COGS-109-UCSD-Final-Project

Airbnb Analysis Project

Title: On the relativity of star ratings and how to understand them depending on cultural and geographical groups

Contributors:
Christopher Jensen (cajensen@ucsd.edu)
Lana Andreasyan (landreas@ucsd.edu)
Amine M’Charrak (amine.mcharrak@tum.de)

Q&A:

Q1: Which dataset did we identify to study?

Airbnb Dataset provided by Tom Slee.

Q2: Why did you choose this topic and dataset?

We all agreed, that we are heavily influenced by rating systems and often let ratings take over huge parts of our final decision. It goes so far that we even exclude possible candidates (accommodations) by simply relying on single scores (average rating value).

Q3: What do you want to analyze in this dataset?

We want to investigate the meaning of the star rating system. Is this ranking relevant to everyone, are the requirements for a five star rating always the same or are there depending on the cultural group surprising differences? Can we come up with a general and simplified model with less predictors and successfully apply it onto similar datasets for different locations?

Q4: Where do you start and how do you proceed?

Step one: We want to do an exploratory analysis of the dataset in order to identify interesting phenomena or significant associations between predictors and response. Step two: We would like to solve a predictive task by applying our own model on this dataset using predictors we found to be helpful in step one.

Q5: Okay. So how are you going to do this and which methods are you going to apply?

The challenge is, that we do not know the true relationship between predictors and response. Therefore, we will have to select the appropriate model by taking several aspects into account. We have to consider not only the size of our dataset but also prevent overfitting by choosing a not too complex model and thus we want to make use of cross validation. Next we want to apply multi-linear regression onto our dataset in order to identify which variables are significant for predicting the overall satisfaction which is denoted by the average rating in units of stars. Finally, we want to use principal component analysis (PCA) to reduce the dimensionality of our feature space but still make acceptable predictions but this time with a much simpler model.

Q6: What outcome do you expect from your project and what decides whether you succeeded?

We would be really happy if we are able to uncover counterintuitive trends and associations in our dataset and perhaps make out interesting relationships which disprove our common sense of how we make decisions based on rating systems. All in all, we want to gain a better understanding of the real value of this five star based rating hierarchy.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
5Stars_wordcloud_copenhagen.pdf		5Stars_wordcloud_copenhagen.pdf
5Stars_wordcloud_newyork.pdf		5Stars_wordcloud_newyork.pdf
5Stars_wordcloud_sydney.pdf		5Stars_wordcloud_sydney.pdf
Airbnb Exploratory Analysis.ipynb		Airbnb Exploratory Analysis.ipynb
BaseAnalysisAirbnb.ipynb		BaseAnalysisAirbnb.ipynb
Neighbourhood_vs_rating_Copenhagen.pdf		Neighbourhood_vs_rating_Copenhagen.pdf
Piechart_Copenhagen_reviews_rating.pdf		Piechart_Copenhagen_reviews_rating.pdf
README.md		README.md
Rating_vs_Price_Copenhagen.pdf		Rating_vs_Price_Copenhagen.pdf
Regression_Comparisons.pdf		Regression_Comparisons.pdf
Rooomtype_distribution_copenhagen.pdf		Rooomtype_distribution_copenhagen.pdf
Rooomtype_distribution_newyork.pdf		Rooomtype_distribution_newyork.pdf
Rooomtype_distribution_sydney.pdf		Rooomtype_distribution_sydney.pdf
cogs-109-project.pdf		cogs-109-project.pdf
copenhagen_predictors.csv		copenhagen_predictors.csv
newyork_predictors.csv		newyork_predictors.csv
sydney_predictors.csv		sydney_predictors.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

COGS-109-UCSD-Final-Project

Airbnb Analysis Project

Title: On the relativity of star ratings and how to understand them depending on cultural and geographical groups

Q&A:

Q1: Which dataset did we identify to study?

Q2: Why did you choose this topic and dataset?

Q3: What do you want to analyze in this dataset?

Q4: Where do you start and how do you proceed?

Q5: Okay. So how are you going to do this and which methods are you going to apply?

Q6: What outcome do you expect from your project and what decides whether you succeeded?

About

Releases

Packages

Languages

mcharrak/Airbnb-Reviews-Dataset-Analysis

Folders and files

Latest commit

History

Repository files navigation

COGS-109-UCSD-Final-Project

Airbnb Analysis Project

Title: On the relativity of star ratings and how to understand them depending on cultural and geographical groups

Q&A:

Q1: Which dataset did we identify to study?

Q2: Why did you choose this topic and dataset?

Q3: What do you want to analyze in this dataset?

Q4: Where do you start and how do you proceed?

Q5: Okay. So how are you going to do this and which methods are you going to apply?

Q6: What outcome do you expect from your project and what decides whether you succeeded?

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages