Statistical and ML Modeling of Travel Reviews Data
We analyze the Travel Review Ratings Data Set, which includes almost 5500 reviews from users on 24 different activities. We perform standard statistical and exploratory operations to have a better understanding of the characteristics of the data set. In order to perform a classification of the data set we propose four intuitive categories to split the 24 activities and use a Decision Tree model and a K-Nearest Neighbors model on three different test size samples. Results obtained were satisfactory with accuracies around 90%; two improvements are discussed for future implementations.