Skip to content

Latest commit

 

History

History
75 lines (60 loc) · 3.57 KB

README.md

File metadata and controls

75 lines (60 loc) · 3.57 KB

The Hotel Dilemma

  • UW FinTech Boot Camp - Project 3 Submission - May 2021

Team Members

  • Monique T.
  • Tony H.
  • April A.
  • Nick N.

Motivation

  • After a 1-year pandemic, we all need a vacation...
  • Our group wanted to conduct a project using the tools learned in both Module 1 & Module 2 of the boot camp (python + machine learning). We set out looking for data sets and questions related to supply chain management and demand forecasting, which led us to a very interesting prompt related to forecasting hotel reservation cancellations. Given we all need a good vacation, we pursued the the following project...

Research Questions

  • Using data and machine learning models, can hotel reservation cancellations be predicted?
  • If yes to the above question, what models and methods most accurately predict hotel reservation cancellations?

Objectives

  • Build a ML model that can predict whether a hotel reservation will be cancelled
  • Analyze and understand data via organization, visualization, and dashboards

Data Sources

Action Items

  • Data Cleaning & Shaping
    • Data comes from/affiliated with an article: Hotel Booking Demand Datasets
    • Data was cleaned by Thomas Mock and Antoine Bichat (additional cleaning and shaping conducted by our team)
    • Is there further noise/info we want to weed out? Label encoding?
  • Machine Learning Model
    • Which model to use (try multiple models)
    • Ensemble/Classifier/Decision Tree/Regression? Pick several and also apply resampling techniques if needed. We predict classifier models will be the most effective given we will be classifying a binary outcome (cancelled vs not cancelled)
    • Which parameters/inputs produce the best outcomes (train/test split; different inputs for each ML model type aka reference documentation; which models are most efficient; what features in the dataset can we eliminate)
    • Look at data in different ways? Is the model/data better for predicting in the summer/winter/fall/etc? Should we try forecasting for specific date ranges, like spring break, holiday breaks, etc. This will be a reach if we have time.
  • Data Visualization
    • Visualize by city hotel & resort hotel
    • Visualize by season
    • Visualize different demographics
    • Visualize different ML model outcomes?
    • Explore other means and methods of visualization that may give unique insight into the data set

Work Assignments

  • ML Models & Workbook - April + Nick
  • Data Visualizations & Dashboard - Monique + Tony
  • Slide Show - Whole Team

Technologies

  • Jupyter lab
  • Python
  • Pandas
  • Numpy
  • Sklearn
  • Pyviz
  • More to be imported and utilized in our python files

Attachments

Outcomes

  • Using machine learning models, our Team was able to predict hotel cancellations with confidence (particularly using the SMOTEENN Resampling + BalancedRandomForestClassifier model, which rendered a ~90% accuracy score). Vizualizations of our final accuracy score outcomes are below. Please see our uploaded slide show for more information on the outcomes.

Presentation Assignments

  • Intro & Hypothesis: Tony
  • Visualizations & Intro to Data: Monique
  • Data Preparation & Model Selection: Nick
  • Model Outcomes & Takeaways: April