One of the leading retail stores in the US, Walmart is facing a challenge due to unforeseen demands and runs out of stock some times, due to the inappropriate machine learning algorithm. An ideal ML algorithm will predict demand accurately and ingest factors like economic conditions including CPI, Unemployment Index, etc. Walmart runs several promotional markdown events throughout the year. These markdowns precede prominent holidays, the four largest of all, which are the Super Bowl, Labour Day, Thanksgiving, and Christmas. The weeks including these holidays are weighted five times higher in the evaluation than non-holiday weeks.
The objective is to determine the factors affecting the sales and to analyze the impact of markdowns around holidays on the sales.
In this project, we undertake the following two tasks:
a) Answer the following questions:
- Which store has maximum sales?
- Which store has maximum standard deviation i.e., the sales vary a lot. Also, find out the coefficient of mean to standard deviation?
- Which store/s has good quarterly growth rate in Q3’2012?
- Some holidays have a negative impact on sales. Find out holidays which have higher sales than the mean sales in non-holiday season for all stores together.
- Provide a monthly and semester view of sales in units and give insights.
b) Build the 3 prediction models to forecast demand and evaluate the most accurate model:
- Linear Regression
- Decision Trees
- Random Forest
The ipynb and html versions of the code are in 'Notebook'.