Skip to content

samirsaci/ml-forecast-features-eng

Repository files navigation

Machine Learning for Retail Sales Forecasting — Features Engineering 📈

Understand the impacts of additional features related to stock-out, store closing date or cannibalization on a Machine Learning model for sales forecasting

Based on the feedback of the last Makridakis Forecasting Competitions, Machine Learning models can reduce the forecasting error by 20% to 60% compared to benchmark statistical models.

Their major advantage is the capacity to include external features that heavily impact the variability of your sales.

For example, e-commerce cosmetics sales are driven by *special events (promotions) and on how you advertise a reference on the website (first page, second page, …).

This process called features engineering is based on analytical concepts and business insights to understand what could drive your sales.

Article

In this Article, will try to understand the impact of several features on the accuracy of a model using the M5 Forecasting competition dataset.

Experiment

Based on business insights or common sense, we will add additional features, built with existing ones, to help our model to capture all the key factors impacting your customer demand.

Data set

This analysis will be based on the M5 Forecasting dataset of Walmart stores sales records (Link).

Code

  1. Create a folder Data in your directory where the notebook is located
  2. Download all the files of the kaggle forecasting competition (Link).
  3. Launch the notebook

About me 🤓

Senior Supply Chain and Data Science consultant with international experience working on Logistics and Transportation operations.
For consulting or advising on analytics and sustainable supply chain transformation, feel free to contact me via Logigreen Consulting

Please have a look at my personal blog: Personal Website