*Forecasting the sales: Count-Based Sales prediction with regularized Poisson Regression*
Important notice:
The final model, which went into production, was created using train/test split, autoregressive part (AR), as well as custom α for each UPC, so that the regularization strength was chosen based on a minimalization of MAE. The presented model is a reduced version calculated on a synthetic dataset.
Project Overview:
The aim was to create and implement a predictive model that can forecast the number of items sold for a period of 8 weeks ahead. The model is constructed using a Poisson regression, which is trained on a synthetic dataset with simple hard-coded L2 regularization.
Data Generation:
The synthetic dataset is generated using a custom function that generates random values for UPC, date, and the target variable y. The number of rows per UPC and the total number of unique UPCs are specified using random values to ensure variability.
Model Training:
For each unique UPC, a Poisson regression model was trained on the whole dataset. The model was initially fit without L2 regularization, and the null deviance was calculated using a constant term, for comparison purposes
Conclusion:
The resulting model can predict future values of the target variable with high accuracy. L2 Regularization can significantly reduced the forecasting error, even when the total number of predictors is low.