This repository encompasses a weather forecasting task using the Kaggle dataset present in "https://www.kaggle.com/datasets/balabaskar/historical-weather-data-of-all-country-capitals".
The main code is scattered across three notebooks:
In the notebook 1-eda.ipynb
, I perform exploratory data analysis on the data:
- Check for missing values and data types
- Filter the weather data to focus on a particular city
- Time-plots, seasonal and ACF-plots
- Power spectral decomposition
- Trend-cycle and seasonal decomposition using
statsmodels
- Measure signal forecastability using trend and seasonal strength and Shannon spectral entropy
This notebook, 2-models.ipynb
, create a simple Naive seasonal model for making weather forecasts using the sktime
package.
The following approach is used:
- Creating a data pipeline
- Training and performing time-series cross-validation on the Naive seasonal model
- Saving the pipeline for later evaluation
Notebook 3-eval.ipynb
evaluates the model from the previous notebook and benchmarks it across various different metrics on the test set.
The evaluation consists of the following steps:
- Forecast vs true time-plot with prediction interval
- Point forecast evaluation using the MASE (Mean absolute scaled error) metric
- Quantile evaluation using the pinball loss metric
- Prediction interval evaluation using the Wrinkler score
- Forecast distribution evaluation using the CRPS (Continuous Ranked Probability Score) metric