Bank dataset is not a provided here because it is too big. Need to be extracted in this folder: 'Datathon_sample_final1.csv'
The first step is to clean the raw data from dataset. Therefore, the notebook 'RFB - Data Cleaning.ipynb' should be started first. It will generate a new dataset: 'RFB - Clean Data.csv'
Another step is to load external data obtained from The Statistical Office of the Republic of Serbia and Open data portal of Republic of Serbia:
- 'RFB - Potrosacke Cene Indeksi.txt'
- 'RFB - Strukturna Imovina.txt'
and Synthetic label from:
- 'RFB - Prvi Zajam.txt'
NOTE: Change txt extension to csv (due to version controll isues).
It will generate a new dataset: 'RFB - Merged Data.csv'
A simple EDA is described in Notebook 'RFB - EDA.ipynb'.
To run, train, and test ML models, run the following notebooks:
- RFB - Classification Random Forest.ipynb
- RFB - Classification Random LightGBM.ipynb