Using machine learning, predict which companies will default on their loans and explain how different features impact the predictions.
- This was a take home assessment for a job interview.
- Data processing with special focus on handling class imbalance with SMOTE and Random Undersampling
- Exploratory data analysis
- Feature engineering
- Model building and evaluation. Select appropriate performance metric for binary classification task i.e. F2 score (more importance for recall)
- Model tuning and optimal threshold identification
- Compare performance of models
- Feature Importance
- Test data prediction
- code.ipynb: Notebook for the complete ML pipeline
- data: Folder containing raw train and test data
- output.csv: output of test predictions