This project aims to achieve the best prediction results by applying various preprocessing techniques and blind data engineering.
To gain an understanding of the data structure and the separation of classes, the initial step involved using UMAP (Uniform Manifold Approximation and Projection). Additionally, the data was imputed, and generic data engineering techniques were applied. Finally, feature selection was performed using mutual information.
Several models were evaluated, and XGBoost emerged as the top-performing model for prediction.
To gain insights into the weaknesses of the model, various techniques were employed to understand the reasons behind incorrect predictions (details can be found in the "main.ipynb" file).
Please refer to the "main.ipynb" file for a more comprehensive analysis and implementation details.
" Won first place btw 🥱 "