Using data from Taarifa and the Tanzanian Ministry of Water, we must predict which pumps are functional, need some repairs, and which don't work at all. A smart understanding of which waterpoints will fail can improve maintenance operations and ensure that clean, potable water is available to communities across Tanzania. Click here to visit the competition page: https://www.drivendata.org/competitions/7/pump-it-up-data-mining-the-water-table/page/23/
This repo contains my EDA, data cleaning, feature engineering and modelling work for this DrivenData competition, where I currently hold a top 4% score (0.8238/1 score). It is also included all the experiments and iterations I have done before reaching my final model.