This repository houses the ml-poverty-prediction
project, which focuses on the Prediction of Poverty and Malnutrition Prevalence with diffrent Machine Learning techniques. This project is based on the research paper "Multivariate Random Forest Prediction of Poverty and Malnutrition Prevalence" (research paper). However, our approach diverges by employing a variety of different Machine Learning (ML) techniques to potentially enhance prediction accuracy and model effectiveness.
Also, a synthesized version of the paper is available in the form of a research article, which can be accessed here (research article).
The primary objective of this project is to develop a ML model that can accurately predict poverty and malnutrition prevalence, leveraging the data used in the aforementioned research but exploring different ML methodologies.
- Validation and Enhancement: We aim to validate the findings of the original research and seek ways to enhance the accuracy of poverty and malnutrition predictions.
- Model Comparison: A key goal is to compare the effectiveness of various ML models in predicting poverty and malnutrition prevalence, thus contributing to the broader field of socio-economic predictive analytics.
In the initial phase of our project, we are adopting a unique approach to model development. This involves:
-
Learning Experience: We are intentionally developing our initial machine learning models without consulting the research paper titled "Multivariate Random Forest Prediction of Poverty and Malnutrition Prevalence". The purpose of this approach is to foster an unbiased learning and discovery process, allowing our team to explore and test various methodologies based on our existing knowledge and hypotheses.
-
Model Development: Our team will develop and train machine learning models using the available data, focusing on predicting poverty and malnutrition prevalence. We aim to explore a range of algorithms and techniques, distinct from those used in the aforementioned research paper.
Once the initial models are developed and evaluated, our methodology will evolve to include:
-
Research Integration: After the initial development phase, we will carefully study the "Multivariate Random Forest Prediction of Poverty and Malnutrition Prevalence" paper. This will provide us with new insights, methodologies, and potential improvements that could be applied to our models.
-
Model Refinement: Leveraging the knowledge gained from the paper, we will refine, adjust, and possibly retrain our models. This phase is critical for integrating best practices, novel techniques, and insights gleaned from the paper into our existing framework.
-
Comparative Analysis: The outcomes of our models will be compared against the findings in the research paper. This will not only validate our initial models but also help in understanding the effectiveness of different machine learning approaches in predicting poverty and malnutrition.
The rationale behind this two-phase approach is to enhance the learning experience of our team, encourage innovative thinking, and ultimately develop a robust predictive model that can be benchmarked against established research. This method ensures a comprehensive understanding and application of machine learning techniques in the realm of socioeconomic predictions.
For further details or inquiries about our research approach, feel free to reach out as outlined in the Contact Information section.
(Here, you would detail the types of data used, how you plan to gather, process, and utilize this data. Be specific about data sources, preprocessing steps, and how this data will be split for training and testing purposes.)
Data from the paper here.
- Preprocessing steps
- Train test split
(Describe the different ML techniques you plan to use. This section should detail why these methods were chosen, how they differ from the original research paper's methodology, and their expected impact on the project's goals.)
-- AutoML libraries -- Quantum ML Libraries -- Neural Nets
(Identify potential challenges in data gathering, model development, and implementation. Also, discuss the strategies you will employ to overcome these challenges.)
(Explain how you will measure the success of the project. Include metrics for both technical performance, such as model accuracy, and real-world impact, such as improvements in predicting poverty and malnutrition.)
(Outline any potential future developments or extensions of this project. How could this work be scaled, or what other areas could it potentially impact?)
Zotero + Ellicit - literature review Obsidian - notes Eraser - graphs Github - repo Gitpod - cloud dev env
Linux - Python - Kedro - Docker etc
kedro run
kedro viz run
kedro jupyter lab --ServerApp.allow_remote_access=True
pip install kedro-docker
kedro docker init
kedro docker build
kedro docker run
kedro docker cmd --docker-args="-p=4141:4141" kedro viz --host=0.0.0.0
- docker does not contain data just pipeline so you will not see data
(Provide guidelines on how others can contribute to your project. This may include instructions for submitting issues, pull requests, and contact information for direct communication.)
(State the license under which your project is released, if applicable.)
(Give credit to individuals, organizations, or papers that have contributed significantly to your project.)
(Provide your contact information or that of the main contributors for further inquiries or collaboration.)