This dataset contains georeferenced crop production (yields) and management data from smallholder farmers across Kenya, Rwanda, Uganda, Burundi, Malawi, and Tanzania for the years 2016 - 2020. All relevant metadata and variable descriptions are included within the datasheets. Your assignment is to ingest the data into either R or Python, run descriptive analysis, and run predictive analysis on factors driving of crop production. Approaches should use at least one of the following: Geospatial analysis, non-parametric methods (e.g. PCA, Random Forests, clustering, etc.). You will be given a cumulative time limit of three hours. You are not expected to produce any conclusive results within this timeframe, rather, the selection committee will use this to evaluate your data processing skills and analytical approaches. You will be required to submit the following:
• All code • A written description of your process and approach, including why you took the approaches you used • Summary of any findings, even if preliminary • Recommendations on any potential next steps for analysis