Skip to content

This GitHub repository hosts the project report and analysis code on investigating lifestyle habits and medical conditions influencing diabetes prevalence. Utilizing data from the CDC's Behavioral Risk Factor Surveillance System survey, the project explores correlations and predictive models for diabetes risk.

Notifications You must be signed in to change notification settings

sofia-rajan/Diabetes-Predictive-Modeling

Repository files navigation

Diabetes-Predictive-Modeling

Lifestyle Habits and Medical Conditions Influencing Diabetes

This repository contains the project report and analysis code, focusing on the influence of lifestyle habits and medical conditions on diabetes prevalence.

Project Overview:

Diabetes is a significant chronic health condition in the United States, impacting millions of individuals annually and posing substantial economic burdens. Understanding the factors contributing to diabetes prevalence is crucial for effective prevention and intervention strategies. This project explores the relationship between lifestyle habits, medical conditions, and diabetes prevalence using data from the Behavioral Risk Factor Surveillance System (BRFSS) survey conducted by the CDC.

Key Topics Covered:

  1. Introduction: Provides an overview of diabetes and its impact on public health.
  2. Motivation: Discusses the need for innovative approaches to address diabetes prevalence and associated risks.
  3. Data Description: Describes the dataset used, its sources, and key variables.
  4. Data Preprocessing: Details the steps taken to clean, preprocess, and prepare the dataset for analysis.
  5. Exploratory Data Analysis (EDA): Visualizes and analyzes data patterns, correlations, and distributions to gain insights.
  6. Models Used and Performance Evaluation: Outlines the models employed (Logistic Regression, Decision Tree, K-Nearest Neighbors) and evaluates their performance in predicting diabetes risk.
  7. Conclusion: Summarizes findings, model preferences, computational considerations, and suggestions for further exploration.

Conclusion:

The project provides valuable insights into the relationship between lifestyle habits, medical conditions, and diabetes prevalence. While various models demonstrated similar test accuracies, logistic regression and decision tree models were preferred due to their interpretability and computational efficiency. K-Nearest Neighbors, while effective, posed computational challenges with larger datasets. Further exploration, including ensemble methods or hyperparameter tuning, could enhance predictive capabilities and provide deeper insights into diabetes risk factors. Overall, the project contributes to the understanding of diabetes prevention and improved public health outcomes.

For detailed information and analysis code, please refer to the project documentation and code available in this repository.

About

This GitHub repository hosts the project report and analysis code on investigating lifestyle habits and medical conditions influencing diabetes prevalence. Utilizing data from the CDC's Behavioral Risk Factor Surveillance System survey, the project explores correlations and predictive models for diabetes risk.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published