A Diabetes prediction Machine learning model using diagnostic data and algorithms such as logistic regression and decision trees. Evaluate their performance and deploy the top-performing model for accurate diabetes prediction.
Welcome to the Diabetes Prediction Project repository! This project focuses on utilizing machine learning techniques to predict diabetes outcomes based on diagnostic measurements. By leveraging data preprocessing, exploratory analysis, and model evaluation, this project aims to contribute insights into predictive modeling in the healthcare domain.
- Dataset
- Exploratory Data Analysis
- Model Training and Evaluation
- Results and Metrics
- Contributors
- Tech Stack
- License
The dataset for this project originates from the National Institute of Diabetes and Digestive and Kidney Diseases. It contains the following features:
Pregnancies
: Number of times pregnantGlucose
: Plasma glucose concentration 2 hours after an oral glucose tolerance testBloodPressure
: Diastolic blood pressure (mm Hg)SkinThickness
: Triceps skinfold thickness (mm)Insulin
: 2-Hour serum insulin (mu U/ml)BMI
: Body mass index (weight in kg / (height in m)^2)DiabetesPedigreeFunction
: Diabetes pedigree functionAge
: Age in years
The target variable, Outcome
, indicates the presence (1) or absence (0) of diabetes.
Exploratory Data Analysis (EDA) is performed using Python libraries such as pandas, numpy, matplotlib, and seaborn. EDA helps in understanding feature distributions, relationships, and potential patterns in the data.
Two machine learning algorithms are employed for prediction: Logistic Regression and Decision Tree. The models are trained on the dataset and evaluated using various metrics, including precision, recall, confusion matrix, ROC curve, and AUC score.
he project provides insights into:
- Feature relationships through visualizations in EDA.
- Classification performance of Logistic Regression and Decision Tree models using precision, recall, confusion matrix, ROC curve, and AUC score.
- This code is distributed under the terms of MIT license