This project aims to develop a machine learning model to predict diabetes using the K-Nearest Neighbors (KNN) algorithm without KNN library. The model utilizes patient data such as glucose level, blood pressure, insulin level, etc., to predict the likelihood of a person having diabetes.
- numpy
- pandas
- matplotlib
- seaborn
- scikit-learn
- scipy
Dataset Source : Kaggle
Attribute Name | Attribute Name |
---|---|
gender | Age |
hypertension | heart_disease |
smoking_history | bmi |
HbA1c_level | blood_glucose_level |
diabetes |
- Data Preprocessing: Cleaning the dataset, handling missing values, Anomaly and Outlier, and encoding categorical & nominal.
- Feature Selection: Identifying relevant features that contribute significantly to the prediction of CKD.
- Model Training: Implementing the K-NN algorithm.
- Model Evaluation: Assessing the performance of the model using appropriate metrics such as accuracy, precision, recall, and F1-score.
- Clone the repository :
git clone https://github.com/DikkiKartajaya/DiabetesPrediction_KNN.git
- Install the required dependencies :
pip install -r requirement.txt
- Run the Jupyter notebook DiabetesPrediction_KNN.ipynb to train and evaluate the KNN model.
Contributions to the project are welcome! If you have any suggestions for improvement, feature requests, or bug reports, please feel free to open an issue or submit a pull request.