The project is to build a machine learning model to accurately predict whether or not the patients in the dataset have diabetes, based on certain diagnostic measurements.
The dataset used in this project was downloaded from Kaggle and is originally from the National Institute of Diabetes and Digestive and Kidney Diseases.In particular, all patients here are females at least 21 years old of Pima Indian heritage. (Acknowledgements Smith, J.W., Everhart, J.E., Dickson, W.C., Knowler, W.C., & Johannes, R.S. (1988). Using the ADAP learning algorithm to forecast the onset of diabetes mellitus. In Proceedings of the Symposium on Computer Applications and Medical Care (pp. 261--265). IEEE Computer Society Press.)
Azure Machine Learning Designer was used thoughout the project to build up a "Two-class Logistic Regression" model. A real-time inference endpoint was created for real-time prediction based on provided diagnostic measurements.
This model trained could serve as a useful tool for early screening for diabetes of females at least 21 years old, particular with Pima Indian heritage.