Predictive Data Analysis, Data Mining and Machine Learning and Big Data are the key concepts of the project.
The main goal of the project is to identify from EMR Datasets which patient is likely to buy which medicine using Machine Learning Techniques. Also, what is the chance of the patient in switching from one medicine to other. Even it is to be determined whether the patient is a frequent buyer or not.
Prediction/Classification is to be done in two ways :
- Predicting Class of Drug : Random Forest is the most accurate. LibSVM is with least error
- Binary Class Drug Classification : Butrans / Opana
- Ternary Class Drug Classification : Butrans / Opana / Both
- Predicting Type of Patient : Frequent Buyer / Infrequent Buyer : Random Forest is the most accurate. LibSVM is with least error
Frequency is based on refill count and median of refill count is used to separate F and NF
- Weka
- ZeroR (Baseline)
- Naive Bayes
- Logistic Regression
- SVM (linear kernel)
- LibSVM (SVM using radial kernel)
- Random Forest
- Decision Tree
- Bagging on Decision Tree
© Saurav Saha (SRvSaha), ACS Lab, 2016-17