Skip to content

Predicting Hospital Readmission of Diabetics | Medical Insurance Cost with EDA + OLS Regression | Predicting Customer Risk by Home Equity Loan | CLICK TO SEE MORE

Notifications You must be signed in to change notification settings

cc59chong/Data-Analysis-and-Machine-Learning-with-R

Repository files navigation

Predicting Hospital Readmission of Diabetic Patients Details

Contents

  • Dataset Description
  • Dataset Preparation

Transform data type, Deal with the missing value, Recoding and collapsing features, Categorization, Remove the outliers

  • Feature Selection

Boruta algorithm *Analytical Teachiques Split the dataset into training data and test data, Data balancing, k-Fold Cross-validation

  • Models and evaluation
  • Model Comparison *Conclusions

Medical insurance cost with EDA + OLS Regression Details

  • Ordinary Least Squares (OLS) Regression: Simple Linear Regression, Polynomial Regression, and Multiple Linear Regression
  • Exploring and Preparing the Data, Model Building, Variable Importance, Regression Assumptions, Improving Model performance
  • Log Transformation, Outliers Remove Function, Multicollinearity Check, Statistical Interpretation

Predicting Customer Risk by Home Equity Loan Details

  • Data Pre-processing
  • Imbalance Data (Decision Tree)

Undersampling, Oversampling, Both, SMOTE
Performance Metrics: Accuracy, Error Rate, Specificity, Precision, Recall(Sensitivity), F Measure, ROC, AUC

  • Build Model (Logistic Regression)
  • Model Diagnostics

VIF, Cutoff, Misclassification Error, Confusion Matrix, Concordance

Health Insurance Data -- EDA/Managing (ggplot2) Details

  • Visually checking distributions for a single variable
  • Visually checking relationships between two variables
  • Plotting data with a rug
  • Cleaning data
  • Data Transformations

Analyzing Biological Streams using Multiple Linear Regression

  • Check the Corralation Cofficient (visualization) paris.panels(), chartCorrelation(), ggpairs()
  • Logarithmic Transformation, Variable Importance, Regression Assumptions(Residuals, Cook's D)

Breast Cancer Wisconsin - Classification

  • Logistic Regression, Decision Trees, Conditional Inference Trees, Random Forest, Support vector machines, Choosing a best predictive solution

About

Predicting Hospital Readmission of Diabetics | Medical Insurance Cost with EDA + OLS Regression | Predicting Customer Risk by Home Equity Loan | CLICK TO SEE MORE

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages