Skip to content

Latest commit

 

History

History
53 lines (28 loc) · 1.25 KB

File metadata and controls

53 lines (28 loc) · 1.25 KB

Apply_Random_Forest_and_XGBoost_DonorsChoose-DataSet

Step by Step Procedure


  • Understanding the Businessreal world problem

  • Loading the data

  • Preprocessing the data(based on the type of data = categorical , text, Numarical )

  • Preprocessing data includes (removing outliers, impute missung values, cleaning data,etc..)

  • Split the data into train, cv, test

  • Vectorization data ( one hot encoding)

  • Vectorizing text data

  • Normalizing

  • Contactinating all the type of features(cat + text + num)

  • Hyperparameter tuning to find th best estimator(GridSearch)

  • Ploting the performence of the model using heatmaps

  • Train the Random Forest model using best hyperparameter and ploting auc roc-curve

  • Plot confusion matrix

  • Hyperparameter tuning to find th best estimator(RandomizedSearch)

  • Ploting the performence of the model using heatmaps

  • Train the XGBoost model using best hyperparameter and ploting auc roc-curve

  • Plot Confusion Matrix

  • Observation on overall model performences

  • Ploting the performences by tableu format.


RAMESH BATTU