Skip to content

RameshBattu/Apply-Decision-Trees-on-Donors-Choose-dataset

Repository files navigation

Apply-Decision-Trees-on-Donors-Choose-dataset

Step by Step Procedure


  • Understanding the Businessreal world problem

  • Loading the data

  • Preprocessing the data(based on the type of data = categorical , text, Numarical )

  • Preprocessing data includes (removing outliers, impute missung values, cleaning data, remove spacial character, etc..)

  • Split the data into train, cv, test(random splitting)

  • Vectorization data ( one hot encoding)

  • Vectorizing text data(bow, tfidf, avgw2v, tfidf weighted w2v)

  • Vectorizing numarical - Normalizer

  • Applying Desition Trees Model on top of the features

  • Contactinating all the type of features(cat + num + selected text features)

  • Hyperparameter tuning to find th best estimator(GridSearchCV) and Ploting heatmaps

  • Train the Desition Trees Model using best hyperparameter and ploting auc roc-curve

  • Ploting confusion matrix(heatmaps)

  • Graphviz visualization of Decision Tree

  • Finding the False Possitive points

  • Ploting wordcloud with the words of essay text of these false positive data points

  • Ploting Box plot with price of false possitive points

  • PDF & CDF with teacher_number_of_previously_posted_projects false possitive points

  • Getting top 5k features using feature_importances_with TFIDF

  • Hyperparameter tuning to find th best estimator(GridSearchCV) and Ploting heatmaps

  • Train the Desition Trees Model using best hyperparameter and ploting auc roc-curve

  • Ploting confusion matrix(heatmaps)

  • Graphviz visualization of Decision Tree

  • Finding the False Possitive points

  • Ploting wordcloud with the words of essay text of these false positive data points

  • Ploting Box plot with price of false possitive points

  • PDF & CDF with teacher_number_of_previously_posted_projects false possitive points

  • Observation on overall model performences(Conclusion)

  • Ploting the performences by table format.


RAMESH BATTU

About

Apply Decision Trees on Donors Choose dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published