Skip to content

Latest commit

 

History

History
48 lines (30 loc) · 2.08 KB

03-decision-trees.md

File metadata and controls

48 lines (30 loc) · 2.08 KB

6.3 Decision trees

Slides

Notes

Decision Trees are powerful algorithms, capable of fitting complex datasets. The decision trees make predictions based on the bunch of if/else statements by splitting a node into two or more sub-nodes.

With versatility, the decision tree is also prone to overfitting. One of the reasons why this algorithm often overfits is its depth. It tends to memorize all the patterns in the train data but struggles to perform well on the unseen data (validation or test set).

To overcome the overfitting problem, we can reduce the complexity of the algorithm by reducing the depth size.

A decision tree with a depth of 1 is called decision stump and has only one split from the root.

Classes, functions, and methods:

  • DecisionTreeClassifier: classification model from sklearn.tree class.
  • max_depth: hyperparameter to control the maximum depth of decision tree algorithm.
  • export_text: method from sklearn.tree class to display the text report showing the rules of a decision tree.

Note: we have already covered DictVectorizer in session 3 and roc_auc_score in session 4 respectively.

Add notes from the video (PRs are welcome)

⚠️ The notes are written by the community.
If you see an error here, please create a PR with a fix.

Navigation