Skip to content

Commit

Permalink
improve a bit
Browse files Browse the repository at this point in the history
  • Loading branch information
christophM committed Nov 11, 2021
1 parent c43a691 commit e7d12ed
Showing 1 changed file with 1 addition and 2 deletions.
3 changes: 1 addition & 2 deletions manuscript/05.6-agnostic-permfeatimp.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -238,8 +238,6 @@ Since another feature is chosen as the first split, the whole tree can be very d

### Disadvantages

It is very **unclear whether you should use training or test data** to compute the feature importance.

Permutation feature importance is **linked to the error of the model**.
This is not inherently bad, but in some cases not what you need.
In some cases, you might prefer to know how much the model's output varies for a feature without considering what it means for performance.
Expand All @@ -260,6 +258,7 @@ The permutation of features produces unlikely data instances when two or more fe
When they are positively correlated (like height and weight of a person) and I shuffle one of the features, I create new instances that are unlikely or even physically impossible (2 meter person weighing 30 kg for example), yet I use these new instances to measure the importance.
In other words, for the permutation feature importance of a correlated feature, we consider how much the model performance decreases when we exchange the feature with values we would never observe in reality.
Check if the features are strongly correlated and be careful about the interpretation of the feature importance if they are.
However, pairwise correlations might not be sufficient to reveal the problem.

Another tricky thing:
**Adding a correlated feature can decrease the importance of the associated feature** by splitting the importance between both features.
Expand Down

0 comments on commit e7d12ed

Please sign in to comment.