Skip to content

Commit

Permalink
replaces data set with dataset
Browse files Browse the repository at this point in the history
  • Loading branch information
christophM committed May 23, 2018
1 parent dbdeffc commit 70d9b24
Show file tree
Hide file tree
Showing 3 changed files with 4 additions and 4 deletions.
2 changes: 1 addition & 1 deletion manuscript/02-interpretability.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ Having an interpretation for a faulty prediction helps to understand the cause o
It delivers a direction for how to fix the system.
Consider an example of a husky versus wolf classifier, that misclassifies some huskies as wolfs.
Using interpretable machine learning methods, you would find out that the misclassification happened due to the snow on the image.
The classifier learned to use snow as a feature for classifying images as wolfs, which might make sense in terms of separating features in the training data set, but not in the real world use.
The classifier learned to use snow as a feature for classifying images as wolfs, which might make sense in terms of separating features in the training dataset, but not in the real world use.

If you can ensure that the machine learning model can explain decisions, the following traits can also be checked more easily (Doshi-Velez and Kim 2017):

Expand Down
4 changes: 2 additions & 2 deletions manuscript/04.5-interpretable-rulefit.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -187,7 +187,7 @@ $x_{s_{jm},\text{lower}}<x_j$ or $x_j<x_{s_{jm},upper}$.
Further splits in that feature create more complicated intervals.
For categorical features the subset $s_{jm}$ contains some specific categories of $x_j$.

A made up example for the bike rental data set:
A made up example for the bike rental dataset:

$$r_{17}(x)=I(x_{\text{temp}}<15)\cdot{}I(x_{\text{weather}}\in\{\text{good},\text{cloudy}\})\cdot{}I(10\leq{}x_{\text{windspeed}}<20)$$

Expand Down Expand Up @@ -216,7 +216,7 @@ You can see the rules simply as new features based on your original features.
**Step 2: Sparse linear model**

You will get A LOT of rules from the first step.
Since the first step is only a feature transformation function on your original data set you are still not done with fitting a model and also you want to reduce the number of rules.
Since the first step is only a feature transformation function on your original dataset you are still not done with fitting a model and also you want to reduce the number of rules.
Next to the rules, also all your 'raw' features from your original dataset will be used in the Lasso linear model.
Every rule and original feature becomes a feature in Lasso and gets a weight estimate.
The original, raw features are added because trees suck at representing simple linear relationships between y and x.
Expand Down
2 changes: 1 addition & 1 deletion manuscript/05.3-agnostic-ice.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ ice$plot() + my_theme() + scale_color_discrete(guide='none') +
With the centered ICE plots it is easier to compare the curves of individual instances.
This can be useful when we are not interested in seeing the absolute change of a predicted value, but rather the difference in prediction compared to a fixed point of the feature range.

The same for the bike data set and count prediction model:
The same for the bike dataset and count prediction model:

```{r ice-bike-centered, fig.cap='Centred individual conditional expectation plots of expected bike rentals by weather condition. The lines were fixed at value 0 for each feature and instance. The lines show the difference in prediction compared to the prediction with the respective feature value at their minimal feature value in the data.'}
data(bike)
Expand Down

0 comments on commit 70d9b24

Please sign in to comment.