-
Notifications
You must be signed in to change notification settings - Fork 11
/
predict_adapt.Rmd
154 lines (123 loc) · 7.34 KB
/
predict_adapt.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
---
title: "bupaR Docs | Adapting your model"
---
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
eval = FALSE,
cache = FALSE
)
```
```{r echo = F, out.width="25%", eval = T, fig.align = "right"}
knitr::include_graphics("images/icons/predict.PNG")
```
***
# Adapting your model
```{r setup, message = F, eval = T, warning = F}
library(processpredictR)
library(bupaR)
```
## Additional features
Additional features to be used by the model, beyond activity sequences and default features, can be defined when using `prepare_examples()`. The example below shows how the month in which a case is started can be added as a feature.
```{r}
# preprocessed dataset with categorical hot encoded features
df_next_time <- traffic_fines %>%
group_by_case() %>%
mutate(month = lubridate::month(min(timestamp), label = TRUE)) %>%
ungroup_eventlog() %>%
prepare_examples(task = "next_time", features = "month") %>% split_train_test()
```
```{r}
# the attributes of df are added or changed accordingly
df_next_time$train_df %>% attr("features")
```
```
#> [1] "latest_duration" "throughput_time" "processing_time"
#> [4] "time_before_activity" "month_jan" "month_feb"
#> [7] "month_mrt" "month_apr" "month_mei"
#> [10] "month_jun" "month_jul" "month_aug"
#> [13] "month_sep" "month_okt" "month_nov"
#> [16] "month_dec"
```
```{r}
# the month feature is hot-encoded
df_next_time$train_df %>% attr("hot_encoded_categorical_features")
```
```
#> [1] "month_jan" "month_feb" "month_mrt" "month_apr" "month_mei" "month_jun"
#> [7] "month_jul" "month_aug" "month_sep" "month_okt" "month_nov" "month_dec"
```
Additional features can be either numerical variables, or factors. Numerical variables will be automatically normalized. Factors will automatically be converted to hot-encoded variables. A few important notes:
* Character values are not accepted, and should be transformed to factors.
* We assume that no features have missing values. If there are any, these should be imputed or removed before using `prepare_examples()`.
* Finally, in case the data is an event log, features should have single values for each activity instance. Start and complete event should thus have a single unique value of a variable for it to be used as feature.
## Changing model dimensions
When it comes to the model itself some parameters are set by default, but can of course be adjusted. Below, a quick overview of each one of these parameters:
* `num_heads`: A number of attention heads of the `keras::layer_multi_head_attention()`. The more attention heads, the more representational power a multi-head attention layer has to encode (relationships between) each activity instance. Default: 4.
* `output_dim`: A dimension of the dense embedding of the `keras::layer_embedding()`, a `key_dim` parameter of the `keras::layer_multi_head_attention()` and a `units` (_dimensionality of the output space_) parameter of the 2nd `keras::layer_dense()` in the feed-forward (sequential) network of the transformer architecture. Default: 36.
* `dim_ff`: `units` parameter of the 1st `keras::layer_dense()` in the feed-forward (sequential) network of the transformer architecture, i.e. the _width_ (number of nodes) of the neural network layer. Default 64.
## Customize model architecture
Instead of using the standard _off the shelf_ transformer model that comes with `processpredictR`, you can customize the model. One way to do this, is by using the `custom` argument of the `create_model()` function. The resulting model will then only contain the input layers of the model, as shown below.
```{r}
df <- prepare_examples(traffic_fines, task = "next_activity") %>% split_train_test()
custom_model <- df$train_df %>% create_model(custom = TRUE, name = "my_custom_model")
custom_model
```
```
#> Model: "my_custom_model"
#> ________________________________________________________________________________
#> Layer (type) Output Shape Param #
#> ================================================================================
#> input_2 (InputLayer) [(None, 9)] 0
#> token_and_position_embedding_1 (To (None, 9, 36) 828
#> kenAndPositionEmbedding)
#> transformer_block_1 (TransformerBl (None, 9, 36) 26056
#> ock)
#> global_average_pooling1d_1 (Global (None, 36) 0
#> AveragePooling1D)
#> ================================================================================
#> Total params: 26,884
#> Trainable params: 26,884
#> Non-trainable params: 0
#> ________________________________________________________________________________
```
You can than stack layers on top of your custom model as you prefer, using the `stack_layers()` function. This function avoids some extra coding that is needed if `keras` is used directly (see [customization with keras](predict_keras.html)).
```{r}
custom_model <- custom_model %>%
stack_layers(layer_dropout(rate = 0.1)) %>%
stack_layers(layer_dense(units = 64, activation = 'relu'))
custom_model
```
```
#> Model: "my_custom_model"
#> ________________________________________________________________________________
#> Layer (type) Output Shape Param #
#> ================================================================================
#> input_2 (InputLayer) [(None, 9)] 0
#> token_and_position_embedding_1 (To (None, 9, 36) 828
#> kenAndPositionEmbedding)
#> transformer_block_1 (TransformerBl (None, 9, 36) 26056
#> ock)
#> global_average_pooling1d_1 (Global (None, 36) 0
#> AveragePooling1D)
#> dropout_6 (Dropout) (None, 36) 0
#> dense_6 (Dense) (None, 64) 2368
#> ================================================================================
#> Total params: 29,252
#> Trainable params: 29,252
#> Non-trainable params: 0
#> ________________________________________________________________________________
```
```{r}
# this works too
# we pass multiple keras-package layers separated by comma
custom_model %>%
stack_layers(layer_dropout(rate = 0.1), layer_dense(units = 64, activation = 'relu'))
```
Once you have finalized your model, with an appropriate output-layer (which should have the correct amount of outputs, as recorded in `customer_model$num_outputs` and an appropriate activation function), you can use the `compile()`, `fit()`, `predict()` and `evaluate()` functions as seen before in the previous introductory [example](https://bupaverse.github.io/docs/predict_workflow.html).
***
Read more:
```{r footer, results = "asis", echo = F, eval = T, collapse = F}
source("htmlbuttons.R")
create_buttons(df, "predict_adapt.html")
```