-
Notifications
You must be signed in to change notification settings - Fork 0
/
09-mcmc.Rmd
145 lines (102 loc) · 5.8 KB
/
09-mcmc.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
# The statistical catch-at-age stock assessment framework with MCMC \label{sec:mcmc}
The previous methods were demonstrated using the maximum likelihood estimation method. However, `ADMB` can also use MCMC methods to fit the model. This section shows how the `sca` methods interface with `ADMB` to use the MCMC fits. For this section we'll use the hake assessment in mediterranean areas (gsa) 1, 5, 6 and 7.
The `sca` likelihood estimate is:
```{r}
# ll
data(hke1567)
data(hke1567.idx)
fmod <- ~s(age, k = 4) + s(year, k = 8) + s(year, k = 8, by = as.numeric(age == 0)) + s(year, k = 8, by = as.numeric(age == 4))
qmod <- list(~I(1/(1 + exp(-age))))
fit <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod)
fit <- simulate(fit, 1000)
```
To run the MCMC method, one needs to configure a set of arguments, which is done by creating a `SCAMCMC` object. For details on the MCMC configuration in `ADMB` visit the `ADMB` website.
```{r}
# mcmc
mc <- SCAMCMC()
# check the default pars
mc
```
Defaults for now are ok, so lets fit the model. Note that the argument `fit` must be set to `MCMC` and the argument `mcmc` takes the `SCAMCMC` object. A major check when running MCMC is the acceptance rate, which should be around 0.3. This is a rule of thumb, for more information read the (extensive) literature on MCMC. The slot `fitSumm` stores that information.
```{r}
# fit the model
fitmc00 <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod, fit = "MCMC", mcmc=mc)
# check acceptance rate
fitSumm(fitmc00)
```
```{r}
plot(hke1567 + fitmc00)
```
As mentioned above `ADMB` has several options for MCMC. Here we demonstrate one of them, `mcprobe` which sets a fat-tailed proposal distribution, as an example of how to use the `SCAMCMC` objects.
```{r}
mc <- SCAMCMC(mcprobe=0.45)
fitmc01 <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod, fit = "MCMC", mcmc=mc)
```
All fits together
```{r}
plot(FLStocks(ll=hke1567 + fit, mc=hke1567 + fitmc00, mc_alt=hke1567 + fitmc01))
```
## Diagnostics with CODA
We use the package `CODA` to run the diagnostics on MCMC fits. One needs to convert the `a4a` output into a `mcmc` CODA object over which several diagostics can be ran. The mcmc object is a matrix with the parameters (row = iters, cols= pars).
Common diagnostics for MCMC chains is to look at the burn-in period, auto-correlation and cross correlation^[ToBe Added]. The first can be dealt by droping an initial set of iterations, which is done using the function `burnin`. The second can be managed by thinning the chain, in `ADMB` this is done through the parameter `mcsave N`, which defines the iteration's saving rate (the inverse of the thinning rate). This is the rate at which samples of the parameters are saved, such that thinning is effectively discarding draws.
Next fit will run 1000 iterations and save every iter (`mcsave=1`).
```{r}
library(coda)
mc <- SCAMCMC(mcmc=1000, mcsave=1)
fitmc02 <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod, fit = "MCMC", mcmc=mc)
fitmc02.mc <- FLa4a::as.mcmc(fitmc02)
```
The autocorrelation plots will show the very strong correlation across samples, which we want to avoid. Figure \@ref(fig:acf01) shows autocorrelation for the first parameter.
<<acf01>>=
acf(fitmc02.mc[,1])
```
Ploting the chain for the parameter clearly shows the autocorrelation but also the burnin phase, where there's no information about the parameter. These iterations must to be dropped.
<<chain01>>=
xyplot(fitmc02.mc[,1])
```
It's also important to check if the distribution of the parameters is normal, which can be done with the `densityplot`:
<<dens01>>=
densityplot(fitmc02.mc[,1])
```
Another interesting diagnostic is the Geweke-Brooks Z-score check. This diagnostic indicates if the first and last part of a sample from a Markov chain may not be drawn from the same distribution. It's useful to decide if the first few iterations should e discarded.
<<gew01>>=
geweke.plot(fitmc02.mc[,1])
```
It's clear from the above diagnostics that a burnin phase of about 200 iterations should be considered. With relation to thining one needs to try several values until no autocorrelation exits.
Next fit will run 10000 iterations and save every 10th iteration (`mcsave=10`), so that the same 1000 iters are generated by the method.
```{r}
mc <- SCAMCMC(mcmc=10000, mcsave=10)
fitmc03 <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod, fit = "MCMC", mcmc=mc)
fitmc03.mc <- FLa4a::as.mcmc(fitmc03)
```
The autocorrelation plots still shows a strong correlation across samples, although less than in the previous model.
```{r}
acf(fitmc03.mc[,1])
```
Next fit will run 100000 iterations and save every 100th iteration (`mcsave=100`), so that the same 1000 iters are generated by the method. Autocorrelation is much weaker, could still be reduced by increasing `mcsave`.
```{r}
mc <- SCAMCMC(mcmc=100000, mcsave=100)
fitmc03 <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod, fit = "MCMC", mcmc=mc)
fitmc03.mc <- FLa4a::as.mcmc(fitmc03)
```
Next fit will run 200000 iterations and save every 200th iteration (`mcsave=200`).
```{r}
mc <- SCAMCMC(mcmc=200000, mcsave=200)
fitmc03 <- sca(hke1567, hke1567.idx, fmodel=fmod, qmodel=qmod, fit = "MCMC", mcmc=mc)
fitmc03.mc <- FLa4a::as.mcmc(fitmc03)
```
```{r}
acf(fitmc03.mc[,1])
```
All diagnostics improved with the new thining rate although some other improvements can be done. Note this diagnostics should be checked for all parameters. For the sake of space the demonstration uses only on the first.
```{r}
xyplot(fitmc03.mc[,1])
```
```{r}
densityplot(fitmc03.mc[,1])
```
```{r}
geweke.plot(fitmc03.mc[,1])
```
Note: add correlation across parameters
The manual "A Guide for Bayesian Analysis in AD Model Builder" by Cole C. Monnahan, Melissa L. Muradian and Peter T. Kuriyam describe and explain a larger group of arguments that can be set when running MCMC with ADMB, which the engine `a4a` uses.