This package was specifically for mixed models using mgcv. At present it has been almost entirely superseded by the mixedup package, which provides all the same functionality and more, while the visualizations were originally from the visibly package anyway. That only leaves the prediction functionality as unique to this package, so I leave it here as I may still use it for that, and I’m contemplating moving the GAM specific visuals from visibly at some point.
The goal of gammit is to provide a set of functions to aid using mgcv (possibly solely) for mixed models. Lately I’ve been using it in lieu of lme4, especially the bam function, for GLMM with millions of observations and multiple random effects. It’s turning out very useful in this sense (see this post for details), but I’d like some more/different functionality with the results. Furthermore, mgcv just has some nice things going on for such models anyway, like the ability to add other smooth terms, alternative distributions for the target variable, etc., so I’m looking to make it easier for me to get some things I want when I use it.
At present there are four functions: extract_vc, extract_ranef, extract_fixed, summary_gamm, and predict_gamm.
You can install the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("m-clark/gammit")
This example demonstrates the summary_gamm function with comparison to the corresponding lme4 model.
library(mgcv)
Loading required package: nlme
This is mgcv 1.8-31. For overview type 'help("mgcv-package")'.
library(lme4)
Loading required package: Matrix
Attaching package: 'lme4'
The following object is masked from 'package:nlme':
lmList
library(gammit)
lmer_model = lmer(Reaction ~ Days + (Days || Subject), data=sleepstudy)
ga_model = gam(Reaction ~ Days + s(Subject, bs='re') + s(Days, Subject, bs='re'),
data=sleepstudy,
method = 'REML')
summary(lmer_model)
Linear mixed model fit by REML ['lmerMod']
Formula: Reaction ~ Days + ((1 | Subject) + (0 + Days | Subject))
Data: sleepstudy
REML criterion at convergence: 1743.7
Scaled residuals:
Min 1Q Median 3Q Max
-3.9626 -0.4625 0.0204 0.4653 5.1860
Random effects:
Groups Name Variance Std.Dev.
Subject (Intercept) 627.57 25.051
Subject.1 Days 35.86 5.988
Residual 653.58 25.565
Number of obs: 180, groups: Subject, 18
Fixed effects:
Estimate Std. Error t value
(Intercept) 251.405 6.885 36.513
Days 10.467 1.560 6.712
Correlation of Fixed Effects:
(Intr)
Days -0.184
summary_gamm(ga_model)
Variance components:
group effect variance sd sd_2.5 sd_97.5 var_prop
Subject Intercept 627.571 25.051 16.085 39.015 0.477
Subject Days 35.858 5.988 4.025 8.908 0.027
Residual 653.582 25.565 22.792 28.676 0.496
Fixed Effects:
Term Estimate Std. Error t value Pr(>|t|)
(Intercept) 251.405 6.885 36.513 0.000
Days 10.467 1.560 6.712 0.000
Extract the variance components with extract_vc.
data.frame(VarCorr(lmer_model))
grp var1 var2 vcov sdcor
1 Subject (Intercept) <NA> 627.56905 25.051328
2 Subject.1 Days <NA> 35.85838 5.988187
3 Residual <NA> <NA> 653.58350 25.565279
extract_vc(ga_model)
group effect variance sd sd_2.5 sd_97.5 var_prop
1 Subject Intercept 627.571 25.051 16.085 39.015 0.477
2 Subject Days 35.858 5.988 4.025 8.908 0.027
3 Residual 653.582 25.565 22.792 28.676 0.496
Extract the random effects with extract_ranef.
ranef(lmer_model)
$Subject
(Intercept) Days
308 1.5126648 9.3234970
309 -40.3738728 -8.5991757
310 -39.1810279 -5.3877944
330 24.5189244 -4.9686503
331 22.9144470 -3.1939378
332 9.2219759 -0.3084939
333 17.1561243 -0.2872078
334 -7.4517382 1.1159911
335 0.5787623 -10.9059754
337 34.7679030 8.6276228
349 -25.7543312 1.2806892
350 -13.8650598 6.7564064
351 4.9159912 -3.0751356
352 20.9290332 3.5122123
369 3.2586448 0.8730514
370 -26.4758468 4.9837910
371 0.9056510 -1.0052938
372 12.4217547 1.2584037
with conditional variances for "Subject"
extract_ranef(ga_model)
# A tibble: 36 x 7
group_var effect group value se lower_2.5 upper_97.5
<chr> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 Subject Intercept 308 1.51 13.3 -24.5 27.5
2 Subject Intercept 309 -40.4 13.3 -66.4 -14.3
3 Subject Intercept 310 -39.2 13.3 -65.2 -13.2
4 Subject Intercept 330 24.5 13.3 -1.51 50.5
5 Subject Intercept 331 22.9 13.3 -3.11 48.9
6 Subject Intercept 332 9.22 13.3 -16.8 35.2
7 Subject Intercept 333 17.2 13.3 -8.87 43.2
8 Subject Intercept 334 -7.45 13.3 -33.5 18.6
9 Subject Intercept 335 0.579 13.3 -25.4 26.6
10 Subject Intercept 337 34.8 13.3 8.74 60.8
# … with 26 more rows
Extract the fixed effects extract_fixef.
fixef(lmer_model)
(Intercept) Days
251.40510 10.46729
extract_fixed(ga_model)
# A tibble: 2 x 7
term value se t p lower_2.5 upper_97.5
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Intercept 251. 6.88 36.5 0 238. 265.
2 Days 10.5 1.56 6.71 0 7.39 13.5
There are a couple of ways to do prediction, and the main goal for gammit was to make it easy to use the lme4 style to include random effects or not. mgcv already has this functionality as well, so the functionality of predict_gamm is mostly cosmetic. One benefit here is to provide standard errors for the prediction also.
head(predict_gamm(ga_model))
prediction
1 252.9178
2 272.7086
3 292.4994
4 312.2901
5 332.0809
6 351.8717
Add standard errors.
head(data.frame(predict_gamm(ga_model, se=T)))
prediction se
1 252.9178 12.410220
2 272.7086 10.660891
3 292.4994 9.191224
4 312.2901 8.153871
5 332.0809 7.724998
6 351.8717 8.003034
compare = data.frame(
gam_original = predict_gamm(ga_model)$prediction,
gam_fe_only = predict_gamm(ga_model, re_form = NA)$prediction,
gam_fe_only2 = predict_gamm(ga_model,
exclude = c('s(Subject)', "s(Days,Subject)"))$prediction,
lme4_fe_only = predict(lmer_model, re.form = NA))
head(compare)
gam_original gam_fe_only gam_fe_only2 lme4_fe_only
1 252.9178 251.4051 251.4051 251.4051
2 272.7086 261.8724 261.8724 261.8724
3 292.4994 272.3397 272.3397 272.3397
4 312.2901 282.8070 282.8070 282.8070
5 332.0809 293.2742 293.2742 293.2742
6 351.8717 303.7415 303.7415 303.7415
Along with that, one can still use include/exclude for other smooth terms as above. Unfortunately, some options do not yet work with bam objects, but this is to due to the functionality in predict.gam from mgcv and should change in the near future.