In this vignette, we discuss how to specify multilevel models with
compositional outcomes using multilevelcoda
. In addition to
multilevelcoda
, we will use brms
package (to
fit models) and bayestestR
package (to compute useful
indices and compare models). We will also attach built in datasets
mcompd
(simulated compositional sleep and wake variables)
and sbp
(sequential binary partition).
library(multilevelcoda)
library(brms)
library(bayestestR)
data("mcompd")
data("sbp")
options(digits = 3)
A model with multilevel compositional outcomes is multivariate, as it
has multiple ILR coordinate outcomes, each of which is predicted by a
set of predictors. The ILR coordinates outcomes can be calculated using
the compilr()
functions.
cilr <- compilr(data = mcompd, sbp = sbp,
parts = c("TST", "WAKE", "MVPA", "LPA", "SB"), idvar = "ID")
head(cilr$TotalILR)
#> ilr1 ilr2 ilr3 ilr4
#> [1,] 0.287 1.20 0.6270 1.702
#> [2,] -0.472 1.57 -0.8336 0.984
#> [3,] -0.486 1.33 1.3344 2.659
#> [4,] -0.316 1.37 -0.0332 0.551
#> [5,] 0.205 1.43 -0.6893 0.733
#> [6,] -0.446 1.16 -0.0950 0.670
Our brms
model can be then fitted using the
brmcoda()
function.
mv <- brmcoda(compilr = cilr,
formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ STRESS + (1 | ID),
cores = 8, seed = 123, backend = "cmdstanr")
#> Setting 'rescor' to TRUE by default for this model
#> Warning: In the future, 'rescor' will be set to FALSE
#> by default for all models. It is thus recommended to
#> explicitely set 'rescor' via 'set_rescor' instead of
#> using the default.
#> Compiling Stan program...
#> Start sampling
Here is a summary()
of the model. We can see that stress
significantly predicted ilr1
and ilr2
.
summary(mv$Model)
#> Family: MV(gaussian, gaussian, gaussian, gaussian)
#> Links: mu = identity; sigma = identity
#> mu = identity; sigma = identity
#> mu = identity; sigma = identity
#> mu = identity; sigma = identity
#> Formula: ilr1 ~ STRESS + (1 | ID)
#> ilr2 ~ STRESS + (1 | ID)
#> ilr3 ~ STRESS + (1 | ID)
#> ilr4 ~ STRESS + (1 | ID)
#> Data: tmp (Number of observations: 3540)
#> Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#> total post-warmup draws = 4000
#>
#> Group-Level Effects:
#> ~ID (Number of levels: 266)
#> Estimate Est.Error l-95% CI
#> sd(ilr1_Intercept) 0.33 0.02 0.30
#> sd(ilr2_Intercept) 0.30 0.01 0.28
#> sd(ilr3_Intercept) 0.39 0.02 0.35
#> sd(ilr4_Intercept) 0.30 0.02 0.27
#> u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(ilr1_Intercept) 0.37 1.00 984 2019
#> sd(ilr2_Intercept) 0.34 1.00 1367 2028
#> sd(ilr3_Intercept) 0.43 1.00 1848 2524
#> sd(ilr4_Intercept) 0.33 1.01 1604 2019
#>
#> Population-Level Effects:
#> Estimate Est.Error l-95% CI u-95% CI
#> ilr1_Intercept -0.43 0.02 -0.48 -0.39
#> ilr2_Intercept 1.46 0.02 1.42 1.51
#> ilr3_Intercept -0.88 0.03 -0.93 -0.82
#> ilr4_Intercept 0.65 0.02 0.60 0.69
#> ilr1_STRESS -0.01 0.00 -0.02 -0.00
#> ilr2_STRESS 0.01 0.00 0.00 0.01
#> ilr3_STRESS 0.00 0.00 -0.01 0.01
#> ilr4_STRESS 0.01 0.00 -0.00 0.01
#> Rhat Bulk_ESS Tail_ESS
#> ilr1_Intercept 1.01 1060 1933
#> ilr2_Intercept 1.00 1078 1660
#> ilr3_Intercept 1.00 2052 2818
#> ilr4_Intercept 1.00 1934 2660
#> ilr1_STRESS 1.00 5597 3516
#> ilr2_STRESS 1.00 4905 3916
#> ilr3_STRESS 1.00 6341 3363
#> ilr4_STRESS 1.00 5854 3408
#>
#> Family Specific Parameters:
#> Estimate Est.Error l-95% CI u-95% CI Rhat
#> sigma_ilr1 0.44 0.01 0.43 0.45 1.00
#> sigma_ilr2 0.38 0.00 0.37 0.39 1.00
#> sigma_ilr3 0.70 0.01 0.68 0.71 1.00
#> sigma_ilr4 0.53 0.01 0.51 0.54 1.00
#> Bulk_ESS Tail_ESS
#> sigma_ilr1 6407 3329
#> sigma_ilr2 6625 3272
#> sigma_ilr3 5853 2767
#> sigma_ilr4 4974 3417
#>
#> Residual Correlations:
#> Estimate Est.Error l-95% CI u-95% CI
#> rescor(ilr1,ilr2) -0.54 0.01 -0.57 -0.52
#> rescor(ilr1,ilr3) -0.18 0.02 -0.21 -0.14
#> rescor(ilr2,ilr3) -0.05 0.02 -0.09 -0.02
#> rescor(ilr1,ilr4) 0.11 0.02 0.07 0.14
#> rescor(ilr2,ilr4) -0.05 0.02 -0.08 -0.01
#> rescor(ilr3,ilr4) 0.56 0.01 0.53 0.58
#> Rhat Bulk_ESS Tail_ESS
#> rescor(ilr1,ilr2) 1.00 6181 3388
#> rescor(ilr1,ilr3) 1.00 4693 3018
#> rescor(ilr2,ilr3) 1.00 5906 3466
#> rescor(ilr1,ilr4) 1.00 5590 3278
#> rescor(ilr2,ilr4) 1.00 6525 3923
#> rescor(ilr3,ilr4) 1.00 5916 3176
#>
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).
We are often interested in whether a predictor significantly predict the overall composition, in addition to the individual ILR coordinates. In Bayesian, this can be done by comparing the marginal likelihoods of two models. Bayes Factors (BFs) are indices of relative evidence of one model over another. In the context of compositional multilevel modelling, Bayes Factors provide two main useful functions:
We can utilize Bayes factors to answer the following question: “Which model is more likely to have produced the observed data?”
Let’s examine whether stress predicts the overall sleep-wake composition.
Note: To use Bayes factors, brmsfit
models must
be fitted with an additional non-default argument
save_pars = save_pars(all = TRUE)
.
# intercept only
mv0 <- brmcoda(compilr = cilr,
formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ 1 + (1 | ID),
iter = 6000, chains = 8, cores = 8, seed = 123, warmup = 1000,
backend = "cmdstanr", save_pars = save_pars(all = TRUE))
#> Setting 'rescor' to TRUE by default for this model
#> Warning: In the future, 'rescor' will be set to FALSE
#> by default for all models. It is thus recommended to
#> explicitely set 'rescor' via 'set_rescor' instead of
#> using the default.
#> Compiling Stan program...
#> Start sampling
# full model
mv <- brmcoda(compilr = cilr,
formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ STRESS + (1 | ID),
iter = 6000, chains = 8, cores = 8, seed = 123, warmup = 1000,
backend = "cmdstanr", save_pars = save_pars(all = TRUE))
#> Setting 'rescor' to TRUE by default for this model
#> Warning: In the future, 'rescor' will be set to FALSE
#> by default for all models. It is thus recommended to
#> explicitely set 'rescor' via 'set_rescor' instead of
#> using the default.
#> Compiling Stan program...
#> Start sampling
We can now compare these models with the
bayesfactor_models()
function
comparison <- bayesfactor_models(mv$Model, denominator = mv0$Model)
comparison
#> Bayes Factors for Model Comparison
#>
#> Model BF
#> [1] 2.74e-04
#>
#> * Against Denominator: [2]
#> * Bayes Factor Type: marginal likelihoods (bridgesampling)
With a \(BF\) < 1, our data favours the intercept only model, showing that there is insufficient evidence for stress predicting the overall sleep-wake composition.
Bayes factors provide a intuitive measure of the strength of evidence
of one model over the other or among different models. Check out the
bayestestR
packages for several other useful functions
related to BFs.