In this vignette, we discuss how to specify multilevel models with compositional outcomes using multilevelcoda. In addition to multilevelcoda, we will use brms package (to fit models) and bayestestR package (to compute useful indices and compare models). We will also attach built in datasets mcompd (simulated compositional sleep and wake variables) and sbp (sequential binary partition).

library(multilevelcoda)
library(brms)
library(bayestestR)

data("mcompd") 
data("sbp") 

options(digits = 3)

Multilevel model with compositional outcomes.

Computing compositions and isometric log ratio coordinates.

A model with multilevel compositional outcomes is multivariate, as it has multiple ILR coordinate outcomes, each of which is predicted by a set of predictors. The ILR coordinates outcomes can be calculated using the compilr() functions.

cilr <- compilr(data = mcompd, sbp = sbp,
                parts = c("TST", "WAKE", "MVPA", "LPA", "SB"), idvar = "ID")

head(cilr$TotalILR)
#>        ilr1 ilr2    ilr3  ilr4
#> [1,]  0.287 1.20  0.6270 1.702
#> [2,] -0.472 1.57 -0.8336 0.984
#> [3,] -0.486 1.33  1.3344 2.659
#> [4,] -0.316 1.37 -0.0332 0.551
#> [5,]  0.205 1.43 -0.6893 0.733
#> [6,] -0.446 1.16 -0.0950 0.670

Fitting model

Our brms model can be then fitted using the brmcoda() function.

mv <- brmcoda(compilr = cilr,
              formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ STRESS + (1 | ID),
              cores = 8, seed = 123, backend = "cmdstanr")
#> Setting 'rescor' to TRUE by default for this model
#> Warning: In the future, 'rescor' will be set to FALSE
#> by default for all models. It is thus recommended to
#> explicitely set 'rescor' via 'set_rescor' instead of
#> using the default.
#> Compiling Stan program...
#> Start sampling

Here is a summary() of the model. We can see that stress significantly predicted ilr1 and ilr2.

summary(mv$Model)
#>  Family: MV(gaussian, gaussian, gaussian, gaussian) 
#>   Links: mu = identity; sigma = identity
#>          mu = identity; sigma = identity
#>          mu = identity; sigma = identity
#>          mu = identity; sigma = identity 
#> Formula: ilr1 ~ STRESS + (1 | ID) 
#>          ilr2 ~ STRESS + (1 | ID) 
#>          ilr3 ~ STRESS + (1 | ID) 
#>          ilr4 ~ STRESS + (1 | ID) 
#>    Data: tmp (Number of observations: 3540) 
#>   Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
#>          total post-warmup draws = 4000
#> 
#> Group-Level Effects: 
#> ~ID (Number of levels: 266) 
#>                    Estimate Est.Error l-95% CI
#> sd(ilr1_Intercept)     0.33      0.02     0.30
#> sd(ilr2_Intercept)     0.30      0.01     0.28
#> sd(ilr3_Intercept)     0.39      0.02     0.35
#> sd(ilr4_Intercept)     0.30      0.02     0.27
#>                    u-95% CI Rhat Bulk_ESS Tail_ESS
#> sd(ilr1_Intercept)     0.37 1.00      984     2019
#> sd(ilr2_Intercept)     0.34 1.00     1367     2028
#> sd(ilr3_Intercept)     0.43 1.00     1848     2524
#> sd(ilr4_Intercept)     0.33 1.01     1604     2019
#> 
#> Population-Level Effects: 
#>                Estimate Est.Error l-95% CI u-95% CI
#> ilr1_Intercept    -0.43      0.02    -0.48    -0.39
#> ilr2_Intercept     1.46      0.02     1.42     1.51
#> ilr3_Intercept    -0.88      0.03    -0.93    -0.82
#> ilr4_Intercept     0.65      0.02     0.60     0.69
#> ilr1_STRESS       -0.01      0.00    -0.02    -0.00
#> ilr2_STRESS        0.01      0.00     0.00     0.01
#> ilr3_STRESS        0.00      0.00    -0.01     0.01
#> ilr4_STRESS        0.01      0.00    -0.00     0.01
#>                Rhat Bulk_ESS Tail_ESS
#> ilr1_Intercept 1.01     1060     1933
#> ilr2_Intercept 1.00     1078     1660
#> ilr3_Intercept 1.00     2052     2818
#> ilr4_Intercept 1.00     1934     2660
#> ilr1_STRESS    1.00     5597     3516
#> ilr2_STRESS    1.00     4905     3916
#> ilr3_STRESS    1.00     6341     3363
#> ilr4_STRESS    1.00     5854     3408
#> 
#> Family Specific Parameters: 
#>            Estimate Est.Error l-95% CI u-95% CI Rhat
#> sigma_ilr1     0.44      0.01     0.43     0.45 1.00
#> sigma_ilr2     0.38      0.00     0.37     0.39 1.00
#> sigma_ilr3     0.70      0.01     0.68     0.71 1.00
#> sigma_ilr4     0.53      0.01     0.51     0.54 1.00
#>            Bulk_ESS Tail_ESS
#> sigma_ilr1     6407     3329
#> sigma_ilr2     6625     3272
#> sigma_ilr3     5853     2767
#> sigma_ilr4     4974     3417
#> 
#> Residual Correlations: 
#>                   Estimate Est.Error l-95% CI u-95% CI
#> rescor(ilr1,ilr2)    -0.54      0.01    -0.57    -0.52
#> rescor(ilr1,ilr3)    -0.18      0.02    -0.21    -0.14
#> rescor(ilr2,ilr3)    -0.05      0.02    -0.09    -0.02
#> rescor(ilr1,ilr4)     0.11      0.02     0.07     0.14
#> rescor(ilr2,ilr4)    -0.05      0.02    -0.08    -0.01
#> rescor(ilr3,ilr4)     0.56      0.01     0.53     0.58
#>                   Rhat Bulk_ESS Tail_ESS
#> rescor(ilr1,ilr2) 1.00     6181     3388
#> rescor(ilr1,ilr3) 1.00     4693     3018
#> rescor(ilr2,ilr3) 1.00     5906     3466
#> rescor(ilr1,ilr4) 1.00     5590     3278
#> rescor(ilr2,ilr4) 1.00     6525     3923
#> rescor(ilr3,ilr4) 1.00     5916     3176
#> 
#> Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
#> and Tail_ESS are effective sample size measures, and Rhat is the potential
#> scale reduction factor on split chains (at convergence, Rhat = 1).

Bayes Factor for compositional multilevel modelling

We are often interested in whether a predictor significantly predict the overall composition, in addition to the individual ILR coordinates. In Bayesian, this can be done by comparing the marginal likelihoods of two models. Bayes Factors (BFs) are indices of relative evidence of one model over another. In the context of compositional multilevel modelling, Bayes Factors provide two main useful functions:

  • Testing single parameters within a model
  • Comparing models

We can utilize Bayes factors to answer the following question: “Which model is more likely to have produced the observed data?”

Let’s examine whether stress predicts the overall sleep-wake composition.

Note: To use Bayes factors, brmsfit models must be fitted with an additional non-default argument save_pars = save_pars(all = TRUE).

# intercept only
mv0 <- brmcoda(compilr = cilr,
               formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ 1 + (1 | ID),
               iter = 6000, chains = 8, cores = 8, seed = 123, warmup = 1000,
               backend = "cmdstanr", save_pars = save_pars(all = TRUE))
#> Setting 'rescor' to TRUE by default for this model
#> Warning: In the future, 'rescor' will be set to FALSE
#> by default for all models. It is thus recommended to
#> explicitely set 'rescor' via 'set_rescor' instead of
#> using the default.
#> Compiling Stan program...
#> Start sampling
# full model
mv <- brmcoda(compilr = cilr,
              formula = mvbind(ilr1, ilr2, ilr3, ilr4) ~ STRESS + (1 | ID),
              iter = 6000, chains = 8, cores = 8, seed = 123, warmup = 1000,
              backend = "cmdstanr", save_pars = save_pars(all = TRUE))
#> Setting 'rescor' to TRUE by default for this model
#> Warning: In the future, 'rescor' will be set to FALSE
#> by default for all models. It is thus recommended to
#> explicitely set 'rescor' via 'set_rescor' instead of
#> using the default.
#> Compiling Stan program...
#> Start sampling

We can now compare these models with the bayesfactor_models() function

comparison <- bayesfactor_models(mv$Model, denominator = mv0$Model)
comparison
#> Bayes Factors for Model Comparison
#> 
#>     Model       BF
#> [1]       2.74e-04
#> 
#> * Against Denominator: [2]
#> *   Bayes Factor Type: marginal likelihoods (bridgesampling)

With a \(BF\) < 1, our data favours the intercept only model, showing that there is insufficient evidence for stress predicting the overall sleep-wake composition.

Bayes factors provide a intuitive measure of the strength of evidence of one model over the other or among different models. Check out the bayestestR packages for several other useful functions related to BFs.