The adaptr
package simulates adaptive clinical trials
using adaptive stopping, adaptive arm dropping and/or response-adaptive
randomisation.
The package has been developed as part of the INCEPT (Intensive Care Platform Trial) project, funded primarily by a grant from Sygeforsikringen “danmark”.
Additional guidance on the key methodological considerations when planning and comparing adaptive clinical trials can be found in the open access article “An overview of methodological considerations regarding adaptive stopping, arm dropping and randomisation in clinical trials” available in Journal of Clinical Epidemiology.
First, load the package:
library(adaptr)
Then, setup a trial with the desired specifications.
adaptr
offers the general purpose function
setup_trial()
, but here we use the built-in
setup_trial_binom()
for a trial with a binary, binomially
distributed, undesirable outcome such as mortality (adaptr
also includes setup_trial_norm()
for continuous, normally
distributed outcomes). The example trial specification has three
characteristics:
min_probs
).equivalence_prob
) of treatment differences
being < 5 %-points (equivalence_diff
), the trial is
stopped.soften_power
) by a
constant factor.<- setup_trial_binom(
binom_trial arms = c("Arm A", "Arm B", "Arm C"),
true_ys = c(0.25, 0.20, 0.30),
min_probs = rep(0.15, 3),
data_looks = seq(from = 300, to = 2000, by = 100),
equivalence_prob = 0.9,
equivalence_diff = 0.05,
soften_power = 0.5
)
See ?setup_trial()
for more details on these arguments
or vignette("Basic-examples", "adaptr")
for
basic example trial specifications and a thorough
review of the general settings, and
vignette("Advanced-example", "adaptr")
for an
advanced example including details on how to specify
user-written functions for generating outcomes and posterior draws.
We can print an overview of the trial specification by simply running:
binom_trial#> Trial specification: generic binomially distributed outcome trial
#> * Undesirable outcome
#> * No common control arm
#> * Best arm: Arm B
#>
#> Arms, true outcomes, starting allocation probabilities
#> and allocation probability limits:
#> arms true_ys start_probs fixed_probs min_probs max_probs
#> Arm A 0.25 0.333 NA 0.15 NA
#> Arm B 0.20 0.333 NA 0.15 NA
#> Arm C 0.30 0.333 NA 0.15 NA
#>
#> Maximum sample size: 2000
#> Maximum number of data looks: 18
#> Planned data looks after: 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000 patients have reached follow-up
#> Number of patients randomised at each look: 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000
#>
#> Superiority threshold: 0.99 (all analyses)
#> Inferiority threshold: 0.01 (all analyses)
#> Equivalence threshold: 0.9 (all analyses) (no common control)
#> Absolute equivalence difference: 0.05
#> No futility threshold (not relevant - no common control)
#> Soften power for all analyses: 0.5
By default, probabilities are shown with 3 decimals. This can be
changed by explicitly print()
ing the specification with the
prob_digits
arguments, for example:
print(binom_trial, prob_digits = 2)
#> Trial specification: generic binomially distributed outcome trial
#> * Undesirable outcome
#> * No common control arm
#> * Best arm: Arm B
#>
#> Arms, true outcomes, starting allocation probabilities
#> and allocation probability limits:
#> arms true_ys start_probs fixed_probs min_probs max_probs
#> Arm A 0.25 0.33 NA 0.15 NA
#> Arm B 0.20 0.33 NA 0.15 NA
#> Arm C 0.30 0.33 NA 0.15 NA
#>
#> Maximum sample size: 2000
#> Maximum number of data looks: 18
#> Planned data looks after: 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000 patients have reached follow-up
#> Number of patients randomised at each look: 300, 400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000
#>
#> Superiority threshold: 0.99 (all analyses)
#> Inferiority threshold: 0.01 (all analyses)
#> Equivalence threshold: 0.9 (all analyses) (no common control)
#> Absolute equivalence difference: 0.05
#> No futility threshold (not relevant - no common control)
#> Soften power for all analyses: 0.5
Remember to define the seed
to ensure reproducible
results:
<- run_trial(binom_trial, seed = 12345)
trial_res
trial_res#> Single simulation result: generic binomially distributed outcome trial
#> * Undesirable outcome
#> * No common control arm
#>
#> Final status: inconclusive, stopped at final allowed adaptive analysis
#> Final/maximum allowed sample sizes: 2000/2000 (100.0%)
#> Available outcome data at last adaptive analysis: 2000/2000 (100.0%)
#>
#> Trial results overview:
#> arms true_ys final_status status_look status_probs final_alloc
#> Arm A 0.25 active NA NA 0.194
#> Arm B 0.20 active NA NA 0.656
#> Arm C 0.30 inferior 2000 0.007 0.150
#>
#> Esimates from final analysis (all patients):
#> arms sum_ys_all ns_all raw_ests_all post_ests_all post_errs_all lo_cri_all
#> Arm A 180 742 0.243 0.243 0.0161 0.213
#> Arm B 178 841 0.212 0.212 0.0141 0.185
#> Arm C 113 417 0.271 0.272 0.0221 0.230
#> hi_cri_all
#> 0.274
#> 0.240
#> 0.316
#>
#> Estimates from last adaptive analysis including each arm:
#> arms sum_ys ns raw_ests post_ests post_errs lo_cri hi_cri
#> Arm A 180 742 0.243 0.243 0.0159 0.213 0.275
#> Arm B 178 841 0.212 0.212 0.0141 0.185 0.241
#> Arm C 113 417 0.271 0.271 0.0215 0.230 0.316
#>
#> Simulation details:
#> * Random seed: 12345
#> * Credible interval width: 95%
#> * Number of posterior draws: 5000
#> * Posterior estimation method: medians with MAD-SDs
Again, we can choose the number of decimals with
print()
:
print(trial_res, prob_digits = 2)
#> Single simulation result: generic binomially distributed outcome trial
#> * Undesirable outcome
#> * No common control arm
#>
#> Final status: inconclusive, stopped at final allowed adaptive analysis
#> Final/maximum allowed sample sizes: 2000/2000 (100.0%)
#> Available outcome data at last adaptive analysis: 2000/2000 (100.0%)
#>
#> Trial results overview:
#> arms true_ys final_status status_look status_probs final_alloc
#> Arm A 0.25 active NA NA 0.19
#> Arm B 0.20 active NA NA 0.66
#> Arm C 0.30 inferior 2000 0.007 0.15
#>
#> Esimates from final analysis (all patients):
#> arms sum_ys_all ns_all raw_ests_all post_ests_all post_errs_all lo_cri_all
#> Arm A 180 742 0.24 0.24 0.016 0.21
#> Arm B 178 841 0.21 0.21 0.014 0.19
#> Arm C 113 417 0.27 0.27 0.022 0.23
#> hi_cri_all
#> 0.27
#> 0.24
#> 0.32
#>
#> Estimates from last adaptive analysis including each arm:
#> arms sum_ys ns raw_ests post_ests post_errs lo_cri hi_cri
#> Arm A 180 742 0.24 0.24 0.016 0.21 0.27
#> Arm B 178 841 0.21 0.21 0.014 0.19 0.24
#> Arm C 113 417 0.27 0.27 0.022 0.23 0.32
#>
#> Simulation details:
#> * Random seed: 12345
#> * Credible interval width: 95%
#> * Number of posterior draws: 5000
#> * Posterior estimation method: medians with MAD-SDs
Generally, we want to run many simulations using the same trial
specification to assess and compare performance metrics of different
trial designs. This is the job of run_trials()
(note the
final s); again, note the use of a
base_seed
for reproducible results. Here we run 25
simulations, but in practice you will generally want to run more
simulations. The low number of simulations in this example have been
chosen to make run-time tolerable when producing the example, but leads
to uncertainty and instability in the results, as seen below.
<- run_trials(binom_trial, n_rep = 25, base_seed = 67890) trial_res_mult
run_trials()
can run simulations on several CPU cores
concurrently: set the cores
argument to some number greater
than 1
(which is the default value, resulting in
serial/non-parallel processing). As an aside, you can see the number of
available CPU cores by calling parallel::detectCores()
.
The results of multiple simulations may be summarised by printing the resulting object:
trial_res_mult#> Multiple simulation results: generic binomially distributed outcome trial
#> * Undesirable outcome
#> * Number of simulations: 25
#> * Number of simulations summarised: 25 (all trials)
#> * No common control arm
#> * Selection strategy: no selection if no superior arm
#> * Treatment effect compared to: no comparison
#>
#> Performance metrics (using posterior estimates from last adaptive analysis):
#> * Sample sizes: mean 1324.0 (SD: 547.2) | median 1200.0 (IQR: 1000.0 to 1900.0)
#> * Total summarised outcomes: mean 296.1 (SD: 111.5) | median 282.0 (IQR: 228.0 to 415.0)
#> * Total summarised outcome rates: mean 0.228 (SD: 0.013) | median 0.230 (IQR: 0.220 to 0.235)
#> * Conclusive: 80.0%
#> * Superiority: 64.0%
#> * Equivalence: 16.0%
#> * Futility: 0.0% [not assessed]
#> * Inconclusive at max sample size: 20.0%
#> * Selection probabilities: Arm A: 0.0% | Arm B: 64.0% | Arm C: 0.0% | None: 36.0%
#> * RMSE: 0.01642
#> * RMSE treatment effect: not estimated
#> * Ideal design percentage: 100.0%
#>
#> Simulation details:
#> * Simulation time: 1.1 secs
#> * Base random seed: 67890
#> * Credible interval width: 95%
#> * Number of posterior draws: 5000
#> * Estimation method: posterior medians with MAD-SDs
This calls the summary()
method (as known from, e.g.,
regression models in R
), which summarises the results, and
prints the output from that function in a human-friendly manner using
the print()
method for summarised simulations.
The summary()
function can also be called directly,
which allows more control of how results are summarised (including which
arms are selected in inconclusive trials), and allows subsequent
extraction of individual key results. In addition, the number of digits
can be controlled when printed:
<- summary(trial_res_mult)
res_sum
print(res_sum, digits = 1)
#> Multiple simulation results: generic binomially distributed outcome trial
#> * Undesirable outcome
#> * Number of simulations: 25
#> * Number of simulations summarised: 25 (all trials)
#> * No common control arm
#> * Selection strategy: no selection if no superior arm
#> * Treatment effect compared to: no comparison
#>
#> Performance metrics (using posterior estimates from last adaptive analysis):
#> * Sample sizes: mean 1324.0 (SD: 547.2) | median 1200.0 (IQR: 1000.0 to 1900.0)
#> * Total summarised outcomes: mean 296.1 (SD: 111.5) | median 282.0 (IQR: 228.0 to 415.0)
#> * Total summarised outcome rates: mean 0.228 (SD: 0.013) | median 0.230 (IQR: 0.220 to 0.235)
#> * Conclusive: 80.0%
#> * Superiority: 64.0%
#> * Equivalence: 16.0%
#> * Futility: 0.0% [not assessed]
#> * Inconclusive at max sample size: 20.0%
#> * Selection probabilities: Arm A: 0.0% | Arm B: 64.0% | Arm C: 0.0% | None: 36.0%
#> * RMSE: 0.01642
#> * RMSE treatment effect: not estimated
#> * Ideal design percentage: 100.0%
#>
#> Simulation details:
#> * Simulation time: 1.1 secs
#> * Base random seed: 67890
#> * Credible interval width: 95%
#> * Number of posterior draws: 5000
#> * Estimation method: posterior medians with MAD-SDs
The summary()
method, however, may not necessarily be
what you want. adaptr
has additional functions that may be
used on multiple simulations using the same trial specification.
The extract_results()
function extract key trial results
and yields a tidy data.frame
with one simulation per
row:
<- extract_results(trial_res_mult)
extr_res
nrow(extr_res)
#> [1] 25
head(extr_res)
#> sim final_n sum_ys ratio_ys final_status superior_arm selected_arm
#> 1 1 2000 415 0.2075000 max <NA> <NA>
#> 2 2 600 139 0.2316667 superiority Arm B Arm B
#> 3 3 1000 237 0.2370000 superiority Arm B Arm B
#> 4 4 900 209 0.2322222 equivalence <NA> <NA>
#> 5 5 2000 441 0.2205000 superiority Arm B Arm B
#> 6 6 1900 431 0.2268421 superiority Arm B Arm B
#> sq_err sq_err_te
#> 1 NA NA
#> 2 7.853843e-04 NA
#> 3 4.190319e-05 NA
#> 4 NA NA
#> 5 3.422824e-06 NA
#> 6 4.852161e-05 NA
Further, the check_performance()
function calculates key
performance metrics and returns them in a tidy data.frame
with one metric per row, and may also be used to assess the uncertainty
in these estimates using bootstrapping:
<- check_performance(trial_res_mult, uncertainty = TRUE, n_boot = 1000,
perf_res boot_seed = "base")
print(perf_res, digits = 3)
#> metric est err_sd err_mad lo_ci hi_ci
#> 1 n_summarised 25.000 0.000 0.000 25.000 25.000
#> 2 size_mean 1324.000 104.423 106.747 1131.800 1524.000
#> 3 size_sd 547.175 47.966 48.405 441.268 627.113
#> 4 size_median 1200.000 229.311 296.520 1000.000 1600.000
#> 5 size_p25 1000.000 132.490 0.000 600.000 1100.000
#> 6 size_p75 1900.000 184.737 148.260 1500.000 2000.000
#> 7 sum_ys_mean 296.120 21.297 22.120 255.065 335.856
#> 8 sum_ys_sd 111.481 10.793 10.592 87.716 129.987
#> 9 sum_ys_median 282.000 41.603 63.752 234.000 359.000
#> 10 sum_ys_p25 228.000 31.031 13.343 138.000 266.000
#> 11 sum_ys_p75 415.000 32.732 7.413 325.000 431.000
#> 12 ratio_ys_mean 0.228 0.003 0.003 0.223 0.233
#> 13 ratio_ys_sd 0.013 0.002 0.002 0.009 0.016
#> 14 ratio_ys_median 0.230 0.004 0.003 0.220 0.234
#> 15 ratio_ys_p25 0.220 0.004 0.005 0.210 0.228
#> 16 ratio_ys_p75 0.235 0.003 0.003 0.232 0.242
#> 17 prob_conclusive 0.800 0.079 0.059 0.640 0.960
#> 18 prob_superior 0.640 0.095 0.119 0.440 0.800
#> 19 prob_equivalence 0.160 0.076 0.059 0.040 0.320
#> 20 prob_futility 0.000 0.000 0.000 0.000 0.000
#> 21 prob_max 0.200 0.079 0.059 0.040 0.360
#> 22 prob_select_arm_Arm A 0.000 0.000 0.000 0.000 0.000
#> 23 prob_select_arm_Arm B 0.640 0.095 0.119 0.440 0.800
#> 24 prob_select_arm_Arm C 0.000 0.000 0.000 0.000 0.000
#> 25 prob_select_none 0.360 0.095 0.119 0.200 0.560
#> 26 rmse 0.016 0.004 0.004 0.008 0.024
#> 27 rmse_te NA NA NA NA NA
#> 28 idp 100.000 0.000 0.000 100.000 100.000
Finally, the stability of performance metrics according to the number
of simulations may be assessed visually using the
plot_convergence()
function. Note that all
adaptr
plotting functions require the ggplot2
package.
# Convergence plots for four performance metrics
plot_convergence(trial_res_mult, metrics = c("size mean", "prob superior",
"rmse", "idp"))
It is seen that the low number of simulations used in this example leads to substantial uncertainty and instability in performance metrics (while the ideal design percentage is always at 100% here, it would likely drop somewhat if more simulations were conducted).
In addition to the convergence plots, results may be visualised using
the plot_status()
and plot_history()
functions. We need non-sparse results for plot_history()
(but not for plot_status()
or
plot_convergence()
presented above), so first we re-run
run_trials()
with the sparse
argument set to
FALSE
:
<- run_trials(binom_trial, n_rep = 25, base_seed = 67890,
trial_res_mult sparse = FALSE)
First, we plot the overall trial statuses according to the total
number of patients randomised (this function does not require
sparse = FALSE
):
plot_status(trial_res_mult, x_value = "total n")
We can also plot the statuses for specific arms, or for all arms if
supplying NA
to the arm
-argument:
plot_status(trial_res_mult, x_value = "total n", arm = NA, ncol = 1)
Next, we plot the history of allocation probabilities at each
adaptive analysis look. Intervals cover the inter-quartile range by
default (interval_width = 0.5
):
plot_history(trial_res_mult)
Plotting other summary metrics is possible; see
plot_history()
.
If using the package, please consider citing:
citation(package = "adaptr")
#>
#> To cite adaptr in publications use:
#>
#> Granholm A, Jensen AKG, Lange T, Kaas-Hansen BS (2022). adaptr: an R
#> package for simulating and comparing adaptive clinical trials.
#> Journal of Open Source Software, 7(72), 4284. URL
#> https://doi.org/10.21105/joss.04284.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Article{,
#> title = {{adaptr}: an R package for simulating and comparing adaptive clinical trials},
#> author = {Anders Granholm and Aksel Karl Georg Jensen and Theis Lange and Benjamin Skov Kaas-Hansen},
#> journal = {Journal of Open Source Software},
#> year = {2022},
#> volume = {7},
#> number = {72},
#> pages = {4284},
#> url = {https://doi.org/10.21105/joss.04284},
#> doi = {10.21105/joss.04284},
#> }