optimall
offers a collection of functions that are
designed to streamline the process of optimum sample allocation,
specifically under an adaptive, multi-wave approach. Its main functions
allow users to:
Define, split, and merge strata based on values or quantiles of other variables.
Calculate the optimum number of samples to allocate to each stratum in a given study in order to minimize the variance of an estimate of interest.
Select specific ids to sample based on a stratified sampling design.
Optimally allocate a fixed number of samples to an ancillary sampling wave based on results from a prior wave.
When used together, these functions can automate most of the sampling workflow.
You can install optimall
from CRAN with:
# install.packages("optimall")
Or, you can install the development version from GitHub with:
# install.packages("devtools")
::install_github("yangjasp/optimall") devtools
Given a dataframe where each row represents one unit,
optimall
can define the stratum each unit belongs to:
library(optimall)
<- split_strata(data = data, strata = "old_strata",
data split_var = "variable_to_split_on",
type = "value", split_at = c(1,2))
If the strata or values to split at are not obvious, it may be useful
to try a few different splits and observe the effects that each has on
sample allocation. optimall
makes this process quick and
easy with a Shiny app that can be launched with
optimall_shiny
. This app allows users to adjust inputs to
the split_strata
function and view the results in real
time. Once the parameters are satisfactory, the user can confirm the
split and move on to further ones if desired. The app prints the code
required to replicate the splits in optimall
, so making the
changes inside of R becomes as easy as a copy and paste!
Screenshot:
We can then use optimum_allocation
to calculate the
optimum allocation a fixed number of samples to our strata in order to
minimize the variance of a variable of interest.
optimum_allocation(data = data, strata = "new_strata",
y = "var_of_interest", nsample = 100)
optimall
offers more functions that streamline adaptive,
multi-wave sampling workflows. For a more detailed description, see
package vignettes.
McIsaac MA, Cook RJ. Adaptive sampling in two‐phase designs: a biomarker study for progression in arthritis. Statistics in Medicine. 2015 Sep 20;34(21):2899-912.
Wright, T. (2014). A simple method of exact optimal sample allocation under stratification with any mixed constraint patterns. Statistics, 07.