This vignette shows how to use pirouette
in a scientific experiment.
library(pirouette)
Remember the main question pirouette
helps to answer:
Would we know the true phylogeny in nature, how well would we be able to infer it from a DNA alignment?
In this vignette, we investigate the effect of extintion rate in inferring a phylogeny from a (constant rate) birth-death process. We expect that inference is better if there is more information to work on. For example, when there are more taxa. But extinctions do not only decrease the number of taxa, also the information when these extinct taxa cam into exinstence is lost. As extinctions lower the amount of information, we predict a higher inference error for higher extinction rates.
Our experimental setup will not be the best setup. A good experimental setup will have, among others, more replicates, longer MCMC runs and a more equal comparison.
We specify the extinction rates we will investigate:
seq(0.0, 0.4, length.out = 3) ext_rates <-
For a speciation rate, we use one value:
1.0 spec_rate <-
Now, per extinction rate, we simulate one phylogeny:
set.seed(42)
4.0
crown_age <- 6
n_taxa <- list()
phylogenies <-for (i in seq_along(ext_rates)) {
ext_rates[i]
ext_rate <- create_exemplary_dd_tree(
phylogeny <-n_taxa = n_taxa,
crown_age = crown_age,
extinction_rate = ext_rate
) phylogeny
phylogenies[[i]] <-
}::plot.phylo(phylogenies[[1]]) ape
::plot.phylo(phylogenies[[2]]) ape
::plot.phylo(phylogenies[[3]]) ape
Now, per phylogeny, we estimate the error BEAST2 makes:
create_alignment_params(
alignment_params <-root_sequence = create_blocked_dna(length = 20)
) create_experiment(
experiment <-inference_conditions = create_inference_conditions(
model_type = "generative",
run_if = "always",
do_measure_evidence = FALSE
),inference_model = create_inference_model(
tree_prior = create_bd_tree_prior(),
mcmc = create_test_mcmc()
)
) list(experiment)
experiments <- create_pir_params(
pir_params <-alignment_params = alignment_params,
experiments = experiments
) list()
errors <-if (rappdirs::app_dir()$os != "win" && beastier::is_beast2_installed()) {
for (i in seq_along(ext_rates)) {
phylogenies[[i]]
phylogeny <- pirouette::pir_run(
df <-phylogeny = phylogeny,
pir_params = pir_params
) df
errors[[i]] <-
} }
Now, we put all data in one data frame:
if (rappdirs::app_dir()$os != "win" && beastier::is_beast2_installed()) {
which(colnames(errors[[1]]) == "error_1")
first_error_col <- ncol(errors[[1]])
last_error_col <- 1 + last_error_col - first_error_col
n_errors <- data.frame(
df <-ext_rate = as.factor(rep(ext_rates, each = n_errors)),
idx = seq(1, n_errors),
error = NA
)
for (i in seq_along(ext_rates)) {
errors[[i]]
this_df <- this_df[1, first_error_col:last_error_col]
nltts <- 1 + (i * n_errors) - n_errors
to_row_index <-$error[to_row_index:(to_row_index + n_errors - 1)] <- t(as.numeric(nltts))
df
} }
Plotting it:
if (rappdirs::app_dir()$os != "win" && beastier::is_beast2_installed()) {
::ggplot(
ggplot2::aes(x = ext_rate, y = error)
df, ggplot2+ ggplot2::geom_boxplot()
) ::ggplot(
ggplot2::aes(x = ext_rate, y = error)
df, ggplot2+ ggplot2::geom_violin()
) }