The goal of ‘mustashe’ is to save time on long-running computations by storing and reloading the resulting object after the first run. The next time the computation is run, instead of evaluating the code, the stashed object is loaded. ‘mustashe’ is great for storing intermediate objects in an analysis.
You can install the released version of ‘mustashe’ from CRAN with:
install.packages("mustashe")
And the development version from GitHub with:
# install.packages("devtools")
::install_github("jhrcook/mustashe") devtools
The ‘mustashe’ package is loaded like any other, using the
library()
function.
library(mustashe)
Below is a simple example of how to use the stash()
function from ‘mustashe’.
Let’s say, for part of an analysis, we are running a long simulation
to generate random data rnd_vals
. This is mocked below
using the Sys.sleep()
function. We can time this process
using the ‘tictoc’ library.
::tic("random simulation")
tictocstash("rnd_vals", {
Sys.sleep(3)
<- rnorm(1e5)
rnd_vals
})#> Stashing object.
::toc()
tictoc#> random simulation: 3.638 sec elapsed
Now, if we come back tomorrow and continue working on the same
analysis, the second time this process is run the code is not evaluated
because the code passed to stash()
has not changed.
Instead, the random values rnd_vals
is loaded.
::tic("random simulation")
tictocstash("rnd_vals", {
Sys.sleep(3)
<- rnorm(1e5)
rnd_vals
})#> Loading stashed object.
::toc()
tictoc#> random simulation: 0.016 sec elapsed
A common problem with storing intermediates is that they have
dependencies that can change. If a dependency changes, then we want the
stashed value to be updated. This is accomplished by passing the names
of the dependencies to the depends_on
argument.
For instance, let’s say we are calculating some value
foo
using x
. (For the following example, I
will use a print statement to indicate when the code is evaluated.)
<- 100
x
stash("foo", depends_on = "x", {
print("Calculating `foo` using `x`.")
<- x + 1
foo
})#> Stashing object.
#> [1] "Calculating `foo` using `x`."
foo#> [1] 101
Now if x
is not changed, then the code for
foo
does not get re-evaluated.
<- 100
x
stash("foo", depends_on = "x", {
print("Calculating `foo` using `x`.")
<- x + 1
foo
})#> Loading stashed object.
foo#> [1] 101
But if x
does change, then foo
gets
re-evaluated.
<- 200
x
stash("foo", depends_on = "x", {
print("Calculating `foo` using `x`.")
<- x + 1
foo
})#> Updating stash.
#> [1] "Calculating `foo` using `x`."
foo#> [1] 201
The ‘here’ package is useful for
handling file paths in R projects, particularly when using an RStudio
project. The main function, here::here()
, can be used to
create the file path for stashing an object by calling
use_here()
.
use_here()
#> The global option "mustashe.here" has been set `TRUE`.
#> Add `mustashe::use_here(silent = TRUE)` to you're '.Rprofile'
#> to have it set automatically in the future.
This behavior can be turned off, too.
dont_use_here()
#> No longer using `here::here()` for creating stash file paths.
The inspiration for this package came from the cache()
feature in the ‘ProjectTemplate’
package. While the functionality and implementation are a bit different,
this would have been far more difficult to do without referencing the
source code from ‘ProjectTemplate’.