Originally designed application in the context of resource-limited
plant research and breeding programs, waves
provides an
open-source solution to spectral data processing and model development
by bringing useful packages together into a streamlined pipeline. This
package is wrapper for functions related to the analysis of point
visible and near-infrared reflectance measurements. It includes
visualization, filtering, aggregation, pretreatment, cross-validation
set formation, model training, and prediction functions to enable
open-source association of spectral and reference data.
Please note: function names were updated as of version 0.2.0. Old function names still work in this version but will be retired in upcoming package versions.
This package is documented in a peer-reviewed manuscript in the Plant Phenome Journal. Please cite the manuscript if you have found this package to be useful!
Hershberger, J, Morales, N, Simoes, CC, Ellerbrock, B, Bauchet, G, Mueller, LA, Gore MA. Making waves in Breedbase: An integrated spectral data storage and analysis pipeline for plant breeding programs. Plant Phenome J. 2021; 4:e20012. https://doi.org/10.1002/ppj2.20012
Follow the installation instructions below, and then go wild! Use
waves
to analyze your own data. Please report any bugs or
feature requests by opening issues in this repository.
More detailed examples can be found in the package vignette. The vignette can also be found by running the following:
vignette("waves")
Install the latest waves
release directly from CRAN:
install.packages("waves")
Alternatively, install the development version to get the most up-to-date (but not necessarily thoroughly tested) version:
# install.packages("devtools")
::install_github("GoreLab/waves") devtools
Format your data. Match spectra with reference values so that you have a dataframe with unique identifiers, reference values, and other metadata as columns to the left of spectral values. Spectral column names should start with “X”.
Visualize and filter spectra using plot_spectra()
and filter_spectra()
.
If you have more than one scan per unique identifier, aggregate
the scans by mean or median with
aggregate_spectra()
.
Use test_spectra()
to perform spectral pretreatment,
cross-validation set formation, and model training functions over
multiple iterations.
pretreat_spectra()
.format_cv()
.
Choose from random, stratified random, or a plant breeding-specific
scheme from Jarquín et
al., 2017. The Plant Genome.train_spectra()
.
Save trained prediction models with
save_model()
.
Predict phenotypic values with new spectra and a saved model
using predict_spectra()
.
The package comes with an example dataset (ikeogu.2017
)
from Ikeogu et
al. (2017) PLoS ONE that can be used to try out
package capabilities. This dataset includes vis-NIR spectra from cassava
roots as well as two reference phenotypes: