polyfreqs is an R package for the estimation of biallelic SNP frequencies, genotypes and heterozygosity in autopolyploid taxa using high throughput sequencing data. It should work for diploids as well, but does not accomodate data sets of mixed ploidy.
0
in the total read
count matrix.NEW: polyfreqs now has a Google Groups page. Please feel free to join the group and post any questions that you may have about the software. [Google Groups link]
polyfreqs uses C++ code to implement its Gibbs
sampling algorithm which will usually require the installation of
additional software (depending on the operating system [OS] being used).
Windows users will need to install
Rtools.
MacOSX users will need to install the Xcode Command Line Tools. Linux
users will need an up-to-date version of the GNU Compiler Collection
(gcc) and the r-base-dev package. polyfreqs relies on
the R package
Rcpp
which is a good place to start too for figuring what you will need. Note
that Rcpp also requires the compilation of C++ code so
make sure that the necessary compilers are installed appropriately for
your OS. You can install Rcpp directly from CRAN in the
usual way using the install.packages()
command:
install.packages("Rcpp")
polyfreqs v1.0.0 is now on CRAN: link.
You can now install it like you would any other R package:
install.packages("polyfreqs")
Installing the latest developmental release of
polyfreqs can be done using the
devtools
package and the install_github()
command. Install
devtools using
install.packages("devtools")
. polyfreqs
can then be installed as follows:
::install_github("pblischak/polyfreqs") devtools
Example code and tutorials for running polyfreqs can be found in the vignette. For more details on the model underlying polyfreqs please see the associated paper in Molecular Ecology Resources: Blischak et al. The Supplemental Material also has a walk through for analyzing a data set collected for autotetraploid potato (Solanum tuberosum).
Release notes
v1.0.2 – Small patch that updated code for sampling genotypes during the MCMC that was giving underflow errors when total read counts are high (~1000x coverage).
v1.0.1 – Removed dependency on the
RcppArmadillo sample()
function by coding
our own version (nonunif_int()
in the
sample_g.cpp source file). The Gibbs sampler should run
a bit faster now.
v1.0.0 – First release. Now available on CRAN.