calc_genoprob()
for the X chromosome, so that it just keeps
track of the chromosome from the DO/HS parent. In males, we are assuming
that the DO/HS parent is the mother.Added dependency on version of Rcpp (>= 1.0.7)
Revised genoprob_to_alleleprob()
to work with DOF1
and HSF1. plot_onegeno()
should also work now in these
cases. (Issue #140 and Issue #141)
Revised predict_snpgeno()
to work for DOF1 and HSF1
populations.
Now give a better error message in
genoprob_to_snpprob()
if snpinfo
is missing
the sdp
column (Issue #207).
In read_csv()
, now give warnings if there are
duplicate column names or duplicate row names in the file.
In read_cross2()
, moved the warning regarding the
number of alleles to before the alleles object gets corrected (Issue
#209).
Now issue a warning message if founder genotypes are included but not used (Issue #211).
The treatment of the male X chromosome in DOF1 and HSF1 was incorrect. We’re now assuming that the DO or HS parent was the mother in the F1 cross, in which case males will be hemizygous for one of the DO/HS founder alleles.
The default colors for the Collaborative Cross (CC) have been
changed to a color-blind friendly palette. The original CC colors remain
as CCorigcolors
; the previous default is now
CCaltcolors
. The new colors are derived from the palette in
Wong 2011 Nature
Methods.
plot_coefCC()
was revised to include
col=CCcolors
as an argument. The default is the new
color-blind friendly CC colors, but one can now more easily use
col=CCaltcolors
or col=CCorigcolors
to get a
different choice.
Added plot_sdp()
to plot the strain distribution
patterns of SNPs using tracks of tick-marks for each founder strain.
(Issue #163)
Added arguments sdp_panel
and
strain_labels
to plot_snpasso()
so that you
can include the plot_sdp()
panel with the SNP association
results and/or the genes.
Added replace_ids()
for a matrix or data frame
(using the row names as the individual IDs). (Issue #191)
Have calc_het()
give an error if the input are for
allele dosages. (Issue #190)
Sneaky change in ind_ids()
makes it apply to
calc_genoprob
and fst_genoprob
objects. I’m
not sure how to document this. (Issue #189)
The output of est_herit()
now includes the residual
SD as an attribute, "resid_sd"
. (Issue #16)
Implemented a cross type "hsf1"
that is similar to
"dof1"
, for a cross between an 8-way HS individual and a
9th strain. (Issue #149)
calc_kinship()
died with cryptic error if genotype
probabilities didn’t have a names attribute; now using
seq_len(probs)
.
Give better error messages in est_map()
,
viterbi()
, and sim_geno()
if the cross is
missing the genetic map.
Fixed Issue #194: calc_genoprob()
was taking
chromosome names from cross$gmap
which might have been
missing; now using names(cross$geno)
.
Fixed Issue #195: in create_snpinfo()
, drop markers
that are non-informative.
Fixed Issue #196, that step()
returns
-Inf
rather than NaN
for general AIL. This had
to do with the handling of -Inf
in
addlog()
.
In fit1()
and scan1coef()
, wasn’t
grabbing the ...
arguments. properly.
Ugly c++ revisions to avoid clang UBsan warnings on CRAN. (Issue #169)
Revised reduce_markers()
so that it can handle the
case of many markers, by working with them in smaller
batches.
fit1()
now returns both fitted values and
residuals.
fit1()
can be run with genotype probabilities
omitted, in which case an intercept column of 1’s is used (Issue
#151).
Updated mouse gene database with 2020-09-07 data from MGI.
Implemented Issue #184, to make calc_het()
multi-core.
Make the vdiffr package optional: only test the plots locally, and only if vdiffr is installed.
calc_sdp()
can now take a plain vector (Issue
#142).
Added a lodcolumn
argument to maxlod()
(Issue #137).
Fixed Issue
#181, where calc_het()
gave values > 1 when used
with R/qtl2fst-based
probabilities. Also fixed a similar bug in
calc_geno_freq()
.
Fixed Issue
#172, where fit1()
gave incorrect fitted values when
kinship
is provided, because they weren’t “rotated
back”.
fit1()
no longer provides individual LOD scores
(ind_lod
) when kinship
is used, as with the
linear mixed model, the LOD score is not simply the sum of individual
contributions.
Fixed Issue
#174 re genoprob_to_alleleprob()
for general AIL
crosses. We had not implemented the geno2allele_matrix()
function.
Fixed Issue #164, so plot_pxg()
can handle a
phenotype that is a single-column data frame.
Fixed Issue #135, so plot_scan1()
can take vector
input (which is then converted to a single-column matrix).
Fixed Issue #157, to have calc_genoprob()
give a
better error message about missing genetic map.
Fixed Issue #178, to have read_cross2()
give a
warning not an error if incorrect number of alleles.
Fixed Issue #180 re scan1()
error if phenotypes’
rownames have rownames.
Fixed Issue #146, revising predict_snpgeno()
so that
it works for homozygous populations, like MAGIC lines or the
Collaborative Cross.
Fixed Issue #176, that guess_phase()
doesn’t work
with cross type "genail"
. Needed to define
phase_known_crosstype
as "genailpk"
in cross_genail.h
because otherwise is_phase_known()
will return
TRUE.
Fixed compilation error on Solaris on CRAN, due to a log(10) and sqrt(10). Solaris refuses to do log(int) or sqrt(int), I guess.
Fixed some conflicting enum definitions in c++ code
Added recognition of the R Core Team as a contributor, as the package includes a copy of code for Brent’s method for univariate function optimization. Also added a Copyright field in the DESCRIPTION field, explaining the copyright of that code.
Sped up some of the examples and tests. Tests no longer use more than 2 cores (even those that are only run locally).
Added \value{}
sections in the documentation for
various functions. Added further explanation of "viterbi"
,
"scan1"
, "scan1perm"
, etc. S3 classes in the
documentation.
Added some functions for diagnostics: recode_snps()
,
calc_raw_het()
, calc_raw_geno_freq()
,
calc_raw_maf()
, and
calc_raw_founder_maf()
.
Added argument blup
to fit1()
, for
getting BLUPs for a single fixed QTL position. At present, just gives
estimates and coefficients by calling scan1blup()
with a
single position.
pull_genoprobpos()
can now take either a marker name
(as before) or a set of map, chromosome, and position (from which it
uses find_marker()
to get the marker name).
Added plot function for the results of
compare_geno()
. (Plots histogram of upper
triangle.)
Added functions n_founders()
and
founders()
for getting the number of founders and the
founder strain names for a cross2 object.
scan1()
now takes an optional hsq
argument, so that the residual heritability may be specified rather than
estimated.
write_control_file()
now allows cross info codes
with a cross info file (previously only allowed with a covariate).
read_cross2()
gives a warning if there are cross info
conversion codes but more than one cross info column.
Small fix in read_cross2()
to allow multiple cross
info covariates.
Added a check that the founder genotypes have the same strain IDs on each chromosome.
convert2cross2()
now includes alleles
component even if it wasn’t present as an attribute.
Added function sdp2char()
for converting numeric SDP
codes to character strings like "ABC|DEFGH"
.
Updated mouse gene database with 2019-08-12 data from MGI.
get_common_ids()
strips off names from output, just
in case.
Added internal functions rqtl1_crosstype()
and
rqtl1_chrtype()
.
Fixed typo in help for scan1()
and related
functions.
genoprob_to_snpprob()
was giving an error if you
gave a cross2 object in place of a snpinfo table and it had monomorphic
markers.
Fixed problem with weights in scan1()
and related
functions when their derived from table()
. Make sure
they’re a plain numeric vector, not an array.
Fixed check_cross2()
: the check for invalid
genotypes wasn’t happening.
Better error message for the case that there are no markers in common between map and genotypes.
extract_dim_from_header()
, used by
read_cross2()
and read_csv()
, now just looks
for the number part in the rest of the line.
maxlod()
now handles missing values (forcing
na.rm=TRUE
). If all values are missing it gives a warning
and returns -Inf
. [Fixes Issue #134.]
In max_scan1()
, treat the case that the input has no
column names. [Fixes Issue #133.]
max_scan1()
was giving a messed up error message if
lodcolumn
was out of range. [Fixes Issue #132.]
Revised the script inst/scripts/create_ccvariants.R
to capture all of the consequences and genes for each SNP
(rather than just the first), and fixing a bug that prevented capture of
indels from chromosomes 6-X. Consequently, revised the example SQLite
database extdata/cc_variants_small.sqlite
and associated
tests.
scan1coef()
and fit1()
now, by default,
gives coefficient estimates for the QTL effects that sum to 0, with an
additional coefficient being the intercept. This makes it more like
DOQTL (and scan1blup()
). The previous behavior can be
obtained with the argument zerosum=FALSE
.
Added function create_snpinfo()
for creating a SNP
information table from a cross2 object, for use with
scan1snps()
.
Updated extdata/mouse_genes_small.sqlite
using
updated MGI annotations. Some of the field names have changed.
In check_cross2()
, added a test for alleles being a
vector of character strings.
Fixed some tests for R 3.6, due to change in random number generation.
Use Markdown for function documentation, throughout
In genoprob_to_snpprob()
when a cross object is
provided, make sure the genotype probabilities get subset to the cross
markers.
Fixed bug in scan1snps()
re
keep_all_snps=FALSE
. It wasn’t subsetting to the index SNPs
properly. Added an internal function
reduce_to_index_snps()
. (See Issue #89.)
Fixed bug in step probabilities for 4-, 8-, and 16-way RIL by selfing.
Fixed bug in zip_datafiles()
when the files are in a
subdirectory. (See Issue #102.)
Fixed bug in plot_peaks()
for the case that the
input peaks
object does not contain QTL intervals. (See Issue #107.)
Fixed inappropriate warning message for check of
cross_info
with cross type risib8
.
Fixed bugs in guess_phase()
and
locate_xo()
where we needed an any()
around a
comparison of two vectors.
Added plot_lodpeaks()
for scatterplot of LOD score
vs position for inferred QTL from find_peaks()
output.
Added new cross types "genril"
and
"genail"
, implemented to handle any number of founders;
include the number of founders in the cross type, for example
"genril38"
or "genail38"
. The cross
information has length 1 + number of founders, with first column being
the number of generations and the remaining columns being non-negative
integers that indicate the relative frequencies of the founders in the
initial population (these will be scaled to sum to 1).
"genril"
assumes the progeny are inbred lines (recombinant
inbred lines, RIL), while "genail"
assumes the progeny have
two random chromosomes (advanced intercross lines, AIL).
The internal function batch_vec()
now made
user-accessible, and takes an additional argument n_cores
.
This splits a vector into batches for use in parallel
calculations.
The internal function cbind_expand()
now made
user-accessible. It’s for combining matrices using row names to align
the rows and expanding with missing values if there are rows in some
matrices but not others.
In plot_peaks()
, added lod_labels
argument. If TRUE, include LOD scores as text labels in the
figure.
Added function calc_het()
for calculating estimated
heterozygosities, by individual or by marker, from genotype
probabilities derived by calc_genoprob()
.
Small corrections to documentation.
Revise some tests due to change in Recla and DOex datasets at https://github.com/rqtl/qtl2data
Add tests of decomposed kinship matrix (from
decomp_kinship()
) with scan1()
.
rbind_scan1()
and cbind_scan1()
no
longer give error if inputs don’t all have matching attributes.
Change default gap between chromosomes in
plot_scan1()
(and related) to be 1% of the total genome
length.
Fixed bug in subset_kinship()
that prevented
scan1()
from working with decomposed “loco” kinship
matrices.
Fixed descriptions in help files for
cbind.calc_genoprob()
and
rbind.calc_genoprob()
, for column- and row-binding genotype
probabilities objects (as output by calc_genoprob()
.
cbind()
is for the same set of individuals but different
chromosomes. rbind()
is for the same set of markers and
genotypes but different individuals. Made similar corrections for the
related functions for sim_geno()
and viterbi()
output.
Added pull_genoprobint()
for pulling out the
genotype probabilities for a given genomic interval. Useful, for
example, to apply scan1blup()
over a defined interval
rather than an entire chromosome.
scan1()
, scan1perm()
,
scan1coef()
, fit1()
, and
scan1snps()
can now use weights when kinship
is provided, for example for the case of the analysis of recombinant
inbred line (RIL) phenotype means with differing numbers of individuals
per line. The residual variance matrix is like \(v[h^2 K + (1-h^2)D]\) where D is diagonal
{1/w} for weights w.
Add weights
argument to
est_herit()
.
Added add_threshold()
for adding significance
thresholds to a genome scan plot.
Added predict_snpgeno()
for predicting SNP genotypes
in a multiparent populations, from inferred genotypes plus the founder
strains’ SNP alleles.
In genoprob_to_snpprob()
, the snpinfo
argument can now be a cross object (for a multiparent population with
founder genotypes), in which case the SNP information for all SNPs with
complete founder genotype data is calculated and used.
max_scan1()
with lodcolumn=NULL
returns
the maximum for all lod score columns. If map
is included,
the return value is in the form returned by find_peaks()
,
namely with lodindex
and lodcolumn
arguments
added at the beginning.
Added replace_ids()
for replacing individual IDs in
an object. S3 method for "cross2"
objects and output of
calc_genoprob()
, viterbi()
,
maxmarg()
, and sim_geno()
.
Added clean_scan1()
plus generic function
clean()
that works with both this and with
clean_genoprob()
. clean_scan1()
replaces
negative values with NA
and removes rows that have all
NAs
.
More informative error message in est_herit()
,
scan1()
, etc., when covariates and other data are not
numeric.
Fixed pull_genoprobpos()
so it will work with qtl2feather (and qtl2fst).
In plot_genes()
, if xlim
is provided as
an argument, subset the genes to those that will actually appear in the
plotting region.
Revise find_marker()
so that the input
map
can also be a “snp info” table (with columns
"snp_id"
, "chr"
and
"pos"
).
Added find_index_snp()
for identifying the index SNP
that corresponds to a particular SNP in a snp info table that’s been
indexed with index_snps()
.
Add overwrite
argument (default FALSE
)
to zip_datafiles()
, similar to that for
write_control_file()
.
plot_snpasso()
now takes an argument
chr
.
max_scan1()
no longer gives a warning if
map
is not provided.
insert_pseudomarkers()
will now accept
pseudomarker_map
that includes only a portion of the
chromosomes.
In fit1()
, replaced tol
and
maxit
and added ...
which takes these plus a
few additional hidden control parameters.
Fix a bug in index_snps()
; messed up results when
start
and end
outside the range of the
map.
Fix a bug in scan1snps()
regarding use of
chr
argument: need to force to be unique character strings,
and avoid unnecessary warning about start
and
end
.
Fix a bug in scan1snps()
where it didn’t check that
the genoprobs
and map
conform.
Revised underlying binary trait regression function to avoid some of the tendency towards NAs.
Added function clean_genoprob()
which cleans
genotype probabilities by setting small values to 0 and, for genotype
columns where the maximum value is not large, setting all values to 0.
This is intended to help with the problem of unstable estimates of
genotype effects in scan1coef()
and fit1()
when there’s a genotype that is largely absent.
Added function compare_maps()
for comparing marker
order between two marker maps.
Revised the order of arguments in reduce_markers()
to match pick_marker_subset()
, because I like the latter
better. Removed the function pick_marker_subset()
because
it’s identical to reduce_markers()
. (Seriously, I
implemented the same thing twice.)
plot_coef()
now uses a named lodcolumn
argument, if provided, to subset scan1_output
, if that’s
provided.
In the documentation for scan1coef()
,
scan1blup()
, and fit1()
, revised the suggested
contrasts for getting additive and dominance effects in an
intercross.
plot_coef()
with scan1_output
provided,
ylim_lod
was being ignored.find_peaks()
and max_scan1()
can now take
snpinfo tables (as produced by index_snps()
and
scan1snps()
) in place of the map.find_peaks()
.find_peaks()
. (Stopped with
error if no LOD scores were above the threshold.)The output of fit1()
now includes fitted
values.
Added function pull_genoprobpos()
for pulling out a
specific position (by name or position) from a set of genotype
probabilities.
The chr
column in the result of
find_peaks()
is now a factor. This makes it possible to
sort by chromosome. Also added an argument sort_by
for
choosing how to sort the rows in the result (by column, genomic
position, or LOD score).
In max_scan1()
, if map
is not
provided, rather than stopping with an error, we just issue a warning
and return the genome-wide maximum LOD score.
Revised find_markerpos()
so it can take a map (as a
list of vectors of marker positions) in place of a "cross2"
object.
plot.scan1()
, which failed to pass
lodcolumn
to plot_snpasso()
.The previously separate packages qtl2geno, qtl2scan, qtl2plot, and qtl2db have now been combined into one package.