MOCHA is an R package containing a novel single-cell peak-calling algorithm that leverages single-cell information to determine whether a particular genomic region is open by calculating two measures of intensities, and using these to call peaks via a hierarchical model.
To install in a HISE IDE, run the following lines of code to install directly from GitHub replacing ‘your_token’ with your GitHub Personal Access Token.
Sys.setenv(GITHUB_PAT='your_token')
devtools::install_github("aifimmunology/MOCHA")
Please view the example usage found in the vignette found under
vignettes/COVID-walkthrough.html
.
The example usage demonstrates this workflow:
While the pipeline can be run function-to-function, you may wish to inspect intermediate results or use end results for your own custom analyses. All MOCHA outputs use common Bioconductor data structures. We also provide some getters for accessing specific results.
The output of the first step of the pipeline,
MOCHA::callOpenTiles
is a MultiAssayExperiment
organizing your open-tiles by cell population. This is the input to the
next function, MOCHA::getSampleTileMatrices
.
> tileResults
3 listed
A MultiAssayExperiment object of -defined names and respective classes.
experiments with user3:
Containing an ExperimentList class object of length 1] MAIT: RaggedExperiment with 2741985 rows and 6 columns
[2] CD16 Mono: RaggedExperiment with 2204602 rows and 6 columns
[3] DC: RaggedExperiment with 1439585 rows and 6 columns
[:
Functionalityexperiments() - obtain the ExperimentList instance
colData() - the primary/phenotype DataFrame
sampleMap() - the sample coordination DataFrame
`$`, `[`, `[[` - extract colData columns, subset, or experiment
*Format() - convert into a long or wide DataFrame
assays() - convert ExperimentList to a SimpleList of matrices
exportClass() - save data to flat files
Sample-level metadata can be accessed using
colData(tileResults)
.
A specific cell population’s results are stored as a RaggedExperiment.
This object stores ranged data alongside the sample-level metadata.
# Get open tiles for DC cells
> DCcellRaggedExp <- tileResults[["DC"]]
> DCcellRaggedExp
: RaggedExperiment
class: 1439585 6
dimassays(7): tileID TotalIntensity ... PredictionStrength peak
: NULL
rownamescolnames(6): B011-AP0C1W3 B011-AP0C1W8 ... B025_FSQAAZ0C00P-07
-01
B025_FSQAAZ0BZZSnames(0):
colData
# You can access the underlying GRanges object containing ranges for all samples
> rowRanges(DCcellRaggedExp)
1439585 ranges and 0 metadata columns:
GRanges object with
seqnames ranges strand<Rle> <IRanges> <Rle>
1] chr1 812500-812999 *
[2] chr1 816000-816499 *
[3] chr1 817000-817499 *
[4] chr1 819000-819499 *
[5] chr1 819500-819999 *
[
... ... ... ...1439581] chrX 155841500-155841999 *
[1439582] chrX 155881000-155881499 *
[1439583] chrX 155888000-155888499 *
[1439584] chrX 155891500-155891999 *
[1439585] chrX 155964000-155964499 *
[-------
: 24 sequences from an unspecified genome; no seqlengths seqinfo
The output of getSampleTileMatrix is a SummarizedExperiment
organizing the sample-tile matrices by cell population. This is the
input to MOCHA::getDifferentialAccessibleTiles
and other
downstream analyses.
> SampleTileMatrices
: RangedSummarizedExperiment
class: 212445 6
dimmetadata(6): Log2Intensity NAtoZero ... Org Directory
assays(3): MAIT CD16 Mono DC
rownames(212445): chr1:1000000-1000499 chr1:100028500-100028999 ...
:7730500-7730999 chrY:7940000-7940499
chrYnames(3): MAIT CD16 Mono DC
rowData colnames(6): B011-AP0C1W3 B011-AP0C1W8 ... B025_FSQAAZ0C00P-07
-01
B025_FSQAAZ0C0YJnames(178): Sample PassQC ... ATAC_WellID AIFI.Batch colData
It also holds metadata related to the genome, transcript database,
and annotations: metadata(SampleTileMatrices)
Individual sample-tile matrices for each cell population can be accessed as follows. These are filtered to only return tiles called for the given cell population.
# Get the sample-tile matrix for DC cells
<- MOCHA::getCellPopMatrix(SampleTileMatrices, "DC")
DCcellsMatrix # Get the sample-tile matrix for MAIT cells
<- MOCHA::getCellPopMatrix(SampleTileMatrices, "MAIT") MAITcellsMatrix
Results of MOCHA::getDifferentialAccessibleTiles
is
given as a data.table
and can be filtered using data.table
syntax:
> differentials[FDR<0.4]
Tile CellPopulation Foreground Background P_value1: chr1:1000000-1000499 MAIT Positive Negative 0.0808556
2: chr1:100038500-100038999 MAIT Positive Negative 0.0808556
3: chr1:10032500-10032999 MAIT Positive Negative 0.0808556
4: chr1:100424000-100424499 MAIT Positive Negative 0.0808556
5: chr1:100927500-100927999 MAIT Positive Negative 0.0808556
---
8048: chrX:77899000-77899499 MAIT Positive Negative 0.0808556
8049: chrX:77910500-77910999 MAIT Positive Negative 0.0808556
8050: chrX:78104000-78104499 MAIT Positive Negative 0.0808556
8051: chrX:78945000-78945499 MAIT Positive Negative 0.0808556
8052: chrX:9464500-9464999 MAIT Positive Negative 0.0808556
-Statistic FDR Log2FC_C MeanDiff Avg_Intensity_Case
Test1: 3.047619 0.3457015 0.01558005 0.2991209 3.803282
2: 3.047619 0.3457015 -0.03178178 -0.2803330 3.948745
3: 3.047619 0.3457015 0.02554094 0.3312654 4.103977
4: 3.047619 0.3457015 -0.17011403 -1.6639420 3.500481
5: 3.047619 0.3457015 -0.08410144 -0.6814757 3.689488
---
8048: 3.047619 0.3457015 0.02367545 0.2179824 3.916536
8049: 3.047619 0.3457015 0.02077454 0.2111809 4.019993
8050: 3.047619 0.3457015 0.01051873 0.1599092 4.039884
8051: 3.047619 0.3457015 -0.12853838 -1.2824704 3.670366
8052: 3.047619 0.3457015 -0.08677006 -0.7969690 3.660102
Pct0_Case Avg_Intensity_Control Pct0_Control1: 0 3.787702 0
2: 0 3.956360 0
3: 0 4.079114 0
4: 0 3.746133 0
5: 0 3.746400 0
---
8048: 0 3.907190 0
8049: 0 3.999219 0
8050: 0 4.032155 0
8051: 0 3.798904 0
8052: 0 3.755235 0
Note that these statistics are poor due to the small # of samples in the example vignette.
Results of MOCHA::getCoAccessibleLinks
is given as a
data.frame
and can be filtered further according to
correlation using MOCHA::filterCoAccessibleLinks
.
> links
Correlation Tile1 Tile21: 0.5481481 chrY:7326500-7326999 chrY:6964500-6964999
2: 0.5481481 chrY:7326500-7326999 chrY:6965000-6965499
3: 0.6666667 chrY:7326500-7326999 chrY:7040500-7040999
4: 0.7259259 chrY:7326500-7326999 chrY:7085000-7085499
5: 0.8444444 chrY:7326500-7326999 chrY:7085500-7085999
---
126: 0.4166667 chrY:7344500-7344999 chrY:7300000-7300499
127: 0.6666667 chrY:7344500-7344999 chrY:7310500-7310999
128: 0.6666667 chrY:7344500-7344999 chrY:7311000-7311499
129: 0.6666667 chrY:7344500-7344999 chrY:7326000-7326499
130: 0.3333333 chrY:7344500-7344999 chrY:7340000-7340499
> MOCHA::filterCoAccessibleLinks(links, threshold = 0.7)
Correlation Tile1 Tile2 chr start end1: 0.7259259 chrY:7326500-7326999 chrY:7085000-7085499 chrY 7085000 7326999
2: 0.8444444 chrY:7326500-7326999 chrY:7085500-7085999 chrY 7085500 7326999
3: 0.7851852 chrY:7326500-7326999 chrY:7099500-7099999 chrY 7099500 7326999
4: 0.7259259 chrY:7326500-7326999 chrY:7279500-7279999 chrY 7279500 7326999
5: 0.9037037 chrY:7326500-7326999 chrY:7280000-7280499 chrY 7280000 7326999
To contact the developers on issues and feature requests, please contact us via the discussions tab for feature requests, or open issues for any bugs.
MOCHA follows the Allen Institute Software License - full information about the license can be found on the LICENSE file.