PRECAST: simulation

Wei Liu

2022-10-18

This vignette introduces the PRECAST workflow for the analysis of integrating multiple spatial transcriptomics dataset. The workflow consists of three steps

We demonstrate the use of PRECAST to three simulated Visium data that are here, which can be downloaded to the current working path by the following command:

githubURL <- "https://github.com/feiyoung/PRECAST/blob/main/vignettes_data/data_simu.rda?raw=true"
download.file(githubURL,"data_simu.rda",mode='wb')

Then load to R

load("data_simu.rda")

The package can be loaded with the command:

library(PRECAST)
library(Seurat)

Load the simulated data

First, we view the the three simulated spatial transcriptomics data with Visium platform.


data_simu ## a list including three Seurat object with default assay: RNA

Check the content in data_simu.

head(data_simu[[1]])

Fit PRECAST using simulated data

Prepare the PRECASTObject with preprocessing step.

In this simulate dataset, we don’t require to select genes, thus, we set customGenelist=row.names(data_simu[[1]])

Add the model setting

Fit PRECAST

For function PRECAST, users can specify the number of clusters \(K\) or set K to be an integer vector by using modified BIC(MBIC) to determine \(K\). For convenience, we give a single K here.

Select a best model and use ARI to check the performance of clustering

Integrate the two samples by the function IntegrateSpaData.

Visualization

Show the spatial scatter plot for clusters

p12 <- SpaPlot(seuInt, batch=NULL,point_size=2, combine=TRUE)
p12
# users can plot each sample by setting combine=FALSE

Show the spatial UMAP/tNSE RGB plot

seuInt <- AddUMAP(seuInt) 
SpaPlot(seuInt, batch=NULL,item='RGB_UMAP',point_size=2, combine=TRUE, text_size=15)

#seuInt <- AddTSNE(seuInt) 
#SpaPlot(seuInt, batch=NULL,item='RGB_TSNE',point_size=2, combine=T, text_size=15)

Show the tSNE plot based on the extracted features from PRECAST to check the performance of integration.

seuInt <- AddTSNE(seuInt, n_comp = 2) 
library(patchwork)
cols_cluster <- c("#E04D50", "#4374A5", "#F08A21","#2AB673", "#FCDDDE",  "#70B5B0", "#DFE0EE" ,"#D0B14C")
p1 <- dimPlot(seuInt,  font_family='serif', cols=cols_cluster) # Times New Roman
p2 <- dimPlot(seuInt, item='batch', point_size = 1,  font_family='serif')
p1 + p2 
# It is noted that only sample batch 1 has cluster 4, and only sample batch 2 has cluster 7. 

Show the UMAP plot based on the extracted features from PRECAST.

dimPlot(seuInt, reduction = 'UMAP3', item='cluster', cols=cols_cluster, font_family='serif')

Users can also use the visualization functions in Seurat package:

DimPlot(seuInt, reduction = 'position')
DimPlot(seuInt, reduction = 'tSNE')

Combined differential expression analysis

dat_deg <- FindAllMarkers(seuInt)
library(dplyr)
n <- 2
dat_deg %>%
  group_by(cluster) %>%
  top_n(n = n, wt = avg_log2FC) -> top10

head(top10)

Session information

sessionInfo()