Integrative analysis in Seurat v5
Reference
Introduction
-
单细胞测序数据集的整合,例如跨实验批次、donor或条件的整合,通常是scRNA-seq工作流程中的重要一步。整合分析可以帮助匹配数据集之间的共享细胞类型和状态,这可以提高统计能力,最重要的是,有助于跨数据集进行准确的比较分析。【选择合适整合方法,去除批次效应带来的细胞差异,关注细胞在生物学上的真实分群、差异表达】
-
Seurat v5使用
IntegrateLayers
功能实现了简化的整合分析。目前支持五种方法。这些方法中的每一种都在低维空间中执行集成,并返回降维(即integrated.xxx),该降维旨在跨批次共同嵌入共享细胞类型。
- Anchor-based CCA integration (method=CCAIntegration)
- Anchor-based RPCA integration (method=RPCAIntegration)
- Harmony (method=HarmonyIntegration)
- FastMNN (method= FastMNNIntegration)
- scVI (method=scVIIntegration)
选一种方法整合 IntegrateLayers
obj <- IntegrateLayers(
object = obj, method = CCAIntegration,
orig.reduction = "pca", new.reduction = "integrated.cca",
verbose = FALSE
)
obj <- IntegrateLayers(
object = obj, method = RPCAIntegration,
orig.reduction = "pca", new.reduction = "integrated.rpca",
verbose = FALSE
)
obj <- IntegrateLayers(
object = obj, method = HarmonyIntegration,
orig.reduction = "pca", new.reduction = "harmony",
verbose = FALSE
)
obj <- IntegrateLayers(
object = obj, method = FastMNNIntegration,
new.reduction = "integrated.mnn",
verbose = FALSE
)
# For example, scVI integration requires `reticulate` which can be installed from CRAN (`install.packages("reticulate")`) as well as `scvi-tools` and its dependencies installed in a conda environment.
#Please see scVI installation instructions [here](https://docs.scvi-tools.org/en/stable/installation.html).
obj <- IntegrateLayers(
object = obj, method = scVIIntegration,
new.reduction = "integrated.scvi",
conda_env = "../miniconda3/envs/scvi-env", verbose = FALSE
)
选一种方法,可视化+聚类
-
FindNeighbors()
,FindClusters()
,RunUMAP()
## CCA--------------------------------------------------
obj <- FindNeighbors(obj, reduction = "integrated.cca", dims = 1:30)
obj <- FindClusters(obj, resolution = 2, cluster.name = "cca_clusters")
obj <- RunUMAP(obj, reduction = "integrated.cca", dims = 1:30, reduction.name = "umap.cca")
p1 <- DimPlot(
obj,
reduction = "umap.cca",
group.by = c("Method", "predicted.celltype.l2", "cca_clusters"),
combine = FALSE, label.size = 2
)
## SCVI--------------------------------------------------
obj <- FindNeighbors(obj, reduction = "integrated.scvi", dims = 1:30)
obj <- FindClusters(obj, resolution = 2, cluster.name = "scvi_clusters")
obj <- RunUMAP(obj, reduction = "integrated.scvi", dims = 1:30, reduction.name = "umap.scvi")
p2 <- DimPlot(
obj,
reduction = "umap.scvi",
group.by = c("Method", "predicted.celltype.l2", "scvi_clusters"),
combine = FALSE, label.size = 2
)
wrap_plots(c(p1, p2), ncol = 2, byrow = F)
两种整合方法 比较
- 在选择方法时,主要考虑聚类中保留的生物学信息。【看不同聚类marker gene 的特异性 :聚类是不是一个类型聚在一起,而不是一个批次聚在一起(marker gene 在好几类都高表达)】
比较整合结果--Marker Gene
p1 <- VlnPlot(
obj,
features = "rna_CD8A", group.by = "unintegrated_clusters"
) + NoLegend() + ggtitle("CD8A - Unintegrated Clusters")
p2 <- VlnPlot(
obj, "rna_CD8A",
group.by = "cca_clusters"
) + NoLegend() + ggtitle("CD8A - CCA Clusters")
p3 <- VlnPlot(
obj, "rna_CD8A",
group.by = "scvi_clusters"
) + NoLegend() + ggtitle("CD8A - scVI Clusters")
p1 | p2 | p3
maker gene violinplot
- 看看CCA整合后的聚类,在其他整合中的分布 :
obj <- RunUMAP(obj, reduction = "integrated.rpca", dims = 1:30, reduction.name = "umap.rpca")
p4 <- DimPlot(obj, reduction = "umap.unintegrated", group.by = c("cca_clusters"))
p5 <- DimPlot(obj, reduction = "umap.rpca", group.by = c("cca_clusters"))
p6 <- DimPlot(obj, reduction = "umap.scvi", group.by = c("cca_clusters"))
p4 | p5 | p6
cca_cluster check
将选择的整合后的结果作为新layer
进行分析
Seurat v5 assays store data in
layers
. These layers can store raw, un-normalized counts (layer='counts'), normalized data (layer='data'), or z-scored/variance-stabilized data (layer='scale.data').
obj <- JoinLayers(obj)
obj
## An object of class Seurat
## 35789 features across 10434 samples within 5 assays
## Active assay: RNA (33694 features, 2000 variable features)
## 3 layers present: data, counts, scale.data
## 4 other assays present: prediction.score.celltype.l1, prediction.score.celltype.l2, prediction.score.celltype.l3, mnn.reconstructed
## 12 dimensional reductions calculated: integrated_dr, ref.umap, pca, umap.unintegrated, integrated.cca, integrated.rpca, harmony, integrated.mnn, integrated.scvi, umap.cca, umap.scvi, umap.rpca
- SCT normalization +整合 示例:
options(future.globals.maxSize = 3e+09)
obj <- SCTransform(obj)
obj <- RunPCA(obj, npcs = 30, verbose = F)
obj <- IntegrateLayers(
object = obj,
method = RPCAIntegration,
normalization.method = "SCT",
verbose = F
)
obj <- FindNeighbors(obj, dims = 1:30, reduction = "integrated.dr")
obj <- FindClusters(obj, resolution = 2)
网友评论