美文网首页
Seurat scRNA-seq 数据整合

Seurat scRNA-seq 数据整合

作者: 重拾生活信心 | 来源:发表于2023-12-17 13:43 被阅读0次

    Integrative analysis in Seurat v5

    Reference

    Introduction

    • 单细胞测序数据集的整合,例如跨实验批次donor条件的整合,通常是scRNA-seq工作流程中的重要一步。整合分析可以帮助匹配数据集之间的共享细胞类型和状态,这可以提高统计能力,最重要的是,有助于跨数据集进行准确的比较分析。【选择合适整合方法,去除批次效应带来的细胞差异,关注细胞在生物学上的真实分群、差异表达】

    • Seurat v5使用IntegrateLayers功能实现了简化的整合分析。目前支持五种方法。这些方法中的每一种都在低维空间中执行集成,并返回降维(即integrated.xxx),该降维旨在跨批次共同嵌入共享细胞类型。

    1. Anchor-based CCA integration (method=CCAIntegration)
    2. Anchor-based RPCA integration (method=RPCAIntegration)
    3. Harmony (method=HarmonyIntegration)
    4. FastMNN (method= FastMNNIntegration)
    5. scVI (method=scVIIntegration)

    选一种方法整合 IntegrateLayers

    obj <- IntegrateLayers(
      object = obj, method = CCAIntegration,
      orig.reduction = "pca", new.reduction = "integrated.cca",
      verbose = FALSE
    )
    
    obj <- IntegrateLayers(
      object = obj, method = RPCAIntegration,
      orig.reduction = "pca", new.reduction = "integrated.rpca",
      verbose = FALSE
    )
    
    obj <- IntegrateLayers(
      object = obj, method = HarmonyIntegration,
      orig.reduction = "pca", new.reduction = "harmony",
      verbose = FALSE
    )
    
    obj <- IntegrateLayers(
      object = obj, method = FastMNNIntegration,
      new.reduction = "integrated.mnn",
      verbose = FALSE
    )
    
    # For example, scVI integration requires `reticulate` which can be installed from CRAN (`install.packages("reticulate")`) as well as `scvi-tools` and its dependencies installed in a conda environment.
    #Please see scVI installation instructions [here](https://docs.scvi-tools.org/en/stable/installation.html).
    
    obj <- IntegrateLayers(
      object = obj, method = scVIIntegration,
      new.reduction = "integrated.scvi",
      conda_env = "../miniconda3/envs/scvi-env", verbose = FALSE
    )
    

    选一种方法,可视化+聚类

    • FindNeighbors(),FindClusters(),RunUMAP()
    ## CCA--------------------------------------------------
    obj <- FindNeighbors(obj, reduction = "integrated.cca", dims = 1:30)
    obj <- FindClusters(obj, resolution = 2, cluster.name = "cca_clusters")
    obj <- RunUMAP(obj, reduction = "integrated.cca", dims = 1:30, reduction.name = "umap.cca")
    p1 <- DimPlot(
      obj,
      reduction = "umap.cca",
      group.by = c("Method", "predicted.celltype.l2", "cca_clusters"),
      combine = FALSE, label.size = 2
    )
    
    ## SCVI--------------------------------------------------
    obj <- FindNeighbors(obj, reduction = "integrated.scvi", dims = 1:30)
    obj <- FindClusters(obj, resolution = 2, cluster.name = "scvi_clusters")
    obj <- RunUMAP(obj, reduction = "integrated.scvi", dims = 1:30, reduction.name = "umap.scvi")
    p2 <- DimPlot(
      obj,
      reduction = "umap.scvi",
      group.by = c("Method", "predicted.celltype.l2", "scvi_clusters"),
      combine = FALSE, label.size = 2
    )
    
    wrap_plots(c(p1, p2), ncol = 2, byrow = F)
    
    两种整合方法 比较
    • 在选择方法时,主要考虑聚类中保留的生物学信息。【看不同聚类marker gene 的特异性 :聚类是不是一个类型聚在一起,而不是一个批次聚在一起(marker gene 在好几类都高表达)】

    比较整合结果--Marker Gene

    p1 <- VlnPlot(
      obj,
      features = "rna_CD8A", group.by = "unintegrated_clusters"
    ) + NoLegend() + ggtitle("CD8A - Unintegrated Clusters")
    p2 <- VlnPlot(
      obj, "rna_CD8A",
      group.by = "cca_clusters"
    ) + NoLegend() + ggtitle("CD8A - CCA Clusters")
    p3 <- VlnPlot(
      obj, "rna_CD8A",
      group.by = "scvi_clusters"
    ) + NoLegend() + ggtitle("CD8A - scVI Clusters")
    p1 | p2 | p3
    
    maker gene violinplot
    • 看看CCA整合后的聚类,在其他整合中的分布 :
    obj <- RunUMAP(obj, reduction = "integrated.rpca", dims = 1:30, reduction.name = "umap.rpca")
    p4 <- DimPlot(obj, reduction = "umap.unintegrated", group.by = c("cca_clusters"))
    p5 <- DimPlot(obj, reduction = "umap.rpca", group.by = c("cca_clusters"))
    p6 <- DimPlot(obj, reduction = "umap.scvi", group.by = c("cca_clusters"))
    p4 | p5 | p6
    
    cca_cluster check

    将选择的整合后的结果作为新layer进行分析

    Seurat v5 assays store data in layers. These layers can store raw, un-normalized counts (layer='counts'), normalized data (layer='data'), or z-scored/variance-stabilized data (layer='scale.data').

    obj <- JoinLayers(obj)
    obj
    
    ## An object of class Seurat 
    ## 35789 features across 10434 samples within 5 assays 
    ## Active assay: RNA (33694 features, 2000 variable features)
    ##  3 layers present: data, counts, scale.data
    ##  4 other assays present: prediction.score.celltype.l1, prediction.score.celltype.l2, prediction.score.celltype.l3, mnn.reconstructed
    ##  12 dimensional reductions calculated: integrated_dr, ref.umap, pca, umap.unintegrated, integrated.cca, integrated.rpca, harmony, integrated.mnn, integrated.scvi, umap.cca, umap.scvi, umap.rpca
    
    
    • SCT normalization +整合 示例:
    options(future.globals.maxSize = 3e+09)
    obj <- SCTransform(obj)
    obj <- RunPCA(obj, npcs = 30, verbose = F)
    obj <- IntegrateLayers(
      object = obj,
      method = RPCAIntegration,
      normalization.method = "SCT",
      verbose = F
    )
    obj <- FindNeighbors(obj, dims = 1:30, reduction = "integrated.dr")
    obj <- FindClusters(obj, resolution = 2)
    

    相关文章

      网友评论

          本文标题:Seurat scRNA-seq 数据整合

          本文链接:https://www.haomeiwen.com/subject/fflogdtx.html