美文网首页
【包】seurat-2 多组学联合

【包】seurat-2 多组学联合

作者: JamesMori | 来源:发表于2022-10-30 21:55 被阅读0次

    写在前面:看官方文档,很可能会误认为很简单,或者有些人直接延用官方的路径。这都是不可取的,最重要的永远是自己的思路,不仅是科研方面的,还是代码绘图方面的。官方的绘图函数虽然比较漂亮,但是针对性很强,学习底层的绘图对个人意义才是最大的。官方的东西借鉴和学习。
    有哪些组学呢?
    有哪些单细胞多组学联用技术呢?

    CITE-seq:Simultaneous epitope and transcriptome measurement in single cells | Nature Methods
    10x multiome kit:Single Cell Multiome ATAC + Gene Expression - 10x Genomics
    Cell Hashing:Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics | Genome Biology | Full Text (biomedcentral.com)

    总之就是转录组+表面蛋白、ATAC、甲基化组、基因组。读文章再补充
    多组学表面上是多层次,根本上可以解释为时间顺序上的故事,是可以互为因果的。
    seurat这里有个Weighted Nearest Neighbors (WNN) 方法,根据多组学结合对细胞作聚类,有个加权过程。
    Weighted Nearest Neighbor Analysis • Seurat (satijalab.org)
    !但是这个文章并没有用WNN

    1. 样本数据准备

    1.1. 提取数据:8,617 cord blood mononuclear cells (CBMCs)

    RNA和11个antibody-derived tags (ADT)表面蛋白

    # Load in the RNA UMI matrix
    
    # Note that this dataset also contains ~5% of mouse cells, which we can use as negative
    # controls for the protein measurements. For this reason, the gene expression matrix has
    # HUMAN_ or MOUSE_ appended to the beginning of each gene.
    cbmc.rna <- as.sparse(read.csv(file = "../data/GSE100866_CBMC_8K_13AB_10X-RNA_umi.csv.gz", sep = ",",header = TRUE, row.names = 1))
    
    # To make life a bit easier going forward, we're going to discard all but the top 100 most
    # highly expressed mouse genes, and remove the 'HUMAN_' from the CITE-seq prefix
    cbmc.rna <- CollapseSpeciesExpressionMatrix(cbmc.rna)
    
    # Load in the ADT UMI matrix
    cbmc.adt <- as.sparse(read.csv(file = "../data/GSE100866_CBMC_8K_13AB_10X-ADT_umi.csv.gz", sep = ",", header = TRUE, row.names = 1))
    
    # Note that since measurements were made in the same cells, the two matrices have identical column names
    #检验永远是最重要哒
    all.equal(colnames(cbmc.rna), colnames(cbmc.adt))
    
    
    CollapseSpeciesExpressionMatrix(
      object,
      prefix = "HUMAN_",
      controls = "MOUSE_",
      ncontrols = 100
    )
    #这些小工具真的太有意思了
    

    1.2. 创建seurat 对象

    # creates a Seurat object based on the scRNA-seq data
    cbmc <- CreateSeuratObject(counts = cbmc.rna)
    
    # We can see that by default, the cbmc object contains an assay storing RNA measurement
    Assays(cbmc) #seurat有很多query,可以queryseurat对象的某类数据,相当于seurat的子对象
    ## [1] "RNA"
    

    Query Specific Object Types — Assays • SeuratObject (mojaveazure.github.io)

    #assay子对象
    # create a new assay to store ADT information
    adt_assay <- CreateAssayObject(counts = cbmc.adt)
    
    # add this assay to the previously created Seurat object
    cbmc[["ADT"]] <- adt_assay
    
    # Validate that the object now contains multiple assays
    Assays(cbmc)
    ## [1] "RNA" "ADT"
    
    # Extract a list of features measured in the ADT assay
    rownames(cbmc[["ADT"]])
    ##  [1] "CD3"    "CD4"    "CD8"    "CD45RA" "CD56"   "CD16"   "CD10"   "CD11c" 
    ##  [9] "CD14"   "CD19"   "CD34"   "CCR5"   "CCR7"
    
    
    # Note that we can easily switch back and forth between the two assays to specify the default
    # for visualization and analysis
    
    # List the current default assay
    DefaultAssay(cbmc)
    ## [1] "RNA"
    # Switch the default to ADT
    DefaultAssay(cbmc) <- "ADT"
    DefaultAssay(cbmc)
    ## [1] "ADT"
    

    2. 根据scRNA-seq进行细胞聚类

    # Note that all operations below are performed on the RNA assay Set and verify that the
    # default assay is RNA
    DefaultAssay(cbmc) <- "RNA"
    

    紧接着就是一连串的标准流程,可以【包】seurat-1 回顾,下面代码有简单的注释

    # perform visualization and clustering steps
    cbmc <- NormalizeData(cbmc)#标准化数据
    cbmc <- FindVariableFeatures(cbmc)#筛选高变异性基因
    cbmc <- ScaleData(cbmc)#归一化高变异性基因
    cbmc <- RunPCA(cbmc, verbose = FALSE)#线性分解细胞对高变异基因的差异解释度
    cbmc <- FindNeighbors(cbmc, dims = 1:30)#细胞间距离
    cbmc <- FindClusters(cbmc, resolution = 0.8, verbose = FALSE)#细胞聚类
    cbmc <- RunUMAP(cbmc, dims = 1:30)#非线性展示低维细胞聚类
    DimPlot(cbmc, label = TRUE)#绘图
    

    标准化与归一化:标准化是去除样本间基线差异、归一化是去除参数间权重差异

    聚类结果

    3. 多组学切换,灵活可视化

    # Normalize ADT data,
    DefaultAssay(cbmc) <- "ADT"
    cbmc <- NormalizeData(cbmc, normalization.method = "CLR", margin = 2)
    DefaultAssay(cbmc) <- "RNA"
    
    # Note that the following command is an alternative but returns the same result
    cbmc <- NormalizeData(cbmc, normalization.method = "CLR", margin = 2, assay = "ADT")
    
    # Now, we will visualize CD14 levels for RNA and protein By setting the default assay, we can
    # visualize one or the other
    DefaultAssay(cbmc) <- "ADT"
    p1 <- FeaturePlot(cbmc, "CD19", cols = c("lightgrey", "darkgreen")) + ggtitle("CD19 protein")
    DefaultAssay(cbmc) <- "RNA"
    p2 <- FeaturePlot(cbmc, "CD19") + ggtitle("CD19 RNA")
    
    # place plots side-by-side
    p1 | p2
    
    # Alternately, we can use specific assay keys to specify a specific modality Identify the key
    # for the RNA and protein assays
    Key(cbmc[["RNA"]])
    ## [1] "rna_"
    Key(cbmc[["ADT"]])
    ## [1] "adt_"
    # Now, we can include the key in the feature name, which overrides the default assay
    p1 <- FeaturePlot(cbmc, "adt_CD19", cols = c("lightgrey", "darkgreen")) + ggtitle("CD19 protein")
    p2 <- FeaturePlot(cbmc, "rna_CD19") + ggtitle("CD19 RNA")
    p1 | p2
    
    feature图

    4. markers鉴定

    # as we know that CD19 is a B cell marker, we can identify cluster 6 as expressing CD19 on the
    # surface
    VlnPlot(cbmc, "adt_CD19")
    
    小提琴图
    # we can also identify alternative protein and RNA markers for this cluster through
    # differential expression
    adt_markers <- FindMarkers(cbmc, ident.1 = 6, assay = "ADT")
    rna_markers <- FindMarkers(cbmc, ident.1 = 6, assay = "RNA")
    
    head(adt_markers)
    ##                p_val avg_log2FC pct.1 pct.2     p_val_adj
    ## CD19   2.067533e-215  1.2787751     1     1 2.687793e-214
    ## CD45RA 8.106076e-109  0.4117172     1     1 1.053790e-107
    ## CD4    1.123162e-107 -0.7255977     1     1 1.460110e-106
    ## CD14   7.212876e-106 -0.5060496     1     1 9.376739e-105
    ## CD3     1.639633e-87 -0.6565471     1     1  2.131523e-86
    ## CD8     1.042859e-17 -0.3001131     1     1  1.355716e-16
    
    head(rna_markers)
    ##       p_val avg_log2FC pct.1 pct.2 p_val_adj
    ## BANK1     0   1.963277 0.456 0.015         0
    ## CD19      0   1.563124 0.351 0.004         0
    ## CD22      0   1.503809 0.284 0.007         0
    ## CD79A     0   4.177162 0.965 0.045         0
    ## CD79B     0   3.774579 0.944 0.089         0
    ## FCRL1     0   1.188813 0.222 0.002         0
    
    

    5. 其他绘图

    # Draw ADT scatter plots (like biaxial plots for FACS). Note that you can even 'gate' cells if
    # desired by using HoverLocator and FeatureLocator
    FeatureScatter(cbmc, feature1 = "adt_CD19", feature2 = "adt_CD3")
    
    feature dot1
    # view relationship between protein and RNA
    FeatureScatter(cbmc, feature1 = "adt_CD3", feature2 = "rna_CD3E")
    
    feature dot2
    #对seurat数据格式要熟悉一点
    # Let's look at the raw (non-normalized) ADT counts. You can see the values are quite high,
    # particularly in comparison to RNA values. This is due to the significantly higher protein
    # copy number in cells, which significantly reduces 'drop-out' in ADT data
    FeatureScatter(cbmc, feature1 = "adt_CD4", feature2 = "adt_CD8", slot = "counts")
    
    feature dot3

    6. 小结

    这个文章只是最基础的多组学联合分析,甚至都没涉及相关性计算。文章末尾也写了更多阅读的链接:

    • Defining cellular identity from multimodal data using WNN analysis in Seurat v4 vignette
    • Mapping scRNA-seq data onto CITE-seq references [vignette]
    • Introduction to the analysis of spatial transcriptomics analysis [vignette] 空间好像也就是多了一个层次的信息
    • Analysis of 10x multiome (paired scRNA-seq + ATAC) using WNN analysis [vignette]
    • Signac: Analysis, interpretation, and exploration of single-cell chromatin datasets [package] 哪一层染色质信息呢?
    • Mixscape: an analytical toolkit for pooled single-cell genetic screens [vignette]

    相关文章

      网友评论

          本文标题:【包】seurat-2 多组学联合

          本文链接:https://www.haomeiwen.com/subject/lbnktdtx.html