美文网首页单细胞测序
Seurat4.0系列教程3:合并两个样品的10x数据集

Seurat4.0系列教程3:合并两个样品的10x数据集

作者: Seurat_Satija | 来源:发表于2021-05-20 00:11 被阅读0次

    在此,我们将合并两个 10X PBMC 数据集:一个包含 4K 细胞,一个包含 8K 细胞。数据集可以在这里找到。

    首先,我们在数据中读入并创建两个Seurat对象。

    library(Seurat)
    pbmc4k.data <- Read10X(data.dir = "../data/pbmc4k/filtered_gene_bc_matrices/GRCh38/")
    pbmc4k <- CreateSeuratObject(counts = pbmc4k.data, project = "PBMC4K")
    pbmc4k
    ## An object of class Seurat 
    ## 33694 features across 4340 samples within 1 assay 
    ## Active assay: RNA (33694 features, 0 variable features)
    
    pbmc8k.data <- Read10X(data.dir = "../data/pbmc8k/filtered_gene_bc_matrices/GRCh38/")
    pbmc8k <- CreateSeuratObject(counts = pbmc8k.data, project = "PBMC8K")
    pbmc8k
    ## An object of class Seurat 
    ## 33694 features across 8381 samples within 1 assay 
    ## Active assay: RNA (33694 features, 0 variable features)
    

    合并两个Seurat对象

    merge()合并两个对象的原始计数矩阵,并创建一个新的对象。

    pbmc.combined <- merge(pbmc4k, y = pbmc8k, add.cell.ids = c("4K", "8K"), project = "PBMC12K")
    pbmc.combined
    ## An object of class Seurat 
    ## 33694 features across 12721 samples within 1 assay 
    ## Active assay: RNA (33694 features, 0 variable features)
    
    # notice the cell names now have an added identifier
    head(colnames(pbmc.combined))
    ## [1] "4K_AAACCTGAGAAGGCCT-1" "4K_AAACCTGAGACAGACC-1" "4K_AAACCTGAGATAGTCA-1"
    ## [4] "4K_AAACCTGAGCGCCTCA-1" "4K_AAACCTGAGGCATGGT-1" "4K_AAACCTGCAAGGTTCT-1"
    
    table(pbmc.combined$orig.ident)
    
    ## 
    ## PBMC4K PBMC8K 
    ##   4340   8381
    

    合并两个以上的Seurat对象

    要合并两个以上的对象,只需将多个对象的矢量传递到参数中即可:我们将使用 4K 和 8K PBMC 数据集以及我们以前计算的 2,700 PBMC的Seurat 对象来演示此情况。

    library(SeuratData)
    InstallData("pbmc3k")
    pbmc3k <- LoadData("pbmc3k", type = "pbmc3k.final")
    pbmc3k
    ## An object of class Seurat 
    ## 13714 features across 2638 samples within 1 assay 
    ## Active assay: RNA (13714 features, 2000 variable features)
    ##  2 dimensional reductions calculated: pca, umap
    
    pbmc.big <- merge(pbmc3k, y = c(pbmc4k, pbmc8k), add.cell.ids = c("3K", "4K", "8K"), project = "PBMC15K")
    pbmc.big
    ## An object of class Seurat 
    ## 34230 features across 15359 samples within 1 assay 
    ## Active assay: RNA (34230 features, 0 variable features)
    
    head(colnames(pbmc.big))
    ## [1] "3K_AAACATACAACCAC" "3K_AAACATTGAGCTAC" "3K_AAACATTGATCAGC"
    ## [4] "3K_AAACCGTGCTTCCG" "3K_AAACCGTGTATGCG" "3K_AAACGCACTGGTAC"
    
    tail(colnames(pbmc.big))
    ## [1] "8K_TTTGTCAGTTACCGAT-1" "8K_TTTGTCATCATGTCCC-1" "8K_TTTGTCATCCGATATG-1"
    ## [4] "8K_TTTGTCATCGTCTGAA-1" "8K_TTTGTCATCTCGAGTA-1" "8K_TTTGTCATCTGCTTGC-1"
    
    unique(sapply(X = strsplit(colnames(pbmc.big), split = "_"), FUN = "[", 1))
    ## [1] "3K" "4K" "8K"
    
    table(pbmc.big$orig.ident)
    
    ## pbmc3k PBMC4K PBMC8K 
    ##   2638   4340   8381
    

    基于标准化数据的合并

    默认情况下,将基于原始计数矩阵合并对象, 如果你想合并标准化的数据矩阵以及原始计数矩阵,则应这样做,添加merge.data = TRUE。

    pbmc4k <- NormalizeData(pbmc4k)
    pbmc8k <- NormalizeData(pbmc8k)
    pbmc.normalized <- merge(pbmc4k, y = pbmc8k, add.cell.ids = c("4K", "8K"), project = "PBMC12K", 
        merge.data = TRUE)
    GetAssayData(pbmc.combined)[1:10, 1:15]
    
    ## 10 x 15 sparse Matrix of class "dgCMatrix"
    ##                                            
    ## RP11-34P13.3  . . . . . . . . . . . . . . .
    ## FAM138A       . . . . . . . . . . . . . . .
    ## OR4F5         . . . . . . . . . . . . . . .
    ## RP11-34P13.7  . . . . . . . . . . . . . . .
    ## RP11-34P13.8  . . . . . . . . . . . . . . .
    ## RP11-34P13.14 . . . . . . . . . . . . . . .
    ## RP11-34P13.9  . . . . . . . . . . . . . . .
    ## FO538757.3    . . . . . . . . . . . . . . .
    ## FO538757.2    . . . . . . . . . 1 . . . . .
    ## AP006222.2    . . . . . . . . . . . 1 . . .
    GetAssayData(pbmc.normalized)[1:10, 1:15]
    
    ## 10 x 15 sparse Matrix of class "dgCMatrix"
    ##                                                           
    ## RP11-34P13.3  . . . . . . . . . .         . .        . . .
    ## FAM138A       . . . . . . . . . .         . .        . . .
    ## OR4F5         . . . . . . . . . .         . .        . . .
    ## RP11-34P13.7  . . . . . . . . . .         . .        . . .
    ## RP11-34P13.8  . . . . . . . . . .         . .        . . .
    ## RP11-34P13.14 . . . . . . . . . .         . .        . . .
    ## RP11-34P13.9  . . . . . . . . . .         . .        . . .
    ## FO538757.3    . . . . . . . . . .         . .        . . .
    ## FO538757.2    . . . . . . . . . 0.7721503 . .        . . .
    ## AP006222.2    . . . . . . . . . .         . 1.087928 . . .
    

    相关文章

      网友评论

        本文标题:Seurat4.0系列教程3:合并两个样品的10x数据集

        本文链接:https://www.haomeiwen.com/subject/ciopjltx.html