美文网首页
10x Genomics PBMC(七):整合数据后的聚类分析

10x Genomics PBMC(七):整合数据后的聚类分析

作者: 程凉皮儿 | 来源:发表于2020-06-11 09:36 被阅读0次

    Cluster Analysis in Integrated Data

    clp

    11 June, 2020

    准备工作

    加载前面学习程中的R环境变量和必要的R包

    library(data.table)
    library(ggplot2)
    library(Seurat)
    
    load('out/01_immune_combined.rd') #immune.combined
    

    鉴定保守的细胞类型标记(markers)

    为了识别在不同条件下保守的典型细胞类型标记基因,我们提供了FindConservedMarkers函数。此函数为每个数据集/组执行差异基因表达检测,并使用来自MetaDE包的荟萃分析方法组合p值。例如,我们可以计算由’NK’细胞标记的簇中的保守标记基因,而不考虑刺激条件。

    DefaultAssay(immune.combined) <- "RNA"
    Idents(immune.combined) <- "seurat_annotations"
    
    message("This run will take 5+ min ...")
    nk.markers <- FindConservedMarkers(immune.combined, ident.1 = "NK", grouping.var = "stim", verbose = FALSE) #default slot: 'data'
    head(nk.markers)
    #>        CTRL_p_val CTRL_avg_logFC CTRL_pct.1 CTRL_pct.2 CTRL_p_val_adj
    #> GNLY            0       4.186117      0.943      0.046              0
    #> NKG7            0       3.164712      0.953      0.085              0
    #> GZMB            0       2.915692      0.839      0.044              0
    #> CLIC3           0       2.407695      0.601      0.024              0
    #> FGFBP2          0       2.241968      0.500      0.021              0
    #> CTSW            0       2.088278      0.537      0.030              0
    #>           STIM_p_val STIM_avg_logFC STIM_pct.1 STIM_pct.2 STIM_p_val_adj
    #> GNLY    0.000000e+00       4.066429      0.956      0.059   0.000000e+00
    #> NKG7    0.000000e+00       2.904602      0.950      0.081   0.000000e+00
    #> GZMB    0.000000e+00       3.128167      0.897      0.060   0.000000e+00
    #> CLIC3   0.000000e+00       2.460388      0.623      0.031   0.000000e+00
    #> FGFBP2 1.674159e-159       1.485116      0.259      0.016  2.352696e-155
    #> CTSW    0.000000e+00       2.175186      0.592      0.035   0.000000e+00
    #>             max_pval minimump_p_val
    #> GNLY    0.000000e+00              0
    #> NKG7    0.000000e+00              0
    #> GZMB    0.000000e+00              0
    #> CLIC3   0.000000e+00              0
    #> FGFBP2 1.674159e-159              0
    #> CTSW    0.000000e+00              0
    

    此外,我们可以探索每种细胞类型的以下标记基因,以验证这些clusters是否具有特定的细胞类型。

    marker_genes <- c("CD3D", "SELL", "CREM", "CD8A", "GNLY", "CD79A", "FCGR3A", "CCL2", "PPBP")
    
    FeaturePlot(immune.combined, features = marker_genes, min.cutoff = "q9")
    
    image.png

    带有split.byDotPlot函数可用于跨条件查看保守的细胞类型标记,显示表达任何给定基因的簇中细胞的表达水平和百分比。在这里,我们为之前获取的13个簇中的每一个绘制了2-3个强标记基因。

    
    markers.to.plot <- c("CD3D", "CREM", "HSPH1", "SELL", "GIMAP5", "CACYBP", "GNLY", "NKG7", "CCL5", "CD8A", "MS4A1", "CD79A", "MIR155HG", "NME1", "FCGR3A", "VMO1", "CCL2", "S100A9", "HLA-DQA1", "GPR183", "PPBP", "GNG11", "HBA2", "HBB", "TSPAN13", "IL3RA", "IGJ")
    
    DotPlot(immune.combined,
            features = rev(markers.to.plot), 
            cols = c("blue", "red"), 
            dot.scale = 8, 
            split.by = "stim") + RotatedAxis()
    
    image.png

    保存R环境变量留待下次使用:

    save(immune.combined, file = 'out/02_immune_cons.rd',compress = TRUE)
    

    到了这一步需要了解的重点

    • When conserved gene are useful?
    • In Seurat object,
      • What is assay?
      • What is slot?
      • Why there are multiple slots?

    相关文章

      网友评论

          本文标题:10x Genomics PBMC(七):整合数据后的聚类分析

          本文链接:https://www.haomeiwen.com/subject/gnpvtktx.html