美文网首页单细胞测序
Scillus——提高scRNA-seq数据的处理和可视化(三)

Scillus——提高scRNA-seq数据的处理和可视化(三)

作者: denghb001 | 来源:发表于2021-11-30 09:11 被阅读0次

    绘图

    1.降维图形绘制

    降维图可以通过plot_scdata()函数绘制:

    plot_scdata(scRNA_int, pal_setup = pal)
    
    UMAP 绘图,按cluster着色

    plot_scdata()有三个可选参数:color_bysplit_by,和pal_setup。至于color_by参数,默认情况下,这个函数会给不同的"seurat_clusters"上色,并且它可以被修改为metadata中的任何因素,比如"sample""group"

    plot_scdata(scRNA_int, color_by = "group", pal_setup = pal)
    
    UMAP 绘图,按group上色

    如果split_by参数被指定为metadat中的一个因子,图形将被该因子分割成不同的块(可参考ggplot2分面):

    plot_scdata(scRNA_int, split_by = "sample", pal_setup = pal)
    
    UMAP 绘图,按样本分割

    plot_qc()函数类似,pal_setup参数可以是RColorBrewer调色板名称、调色板设置数据框或手动指定的颜色向量。

    plot_scdata(scRNA_int, pal_setup = "Dark2")
    
    UMAP 绘图,按簇着色,RColorBrewer Dark2 调色板
    plot_scdata(scRNA_int, color_by = "sample", pal_setup = c("red","orange","yellow","green","blue","purple"))
    
    UMAP 绘图,按簇着色,手动指定调色板

    2.统计数据绘制

    集群的计数和比例统计可以通过函数plot_stat()绘制,plot_type参数必须提供为三个值之一:“group_count”“prop_fill”“prop_multi”。他们的图表如下:

    plot_stat(scRNA_int, plot_type = "group_count")
    
    image.png
    plot_stat(scRNA_int, "group_count", group_by = "seurat_clusters", pal_setup = pal)
    
    image.png
    plot_stat(scRNA_int, plot_type = "prop_fill", 
              pal_setup = c("grey90","grey80","grey70","grey60","grey50","grey40","grey30","grey20"))
    
    image.png
    plot_stat(scRNA_int, plot_type = "prop_multi", pal_setup = "Set3")
    
    image.png

    group_by参数"sample"用作默认分组变量,并且可以指定为元数据中的其他因素(例如 "group")。

    plot_stat(scRNA_int, plot_type = "prop_fill", group_by = "group")
    
    image.png
    plot_stat(scRNA_int, plot_type = "prop_multi", group_by = "group", pal_setup = c("sienna","bisque3"))
    
    image.png

    3.热图绘制

    热图的绘制需要 Seurat 找到聚类标记:

    markers <- FindAllMarkers(scRNA_int, logfc.threshold = 0.1, min.pct = 0, only.pos = T)
    

    然后,用plot_heatmap()绘制每个聚类中的top基因。每个群集n中绘制的基因数量的默认值是8。在热图中,每一行代表一个基因,每一列代表一个细胞。细胞可以按sort_var排序,如果默认设置为c("seurat_clusters"),这意味着细胞按集群标识排序。可以在sort_var中指定多个变量,细胞将按变量的顺序排序。热图上方是注释栏,可以通过指定anno_var参数显示metadata数据中的分类或连续变量,变量名作为字符向量。anno_colors参数是一个列表,它为相应的注释变量指定注释颜色,因此它应该与anno_var相同的长度。建议对分类变量和连续变量使用适当的调色板。和前面一样,支持RColorBrewer调色板和手工指定的调色板,并且三色向量可以用于连续变量注释。

    plot_heatmap(dataset = scRNA_int, 
                  markers = markers,
                  sort_var = c("seurat_clusters","sample"),
                  anno_var = c("seurat_clusters","sample","percent.mt","S.Score","G2M.Score"),
                  anno_colors = list("Set2",                                             # RColorBrewer palette
                                     c("red","orange","yellow","purple","blue","green"), # color vector
                                     "Reds",
                                     c("blue","white","red"),                            # Three-color gradient
                                     "Greens"))
    
    image.png

    此外,hm_limithm_colors用于指定热图主体的颜色梯度和限制。

    plot_heatmap(dataset = scRNA_int,
                 n = 6,
                 markers = markers,
                 sort_var = c("seurat_clusters","sample"),
                 anno_var = c("seurat_clusters","sample","percent.mt"),
                 anno_colors = list("Set2",
                                    c("red","orange","yellow","purple","blue","green"),
                                    "Reds"),
                 hm_limit = c(-1,0,1),
                 hm_colors = c("purple","black","yellow"))
    
    image.png

    4.GO分析

    GO分析结果可以通过plot_cluster_go()和plot_all_cluster_go()绘制。前者绘制一个特定的集群,而后者迭代所有集群。plot_cluster_go()中的topn参数指定用于GO分析的top基因的数量,默认值为100。org参数指定生物体,“human”“mouse”是可接受的值。plot_all_cluster_go()plot_cluster_go()的包装器,后者又是clusterProfilter:: richgo()`的包装器。因此,…参数可以传递给内部函数。

    plot_cluster_go(markers, cluster_name = "1", org = "human", ont = "CC")
    
    image.png
    plot_all_cluster_go(markers, org = "human", ont = "CC")
    
    image.png

    5.Measures绘图

    Measures被定义为metadata中的连续变量以及基因表达值。plot_measure()plot_measure_dim()将这些变量分别归纳为箱线图、小提琴图和降维图。像group_bysplit_bypal_setup这样的参数可以像上面描述的那样使用。

    plot_measure(dataset = scRNA_int, 
                 measures = c("KRT14","percent.mt"), 
                 group_by = "seurat_clusters", 
                 pal_setup = pal)
    
    image.png
    plot_measure_dim(dataset = scRNA_int, 
                     measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"))
    
    image.png
    plot_measure_dim(dataset = scRNA_int, 
                     measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"),
                     split_by = "sample")
    
    image.png

    6.GSEA分析

    为了进行GSEA分析,我们将首先通过find_diff_genes()找到差异表达基因(DEGs)和相关measures。然后,通过test_GSEA()输入经过排序的列表进行GSEA分析。(注:Seurat可能需要很长时间才能找到DEG。建议使用future包进行多线程分析处理)。最后,可以使用plot_GSEA()绘制输出,并提供用于调整p值截止和颜色渐提供附加参数。

    de <- find_diff_genes(dataset = scRNA_int, 
                          clusters = as.character(0:7),
                          comparison = c("group", "CTCL", "Normal"),
                          logfc.threshold = 0,   # threshold of 0 is used for GSEA
                          min.cells.group = 1)   # To include clusters with only 1 cell
    
    gsea_res <- test_GSEA(de, 
                          pathway = pathways.hallmark)
    
    plot_GSEA(gsea_res, p_cutoff = 0.1, colors = c("#0570b0", "grey", "#d7301f"))
    
    image.png

    参考文献:
    https://github.com/xmc811/Scillus

    相关文章

      网友评论

        本文标题:Scillus——提高scRNA-seq数据的处理和可视化(三)

        本文链接:https://www.haomeiwen.com/subject/wvdbtrtx.html