scRNA基础分析-6:富集分析

作者: 小贝学生信 | 来源:发表于2020-08-26 19:34 被阅读0次

    scRNA基础分析-1:安装包、导入数据、过滤质控 - 简书
    scRNA基础分析-2:降维与聚类 - 简书
    scRNA基础分析-3:鉴定细胞类型 - 简书
    scRNA基础分析-4:细胞亚类再聚类、注释 - 简书
    scRNA基础分析-5:伪时间分析 - 简书
    scRNA基础分析-6:富集分析 - 简书

    在之前的分析中,已经将细胞分为多种不同的类型,例如cluster、cell type、stage等。以cluster为例,不同的cluster有哪些基因表达差异显著,有何特别的意义?富集分析可以帮助我们挖掘一些信息。
    之前学习转录组时,已了解有关富集分析的基本概念,详见笔记

    library(Seurat)
    library(tidyverse)
    library(patchwork)
    library(monocle)
    library(clusterProfiler)
    library(org.Hs.eg.db)
    rm(list=ls())
    
    scRNA <- readRDS("scRNA.rds")
    mycds <- readRDS("mycds.rds")
    

    1、差异分析

    • 差异分析是富集分析的基础,基于前面的结果,我们可以进行多种差异分析。
    #比较cluster0和cluster1的差异表达基因
    dge.cluster <- FindMarkers(scRNA,ident.1 = 0,ident.2 = 1)
    sig_dge.cluster <- subset(dge.cluster, p_val_adj<0.01&abs(avg_logFC)>1)
    
    #比较B_cell和T_cells的差异表达基因,后面的演示以此数据为例
    dge.celltype <- FindMarkers(scRNA, ident.1 = 'B_cell', ident.2 = 'T_cells', group.by = 'celltype')
    sig_dge.celltype <- subset(dge.celltype, p_val_adj<0.01&abs(avg_logFC)>1)
    
    #比较拟时State1和State3的差异表达基因
    p_data <- subset(pData(mycds),select='State')
    scRNAsub <- subset(scRNA, cells=row.names(p_data))
    scRNAsub <- AddMetaData(scRNAsub,p_data,col.name = 'State')
    dge.State <- FindMarkers(scRNAsub, ident.1 = 1, ident.2 = 3, group.by = 'State')
    sig_dge.State <- subset(dge.State, p_val_adj<0.01&abs(avg_logFC)>1)
    

    2、go分析

    ego_ALL <- enrichGO(gene          = row.names(sig_dge.celltype),
                        #universe     = row.names(dge.celltype),
                        OrgDb         = 'org.Hs.eg.db',
                        keyType       = 'SYMBOL',
                        ont           = "ALL",
                        pAdjustMethod = "BH",
                        pvalueCutoff  = 0.01,
                        qvalueCutoff  = 0.05)
    ego_all <- data.frame(ego_ALL)
    
    2-1
    ego_CC <- enrichGO(gene          = row.names(sig_dge.celltype),
                       #universe     = row.names(dge.celltype),
                       OrgDb         = 'org.Hs.eg.db',
                       keyType       = 'SYMBOL',
                       ont           = "CC",
                       pAdjustMethod = "BH",
                       pvalueCutoff  = 0.01,
                       qvalueCutoff  = 0.05)
    ego_MF <- enrichGO(gene          = row.names(sig_dge.celltype),
                       #universe     = row.names(dge.celltype),
                       OrgDb         = 'org.Hs.eg.db',
                       keyType       = 'SYMBOL',
                       ont           = "MF",
                       pAdjustMethod = "BH",
                       pvalueCutoff  = 0.01,
                       qvalueCutoff  = 0.05)
    ego_BP <- enrichGO(gene          = row.names(sig_dge.celltype),
                       #universe     = row.names(dge.celltype),
                       OrgDb         = 'org.Hs.eg.db',
                       keyType       = 'SYMBOL',
                       ont           = "BP",
                       pAdjustMethod = "BH",
                       pvalueCutoff  = 0.01,
                       qvalueCutoff  = 0.05) 
    #截取每个description的前70个字符,方便后面作图排版
    ego_CC@result$Description <- substring(ego_CC@result$Description,1,70)
    ego_MF@result$Description <- substring(ego_MF@result$Description,1,70)
    ego_BP@result$Description <- substring(ego_BP@result$Description,1,70)
    p_BP <- barplot(ego_BP,showCategory = 10) + ggtitle("barplot for Biological process")
    p_CC <- barplot(ego_CC,showCategory = 10) + ggtitle("barplot for Cellular component")
    p_MF <- barplot(ego_MF,showCategory = 10) + ggtitle("barplot for Molecular function")
    plotc <- p_BP/p_CC/p_MF
    
    2-2

    3、kegg分析

    genelist <- bitr(row.names(sig_dge.celltype), fromType="SYMBOL",
                               toType="ENTREZID", OrgDb='org.Hs.eg.db')
    # kegg分析的基因名必须要是ENTREZID
    genelist <- pull(genelist,ENTREZID)               
    ekegg <- enrichKEGG(gene = genelist, organism = 'hsa')
    p1 <- barplot(ekegg, showCategory=20)
    p2 <- dotplot(ekegg, showCategory=20)
    plotc = p1/p2
    
    2-3

    相关文章

      网友评论

        本文标题:scRNA基础分析-6:富集分析

        本文链接:https://www.haomeiwen.com/subject/wwkfsktx.html