美文网首页GSEA
GSEA富集分析

GSEA富集分析

作者: MJades | 来源:发表于2020-05-05 17:15 被阅读0次

    GSEA is designed to analyze ranked lists of all available genes and does not require a threshold.

    Y叔的clusterProfiler做GSEA分析超级方便~~

    参数解读:

    • GSEA searches for pathways whose genes are enriched at the top or bottom of the ranked gene list, more so than expected by chance alone.
    • To calculate an enrichment score (ES) for a pathway, GSEA progressively examines genes from the top to the bottom of the ranked list, increasing the ES if a gene is part of the pathway and decreasing the score otherwise.
    • These running sum values are weighted, so that enrichment in the very top- (and bottom-) ranking genes is amplified, whereas enrichment in genes with more moderate ranks are not amplified.
    • The ES score is calculated as the maximum value of the running sum and normalized relative to pathway size, resulting in a normalized enrichment score (NES) that reflects the enrichment of the pathway in the list. Positive and negative NES values represent enrichment at the top and bottom of the list, respectively.
    • Finally, a permutation-based P value is computed and corrected for multiple testing to produce a permutation-based false-discovery rate (FDR) Q value that ranges from 0 (highly significant) to 1 (not significant)
    • Resulting pathways are selected using the FDR Q value threshold (e.g., Q < 0.05) and ranked using NES. In addition, the ‘leading edge’ aspect of the GSEA analysis identifies specific genes that most strongly contribute to the detected enrichment signal of a pathway.

    从以上参数解读了解到,结果可通过 FDR q-value筛选,利用NES进行排序解读。

    # Select gene set form msigdf
    msigdf.human <-msigdf::msigdf.human 
    GeneS<-msigdf.human[grepl("LYSOSOM",msigdf.human$geneset),]%>% select(geneset,   symbol) %>% as.data.frame
    
    # gene.rank is a data.frame, which contains Gene and FC (Fold-Change)
    gene.rank
    gene.sort <- arrange(gene.rank, desc(logFC)) # 降序排列
    geneList1<-gene.sort$FC
    names(geneList1)<-gene.sort$Gene # 命名
    
    # GSEA
    gsea<- GSEA(geneList, TERM2GENE = GeneS, verbose=FALSE, pvalueCutoff = 0.5)
    # Note: 1. 结果是不断变的,原因是by="fgsea",可以将nPerm设置的大一些,默认是1000.
    #       2. 原始文献中的选择:p value < 0.05, FDR < 0.25,可结合自己需求调整。
    
    # 可视化第一个结果,gseaplot2中by函数可以选择部分结果展示,如"preranked".
    gseaplot2(gsea,geneSetID=1,title=gsea$Description[1],pvalue_table=T)
    # geneSetID多选择几个,即可实现多个gene set的可视化。
    
    修图之后

    相关文章

      网友评论

        本文标题:GSEA富集分析

        本文链接:https://www.haomeiwen.com/subject/ldzmghtx.html