美文网首页
Analyzing RNA-seq data with DESe

Analyzing RNA-seq data with DESe

作者: BINBINCC | 来源:发表于2020-10-16 15:48 被阅读0次

    前面铺垫了那么多终于要开始了

    Analyzing RNA-seq data with DESeq2(一)
    Analyzing RNA-seq data with DESeq2(二)
    Analyzing RNA-seq data with DESeq2(三)
    Analyzing RNA-seq data with DESeq2(四)
    Analyzing RNA-seq data with DESeq2(五)

    Differential expression analysis

    The standard differential expression analysis steps are wrapped into a single function, DESeq.
    Results tables are generated using the function results, which extracts a results table with log2 fold changes, p values and adjusted p values.

    dds <- DESeq(dds)
    res <- results(dds)
    res
    
    ## log2 fold change (MLE): condition treated vs untreated 
    ## Wald test p-value: condition treated vs untreated 
    ## DataFrame with 9921 rows and 6 columns
    ##               baseMean log2FoldChange     lfcSE       stat    pvalue      padj
    ##              <numeric>      <numeric> <numeric>  <numeric> <numeric> <numeric>
    ## FBgn0000008   95.14429     0.00227644  0.223729   0.010175 0.9918817  0.997211
    ## FBgn0000014    1.05652    -0.49512039  2.143186  -0.231021 0.8172987        NA
    ## FBgn0000017 4352.55357    -0.23991894  0.126337  -1.899041 0.0575591  0.288002
    ## FBgn0000018  418.61048    -0.10467391  0.148489  -0.704927 0.4808558  0.826834
    ## FBgn0000024    6.40620     0.21084779  0.689588   0.305759 0.7597879  0.943501
    ## ...                ...            ...       ...        ...       ...       ...
    ## FBgn0261570 3208.38861      0.2955329  0.127350  2.3206264  0.020307  0.144240
    ## FBgn0261572    6.19719     -0.9588230  0.775315 -1.2366888  0.216203  0.607848
    ## FBgn0261573 2240.97951      0.0127194  0.113300  0.1122634  0.910615  0.982657
    ## FBgn0261574 4857.68037      0.0153924  0.192567  0.0799327  0.936291  0.988179
    ## FBgn0261575   10.68252      0.1635705  0.930911  0.1757102  0.860522  0.967928
    

    Note that we could have specified the coefficient or contrast we want to build a results table for, using either of the following equivalent commands:
    如果需要明确如何处理对象则可以使用以下方法:

    res <- results(dds, name="condition_treated_vs_untreated")
    ##The text, condition treated vs untreated, tells you that the estimates are of the logarithmic fold change log2(treated/untreated).
    通过对treated和untreated顺序的更改来达到明确处理对象的问题
    res2 <- results(dds, contrast=c("condition","untreated","treated"))
    res2
    
    log2 fold change (MLE): condition untreated vs treated 
    Wald test p-value: condition untreated vs treated 
    DataFrame with 9921 rows and 6 columns
                        baseMean       log2FoldChange             lfcSE                 stat             pvalue              padj
                       <numeric>            <numeric>         <numeric>            <numeric>          <numeric>         <numeric>
    FBgn0000008 95.1440789963134 -0.00215437939131803 0.223778503681452 -0.00962728481903151  0.992318656737681 0.997034169677036
    FBgn0000014 1.05657219346166    0.495299273223956  2.14327479881991    0.231094619083222  0.817241295341332                NA
    FBgn0000017  4352.5928987935    0.240025555383081 0.126342087768198     1.89980678349608 0.0574584804033397 0.287669966702742
    FBgn0000018 418.614930505122    0.104799219010569 0.148536767352526    0.705543959778297  0.480471785048082 0.826695551321695
    FBgn0000024 6.40628919476961   -0.210746819818719 0.689970836726196   -0.305443083389843  0.760028712717949 0.943790945818601
    ...                      ...                  ...               ...                  ...                ...               ...
    FBgn0261570 3208.38445969106   -0.295433108560876 0.127325952346867    -2.32028980043316 0.0203252055046957 0.144360633334623
    FBgn0261572 6.19713652050888     0.95895731601274 0.775588232046352     1.23642582028685  0.216300323350011 0.608242035446242
    FBgn0261573 2240.98398636611  -0.0126194294797913 0.113296402610125   -0.111384202755468  0.911311686482631 0.982594010602204
    FBgn0261574 4857.74267170989  -0.0152592268752615 0.192504546880859  -0.0792668387448812  0.936820382128824 0.988423794214432
    FBgn0261575 10.6835537573787   -0.163219202829332 0.930591813482421   -0.175392906389902  0.860770914073928 0.967954942188896
    

    Log fold change shrinkage for visualization and ranking

    对数据进行收缩处理

    resultsNames(dds)
    ## [1] "Intercept"                      "condition_treated_vs_untreated"
    
    resLFC <- lfcShrink(dds, coef="condition_treated_vs_untreated", type="apeglm")
    resLFC
    
    ## log2 fold change (MAP): condition treated vs untreated 
    ## Wald test p-value: condition treated vs untreated 
    ## DataFrame with 9921 rows and 5 columns
    ##               baseMean log2FoldChange     lfcSE    pvalue      padj
    ##              <numeric>      <numeric> <numeric> <numeric> <numeric>
    ## FBgn0000008   95.14429     0.00119920  0.151897 0.9918817  0.997211
    ## FBgn0000014    1.05652    -0.00473412  0.205468 0.8172987        NA
    ## FBgn0000017 4352.55357    -0.18989990  0.120377 0.0575591  0.288002
    ## FBgn0000018  418.61048    -0.06995753  0.123901 0.4808558  0.826834
    ## FBgn0000024    6.40620     0.01752715  0.198633 0.7597879  0.943501
    ## ...                ...            ...       ...       ...       ...
    ## FBgn0261570 3208.38861     0.24110290 0.1244469  0.020307  0.144240
    ## FBgn0261572    6.19719    -0.06576173 0.2141351  0.216203  0.607848
    ## FBgn0261573 2240.97951     0.01000619 0.0993764  0.910615  0.982657
    ## FBgn0261574 4857.68037     0.00843552 0.1408267  0.936291  0.988179
    ## FBgn0261575   10.68252     0.00809101 0.2014704  0.860522  0.967928
    

    Using parallelization

    并行处理

    使用BiocParallel包

    Parallelizing DESeq, results, and lfcShrink can be easily accomplished by loading the BiocParallel package, and then setting the following arguments: parallel=TRUE and BPPARAM=MulticoreParam(4), for example, splitting the job over 4 cores.

    library("BiocParallel")  ##使用BiocParallel包
    register(MulticoreParam(4))
    #提前声明好后,直接添加parallel=TRUE选项即可
    
    ?results ?DESeq ?lfcShrink

    p-values and adjusted p-values

    P值调整

    ##p值从小到大排列
    resOrdered <- res[order(res$pvalue),]
    
    ##使用summary函数获取简单的基本结果
    summary(res)
    ## 
    ## out of 9921 with nonzero total read count
    ## adjusted p-value < 0.1
    ## LFC > 0 (up)       : 518, 5.2%
    ## LFC < 0 (down)     : 536, 5.4%
    ## outliers [1]       : 1, 0.01%
    ## low counts [2]     : 1539, 16%
    ## (mean count < 6)
    ## [1] see 'cooksCutoff' argument of ?results
    ## [2] see 'independentFiltering' argument of ?results
    
    ##p值小于0.1的共有多少个
    sum(res$padj < 0.1, na.rm=TRUE)
    ##[1] 1048
    

    如果想要提取排序后的差异分析结果可以这样操作:

    resOrdered <- res[order(res$padj),]  #padj排序
    resOrdered_frame=as.data.frame(resOrdered)
    write.csv(resOrdered_frame, 
              file="condition_treated_results.csv")
    
    
    ##通过class函数看看数据类型
    > class(resOrdered)
    ##[1] "DESeqResults"
    ##attr(,"package")
    ##[1] "DESeq2"
    
    > class(resOrdered_frame)
    ##[1] "data.frame"
    
    

    函数results有着丰富的参数设置,可以自定义生成的结果,默认p值=0.1,可使用alpha参数更改为合适的值。
    例如:

    res05 <- results(dds, alpha=0.05)
    summary(res05)
    ## 
    ##out of 9921 with nonzero total read count
    ##adjusted p-value < 0.05
    ##LFC > 0 (up)       : 408, 4.1%
    ##LFC < 0 (down)     : 428, 4.3%
    ##outliers [1]       : 1, 0.01%
    ##low counts [2]     : 1154, 12%
    ##(mean count < 4)
    ##[1] see 'cooksCutoff' argument of ?results
    ##[2] see 'independentFiltering' argument of ?results
    
    sum(res05$padj < 0.05, na.rm=TRUE)
    ## [1] 836
    

    基本的数据处理到这里就完成了,接下来就是结果展示了。

    下次见咯( ^ . ^ )

    大家一起学习讨论鸭!

    来一杯!

    相关文章

      网友评论

          本文标题:Analyzing RNA-seq data with DESe

          本文链接:https://www.haomeiwen.com/subject/dovppktx.html