ATAC-Seq 差异分析 DiffBind

作者: BeeBee生信 | 来源:发表于2020-12-01 10:34 被阅读0次

本文是《ATAC-Seq 分析流程》拆分出来的 DiffBind 差异分析部分。DiffBind 有许多的 Normalization 方法,建议使用前先看文档了解清楚,这里只提供简单的分析步骤,只用默认参数。

DiffBind 借助 RNA-seq 分析差异基因思路。软件首先将所有样本的峰合并(bedtools merge),得到总的峰区域信息,再计算每个峰区域 Reads 数,最后调用 DESeq2/edgeR 进行差异分析。

library(DiffBind, quietly = TRUE)
library(tidyverse, quietly = TRUE)

读入数据,没有的信息就填 NA. 不会有影响。

> samples <- read.csv("Samples.csv", header = TRUE)
> samples
  SampleID Tissue Factor Condition Treatment Replicate
1      KO1     NA     NA        NA        KO         1
2      KO2     NA     NA        NA        KO         2
3      WT1     NA     NA        NA        WT         1
4      WT2     NA     NA        NA        WT         2
                                                                  bamReads
1 /Example/KO1_ATAC_ShiftSort.bam
2 /Example/KO2_ATAC_ShiftSort.bam
3 /Example/WT1_ATAC_ShiftSort.bam
4 /Example/WT2_ATAC_ShiftSort.bam
  ControlID bamControl
1        NA         NA
2        NA         NA
3        NA         NA
4        NA         NA
                                                                   Peaks
1 /Example/KO1_ATAC_peaks.narrowPeak
2 /Example/KO2_ATAC_peaks.narrowPeak
3 /Example/WT1_ATAC_peaks.narrowPeak
4 /Example/WT2_ATAC_peaks.narrowPeak
  PeakCaller
1     narrow
2     narrow
3     narrow
4     narrow

构建 dba 对象。

> sampleDba <- dba(sampleSheet = samples)
KO1    KO 1 narrow
KO2    KO 2 narrow
WT1    WT 1 narrow
WT2    WT 2 narrow

计算 count, 然后会给出 FRiP 数值。

sampleCount <- dba.count(sampleDba)
> sampleCount
4 Samples, 9127 sites in matrix:
   ID Treatment Replicate    Reads FRiP
1 KO1        KO         1 12317324 0.01
2 KO2        KO         2 11922986 0.01
3 WT1        WT         1 30846990 0.02
4 WT2        WT         2 29012696 0.02

进行均一化,然后分析差异的 Peaks.

sampleNor <- dba.normalize(sampleCount)
sampleCon <- dba.contrast(sampleNor, design = ~ Treatment, contrast = c("Treatment", "WT", "KO"), reorderMeta = list(Treatment="WT"))
sampleDiff <- dba.analyze(sampleCon, bBlacklist = FALSE, bGreylist = FALSE)

> dba.show(sampleDiff, bContrasts = TRUE)
     Factor Group Samples Group2 Samples2 DB.DESeq2
1 Treatment    WT       2     KO        2      6293

导出文本的结果

diffPeak <- dba.report(sampleDiff)
> diffPeak
GRanges object with 6293 ranges and 6 metadata columns:
       seqnames              ranges strand |      Conc   Conc_WT   Conc_KO
          <Rle>           <IRanges>  <Rle> | <numeric> <numeric> <numeric>
  1041    chr10   74826166-74826566      * |      6.69      7.44      5.00
  7544     chr6 155314400-155314800      * |      6.61      7.38      4.84
  8254     chr8   73972212-73972612      * |      7.19      7.82      6.02
  6555     chr5   91383243-91383643      * |      6.84      7.54      5.41
  7830     chr7   94657864-94658264      * |      6.47      7.24      4.70
   ...      ...                 ...    ... .       ...       ...       ...
  3936    chr19   35612533-35612933      * |      4.46      4.91      3.79
  4659     chr2 175181550-175181950      * |      4.04      4.55      3.25
  1275    chr11   13463100-13463500      * |      4.64      5.06      4.04
  8757     chr9   97983046-97983446      * |      5.10      5.47      4.60
  6553     chr5   91280440-91280840      * |      5.28      5.62      4.84
            Fold   p-value       FDR
       <numeric> <numeric> <numeric>
  1041      2.34  1.11e-15  1.02e-11
  7544      2.42  4.96e-15  2.26e-11
  8254      1.76  8.53e-15  2.60e-11
  6555      2.06  4.11e-14  9.38e-11
  7830      2.41  1.21e-13  2.20e-10
   ...       ...       ...       ...
  3936      0.99    0.0344    0.0499
  4659      1.10    0.0344    0.0499
  1275      0.91    0.0344    0.0500
  8757      0.81    0.0344    0.0500
  6553      0.73    0.0345    0.0500
  -------
  seqinfo: 23 sequences from an unspecified genome; no seqlengths

还有些函数可以出一些简单的图。

相关文章

网友评论

    本文标题:ATAC-Seq 差异分析 DiffBind

    本文链接:https://www.haomeiwen.com/subject/cvnjwktx.html