美文网首页
ChIP-Seq: DiffBind无control,无重复样本

ChIP-Seq: DiffBind无control,无重复样本

作者: scdzzdw | 来源:发表于2019-10-23 17:14 被阅读0次

DiffBind的使用有前辈已经写的很详细了,可以参考下:

另附上其官方手册:

DiffBind首先要导入一个SampleSheet文件,格式为csv,官方文档提到Spreadsheets in Excel® format, with a .xls or .xlsx suffix, are also accepted,但是我导入时出错。
SampleSheet文件含有固定的几列,

image.png
PeakCaller的选项为
– “raw”: text file file; peak score is in fourth column
– “bed”: .bed file; peak score is in fifth column
– “narrow”: default peak.format: narrowPeaks file
– “macs”: MACS .xls file
– “swembl”: SWEMBL .peaks file
– “bayes”: bayesPeak file
– “peakset”: peakset written out using pv.writepeakset
– “fp4”: FindPeaks v4
(详见手册:https://www.bioconductor.org/packages/release/bioc/manuals/DiffBind/man/DiffBind.pdf
其中我的样本无control的input,所以两列ControlIDbamControl为空。
image.png
其次bam文件路径仍为E:\defect\DNA_protein_interaction\GSE55506\Differential_expression\T2N_H3K4me3_sorted.bam,无需写成R识别的\\,导入R测试
> dbObj <- dba(sampleSheet="SampleSheet.csv")
trisomy_21 fibroblasts trisomy_21 trisomy_21 trisomy_21 1 narrow
euploid fibroblasts euploid euploid euploid 1 narrow
> dbObj
2 Samples, 33153 sites in matrix (47495 total):
          ID      Tissue     Factor  Condition  Treatment Replicate Caller Intervals
1 trisomy_21 fibroblasts trisomy_21 trisomy_21 trisomy_21         1 narrow     40820
2    euploid fibroblasts    euploid    euploid    euploid         1 narrow     44391

没问题,证明无control的input也是可行的,但是进行差异分析时报错

> dbObj <- dba.contrast(dbObj, categories=DBA_FACTOR,minMembers = 1)
Error in dba.contrast(dbObj, categories = DBA_FACTOR, minMembers = 1) : 
  minMembers must be at least 2. Use of replicates strongly advised.
> dbObj <- dba.contrast(dbObj, categories=DBA_FACTOR,minMembers = 2)
Warning message:
No contrasts added. Perhaps try more categories, or lower value for minMembers. 
> dbObj <- dba.analyze(dbObj)
Error in pv.DBA(DBA, method, bSubControl, bFullLibrarySize, bTagwise = bTagwise,  : 
  Unable to perform analysis: no contrasts specified.
In addition: Warning message:
No contrasts added. Perhaps try more categories, or lower value for minMembers. 
> dbObj <- dba.contrast(dbObj, categories=DBA_CONDITION)
Warning message:
No contrasts added. Perhaps try more categories, or lower value for minMembers. 
> dbObj <- dba.contrast(dbObj, categories=DBA_CONDITION, minMembers = 1)
Error in dba.contrast(dbObj, categories = DBA_CONDITION, minMembers = 1) : 
  minMembers must be at least 2. Use of replicates strongly advised.

根据提示,难道一定需要2个以上的重复?待解决......

=================================================================================
去论坛及官网问了下,DiffBind的作者给出了回答,输入的样本DiffBind需要重复
原回答:
Yes, replicates are required to do any kind of statistical analysis. Replicates are required to estimate the variance in the data and calculate confidence statistics such as p-values/FDRs.

Without replicates, you can do some exploratory analysis of overlapping peaks (occupancy analysis). For example using dba.plotVenn(). But not knowing if your data represents an outlier, combined with the inherent noisiness of peak calling, means you will have to have another way to validate any "differential" peaks you identify.
链接:https://support.bioconductor.org/p/125809/#125840

相关文章

网友评论

      本文标题:ChIP-Seq: DiffBind无control,无重复样本

      本文链接:https://www.haomeiwen.com/subject/ktbkvctx.html