美文网首页
手动转换 Genrich 的类 bedgraph 文件到 big

手动转换 Genrich 的类 bedgraph 文件到 big

作者: BeeBee生信 | 来源:发表于2020-12-06 16:26 被阅读0次

    本文是《ATAC-Seq 分析流程》的补充,解释了不要自己手动将 bedgraph-ish 格式文件转换到 bigwig 的原因。

    Genrich -k 参数文件格式示例。

    # experimental file: /Example/WT1_ATAC.bam; control file: NA
    chr     start   end     experimental    control -log(p)
    chr1    0       9944    0.000000        2.238139        0.000000
    chr1    9944    9946    1.000000        2.238139        0.188209
    chr1    9946    9947    2.200000        2.238139        0.488245
    

    其中信号背景 "control" 列的值计算方法为

    The background pileup value is calculated by dividing the total sequence information (sum of read/fragment/interval lengths) in the experimental sample by the calculated genome length. The net control pileup value at a particular genomic position is the maximum of the background pileup value and the pileup of the control sample at that position (if a control sample is specified). Note that control pileups are scaled to match the experimental, based on the total sequence information in each.

    简单总结就是根据测序深度(文库大小)计算,背景强度跟总体测序深度会成正比。

    Genrich -k 参数文件 "control" 列总结。

    $ awk '{print$5}' KO_pileup_p.bed | sort | uniq
    0.000000
    0.866656
    0.900041
    control
    
    $ awk '{print$5}' WT_pileup_p.bed | sort | uniq
    0.000000
    2.114438
    2.238139
    control
    

    这里 WT 组样本因为测序深度更高,所以背景的值更大,KO 更小。问题就出在 cut 命令取前 4 列信息最后转换到 bigwig 文件,这会导致有些区域在 KO 鉴定为有峰,在 WT 鉴定没有峰。但是在生成的 bigwig 文件里,WT 样本 "experimental" 列的信号值比 KO 更强,可视化这部分区域时就会发现峰图跟软件结论不一致。

    以区域 chr3:170386225-170386419 为例,Genrich 软件鉴定 KO 组在这里有峰,而 WT 组没有。这段区域 -k 参数产生的 Pileup 文件信息如下。

    # KO 组样本
    chr     start   end     experimental    control -log(p)
    chr3    170386225       170386227       4.700000        0.866656        1.926705
    chr3    170386227       170386267       3.700000        0.866656        1.652342
    chr3    170386267       170386268       4.700000        0.866656        1.926705
    chr3    170386268       170386279       4.200000        0.866656        1.794608
    chr3    170386279       170386286       3.200000        0.866656        1.497904
    chr3    170386286       170386294       4.200000        0.866656        1.794608
    chr3    170386294       170386310       5.200000        0.866656        2.050165
    chr3    170386310       170386311       6.200000        0.866656        2.275715
    chr3    170386311       170386312       6.533333        0.866656        2.345520
    chr3    170386312       170386319       7.533333        0.866656        2.541555
    chr3    170386319       170386320       8.533333        0.866656        2.720584
    chr3    170386320       170386325       8.700000        0.866656        2.748994
    chr3    170386325       170386359       8.500000        0.866656        2.714856
    chr3    170386359       170386367       9.500000        0.866656        2.880330
    chr3    170386367       170386386       8.500000        0.866656        2.714856
    chr3    170386386       170386394       7.500000        0.866656        2.535314
    chr3    170386394       170386399       6.500000        0.866656        2.338648
    chr3    170386399       170386410       5.500000        0.866656        2.120604
    chr3    170386410       170386411       4.500000        0.866656        1.874980
    chr3    170386411       170386412       4.166667        0.866656        1.785459
    chr3    170386412       170386419       3.166667        0.866656        1.487114
    chr3    170386419       170386420       2.166667        0.866656        1.127307
    
    
    # WT 组样本
    chr     start   end     experimental    control -log(p)
    chr3    170386225       170386228       9.400000        2.114438        1.697015
    chr3    170386228       170386231       9.733333        2.114438        1.736043
    chr3    170386231       170386234       10.733333       2.114438        1.848394
    chr3    170386234       170386235       9.733333        2.114438        1.736043
    chr3    170386235       170386256       8.733333        2.114438        1.616348
    chr3    170386256       170386259       8.983333        2.114438        1.647022
    chr3    170386259       170386261       7.483334        2.114438        1.454529
    chr3    170386261       170386268       6.483334        2.114438        1.313350
    chr3    170386268       170386273       5.483334        2.114438        1.159408
    chr3    170386273       170386282       5.033333        2.114438        1.085210
    chr3    170386282       170386285       8.533333        2.114438        1.591427
    chr3    170386285       170386292       9.033333        2.114438        1.653094
    chr3    170386292       170386299       10.033333       2.114438        1.770472
    chr3    170386299       170386305       11.033333       2.114438        1.880816
    chr3    170386305       170386311       12.033333       2.114438        1.985027
    chr3    170386311       170386318       12.366667       2.114438        2.018532
    chr3    170386318       170386319       12.166667       2.114438        1.998500
    chr3    170386319       170386320       12.366667       2.114438        2.018532
    chr3    170386320       170386321       11.366667       2.114438        1.916196
    chr3    170386321       170386328       10.366667       2.114438        1.807986
    chr3    170386328       170386329       10.033333       2.114438        1.770472
    chr3    170386329       170386331       10.366667       2.114438        1.807986
    chr3    170386331       170386337       9.366667        2.114438        1.693066
    chr3    170386337       170386339       9.700000        2.114438        1.732178
    chr3    170386339       170386356       10.700000       2.114438        1.844757
    chr3    170386356       170386357       10.200000       2.114438        1.789324
    chr3    170386357       170386379       11.200000       2.114438        1.898589
    chr3    170386379       170386382       12.200000       2.114438        2.001853
    chr3    170386382       170386385       8.700000        2.114438        1.612219
    chr3    170386385       170386392       8.200000        2.114438        1.549103
    chr3    170386392       170386399       7.200000        2.114438        1.415684
    chr3    170386399       170386405       6.200000        2.114438        1.271142
    chr3    170386405       170386411       5.200000        2.114438        1.113082
    chr3    170386411       170386419       4.866666        2.114438        1.056857
    chr3    170386419       170386425       4.666667        2.114438        1.022170
    

    可以看到各自对照自己的 "control" 时,KO 组是更加显著有峰的,但是 "experimental" 绝对值是 WT 组更高的。所以 cut 命令取了 "experimental" 列生成 bigwig 文件进行可视化,会出现 WT 组峰信号更强,跟软件结论不一致。

    可见 WT 组信号比 KO 组高,但实际上比对各自的背景信号,KO 组为显著有峰区域

    相关文章

      网友评论

          本文标题:手动转换 Genrich 的类 bedgraph 文件到 big

          本文链接:https://www.haomeiwen.com/subject/ylwowktx.html