美文网首页
用bismark比对RRBS(二)结果文件分析

用bismark比对RRBS(二)结果文件分析

作者: 蓬举举 | 来源:发表于2020-11-14 15:17 被阅读0次

Running bismark之后

1. test_dataset_bismark_bt2.bam (contains all alignments plus methylation call strings) 
2. test_dataset_bismark_SE_report.txt (contains alignment and methylation summary)

用samtools view -h filename查看目录下的bam文件


image.png

其中文件每一列代表的信息如下:

    QNAME (seq-ID)
    FLAG (this flag tries to take the strand a bisulfite read originated from into account (this is different from ordinary DNA alignment flags!))
    RNAME (chromosome)
    POS (start position)
    MAPQ (calculated for Bowtie 2 and HISAT2)
    CIGAR
    RNEXT
    PNEXT
    TLEN
    SEQ
    QUAL (Phred33 scale)
    NM-tag (edit distance to the reference)
    MD-tag (base-by-base mismatches to the reference) 
    XM-tag (methylation call string)
    XR-tag (read conversion state for the alignment) 
    XG-tag (genome conversion state for the alignment)

特别是XM-tag所对应的Methylation call,我们可以看到它的格式一般是这样的:XM:Z:h..x...h..,分别是什么意思呢?

    z - C in CpG context - unmethylated
    Z - C in CpG context - methylated
    x - C in CHG context - unmethylated
    X - C in CHG context - methylated
    h - C in CHH context - unmethylated
    H - C in CHH context - methylated
    u - C in Unknown context (CN or CHN) - unmethylated
    U - C in Unknown context (CN or CHN) - methylated
    . - not a C or irrelevant position

具体的对bam文件的解读可以看这个链接https://www.jianshu.com/p/ba89ec471dfe

Running bismark_methylation_extractor 之后

    CpG_context_test_dataset_bismark_bt2.txt.gz
    CHG_context_test_dataset_bismark_bt2.txt.gz
    CHH_context_test_dataset_bismark_bt2.txt.gz

以及bedGraph和Bismark coverage file。
the output of the methylation extractor can be transformed into a bedGraph and coverage file using the option --bedGraph
bedGraph和Bismark coverage file是甲基化提取文件的另一种整合(transform)。
methylation extractor output也就是上面的gz压缩包的文件打开呢,是长这样的:

HWUSI-EAS611_0006:3:1:1058:15806#0/1 - 6 91793279 z
HWUSI-EAS611_0006:3:1:1058:17564#0/1 + 8 122855484 Z

每一列代表的信息如下

1. seq-ID
2. methylation state
3. chromosome
4. start position (= end position)
5. methylation call

bedgraph文件:


image.png

每列的信息如下:

<chromosome> <start position> <end position> <methylation percentage>

Bismark coverage file包含的信息再多一点,包含了不仅有比例,也有具体的数目。

<chromosome> <start position> <end position> <methylation percentage> <count methylated> <count unmethylated>

Running bismark2report

find Bismark alignment, deduplication and methylation extraction (splitting) reports as well as M-bias files
寻找bismark比对,去除重复,甲基化数据提取以及偏差文件


将目录中文件可视化

就是对比对结果的一个可视化

The M-bias plot

The M-bias plot can for example show the methylation bias at the start of reads in PBAT-Seq experiments(在methylation extractor步骤中产生)


M-bias plot

Running bismark2summary

先识别bam文件,再根据这个bam文件扫描当前目录中:different Bismark alignment, deduplication and methylation extraction (splitting) reports
均生成HTML格式的文件

相关文章

网友评论

      本文标题:用bismark比对RRBS(二)结果文件分析

      本文链接:https://www.haomeiwen.com/subject/abbvbktx.html