美文网首页小教程收藏画图
【三维基因组】PyGenomeTrack之CHIP峰图可视化

【三维基因组】PyGenomeTrack之CHIP峰图可视化

作者: XuningFan | 来源:发表于2020-07-12 00:49 被阅读0次
    image.png

    很多时候我们都需要画chip的峰图,而pyGenomeTrack(https://pygenometracks.readthedocs.io/en/latest/index.html)能够方便的帮我们完成这项工作。

    在用pyGenomeTrack画峰图之前,我们需要几项准备工作:

    注释基因(UCSC)

    1.将gtf文件转化GenePred 文件

    gtfToGenePred -genePredExt -geneNameAsName2 genes.gtf genePredName.txt

    2.将GenePred 文件转换为UCSC Bed12
    Bed12****格式:

    chr1    3073252 3074322 4933401J01Rik   0   +   3074322 3074322 0   1   1070,   0,
    chr1    3102015 3102125 Gm26206 0   +   3102125 3102125 0   1   110,    0,
    chr1    3205900 3216344 Xkr4    0   -   3216344 3216344 0   2   1417,2736,  0,6291,
    chr1    3206522 3215632 Xkr4    0   -   3215632 3215632 0   2   795,2194,   0,6121,
    chr1    3214481 3671498 Xkr4    0   -   3216021 3671348 0   3   2487,200,947,   0,204733,248650,
    chr1    3252756 3253236 Gm18956 0   +   3253236 3253236 0   1   480,    0,
    chr1    3365730 3368549 Gm37180 0   -   3368549 3368549 0   1   2819,   0,
    chr1    3375555 3377788 Gm37363 0   -   3377788 3377788 0   1   2233,   0,
    chr1    3464976 3467285 Gm37686 0   -   3467285 3467285 0   1   2309,   0,
    chr1    3466586 3513553 Gm1992  0   +   3513553 3513553 0   2   101,149,    0,46717,
    

    可以参考以下代码进行转换:

     def genePredName2bed12(file,bedpos):
    
        fo = open(file[:-4] + "_bed12.bed","w")
        with open(file) as f:
            for line in f:
                items = line.strip().split()
                if items[-4]=="0":
                    name = "None"
                else:
                    name =items[-4]
                
                con = [items[1],items[3],items[4],name,"0",items[2],items[5],items[6],"0",items[7]]
                #$2"\t"$4"\t"$5"\t"$1"\t0\t"$3"\t"$6"\t"$7"\t0\t"$8"\t"$9"\t"$10}
                start_list = [int(ite) for ite in items[8].split(",")[:-1]]
                end_list = [int(ite) for ite in items[9].split(",")[:-1]]
                n = int(items[7])
                length = []
                distance = ["0"]
                for i in range(n):
                    length.append(str(end_list[i]-start_list[I]))
                for j in range(n-1):
                    distance.append(str(start_list[j+1]-end_list[j]))
                con.extend([",".join(length)+",",",".join(distance)+","])
                fo.write("\t".join(con) + "\n")
                pos=items[1]+":" + items[3]+"-" + items[4] + "," + items[2]
                if pos == bedpos[name]:
                    fo1.write("\t".join(con) + "\n")
       
        fo.close()
    
    
    

    注意bed****格式必须要sort,****因是0-based****的,start>end, ****不能start=end,****不然会报错[0,)
    sort -k 1,1 -k 2,2n file.bed > out.bed
    转换完gene注释文件之后,就可以进行配置了
    来看一下config的设置:

    [x-axis]
    #optional
    fontsize=6
    # default is bottom meaning below the axis line
    where=top
    
    [spacer]
    # height of space in cm (optional)
    height = 0.5
    
    
    #c("#999999", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7")
    ##999999, #E69F00, #56B4E9, #009E73
    #G0:   MEF
    #G1: PGFH3d, PGFH6d, PGFH9d, PGFHiHep
    #G2:   PGF3d,PGF6d,PGF9d
    #G3:   CRGF3d,CRGF6d,CRGF9d, CRGFiHep
    
    
    [bigwig]
    file=sam1.bw
    title = sam1
    height = 2
    color = #7427A5
    number of bins = 500
    summary method = mean
    show data range = yes
    file_type = bigwig
    
    
    [bigwig]
    file=sam2.bw
    title = sam2
    height = 2
    color = #955122
    number of bins = 500
    summary method = mean
    show data range = yes
    file_type = bigwig
    
    [bigwig]
    file=sam3.bw
    title = sam3
    height = 2
    color = #4ECEC5
    number of bins = 500
    summary method = mean
    show data range = yes
    file_type = bigwig
    
    [bigwig]
    file=sam4.bw
    title = sam4
    height = 2
    color = #178D7C
    number of bins = 500
    summary method = mean
    show data range = yes
    file_type = bigwig
    
    
    [bigwig]
    file=sam5.bw
    title = sam5
    height = 2
    color = #4ECEC5
    number of bins = 500
    summary method = mean
    show data range = yes
    file_type = bigwig
    
    [bigwig]
    file=sam6.bw
    title = sam6
    height = 2
    color = #178D7C
    number of bins = 500
    summary method = mean
    show data range = yes
    file_type = bigwig
    
        
    [spacer]
    [mm10_genePredName_bed12_filter]
    file=Mus_musculus.GRCm38.90_genePredName_bed12.bed
    height = 4
    title = refGenes
    fontsize = 6
    style = UCSC
    gene_rows = 2
    color=black
    border color = black
    

    配置完成之后,通过以下命令就大功告成了.....

    pyGenomeTracks  --tracks  config.ini  --region chr11:4181821-4220502 --outFileName test.pdf --width 20 --height 20 
    
    image.png

    相关文章

      网友评论

        本文标题:【三维基因组】PyGenomeTrack之CHIP峰图可视化

        本文链接:https://www.haomeiwen.com/subject/tuqicktx.html