美文网首页
使用circos绘制变异分布图

使用circos绘制变异分布图

作者: 花生学生信 | 来源:发表于2022-10-22 17:55 被阅读0次

    一、数据准备:

    使用SURVIVOR将得到的vcf文件合并,

    SURVIVOR merge filter_vcf_files.txt 1000 0 1 1 0 0  sample.vcf
    
    ####最后一个参数是保留的最小长度,这里用到SNP数据,所以用0
    

    将变异个数按照1MB窗口进行滑窗统计变异个数

    
    # variation_number_in_bin.py
    
    import sys
    
    input_file = sys.argv[1]
    step_len = 1000000
    result = {}
    
    chrom_stats = {
        "Chr01":44717177,
        "Chr02":37244950,
        "Chr03":39727707,
        "Chr04":37171045,
        "Chr05":31106453,
        "Chr06":32139443,
        "Chr07":31006940,
        "Chr08":30241818,
        "Chr09":24981782,
        "Chr10":25400072,
        "Chr11":33669327,
        "Chr12":26652165
        }
    
    for chrom in chrom_stats.keys():
        step= 0
        step_start = 0
        result[chrom]={}
        result[chrom][step]=0
        for line in open(input_file):
            if line.startswith(chrom):
                pos = line.split()[1]
                if step_start < int(pos) < (step_start + step_len):
                    result[chrom][step] += 1
                if int(pos) >= (step_start + step_len):
                    step += 1
                    step_start += step_len
                    result[chrom][step] = 0
                    result[chrom][step] += 1
    
        for steps,nums in result[chrom].items():
            print(chrom, steps, nums, sep="\t")
    
    

    circos需要准备的数据:

    >**基因组信息数据**  rice1.txt
    #第一列:染色体,表明这是一条染色体
    #第二列:占位符,定义所属关系
    #第三列:ID,跟身份证号码一样,唯一不能重复的标识(这一列很重要,后面其他作图数据都是要和ID对应)
    #第四列:  图中显示的文本  
    #第五列:起始。以0开始
    #第六列:终止。第五列和第六列定义了染色体的实际大小
    #第七列:颜色     (这一列我写的是Chr1,Chr2,Chr3... 其实在这里他们代表的是颜色,因为在配置文件colors.ucsc.conf
     #               中将不同的色号赋值给了Chr1,Chr2,Chr3... 所以就可用Chr1,Chr2,Chr3... 来代替颜色,当然你可以直接
       #             写色号,或是颜色,都可以)
    ####
    chr - hs01 Chr1 0 43270923 Chr1
    chr - hs02 Chr2 0 35937250 Chr2
    chr - hs03 Chr3 0 36413819 Chr3
    chr - hs04 Chr4 0 35502694 Chr4
    chr - hs05 Chr5 0 29958434 Chr5
    chr - hs06 Chr6 0 31248787 Chr6
    chr - hs07 Chr7 0 29697621 Chr7
    chr - hs08 Chr8 0 28443022 Chr8
    chr - hs09 Chr9 0 23012720 Chr9
    chr - hs10 Chr10 0 23207287 Chr10
    chr - hs11 Chr11 0 29021106 Chr11
    chr - hs12 Chr12 0 27531856 Chr12
    
    ###配置文件  circos.conf
    
    
    karyotype = rice1.txt
    chromosomes_units =100000
    <plots>
    
    <plot>
    type    = line
    file    = num/SNP
    color   = blue
    #color=div-5-spectral
    r1      = 0.90r
    r0      = 0.81r
    #type      = line
    thickness = 2
    max_gap = 1u
    
    
    </plot>
    
    
    <plot>
    type    = heatmap
    #type      = line
    thickness = 2
    max_gap = 1u
    #color   = redv
    file    = num/DEL
    color   = reds-9-seq
    #color=div-5-spectral
    r1      = 0.80r
    r0      = 0.71r
    </plot>
    
    <plot>
    type    = heatmap
    
    #type      = line
    thickness = 2
    max_gap = 1u
    #color   = redv
    file    = num/INS
    color   = reds-9-seq
    #color=div-5-spectral
    r1      = 0.70r
    r0      = 0.61r
    </plot>
    
    #<plot>
    #type    = heatmap
    #file    = num/INS
    #color   = reds-9-seq
    #color=div-5-spectral
    #type      = line
    #thickness = 2
    #max_gap = 1u
    #color   = redv
    #r1      = 0.60r
    #r0      = 0.51r
    #</plot>
    
    #<plot>
    #type    = heatmap
    #file    = num/DEL
    #color   = reds-9-seq
    #color=div-5-spectral
    #type      = line
    #thickness = 2
    #max_gap = 1u
    #color   = redv
    #r1      = 0.50r
    #r0      = 0.41r
    #</plot>
    
    
    #<plot>
    #type = scatter
    #fill_color       = black # 填充色
    #stroke_color     = black
    #glyph            = circle
    #glyph_size       = 5 # 元素大小
    #file = INV.txt
    #r1   = 0.80r
    #r0   = 0.71r
    #</plot>
    
    #<plot>
    #type = histogram
    #file = INV.txt
    #fill_color = blue # 填充色
    #r1   = 0.89r
    #r0   = 0.81r
    #</plot>
    
    
    </plots>
    
    
    #<links>
    
    #<link>
    #file          = links.txt
    #radius        = 0.61r
    #color         = blue_a4
    #ribbon = yes
    #</link>
    
    #</links>
    
    
    # Includes content from ideogram.conf (included file path is relative
    # to the file that included it). Conventionally, I separate the 
    # contents of the <ideogram> block from circos.conf and include
    # it via ideogram.conf.
    <<include ideogram.conf>>
     
    # Similarly, I put the <ticks> block in ticks.conf
    #<<include ticks.conf>>
    
    <image>
    <<include etc/image.conf>>                
    </image>
    
    <<include etc/colors_fonts_patterns.conf>> 
    <<include etc/housekeeping.conf>> 
    
    
    circos需要的文件格式
    ####ideogram.conf
    <ideogram>
    
    <spacing>
    default = 0.001r
    <pairwise hs01;hs12>
    # 以下设置两条染色体之间的间隙为圆的20度角
    spacing = 40r
    #radius=0.1r
    </pairwise>
    
    
    </spacing>
    
    #<spacing>
    # 也可设置指定的两条染色体之间的间隙
    
    #</spacing>
    
    # Ideogram position, fill and outline
    radius           = 0.90r
    thickness        = 20p
    fill             = yes
    stroke_color     = dgrey
    stroke_thickness = 2p
    
    # Minimum definition for ideogram labels.
    
    show_label       = yes
    # see etc/fonts.conf for list of font names
    label_font       = default 
    label_radius     = dims(image,radius)-60p
    label_size       = 30
    label_parallel   = yes
    
    </ideogram>
    
    

    最后在终端直接运行circos,

    可以得到一张png和一张svg格式的图片,最后还可以用PS或者AI进行适当修改,

    不说了,开组会了

    相关文章

      网友评论

          本文标题:使用circos绘制变异分布图

          本文链接:https://www.haomeiwen.com/subject/gazvzrtx.html