美文网首页重测序比较基因组比较与进化基因组
转载|VCF变异统计及绘制染色体分布图

转载|VCF变异统计及绘制染色体分布图

作者: Zukunft_Lab | 来源:发表于2021-05-13 22:43 被阅读0次

    <meta charset="utf-8">

    VCF文件的变异分布图,除了使用Circos图之外,统计特定材料间的变异分布情况时可能会用到染色体分布图,为了节省时间,实现统计及绘图的自动化,现提供统计数量用的Python脚本和绘图使用的R脚本,仅供参考。如有其它更快捷的方法,或觉得脚本太拙劣,欢迎留言批评交流。

    image

    PYTHON脚本,使用脚本整理100Kb滑窗中变异位点数量:

    variation_number_in_bin.py

    import sys
    
    input_file = sys.argv[1]
    step_len = 100000
    result = {}
    
    chrom_stats = {
    "Chr1":43270923,
    "Chr2":35937250,
    "Chr3":36413819,
    "Chr4":35502694,
    "Chr5":29958434,
    "Chr6":31248787,
    "Chr7":29697621,
    "Chr8":28443022,
    "Chr9":23012720,
    "Chr10":23207287,
    "Chr11":29021106,
    "Chr12":27531856
    }
    
    for chrom in chrom_stats.keys():
        step= 0
        step_start = 0
        result[chrom]={}
        result[chrom][step]=0
        for line in open(input_file):
            if line.startswith(chrom):
                pos = line.split()[1]
                if step_start < int(pos) < (step_start + step_len):
                    result[chrom][step] += 1
                if int(pos) >= (step_start + step_len):
                    step += 1
                    step_start += step_len
                    result[chrom][step] = 0
                    result[chrom][step] += 1
    
        for steps,nums in result[chrom].items():
            print(chrom, steps, nums, sep="\t")
    
    image

    下面画图,使用R脚本如下:

     library(ggplot2)
     library(Cairo)
    
     var_density <- read.table("snp.distribution.stats", sep = "\t", header = T)
    
     names(var_density) <- c("chrom", "region", "variation_count")
     var_density$chrom <- factor(var_density$chrom, levels=c(
       "Chr1", "Chr2", "Chr3", "Chr4", "Chr5", "Chr6",
       "Chr7", "Chr8", "Chr9", "Chr10", "Chr11", "Chr12"))
    
     CairoPNG(filename = "variation_distribution.png", width = 12000, height = 7500, res = 600)
    
     distribution <- ggplot(var_density, aes(x=region, y=variation_count))
    
     distribution + geom_line(colour = "darkorange4", size = 0.35) +
       guides(fill=FALSE) +
       theme(axis.line = element_line(size=0.5),
             panel.background = element_rect(fill="white",size=rel(20)),
             panel.grid.minor = element_line(colour = "gray"),
             axis.text = element_text(family="Times New Roman",size=16,face="bold"),
             axis.title = element_text(family="Times New Roman",size=18,face="bold"),
             strip.text = element_text(family="Times New Roman",size=12,face="bold")) +
       facet_wrap(~chrom, ncol=2)
    
     dev.off()
    

    作者:生物数据分析笔记
    链接:https://www.jianshu.com/p/9b5541a3f67c
    来源:简书
    著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

    相关文章

      网友评论

        本文标题:转载|VCF变异统计及绘制染色体分布图

        本文链接:https://www.haomeiwen.com/subject/ajlsdltx.html