美文网首页基因组组装
基因组圈图Circos

基因组圈图Circos

作者: 吴十三和小可爱的札记 | 来源:发表于2020-10-13 18:16 被阅读0次

    数据准备

    circos前期需要准备的文件很多,首先是KARYOTYPE — BIOLOGY APPLICATIONS

    染色体长度统计(KARYOTYPE — BIOLOGY APPLICATIONS )

    """
    @Description: Calculate the length of chromosomes and get it ready for the Circos.
    @useage: python  count_chr_length.py input_flie output.txt
    @File: count_chr_length.py
    @Time: 2020/10/13
    """
    
    import sys
    import pandas as pd
    
    
    input_file = sys.argv[1]
    output_file = sys.argv[2]
    
    dic = {}
    
    with open(input_file, "r") as read_fa:
        for line in read_fa:
            if line.startswith(">"):
                key = line.strip("[>\n]")
                dic[key] = 0
            else:
                value = line.strip()
                seq_len = len(value)
                dic[key] += seq_len
    
    
    # data frame - nice~~~
    df_raw = pd.DataFrame(dic, index = ["end"])
    
    # transformation
    df = df_raw.T
    df["chr"] = "chr"
    df["start"] = 0
    df["-"] = "-"
    df["label"] = df_raw.columns
    df["ID"] = df_raw.columns
    
    # reorder
    index = ["chr", "-", "ID", "label", "start", "end"]
    
    # if head != False, some strings will be find in "start and end"
    reslut = df[index]
    reslut.to_csv(path_or_buf = output_file, sep = "\t", index = False, header = False)
    

    相关文章

      网友评论

        本文标题:基因组圈图Circos

        本文链接:https://www.haomeiwen.com/subject/vctbpktx.html