美文网首页
ID转换,把ensbl_id换成gene_name

ID转换,把ensbl_id换成gene_name

作者: 幸福疼痛 | 来源:发表于2020-04-12 22:00 被阅读0次
    import re
    gene_dict = {}
    f1 = open('human.gtf')
    f2 = open('mRNAmatrix.txt')
    f3 = open('sym.txt','w')
    for lines in f1:
        if lines.startswith('#'):
            continue
        line = lines.strip().split('\t')
        #print(line[2])
        if line[2] == "gene":
            gene_id = re.search(r'gene_id "([^;]+)";',line[8]).group(1)
            #print(geneid)
            gene_name = re.search(r' gene_name "([^;]+)";',line[8]).group(1)
            #print(gene_name)
            gene_dict[gene_id] = gene_name
    
    for lines in f2:
        lines = lines.strip()
        if lines.startswith('id'):
            print(lines,file = f3)
            continue
        line = lines.split('\t')
        gene_inf = line[0][:15]
    
        if gene_inf in gene_dict:
            print(gene_dict[gene_inf]+"\t"+'\t'.join(line[1:]),file = f3 )
    

    相关文章

      网友评论

          本文标题:ID转换,把ensbl_id换成gene_name

          本文链接:https://www.haomeiwen.com/subject/ledimhtx.html