美文网首页
ID转换,把ensbl_id换成gene_name

ID转换,把ensbl_id换成gene_name

作者: 幸福疼痛 | 来源:发表于2020-04-12 22:00 被阅读0次
import re
gene_dict = {}
f1 = open('human.gtf')
f2 = open('mRNAmatrix.txt')
f3 = open('sym.txt','w')
for lines in f1:
    if lines.startswith('#'):
        continue
    line = lines.strip().split('\t')
    #print(line[2])
    if line[2] == "gene":
        gene_id = re.search(r'gene_id "([^;]+)";',line[8]).group(1)
        #print(geneid)
        gene_name = re.search(r' gene_name "([^;]+)";',line[8]).group(1)
        #print(gene_name)
        gene_dict[gene_id] = gene_name

for lines in f2:
    lines = lines.strip()
    if lines.startswith('id'):
        print(lines,file = f3)
        continue
    line = lines.split('\t')
    gene_inf = line[0][:15]

    if gene_inf in gene_dict:
        print(gene_dict[gene_inf]+"\t"+'\t'.join(line[1:]),file = f3 )

相关文章

网友评论

      本文标题:ID转换,把ensbl_id换成gene_name

      本文链接:https://www.haomeiwen.com/subject/ledimhtx.html