美文网首页
【Python编程】---查找两DNA序列片段中相同位置的差异碱

【Python编程】---查找两DNA序列片段中相同位置的差异碱

作者: 卡布达b1 | 来源:发表于2020-04-15 15:20 被阅读0次

(接上篇)
前言:现有长度相等,但碱基存在一定差异的两条DNA序列,求它们之间的最长相同子序列,并输出差异碱基的位置。
代码如下:

import os,sys
file = sys.argv[1]
outdir = sys.argv[2]
os.system('mkdir %s'%(outdir))

def HitsParser(f):
    load = open(f,'r')
    Hits = [line for line in load.readlines()]
    All_align = []
    for i in range(1,len(Hits)):
        align = {}
        align['query.seq'] = Hits[i].strip().split('\t')[12]
        align['subject.seq'] = Hits[i].strip().split('\t')[13]
        All_align.append(align)
    return All_align

def DiffPos(f,n):
    str1 = HitsParser(f)[n]['query.seq']
    str2 = HitsParser(f)[n]['subject.seq']
    length = len(str1)
    All_pos = []
    for i in range(length):
        if str1[i] != str2[i]:
            pos = [str(i),str1[i],str2[i]]
            All_pos.append(pos)
    return All_pos

Header = 'Position\tQuery.Base\tSubject.base\n'
All_hits = HitsParser(file)
Hits_num = len(All_hits)
for j in range(Hits_num):
    L = DiffPos(file,j)
    S = Header+'\n'.join(['\t'.join(i) for i in L])+'\n'
    with open('%s/DiffPos_%s.txt'%(outdir,str(j)),'w') as g:
        g.write(S)
    g.close()

相关文章

网友评论

      本文标题:【Python编程】---查找两DNA序列片段中相同位置的差异碱

      本文链接:https://www.haomeiwen.com/subject/vjmcvhtx.html