两条等长序列的无空位全局比对算得上是最简单的比对模型,这里用Python简单实现一下。用到的得分矩阵是BLOSUM62,BLOSUM62是应用得非常广的氨基酸替换矩阵,BLAST中蛋白质的比对也有用到。
BLOSUM62替换矩阵可以在:ftp://ftp.ncbi.nih.gov/blast/matrices/找到
import pandas as pd
from random import choice
matrix_path = "./BLOSUM62.xlsx"
matrix = pd.read_excel(matrix_path, sheet_name=0, header=0, index_col=0)
aa_list = ['A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V', 'B', 'Z', 'X']
#随机生成两条100氨基酸的序列,比对,求出得分
seq1 = ''
seq2 = ''
for i in range(100):
seq1_aa = choice(aa_list)
seq1 +=seq1_aa
seq2_aa = choice(aa_list)
seq2 +=seq2_aa
print(seq1+"\n"+seq2)
sum=0
for i in range(len(seq1)):
col_name = seq1[i]
index_name = seq2[i]
sum += matrix[col_name][index_name]
print("the score of alignment is: "+str(sum)+".")
网友评论