【python脚本】计算fasta序列长度;基因组contig/

作者: 山竹山竹px | 来源:发表于2020-09-05 17:39 被阅读0次

目的

如题

脚本

import sys,os,re

def process_file(reader):
    '''Open, read,and print a file'''
    names=[]
    index=0
    dict={} 
 
    for line in reader:
        if line.startswith('>'):
           if index >=1:
               names.append(line)
           index =index+1
           name=line[:-1]
           seq = ''
        else:
           seq +=line[:-1]
           dict[name]=seq
    return dict


if __name__ == "__main__":
    input_file=open(sys.argv[1],"r")
    reader=input_file.readlines()
    items=process_file(reader)
    for key in items:
        length=int(len(items[key]))
        print("%s\t%d" %(key,length))
    input_file.close()

来源：https://blog.csdn.net/tangxc10/article/details/48833989?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-1.channel_param

运行

将fasta序列放在脚本的相同目录下

在terminal输入代码

python stat_length.py HHGassembly.fasta > HHG.len &

结果

tab分隔

第一列是序列文件中contig/scaffold/chromosome的名字，带“>”符合

第二列是大小

网友评论

基因家族相关

本文标题：【python脚本】计算fasta序列长度;基因组contig/

本文链接：https://www.haomeiwen.com/subject/sgefektx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

【python脚本】计算fasta序列长度;基因组contig/

目的

脚本

运行

结果

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

基因家族相关