使用blast建立数据库并提取相关序列

作者: 惊鸿影 | 来源:发表于2023-07-14 16:25 被阅读0次

1. 安装blast

conda install -c bioconda blast

2. BLAST数据库的构建

#蛋白质数据库：
makeblastdb -in name.pep.fasta  -parse_seqids -hash_index  -out name.db -dbtype prot  #-out 指定数据库位置和名字
#核酸数据库：
makeblastdb -in name.cds.fasta  -parse_seqids -hash_index  -out name.db -dbtype nucl    
#一般带上-parse_seqids -hash_index

3. 运行blast

blastp -query yourfasta.fa -db name.db -out blast_out
#-out后面为输出的文件名。可以指定输出格式，如在命令后加上 -outfmt ' 6 qseqid sseqid pident length mismatch gapopen qstart qend sstart send qseq sseq evalue bitscore'  就是以blast6格式输出引号中的字段，这些字段都是可选的，如果只想获得blast的序列ID，则只加上-outfmt ' 6 sseqid'

从上一步中获得的blast结果的list，blast结果文件中只能提取比对上的相同字段，不能提取blast数据库中对应的整个蛋白序列，需要使用ID list来提取对应的序列，可以使用seqkit来进行提取

seqkit grep -f ID.list name.db > blast.pep
# -f 有ID列表的文件，name.db为你的数据库名字

网友评论

本文标题：使用blast建立数据库并提取相关序列

本文链接：https://www.haomeiwen.com/subject/meovudtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！