bedtools批量提取基因组指定位置序列

作者: 陈光辉_山东花生 | 来源:发表于2020-02-18 11:06 被阅读0次

bedtools批量提取基因组指定位置序列

之前已经介绍过很多提取序列的方法，有脚本的也有软件的，这里再介绍一种方法。

用到软件是bedtools，具体方法如下：

>Usage:   bedtools getfasta [OPTIONS] -fi <fasta> -bed <bed/gff/vcf>

Options:
-fi Input FASTA file
-bed BED/GFF/VCF file of ranges to extract from -fi
-name Use the name field for the FASTA header
-split given BED12 fmt., extract and concatenate the sequencesfrom the BED "blocks" (e.g., exons)
-tab Write output in TAB delimited format.
- Default is FASTA format.
-s Force strandedness. If the feature occupies the antisense,
strand, the sequence will be reverse complemented.
- By default, strand information is ignored.
-fullHeader Use full fasta header.
- By default, only the word before the first space or tab is used.
其中-fi 指定基因组fasta文件，-bed 指定要提取序列的位置文件，可以是bed、gff 或 vcf 文件（染色体碱基位置从0开始计数）。
-tab 指定输出格式。

$bedtools getfasta -fi GCA_001651475.1_Ler_Assembly_genomic.fna -bed id.bed

>CM004359.1:0-10
gtttagggtt
>CM004359.1:100-200
ttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggttt
>CM004359.1:1000-1050
TTGTGGgaaaattatttagttgtaGGGATGAAGTCTTTCTTCGTTGTTGT

$bedtools getfasta -fi GCA_001651475.1_Ler_Assembly_genomic.fna -bed id.bed -tab

>CM004359.1:0-10 gtttagggtt
>CM004359.1:100-200 ttagggtttagg gtttagggtttagggtttagggttta gggtttagggtttagggtttagggtttagggtttagggtttagggtttagggtttagggttt
>CM004359.1:1000-1050 TTGTGGgaaaattatttagttgtaGGGATGAAGTCTTTCTTCGTTGTTGT

网友评论

本文标题：bedtools批量提取基因组指定位置序列

本文链接：https://www.haomeiwen.com/subject/xotgfhtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

bedtools批量提取基因组指定位置序列

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

提取特定位置的基因组序列

转录组

生物信息学