1. 简介
ABySS 是用于从头拼接双端测序短序列或较大基因组的软件,官网有测试数据可以使用。
2. 下载安装
ubantu下载安装。
sudo apt-get install abyss
3. 参数详解
-
abyss 将 short reads 组装成 contigs
-
abyss-pe 将 short reads 组装成 contigs 和 scaffolds
-
--chastity 去除污染的reads默认选项
-
--no-chastity 不去除污染的reads
-
--trim-masked 从序列某端去除低质量的碱基,默认选项
-
--no-trim-masked 不从序列末端去除低质量碱基
-
-q | --trim-quality=N 从序列尾端去除碱基质量低于此值的碱基
-
--standard-quality 碱基质量格式为 phred33 ,默认选项
-
--illumina-quality 碱基质量格式为 phred64
-
-o | --out=FILE 输出的 contigs 文件的文件名
-
-k | --kmer=N k-mer 长度,必要参数
-
-t | --trim-length=N maximum length of dangling edges to trim
-
-c | --coverage=FLOAT 去除 k-mer 覆盖读低于此值的 contigs
-
-b | --bubbles=N pop bubbles shorter than N bp [3*k]
-
-b0 | --no-bubbles do not pop bubbles
-
name 指定生成文件的前缀,运行结束后,会生成很多文件,主要关注test-contigs.fa,test-scaffolds.fa
4. 示例
使用 abyss-pe 将 1 个 paired-end 文库的short reads 组装成 scaffolds
wget http://www.bcgsc.ca/platform/bioinfo/software/abyss/releases/1.3.4/test-data.tar.gz
tar xzvf test-data.tar.gz
abyss-pe \
k=25 --out=test name=test \
in='test-data/reads1.fastq test-data/reads2.fastq'
Assembling multiple libraries
abyss-pe \
k=25 --out=test name=ecoli \
lib='pea peb' \
pea='pea_1.fa pea_2.fa' \
peb='peb_1.fa peb_2.fa' \
se='se1.fa se2.fa'
使用多个 k 值进行基因组组装,再寻找最佳 k 值:
export k
for k in {20..40}; do
mkdir k$k
abyss-pe -C k$k name=ecoli in=../reads.fa
done
abyss-fac k*/ecoli-contigs.fa
通过for 循环,实现多 梯度kmer组装:
for k in `seq 50 8 90`; do
mkdir k$k
abyss-pe -C k$k name=test k=$k in=reads.fa
done
其他
Scaffolding
Scaffolding with linked reads
Rescaffolding with long sequences
Assembling using a Bloom filter de Bruijn graph
Assembling using a paired de Bruijn graph
Assembling a strand-specific RNA-Seq library
Parallel processing
Running ABySS on a cluster
Using the DIDA alignment framework
Tips
abyss-pe是一种Makefile驱动程序脚本,因此任何的make选项均可与abyss-pe一起连用:
-
ABYSS
: de Bruijn graph assembler -
ABYSS-P
: parallel (MPI) de Bruijn graph assembler -
AdjList
: find overlapping sequences -
DistanceEst
: estimate the distance between sequences -
MergeContigs
: merge sequences -
MergePaths
: merge overlapping paths -
Overlap
: find overlapping sequences using paired-end reads -
PathConsensus
: find a consensus sequence of ambiguous paths -
PathOverlap
: find overlapping paths -
PopBubbles
: remove bubbles from the sequence overlap graph -
SimpleGraph
: find paths through the overlap graph -
abyss-fac
: calculate assembly contiguity statistics -
abyss-filtergraph
: remove shim contigs from the overlap graph -
abyss-fixmate
: fill the paired-end fields of SAM alignments -
abyss-map
: map reads to a reference sequence -
abyss-scaffold
: scaffold contigs using distance estimates -
abyss-todot
: convert graph formats and merge graphs
需要加载到PATH环境变量中,如配置ABYSS:
export abyss_home=~/soft/abyss-...
export PATH=$abyss_home/ABYSS:$PATH
source ~/.bashrc
网友评论