平时也没怎么关注fasterq-dump解压缩的细节,随便搜一下网上的教程解压就行
也就是一般是下面的这个命令
fasterq-dump --split-3 -e 20 *.sra
但是我发现这样实际上是调用不到多线程的?但是有时任务不急就没管,不过现在要弄接近一百个文件,每个解压缩后都很大,所以不得不看帮助文档了
Usage: fasterq-dump [ options ] [ accessions(s)... ]
Parameters:
accessions(s) list of accessions to process
Options:
-o|--outfile <path> full path of outputfile (overrides usage
of current directory and given accession)
-O|--outdir <path> path for outputfile (overrides usage of
current directory, but uses given
accession)
-b|--bufsize <size> size of file-buffer (dflt=1MB, takes
number or number and unit)
-c|--curcache <size> size of cursor-cache (dflt=10MB, takes
number or number and unit)
-m|--mem <size> memory limit for sorting (dflt=100MB,
takes number or number and unit)
-t|--temp <path> path to directory for temp. files
(dflt=current dir.)
-e|--threads <count> how many threads to use (dflt=6)
-p|--progress show progress (not possible if stdout used)
-x|--details print details of all options selected
-s|--split-spot split spots into reads
-S|--split-files write reads into different files
-3|--split-3 writes single reads into special file
--concatenate-reads writes whole spots into one file
-Z|--stdout print output to stdout
-f|--force force overwrite of existing file(s)
-N|--rowid-as-name use rowid as name (avoids using the name
column)
--skip-technical skip technical reads
--include-technical explicitly include technical reads
-P|--print-read-nr include read-number in defline
-M|--min-read-len <count> filter by sequence-lenght
--table <name> which seq-table to use in case of pacbio
--strict terminate on invalid read
-B|--bases <bases> filter output by matching against given
bases
-A|--append append to output-file, instead of
overwriting it
--ngc <path> <path> to ngc file
--perm <path> <path> to permission file
--location <location> location in cloud
--cart <path> <path> to cart file
--disable-multithreading disable multithreading
-V|--version Display the version of the program
-L|--log-level <level> Logging level as number or enum string.
One of
(fatal|sys|int|err|warn|info|debug) or
(0-6) Current/default is warn
--option-file file Read more options and parameters from the
file.
-h|--help print this message
version 2.10.7
所以很容易看明白为何速度就是上不去,因为内存和缓存限制,需要改这些参数才行,不能用默认的(有服务器的才这么搞,本地就算了)
fasterq-dump -e 30 -3 -b 100MB -c 200MB -m 2000MB *sra
网友评论