美文网首页
fasterq-dump解压的一些细节

fasterq-dump解压的一些细节

作者: 可能性之兽 | 来源:发表于2022-03-09 12:48 被阅读0次

    平时也没怎么关注fasterq-dump解压缩的细节,随便搜一下网上的教程解压就行
    也就是一般是下面的这个命令

    fasterq-dump --split-3  -e 20 *.sra
    

    但是我发现这样实际上是调用不到多线程的?但是有时任务不急就没管,不过现在要弄接近一百个文件,每个解压缩后都很大,所以不得不看帮助文档了

    
    Usage: fasterq-dump [ options ] [ accessions(s)... ]
    
    Parameters:
    
      accessions(s)                    list of accessions to process
    
    
    Options:
    
      -o|--outfile <path>              full path of outputfile (overrides usage
                                         of current directory and given accession)
      -O|--outdir <path>               path for outputfile (overrides usage of
                                         current directory, but uses given
                                         accession)
      -b|--bufsize <size>              size of file-buffer (dflt=1MB, takes
                                         number or number and unit)
      -c|--curcache <size>             size of cursor-cache (dflt=10MB, takes
                                         number or number and unit)
      -m|--mem <size>                  memory limit for sorting (dflt=100MB,
                                         takes number or number and unit)
      -t|--temp <path>                 path to directory for temp. files
                                         (dflt=current dir.)
      -e|--threads <count>             how many threads to use (dflt=6)
      -p|--progress                    show progress (not possible if stdout used)
      -x|--details                     print details of all options selected
      -s|--split-spot                  split spots into reads
      -S|--split-files                 write reads into different files
      -3|--split-3                     writes single reads into special file
         --concatenate-reads           writes whole spots into one file
      -Z|--stdout                      print output to stdout
      -f|--force                       force overwrite of existing file(s)
      -N|--rowid-as-name               use rowid as name (avoids using the name
                                         column)
         --skip-technical              skip technical reads
         --include-technical           explicitly include technical reads
      -P|--print-read-nr               include read-number in defline
      -M|--min-read-len <count>        filter by sequence-lenght
         --table <name>                which seq-table to use in case of pacbio
         --strict                      terminate on invalid read
      -B|--bases <bases>               filter output by matching against given
                                         bases
      -A|--append                      append to output-file, instead of
                                         overwriting it
         --ngc <path>                  <path> to ngc file
         --perm <path>                 <path> to permission file
         --location <location>         location in cloud
         --cart <path>                 <path> to cart file
         --disable-multithreading      disable multithreading
      -V|--version                     Display the version of the program
      -L|--log-level <level>           Logging level as number or enum string.
                                         One of
                                         (fatal|sys|int|err|warn|info|debug) or
                                         (0-6) Current/default is warn
         --option-file file            Read more options and parameters from the
                                         file.
      -h|--help                        print this message
    
    version 2.10.7
    
    
    

    所以很容易看明白为何速度就是上不去,因为内存和缓存限制,需要改这些参数才行,不能用默认的(有服务器的才这么搞,本地就算了)

    fasterq-dump -e 30 -3 -b 100MB -c 200MB -m 2000MB *sra
    

    相关文章

      网友评论

          本文标题:fasterq-dump解压的一些细节

          本文链接:https://www.haomeiwen.com/subject/kbstdrtx.html