美文网首页
bmtagger -- 比对宿主数据库

bmtagger -- 比对宿主数据库

作者: QXPLUS | 来源:发表于2022-05-13 12:53 被阅读0次
    • 查看帮助文档 bmtagger.sh -h
    usage: bmtagger [-hV] [-q 0|1] [-C config] -1 input.fa [-2 matepairs.fa] -b genome.wbm -d genome-seqdb -x srindex [-o blacklist] [-T tmpdir] [-X]
    usage: bmtagger [-hV] [-q 0|1] [-C config] -1 input.fa [-2 matepairs.fa] --ref=reference [-o blacklist] [-T tmpdir] [-X]
    usage: bmtagger [-hV] [-q 0|1] [-C config] -A accession [--ref=reference] [-b genome.wbm] [-d genome-seqdb] [-x srindex] [-T tmpdir]
    use --ref=name to point to .wbm, seqdb and srprism index if they have the same path and basename
    use --extract or -X to generate fasta or fastq files which will NOT contain tagged sequences (-o required)
    use --debug to leave temporary data on exit
    use --old-srprism to use options for older version of srprism (interferes with config file)
    Using following programs:
    /software/miniconda2/envs/bin/bmfilter
    /software/miniconda2/envs/bin/srprism
    /software/miniconda2/envs/bin/blastn
    /software/miniconda2/envs/bin/extract_fullseq
    

    原始数据去宿主

    • 去除人类宿主序列
    ref='/database/ref/human/human'
    
    bmtagger.sh -q 1 -1 2_cleandata/${sample}_clean_R1.fq \
                     -2 2_cleandata/${sample}_clean_R2.fq \
            -b "$ref".bitmask -d $ref -x "$ref" \
            -o 2_cleandata/${sample} \
            -X
    

    最会生成{sample}_1.fastq{sample}_2.fastq 的结果文件

    • 这里很容易出现内存溢出,为了避免这个问题,可以考虑将reads 分段成N份(比如N=10)分别进行比对,最后再将N份比对结果合并即可。

    基于质控后的clean data,继续进行去宿主

    • 数据被分为10份(0001-0010.{sample}_clean_R[12].fq
    ref='/database/ref/human/human'
    
    ls *.${sample}_clean_R1.fq | cut -d"/" -f2 | cut -d"." -f1 | \
     xargs -P5 -i sh -c "~/R3.6/bin/bmtagger.sh -q 1\
     -1 2_cleandata/{}.${sample}_clean_R1.fq \
    -2 2_cleandata/{}.${sample}_clean_R2.fq \
    -b "$ref".bitmask -d $ref -x "$ref" \
    -o 2_cleandata/{}.${sample} \
    -X"
    

    会生成0001-0010.{sample}_[12].fastq

    • 合并10份数据为一份,用于后续分析
    cat [0-9][0-9][0-9][0-9].${sample}_[12].fastq | \
    pigz -p 12 -f -c \
    > ${sample}_clean_R[12].fq.gz
    

    相关文章

      网友评论

          本文标题:bmtagger -- 比对宿主数据库

          本文链接:https://www.haomeiwen.com/subject/quxeurtx.html