美文网首页
2019-07-29 lima

2019-07-29 lima

作者: 老_Z | 来源:发表于2019-07-30 20:44 被阅读0次

后续代码准备
文件夹建立primers.fasta文件,保存接头序列

>primer_5p
AAGCAGTGGTATCAACGCAGAGTACATGGGG
>primer_3p
AAGCAGTGGTATCAACGCAGAGTAC

运行pb的去接头软件lima,(不是基因芯片的差异分析用的limma)参数如下:
lima:Demultiplex barcoded samples 可以去接头,可以拆分多barcode的样本

lima /home/Acer1/ZhaoJing/rice_pacbio/movie.ccs.bam /home/Acer1/ZhaoJing/rice_pacbio/primers.fasta /home/Acer1/ZhaoJing/rice_pacbio/demux.ccs.bam --isoseq --no-pbi
Usage: lima [options] INPUT BARCODES OUTPUT
Lima, Demultiplex Barcoded PacBio Data and Clip Barcodes 

Library Design:
  -s,--same                      Only keep same barcodes in a pair in BAM
                                 output.
  -d,--different                 Only keep different barcodes in a pair in BAM
                                 output. Enforces --min-passes ≥ 1.

Input Limitations:
  -p,--per-read                  Do not tag per ZMW, but per read.
  -f,--score-full-pass           Only use subreads flanked by adapters for
                                 barcode identification.
  -n,--max-scored-barcode-pairs  Only use up to N barcode pair regions to find
                                 the barcode, 0 means use all. [0]
  -b,--max-scored-barcodes       Analyze at maximum the provided number of
                                 barcodes per ZMW; 0 means deactivated. [0]
  -a,--max-scored-adapters       Analyze at maximum the provided number of
                                 adapters per ZMW; 0 means deactivated. [0]
  -u,--min-passes                Minimal number of full passes. [0]
  -l,--min-length                Minimum sequence length after clipping. [50]
  -L,--max-input-length          Maximum input sequence length, 0 means
                                 deactivated. [0]
  -M,--bad-adapter-ratio         Maximum ratio of bad adapter. [0]
  -P,--shared-prefix             Barcodes may be substrings of others.
Barcode Region:
  -w,--window-size-mult          The candidate region size multiplier:
                                 barcode_length * multiplier. [1.5]
  -W,--window-size-bp            The candidate region size in bp. If set,
                                 --window-size-mult will be ignored. [0]
  -r,--min-ref-span              Minimum reference span relative to the barcode
                                 length. [0.5]
  -R,--min-scoring-regions       Minimum number of barcode regions with
                                 sufficient relative span to the barcode length.
                                 [1]

Score Filters:
  -m,--min-score                 Reads below the minimum barcode score are
                                 removed from downstream analysis. [0]
  -i,--min-end-score             Minimum end barcode score threshold is applied
                                 to the individual leading and trailing ends.
                                 [0]
  -x,--min-signal-increase       The minimal score difference, between first
                                 and combined, required to call a barcode pair
                                 different. [10]
  -y,--min-score-lead            The minimal score lead required to call a
                                 barcode pair significant. [10]
Index Sorting:
  -k,--keep-tag-idx-order        Keep identified order of barcode pair indices
                                 in BC tag; CCS only.
  -K,--keep-split-idx-order      Keep identified order of barcode pair indices
                                 in split BAM names; CCS only.

Aligner Configuration:
  --ccs                          CCS mode, use optimal alignment options -A 1
                                 -B 4 -D 3 -I 3 -X 4.
  -A,--match-score               Score for a sequence match. [4]
  -B,--mismatch-penalty          Penalty for a mismatch. [13]
  -D,--deletion-penalty          Deletions penalty. [7]
  -I,--insertion-penalty         Insertion penalty. [7]
  -X,--branch-penalty            Branch penalty. [4]

Output Restrictions:
  --split-bam                    Split BAM output by barcode pair.
  --split-bam-named              Split BAM output by resolved barcode pair name.
  --bam-handles                  Maximum number of open BAM files. [500]
  --dump-clips                   Dump clipped regions in a separate output file
                                 <prefix>.lima.clips
  --dump-removed                 Dump removed records to
                                 <prefix>.lima.removed.bam.
  --no-pbi                       Do not generate a PBI file that is needed for SMRTLink.
  --no-bam                       Do not generate BAM output.
  --no-reports                   Do not generate reports.

Single Side:
  -S,--single-side               Assign single side barcodes by score clustering.
  --scored-adapter-ratio         Minimum ratio of scored vs sequenced adapters. [0.25]

IsoSeq:
  --isoseq                       Activate specialized IsoSeq mode.

Advanced:
  --peek                         Demux the first N ZMWs and return the mean
                                 score; 0 means peeking deactivated. [0]
  --guess                        Try to guess the used barcodes, using the
                                 provided mean score threshold; 0 means guessing
                                 deactivated. [0]
  --guess-min-count              Minimum number of ZMWs observed to whitelist
                                 barcodes. [0]
  --peek-guess                   Try to infer the used barcodes subset, by
                                 peeking at the first 50,000 ZMWs, whitelisting
                                 barcode pairs with more than 10 counts and mean
                                 score ≥ 45.

Options:
  -h,--help                      Output this help.
  --version                      Output version info.
  -j,--num-threads               Number of threads to use, 0 means
                                 autodetection. [0]
  --emit-tool-contract           Emit tool contract.
  --resolved-tool-contract       Use args from resolved tool contract.

Arguments:
  input                          Source BAM or DATASET
  barcode                        FASTA or BARCODESET file
  output                         Output BAM or DATASET file

image.png

默认参数,时间比较短,十几分钟,600M的css.reads.bam就搞完了

相关文章

网友评论

      本文标题:2019-07-29 lima

      本文链接:https://www.haomeiwen.com/subject/zdtbrctx.html