Usearch fastq_mergepairs 命令使用信息搬

作者: 代号北极能 | 来源:发表于2019-06-28 16:33 被阅读0次

Usearch fastq_mergepairs 命令使用信息搬
usearch -filter_phix command 信息搬
USEARCH algorithm信息收集
USEARCH的使用
Defining unique sequence abundan
Usearch search_oligodb command 信
Sequence database files 信息搬运
Creating an OTU table 信息搬运
Using the tabbedout file to inve
Interpreting counts and frequenc

All the following information come from www.drive5.com, I just use this as a notebook for my learning, I declare no commercial interest with this. Everyone who see this document should refer to www.drive5.com.

I got some problem when I was trying to merge my read data, then I collected some information, they are shown as following.

The fastq_mergepairs command merges (assembles) paired-end reads to create consensus sequences and, optionally, consensus quality scores. This command has many features and options so I recommend spending some time browsing the documentation to get familiar with the capabilities of fastq_mergepairs and issues that arise in read merging.

Basic usage

The simplest way to use fastq_mergepairs is to specify the the forward and reverse FASTQ filenames and an output FASTQ filename.

usearch -fastq_mergepairs SampleA_R1.fastq -reverse SampleA_R2.fastq -fastqout merged.fq

Automatic R2 filename

If the -reverse option is omitted, the reverse FASTQ filename is constructed by replacing R1 with R2. The following command line is equivalent to the example above.

usearch -fastq_mergepairs SampleA_R1.fastq -fastqout merged.fq

Merging multiple FASTQ file pairs in a single command

You can specify two or more FASTQ filenames following -fastq_mergepairs. In the following example, SampleA and SampleB are both merged. The R2 filenames are constructed automatically as explained above, or can be given explicitly using the -reverse option.

usearch -fastq_mergepairs SampleA_R1.fastq SampleB_R1.fastq -fastqout merged.fq

usearch -fastq_mergepairs *_R1*.fastq -fastqout merged.fq (This is what I was using when I had 45 reads).

Adding sample identifiers to read labels

If multiple samples are combined into a single file as shown in some of the above examples, then you lose track of which read came from which sample. This is addressed by adding a sample identifier to each read label. The simplest method is to use the -sample option, e.g.

usearch -fastq_mergepairs SampleA_R1.fastq -fastqout merged.fq -sample SampleA

The string sample=SampleA; will be added at the end of the read label.

Getting the sample identifier from the FASTQ filename

FASTQ filenames are often based on the sample identifier, e.g. SampleA_R1.fastq. If you specify -relabel @ then fastq_mergepairs gets the sample identifier from the FASTQ file name by truncating at the first underscore (_) or period (.). A period and the read number is added after the sample identifier to make the new read label, which replaces the original label. This differs from the -sample option, which adds the sample= annotation at the end of the label. The usearch_global command understands both of these methods for putting sample identifiers into read labels..

usearch -fastq_mergepairs SampleA_R1.fastq -fastqout merged.fq -relabel @

Merging multiple files with sample identifiers

By using wildcards and the -relabel @ option you can merge multiple files and add sample identifiers to the read labels, for example:

usearch -fastq_mergepairs *R1*.fastq -fastqout merged.fq -relabel @

fastq_mergepairs options

Input files

-

fastq_mergepairs Forward FASTQ filename(s). -reverse Reverse FASTQ filename(s). If not given, constructed by replacing R1 with R2.

-interleaved Forward and reverse reads are interleaved in the same file (sometimes produced by SRA fastq-dump).

Output files

-

fastqout FASTQ filename for merged reads.

-fastaout FASTA filename for merged reads.

-fastqout_notmerged_fwd FASTQ filename for forward reads which were not merged.

-fastaout_notmerged_fwd FASTA filename for forward reads which were not merged.

-fastqout_notmerged_rev FASTQ filename for reverse reads which were not merged.

-fastaout_notmerged_rev FASTA filename for reverse reads which were not merged.

Reports

-report Filename for summary report. See Reviewing a fastq_mergepairs report to check for problems.

-tabbedout Tabbed text file containing detailed information about merging process for each pair including reason for discarding.

-alnout Human-readable alignments. Useful for trouble-shooting.

Merged read labels

-relabel Prefix string for output labels. The read number 1, 2, 3... is appended after the prefix.

-relabel @ Relabel using prefix string constructed from FASTQ filename, this will be understood as the sample identifier.

-sample xxx Append sample identifier to read label using sample=xxx; format. This is an alternative method for adding sample ids.

-fastq_eeout Add ee=xxx; annotation with the number of expected errors in the merged read.

-label_suffix Suffix to append to merged read label. Can be used e.g. to add sample=xxx; type of sample identifier annotations.

Filtering

-fastq_maxdiffs Maximum number of mismatches in the alignment. Default 5. Consider increasing if you have long overlaps.

-fastq_pctid Minimum %id of alignment. Default 90. Consider decreasing if you have long overlaps.

-fastq_nostagger Discard staggered pairs. Default is to trim overhangs (non-biological sequence).

-fastq_minmergelen Minimum length for the merged sequence. See Filtering artifacts by setting a merge length range.

-fastq_maxmergelen Maximum length for the merged sequence.

-fastq_minqual Discard merged read if any merged Q score is less than the given value. (No minimum by default).

-fastq_minovlen Discard pair if alignment is shorter than given value. Default 16.

Pre-processing of reads before alignment

-fastq_trunctail Truncate reads at the first Q score with <= this value. Default 2.

-fastq_minlen Discard pair if either read is shorter than this, after truncating by -fastq_trunctail if applicable. Default 64.

Multi-threading

-threads Specifies the number of threads. Default 10, or the number of CPU cores, which ever is less.

网友评论

Usearch documentation信息收集

本文标题：Usearch fastq_mergepairs 命令使用信息搬

本文链接：https://www.haomeiwen.com/subject/gmrecctx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Usearch fastq_mergepairs 命令使用信息搬

fastq_mergepairs options

相关文章

Usearch fastq_mergepairs 命令使用信息搬

usearch -filter_phix command 信息搬

USEARCH algorithm信息收集

USEARCH的使用

Defining unique sequence abundan

Usearch search_oligodb command 信

Sequence database files 信息搬运

Creating an OTU table 信息搬运

Using the tabbedout file to inve

Interpreting counts and frequenc

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

Usearch documentation信息收集