flye的使用（基因组组装）

作者: GenomeStudy | 来源:发表于2023-08-16 15:39 被阅读0次

一文看懂三代组装软件——Flye
利用Juicer v2.0挂载染色体
Flye | 三代测序数据组装软件①
在 Windows 下(非WSL)使用 Flye 组装基因组（P
生信 | 使用 flye 基于三代测序数据组装基因组 - 数据
常用转录组组装软件集合
基因组结构注释
Bacteria genome denovo assembly
使用ALLHiC基于HiC数据辅助基因组组装
基因组组装教程 (T2T)

软件介绍

Flye 是一种用于单分子测序读数（如 PacBio 和牛津纳米孔技术公司生产的读数）的从头组装器。它适用于各种数据集，从小型细菌项目到大型哺乳动物级组装。该软件包代表了一个完整的流水线：它将原始的 PacBio / ONT 读取数据作为输入，并输出精加工的等位基因。

软件的安装

1.直接conda安装

conda install flye

2.编译安装

在github上获得最新的软件包，进行编译

git clone https://github.com/fenderglass/Flye
cd Flye
make

3.python直接编译安装，命令直接进入环境变量

git clone https://github.com/fenderglass/Flye
cd Flye
python setup.py install

软件的使用

flye将原始的 PacBio / ONT 数据作为输入

1.先查看软件的帮助信息

$ flye -h
usage: flye (--pacbio-raw | --pacbio-corr | --pacbio-hifi | --nano-raw |
             --nano-corr | --nano-hq ) file1 [file_2 ...]
             --out-dir PATH

             [--genome-size SIZE] [--threads int] [--iterations int]
             [--meta] [--polish-target] [--min-overlap SIZE]
             [--keep-haplotypes] [--debug] [--version] [--help]
             [--scaffold] [--resume] [--resume-from] [--stop-after]
             [--read-error float] [--extra-params]
             [--deterministic]

Assembly of long reads with repeat graphs

options:
  -h, --help            show this help message and exit
  --pacbio-raw path [path ...]
                        PacBio regular CLR reads (<20% error)
  --pacbio-corr path [path ...]
                        PacBio reads that were corrected with other methods (<3% error)
  --pacbio-hifi path [path ...]
                        PacBio HiFi reads (<1% error)
  --nano-raw path [path ...]
                        ONT regular reads, pre-Guppy5 (<20% error)
  --nano-corr path [path ...]
                        ONT reads that were corrected with other methods (<3% error)
  --nano-hq path [path ...]
                        ONT high-quality reads: Guppy5+ SUP or Q20 (<5% error)
  --subassemblies path [path ...]
                        [deprecated] high-quality contigs input
  -g size, --genome-size size
                        estimated genome size (for example, 5m or 2.6g)
  -o path, --out-dir path
                        Output directory
  -t int, --threads int
                        number of parallel threads [1]
  -i int, --iterations int
                        number of polishing iterations [1]
  -m int, --min-overlap int
                        minimum overlap between reads [auto]
  --asm-coverage int    reduced coverage for initial disjointig assembly [not set]
  --hifi-error float    [deprecated] same as --read-error
  --read-error float    adjust parameters for given read error rate (as fraction e.g. 0.03)
  --extra-params extra_params
                        extra configuration parameters list (comma-separated)
  --plasmids            unused (retained for backward compatibility)
  --meta                metagenome / uneven coverage mode
  --keep-haplotypes     do not collapse alternative haplotypes
  --no-alt-contigs      do not output contigs representing alternative haplotypes
  --scaffold            enable scaffolding using graph [disabled by default]
  --trestle             [deprecated] enable Trestle [disabled by default]
  --polish-target path  run polisher on the target sequence
  --resume              resume from the last completed stage
  --resume-from stage_name
                        resume from a custom stage
  --stop-after stage_name
                        stop after the specified stage completed
  --debug               enable debug output
  -v, --version         show program's version number and exit
  --deterministic       perform disjointig assembly single-threaded

Input reads can be in FASTA or FASTQ format, uncompressed
or compressed with gz. Currently, PacBio (CLR, HiFi, corrected)
and ONT reads (regular, HQ, corrected) are supported. Expected error rates are
<15% for PB CLR/regular ONT; <5% for ONT HQ, <3% for corrected, and <1% for HiFi. Note that Flye
was primarily developed to run on uncorrected reads. You may specify multiple
files with reads (separated by spaces). Mixing different read
types is not yet supported. The --meta option enables the mode
for metagenome/uneven coverage assembly.

To reduce memory consumption for large genome assemblies,
you can use a subset of the longest reads for initial disjointig
assembly by specifying --asm-coverage and --genome-size options. Typically,
40x coverage is enough to produce good disjointigs.

You can run Flye polisher as a standalone tool using

2.通过查看参数，对应我们的原始数据，选择正确的参数进行运行

--pacbio-raw | --pacbio-corr | --pacbio-hifi | --nano-raw |--nano-corr | --nano-hq

3.看看实例

flye --pacbio-hifi hifi_reads.fastq.gz --out-dir ./03.flye --iterations 3 --genome-size 1.4g --threads 40

#参数介绍
--pacbio-hifi   #我们的是PacBio HiFi数据，所以选择这个参数
--out-dir   #输出文件的路劲
--iterations 3 #进行3次抛光（polish）
--genome-size 1.4g #预估的基因组大小（单位m,g）
--threads 40 #调取的线程数

参考链接

https://github.com/fenderglass/Flye

网友评论

本文标题：flye的使用（基因组组装）

本文链接：https://www.haomeiwen.com/subject/hrokmdtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

flye的使用（基因组组装）

软件介绍

软件的安装

1.直接conda安装

2.编译安装

3.python直接编译安装，命令直接进入环境变量

软件的使用

1.先查看软件的帮助信息

2.通过查看参数，对应我们的原始数据，选择正确的参数进行运行

3.看看实例

参考链接

相关文章

一文看懂三代组装软件——Flye

利用Juicer v2.0挂载染色体

Flye | 三代测序数据组装软件①

在 Windows 下(非WSL)使用 Flye 组装基因组（P

生信 | 使用 flye 基于三代测序数据组装基因组 - 数据

常用转录组组装软件集合

基因组结构注释

Bacteria genome denovo assembly

使用ALLHiC基于HiC数据辅助基因组组装

基因组组装教程 (T2T)

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读