细菌基因组分析软件--Bactopia

作者: yilunanxia | 来源:发表于2020-08-05 10:34 被阅读0次

细菌基因组分析软件--Bactopia
Bacteria genome denovo assembly
向ggplot2图中添加系统进化树并共享y轴
分析记录 | 细菌基因组软件安装
细菌基因组拼接最终流程（三）
细菌基因组分析记录
细菌基因组结构分析
BPGA--细菌泛基因组分析软件包
WGDI软件（一）：安装与配置
细菌基因组进化分析

一、软件介绍：

1、文章信息：

Petit III RA, Read TD, Bactopia: a flexible pipeline for complete analysis of bacterial genomes. mSystems. 5 (2020), https://doi.org/10.1128/mSystems.00190-20.

2、软件相关介绍：

https://github.com/bactopia/bactopia

3、软件工作流程：

软件分析流程

4、主要功能

我觉得最大的特点是傻瓜，一步到位。以前的分析往往需要多步多软件进行。用完一个再用另外一个。比如：FastQC-Trimmomatic-Unicycler（SPAdes)-Prokka-blast against custom database。更麻烦的是需要经常写一些小脚本处理格式。总之很烦躁，还很难发好文章（血与泪的教训）。

该软件配置完成后可以一步到位，有木有觉得很激动，很爽？什么总结信息、提取16S序列构建进化树、物种分类、基于ANI来进行物种更细的分类（species/subspecies?)、泛基因组分析之类的一次性搞定。不知道正在准备搭建流程的公司看到这个有没有很激动。

文章里提供的1.4版本的软件列表

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

AMRFinder 3.6.7 Finds acquired antimicrobial resistance genes and some point mutations in protein or assembled nucleotide sequences

Aragorn 1.2.38 Finds transfer RNA (tRNA) features

Ariba 2.14.4 Antimicrobial resistance identification by assembly

ART 2016.06.05 A set of simulation tools to generate synthetic next-generation sequencing reads

assembly-scan 0.3.0 Generates basic stats for an assembly

Barrnap 0.9 Bacterial ribosomal RNA predictor

BBMap 38.76 A suite of fast, multithreaded bioinformatics tools designed for analysis of DNA and RNA sequence data

BCFtools 1.9 Utilities for variant calling and manipulating VCFs and BCFs

Bedtools 2.29.2 A powerful tool set for genome arithmetic

BioPython 1.76 Tools for biological computation written in Python

BLAST 2.9.0 Basic local alignment search tool

Bowtie2 2.4.1 A fast and sensitive gapped-read aligner

BWA 0.7.17 Burrows-Wheeler Aligner for short-read alignment

CD-HIT 4.8.1 Accelerated for clustering the next-generation sequencing data

CheckM 1.1.2 Assesses the quality of microbial genomes recovered from isolates, single cells, and metagenomes

ClonalFrameML1.12 Efficient inference of recombination in whole bacterial genomes

DiagrammeR 1.0.0 Graph and network visualization using tabular data in R https://github.com/rich-iannone/DiagrammeR

DIAMOND 0.9.35 Accelerated BLAST-compatible local sequence aligner https://github.com/bbuchfink/diamond

eggNOG-Mapper 2.0.1 Fast genome-wide functional annotation through orthology assignment

EMIRGE 0.61.1 Reconstructs full-length ribosomal genes from short-read sequencing data

FastANI 1.3 Fast whole-genome similarity (ANI) estimation

FastTree2 2.1.10 Approximately-maximum-likelihood phylogenetic trees from alignments of nucleotide or protein sequences

fastq-dl 1.0.3 Downloads FASTQ files from SRA or ENA repositories

FastQC 0.11.9 A quality control analysis tool for high throughput sequencing data

fastq-scan 0.4.3 Outputs FASTQ summary statistics in JSON format

FLASH 1.2.11 A fast and accurate tool to merge paired-end reads

freebayes 1.3.2 Bayesian haplotype-based genetic polymorphism discovery and genotyping

GNU Parallel 20200122 A shell tool for executing jobs in parallel

GTDB-tk 1.0.2 A tool kit for assigning objective taxonomic classifications to bacterial and archaeal genomes

HMMER 3.3 Biosequence analysis using profile hidden Markov models

Infernal 1.1.2 Searches DNA sequence databases for RNA structure and sequence similarities

IQ-TREE 1.6.12 Efficient phylogenomic software by maximum likelihood

ISMapper 2.0 Insertion sequence mapping software

Lighter 1.1.2 Fast and memory-efficient sequencing error corrector

MAFFT 7.455 Multiple alignment program for amino acid or nucleotide sequences

Mash 2.2.2 Fast genome and metagenome distance estimation using MinHash

Mashtree 1.1.2 Creates a tree using Mash distances

maskrc-svg 0.5 Masks recombination as detected by ClonalFrameML or Gubbins and draws an SVG

McCortex 1.0 De novo genome assembly and multisample variant calling

MEGAHIT 1.2.9 Ultra-fast and memory-efficient (meta-)genome assembler

MinCED 0.4.2 Mining CRISPRs in environmental data sets

Minimap2 2.17 A versatile pairwise aligner for genomic and spliced nucleotide sequences

ncbi-genome-download 0.2.12 Scripts to download genomes from the NCBI FTP servers

Nextflow 19.10.0 A DSL for data-driven computational pipelines

phyloFlash 3.3b3 Rapidly reconstruct the SSU rRNAs and explore phylogenetic composition of anIllumina (metagenomic data set)

Pigz 2.3.4 A parallel implementation of gzip for modern multiprocessor, multicore machines

Pilon 1.23 An automated genome assembly improvement and variant detection tool

PIRATE 1.0.3 A toolbox for pan-genome analysis and threshold evaluation

pplacer 1.1.alpha19 Phylogenetic placement and downstream analysis

Prodigal 2.6.3 Fast, reliable protein-coding gene prediction for prokaryotic genomes

Prokka 1.4.5 Rapid prokaryotic genome annotation

QUAST 5.0.2 Quality assessment tool for genome assemblies

Racon 1.4.13 Ultrafast consensus module for raw de novo genome assembly of long uncorrected reads

Roary 3.13.0 Rapid large-scale prokaryote pan genome analysis

samclip 0.2 Filter SAM file for soft and hard clipped alignments

SAMtools 1.9 Tools for manipulating next-generation sequencing data

Seqtk 1.3 A fast and lightweight tool for processing sequences in the FASTA or FASTQ format

Shovill 1.0.9se Faster assembly of Illumina reads

SKESA 2.3.0 Strategic k-mer extension for scrupulous assemblies

Snippy 4.4.5 Rapid haploid variant calling and core genome alignment

SnpEff 4.3.1 Genomic variant annotations and functional effect prediction toolbox

snp-dists 0.6.3 Pairwise SNP distance matrix from a FASTA sequence alignment

SNP-sites 2.5.1 Rapidly extracts SNPs from a multi-FASTA alignment

Sourmash 3.2.0 Compute and compare MinHash signatures for DNA data sets

SPAdes 3.13.0 An assembly toolkit containing various assembly pipelines

Trimmomatic 0.39 A flexible read trimming tool for Illumina NGS data

Unicycler 0.4.8 Hybrid assembly pipeline for bacterial genomes

vcf-annotator 0.5 Add biological annotations to variants in a VCF file

Vcflib 1.0.0rc3 A simple C library for parsing and manipulating VCF files

Velvet 1.2.10 Short read de novo assembler using de Bruijn graphs

VSEARCH 2.14.1 Versatile open-source tool for metagenomics

vt 2015.11.10 A tool set for short-variant discovery in genetic sequence data

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

5. 软件使用

5.1 软件安装

conda create -y -n bactopia -c conda-forge -c bioconda bactopia

conda activate bactopia

bactopia datasets datasets/ #这里会下载到指定的目录‘datasets/'，包含了CARD，VFDB（核心），RefSeq Mash Sketch，GenBank Sourmash Signatures, PLSDB Mash Sketch & BLAST。

5.2 软件运行

双端数据

bactopia --R1 ${SAMPLE}_R1.fastq.gz --R2 ${SAMPLE}_R2.fastq.gz --sample ${SAMPLE} \

--datasets datasets/ --outdir ${OUTDIR}

单端数据

bactopia --SE ${SAMPLE}.fastq.gz --sample ${SAMPLE} --datasets datasets/ --outdir ${OUTDIR}

多样本

bactopia prepare directory-of-fastqs/ > fastqs.txt

bactopia --fastqs fastqs.txt --datasets datasets --outdir ${OUTDIR}

ENA数据（真香）

bactopia --accessions ena-accessions.txt \

--datasets datasets/ \

--species "Staphylococcus aureus" \

--coverage 100 \

--genome_size median \

--cpus 2 \

--outdir ena-multiple-samples

网友评论

泛基因组

本文标题：细菌基因组分析软件--Bactopia

本文链接：https://www.haomeiwen.com/subject/axrnrktx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

细菌基因组分析软件--Bactopia

相关文章