美文网首页
nucmer可视化

nucmer可视化

作者: 胡童远 | 来源:发表于2024-03-21 11:50 被阅读0次

    gitee:https://gitee.com/liaochenlanruo/mummer2circos
    github: https://github.com/metagenlab/mummer2circos

    来源:https://taylorreiter.github.io/2019-05-11-Visualizing-NUCmer-Output/

    比对及R语言可视化

    Installing mummer

    conda create -n mummer 
    conda activate mummer
    conda install -c bioconda mummer4=4.0.0beta2
    

    Running nucmer

    To download the test data, run:

    # M. harundinacea 6AC
    wget -O mh6ac.fasta.gz ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/235/565/GCA_000235565.1_ASM23556v1/GCA_000235565.1_ASM23556v1_genomic.fna.gz
    gunzip mh6ac.fasta.gz
    
    # M. harundinacea MAG07
    wget -O mag07.fasta https://osf.io/d9qyg/download 
    

    The general structure of the nucmer command looks like this:

    nucmer --mum reference.fasta query.fasta -p query_ref_nucmer
    

    We will use the genbank assembly as a reference, and the metagenome assembled genome bin as the query.

    nucmer --mum mh6ac.fasta mag07.fasta -p m_harundinacea
    

    Here, we filter the nucmer output to only include alignment of length 1000. This is arbitrary, and you should use a length that makes sense for your biological question.

    delta-filter -l 1000 -q m_harundinacea.delta > m_harundinacea_filter.delta
    show-coords -c -l -L 1000 -r -T m_harundinacea_filter.delta > m_harundinacea_filter_coords.txt
    

    Simple plot

    • -r reference fasta
    • -q other fasta with to compare with the reference fasta
    • -l mendatory option to build circular plots
    • genome tracks are ordered based on the order of the input query fasta files
    mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*fna
    
    nucmer2circos_simple.png

    Condensed tracks

    mummer2circos -l -c -r genomes/NZ_CP008827.fna -q genomes/*fna
    
    nucmer2circos_condensed.png

    With gene tracks

    • the header of the reference fasta file chromosome (and eventual plasmids) should be the same as the locus accession of the genbank file. See example file NZ_CP008828.fna.

    LOCUS NZ_CP008828 15096 bp DNA CON 16-AUG-2015

    mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk
    
    nucmer2circos_gene_tracks.png

    Label specific genes

    • given a fasta file of protein of interest, label the BBH of each amino acid sequence on the circular plot
    • the fasta headers are used as labels (see example file VF.faa)
    mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa
    
    nucmer2circos_labels.png

    Show mapping depth along the chromosome (and plasmids)

    • depth files can be generated from bam file using samtools depth
    • the labels used in the .depth file should be the same as the fasta header (see example files)
    • regions with depth higher than 2 times the median are croped to that limit and coloured in green (deal with highly repeated sequences).
    • regions with depth lower than half of the median depth are coloured in red.
    mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa -s GCF_000281535.depth
    
    nucmer2circos_depth.png

    Add labels based on coordinate file

    • structure: LOCUS start stop label (see labels.txt)
    • labels can not include spaces
    mummer2circos -l -r genomes/NZ_CP008827.fna -q genomes/NZ_FO834906.fna -gb GCF_000281535_merged.gbk -b VF.faa -s GCF_000281535.depth -lf labels.txt
    
    nucmer2circos_labels_coord.png

    show links between two genomes

    mummer2circos -r genomes/NZ_CP012745.fna -q genomes/*.fna -gb GCF_000281535_merged.gbk -b VF.faa -s GCF_000281535.depth -lf labels.txt
    
    nucmer2circos_links.png

    相关文章

      网友评论

          本文标题:nucmer可视化

          本文链接:https://www.haomeiwen.com/subject/fyvftjtx.html