美文网首页mtDNA
对组装完成的线粒体基因组进行纠错

对组装完成的线粒体基因组进行纠错

作者: 多啦A梦的时光机_648d | 来源:发表于2023-01-12 20:49 被阅读0次

    1.三代数据纠错minimap2+racon(迭代三次)

    ##step1 :long reads polishing
    long reads remapping -Iteration 1(minimap2)
    conda install racon
    minimap2 127kb.fasta  ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz >127kb.paf ##127kb.fa 初步组装结果  ##../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 三代测序数据
    gzip 127kb.paf
    
    long reads consensus call -Iteration 1 (racon)
    racon -t 20 ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 127kb.paf.gz 127kb.fasta >127kb1.fasta
    
    long read remapping -Ilteration 2(minimap2)
    minimap2 127kb1.fasta  ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz >127kb1.paf
    gzip 127kb1.paf
    
    long reads consensus call -Iteration 2(racon)
    racon -t 20 ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 127kb1.paf.gz 127kb1.fasta >127kb2.fasta
    
    long reads remapping -Iteration 3(minimap2)
    minimap2 127kb2.fasta  ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz >127kb2.paf
    gzip 127kb2.paf
    
    long reads consensus call -Iteration 3(racon)
    racon -t 30 ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 127kb2.paf.gz 127kb2.fasta >127kb3.fasta
    

    2.二代数据纠错bwa+pilon(一次)

    ## step2 :short reads polishing
    conda install -c bioconda pilon
    BWA genome indexing & short read remapping
    bwa index 127kb3.fasta
    bwa mem -t 30 127kb3.fasta ../WGS/Cassytha_filiformis_BDSW210000018-1A_1.clean.fq.gz ../WGS/Cassytha_filiformis_BDSW210000018-1A_2.clean.fq.gz |/home/lx_sky6/software/miniconda3/envs/yt/bin/samtools sort -m 10G -@ 20 >127kb3.bam  ##samtools=1.6,版本一定要高,不然这一行明令不行
    samtools index 127kb3.bam
    short read consensus call -Iteration 1 
    /home/lx_sky6/software/miniconda3/bin/pilon --genome 127kb3.fasta --frags 127kb3.bam  --fix all  --output 127kb4
    

    报错:pilon内存不足

    Pilon version 1.24 Thu Jan 28 13:00:45 2021 -0500
    Genome: 127kb3.fasta
    Fixing snps, indels, gaps, local
    Input genome size: 124420
    Scanning BAMs
    127kb3.bam: 158192970 reads, 0 filtered, 1947329 mapped, 1787406 proper, 11840 stray, FR 100% 371+/-82, max 615
    Processing ctg000000:1-124420
    frags 127kb3.bam: coverage 1760
    Total Reads: 2087914, Coverage: 1760, minDepth: 176
    Confirmed 102590 of 124420 bases (82.45%)
    Corrected 822 snps; 1 ambiguous bases; corrected 176 small insertions totaling 452 bases, 233 small deletions totaling 962 bases
    # Attempting to fix local continuity breaks
    Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
        at org.broadinstitute.pilon.PileUp.<init>(PileUp.scala:26)
        at org.broadinstitute.pilon.Assembler.$anonfun$addToPileups$1(Assembler.scala:83)
        at org.broadinstitute.pilon.Assembler$$Lambda$98/843467284.apply$mcVI$sp(Unknown Source)
        at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
        at org.broadinstitute.pilon.Assembler.addToPileups(Assembler.scala:80)
        at org.broadinstitute.pilon.Assembler.addRead(Assembler.scala:65)
        at org.broadinstitute.pilon.Assembler.$anonfun$addReads$1(Assembler.scala:47)
        at org.broadinstitute.pilon.Assembler.$anonfun$addReads$1$adapted(Assembler.scala:47)
        at org.broadinstitute.pilon.Assembler$$Lambda$97/1176735295.apply(Unknown Source)
        at scala.collection.immutable.List.foreach(List.scala:333)
        at org.broadinstitute.pilon.Assembler.addReads(Assembler.scala:47)
        at org.broadinstitute.pilon.GapFiller.assembleIntoBreak(GapFiller.scala:127)
        at org.broadinstitute.pilon.GapFiller.assembleAcrossBreak(GapFiller.scala:55)
        at org.broadinstitute.pilon.GapFiller.fixBreak(GapFiller.scala:46)
        at org.broadinstitute.pilon.GenomeRegion.$anonfun$identifyAndFixIssues$6(GenomeRegion.scala:401)
        at org.broadinstitute.pilon.GenomeRegion.$anonfun$identifyAndFixIssues$6$adapted(GenomeRegion.scala:399)
        at org.broadinstitute.pilon.GenomeRegion$$Lambda$88/233021551.apply(Unknown Source)
        at scala.collection.immutable.List.foreach(List.scala:333)
        at org.broadinstitute.pilon.GenomeRegion.identifyAndFixIssues(GenomeRegion.scala:399)
        at org.broadinstitute.pilon.GenomeFile.$anonfun$processRegions$4(GenomeFile.scala:113)
        at org.broadinstitute.pilon.GenomeFile.$anonfun$processRegions$4$adapted(GenomeFile.scala:102)
        at org.broadinstitute.pilon.GenomeFile$$Lambda$43/1863932867.apply(Unknown Source)
        at scala.collection.immutable.List.foreach(List.scala:333)
        at org.broadinstitute.pilon.GenomeFile.processRegions(GenomeFile.scala:102)
        at org.broadinstitute.pilon.Pilon$.main(Pilon.scala:111)
        at org.broadinstitute.pilon.Pilon.main(Pilon.scala)
    

    或者

    Pilon version 1.24 Thu Jan 28 13:00:45 2021 -0500
    Genome: YC.asm.hic.p_ctg.fasta
    Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
        at org.broadinstitute.pilon.GenomeRegion.<init>(GenomeRegion.scala:54)
        at org.broadinstitute.pilon.GenomeFile.$anonfun$contigRegions$1(GenomeFile.scala:73)
        at org.broadinstitute.pilon.GenomeFile.$anonfun$contigRegions$1$adapted(GenomeFile.scala:73)
        at org.broadinstitute.pilon.GenomeFile$$Lambda$25/731260860.apply(Unknown Source)
        at scala.collection.immutable.Range.map(Range.scala:59)
        at org.broadinstitute.pilon.GenomeFile.contigRegions(GenomeFile.scala:73)
        at org.broadinstitute.pilon.GenomeFile.$anonfun$regions$1(GenomeFile.scala:53)
        at org.broadinstitute.pilon.GenomeFile$$Lambda$24/1795799895.apply(Unknown Source)
        at scala.collection.immutable.List.map(List.scala:250)
        at org.broadinstitute.pilon.GenomeFile.<init>(GenomeFile.scala:53)
        at org.broadinstitute.pilon.Pilon$.main(Pilon.scala:108)
        at org.broadinstitute.pilon.Pilon.main(Pilon.scala)
    

    调大软件设置的内存限制

    #查询pilon路径
    which pilon
    #修改pilon配置
    vim ~/software/miniconda3/bin/pilon
    
    1g
    10g

    3. 检查、验证和装配质量评估

    BWA genome indexing & short read remapping
    $bwa index 127kb4.fasta
    $bbwa mem -t 30 127kb4.fasta ../WGS/Cassytha_filiformis_BDSW210000018-1A_1.clean.fq.gz ../WGS/Cassytha_filiformis_BDSW210000018-1A_2.clean.fq.gz |/home/lx_sky6/software/miniconda3/envs/yt/bin/samtools sort -m 10G -@ 20 >127kb4.bam
    (若未环:circlator mapreads 127kb4.fasta ../WGS/Cassytha_filiformis_BDSW210000018-1A_1.clean.fq.gz 127kb4.bam)
    
    $samtools index 127kb4.bam
    $samtools tview 127kb4.bam 127kb4.fasta
    
    sequenceing depth
    $samtools depth 127kb4.bam >127kb4_depth.txt
    

    相关文章

      网友评论

        本文标题:对组装完成的线粒体基因组进行纠错

        本文链接:https://www.haomeiwen.com/subject/nhjycdtx.html