1.三代数据纠错minimap2+racon(迭代三次)
##step1 :long reads polishing
long reads remapping -Iteration 1(minimap2)
conda install racon
minimap2 127kb.fasta ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz >127kb.paf ##127kb.fa 初步组装结果 ##../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 三代测序数据
gzip 127kb.paf
long reads consensus call -Iteration 1 (racon)
racon -t 20 ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 127kb.paf.gz 127kb.fasta >127kb1.fasta
long read remapping -Ilteration 2(minimap2)
minimap2 127kb1.fasta ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz >127kb1.paf
gzip 127kb1.paf
long reads consensus call -Iteration 2(racon)
racon -t 20 ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 127kb1.paf.gz 127kb1.fasta >127kb2.fasta
long reads remapping -Iteration 3(minimap2)
minimap2 127kb2.fasta ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz >127kb2.paf
gzip 127kb2.paf
long reads consensus call -Iteration 3(racon)
racon -t 30 ../output.1061_FPAC22H004594_1A--1061_FPAC22H004594_1A.fastq.fastq.gz 127kb2.paf.gz 127kb2.fasta >127kb3.fasta
2.二代数据纠错bwa+pilon(一次)
## step2 :short reads polishing
conda install -c bioconda pilon
BWA genome indexing & short read remapping
bwa index 127kb3.fasta
bwa mem -t 30 127kb3.fasta ../WGS/Cassytha_filiformis_BDSW210000018-1A_1.clean.fq.gz ../WGS/Cassytha_filiformis_BDSW210000018-1A_2.clean.fq.gz |/home/lx_sky6/software/miniconda3/envs/yt/bin/samtools sort -m 10G -@ 20 >127kb3.bam ##samtools=1.6,版本一定要高,不然这一行明令不行
samtools index 127kb3.bam
short read consensus call -Iteration 1
/home/lx_sky6/software/miniconda3/bin/pilon --genome 127kb3.fasta --frags 127kb3.bam --fix all --output 127kb4
报错:pilon内存不足
Pilon version 1.24 Thu Jan 28 13:00:45 2021 -0500
Genome: 127kb3.fasta
Fixing snps, indels, gaps, local
Input genome size: 124420
Scanning BAMs
127kb3.bam: 158192970 reads, 0 filtered, 1947329 mapped, 1787406 proper, 11840 stray, FR 100% 371+/-82, max 615
Processing ctg000000:1-124420
frags 127kb3.bam: coverage 1760
Total Reads: 2087914, Coverage: 1760, minDepth: 176
Confirmed 102590 of 124420 bases (82.45%)
Corrected 822 snps; 1 ambiguous bases; corrected 176 small insertions totaling 452 bases, 233 small deletions totaling 962 bases
# Attempting to fix local continuity breaks
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
at org.broadinstitute.pilon.PileUp.<init>(PileUp.scala:26)
at org.broadinstitute.pilon.Assembler.$anonfun$addToPileups$1(Assembler.scala:83)
at org.broadinstitute.pilon.Assembler$$Lambda$98/843467284.apply$mcVI$sp(Unknown Source)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:190)
at org.broadinstitute.pilon.Assembler.addToPileups(Assembler.scala:80)
at org.broadinstitute.pilon.Assembler.addRead(Assembler.scala:65)
at org.broadinstitute.pilon.Assembler.$anonfun$addReads$1(Assembler.scala:47)
at org.broadinstitute.pilon.Assembler.$anonfun$addReads$1$adapted(Assembler.scala:47)
at org.broadinstitute.pilon.Assembler$$Lambda$97/1176735295.apply(Unknown Source)
at scala.collection.immutable.List.foreach(List.scala:333)
at org.broadinstitute.pilon.Assembler.addReads(Assembler.scala:47)
at org.broadinstitute.pilon.GapFiller.assembleIntoBreak(GapFiller.scala:127)
at org.broadinstitute.pilon.GapFiller.assembleAcrossBreak(GapFiller.scala:55)
at org.broadinstitute.pilon.GapFiller.fixBreak(GapFiller.scala:46)
at org.broadinstitute.pilon.GenomeRegion.$anonfun$identifyAndFixIssues$6(GenomeRegion.scala:401)
at org.broadinstitute.pilon.GenomeRegion.$anonfun$identifyAndFixIssues$6$adapted(GenomeRegion.scala:399)
at org.broadinstitute.pilon.GenomeRegion$$Lambda$88/233021551.apply(Unknown Source)
at scala.collection.immutable.List.foreach(List.scala:333)
at org.broadinstitute.pilon.GenomeRegion.identifyAndFixIssues(GenomeRegion.scala:399)
at org.broadinstitute.pilon.GenomeFile.$anonfun$processRegions$4(GenomeFile.scala:113)
at org.broadinstitute.pilon.GenomeFile.$anonfun$processRegions$4$adapted(GenomeFile.scala:102)
at org.broadinstitute.pilon.GenomeFile$$Lambda$43/1863932867.apply(Unknown Source)
at scala.collection.immutable.List.foreach(List.scala:333)
at org.broadinstitute.pilon.GenomeFile.processRegions(GenomeFile.scala:102)
at org.broadinstitute.pilon.Pilon$.main(Pilon.scala:111)
at org.broadinstitute.pilon.Pilon.main(Pilon.scala)
或者
Pilon version 1.24 Thu Jan 28 13:00:45 2021 -0500
Genome: YC.asm.hic.p_ctg.fasta
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at org.broadinstitute.pilon.GenomeRegion.<init>(GenomeRegion.scala:54)
at org.broadinstitute.pilon.GenomeFile.$anonfun$contigRegions$1(GenomeFile.scala:73)
at org.broadinstitute.pilon.GenomeFile.$anonfun$contigRegions$1$adapted(GenomeFile.scala:73)
at org.broadinstitute.pilon.GenomeFile$$Lambda$25/731260860.apply(Unknown Source)
at scala.collection.immutable.Range.map(Range.scala:59)
at org.broadinstitute.pilon.GenomeFile.contigRegions(GenomeFile.scala:73)
at org.broadinstitute.pilon.GenomeFile.$anonfun$regions$1(GenomeFile.scala:53)
at org.broadinstitute.pilon.GenomeFile$$Lambda$24/1795799895.apply(Unknown Source)
at scala.collection.immutable.List.map(List.scala:250)
at org.broadinstitute.pilon.GenomeFile.<init>(GenomeFile.scala:53)
at org.broadinstitute.pilon.Pilon$.main(Pilon.scala:108)
at org.broadinstitute.pilon.Pilon.main(Pilon.scala)
调大软件设置的内存限制
#查询pilon路径
which pilon
#修改pilon配置
vim ~/software/miniconda3/bin/pilon
1g
10g
3. 检查、验证和装配质量评估
BWA genome indexing & short read remapping
$bwa index 127kb4.fasta
$bbwa mem -t 30 127kb4.fasta ../WGS/Cassytha_filiformis_BDSW210000018-1A_1.clean.fq.gz ../WGS/Cassytha_filiformis_BDSW210000018-1A_2.clean.fq.gz |/home/lx_sky6/software/miniconda3/envs/yt/bin/samtools sort -m 10G -@ 20 >127kb4.bam
(若未环:circlator mapreads 127kb4.fasta ../WGS/Cassytha_filiformis_BDSW210000018-1A_1.clean.fq.gz 127kb4.bam)
$samtools index 127kb4.bam
$samtools tview 127kb4.bam 127kb4.fasta
sequenceing depth
$samtools depth 127kb4.bam >127kb4_depth.txt
网友评论