If you suspect any of your samples of substitution errors that occur on a single strand before sequencing you should definitely use Mutect2's orientation bias filter. This applies to all FFPE tumor samples and samples sequenced on Illumina Novaseq machines, among others. In fact, with the optimizations in 4.1.1.0 you can run the filter even when you're not suspicious. It won't hurt accuracy and the CPU cost is now quite small.
A step-by-step guide to the new Mutect2 Read Orientation Artifacts Workflow
There are three steps to the filter.
"# First, run Mutect2 with the "--f1r2-tar-gz" argument. This creates an output with raw data used to learn the orientation bias model //."
gatk Mutect2 -R ref.fasta \
-L intervals.interval_list \
-I tumor.bam \
-germline-resource af-only-gnomad.vcf \
-pon panel_of_normals.vcf \
--f1r2-tar-gz f1r2.tar.gz \
-O unfiltered.vcf
"#Next, pass this raw data to LearnReadOrientationModel:"
gatk LearnReadOrientationModel -I f1r2.tar.gz -O read-orientation-model.tar.gz
##Run GetPileupSummaries to summarize read support for a set number of known variant sites.
gatk GetPileupSummaries \
-I tumor.bam \
-V chr17_small_exac_common_3_grch38.vcf.gz \
-L chr17_small_exac_common_3_grch38.vcf.gz \
-O getpileupsummaries.table
##Estimate contamination with CalculateContamination.
gatk CalculateContamination \
-I getpileupsummaries.table \
-tumor-segmentation segments.table \
-O calculatecontamination.table
"#Finally, pass the learned read orientation model to FilterMutectCallswith the -ob-priors argument//:"
gatk FilterMutectCalls -V unfiltered.vcf \
[--tumor-segmentation segments.table] \
[--contamination-table contamination.table] \
--ob-priors read-orientation-model.tar.gz \
-O filtered.vcf
A step-by-step guide to the new Mutect2 Panel of Normals Workflow
The three steps to create a panel of normals are:
"#Step 1: Run Mutect2 in tumor-only mode for each normal sample:"
gatk Mutect2 -R reference.fasta -I normal1.bam --max-mnp-distance 0 -O normal1.vcf.gz
gatk Mutect2 -R reference.fasta -I normal2.bam --max-mnp-distance 0 -O normal2.vcf.gz
"#Step 2: Create a GenomicsDB from the normal Mutect2 calls:"
gatk GenomicsDBImport -R reference.fasta -L intervals.interval_list \
--genomicsdb-workspace-path pon_db \
-V normal1.vcf.gz \
-V normal2.vcf.gz \
-V normal3.vcf.gz
"#Step 3: Combine the normal calls using CreateSomaticPanelOfNormals:"
gatk CreateSomaticPanelOfNormals -R reference.fasta \
--germline-resource af-only-gnomad.vcf.gz \
-V gendb://pon_db \
-O pon.vcf.gz
网友评论