美文网首页
Sentieon体细胞变异检测分析pipeline--系列2(c

Sentieon体细胞变异检测分析pipeline--系列2(c

作者: chSNP | 来源:发表于2020-12-19 15:02 被阅读0次

前言

本文介绍了两种体细胞变异检测pipeline:

  • TNscope:使用Sentieon特有的算法,拥有更快的计算速度和更高的计算精度,对临床基因诊断样本尤其适用;
  • TNhaplotyper2:匹配Mutect2(现在匹配到4.1.9)结果的同时,计算速度提升10倍以上。

关于TNscope和TNhaplotyper2的完整脚本,可访问:https://github.com/Sentieon/sentieon-scripts/tree/master/example_pipelines/somatic
Sentieon软件下载地址:https://www.insvast.com/sentieon

以下流程主要针对ctDNA和其他高深度测序的样本数据(2000-5000x depth, AF > 0.3%)

第一步:Alignment

# ****************************************** 
# 1a. Mapping reads with BWA-MEM, sorting for tumor sample 
# ****************************************** 
( sentieon bwa mem -M -R "@RG\tID:$tumor\tSM:$tumor\tPL:$platform" \
-t $nt -K 10000000 $fasta $tumor_fastq_1 $tumor_fastq_2 || \
echo -n 'error' ) | \
sentieon util sort -o tumor_sorted.bam -t $nt --sam2bam -i -

# ****************************************** 
# 1b. Mapping reads with BWA-MEM, sorting for normal sample 
# ****************************************** 
( sentieon bwa mem -M -R "@RG\tID:$normal\tSM:$normal\tPL:$platform" \
-t $nt -K 10000000 $fasta $normal_fastq_1 $normal_fastq_2 || 
echo -n 'error' ) | \
sentieon util sort -o normal_sorted.bam -t $nt --sam2bam -i -

第二步:PCR Duplicate Removal (Skip For Amplicon)

# ****************************************** 
# 2a. Remove duplicate reads for tumor sample. 
# ****************************************** 
# ******************************************  
sentieon driver -t $nt -i tumor_sorted.bam \
      --algo LocusCollector \
      --fun score_info \ tumor_score.txt sentieon driver -t $nt -i tumor_sorted.bam \
      --algo Dedup \
      --score_info tumor_score.txt \
      --metrics tumor_dedup_metrics.txt \ tumor_deduped.bam
# ****************************************** 
# 2b. Remove duplicate reads for normal sample. 
# ****************************************** 
sentieon driver -t $nt -i normal_sorted.bam \
     --algo LocusCollector \
     --fun score_info \ normal_score.txt sentieon driver -t $nt -i normal_sorted.bam \
     --algo Dedup \
     --score_info normal_score.txt \
     --metrics normal_dedup_metrics.txt \ normal_deduped.bam

第三步: Base Quality Score Recalibration (Skip For Small Panel)

# ****************************************** 
# 3a. Base recalibration for tumor sample
# ******************************************
sentieon driver -r $fasta -t $nt -i tumor_deduped.bam --interval $BED \
    --algo QualCal \
    -k $dbsnp \
    -k $known_Mills_indels \
    -k $known_1000G_indels \ tumor_recal_data.table
# ****************************************** 
# 3b. Base recalibration for normal sample 
# ****************************************** 
sentieon driver -r $fasta -t $nt -i normal_deduped.bam --interval $BED \
     --algo QualCal \
     -k $dbsnp \
     -k $known_Mills_indels \
     -k $known_1000G_indels \ 
     normal_recal_data.table

第四步:Variant Calling (Tumor Only)

sentieon driver -r $fasta -t $nt -i tumor_deduped.bam --interval $BED --interval_padding 10 \
     --algo TNscope \
     --tumor_sample $TUMOR_SM \
     --dbsnp $dbsnp \
     --disable_detector sv \
     --min_tumor_allele_frac 3e-3 \
     --filter_t_alt_frac 3e-3 \
     --clip_by_minbq 1 \
     --min_init_tumor_lod 3.0 \
     --min_tumor_lod 3.0 \
     --assemble_mode 4 \
     --resample_depth 100000 \
     [--pon panel_of_normal.vcf \] 
     output_tnscope.pre_filter.vcf.gz

第五步:Variant Filtration (Tumor Only)

bcftools annotate -x "FILTER/triallelic_site" output_tnscope.pre_filter.vcf.gz | \ 
   bcftools filter -m + -s "low_qual" -e "QUAL < 10" | \ 
   bcftools filter -m + -s "short_tandem_repeat" -e "RPA[0]>=10" | \ 
   bcftools filter -m + -s "read_pos_bias" -e "FMT/ReadPosRankSumPS[0] < -5" | \
   bcftools norm -f $fasta -m +any | \ 
sentieon util vcfconvert - output_tnscope.filtered.vcf.gz

相关文章

网友评论

      本文标题:Sentieon体细胞变异检测分析pipeline--系列2(c

      本文链接:https://www.haomeiwen.com/subject/nfhcnktx.html