exomiser-cli-13.2.0/examples/exome-analysis.yml
- note : 结构不可变,tab 、空格
- 注释、过滤、prioritize
## Exomiser Analysis Template.
# These are all the possible options for running exomiser. Use this as a template for your own set-up.
---
analysis:
# hg19 or hg38 - ensure that the application has been configured to run the specified assembly otherwise it will halt.
genomeAssembly: hg19
vcf:
ped:##前六列输入家系信息
proband:CY0619_02
hpoIds: ['HP:0000407', 'HP:0001181', 'HP:0001249', 'HP:0001531','HP:0001824','HP:0002014','HP0002017','HP:0002019','HP:0002027','HP:0002251','HP:0004322','HP:0005214','HP:0012719','HP:0100031','HP:0100806','HP:0200008']
image.png
不同遗传模式:如果预先知道样本的疾病遗传模式,可只保留一个。
常染色体显性(AD),常染色体隐性(纯和、杂合?)
数值为最大MAF(minor allele frequency)。
这里是能被作为候选致病variant的次等位基因频率不能过高
vcf过滤掉MAF小的variant,减少假阳性
# These are the default settings, with values representing the maximum minor allele frequency in percent (%) permitted for an allele to be considered as a causative candidate under that mode of inheritance. If you just want to analyse a sample under a single inheritance mode, delete/comment-out the others. For AUTOSOMAL_RECESSIVE or X_RECESSIVE ensure *both* relevant HOM_ALT and COMP_HET modes are present.In cases where you do not want any cut-offs applied an empty map should be used e.g. inheritanceModes: {}
inheritanceModes: {
AUTOSOMAL_DOMINANT: 0.1,
AUTOSOMAL_RECESSIVE_HOM_ALT: 0.1,
AUTOSOMAL_RECESSIVE_COMP_HET: 2.0,
X_DOMINANT: 0.1,
X_RECESSIVE_HOM_ALT: 0.1,
X_RECESSIVE_COMP_HET: 2.0,
MITOCHONDRIAL: 0.2
}
#FULL or PASS_ONLY
#保留符合条件的variant
analysisMode: PASS_ONLY
- variant 频率注释 数据库来源
# Possible frequency Sources:
# Thousand Genomes project http://www.1000genomes.org/
# THOUSAND_GENOMES,
# ESP project http://evs.gs.washington.edu/EVS/
# ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,
# ExAC project http://exac.broadinstitute.org/about
# EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
# EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
# EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
# EXAC_OTHER
# Possible frequencySources:
# Thousand Genomes project - http://www.1000genomes.org/ (THOUSAND_GENOMES)
# TOPMed - https://www.nhlbi.nih.gov/science/precision-medicine-activities (TOPMED)
# UK10K - http://www.uk10k.org/ (UK10K)
# ESP project - http://evs.gs.washington.edu/EVS/ (ESP_)
# ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,
# ExAC project http://exac.broadinstitute.org/about (EXAC_)
# EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
# EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
# EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
# EXAC_OTHER
# gnomAD - http://gnomad.broadinstitute.org/ (GNOMAD_E, GNOMAD_G)
frequencySources: [
THOUSAND_GENOMES,
TOPMED,
UK10K,
ESP_AFRICAN_AMERICAN, ESP_EUROPEAN_AMERICAN, ESP_ALL,
EXAC_AFRICAN_INC_AFRICAN_AMERICAN, EXAC_AMERICAN,
EXAC_SOUTH_ASIAN, EXAC_EAST_ASIAN,
EXAC_FINNISH, EXAC_NON_FINNISH_EUROPEAN,
EXAC_OTHER,
GNOMAD_E_AFR,
GNOMAD_E_AMR,
# GNOMAD_E_ASJ,
GNOMAD_E_EAS,
GNOMAD_E_FIN,
GNOMAD_E_NFE,
GNOMAD_E_OTH,
GNOMAD_E_SAS,
GNOMAD_G_AFR,
GNOMAD_G_AMR,
# GNOMAD_G_ASJ,
GNOMAD_G_EAS,
GNOMAD_G_FIN,
GNOMAD_G_NFE,
GNOMAD_G_OTH,
GNOMAD_G_SAS
]
- 致病性数据库来源
# Possible pathogenicitySources: (POLYPHEN, MUTATION_TASTER, SIFT), (REVEL, MVP), CADD, REMM
# REMM is trained on non-coding regulatory regions
# *WARNING* if you enable CADD or REMM ensure that you have downloaded and installed the CADD/REMM tabix files
# and updated their location in the application.properties. Exomiser will not run without this.
pathogenicitySources: [ REVEL, MVP ]
this is the standard exomiser order.
all steps are optional
根据染色体区间过滤 —— intervalFilter
根据质量过滤——qualityFilter
根据effect过滤[INTERGENIC_VARIANT……]——variantEffectFilter
过滤已知variant——knownVariantFilter
根据MAF过滤——frequencyFilter
……
steps: [
#intervalFilter: {interval: 'chr10:123256200-123256300'},
# or for multiple intervals:
#intervalFilter: {intervals: ['chr10:123256200-123256300', 'chr10:123256290-123256350']},
# or using a BED file - NOTE this should be 0-based, Exomiser otherwise uses 1-based coordinates in line with VCF
#intervalFilter: {bed: /full/path/to/bed_file.bed},
#genePanelFilter: {geneSymbols: ['FGFR1','FGFR2']},
failedVariantFilter: { },
#qualityFilter: {minQuality: 50.0},
variantEffectFilter: {
remove: [
FIVE_PRIME_UTR_EXON_VARIANT,
FIVE_PRIME_UTR_INTRON_VARIANT,
THREE_PRIME_UTR_EXON_VARIANT,
THREE_PRIME_UTR_INTRON_VARIANT,
NON_CODING_TRANSCRIPT_EXON_VARIANT,
NON_CODING_TRANSCRIPT_INTRON_VARIANT,
CODING_TRANSCRIPT_INTRON_VARIANT,
UPSTREAM_GENE_VARIANT,
DOWNSTREAM_GENE_VARIANT,
INTERGENIC_VARIANT,
REGULATORY_REGION_VARIANT
]
},
#knownVariantFilter: {}, #removes variants represented in the database
frequencyFilter: {maxFrequency: 2.0},
pathogenicityFilter: {keepNonPathogenic: true},
#inheritanceFilter and omimPrioritiser should always run AFTER all other filters have completed
#they will analyse genes according to the specified modeOfInheritance above- UNDEFINED will not be analysed.
inheritanceFilter: {},
#omimPrioritiser isn't mandatory.
omimPrioritiser: {},
#priorityScoreFilter: {minPriorityScore: 0.4},
#Other prioritisers: Only combine omimPrioritiser with one of these.
#Don't include any if you only want to filter the variants.
hiPhivePrioritiser: {},
# or run hiPhive in benchmarking mode:
#hiPhivePrioritiser: {runParams: 'mouse'},
#phivePrioritiser: {}
#phenixPrioritiser: {}
#exomeWalkerPrioritiser: {seedGeneIds: [11111, 22222, 33333]}
]
- 输出选项
outputOptions:
outputContributingVariantsOnly: false
#numGenes options: 0 = all or specify a limit e.g. 500 for the first 500 results
numGenes: 0
# Path to the desired output directory. Will default to the 'results' subdirectory of the exomiser install directory
#outputDirectory: results
# Filename for the output files. Will default to {input-vcf-filename}-exomiser
outputFileName: Pfeiffer-hiphive-exome-PASS_ONLY
#out-format options: HTML, JSON, TSV_GENE, TSV_VARIANT, VCF (default: HTML)
outputFormats: [HTML, JSON, TSV_GENE, TSV_VARIANT, VCF]
网友评论