有时候我们拿到变异位点的基因组坐标之后,往往想看它的转录本坐标或氨基酸坐标以便跟已知变异位点进行比较,transvar就可以轻轻松松实现坐标转换。
举例如下:
Deletion
比如我们有如下变异位点:
one base Deletion
13 32954022 CA C
在13号染色体32954023位置发生了A的缺失
可以用以下命令进行查询:
transvar ganno -i 'chr13:g.32954022_32954023delinsC' --ucsc
or
transvar ganno -i 'chr13:g.32954023del' --ucsc
查询结果如下:
chr13:g.32954022_32954023delinsC NM_000059 (protein_coding) BRCA2 + chr13:g.32954030delA/c.9097delA/p.T3033Lfs*29 inside_[cds_in_exon_23]CSQN=Frameshift;left_align_gDNA=g.32954023delA;unaligned_gDNA=g.32954023delA;left_align_cDNA=c.9090delA;unalign_cDNA=c.9090delA;source=UCSCRefGene
chr13:g.32954023del NM_000059 (protein_coding) BRCA2 + chr13:g.32954030delA/c.9097delA/p.T3033Lfs*29 inside_[cds_in_exon_23] CSQN=Frameshift;left_align_gDNA=g.32954023delA;unaligned_gDNA=g.32954023delA;left_align_cDNA=c.9090delA;unalign_cDNA=c.9090delA;source=UCSCRefGene
multi base Deletion
13 32912089 CTG C
transvar ganno -i 'chr13:g.32912089_32912091delinsC' --ucsc
or
transvar ganno -i 'chr13:g.32912091del' --ucsc
Insertion
13 32937354 T TA
transvar ganno -i 'chr13:g.32937354_32937354delinsTA' --ucsc
or
transvar ganno -i 'chr13:g.32937354_32937355insA' --ucsc
SNV
17 41246245 C A
transvar ganno -i 'chr17:g.41246245C>A' --ucsc
or
transvar ganno -i 'chr17:g.41246245_41246245delinsA' --ucsc
批量转换
生成如下sites.txt 文件:
CHROM POS REF ALT id
1 46714092 C T chr1:g.46714092C>T
1 46714198 TC T chr1:g.46714199del
1 46714231 C T chr1:g.46714231C>T
1 46714263 G A chr1:g.46714263G>A
1 46714267 A G chr1:g.46714267A>G
1 46714272 T C chr1:g.46714272T>C
1 46714273 G A chr1:g.46714273G>A
1 46714274 A T chr1:g.46714274A>T
1 46714275 G T chr1:g.46714275G>T
transvar ganno -l sites.txt -m 5 --ucsc > transvar_result.bed
结果文件如下所示:
chr2:g.47607092C>A NM_002354 (protein_coding) EPCAM + chr2:g.47607092C>A/c.842C>A/p.A281D inside_[cds_in_exon_7] CSQN=Missense;codon_pos=47607091-47607092-47607093;ref_codon_seq=GCT;source=UCSCRefGene
chr22:g.29095861T>C NM_001005735 (protein_coding) CHEK2 - chr22:g.29095861T>C/c.1102A>G/p.K368E inside_[cds_in_exon_10] CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr22:g.29095861T>C NM_001257387 (protein_coding) CHEK2 - chr22:g.29095861T>C/c.310A>G/p.K104E inside_[cds_in_exon_10] CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr22:g.29095861T>C NM_007194 (protein_coding) CHEK2 - chr22:g.29095861T>C/c.973A>G/p.K325E inside_[cds_in_exon_9] CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr22:g.29095861T>C NM_145862 (protein_coding) CHEK2 - chr22:g.29095861T>C/c.973A>G/p.K325E inside_[cds_in_exon_9] CSQN=Missense;codon_pos=29095859-29095860-29095861;ref_codon_seq=AAG;source=UCSCRefGene
chr11:g.108200964A>T NM_000051 (protein_coding) ATM + chr11:g.108200964A>T/c.7331A>T/p.E2444V inside_[cds_in_exon_50] CSQN=Missense;codon_pos=108200963-108200964-108200965;ref_codon_seq=GAG;source=UCSCRefGene
chr17:g.59761513A>G NM_032043 (protein_coding) BRIP1 - chr17:g.59761513A>G/c.2906-12T>C/. inside_[intron_between_exon_19_and_20] CSQN=IntronicSNV;source=UCSCRefGene
网友评论