最近在复习HGVS命名,当做个人翻译笔记吧。
HGVS,全称是Human Genome Variation Society,人类基因组变异协会的缩写。
本周翻译的是第二部分Duplication,原文地址http://varnomen.hgvs.org/recommendations/DNA/variant/Duplication/
【重复】Duplication
定义:与参考序列相比,一个或多个核苷酸的拷贝直接插入原始序列的下游(3'后面)
格式:前缀(所使用的参考序列)+ . +重复位置(范围)+dup,举例:g.123_345dup
要点:
①接受的参考序列包括g. m. c. n.
②重复范围“_”前后应该为两个不同的位置,如123_126 而非 123_123
③重复范围“_”前后应该遵循5'至3'的顺序,如123_126 而非 126_123
④依据定义,dup只能用于新增的拷贝直接在原始拷贝的3'后面,若不是直接跟随,应该用insertion;倒置的重复,应描述为插入,如g.234_235ins123_234inv
⑤当多个拷贝依次直接排列在原始拷贝的3'后面,可以应用重复序列的形式Repeated sequences, 比如 [3] (triplication), [4] (quadruplication)
⑥遵循3'规则,但也有例外(如重复发生在外显子与内含子交界处,且两个衔接处的碱基一致时)
⑦【有争议】“{ }”(大括号)可用于描述重复序列中的任意改变(与原序列相比),如g.123_345dup{234A>G}
举例:
g.7dup (one nucleotide)
the duplication of a T at position g.7 in the sequence ACTTACTGCC to ACTTACTTGCC
NOTE: it is allowed to describe the variant as g.7dupT
NOTE: it is not allowed to describe the variant as g.6_7insT (see prioritisation)这个不可以,重复的优先级高于插入
g.6_8dup (several nucleotides)
a duplication from position g.6 to g.8 in the sequence ACAATTGCC to ACAATTGCTGCC
NOTE: it is allowed to describe the variant as g.6_8dupTGC
c.120_123+48dup
a duplication of nucleotides c.120 to c.123+48 (coding DNA reference sequence), crossing an exon/intron border
c.123dup
based on the sequence of a genomic DNA sample, a duplication of the A nucleotide c.123 in the sequence CAAgt…/..agAAG to CAAAgt…/..agAAG, i.e. the duplication of the last nucleotide of an exon (see Question below)
NOTE: when RNA is sequenced and the variant does not alter splicing the description at the RNA level based on a coding RNA reference sequence is r.125dup (the 3’rule needs to be applied)
当对RNA进行测序并且变异不改变剪接时,基于编码RNA参考序列的RNA水平的描述为r.125dup(仍需要应用3'规则)
c.4072-1234_5146-246dup
a duplication of nucleotides c.4072-1234 to c.5146-246 duplicating exon 30 (starting at position c.4072) to exon 36 (ending at position c.5145) of the DMD-gene.
NOTE : c.4072-1234_5146-246dupXXXXX, the size of the duplication (XXXXX) should not be described
c.(4071+1_4072-1)_(5145+1_5146-1)dup
a duplication of exon 30 (starting at position c.4072) to exon 36 (ending at position c.5145) of the DMD-gene. The duplication break point hasnot been sequenced. Exons 29 (ending at c.4071) and 37 (starting at nucleotide c.5146) have been tested an shown to be not duplicated. The duplication therefore starts in intron 29 (position c.4071+1 to c.4072-1) and ends in intron 36 (position c.5145+1 to c.5156-1).
第一个重复点在内含子29,第二个重复点在内含子36,具体位置不详,所以给了一个完整内含子的范围
NOTE : this description is part of proposal SVD-WG003 (undecided).
NOTE : previously, the suggestion was made to describe such duplications using the format c.4072-?_5154+?dup. However, since c.4072-? indicates “to an unknown postion 5’ of c.4072” and c.5154+? “to an unknown postion 3’ of c.5154” this description is not correct when it is known that exons 29 and 37 are involved.
c.(4071+1_4072-1)_(5145+1_5146-1)[3]
a triplication of exon 30 (starting at position c.4072) to exon 36 (ending at position c.5145) of the DMD-gene (break points not sequenced.
NOTE : this description should only be used when the two additional copies are in tandem with the original copy. There is no specific recommendation yet how to describe such a change but following current recommendations the format would be something like c.?ins(4071+1_4072-1)_(5145+1_5146-1)[2] ([2] since 2 additional copies have been inserted somewhere in the genome).
c.(?_-30)_(12+1_13-1)dup
a duplication starting somewhere upstream of a gene, last postion tested duplicated c.-29, and ending in the intron between nucleotides c.12+1 and c.13-1 (intron 1).
c.(?_-1)_(*1_?)dup
a duplication of the entire protein coding region of a gene based on a coding DNA reference sequence.
NOTE: when more details are available regarding the duplication, based on the probes tested to determine its location, the description can be specified like c.(?_-189)_(*884_?)dup, meaning the duplication starts 5’ of c.-189 and extends 3’ of c.*884.
【问题】
如何描述ATCGATCGATCGATCGAGGGTCCCtoATCGATCGATCGATCGAATCGATCGATCGGGGTCCC?可以用dup吗?
答:依据定义,dup应该是直接插在原始序列3'的后面,本例多了个“A”,不符合,所以不可以用dup,应该描述为g.17_18ins5_16
更多信息,关注公众号:何艺新筛手札
网友评论