美文网首页🍊码农
基因组注释文件(GTF/GFF)格式介绍

基因组注释文件(GTF/GFF)格式介绍

作者: bioinfo2011 | 来源:发表于2017-10-31 11:33 被阅读47次

    基因组注释文件GTF/GFF格式的介绍

    GFF 2 -> GTF -> GFF 3                The GTF (General Transfer Format) is identical to GFF version 2 

    GTF其实就是GFF版本2

    其格式为(每个数字代表一列,总共9列)

    1. seqname- (染色体名称)  name of the chromosome or scaffold; chromosome names can be given with or without the 'chr' prefix.

    2. source- (用什么软件产生的)name of the program that generated this feature, or the data source (database or project name)

    3. feature- (是转录本/外显子/内含子 等)feature type name, e.g. Gene, Variation, Similarity

    4. start-(起始点) Start position of the feature, with sequence numbering starting at 1.

    5. end- (终止点)End position of the feature, with sequence numbering starting at 1.

    6. score- A floating point value.

    7. strand (正链还是负链)- defined as + (forward) or - (reverse).

    8. frame- One of '0', '1' or '2'. '0' indicates that the first base of the feature is the first base of a codon, '1' that the second base is the first base of a codon, and so on..

    9. attribute- (特性,比如编码的蛋白 等)A semicolon-separated list of tag-value pairs, providing additional information about each feature.

    举例:

    transcribed_pseudogene ------> gene ------> 11869 ------> 14409 ------> .------> +  ------>. ------> gene_id "ENSG00000223972"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene";

    参考:

    https://www.biostars.org/p/99462/

    http://www.ensembl.org/info/website/upload/gff.html

    相关文章

      网友评论

        本文标题:基因组注释文件(GTF/GFF)格式介绍

        本文链接:https://www.haomeiwen.com/subject/futbpxtx.html