美文网首页
判断二代测序数据产自哪种illumina测序平台

判断二代测序数据产自哪种illumina测序平台

作者: wo_monic | 来源:发表于2020-12-04 21:10 被阅读0次

    https://raw.githubusercontent.com/10XGenomics/supernova/master/tenkit/lib/python/tenkit/illumina_instrument.py
    最新分类情况,请在上述链接查找。

    首字符 测序平台
    HWI-M [0-9] {4} $ MiSeq
    HWUSI Genome Analyzer IIx
    “ M [0-9] {5} $ MiSeq
    “ HWI-C [0-9] {5} $ HiSeq 1500
    “ C [0-9] {5} $ HiSeq 1500
    “ HWI-D [0-9] {5} $ HiSeq 2500
    “ D [0-9] {5} $ HiSeq 2500
    “ J [0-9] {5} $ HiSeq 3000
    “ K [0-9] {5} $ HiSeq 3000(目前基本不用),HiSeq 4000
    “ E [0-9] {5} $ HiSeq X
    NB [0-9] {6} $ NextSeq
    NS [0-9] {6} $ NextSeq
    MN [0-9] {5} $ MiniSeq
    测序通道的分类
             "C[A-Z,0-9]{4}ANXX$" : (["HiSeq 1500", "HiSeq 2000", "HiSeq 2500"], "High Output (8-lane) v4 flow cell"),
             "C[A-Z,0-9]{4}ACXX$" : (["HiSeq 1000", "HiSeq 1500", "HiSeq 2000", "HiSeq 2500"], "High Output (8-lane) v3 flow cell"),
             "H[A-Z,0-9]{4}ADXX$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v1 flow cell"),
             "H[A-Z,0-9]{4}BCXX$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v2 flow cell"),
             "H[A-Z,0-9]{4}BCXY$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v2 flow cell"),
             "H[A-Z,0-9]{4}BBXX$" : (["HiSeq 4000"], "(8-lane) v1 flow cell"),
             "H[A-Z,0-9]{4}BBXY$" : (["HiSeq 4000"], "(8-lane) v1 flow cell"),
             "H[A-Z,0-9]{4}CCXX$" : (["HiSeq X"], "(8-lane) flow cell"),
             "H[A-Z,0-9]{4}CCXY$" : (["HiSeq X"], "(8-lane) flow cell"),
             "H[A-Z,0-9]{4}ALXX$" : (["HiSeq X"], "(8-lane) flow cell"),
             "H[A-Z,0-9]{4}BGXX$" : (["NextSeq"], "High output flow cell"),
             "H[A-Z,0-9]{4}BGXY$" : (["NextSeq"], "High output flow cell"),
             "H[A-Z,0-9]{4}BGX2$" : (["NextSeq"], "High output flow cell"),
             "H[A-Z,0-9]{4}AFXX$" : (["NextSeq"], "Mid output flow cell"),
             "A[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq flow cell"),
             "B[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq flow cell"),
             "D[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq nano flow cell"),
             "G[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq micro flow cell"),
             "H[A-Z,0-9]{4}DMXX$" : (["NovaSeq"], "S2 flow cell")}
    

    使用zless查看测序原始文件。
    zless sample.fastq.gz|head -5

    @E00552:40:H23NGCCXY:5:1101:1154:1520 1:N:0:NCAGTG
    NTTTGCTAAACGGAAGGACTAAAGTAGGAACTGATTGGCTTTAGTCTCTAGTCTCTCACATGGGTGCTAAAAGGGACTAGAGGGTAACATTTACTCCAATTGCCTTTGCCTAGAGTTGGAATATAATATAAGTGAATTGTCCACCTTCTT
    +
    #AAFAFJAJJ-FFFJJJ7JJJFJJJJJFJJJJ<FFFAJJJJFJJJJJJJJJJFAJ<AJJFJJJJ-FF7FJJJJJJJJF<FJJJJAFAJFFFJJJJJJJFJ-FJJJJFJ<J-FJFF-7AF7FJF7FJJ7FAFJ-<<7<-AAJJJ<JA-F<-
    @E00552:40:H23NGCCXY:5:1101:2777:1520 1:N:0:NCAGTG
    

    显然可以看出,是E开头,即HiSeq X (8-lane) flow cell

    例2:zless sample2.fastq.gz|head -5

    @A00262:358:HTG2NDSXX:2:1101:1127:1031 1:N:0:GTTATA+GTTATAC
    GNCTACATTTACCTAGCATTTTTCTTCTATCTTACATAGTTTTTGGGTAAACATACTATCCTTATGAGCATTGGGTGTAATGTTTGTTGTTTTATGTTGATTGCTTATTTGGGTAGAAATGACTAACCTATGCTTCATTCCTGCGGATGG
    +
    F#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,FFFF,FFFF,F:FFF:FFF,FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
    @A00262:358:HTG2NDSXX:2:1101:1181:1031 1:N:0:GTTATA+GTTATAC
    

    相关文章

      网友评论

          本文标题:判断二代测序数据产自哪种illumina测序平台

          本文链接:https://www.haomeiwen.com/subject/yhgiwktx.html