https://raw.githubusercontent.com/10XGenomics/supernova/master/tenkit/lib/python/tenkit/illumina_instrument.py
最新分类情况,请在上述链接查找。
首字符 | 测序平台 |
---|---|
HWI-M [0-9] {4} $ | MiSeq |
HWUSI | Genome Analyzer IIx |
“ M [0-9] {5} $ | MiSeq |
“ HWI-C [0-9] {5} $ | HiSeq 1500 |
“ C [0-9] {5} $ | HiSeq 1500 |
“ HWI-D [0-9] {5} $ | HiSeq 2500 |
“ D [0-9] {5} $ | HiSeq 2500 |
“ J [0-9] {5} $ | HiSeq 3000 |
“ K [0-9] {5} $ | HiSeq 3000(目前基本不用),HiSeq 4000 |
“ E [0-9] {5} $ | HiSeq X |
NB [0-9] {6} $ | NextSeq |
NS [0-9] {6} $ | NextSeq |
MN [0-9] {5} $ | MiniSeq |
测序通道的分类
"C[A-Z,0-9]{4}ANXX$" : (["HiSeq 1500", "HiSeq 2000", "HiSeq 2500"], "High Output (8-lane) v4 flow cell"),
"C[A-Z,0-9]{4}ACXX$" : (["HiSeq 1000", "HiSeq 1500", "HiSeq 2000", "HiSeq 2500"], "High Output (8-lane) v3 flow cell"),
"H[A-Z,0-9]{4}ADXX$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v1 flow cell"),
"H[A-Z,0-9]{4}BCXX$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v2 flow cell"),
"H[A-Z,0-9]{4}BCXY$" : (["HiSeq 1500", "HiSeq 2500"], "Rapid Run (2-lane) v2 flow cell"),
"H[A-Z,0-9]{4}BBXX$" : (["HiSeq 4000"], "(8-lane) v1 flow cell"),
"H[A-Z,0-9]{4}BBXY$" : (["HiSeq 4000"], "(8-lane) v1 flow cell"),
"H[A-Z,0-9]{4}CCXX$" : (["HiSeq X"], "(8-lane) flow cell"),
"H[A-Z,0-9]{4}CCXY$" : (["HiSeq X"], "(8-lane) flow cell"),
"H[A-Z,0-9]{4}ALXX$" : (["HiSeq X"], "(8-lane) flow cell"),
"H[A-Z,0-9]{4}BGXX$" : (["NextSeq"], "High output flow cell"),
"H[A-Z,0-9]{4}BGXY$" : (["NextSeq"], "High output flow cell"),
"H[A-Z,0-9]{4}BGX2$" : (["NextSeq"], "High output flow cell"),
"H[A-Z,0-9]{4}AFXX$" : (["NextSeq"], "Mid output flow cell"),
"A[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq flow cell"),
"B[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq flow cell"),
"D[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq nano flow cell"),
"G[A-Z,0-9]{4}$" : (["MiSeq"], "MiSeq micro flow cell"),
"H[A-Z,0-9]{4}DMXX$" : (["NovaSeq"], "S2 flow cell")}
使用zless查看测序原始文件。
zless sample.fastq.gz|head -5
@E00552:40:H23NGCCXY:5:1101:1154:1520 1:N:0:NCAGTG
NTTTGCTAAACGGAAGGACTAAAGTAGGAACTGATTGGCTTTAGTCTCTAGTCTCTCACATGGGTGCTAAAAGGGACTAGAGGGTAACATTTACTCCAATTGCCTTTGCCTAGAGTTGGAATATAATATAAGTGAATTGTCCACCTTCTT
+
#AAFAFJAJJ-FFFJJJ7JJJFJJJJJFJJJJ<FFFAJJJJFJJJJJJJJJJFAJ<AJJFJJJJ-FF7FJJJJJJJJF<FJJJJAFAJFFFJJJJJJJFJ-FJJJJFJ<J-FJFF-7AF7FJF7FJJ7FAFJ-<<7<-AAJJJ<JA-F<-
@E00552:40:H23NGCCXY:5:1101:2777:1520 1:N:0:NCAGTG
显然可以看出,是E
开头,即HiSeq X (8-lane) flow cell
例2:zless sample2.fastq.gz|head -5
@A00262:358:HTG2NDSXX:2:1101:1127:1031 1:N:0:GTTATA+GTTATAC
GNCTACATTTACCTAGCATTTTTCTTCTATCTTACATAGTTTTTGGGTAAACATACTATCCTTATGAGCATTGGGTGTAATGTTTGTTGTTTTATGTTGATTGCTTATTTGGGTAGAAATGACTAACCTATGCTTCATTCCTGCGGATGG
+
F#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF:,FFFF,FFFF,F:FFF:FFF,FFFFFFFF:FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF
@A00262:358:HTG2NDSXX:2:1101:1181:1031 1:N:0:GTTATA+GTTATAC
网友评论