1、统计fasta文件大小
[train@MiWiFi-R3P-srv out]$ perl -e 'while (<>) { $x = <>; $num += length ($x) - 1;} print "$num\n";' contig.fa #每两行一个循环,$num += length ($x) - 1表示减去换行符。
7510245
2、统计基因组数据
[train@MiWiFi-R3P-srv out]$ genome_statistic.pl scaffold.fa
==>>scaffold.fa<<==
the genome scaffolds number is 907
the genome contigs number is 912
the longest length is 166629
the shortest length is 101
the genome scaffolds size is 7509212
the genome contig size is 7509076
the rate of N is 1.81110880875383e-05
the rate of GC is 0.584690579773064
the scaffold N50 is 47014
the scaffold L50 is 45
the contig N50 is 47014
the contig L50 is 45
the scaffold N90 is 4028
the scaffold L90 216
the contig N90 is 4009
the contig L90 218
the number of sequences >= 1kb is 498 total length is 7283874
the number of sequences >= 2kb is 310 total length is 7021242
the number of sequences >= 3kb is 250 total length is 6879227
网友评论