美文网首页
Crack4-详解根据基因组测序报告,进行细菌基因组Genome

Crack4-详解根据基因组测序报告,进行细菌基因组Genome

作者: RashidinAbdu | 来源:发表于2021-05-23 12:22 被阅读0次

背景:

  • 将测序得到的细菌基因组数据上传NCBI前,需要计算基因组覆盖度,而这个可以根据基因组测序报告来进行计算

1. Concept:

To calculate the genome coverage, divide the number of bases sequenced by the estimated genome size, multiplied by % reads placed in contigs, as the following example:

1,514,603,088 / 2,100,000 x (96% of reads placed) = 692x 
2. How to calculate:
  • 为此直接写了个Python程序,今后只需要将以下三个变量:number_of_bases_sequenced, estimated_genome_size, reads_placed_in_contigs值放进去,点击运行即可得到基因组覆盖度!
number_of_bases_sequenced =1377024000
estimated_genome_size= 4109798
reads_placed_in_contigs= (1317332892/1377024000)

genome_coverage="{:.2f}".format((number_of_bases_sequenced/estimated_genome_size)*reads_placed_in_contigs)#print 2 decimal places

#format_float = "{:.2f}".format(genome_coverage)
#print(format_float)

print("%reads_placed_in_contigs=", "{:.2f}".format(reads_placed_in_contigs*100), "%") #print 2 decimal places
# 最终获得的基因组覆盖度
print("genome_coverage=", genome_coverage)

就得到:


image.png

3. 那么问题来了,如何找到基因组报告里对应的值?

具体如下:


image.png image.png
image.png

所以根据这个进行计算:


#To calculate the genome coverage, divide the number of bases sequenced by the estimated genome size,
# multiplied by % reads placed in contigs
# 如: 1,514,603,088 / 2,100,000 x (96% of reads placed) = 692x

number_of_bases_sequenced =1377024000
estimated_genome_size= 4109798
reads_placed_in_contigs= (1317332892/1377024000)

genome_coverage="{:.2f}".format((number_of_bases_sequenced/estimated_genome_size)*reads_placed_in_contigs)#print 2 decimal places

#format_float = "{:.2f}".format(genome_coverage)
#print(format_float)

print("%reads_placed_in_contigs=", "{:.2f}".format(reads_placed_in_contigs*100), "%") #print 2 decimal places
print("genome_coverage=", genome_coverage, "x")
  • 即最终的 Genome Coverage: 320.53x

相关文章

网友评论

      本文标题:Crack4-详解根据基因组测序报告,进行细菌基因组Genome

      本文链接:https://www.haomeiwen.com/subject/kuoajltx.html