不用说,又被锁定了呗,至今不知道为何,难道是因为翻译的文字太多?里面的网址太多,被认为违规?还是看公众号对应标题的推文吧。。。
Fast algorithms for large-scale genome alignment and comparison (2022)
大规模基因组比对的快速算法
还是投的Nucleic Acids Research
ABSTRACT
abstract
We describe a suffix-tree algorithm that can align the entire genome sequences of eukaryotic and prokaryotic organisms with minimal use of computer time and memory. The new system, MUMmer 2, runs three times faster while using one-third as much memory as the original MUMmer system.It has been used successfully to align the entire human and mouse genomes to each other, and to align numerous smaller eukaryotic and prokaryotic genomes. A new module permits the alignment of multiple DNA sequence fragments, which has proven valuable in the comparison of incomplete genome sequences. We also describe a method to align more distantly related genomes by detecting protein sequence homology. This extension to MUMmer aligns two genomes after translating the sequence in all six reading frames, extracts all matching protein sequences and then clusters together matches. This method has been applied to both incomplete and complete genome sequences in order to detect regions of conserved synteny, in which multiple proteins from one organism are found in the same order and orientation in another. The system code is being made freely available by the authors.
我们描述了一种后缀树算法,它可以在最少使用计算机时间和内存的情况下对齐真核和原核生物的整个基因组序列。新系统 MUMmer 2 的运行速度是原始 MUMmer 系统的三倍,同时使用的内存是原始 MUMmer 系统的三分之一。它已成功用于将整个人类和小鼠基因组相互比对,并比对许多较小的真核和原核基因组。一个新模块允许比对多个 DNA 序列片段,这已被证明在比较不完整的基因组序列中很有价值。我们还描述了一种通过检测蛋白质序列同源性来比对更远缘的基因组的方法。这种对 MUMmer 的扩展在翻译所有六个阅读框中的序列后对齐两个基因组,提取所有匹配的蛋白质序列,然后将匹配聚集在一起。该方法已应用于不完整和完整的基因组序列,以检测保守同线性区域,其中来自一种生物的多种蛋白质在另一种生物中以相同的顺序和方向被发现。系统代码由作者免费提供。
。。。
网友评论