引用内容来自文章:“Species Tree Estimation and the Impact of Gene Loss Following Whole-Genome Duplication”
Immediately following WGD, most genes are present in two copies as paralogs (i.e., paralogs or sometimes referred to as ohnologs to honor Ohno and his contribution to this area; Ohno 1970). Due to this redundancy, one copy of a paralog pair often undergoes pseudogenization and is eventually lost (Lynch and Conery 2000; Langham et al. 2004; Aury et al. 2006; Makino and McLysaght 2012).
在各个物种中,尤其是植物中,全基因组加倍事件(whole genome duplication, WGD)是十分普遍的。这会导致基因的冗余,于是旁系同源基因中的一个就会假基因化,然后最终丢失。
Along these lines, if speciation occurs shortly after WGD and subsequent loss of paralogs is restricted to one major paralog subclade, single-copy genes should include only one-to-one orthologs and be relatively straightforward to analyze phylogenetically (Fig. 1a). In contrast, when both copies of a paralog pair within post-WGD species are equally likely to be lost, paralogous gene copies may be erroneously grouped as orthologs (i.e., pseudoorthologs) and lead to incorrect gene tree estimation (Fig. 1b) (Salichos and Rokas 2011; Struck 2013; Smith and Hahn 2022).
但基因的丢失,不太可能只发生在旁系同源基因的一个亚群中,而是两个亚群中的基因都有可能。WGD后分化出的不同物种,旁系基因的丢失如果是不同的,虽然结果都导致只剩下一个基因。在鉴定分化出的物种的单拷贝直系基因时,这样的基因也会被认为是单拷贝直系同源基因。此基因构建的基因树很可能是跟物种树是不一致的。利用这样的基因构建物种树,可能会对物种树产生影响。
Concatenation methods (i.e., the maximum likelihood tree inferred from the concatenated sequences across loci) have been commonly employed for species tree estimation, which implicitly assumes that all genes have the same or very similar evolutionary histories. Coalescent-based methods, in contrast, permit gene trees to have different evolutionary histories (Liu et al. 2009a).
串联构树方法(Concatenation methods)默认所有的基因有相同的进化历史,而并联构树方法(Coalescent-based methods)允许不同基因有不同的进化历史。
Some of these methods, including *BEAST (Heled and Drummond 2010), BEST (Liu 2008), and BPP (Flouri et al. 2018), simultaneously estimate gene trees and the species tree from multilocus sequence data. These alignment-based methods have outstanding accuracy, but they are computationally intensive (Leaché and Rannala 2011; Bayzid and Warnow 2013; Mirarab et al. 2016). Other coalescent-based methods infer the species tree from a set of gene trees using likelihood functions, for example, MP-EST (Liu and Yu 2010), STELLS (Wu 2012; Pei and Wu 2017), and STEM (Kubatko et al. 2009). In addition, recently developed methods, including ASTRAL (Mirarab et al. 2014; Mirarab and Warnow 2015; Zhang et al. 2018), STAR (Liu et al. 2009b), and STEAC (Liu et al. 2009b), estimate the species tree from gene trees using summary statistics.
并联构树也有很多方法,有基于比对的,有基于likelihood, 有的利用总结统计。这写方法的基本原理后面看看再添加。
网友评论