美文网首页群体遗传学
工具 | OrthoFinder 同源基因分析

工具 | OrthoFinder 同源基因分析

作者: biogeeker | 来源:发表于2020-12-02 09:36 被阅读0次

前言

从上图的比较中,可以看出 OrthoFinder 相比于其它工具,无论是速度还是准确度都有很大的提升。同时在 OrthoFinder 运行之后,不仅仅寻找了同源基因,还构建了基因树和物种树。下图是关于基因树和物种树的介绍:

Fig. 2 Schematic example of the gene/species tree reconciliation.

The gene tree and species tree are not compatible. Reconciliation methods resolve the incongruence between the two by inferring speciation, duplication, and losses events on the gene tree. The reconciled tree indicates the most parsimonious history of this gene, constrained to the species tree. The simple representation (bottom right) suggests that the human and frog genes are orthologs and that they are both paralogous to the dog gene

上图表示:人基因和青蛙基因是直系同源(圈);它们和狗基因是旁系同源(星)。但是,人和狗在物种树上的亲缘关系更近,所以基因树和物种树并不完全等同。

安装

下载解压:OrthoFinder

使用

$ python orthofinder.py -f dir/
# python orthofinder.py -f dir/ # 文件夹里放2个及以上物种蛋白序列

具体参数介绍:

$ python orthofinder.py -h

OrthoFinder version 2.4.1 Copyright (C) 2014 David Emms

SIMPLE USAGE:
Run full OrthoFinder analysis on FASTA format proteomes in <dir>
  orthofinder [options] -f <dir>

Add new species in <dir1> to previous run in <dir2> and run new analysis
  orthofinder [options] -f <dir1> -b <dir2>

OPTIONS:
 -t <int>        Number of parallel sequence search threads [Default = 24]
 -a <int>        Number of parallel analysis threads [Default = 1]
 -d              Input is DNA sequences
 -M <txt>        Method for gene tree inference. Options 'dendroblast' & 'msa'
                 [Default = dendroblast]
 -S <txt>        Sequence search program [Default = diamond]
                 Options: blast, diamond, blast_gz, mmseqs, blast_nucl
 -A <txt>        MSA program, requires '-M msa' [Default = mafft]
                 Options: mafft, muscle
 -T <txt>        Tree inference method, requires '-M msa' [Default = fasttree]
                 Options: fasttree, raxml, raxml-ng, iqtree
 -s <file>       User-specified rooted species tree
 -I <int>        MCL inflation parameter [Default = 1.5]
 -x <file>       Info for outputting results in OrthoXML format
 -p <dir>        Write the temporary pickle files to <dir>
 -1              Only perform one-way sequence search
 -X              Don't add species names to sequence IDs
 -y              Split paralogous clades below root of a HOG into separate HOGs
 -n <txt>        Name to append to the results directory
 -o <txt>        Non-default results directory
 -h              Print this help text

WORKFLOW STOPPING OPTIONS:
 -op             Stop after preparing input files for BLAST
 -og             Stop after inferring orthogroups
 -os             Stop after writing sequence files for orthogroups
                 (requires '-M msa')
 -oa             Stop after inferring alignments for orthogroups
                 (requires '-M msa')
 -ot             Stop after inferring gene trees for orthogroups 

WORKFLOW RESTART COMMANDS:
 -b  <dir>         Start OrthoFinder from pre-computed BLAST results in <dir>
 -fg <dir>         Start OrthoFinder from pre-computed orthogroups in <dir>
 -ft <dir>         Start OrthoFinder from pre-computed gene trees in <dir>

-S: 使用 diamond 比对,比对速度快;-A: 多序列比对默认是 mafft,可以改成 muscle;其它参数默认。

参考

[1]. 如何寻找同源基因 - OrthoFinder
[2]. OrthoFinder2: fast and accurate phylogenomic orthology analysis from gene sequences. bioRxiv
[3]. Inferring Orthology and Paralogy

相关文章

网友评论

    本文标题:工具 | OrthoFinder 同源基因分析

    本文链接:https://www.haomeiwen.com/subject/wfktdktx.html