Paml分析过程中根据Othogroups信息和物种树生成对应的

作者: 徐诗芬 | 来源:发表于2022-01-05 17:44 被阅读0次

Paml分析过程中根据Othogroups信息和物种树生成对应的
django项目--菜单管理
webpack基础学习
用PowerBI将制作数据报告变成复制粘贴的简单工作
crash之野指针
安卓DataBinding报错：android.content.
Hibernate之配置信息
双向数据绑定demo
webpack小栗子
webpack 初体验

首先根据系统发育分析我们得到一个物种树文件，对每个单拷贝基因进行正选择或适应性趋同分析，需要输入基因树。因此，我们根据genelist和物种树文件，生成一系列基因树文件。

首先得到genelist很简单，用grep提取，以下两个文件都在orthofinder结果文件里：

grep -f Orthogroups_SingleCopyOrthologues.txt Orthogroups.tsv >single_copy_gene_list.txt #生成single_copy_gene_list.txt后还要添加首行物种名

然后再运行以下脚本：

#!/usr/bin/python
# -*- coding: utf-8 -*-
# conda activate python3
"""
    作者：徐诗芬
    内容：1. 打开树文件，读取树文件信息，并赋值tree；
         2. 打开single_copy_gene_list.txt文件，读取第一行生成一个species list；
         3. 读取剩下的行，读取OG的id以及生成该id的基因列表gene list,
         并且将species和gene列表并列起来生成一个字典；
         4.循环species list，根据物种匹配替换基因名，
         并创建一个该OG的文件，写入替换好的gene tree。

    日期：2022.1.5
"""
import sys
import subprocess
import re
from itertools import islice

def usage():
    print('Usage: python makertree.py [IQ_TREE.treefile] [single_copy_gene_list.txt] [Out_dir]')

def main():
    tree_file = open('IQ_TREE.treefile', 'rt')  # sys.argv[1]
    name_list = open('single_copy_gene_list.txt', 'rt')  # sys.argv[2]
    out_dir = 'TREE'  # sys.argv[3]  # windows系统自己创建
    # subprocess.call(['mkdir', out_dir])  # linux系统自动创建output文件夹
    spe_tree = tree_file.read()
    species_list = []
    for head in islice(name_list, 0, 1):   # 输出第一行title
        species_list = head.strip().split("\t")[1:]
    for line in islice(name_list, 0, None):    
        OG_id = line.strip().split("\t")[0]
        gene_list = line.strip().split("\t")[1:]
        pattern_dict = dict(zip(species_list, gene_list))
        new_tree = spe_tree        # 给new_tree一个初始字符串
        for sp in species_list:
            gene = pattern_dict[sp]
            new_tree = re.sub(sp, gene, new_tree)  # 循环一一替换new_tree里面的物种名
        
        gene_tree = new_tree
        with open(f'{out_dir}/{OG_id}.tree', 'w+') as f:
            f.write(gene_tree)

    tree_file.close()
    name_list.close()

try:
    main()
except IndexError:
    usage()