一:准备核酸序列和树文件
## 核酸序列(长度不一样比对去gap)
5 3372
FJ973621Lhes
GCTCAAGCACAGGCAGAAGCAGCAGCCCGTGCTCAGGCACAAGCAGAAGCCAGAGCTAAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCAGAAGCCAGAGCTAAAGCTGAAGCCACAGCTCGTGCCAAAGCACAAGCGGAAGCAGCAGCACGTGCTCAAGCACAAGCAGAAGCCAGAGCTATAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCAGAAGCTAGAGCTAAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCCATAGCGAGAGCTGAAGCTGCTGCGCGTGCTCAGGCTGAAGCAGAGGCGCGTGCATATGCTGAAGCTTTAGCAAGAGTTCAAGCTGAAGCTGCTGCTAGAGCTCAAGCACAAACTCAATCCAGAACTCAAGCTGAAACACATTCTCAAGCACATAGTGCTTCTCATGCATCTTCTCAAGCAACCTCTGAGACTCACGTTGAAAGTACTGCCCATACTGCTACAGAAACGCATGAACATACGTCTTCTAACTCTCAAGCTGCCAGTCACACCCAAGCAGCTTCTCACAGTCAAGCAAAAGCTCAAAGTGAAGCACATACGTATTCTCAAGCTCAATCAGCCGCACATACTATTGCCGCAGCTCAAGCACAAGCTGAAACCAGAGCCAGAGCTGAAGCTGTTGCTAGAGCGCAAGCACAAGCGGAGGCCAGAGTGAGAGCAGAAGCAGCAGCCCGTGCTCAAGCACAAGCCGAAGCAGCAGCCCGTGCTCAAGCACAAGCCGAAGCAGCAGCCCGTGCTCAAGCACAAGCCGAAGCAGCAGCCCGTGCTCAAGCACAAGCCGAAGCAGCAGCCCGTGCTCAAGCACAAGCCGAAGCAGCAGCCCGTGCTCAAGCACAAGCCGAAGCAGCAGCCCGTGCTCAATCACAATCTGAAGCAGCCGCACGTGCTCAAGCTCAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGTTGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCTGCAGCCCGTGCTCAGGCACAAGCCGAAGCCAGAGCTAAAGCTGAAGCAGCAGTCCGTGCTCAAGCTCAAGTTGAAGCAGCAGCCCGTGCTCAAGCACAAGTGGAAGCTGCAGCCCGTGCTCAAGCACAGGCAGAAGCAGCAGCCCGTGCTCAGGCACAAGCAGAAGCCAGAGCTAAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCAGAAGCCAGAGCTAAAGCTGAAGCCACAGCTCGTGCCAAAGCACAAGCGGAAGCAGCAGCACGTGCTCAAGCACAAGCAGAAGCCAGAGCTATAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCAGAAGCTAGAGCTAAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCCATAGCGAGAGCTGAAGCTGCTGCGCGTGCTCAGGCTGAAGCAGAGGCGCGTGCATATGCTGAAGCTTTAGCAAGAGTTCAAGCTGAAGCTGCTACTAGAGCTCAAGCACAAACTCAATCCAGAACTCAAGCTGTAACACATTCTCATGCCCATAGTGCTTCTCATGCATCTTCTCAAGCATCCTCTGAGACTTACGCAGAAAGTACTGCTCATACTGCTACAGAAACGCATGAACATACGTCTTCTCACTCTCAAACTGCCAGTCACAGCCAAGCAGCTTCTCACAGTAAAGCAAAAGCTCATACTGAAGCAGATACGTATTCTCAAGCTCAATCAGCCGCACATACTATTGCCGCTGCTCAAGCACAAGCGGAAACTAGAGCCAGAGCTGAAGCTGTTGCTAGAGCGCAAGCACAAGCAGAAGCCAGAGTTAGAGCAGAAGCAGCAGCTCGTGCTCAAGCACAAGCTGAAGCAGCAGCCCGTGCTCATGCACAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCCAGAGCTAAAGCTGAAGCAGCAGCCCGCGCTCAAGCACAGGCAGAAGCCAGAGCTAAAGCTGAAGCAGCAGCCCGTGCTCAAGCACAAGCAGAAGCCAGAGCGAGAGCTGAAGCAGTAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCTCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCGGAAGCAGCAGCCCGTGCTCAAGCACAAGCAGAAGCCAGAGCGAGAGCAGAAACTGCTGCGCGAGCTCAGGCTGAAGCAGAGGCGCGTGCATATGCTGAAGCTTTAGCAAGAGTTCAAGCTGAAGCTGCTGCTAGAGCTCAAGCACAATCTCAATCCAGAATTCAAGCTGAAACACAGTCTCATGCACATAGTGCTTCTCATGCATCTTCTCAAGCATTCTCTGAGACTCACGCGGAAAGTGCTGCCCATACTGCTACAGAAACGCATGAACAGACGTCTTCTCACTCACAAGCTGCCAGTCGAAGTCAAGCAGCTTCTCACAGTCAAGCAGAAGCTCATACTGAAGCACATACGTATTCTCAAGCTCAGTCAGCCGCACATACTATTGCCGCAGCTCAAGCACAAGCGGAAACAAGAGCCAGAGCTGAAGCTGCTGCTCGAGCGCAAGCACATGCTCAAGCTGAAGCTGTTGCACGAGCTCGAGCAGAAGCAGCTGCCAGAGCCAAAGCACAGGCGGAAGCACGTGCTGATGCAGAAGCTGCCGCAAAAGCTCAAGCTGAAGCTGCAGCATTAGCTCATGCACAAGCTGTTGCTCGTGCTCAAGCAGAAGCTGCTGCCAAAGCTAAAATAGAAGAAGAAGCACGTGCTCAAGCTGAAGCGGCAATCAGGTCTCAAGTAGAAGCTGCAGTTAGAGCTCAAGCTGAAGCACATTCTCAAGCAAAATCCGAAGCAAGCACTCAAACGCAAACTGCGGCATATTCGAGCAGTGAAAGTGCTTCCTCCTCTGAAGCTGAATCTTCTTCATACGCACAATCCTTCCAGTTTTCACTCACATACACGCTGCATTAACGTCTTCAGCTCATCAGTTGGTCTCTGCAGCAGCTAAACGCAGAATTGCTTCGCTATCGCAAGCTATGTCTTCTGTTATCTCCGGAGGTGCGTTAACTACGCAGCTCTTTCCAGCTCTTTATCTGGCTTAGCGAGTGAAATCCAAAATGAATCCAACTTATCAAAAACAGAAGTTCTCGTCGAAGCTTTACTGGAAACATTGTCAGCTCTTTTGGA
KY398016Aarg
GCTAGAGCTGTAGCCCAGTCCCTCGGATTGTCACAAGGGTCAGTTCAAAATATAATGAGCCAACAATTGAGCAGCATAGGCTCTGGAGCTTCCACATCATCCCTCTCCCAGGCGATAGCAAATGCCGTATCTTCCGCAGTTCAAGGATCACAGGCAGCAGCTCCAGGACAGGAACAATCTATTGCACAAAGAGTAAATTCAGCCATTTCCTCCGCTTTCGCACAATTGATTTCCCAGAAAACCGCACCGGCTCCGGCCCCGAGACCCAGACCAGCTCCNTTGCCTGCTCCAGCTCCAAGGCCCAGACCAGCACCTGCTCCACGACCAGCACCAGTTTATGCACCAGCGCCAGTTGCTTCGCAATTTCAGGCGTCTGCTTCCAGTCAATCTTCGGCTCAAGAGAATTCCTTCACTCAGTCATCAGTTGCTCAGCAATCAGCAGTTGCCCAACAATCCTCAGTTTCTCAACAATCCTCAGCTGCTCAACAGTCATCAGTTGCTCAATCGCAACAAACATCTTACTCTGCAGCAACAAATGCCGGTTCGAGTGTCTCGCAGTCTCAAGCTATTGTCTCAAGTGCCCCTGTGTACTTCAACTCGCAAACTTTGACAAACAACTTGGCTTCCTCTCTGCAATCACTGAATGCTCTTAATTACGTATCGAATGGTCAATTGAGTTCCTCGGATGTCGCTTCCACTGTTGCTAGAGCTGTAGCCCAGTCCCTCGGATTGTCACAAGGGTCAGTTCAAAATATAATGAGCCAACAATTGAGCAGCATAGGCTCTGGAGCTTCCACATCATCCCTCTCCCAGGCGATAGCAAATGCCGTATCTTCCGCAGTTCAAGGATCACAGGCAGCAGCTCCAGGACAGGAACAATCTATTGCACAAAGAGTAAATTCAGCCATTTCCTCCGCTTTCGCACAATTGATTTCCCAGAGAACCGCACCGGCTCCGGCCCCGAGACCCAGACCAGCTCCATTGCCTGCTCCAGCTCCAAGGCCCAGACCAGCACCTGCTCCACGACCAGCACCANTTTATGCACCAGCGCCAGTTGCTTCGCAATTTCAGGCGTCTGCTTCCAGTCAATCTTCGGCTCAACAGAATTCCTTCACTCAGTCATCAGTTGCTCAGCAATCAGCAGTTGCCCAACAATCCTCAGTTTCTCAACAATCCTCAGCTGCTCAACAGTCATCAGTTGCTCAATCGCAACAAACATCTTACTCTGCAGCAACAAATGCCGGTTCGAGTGTCTCGCAGTCTCAAGCTATTGTCTCAAGTGCCCCTGTGTACTTCAACTCGCAAANTTTGACAAACAACTTGGCTTCCTCTCTGCAATCACTGAATGCTCTTAATTACGTATCGAATGGTCAATTGAGTTCCTCGGATGTCGCTTCCACTGTTGCTAGAGCTGTAGCCCAGTCCCTCGGATTGTCACAAGGGTCAGTTCAAAATATAATGAGCCAACAATTGAGCAGCATAGGCTCTGGAGCTTCCACATCATCCCTCTCCCAGGCGATAGCAAATGCCGTATCTTCCGCAGTTCAAGGATCACAGGCAGCAGCTCCAGGACAGGAACAATCTATTGCACAAAGAGTAAATTCAGCCATTTCCTCCGCTTTCGCACAATTGATTTCCCAGAGAACCGCACCGGCTCCGGCCCCGAGACCCAGACCAGCTCCATTGCCTGCTCCAGCTCCAAGGCCCAGACCAGCACCTGCTCCACGACCAGCACCAGTTTATGCACCAGCGCCAGTTGCTTCGCAATTTCAGGCGTCTGCTTCCAGTCAATCTTCGGCTCAAGAGAATTCCTTCACTCAGTCATCAGTTGCTCAGCAATCAGCAGTTGCCCAACAATCCTCAGTTTCTCAACAATCCTCAGCTGCTCAACAGTCATCAGTTGCTCAATCACAACAAACATCTTACTCTGCAGCAACAAATGCCGGTTCGAGTGTCTCGCAGTCTCAAGCTATTGTCTCAAGTGCCCCTGTGTACTTCAACTCGCAAACTTTGACAAACAACTTGGCTTCCTCTCTGCAATCACTGAATGCTCTTAATTACGTATCGAATGGTCAATTGAGTTCCTCGGATGTCGCTTCCACTGTTGCTAGAGCTGTAGCCCAGTCCCTCGGATTGTCACAAGGGTCAGTTCAAAATATAATGAGCCAACAATTGAGCAGCATAGGCTCTGGAGCTTCCACATCATCCCTCTCCCAGGCGATAGCAAATGCCGTATCTTCCGCAGTTCAAGGATCACAGGCAGCAGCTCCAGGACAGGAACAATCTATTGCACAAAGAGTAAATTCAGCCATTTCCTCCGCTTTCGCACAATTGATTTCCCAGAGAACCGCACCGGCTCCGGCCCCGAGACCCAGACCAGCTCCATTGCCTGCTCCAGCTCCAAGGCCCAGACCAGCACCTGCTCCACGACCAGCACCAGTTTATGCACCAGCGCCAGTTGCTTCGCAATTTCAGGCGTCTGCTTCCAGTCAATCTTCGGCTCAAGAGAATTCCTTCACTCAGTCATCAGTTGCTCAGCAATCAGCAGTTGCCCAACAATCCTCAGTTTCTCAACAATCCTCAGCTGCTCAACAGTCATCAGTTGCTCAATCGCAACAAACATCTTACTCTGCAGCAACAAATGCCGGTTCGAGTGTCTCGCAGTCTCAAGCTATTGTCTCAAGTGCCCCTGTGTACTTCAACTCGCAAACTTTGACAAACAACTTGGCTTCCTCTCTGCAATCACTGAATGCTCTTAATTACGTATCGAATGGTCAATTGAGTTCCTCGGATGTCGCTTCCACTGTTGCTAGAGCTGTAGCCCAGTCCCTCGGATTGTCACAAGGGTCAGTTCAAAATATAATGAGCCAACAATTGAGCAGCATAGGCTCTGGAGCTTCCACATCATCCCTCTCCCAGGCGATAGCAAATGCCGTATCTTCCGCAGTTCAAGGATCACAGGCAGCAGCTCCAGGACAGGAACAATCTATTGCACAAAGAGTAAATTCAGCCATTTCCTCCGCTTTCGCACAATTGATTTCCCAGAGAACCGCACCGGCTCCGGCCCCGAGACCCAGACCAGCTCCATTGCCTGCTCCAGCTCCAAGGCCCAGACCAGCACCTGCTCCACGACCAGCACCAGTTTATGCACCAGCGCCAGTTGCTTCGCAATTTCAGGCGTCTGCTTCCAGTCAATCTTCGGCTCAAGAGAATTCCTTCACTCAGTCCCAACAATCCTCAGTTTCTCAACCTCAACAGTCATCAGTTGCTCAATCGCAACAAACATCTTACTCTGCAGCAACAAATGCCGGTTCGAGTGTCTCGCAGTCTCAAGCTATTGTCTCAAGTGCCCCTGTGTACTTCAACTCGCAA
LC570228Osyb
GCTAATTCATACAGGGCCAATTACAATACTTTCCAGCAGTCTGTAGCATCTGCTTTTGCAACTTCGCGTTCGTTTAGTTCTTTGAACACGCAAGTAGTCAGAACGGAAGATGTCAGAAACGTATTAAGTAGCGTTCTGCAGAGAAGAGGCATTTCATCTTCCGCTATCCAAAGTGCAATTAGTAGAATTAATCTAAGCGCTGGATCGTCTGTCGGAGCTTATTCTCAATCTATTTCTTCGGCATTAACGGCAGCAATGCAACAGAGCAGTATGCTATCTTCTGGACAGGAGCAAAGCATGGCAAGCTCGATTGCAACCGAAGTGATGCAAAGTTTGCTTCACATATCAACGCAGAAATCTCGTCCAGCAGCACCACGTCCCGCTCCCCTTCCCAGACCAGCCCCGAGACCTATGCCCCGTCCAATGCCTGCACCGATGCCAGTACAACAAACCCAAGTTATGCAATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTACTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACTGCATCACAGGTCTCTTCCACAAGTCAAGTATCTTCAGGATCTAGCTCTTTGAGTGCTAATTCATACAGGGCCAATTACAATACTTTCCAGCAGTCTGTAGCATCTGCTTTTGCAACTTCGCGTTCGTTTAGTTCTTTGAACACGCAAGTAGTCAGAACGGAAGATGTCAGAAACGTATTAAGTAGCGTTCTGCAGAGAAGAGGCATTTCATCTTCCGCTATCCAAAGTGCAATTAGTAGAATTAATCTAAGCGCTGGATCGTCTGTCGGAGCTTATTCTCAATCTATTTCTTCGGCATTAACGGCAGCAATGCAACAGAGCAGTATGCTATCTTCTGGACAGGAGCAAAACATGGGAGGCATGATTGCAACCGAAGTGATGCAAAGTTTGCTTCAAATATCAACGCAGAAATCTCGTCCAGCAGCACCACGTCCCGCTCCCCTTCCCAGACCAGCCCCGAGACCTATGCCCCGTCCAATGCCTGCACCGATGCCAGTACAACAAACCCAAGTTATGCAATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACTGCATCACAGATCTCTTCCACAAGTCAAGTATCTTCAGGATCTAGCTCTTTGAGTGCTAATTCATACAGGGCCAATTACAATACTTTCCAGCAGTCTGTAGCATCTGCTTTTGCAACTTCGCGTTCGTTTAGTTCTTTGAACACGCAAGTAGTCAGAACGGAAGATGTCAGAAACGTATTAAGTAGCGTTCTGCAGAGAAGAGGCATTTCATCTTCCGCTATCCAAAGTGCAATTAGTAGAATTAATCTAAGCGCTGGATCGTCTGTCGGAGCTTATTCTCAATCTATTTCTTCGGCATTAACGGCAGCAATGCAACAGAGCAGTATGCTATCTTCTGGACAGGAGCAAAGCATGGCAAGCTCGATTGCAACCGAAGTGATGCAAAGTTTGCTTCACATATCAACGCAGAAATCTCGTCCAGCAGCACCACGTCCCGCTCCCCTTCCCAGACCAGCCCCGAGACCTATGCCCCGTCCAATGCCTGCACCGATGCCAGTACAACAAACCCAAGTTATGCAATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTACTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACTGCATCACAGGTCTCTTCCACAAGTCAAGTATCTTCAGGATCTAGCTCTTTGAGTGCTAATTCATACAGGGCCAATTACAATACTTTCCAGCAGTCTGTAGCATCTGCTTTTGCAACTTCGCGTTCGTTTAGTTCTTTGAACACGCAAGTAGTCAGAACGGAAGATGTCAGAAACGTATTAAGTAGCGTTCTGCAGAGAAGAGGCATTTCATCTTCCGCTATCCAAAGTGCAATTAGTAGAATTAATCTAAGCGCTGGATCGTCTGTCGGAGCTTATTCTCAATCTATTTCTTCGGCATTAACGGCAGCAATGCAACAGAGCAGTATGCTATCTTCTGGACAGGAGCAAAACATGGGAGGCATGATTGCAACCGAAGTGATGCAAAGTTTGCTTCAAATATCAACGCAGAAATCTCGTCCAGCAGCACCACGTCCCGCTCCCCTTCCCAGACCAGCCCCGAGACCTATGCCCCGTCCAATGCCTGCACCGATGCCAGTACAACAAACCCAAGTTATGCAATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACTGCATCACAGGTCTCTTCCACAAGTCAAGTATCTTCAGGATCTAGCTCTTTGAGTGCTAATTCATACAGGGCCAATTACAATACTTTCCAGCAGTCTGTAGCATCTGCTTTTGCAACTTCGCGTTCGTTTAGTTCTTTGAACACGCAAGTAGTCAGAACGGAAGATGTCAGAAACGTATTAAGTAGCGTTCTGCAGAGAAGAGGCATTTCATCTTCCGCTATCCAAAGTGCAATTAGTAGAATTAATCTAAGCGCTGGATCGTCTGTCGGAGCTTATTCTCAATCTATTTCTTCGGCATTAACGGCAGCAATGCAACAGAGCAGTATGCTATCTTCTGGACAGGAGCAAAGCATGGCAAGCTCGATTGCAACCGAAGTGATGCAAAGTTTGCTTCACATATCAACGCAGAAATCTCGTCCAGCAGCACCACGTCCCGCTCCCCTTCCCAGACCAGCCCCGAGACCTATGCCCCGTCCAATGCCTGCACCGATGCCAGTACAACAAACCCAAGTTATGCAATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTACTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACCGCATCACAGGCTGCTGCAGCATCATACCAAAGTGCCGCAACTTCTGCATCGACTGCATCACAGGTCTCTTCCACAAGTCAAGTATCTTCAGGATCTAGCTCTTTGAGTGCTAATTCATACAGGGCCAATTACAATACTCAGTCTGTAGCATCTGCTTTTGCAACTTCGCGTTCCAGAACGGAAGATGTCAGAAACGTT
MH376748Aven
GCGAGTGCGGTAGCACCGTCACTTGGAGTGTCCCAAGCGTCGGTTCAAAATAGTATAAGCCAACAGTTGAGAAGCGTAGGGCCCGGATCTTCCACGTCCTCTGTCGCTCAAGCAATAGCAAATGGAGTGGCTAACGCAGTTGGAGCATCAGGAACAGGAGTTGCAGGACAAGAACAATCTATTTCACAATCCATATATACTTCAGTTTCCACTGCTCTATCTCAACTGGCAGCACCGGCTCCAGCACCTGCACCTAGACCAGCTCCTCGACCACTACCAGCCCCAATTCAAGCCCCAAGACCAGCACCCGCACCACAACCTGCACCGGTTTACGCACCAGCCCCAGTCGTTTCACAAGTTCAGGCAACTTCTTCCTCTCAAGCCTCGGCTCAACAGAGTGCCTTCGCACAGTCCCAACAATCTTCAGTTGTTCAATCTCAACAAAGCTCAAACGCTTATTCTGCAGCTTCATCTGTAGGCTCAAGTTTTTCGCAGTCTCAGGGGACTGTCCCAAGCGCTCCTGTTTATTTTAACACGCAAACTTTAAGTAGCAGCCTGTCTTCTTCTCTGCAATCACTCAGTGCACTCAATTCGATAGCGAGTGGTCAACTGAGCTCCTCGAATGCCGCTTCTATTATAGCGAGTGCAGTGGCACGGTCACTTGGAGTGTCCCAAGCGTCGGTTCAAAATAGTATAAGCCAACAGTTGAGAAGCGTAGGGCCCGGATCTTCCACGTCCTCTGTCGCTCAAGCAATAGCAAATGGAGTGGCTAACGCAGTTGGAGCATCAGGAACTGGAGTTGCAGGACAAGAACAATCTATTTCACAATCCATATATACTTCAATTTCCACTGCTCTTTCTCAACTGGCAGCACCGGCTCCAGCACCTGCACCGAGACTTGCTCCTAGACCACTACCAGCCCCAATTCAAGCCCCAAGACCAGCACCCGCACCACAACCTGCACCGGTTTACGCACCAGCCCCAGTCGTTTCACAAGTTCAGGCAACTTCTTCCTCTCAAGCTTCGGCTCAACAGAGTGCCTTCGCACAGTCCCAGCAATCTTCAGTTGCACAATCTCAACAAAGCTCAAACGTTGATTCTGCAGCTTCATCTGTAGGCTCAAGTTTTTCGCAGTCTCAGGGGACTGTCCCAAGCGCTCCTGTTTATTTTAACACGCAAACTTTAAGTAGCAGCCTGTCTTCTTCTCTGCAATCACTCAGTGCACTCAATTCGATAGCGAGTGGTCAACTGAGCTCATCGTATGCCGATTCTATTTTAGCGAGTGCAGTGGCTCGGTCTCTTGGAGTTTCCCAAGCGTCGGTTCTAAATAGTATAAGCCAACAGTTGAGAAGCGTAGGGCCCGGATCTTCCACGTCCTCTGTCGCTCAAGCAATAGCAAATGGAGTGGCTAACGCAGTTGGAGCATCAGGAACAGGAGTTGCAGGACAAGAACAATCTATTTCACAATCCATATATACTTCAGTTTCCACTGCTCTTTCTCAACTGGCAGCACCGGCTCCAGCACCTGCACCGAGACTTGCTCCTAGACCACTACCAGCCCCAATTCAAGCCCCAAGACCAGCACCCGCACCACAACCTGCACCGGTTTACGCACCAGCCCCAGTCGTTTCACAAGTTCAGGCAACTTCTTCCTCTCAAGCCTCGGCTCAACAGAGTGCCTTCGCACAGTCCCAACAATCTTCAGTTGTTCAATCTCAACAAAGTTCAAACGCTTATTCTGCAGCATCAACTGCCGGTTCAAGTGTGTCGCAATCTCAGGCGATTGTCTCAAGCGCTCCTGATTATTTTAACACGCAAACTTTAAGTAGCAGCCTGTCTTCTTCTCAGCAATCACTCAGTGCACTCAATTCGATAGCGAGTGGTCAACTGAGCTCCTCGAATGCCGCTTCTATTTTAGCGAGTGCAGTGGCACGGTCACTTGTAGTGTCCCACGCGTCGGTTCAAAATAGTATAAGCCAACAGTTGAGAAGCGTAGGGCCCGGATCTTCCACGTCCTCTGTCGCTCAAGCAATAGCAAATGGAGTGGCTAACGCAGTTGGAGCATCAGGAACTGGAGTTGCAGGACAAGAACAATCTATTTCACAATCCATATATACTTCAATTTCCACTGCTCTTTCTCAACTGGCAGCACCGGCTCCAGCACCTGCACCGAGACTTGCTCCTAGACCACTACCAGCCCCAATTCAAGCCCCAAGACCAGCACCCGCACCACAACCTGCACCGGTTTACGCACCAGCCCCAGTCGTTTCACAAGTTCAGGCAACTTCTTCCTCTCAAGCCTCGGCTCAACAGAGTGCCTTCGCACAGTCCCAACAATCTTCAGTTGTTCAATCTCAACAAAGTTCAAACGCTTATTCTGCAGCATCAACTGCCGGTTCAAGTGTGTCGCAATCTCAGGCGATTGTCTCAAGCGCTCCTGATTATTTTAACACGCAAACTTTAAGTAGCAGCCTGTCTTCTTCTCAGCAATCACTCAGTGCACTCAATTCGATAGCGAGTGGTCAACTGAGCTCCTCGAATGCCGCTTCTATTTTAGCGAGTGCAGTGGCACGGTCACTTGGAGTGTCCCAAGCGTCGGTTCAAAATAGTATAAGCCAACAGTTGAGAAGCGTAGGGCCCGGATCTTCCACGTCCTCTGTCGCTCAAGCAATAGCAAATGGAGTGGCTAACGCAGTTGGAGCATCAGGAACTGGAGTTGCAGGACAAGAACAATCTATTTCACAATCCATATATACTTCAATTTCCACTGCTCTTTCTCAACTGGCAGCACCGGCTCCAGCACCTGCACCGAGACTTGCTCCTAGACCACTACCAGCCCCAATTCAAGCCCCAAGACCAGCACCCGCACCACAACCTGCACCGGTTTACGCACCAGCCCCAGTCGTTTCACAAGTTCAGGCAACTTCTTCCTCTCAAGCCTCGGCTCAACAGAGTGCCTTCGCACAGTCCCAACAATCTTCAGTTGTTCAATCTCAACAAAGTTCAAACGCTTATTCTGCAGCATCAACTGCCGGTTCAAGTGTGTCGCAATCTCAGGCGATTGTCTCAAGCGCTCCTGATTATTTTAACACGCAAACTTTAAGTAGCAGCCTGTCTTCTTCTCAGCAATCACTCAGTGCACTCAATTCGATAGCGAGTGGTCAACTGAGCTCCTCGAATGCCGCTTCTATTTTAGCGAGTGCAGTGGCACGGTCACTTGGAGTGTCCCAAGCGTCGGTTCAAAATAGTATAAGCCAACAGTTGAGAAGCGTAGGGCCCGGATCTTCCACGTCCTCTGTCGCTCAAGCAATAGCAAATGGAGTGGCTAACGCAGTTGGAGCATCAGGAACTGGAGTTGCAGGACAAGAACAA
MN704282Aven_2
GCACAACGGGCAGCACAGGAAGCAGCACAACGCGCAGCTCAGCAAGCATCAGCACAACGGGCAGCTCAAGAAGCAGCAGCACAATTGGCAGCTCAGCAAGCAGCAGCACAAAGGGCAGCACAGGAAGCAGCAGCACAACGCGCAGCTCAGCAAGCAGCACAACGCGCAGCACAGCAAGCAGCAGCACAACGGGCCGCTCAGGAAGCGGCAGCTCAACGGGCTGCACAACAAGCAGCAGCTCAAAGGGCAGCGCAGCAAGCAGCAGCTCAAAGAGCAGCTCAGCAAGCAGCAGCTCAAAGCGCAGCTCAGCAAGCAGCAGCTCAACAAGCATCATCCTTCGCACAGTCACAACAATCCTCTGTTGTTCAATCTCAACAAAGTTCAAATGCTTACTCTGCAGCATCAACCTCGGGTTCCAGTGTGTCTCAGTCTCAAGCAATTGTCTCTAGCGCACCTACATATTCCAACACGCAGACGGTCAGTAGCAGCCTGTATTCTTCCCTGCAATCGCAAGCAAGCAGTGCACTAAATTTAATATCGACCGGTCAAGTGAGCTCAACCAGTGCCGCTTCCGCTATAGCGAGTGCCATTGCGCAGTCACTTGGAATTTCCCAGTCAACAGCACAAAATATCATTAGCCAACAATTGAGCAACGTACGAGTCGGATCCTCTACCTCGGCTATAGCTCAAGCCTTATCAAGTGCAATATCTTCCGTAATTGCATCATCAGGATCTTATGTTGCAGGACAGGAGCAATCTATTTCGCAAACCGTATCATCTGCGATTTCTTCTGCTCTATCCCAAATATCAGGACCGGCTCCAGCGCCATCACCGATTTTGGCACCTCGACCACTACCTGCTCCAATACCTGCACCTGCTCCCAGACCAGTCCTCGCACCAGCCGTTTCACAATCTCAGGCAATTTCTGCTTCTTCTTCTTCTTCTCAGGCCACGGCTCAACAAAGTTCTTTCGCACAATCCCAACAATCCTCTGTTGTTCAATCCCAACAATCCTCTGTTATATCTCAACAAAGTTCAAACGCTTACACTGCAGCATCAACCTCGGGTTCCAGTGTGTCTCAGTCTCAAGCAACTGTCTCGAGCGCACCTACATATTCCAACACGCAGACGGTCAGTAGCAGCCTGTATTCTTCCCTTCAATCGCAAGCAAGCAGTGCACTGAATTTAATATCGACCGGTCAAGTGAGCTCAACCAGTGCCGCTTCCGCTATAGCGAGTGCCATTGCGCAGTCACTTGGAATTTCCCAGTCAACAGCACAAAATATCATTAGCCAGCAATTGAGCAACGTACGAGTCGGATCCTCTACCTCGGCTATAGCTCAAGCCTTATCAAGTGCAATATCTTCCGTAATTGCATCGTCAGGATCTTATTCTGCAGGACAGGAGCAATCTATTTCACAAACCGTATCATCTGCGATTTCCTCTGCTCTATCCCAAATATCAGGACCGGCTCCAGCGCCTGCACCGATTTCGGCTCTTGCACCACAACCAACTCCAATTTACGCACCAGCCCCAGTAATTTCCCAAGTTCAGGCAACTTCCTCTTCTTCTCAAACTTTGGCTGAACAGAGTTCTATCACACAGTCTCAGCAGTCTTCCTTTGCTCAAACTCAACAAAGTGCAAACGCTTATTCTGCAGCAGCAGCTCAACAGGCAGCAGCACAACAGGCAGCAGCTCAACAGGCAGCAGCACAACAGGCAGCAGCTCAACAGGCAGCAGCTCAACAGGCAGCAGCACAACAGGCAGCAGCACAACGGGCAGCTCAGCAAGCAGCGGCACAGCAGGCAGCACAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAAAGCGGCTCAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAGGGCGGCTCAACAAGCAGCAGCACAAAGGGCTGCTCAACAAGCAGCAGCACAACGGGCAGCTCAGGAAGCAGCAGCTCAAAGGGCAGCTCAGGAAGCAGCAGCTCAAAGGGCTGCTCAGCAAGCAGCGGCACAAAGGGCAGCACAACAAGCAGCGGCTCAAAGGGCAGCACAACAAGCAGCAGCTCAAAGGGCAGCTCAACAAGCAGCAGCTCAAAGGGCAGCCCAGCAGGCGGCAGCTCAACAGGCAGCACAACAAGCAGCAGCACAAAGGGCAGCGCAGCAGGCGGCCGCTCAAAGGGCAGCGCAGCAAGCAGCAGCTCAAAGAGCAGCTCAGCGAGCAGCAGCTCAACAAGCATCATCCTTCGCACAGTCACAACAATCCTCTGTTGTTCAATCTCAACAAAGTTCAAATGCTTACTCTGCAGCATCAACCTCGGGTTCCAGTGTGTCTCAGTCTCAAGCAATTGTCTCTAGCGCACCTACATATTCCAACACGCAGACAGTCAGTAGCAGCCTGTATTCTTCCCTGCAATCGCAAGCAAGCAGTGCACTAAATTTAATATCGACCGGTCAAGTGAGCTCAACCAGTGCCGCTTCCGCTATAGCGAGTGCCATTGCGCAGTCACTTGGAATTTCCCAGTCAACAGCACAAAATATCATTAGCCAACAATTGAGCAACGTACGAGTCGGATCCTCTACCTCGGCTATAGCTCAAGCCTTATCAAGTGCAATATCTTCCGTAATTGCATCATCAGGATCTTATGTTGCAGGACAGGAGCAATCTATTTCGCAAACCGTATCATCTGCGATTTCTTCTGCTCTATCCCAAATATCAGGACCGGCTCCAGCGCCATCACCGATTTCGGCACCTCGACCACTACCTGCTCCAATACCTGCACCTGCTCCACGACCAGTCCTCGCACCACAAACTGCTTCAGTCACTAGTTACGCACCAGTTGTTTCACAATCTCAGGCAATTTCTTCTTCTCAGGCCTCGGCTCAACAGAGTTCTTTCGCACAATCCCAACAGTCCTCTGTTGTTCAATCCCAACAATCCTCTGTTCAATCTCAACAAAGTTCAAACGCGTATTCTGCAGCATCAACCTCGGGTTCCAGTGTGTCTCAGTCTCAAGCAATTGTCTCTAGCGCACCTACATATTCCAACACGCAGACAGTCAGTAGCAGCCTGTATTCTTCCCTGCAATCGCAAGCAAGCAGTGCACTAAATTTAATATCGACCGGTCAAGTGAGCTCAACCAGTGCCGCTTCCGCTATAGCGAGTGCCATTGCGCAGTCACTTGGAATTTCCCAGTCAACAGCACAAAATATCATTAGCCAACAATTGAGCAAC
二:零假设
## tree文件
5 1
((MH376748Aven,KY398016Aarg),(LC570228Osyb,FJ973621Lhes) ,MN704282Aven_2);
codeml.ctl
seqfile = /home/spider/project/yuantao/test3/10_positive_selection/nuc.txt * sequence data filename
treefile = /home/spider/project/yuantao/test3/10_positive_selection/frist.tree * tree structure file name
outfile = /home/spider/project/yuantao/test3/10_positive_selection/result0.txt * main result file name
noisy = 9 * 0,1,2,3,9: how much rubbish on the screen
verbose = 1 * 0: concise; 1: detailed, 2: too much
runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic
* 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise
seqtype = 1 * 1:codons; 2:AAs; 3:codons-->AAs
CodonFreq = 2 * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table
* ndata = 10
clock = 0 * 0:no clock, 1:clock; 2:local clock; 3:CombinedAnalysis
aaDist = 0 * 0:equal, +:geometric; -:linear, 1-6:G1974,Miyata,c,p,v,a
aaRatefile = dat/jones.dat * only used for aa seqs with model=empirical(_F)
* dayhoff.dat, jones.dat, wag.dat, mtmam.dat, or your own
model = 0
* models for codons:
* 0:one, 1:b, 2:2 or more dN/dS ratios for branches
* models for AAs or codon-translated AAs:
* 0:poisson, 1:proportional, 2:Empirical, 3:Empirical+F
* 6:FromCodon, 7:AAClasses, 8:REVaa_0, 9:REVaa(nr=189)
NSsites = 0 * 0:one w;1:neutral;2:selection; 3:discrete;4:freqs;
* 5:gamma;6:2gamma;7:beta;8:beta&w;9:betaγ
* 10:beta&gamma+1; 11:beta&normal>1; 12:0&2normal>1;
* 13:3normal>0
icode = 0 * 0:universal code; 1:mammalian mt; 2-10:see below
Mgene = 0
* codon: 0:rates, 1:separate; 2:diff pi, 3:diff kapa, 4:all diff
* AA: 0:rates, 1:separate
fix_kappa = 0 * 1: kappa fixed, 0: kappa to be estimated
kappa = 2 * initial or fixed kappa
fix_omega = 0 * 1: omega or omega_1 fixed, 0: estimate
omega = .4 * initial or fixed omega, for codons or codon-based AAs
fix_alpha = 1 * 0: estimate gamma shape parameter; 1: fix it at alpha
alpha = 0. * initial or fixed alpha, 0:infinity (constant rate)
Malpha = 0 * different alphas for genes
ncatG = 8 * # of categories in dG of NSsites models
getSE = 0 * 0: don't want them, 1: want S.E.s of estimates
RateAncestor = 1 * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)
Small_Diff = .5e-6
cleandata = 1 * remove sites with ambiguity data (1:yes, 0:no)?
* fix_blength = 1 * 0: ignore, -1: random, 1: initial, 2: fixed, 3: proportional
method = 0 * Optimization method 0: simultaneous; 1: one branch a time
* Genetic codes: 0:universal, 1:mammalian mt., 2:yeast mt., 3:mold mt.,
* 4: invertebrate mt., 5: ciliate nuclear, 6: echinoderm mt.,
* 7: euplotid mt., 8: alternative yeast nu. 9: ascidian mt.,
* 10: blepharisma nu.
* These codes correspond to transl_table 1 to 11 of GENEBANK.
(END)
三:替代假设
5 1
((MH376748Aven,KY398016Aarg),(LC570228Osyb,FJ973621Lhes) $1,MN704282Aven_2);
seqfile = /home/spider/project/yuantao/test3/10_positive_selection/nuc.txt * sequence data filename
treefile = /home/spider/project/yuantao/test3/10_positive_selection/frist.tree * tree structure file name
outfile = /home/spider/project/yuantao/test3/10_positive_selection/result2.txt * main result file name
noisy = 9 * 0,1,2,3,9: how much rubbish on the screen
verbose = 1 * 0: concise; 1: detailed, 2: too much
runmode = 0 * 0: user tree; 1: semi-automatic; 2: automatic
* 3: StepwiseAddition; (4,5):PerturbationNNI; -2: pairwise
seqtype = 1 * 1:codons; 2:AAs; 3:codons-->AAs
CodonFreq = 2 * 0:1/61 each, 1:F1X4, 2:F3X4, 3:codon table
* ndata = 10
clock = 0 * 0:no clock, 1:clock; 2:local clock; 3:CombinedAnalysis
aaDist = 0 * 0:equal, +:geometric; -:linear, 1-6:G1974,Miyata,c,p,v,a
aaRatefile = dat/jones.dat * only used for aa seqs with model=empirical(_F)
* dayhoff.dat, jones.dat, wag.dat, mtmam.dat, or your own
model = 2
* models for codons:
* 0:one, 1:b, 2:2 or more dN/dS ratios for branches
* models for AAs or codon-translated AAs:
* 0:poisson, 1:proportional, 2:Empirical, 3:Empirical+F
* 6:FromCodon, 7:AAClasses, 8:REVaa_0, 9:REVaa(nr=189)
NSsites = 0 * 0:one w;1:neutral;2:selection; 3:discrete;4:freqs;
* 5:gamma;6:2gamma;7:beta;8:beta&w;9:betaγ
* 10:beta&gamma+1; 11:beta&normal>1; 12:0&2normal>1;
* 13:3normal>0
icode = 0 * 0:universal code; 1:mammalian mt; 2-10:see below
Mgene = 0
* codon: 0:rates, 1:separate; 2:diff pi, 3:diff kapa, 4:all diff
* AA: 0:rates, 1:separate
fix_kappa = 0 * 1: kappa fixed, 0: kappa to be estimated
kappa = 2 * initial or fixed kappa
fix_omega = 0 * 1: omega or omega_1 fixed, 0: estimate
omega = .4 * initial or fixed omega, for codons or codon-based AAs
fix_alpha = 1 * 0: estimate gamma shape parameter; 1: fix it at alpha
alpha = 0. * initial or fixed alpha, 0:infinity (constant rate)
Malpha = 0 * different alphas for genes
ncatG = 8 * # of categories in dG of NSsites models
getSE = 0 * 0: don't want them, 1: want S.E.s of estimates
RateAncestor = 1 * (0,1,2): rates (alpha>0) or ancestral states (1 or 2)
Small_Diff = .5e-6
cleandata = 1 * remove sites with ambiguity data (1:yes, 0:no)?
* fix_blength = 1 * 0: ignore, -1: random, 1: initial, 2: fixed, 3: proportional
method = 0 * Optimization method 0: simultaneous; 1: one branch a time
* Genetic codes: 0:universal, 1:mammalian mt., 2:yeast mt., 3:mold mt.,
* 4: invertebrate mt., 5: ciliate nuclear, 6: echinoderm mt.,
* 7: euplotid mt., 8: alternative yeast nu. 9: ascidian mt.,
* 10: blepharisma nu.
* These codes correspond to transl_table 1 to 11 of GENEBANK.
四:运行零假设和替代假设的ctl文件
/home/spider/project/yuantao/soft/paml4.9j/bin/codeml /home/spider/project/yuantao/soft/paml4.9j/codeml.ctl
会获得两个结果文件:result0.txt和result2.txt,其中最主要的看Likehood radio test(lnL)值和omeag(w )值,如下:
其中零假设(1个omega值)
lnL0(ntime: 7 np0: 9): -21384.472055 +0.000000
omega (dN/dS) = 0.86618
替代假设(2个omega值)
lnL2(ntime: 7 np2: 10): -21376.996759 +0.000000
w (dN/dS) for branches: 0.05555 3.45230
对于inL值,可以用R进行一个p检验;
其中lnL2-lnL0符合Chi^2分布,df=np2-np0,从而得到显著性P值。
pchisq(-21376.996759+21384.472055,10-9,lower.tail = FALSE)
[1] 0.006255128
结果为0.006255128<10%,表示在10%的significance的情况下拒绝零假设,接受替代假设。
w (dN/dS) for branches: 0.05555 3.45230
所以从替代假设的omega值中,可以看到PiSp2的dN/dS值要大于PiSp1的dN/dS值,PiSp1<1为negative selection,PiSp2>1为positiveselection
网友评论