美文网首页
linux perl实现碱基序列 的反向互补和翻译 ,并配置成命

linux perl实现碱基序列 的反向互补和翻译 ,并配置成命

作者: 火卫控 | 来源:发表于2023-07-26 17:04 被阅读0次

linux perl实现碱基序列 的反向互补和翻译 ,并配置成命令

例如,实现对 ATCGCGATCGATCGT 序列的反向互补和翻译
先运行 ./tranfa.pl -i
在输入碱基序列信息ATCGCGATCGATCGT ,回车
然后按Ctrl+D表示输入完毕,结果生成

我们可以采用alias命令 将 tranfa.pl 脚本运行命令定义为tfa
alias tfa='perl /mnt/g/Linux_docu/tranfa.pl'
这样就可以全局使用 tfa 命令

运行情况如下:

(base) root@DESKTOP-727JVLV:/mnt/g/Linux_docu/testpl/perl-scripts-master# tfa -i
Please input the nucleotide sequence,and end by ctrl+D.

ATCGCGATCGATCGT

###rc###
ACGATCGATCGCGAT

###protein###
ORF1:
IAIDR
ORF2:
SRSI
ORF3:
RDRS

###Length###
15

但是这样会重启Linux失效,需要写入到 bashrc 文件中
vi ~/.bashrc
然后生效 source ~/. bashrc
这样就可以每次直接使用tfa命令来实现该功能了

perl代码如下:
首行 定义使用perl运行 -w会显示错误信息

#! /usr/bin/perl -w
use strict;
use Getopt::Long;
my ($i,$r,$p,$l);
GetOptions(
    "i!"=>\$i,
    "r!"=>\$r,
    "p!"=>\$p,
    "l!"=>\$l,
);
my $usage = "\nUsage: tfa <STDIN>[-i-r-p-l]\n";
die "$usage\n" unless $i;
print "Please input the nucleotide sequence,and end by ctrl+D.\n\n";
unless($r || $p || $l){
    ($r,$p,$l)=(1,1,1);
}
my $fa;
do{local $/;chomp($fa=<STDIN>)};
$fa =~ s/\s+//g;
die "$usage\n" unless $fa;

if($r){
    my $faout = reverse_complement($fa);
    $faout = out_fasta($faout,50);
    print "\n###rc###\n$faout\n";
}
if($p){
    my @fa_arr = cds2pep($fa);
    print "\n###protein###\n";
    $fa_arr[0] = out_fasta($fa_arr[0],50);
    print "ORF1:\n$fa_arr[0]\n";
    $fa_arr[1] = out_fasta($fa_arr[1],50);
        print "ORF2:\n$fa_arr[1]\n";
    $fa_arr[2] = out_fasta($fa_arr[2],50);
        print "ORF3:\n$fa_arr[2]\n";
    
}
if($l){
    my $len = length $fa;
    print "\n###Length###\n$len\n";
}   
#####################
sub out_fasta{
        my ($seq,$num) = @_;
        my $len = length $seq;
        $seq =~ s/([A-Za-z]{$num})/$1\n/g;
        chop($seq) unless $len % $num;
        return $seq;
}
#####################
sub reverse_complement{
        my ($seq)=shift;
        $seq=reverse$seq;
        $seq=~tr/AaGgCcTt/TtCcGgAa/;
        return $seq;
}
#####################
sub cds2pep{
        my $seq=shift;  
    ##phase0
    my $str0 = $seq;
    $str0 = trans($str0);
    ##phase1
    my $str1 = substr($seq,1);
    $str1 = trans($str1);
    ##phase0
        my $str2 = substr($seq,2);
        $str2 = trans($str2);
    return ($str0,$str1,$str2);
}
#####################
sub trans{
    my $seq = shift;
    my $p = code();
    my $out;
    for(my $i=0;$i<length$seq;$i+=3){
                my $codon=uc(substr($seq,$i,3));
                last if (length$codon <3);
                $out.= exists $p->{"standard"}{$codon} ? $p->{"standard"}{$codon} : "X";
        }
        return $out;
}
#####################
sub code{
        my $p={
                "standard" =>
                        {       
                                'GCA' => 'A', 'GCC' => 'A', 'GCG' => 'A', 'GCT' => 'A',                               # Alanine
                                'TGC' => 'C', 'TGT' => 'C',                                                           # Cysteine
                                'GAC' => 'D', 'GAT' => 'D',                                                           # Aspartic Aci
                                'GAA' => 'E', 'GAG' => 'E',                                                           # Glutamic Aci
                                'TTC' => 'F', 'TTT' => 'F',                                                           # Phenylalanin
                                'GGA' => 'G', 'GGC' => 'G', 'GGG' => 'G', 'GGT' => 'G',                               # Glycine
                                'CAC' => 'H', 'CAT' => 'H',                                                           # Histidine
                                'ATA' => 'I', 'ATC' => 'I', 'ATT' => 'I',                                             # Isoleucine
                                'AAA' => 'K', 'AAG' => 'K',                                                           # Lysine
                                'CTA' => 'L', 'CTC' => 'L', 'CTG' => 'L', 'CTT' => 'L', 'TTA' => 'L', 'TTG' => 'L',   # Leucine
                                'ATG' => 'M',                                                                         # Methionine
                                'AAC' => 'N', 'AAT' => 'N',                                                           # Asparagine
                                'CCA' => 'P', 'CCC' => 'P', 'CCG' => 'P', 'CCT' => 'P',                               # Proline
                                'CAA' => 'Q', 'CAG' => 'Q',                                                           # Glutamine
                                'CGA' => 'R', 'CGC' => 'R', 'CGG' => 'R', 'CGT' => 'R', 'AGA' => 'R', 'AGG' => 'R',   # Arginine
                                'TCA' => 'S', 'TCC' => 'S', 'TCG' => 'S', 'TCT' => 'S', 'AGC' => 'S', 'AGT' => 'S',   # Serine
                                'ACA' => 'T', 'ACC' => 'T', 'ACG' => 'T', 'ACT' => 'T',                               # Threonine
                                'GTA' => 'V', 'GTC' => 'V', 'GTG' => 'V', 'GTT' => 'V',                               # Valine
                                'TGG' => 'W',                                                                         # Tryptophan
                                'TAC' => 'Y', 'TAT' => 'Y',                                                           # Tyrosine
                                'TAA' => 'U', 'TAG' => 'U', 'TGA' => 'U'                                              # Stop
                         }
                ## more translate table could be added here in future
                ## more translate table could be added here in future
                ## more translate table could be added here in future
        };
        return $p;
}

相关文章

网友评论

      本文标题:linux perl实现碱基序列 的反向互补和翻译 ,并配置成命

      本文链接:https://www.haomeiwen.com/subject/knvkpdtx.html