美文网首页
signalp工具+ 我的脚本

signalp工具+ 我的脚本

作者: keaidelele | 来源:发表于2017-04-13 10:39 被阅读154次

    之前出现了一个bug
    http://stackoverflow.com/questions/38605052/signalp-error-message-cant-locate-fasta-pm-in-inc

    使用环境:ubuntu
    下载signalp:http://www.cbs.dtu.dk/download/EBC48D58-1F4D-11E7-A2FF-F5A645117D33/
    接着解压之后,查看readme,使用说明如下:

    1. Test SignalP on the 10 eukaryotic sequences shipped with the package: go to
      the signalp-4.1 directory on your system and issue the following commands:
        > ./signalp -t euk -f short test/euk10.fsa > euk10.fsa.short_out
        > ./signalp -t euk -f long test/euk10.fsa > euk10.fsa.long_out
        > ./signalp -t euk -f all test/euk10.fsa > euk10.fsa.all_out
        > ./signalp -t euk -f summary test/euk10.fsa > euk10.fsa.summary_out
    

    然后我们要修改一下singalp的配置,打开signalp,编辑第14行,改成你自己文件的目录

    BEGIN {
        #$ENV{SIGNALP} = '/usr/cbs/bio/src/signalp-4.1';
         $ENV{SIGNALP} = '/home/huangle/signalp-4.1';   
    }
    

    我的脚本是批处理序列的脚本,统计cutoff:

    import re
    import os
    filename = 'test.fas'
    directory = '/home/huangle/signalp-4.1/'
    filepath = directory + filename
    wfilepath = directory + filename+'.out'
    pfilepath =directory + filename + '.short_out'
    seq ={}#key: 'cazyme_name' value:  seq
    cutoff ={}#key: 'cazyme_name' value:  cutoff
    #shell exec
    
    os.system('./signalp -t euk -f short '+ filename +' > ' + pfilepath)
    
    
    #calculate the length of cazyme
    f = open(filepath)
    line = f.readline()
    lenn = 0
    line1 = re.split('\s',line)
    line1 = line1[0].split('>')
    key = line1[1]
    tag =0
    while line:
    
          line = f.readline()
          
          if line and line[0] == '>':
         seq[key] = lenn
         lenn=0
         line1 = re.split('\s',line)
         line1 = line1[0].split('>')
         key = line1[1]
          else:
               if line:
              lenn += len(line)-1
    
    seq[key] = lenn
    #print seq
    f.close()
    #calculate cutoff
    f = open(pfilepath)
    line = f.readline()
    line = f.readline()
    line = f.readline()
    while line:
        line = re.split('\s\s+',line)
        key = line[0]
        n1 = int(line[2])
        n2 = int(line[4])
        n3 = int(line[6])
        MAX = -1
        if(n1>n2):
            if(n1>n3):
                MAX = n1
            else:
                MAX = n3
        else:
            if(n2>n3):
                MAX = n2
            else:
                MAX = n3    
        cutoff[key] = MAX-1
        line = f.readline()
    #print cutoff
    f.close()
    #output
    fw =open(wfilepath,'w')
    title = "#cazyme_name   cayzme_length   Cleavage Site\n"
    fw.write(title)
    for (key,value) in seq.items():
        fw.write(key + '    ' + str(value) + '  ' + str(cutoff[key]) + '\n')
    fw.close()
    print "success"
    os.system('gedit ' + wfilepath)
    
    

    相关文章

      网友评论

          本文标题:signalp工具+ 我的脚本

          本文链接:https://www.haomeiwen.com/subject/wgfbattx.html