论文翻译助手，python3调用剪贴板和谷歌翻译

作者: 铁佛爷 | 来源:发表于2018-10-24 20:50 被阅读0次

论文翻译助手，python3调用剪贴板和谷歌翻译
l'étude du Francais
python谷歌翻译
调用谷歌翻译API
NodeJS调用谷歌翻译
python调用谷歌翻译
有什么好的翻译软件吗?
英语论文怎么写
2018-05-10
写论文之翻译助手

英语烂，看论文都费劲，谷歌翻译和欧陆词典是我的好朋友。
从pdf里复制段落到谷歌翻译是最常用的操作了。
但是删换行什么的太讨厌。
python写个小工具。

功能：从windows剪切板中读取复制的文字，格式处理，调用谷歌翻译api，返回结果。
环境：win10，Python 3.5.2 |Anaconda 4.2.0 (64-bit)

主程序 Clipboard.py，从这里运行。包括读写剪贴板，格式化处理。
一是注意剪贴板使用中的异常处理，剪贴板打开了必须要关闭，cb.CloseClipboard()，否则会影响复制粘贴使用（如果发现复制粘贴失效了，关闭python即可）
二是注意编码问题。python3里str都是unicode编码，从剪贴板读的时候，格式控制要选win32con.CF_UNICODETEXT，不要用win32con.CF_TEXT。那个出来时bytes类型，转str的时候还会有好多毛病。

# -*- coding: utf-8 -*-
"""
Created on Fri Oct 19 10:48:45 2018

@author: BigFly
"""
import win32clipboard as cb
import win32con
from translate import google_translate

def gettext():
    cb.OpenClipboard()
    try:
        t = cb.GetClipboardData( win32con.CF_UNICODETEXT)
    except TypeError:
        print("There are NO TEXT in clipboard.")
    else :
        return t
    finally:
        cb.CloseClipboard()

def settext(aString):
    cb.OpenClipboard()
    try:
        cb.EmptyClipboard()
        cb.SetClipboardData( win32con.CF_UNICODETEXT, aString)
    except:
        print("Any error in func:settext()")
    cb.CloseClipboard()
    
#删()引用
def deletBracket(source,flags,pad_sym=chr(0)):
    code={"(":1, ")":-1}
    index = [i for i in range(len(source)) if source[i]=="(" or source[i]==")"]
    match,start=0,-1
    for i in index:
        match+= code[ source[i] ]
        if start<0 and match==1:
            start = i
        if match==0:
            concent=source[start: i+1]
            check=sum([concent.find(flag) for flag in flags])+len(flags)
            if check > 0:
                source=source.replace(concent,pad_sym*len(concent),1)
            start=-1
    return source.replace(pad_sym,"")
    
source= gettext()
if source:
    source= source.replace(chr(0),"")
    # huanhang
    source=source.replace("\r","")
    source=source.replace("\n"," ")
    # fenju
    pad_sym=chr(0)
    source=source.replace("e.g. ","e.g."+pad_sym)
    source=source.replace("i.e. ","i.e."+pad_sym)
    source=source.replace("Eq. ","Eq."+pad_sym)
    source=source.replace("Mr. ","Mr."+pad_sym)
    
    source=source.replace(". ",". \r\n")
    source=source.replace(pad_sym," ")
    # qu()
    source=deletBracket(source,["et al.", ", 201", ", 200", ", 199"],pad_sym)
    source=source.replace("  "," ")
    
    settext(source)
    print(source)
    print("[ %d ]"%(len(source)))
    print(google_translate(source))

'''

Our architectures
will have only one representation at one resolution besides
the pooling layers and the convolutional layers that initialize
the needed numbers of channels. Take the architecture in
Table 1 as an example. There are two processes for each
resolution. The first one is the transition process, which
computes the initial features with the dimensions of the next
resolution, then down samples it to 1=4 using a 2×2 average
pooling. A convolutional operation is needed here because
F is assumed to have the same input and output sizes. The
next process is using GUNN to update this feature space
gradually. Each channel will only be updated once, and all
channels will be updated after this process. Unlike most of
the previous networks, after this two processes, the feature
transformations at this resolution are complete. There will
be no more convolutional layers or blocks following this feature representation, i.e., one resolution, one representation.
Then, the network will compute the initial features for the
next resolution, or compute the final vector representation of
the entire image by a global average pooling. By designing
networks in this way, SUNN networks usually have about
20 layers before converting to GUNN-based networks.
'''

调用谷歌翻译的程序，网上找的现成代码稍改了一下
原文：https://blog.csdn.net/yingshukun/article/details/53470424

translate.py
改了返回数据的处理：
result返回的是个长度为9的list，result[0]是翻译结果，后边有备选翻译等其他东西，用不着。
result[0]也是个列表，长度为行数or句子数+1，最后一个是翻译结果的拼音
把result[:-1]中的翻译结果拼接起来就是我们要的了。
该文件可直接运行，测试翻译。

# -*- coding: utf-8 -*-
"""
Created on Tue Oct 23 18:58:26 2018

@author: BigFly
"""

import requests  
from HandleJs import Py4Js    

js=Py4Js()

def google_translate(content):   
    if len(content) > 4891:    
        print("翻译的长度超过限制！！！")    
        return  
    tk = js.getTk(content)
    param = {'tk': tk, 'q': content}
    result = requests.get("""http://translate.google.cn/translate_a/single?client=t&sl=en
        &tl=zh-CN&hl=zh-CN&dt=at&dt=bd&dt=ex&dt=ld&dt=md&dt=qca&dt=rw&dt=rm&dt=ss
        &dt=t&ie=UTF-8&oe=UTF-8&clearbtn=1&otf=1&pc=1&srcrom=0&ssel=0&tsel=0&kc=2""", params=param).json()[0]
    #返回的结果为Json，解析为一个嵌套列表
    return "".join([text[0] for text in result[:-1]])

if __name__ == "__main__":    
    content = """An old woman had a cat. 
The cat was very old; she could not run quickly, and she could not bite, because she was so old. 
One day the old cat saw a mouse; she jumped and caught the mouse. 
But she could not bite it; so the mouse got out of her mouth and ran away, because the cat could not bite it.
Then the old woman became very angry because the cat had not killed the mouse. 
She began to hit the cat. The cat said, "Do not hit your old servant. 
I have worked for you for many years, and I would work for you still, but I am too old. 
Do not be unkind to the old, but remember what good work the old did when they were young."""
    print(google_translate(content))

HandleJs.py
这段是用js生成tk码的，tk码由提交的要翻译的内容生成，相当于是个校验吧，不了解。
注意安装execjs模块时，名字是 PyExecJS。 pip install PyExecJS

# -*- coding: utf-8 -*-
"""
Created on Tue Oct 23 18:57:54 2018

@author: BigFly
"""
import execjs
 
class Py4Js():
    def __init__(self):
        self.ctx = execjs.compile("""
        function TL(a) {
        var k = "";
        var b = 406644;
        var b1 = 3293161072;
        
        var jd = ".";
        var $b = "+-a^+6";
        var Zb = "+-3^+b+-f";
    
        for (var e = [], f = 0, g = 0; g < a.length; g++) {
            var m = a.charCodeAt(g);
            128 > m ? e[f++] = m : (2048 > m ? e[f++] = m >> 6 | 192 : (55296 == (m & 64512) && g + 1 < a.length && 56320 == (a.charCodeAt(g + 1) & 64512) ? (m = 65536 + ((m & 1023) << 10) + (a.charCodeAt(++g) & 1023),
            e[f++] = m >> 18 | 240,
            e[f++] = m >> 12 & 63 | 128) : e[f++] = m >> 12 | 224,
            e[f++] = m >> 6 & 63 | 128),
            e[f++] = m & 63 | 128)
        }
        a = b;
        for (f = 0; f < e.length; f++) a += e[f],
        a = RL(a, $b);
        a = RL(a, Zb);
        a ^= b1 || 0;
        0 > a && (a = (a & 2147483647) + 2147483648);
        a %= 1E6;
        return a.toString() + jd + (a ^ b)
    };
    function RL(a, b) {
        var t = "a";
        var Yb = "+";
        for (var c = 0; c < b.length - 2; c += 3) {
            var d = b.charAt(c + 2),
            d = d >= t ? d.charCodeAt(0) - 87 : Number(d),
            d = b.charAt(c + 1) == Yb ? a >>> d: a << d;
            a = b.charAt(c) == Yb ? a + d & 4294967295 : a ^ d
        }
        return a
    }
    """)
        
    def getTk(self,text):
        return self.ctx.call("TL",text)

程序演示：

pdf里选中，复制

运行下clipboard.py，中英文结果都出来了。按句换行，括号引用都去掉了，清爽。

格式处理后的英文还放到了剪贴板里，可以在别处直接粘贴（这是为了方便做ppt用的）：

Deep neural networks have become the state-of-the-art systems for image recognition as well as other vision tasks .
The architectures keep going deeper, e.g., from five convolutional layers to 1001 layers .
The benefit of deep architectures is their strong learning capacities because each new layer can potentially introduce more non-linearities and typically uses larger receptive fields .
In addition, adding certain types of layers will not harm the performance theoretically since they can just learn identity mapping.
This makes stacking up layers more appealing in the network designs.

嗯，，还是得好好学英语，不要依赖这个。

论文翻译助手，python3调用剪贴板和谷歌翻译
英语烂，看论文都费劲，谷歌翻译和欧陆词典是我的好朋友。从pdf里复制段落到谷歌翻译是最常用的操作了。但是删换行什么...
l'étude du Francais
Quick access for myself 谷歌翻译法语助手Journal en français facil...
python谷歌翻译
Python调用谷歌翻译API实现文本翻译 - 完美代码 (perfcode.com)[https://www.p...
调用谷歌翻译API
在平时使用谷歌翻译的过程中，经常会遇到需要批量翻译大量文本的情景，这种时候需要调用谷歌翻译的API 首先可以使用p...
NodeJS调用谷歌翻译
首先要生成 Google的TKTK需要根据TKK去生成,这玩意貌似是防爬虫的这个TKK可以用一段时间，但是不保证...
python调用谷歌翻译
有什么好的翻译软件吗?
我想把英文论文直接翻译成中文有翻译软件没有字数限制，并且翻译准确吗? 网页，谷歌，百度，有道翻译等都有5000字...
英语论文怎么写
专业名词不会啊！用谷歌翻译吗？不，有语料库是最好的。反键轨道的翻译结果--cnki翻译助手举个例子反键轨道吧当...
2018-05-10
利用谷歌翻译API实现谷歌翻译函数：
写论文之翻译助手
写英语论文最怕啥，憋不出来啊。那怎么办？谷歌翻译？80%是靠谱的，可是总归有点机器的口感。大杀器来了cnki翻译助...