由于get请求方式比较简单,就不写了,贴一段代码作为回顾
get请求
from urllib import parse, request
import random
url = 'http://www.baidu.com/s'
keyword = input('请输入要搜的关键字:')
wd = {'wd': keyword}
encoded_wd = parse.urlencode(wd)
new_url = url + '?' + encoded_wd
print(new_url)
req = request.Request(url)
#为了防止被网站封ip,模仿浏览器访问网站
ua_list = [
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1",
"Mozilla/5.0 (Windows NT 6.1; rv2.0.1) Gecko/20100101 Firefox/4.0.1",
"Opera/9.80 (Macintosh; Intel Mac OS X 10.6.8; U; en) Presto/2.8.131 Version/11.11",
"Opera/9.80 (Windows NT 6.1; U; en) Presto/2.8.131 Version/11.11",
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_0) AppleWebKit/535.11 (KHTML, like Gecko) Chrome/17.0.963.56 Safari/535.11"
]
# 在User-Agent列表里随机选择一个User-Agent ;从序列中随机选取一个元素
user_agent = random.choice(ua_list)
req.add_header('User-Agent', user_agent)
response = request.urlopen(req)
print(response.read().decode('utf-8'))
Post请求
说明:post请求是利用有道翻译网页中对单词进行翻译时采用的post方式,这是前提。
1.利用抓包工具获取url
url='http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule '
2.编辑form表单
- 从抓包工具中抓取的部分是:
i=smart%0A&from=AUTO&to=AUTO&smartresult=dict&client=fanyideskweb&salt=1528635335125&sign=df74f2062975d33318777e2cdae0af41&doctype=json&version=2.1&keyfrom=fanyi.web&action=FY_BY_CLICKBUTTION&typoResult=false
拆分整理如下:
formbody = {
"i": "smart",
"from": "AUTO",
"to": "AUTO",
"smartresult": "dict",
"client": "fanyideskweb",
"doctype": "json",
"version": "2.1",
"keyfrom": "fanyi.web",
"action": "FY_BY_CLICKBUTTION",
"typoResult": "false"
}
此处用sublime正则表达式替换比较便捷
内容为 ^(.*)=(.*)$
替换为 "\1":"\2",
- 接下来的代码与get请求并没有什么区别。
data = urllib.parse.urlencode(formbody)
request = urllib.request.Request(url, data=data, headers=headers)
print(urllib.request.urlopen(request).read().decode('utf-8'))
这时候遇见第一个error :
TypeError: POST data should be bytes, an iterable of bytes, or a file object. It cannot be of type str.
- 搜索之,看到这个解决办法 https://blog.csdn.net/peade/article/details/49977515搞定。
data = urllib.parse.urlencode(formbody).encode(encoding='utf-8')
根据错误和解决的办法判断:应该是encode方法实现的功能string转化成bytes???(此处存疑,是不是有可能是编码方式变了所以可以了) - 这个问题还有另一种解决方法:
data=bytes(urllib.parse.urlencode(formbody),encoding='utf-8')
这个明显是转换成bytes格式。
解决完这个问题运行,返回结果是
{"errorCode":50}
又搜索,居然也有人趟过雷了,http://bbs.fishc.com/thread-96638-1-1.html,按照其中方法修改解决问题。
import urllib.request, urllib.parse
user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv2.0.1) Gecko/20100101 Firefox/4.0.1"
url = 'http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule'
headers = {'User-Agent': user_agent}
formbody = {
"i": "smart",
"from": "AUTO",
"to": "AUTO",
"smartresult": "dict",
"client": "fanyideskweb",
"doctype": "json",
"version": "2.1",
"keyfrom": "fanyi.web",
"action": "FY_BY_CLICKBUTTION",
"typoResult": "false"
}
#data = urllib.parse.urlencode(formbody).encode(encoding='utf-8')
data=bytes(urllib.parse.urlencode(formbody),encoding='utf-8')
request = urllib.request.Request(url, data=data, headers=headers)
print(urllib.request.urlopen(request).read().decode('utf-8'))
结果是:
{"type":"EN2ZH_CN","errorCode":0,"elapsedTime":10,"translateResult":[[{"src":"smart","tgt":"聪明的"}]]}
对于第二个问题:为啥删除_o就没问题了我也不知道。不过可以思考一下,
http://fanyi.youdao.com/translate_o?smartresult=dict&smartresult=rule
直接用鼠标点击在浏览器里显示:
{"errorCode":50}
而http://fanyi.youdao.com/translate?smartresult=dict&smartresult=rule
跳转到有道。这给我自己提了个醒以后url一定要先在浏览器里打开试试
网友评论