python知识复习(二)--urllib2和requests常

作者: 只喝白开水a | 来源:发表于2018-07-25 23:18 被阅读0次

python知识复习(二)--urllib2和requests常
tenliu的爬虫-抓包分析
tenliu的爬虫-python的urllib库
tenliu的爬虫-python库urllib、urllib2、
tenliu的爬虫-urllib2学习
tenliu的爬虫-requests学习
urllib、urllib2、requests库的区别与联系
Python-爬取页面内容（涉及urllib、requests、
Python 爬虫工具
接口自动化测试之-requests模块详解

一.urllib2/urllib

1.请求和响应
向指定url获取数据，最简单形式：urllib2.urlopen(URL)

请求和响应分离：请求：request=urllib2.Request(URL)
响应：urllib2.urlopen(request,timeout=...)

带请求头和post数据的请求：data=urllib.urlencode(post_data)
urllib2.urlopen(URL,data,header)

添加特定header：
request.add_header('User-Agent', '....')

响应response：
response.read()
response.headers
response.get_code()

2.异常处理：

try:
    urllib2.urlopen(request)
except urllib2.URLError, e:      #urllib2.HTTPError
    print e.reason,e.code

3.Cookie处理(详解)

import urllib2
import cookielib
#声明一个CookieJar对象实例来保存cookie
cookie = cookielib.CookieJar()
#利用urllib2库的HTTPCookieProcessor对象来创建cookie处理器
handler=urllib2.HTTPCookieProcessor(cookie)
#通过handler来构建opener
opener = urllib2.build_opener(handler)
#此处的open方法同urllib2的urlopen方法，也可以传入request
response = opener.open('http://www.baidu.com')
for item in cookie:
    print 'Name = '+item.name
    print 'Value = '+item.value

4.proxy的设置

import urllib2
proxy = urllib2.ProxyHandler(('http':'127.0.0.1'))
opener  = urllib2.build_opener(proxy,)  #urllib2.install_opener 修改全局的opener
response = opener.open('....')

二.Requests

1.请求(get,post,put,delete,head,options)

requests.get('...',headers=headers)

r = requests.post("...",data=post_data)

r = requests.put("...")

r = requests.delete("...")

r = requests.head("...")

r = requests.options("...")

payload = {'key1': 'value1', 'key2': 'value2'}

r = requests.get("...", params=payload)

证书验证：r = requests.get('https://kyfw.12306.cn/otn/', verify=True)
2.响应与响应编码

返回字节形式内容：r.content

返回文本形式内容：r.text

返回状态码：r.status_code

响应头：r .headers

cookies: r.cookies

返回json对象：r.json()

返回页面url：r.url

响应历史：r.history

超时设置：r = requests.get("...",timeout=..)
字节流读取内容：
r = requests.get("...", stream=True)
r.raw r.raw.read(10)

根据http头猜测编码方式：r.encoding

使用chardet预测：

r  = requests.get(...)
r.encoding = chardet.detect(r.content)['encoding']
print r.text

3.异常处理
r.raise_for_status( ):主动抛出异常，当响应码为4xx或者5xx时抛出异常，200时抛出None

4.cookie处理
自定义cookies，直接发送出去，无需像urllib2一样构建opener：

r = requests.post("....", cookies=cookies)

可打印出所有cookie的值对：

for key in r.cookies.keys():
    print key,':',r.cookies.get(key)

5.proxy的设置

直接通过参数进行单次请求 r = requests.get("....", proxies=proxies)

6.session
创建session：
s = requests.Session()
设置cookie
s.cookies.set(cookie['name'],cookie['value'])
更新headers：
s.headers.update({'x-test': 'true'}) 存在则更新，不存在就合并

网友评论

本文标题：python知识复习(二)--urllib2和requests常

本文链接：https://www.haomeiwen.com/subject/qkdwpftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

python知识复习(二)--urllib2和requests常

一.urllib2/urllib

二.Requests

相关文章