Requests库的基本使用

作者: YoYoYoo | 来源:发表于2019-08-27 10:07 被阅读0次

Requests库基本使用
requests库的基本使用
Requests库的基本使用
Requests库的基本使用
Requests库的基本使用
爬虫笔记（1）基础
Requests库的基本使用(一)
Requests库的基本使用(二)
Python爬虫最入门的教程案例：爬取糗百
9.Requests库基本使用

一、什么是Requests库？

Requests库是用Python编写的，基于urllib，采用Apache2 Licensed开源协议的HTTP库；

相比urllib库，Requests库更加方便，可以节约我们大量的工作，完全满足HTTP测试需求。

二、安装

pip install requests

三、Requests库基本用法详解

1、举个例子

import requests

r = requests.get('http://www.baidu.com') #调用requests的get方法
print(type(r)) # 输出Response类型
print(r.status_code) # 输出状态码
print(type(r.text))  # 响应体类型
print(r.text)        # 输出响应体内容,无需解码
print(r.cookies)     # 输出Cookies

2、各种请求方式（HTTP测试网站：http://httpbin.org/）

requests.post('http://httpbin.org/post')

requests.put('http://httpbin.org/put')

requests.delete('http://httpbin.org/delete')

requests.head('http://httpbin.org/get')

requests.options('http://httpbin.org/get')

3、GET请求

基本get请求

import requests

r = requests.get('http://httpbin.org/get')
print(r.text) #对比urllib，无需用decode解码

添加两个参数
如name=germey,age=22

import requests

r = requests.get('http://httpbin.org/get?name=germey&age=22') 
# 直接加在后面，但不太人性化
print(r.text)

用参数params添加

import requests

data = {
    'name':'germey',
    'age':22
    }
r = requests.get('http://httpbin.org/get',params=data)
print(r.text)

解析Json

import requests

r = requests.get('http://httpbin.org/get')
print(type(r.text))
print(r.json()) # 将返回结果是JSON格式的字符串转为字典
print(type(r.json()))

抓取二进制数据并保存（图片、视频等）

import requests

r = requests.get('https://github.com/favicon.ico')
# print(type(r.text))
# print(type(r.content))
# print(r.content)# bytes类型的数据，不会乱码 
with open('favicon.ico','wb') as f:
    f.write(r.content)

添加headers
headers:请求头headers是我们请求网页时携带的信息，有一些网站会根据headers来判断请求是不是爬虫，我们需要通过伪造headers来绕过这种反爬机制.
User-Agent:顾名思义，就是指谁来代替我们访问网页的。如果它对应的是requests库，那么对方网站就可以直接看出你是爬虫而拒绝这次请求。
具体介绍参考：https://zhuanlan.zhihu.com/p/35625779

import requests

headers = {
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.86 Safari/537.36'
    }
r = requests.get('https://www.zhihu.com/explore',headers=headers) # 简单来说，不加headers可能会被禁止访问
print(r.text)

4、POST请求

另一种常见的请求方式

示例

import requests

data = {'name':'germey','age':'22'}
r = requests.post('http://httpbin.org/post',data=data) # 注意这里和get请求的区别
print(r.text)

5、响应

import requests
#常用的response属性
response3 = requests.get('http://www.jianshu.com')
print(type(response3.status_code),response.status_code)
print(type(response3.headers),response3.headers)
print(type(response3.cookies),response3.cookies)
print(type(response3.url),response3.url)
print(type(response3.history),response3.history)

四、Requests库高级用法

1、文件上传

import requests

file = {'file':open('favicon.ico','rb')}#将之前抓取的github图标以二进制格式读取
response = requests.post('http://httpbin.org/post',files = file)
print(response.text)

2、获取Cookies

import requests

r = requests.get('http://www.baidu.com')  # 获取Cookies
print(r.cookies)

for key,value in r.cookies.items(): # 遍历输出每一个Cookies的名称和值，实现Cookies的遍历解析
    print(key + '=' value)

3、维持登陆

以知乎为例，登陆知乎将Headers种的Cookie复制下来

import requests

headers = {
    'Cookie':'复制下来的Cookie',
    'Host':'www.zhihu.com',
    'User-Agent':'同样复制一下'}

r = requests.get('https://www.zhihu.com',headers=headers)
print(r.text)

4、会话维持

利用Session，可以做到模拟同一个会话而不用担心Cookies的问题，通常用于模拟登陆成功后再进·行下一步的操作。

import requests

s = requests.Session()
s.get('http://httpbin.org/cookies/set/number/123456789')
r = s.get('http://httpbin.org/cookies')
print(r.text)

5、证书验证


import requests
#通过一下两行代码即可把警报消除，即使verify=False，报警还是存在的
from requests.packages import urllib3
 
urllib3.disable_warnings()
 
#首先会检测证书是否合法,通过verify就可以设置成False就可关闭错误提示
response = requests.get('https://www.12306.cn',verify = False)
print(response.status_code)

6、代理设置

import requests
 
proxies = {
'http':'http://127.0.0.1:1080/pac?auth=HgT2fpms98njlh9QGpsP&t=201803030916114202',
'https':'https://127.0.0.1:1080/pac?auth=HgT2fpms98njlh9QGpsP&t=201803030916114202',
}
 
response = requests.get('http://www.taobao.com',proxies = proxies)
 
print(response.status_code)

注：除了基本的 HTTP 代理外， requests 还支持 SOCKS 协议的代理。
首先，安装socks这个库：pip install requests[socks]

import requests
 
proxies = {
'http':'socks5//127.0.0.1:1080/pac?auth=HgT2fpms98njlh9QGpsP&t=201803030916114202',
'https':'socks5//127.0.0.1:1080/pac?auth=HgT2fpms98njlh9QGpsP&t=201803030916114202',
}
 
response = requests.get('http://www.taobao.com',proxies = proxies)
 
print(response.status_code)

7、超时设置

import requests

r = requests.get('https://www.taobao',timeout = 1) #1秒内没有响应，则抛出异常
print(r.status_code)

8、身份认证登陆

有时候访问网页时会出现以下认证界面：

image.png

ort requests
from requests.auth import HTTPBasicAuth

r = requests.get('http://localhost:5000',auth=HTTPBasicAuth('username','password'))
# 也可以简写为 r = requests.get('http://localhost:5000',auth=('username','password'))

print(r.status_code)

9、异常处理

如果出现异常的话，可以到官网查询相应的异常
http://docs.python-requests.org/en/master/api/#exceptions
用：

try:
    .....
except  ....

处理。

网友评论

本文标题：Requests库的基本使用

本文链接：https://www.haomeiwen.com/subject/eerfectx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Requests库的基本使用

一、什么是Requests库？

二、安装

三、Requests库基本用法详解

1、举个例子

2、各种请求方式（HTTP测试网站：http://httpbin.org/）

3、GET请求

4、POST请求

5、响应

四、Requests库高级用法

1、文件上传

2、获取Cookies

3、维持登陆

4、会话维持

5、证书验证

6、代理设置

7、超时设置

8、身份认证登陆

9、异常处理

相关文章

Requests库基本使用

requests库的基本使用

Requests库的基本使用

Requests库的基本使用

Requests库的基本使用

爬虫笔记（1）基础

Requests库的基本使用(一)

Requests库的基本使用(二)

Python爬虫最入门的教程案例：爬取糗百

9.Requests库基本使用

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读