Requests库的基本使用

作者: IT的咸鱼 | 来源:发表于2018-10-18 21:44 被阅读0次

Requests库基本使用
requests库的基本使用
Requests库的基本使用
Requests库的基本使用
Requests库的基本使用
爬虫笔记（1）基础
Requests库的基本使用(一)
Requests库的基本使用(二)
Python爬虫最入门的教程案例：爬取糗百
9.Requests库基本使用

安装

pip3 install requests

requests 的底层实现其实就是 urllib

开源地址：https://github.com/kennethreitz/requests
中文文档 API： http://docs.python-requests.org/zh_CN/latest/index.html

Requests 自称 "HTTP for Humans"，说明使用更简洁方便。

Requests 唯一的一个非转基因的 Python HTTP 库，人类可以安全享用

Requests 继承了urllib的所有特性。Requests支持HTTP连接保持和连接池，支持使用cookie保持会话，支持文件上传，支持自动确定响应内容的编码，支持国际化的 URL 和 POST 数据自动编码。

(1)最基本的GET请求可以直接用get方法

response = requests.get("http://www.baidu.com/")
# 也可以这么写
response = requests.request("get", "http://www.baidu.com/")

response的常用方法：

response.text(查看响应内容)

respones.content(返回的是服务器响应数据的原始二进制字节流，可以用来保存图片等二进制文件)

response.status_code(查看响应码)

response.request.headers

response.headers

(2)添加headers和查询参数(params)

如果想添加 headers，可以传入headers参数来增加请求头中的headers信息。如果要将参数放在url中传递，可以利用 params 参数。

import requests
kw = {'wd':'长城'}
headers = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36"}
# params 接收一个字典或者字符串的查询参数，字典类型自动转换为url编码，不需要urlencode()
response = requests.get("http://www.baidu.com/s?", params = kw, headers = headers)
# 查看响应内容，response.text 返回的是Unicode格式的数据
print (response.text)
# 查看响应内容，response.content返回的字节流数据
print (respones.content)
# 查看完整url地址
print (response.url)
# 查看响应头部字符编码
print (response.encoding)
# 查看响应码
print (response.status_code)

运行结果

......省略

......省略

'http://www.baidu.com/s?wd=%E9%95%BF%E5%9F%8E'

'utf-8'

200

例子

#通过requests获取新浪首页
import  requests
response = requests.get("http://www.sina.com")
print(response.request.headers)
print(response.content.decode('utf-8'))#正常打印网页源码
print('*'*100)
print(response.text)#出现乱码

打印结果

{'User-Agent': 'python-requests/2.12.4', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
<!DOCTYPE html>
<html>
<head>
    <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    <title>新浪首页</title>
    <meta name="keywords" content="新浪,新浪网,SINA,sina,sina.com.cn,新浪首页,门户,资讯" />
    ...
*********************************************************************************************************
    <!DOCTYPE html>
    <!-- [ published at 2017-06-09 15:18:10 ] -->
    <html>
    <head>
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta http-equiv="X-UA-Compatible" content="IE=edge" />
        <title>æ–°æµªé¦–é¡µ</title>
        <meta name="keywords" content="æ–°æµª,æ–°æµªç½‘,SINA,sina,sina.com.cn,æ–°æµªé¦–é¡µ,é—¨æˆ·,èµ„è®¯" />
        <meta name="description" content="æ–°æµªç½‘ä¸ºå…¨ç�ƒç”¨æˆ·24å°�æ—¶æ��ä¾›å…¨é�¢å�Šæ—¶çš„ä¸æ–‡èµ„è®¯ï¼Œå†…å®¹è¦†ç›–å›½å†…å¤–çª�å�‘æ–°é—»äº‹ä»¶ã€�ä½“å�›èµ›äº‹ã€�å¨±ä¹�æ—¶å°šã€�äº§ä¸šèµ„è®¯ã€�å®žç”¨ä¿¡æ�¯ç‰ï¼Œè®¾æœ‰æ–°é—»ã€�ä½“è‚²ã€�å¨±ä¹�ã€�è´¢ç»�ã€�ç§‘æŠ€ã€�æˆ¿äº§ã€�æ±½è½¦ç‰30å¤šä¸ªå†…å®¹é¢‘é�“ï¼Œå�Œæ—¶å¼€è®¾å�šå®¢ã€�è§†é¢‘ã€�è®ºå�›ç‰è‡ªç”±äº’åŠ¨äº¤æµ�ç©ºé—´ã€‚" />
        <link rel="mask-icon" sizes="any" href="//www.sina.com.cn/favicon.svg" color="red">

产生问题的原因分析

requests默认自带的Accept-Encoding导致或者新浪默认发送的就是压缩之后的网页

但是为什么content.read()没有问题，因为requests，自带解压压缩网页的功能

当收到一个响应时，Requests 会猜测响应的编码方式，用于在你调用response.text 方法时对响应进行解码。Requests 首先在 HTTP 头部检测是否存在指定的编码方式，如果不存在，则会使用 chardet.detect来尝试猜测编码方式（存在误差）

更推荐使用response.content.deocde()

通过requests获取网络上图片的大小

from io import BytesIO,StringIO
import requests
from PIL import Image
img_url = "http://imglf1.ph.126.net/pWRxzh6FRrG2qVL3JBvrDg==/6630172763234505196.png"
response = requests.get(img_url)
f = BytesIO(response.content)
img = Image.open(f)
print(img.size)#长宽

输出结果：

(500, 262)

BytesIO 和StringIO

很多时候，数据读写不一定是文件，也可以在内存中读写。

StringIO顾名思义就是在内存中读写str。

BytesIO 就是在内存中读写bytes类型的二进制数据
例子中如果使用StringIO:

即f = StringIO(response.text)会产生"cannot identify image file"的错误

当然上述例子也可以把图片存到本地之后再使用Image打开来获取图片大小

动手使用requests爬取任意输入的百度贴吧的网页，并保存到本地

网友评论

本文标题：Requests库的基本使用

本文链接：https://www.haomeiwen.com/subject/svlrzftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

Requests库的基本使用

安装

requests 的底层实现其实就是 urllib

(1)最基本的GET请求可以直接用get方法

response的常用方法：

(2)添加headers和查询参数(params)

例子

产生问题的原因分析

通过requests获取网络上图片的大小

BytesIO 和StringIO

相关文章

Requests库基本使用

requests库的基本使用

Requests库的基本使用

Requests库的基本使用

Requests库的基本使用

爬虫笔记（1）基础

Requests库的基本使用(一)

Requests库的基本使用(二)

Python爬虫最入门的教程案例：爬取糗百

9.Requests库基本使用

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读