python中不同方式打开网页的获取数据类型

作者: 刘年 | 来源:发表于2020-03-15 22:38 被阅读0次

python中不同方式打开网页的获取数据类型
Python 爬虫第一篇（urllib+regex）
16.http/https及邮件协议
刘铁猛C#第5讲（2）初识类型，变量和方法
python 调用selenium备忘
python：判断网页的编码方式
国庆第一天：Python
python获取网页内容Get方式
Python实战：解析本地网页
Python 学习笔记 043

1、requests类

requests类默认的.text是字符串格式，并且已经默认了编码格式，改变编码则是.content.decode('')，即先得到bytes类，再编码转为字符串

import requests
resp =requests.get("https://www.baidu.com")
print(type(resp.content))
print(type(resp.text))
print(type(resp.content.decode('utf-8')))
print(type(resp.text.encode('utf-8')))

结果

<class 'bytes'>
<class 'str'>
<class 'str'>
<class 'bytes'>

2、urlrequest类

urlrequest类一定要用read（）表示网页内容，并且是一个bytes类。

from urllib import request
resp =request.urlopen("https://www.baidu.com")
print(type(resp.read()))
print(type(resp.read().decode('utf-8')))

<class 'bytes'>
<class 'str'>

3、selenium方式打开

from selenium import webdriver
import time
driver = webdriver.Firefox()
driver.get("https://www.baidu.com")
print(type(driver.page_source.encode('utf-8')))
print(type(driver.page_source.encode('utf-8').decode('utf-8')))
print(type(driver.page_source))

<class 'bytes'>
<class 'str'>
<class 'str'>

4、总结

注意区分各种形式打开网页得到数据的方法和类型
并且注意str和bytes类型之间的变化
str------encode()-----bytes
bytes------decode()-------str

网友评论

本文标题：python中不同方式打开网页的获取数据类型

本文链接：https://www.haomeiwen.com/subject/qndiehtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

python中不同方式打开网页的获取数据类型

1、requests类

2、urlrequest类

3、selenium方式打开

4、总结

相关文章

python中不同方式打开网页的获取数据类型

Python 爬虫第一篇（urllib+regex）

16.http/https及邮件协议

刘铁猛C#第5讲（2）初识类型，变量和方法

python 调用selenium备忘

python：判断网页的编码方式

国庆第一天：Python

python获取网页内容Get方式

Python实战：解析本地网页

Python 学习笔记 043

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读

想法

简友广场