;p
Requests库的7个主要方法
![](https://img.haomeiwen.com/i1028505/e6c93d34a001cc06.png)
1.get方法
r=requests.get(url)
Response/ Request
![](https://img.haomeiwen.com/i1028505/7544dbe3fa18c5fd.png)
通用代码框架
def getHtmlText(url):
try:
r=requests.get(url,timeout=30)
r.raise_for_status()
r.encoding=r.apparent_encoding
returnr.text
except:
return""
HTTP协议
Hypertext transfer protocol 基于请求与响应、无状态
![](https://img.haomeiwen.com/i1028505/9daeae838d7349f8.png)
![](https://img.haomeiwen.com/i1028505/f7096d159d4fd280.png)
网络爬虫的尺寸
![](https://img.haomeiwen.com/i1028505/93028cfafe30324e.png)
Robots协议
网络爬虫排除标准/建议遵守
——>判断User-Agent进行限制
网友评论