参考博客:爬虫入门系列
简要介绍:
1.用到的Python库:
requests: 主要用于获取网页结果
BeautifulSoup: 主要用于解析网页内容
2.简单例子:
import requests
url = "https://movie.douban.com/cinema/later/chengdu/"
response = requests.get(url)
print(response.content.decode('utf-8'))
from bs4 import BeautifulSoup
soup = BeautifulSoup(response.content.decode('utf-8'),'lxml')
all_movie = soup.find('div',id="showing-soon")
3.数据存储:
对于爬出到的数据可以选择保持到csv, txt等文件中
网友评论