安装python及pycharm不再赘述
目标
1,学习Python爬虫
2,学会用pycharm的pip安装Python需要用到的扩展包
3,爬取图片
4,把爬取到的数据存在本地文件夹或者数据库
5,爬取新闻网站新闻列表
打开terminal 更新pip
image.png
pip install requests
image.pngpip install beautifulsoup4
image.png
pip install jupyter
安装成功
image.png
下面是一个爬取金价的例子
coding=UTF-8
author = 'Administrator'
import time
from selenium import webdriver
import os
import re
chromedriver = "C:/Users/Administrator/AppData/Local/Google/Chrome/Application/chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
browser = webdriver.Chrome(chromedriver)
url = "https://www.jin10.com/"
for i in range(1,10):
browser.get(url)
result= browser.page_source
gold_price = ""
gold_price_change = ""
try:
gold_price = re.findall('<div id="XAUUSD_B" class="jin-price_value" style=".?">(.?)</div>',result)[0]
gold_price_change = re.findall('<div id="XAUUSD_P" class="jin-price_value" style=".?">(.?)</div>',result)[0]
except:
gold_pric = "------"
gold_price_change = "------"
print gold_price
print gold_price_change
time.sleep(1)
返回结果
![
页面展示
image.png
网友评论