美文网首页python 爬虫
Python实战作业1-4:获取动态网页数据

Python实战作业1-4:获取动态网页数据

作者: 浮生只言片语 | 来源:发表于2017-05-24 21:58 被阅读11次

任务:

获取网站:https://knewone.com/discover?page= 前20页图片链接并下载至本地

成果:

Snip20170524_1.png

代码:

from bs4 import BeautifulSoup
import requests,urllib.request

folderPath = '/Users/FS/Desktop/test/'
urls = ['https://knewone.com/discover?page={}'.format(str(i)) for i in range(1,15)]

imageUrls = []
for url in urls:
    print(url)
    wb_data = requests.get(url)
    soup = BeautifulSoup(wb_data.text, 'lxml')
    images = soup.select('#wrapper > div > section > div > div.hits_group-things.clearfix > article > header > a > img')
    for image in images:
        url = image.get('src')
        imageUrls.insert(-1,url.split('!')[0])
        print(url)

for imageUrl in imageUrls:
    urllib.request.urlretrieve(imageUrl,folderPath+imageUrl[-10:])
    print('Done')

相关文章

网友评论

    本文标题:Python实战作业1-4:获取动态网页数据

    本文链接:https://www.haomeiwen.com/subject/hxzoxxtx.html