美文网首页
python 代码收集

python 代码收集

作者: 好好他爸爸 | 来源:发表于2020-06-16 00:10 被阅读0次

24行代码爬取B站UP主相册所有图片

from selenium import webdriver
import re
from lxml import etree
import requests
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'}
driver = webdriver.Chrome()
url = "https://space.bilibili.com/430654092/album"
pa = re.compile('style="background-image: url\("(.*?)@')
driver.get(url)
text = driver.page_source
pic_url_list = pa.findall(text)
pic_url_list = pic_url_list[1:]
html = etree.HTML(text)
titles = html.xpath('//a[@class="title"]/text()')
print('一共'+str(len(titles))+'张照片')
for pic_url, title in zip(pic_url_list, titles):
    print('正在下载', title)
    content = requests.get(pic_url, headers=headers).content
    if '\n' in title:
        title = title.replace('\n', '')
    if '/' in title:
        title = title.replace('/', '')
    with open('图片/'+title+'.jpg', 'wb') as f:
        f.write(content)

相关文章

网友评论

      本文标题:python 代码收集

      本文链接:https://www.haomeiwen.com/subject/vxzdxktx.html