美文网首页
Python实战计划学习笔记1.4:爬取http://wehea

Python实战计划学习笔记1.4:爬取http://wehea

作者: 折青颜 | 来源:发表于2016-08-05 16:22 被阅读0次

    代码:

    from bs4 import BeautifulSoup
    import requests
    import time
    import urllib.request
    
    path = "C:\\\\Users\\\\album\\\\Desktop\\\\tylor\\\\"
    base_url ='http://weheartit.com/inspirations/taylorswift?page'
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.106 Safari/537.36'
    }
    
    def get_links(num):
        photo_links = []
        for page_num in range(1, num + 1):
            full_url = base_url + str(page_num)
            wb_data = requests.get(full_url)
            soup = BeautifulSoup(wb_data.text, 'lxml')
            time.sleep(2)
            images = soup.select('img.entry-thumbnail')
            for image in images:
                photo_links.append(image.get('src'))
    
        return photo_links
    
    
    
    def dl_image(url):
        urllib.request.urlretrieve(url, path + url.split('/')[-2] + url.split('/')[-1])
        print('Done')
    
    
    for url in get_links(4):
        dl_image(url)
    

    总结:

    用urllib.request.urlretrieve()下载图片

    相关文章

      网友评论

          本文标题:Python实战计划学习笔记1.4:爬取http://wehea

          本文链接:https://www.haomeiwen.com/subject/vgnhsttx.html