美文网首页
【爬虫篇】:爬取图片网站美女图片

【爬虫篇】:爬取图片网站美女图片

作者: dataheart | 来源:发表于2016-05-15 22:43 被阅读270次

成果:和谐,NO PROCLAMATION。

代码如下:

-- coding: utf-8 --

"""
Created on Sun May 8 13:22:55 2016
爬取MM图片连续下载20页面,使用了代理IP和下载图片并且赋值的技术
@author: xings
"""

from bs4 import BeautifulSoup #在储存过程中没有做图片的保护的
import requests,urllib.request
import time

useurl = ['http://www.27270.com/ent/meinvtupian/list_11_{}.html'.format(str(i)) for i in range(21,50,1)]
headers = { 'User-Agent':'Mozilla/5.0 (iPad; CPU OS 9_1 like Mac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1',
"http": "http://111.206.81.248:80"}
url = 'http://www.27270.com/ent/meinvtupian/list_11_1.html'

def get_photo(url,data=None):
mm_poto = requests.get(url,headers=headers)
soup = BeautifulSoup(mm_poto.text,'lxml')
mm_down = []
mm_names = []
folder_path = 'H:\beatufulgril\'

for mpoto in soup.select('body > div.w960.yh > div.MeinvTuPianBox > ul > li > a.MMPic > i > img'):
    mm_link = mpoto.get('src') 
    mm_nick = mpoto.get('alt')
    mm_down.append(mm_link)
    mm_names.append(mm_nick)
   
#可以用len(mm_downs) 和i+=1来计算数量, + +表示路径和文件格式
for gril,name in zip(mm_down,mm_names):
    urllib.request.urlretrieve(gril,folder_path + name + '.jpg')
    print(name)

for mm_url in useurl:
time.sleep(1)
get_photo(mm_url)

相关文章

网友评论

      本文标题:【爬虫篇】:爬取图片网站美女图片

      本文链接:https://www.haomeiwen.com/subject/eizvrttx.html