美文网首页
抓取网页资源并且下载

抓取网页资源并且下载

作者: 香蕉小黄人 | 来源:发表于2017-04-01 17:16 被阅读0次

    #coding=utf-8

    import urllib

    import re

    def getHtml(url):

    page = urllib.urlopen(url)

    html = page.read()

    return html

    def getImg(html):

    # reg = r'src="(.+?\.jpg)" pic_ext'

    reg = r'src="(.+?\.jpg)"'

    imgre = re.compile(reg)

    imglist = re.findall(imgre,html)

    # return imglist

    x = 0

    for imgurl in imglist:

    urllib.urlretrieve(imgurl,'D:\WWW\demo\python\curl\\img5\\%s.jpg' % x)

    x+=1

    return imglist

    a=10

    b=list(range(3))

    for y in b:

    url = "https://www.zhihu.com/topic/19552207/top-answers?page="

    html = '%s%s'%(url,a)

    html = getHtml(html)

    print getImg(html)

    # res =  getImg(html)

    # print res

    a+=1

    相关文章

      网友评论

          本文标题:抓取网页资源并且下载

          本文链接:https://www.haomeiwen.com/subject/raaiottx.html