美文网首页
python爬取表情包

python爬取表情包

作者: Rain师兄 | 来源:发表于2020-10-31 12:23 被阅读0次

    从斗图啦网站爬取表情包

    import requests

    from lxml import etree

    import time

    for i in range(1,6):

            url ='https://www.doutula.com/article/list/?page={}'.format(i)

            headers = {'Referer':'https://www.doutula.com/',

                              'User-Agent':'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.111 Safari/537.36'}

            resp = requests.get(url,headers=headers)

            html = etree.HTML(resp.text)

            srcs = html.xpath('//img/@data-original')

            for src in srcs :

                    filename = src.split('/')[-1]

                    img = requests.get(src,headers=headers)

                    with open('imgs/'+filename,'wb')as file:

                            file.write(img.content)

                    print(src,filename)

    新东西是Referer,wb,'imgs/'+filename,img.content, srcs = html.xpath('//img/@data-original'),  filename = src.split('/')[-1]

    相关文章

      网友评论

          本文标题:python爬取表情包

          本文链接:https://www.haomeiwen.com/subject/twjsvktx.html