美文网首页
Python基本爬虫(HTML下载器)

Python基本爬虫(HTML下载器)

作者: 原来不语 | 来源:发表于2017-12-12 19:48 被阅读0次
# -*-encoding:utf-8 -*-
import requests
import urllib
class HtmlDownloader(object):
"""docstring for HtmlDownloader"""
def download(self,url):
    if url is None:
        return None
    print(url)
    headers = ("User-Agent",
               "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.94 Safari/537.36")
    opener = urllib.request.build_opener()
    opener.addheaders = [headers]
    urllib.request.install_opener(opener)
    r = urllib.request.urlopen(url).read().decode("utf-8")
    #print(r)
    """
    if r.status_code == 200:
        print("获取首个页面成功")
        r.encoding='utf-8'
        return r.text
    """
    #print(r)
    return r

相关文章

网友评论

      本文标题:Python基本爬虫(HTML下载器)

      本文链接:https://www.haomeiwen.com/subject/afanixtx.html