美文网首页大数据 爬虫Python AI SqlPython学习互联网科技
python爬虫入门04:长江后浪推前浪,Reuqests库把u

python爬虫入门04:长江后浪推前浪,Reuqests库把u

作者: 919b0c54458f | 来源:发表于2019-05-10 20:20 被阅读9次

    接下来我们要来玩一个新的库

    这个库的名称叫做

    Requests

    这个库比我们上次说的 urllib 可是要牛逼一丢丢的

    毕竟 Requests 是在 urllib 的基础上搞出来的

    通过它我们可以用更少的代码

    模拟浏览器操作

    人生苦短

    接下来就是

    image

    skr

    对于不是 python 的内置库

    我们需要安装一下

    直接使用 pip 安装

    pip install requests

    安装完后就可以使用了

    接下来就来感受一下 requests 吧

    导入 requests 模块

    import requests

    一行代码 Get 请求
    r = requests.get('https://api.github.com/events')

    一行代码 Post 请求
    r = requests.post('https://httpbin.org/post', data = {'key':'value'})

    其它乱七八糟的 Http 请求

    >>> r = requests.put('https://httpbin.org/put', data = {'key':'value'})
    >>> r = requests.delete('https://httpbin.org/delete')
    >>> r = requests.head('https://httpbin.org/get')
    >>> r = requests.options('https://httpbin.org/get')
    

    想要携带请求参数是吧?

    >>> payload = {'key1': 'value1', 'key2': 'value2'}
    >>> r = requests.get('https://httpbin.org/get', params=payload)
    

    假装自己是浏览器

    >>> url = 'https://api.github.com/some/endpoint'
    >>> headers = {'user-agent': 'my-app/0.0.1'}
    >>> r = requests.get(url, headers=headers)
    

    获取服务器响应文本内容

    >>> import requests
    >>> r = requests.get('https://api.github.com/events')
    >>> r.text
    u'[{"repository":{"open_issues":0,"url":"https://github.com/...
    >>> r.encoding
    'utf-8'
    

    获取字节响应内容

    >>> r.content
    b'[{"repository":{"open_issues":0,"url":"https://github.com/...
    

    获取响应码

    >>> r = requests.get('https://httpbin.org/get')
    >>> r.status_code
    200
    

    获取响应头

    >>> r.headers
    {    
        'content-encoding': 'gzip',    
        'transfer-encoding': 'chunked',  
        'connection': 'close',    
        'server': 'nginx/1.0.4',    
        'x-runtime': '148ms',    
        'etag': '"e1ca502697e5c9317743dc078f67693f"',   
        'content-type': 'application/json'
    
    }
    

    获取 Json 响应内容

    >>> import requests
    >>> r = requests.get('https://api.github.com/events')
    >>> r.json()
    [{u'repository': {u'open_issues': 0, u'url': 'https://github.com/...
    

    获取 socket 流响应内容

    >>> r = requests.get('https://api.github.com/events', stream=True)
    >>> r.raw
    <urllib3.response.HTTPResponse object at 0x101194810>
    >>> r.raw.read(10)
    '\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03'
    

    Post请求

    当你想要一个键里面添加多个值的时候

    >>> payload_tuples = [('key1', 'value1'), ('key1', 'value2')]
    >>> r1 = requests.post('https://httpbin.org/post', data=payload_tuples)
    >>> payload_dict = {'key1': ['value1', 'value2']}
    >>> r2 = requests.post('https://httpbin.org/post', data=payload_dict)
    >>> print(r1.text)
    {  ...  "form": {    "key1": [      "value1",      "value2"    ]  },  ...}
    >>> r1.text == r2.text
    True
    

    请求的时候用 json 作为参数

    >>> url = 'https://api.github.com/some/endpoint'
    >>> payload = {'some': 'data'}
    >>> r = requests.post(url, json=payload)
    

    想上传文件?

    >>> url = 'https://httpbin.org/post'
    >>> files = {'file': open('report.xls', 'rb')}
    >>> r = requests.post(url, files=files)
    >>> r.text
    {  ...  "files": {    "file": "<censored...binary...data>"  },  
    

    获取 cookie 信息

    >>> url = 'http://example.com/some/cookie/setting/url'
    >>> r = requests.get(url)
    >>> r.cookies['example_cookie_name']
    'example_cookie_value'
    

    发送 cookie 信息

    >>> url = 'https://httpbin.org/cookies'
    >>> cookies = dict(cookies_are='working')
    >>> r = requests.get(url, cookies=cookies)
    >>> r.text
    '{"cookies": {"cookies_are": "working"}}'
    

    设置超时

    >>> requests.get('https://github.com/', timeout=0.001)
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>requests.exceptions.Timeout: HTTPConnectionPool(host='github.com', port=80): Request timed out. (timeout=0.001)
    
    image

    除了牛逼

    还能说什么呢??

    Python学习交流群:556370268,这里有资源共享,技术解答,还有小编从最基础的Python资料到项目实战的学习资料都有整理,希望能帮助你更了解python,学习python。

    相关文章

      网友评论

        本文标题:python爬虫入门04:长江后浪推前浪,Reuqests库把u

        本文链接:https://www.haomeiwen.com/subject/cyezoqtx.html