美文网首页大数据 爬虫Python AI Sql
python爬虫获取百度实时疫情数据

python爬虫获取百度实时疫情数据

作者: 弃暗投明 | 来源:发表于2022-04-02 13:58 被阅读0次

    人在上海,每天都要打开的网站是
    百度实时疫情大数据

    截屏2022-04-02 13.37.19.png

    几乎是各种维度的数据统计,如何整理出全国各地区的数据变化呢?
    第一步,首先确定数据位置,即分析数据源

    截屏2022-04-02 13.38.49.png

    第二步,请求数据

    第三步,数据解析

    第四步,数据保存

    完整代码块

    #加载模块
    
    import requests
    
    import re
    
    import json
    
    import csv
    
    import pandas as pd
    
    #身份伪装,其实没必要
    
    headers={
    
            'user-agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.82 Safari/537.36'
    
            }
    
    #请求地址
    
    url='https://voice.baidu.com/act/newpneumonia/newpneumonia/?from=osari_aladin_banner'
    
    #发送请求
    
    response=requests.get(url=url,headers=headers)
    
    #数据解析
    
    data_html=response.text
    
    #【0】转换数据类型从list到str,强大的正则
    
    json_str=re.findall('"component":\[(.*)\],',data_html)[0]
    
    #转换字典
    
    json_dict=json.loads(json_str)
    
    caseList=json_dict['caseList']
    
    for case in caseList:
    
        area=case['area']#
    
        confirmed=case['confirmed']#
    
        curConfirm=case['curConfirm']
    
        asymptomatic=case['asymptomatic']
    
        crued=case['crued']#
    
        died=case['died']#
    
        confirmedRelative=case['confirmedRelative']
    
        diedRelative=case['diedRelative']
    
        curedRelative=case['curedRelative']
    
        asymptomaticRelative=case['asymptomaticRelative']
    
        nativeRelative=case['nativeRelative']
    
        overseasInputRelative=case['overseasInputRelative']
    
    #打印检查  print(area,confirmed,curConfirm,confirmedRelative,nativeRelative,overseasInputRelative, asymptomatic,asymptomaticRelative,crued,curedRelative,died,diedRelative)
    
    #写入表格   
    
    with open('./data.csv',mode='a',encoding='utf-8',newline='')as f:
    
            csv_writer=csv.writer(f)
    
            csv_writer.writerow([area,confirmed,curConfirm,confirmedRelative,nativeRelative,overseasInputRelative,asymptomatic,asymptomaticRelative,crued,curedRelative,died,diedRelative])
    

    最后,对比一下输出内容,数据一致~


    截屏2022-04-02 13.39.21.png
    截屏2022-04-02 13.49.07.png

    相关文章

      网友评论

        本文标题:python爬虫获取百度实时疫情数据

        本文链接:https://www.haomeiwen.com/subject/fyytsrtx.html