美文网首页
Python爬虫实战3

Python爬虫实战3

作者: python小哥哥2020 | 来源:发表于2022-03-16 21:26 被阅读0次

大家好,我是天空之城。今天给大家带来小福利

import requests, openpyxl
header = {
      'Referer': 'https://www.zhihu.com/people/zhang-jia-wei/posts?page=3',
      'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; rv:46.0) Gecko/20100101 Firefox/46.0'}
# 创建工作簿
wb = openpyxl.Workbook()
# 获取工作簿的活动表
sheet = wb.active
# 工作表重命名
sheet.title = 'lyrics'

sheet['A1'] = '标题'  # 加表头,给A1单元格赋值
sheet['B1'] = '摘要'  # 加表头,给B1单元格赋值
sheet['C1'] = '链接'  # 加表头,给C1单元格赋值


for i in range(0,200,20):
    url = 'https://www.zhihu.com/api/v4/members/zhang-jia-wei/articles?include=data%5B*%5D.comment_count%2Csuggest_edit%2Cis_normal%2Cthumbnail_extra_info%2Cthumbnail%2Ccan_comment%2Ccomment_permission%2Cadmin_closed_comment%2Ccontent%2Cvoteup_count%2Ccreated%2Cupdated%2Cupvoted_followees%2Cvoting%2Creview_info%2Cis_labeled%2Clabel_info%3Bdata%5B*%5D.author.badge%5B%3F(type%3Dbest_answerer)%5D.topics&offset={}&limit=20&sort_by=created'.format(str(i))
    res = requests.get(url,headers=header)
    json_blog = res.json()
    list_blog = json_blog['data']
    for blog in list_blog:
        name = blog['title']
        content = blog['excerpt']
        link = blog['url']
        sheet.append([name, content, link])
        print('文章名:' + name + '\n' + '摘要:' + content + '\n' + '文章链接:' + link)

# 最后保存并命名这个Excel文件
wb.save('zhangjiawei3.xlsx')

相关文章

网友评论

      本文标题:Python爬虫实战3

      本文链接:https://www.haomeiwen.com/subject/nshfdrtx.html