python 爬虫思维

python 爬虫思维

作者: 任我笑笑 | 来源:发表于2018-05-04 20:38 被阅读0次

Python入门爬虫必知的两套解析方法和四种爬虫实现方式
3分钟带你了解世界第一语言Python 入门上手也这么简单！
Python网络爬虫（八） - 利用有道词典实现一个简单翻译程序
Python网络爬虫（七）- 深度爬虫CrawlSpider
Python网络爬虫（二）- urllib爬虫案例
Python网络爬虫（一）- 入门基础
Python网络爬虫（四）- XPath
Python网络爬虫（三）- 爬虫进阶
Python网络爬虫（六）- Scrapy框架
Python网络爬虫（五）- Requests和Beautifu

url

https://www.bilibili.com/video/av12721444
这视频老师还真是有当网红老师的潜质

重在讲思路

先查看源代码中是否还有需要的信息，如果没有
利用chrome来对js加载的url进行跟踪，分析

image.png

120分钟之后都是多余。。。。

反爬

最基本的做法就是添加user-agent
用的还是python2的urllib2

image.png
对多线程的爬虫，可以为每个线程配置自己的user-agent，可以搜“user-agent 大全”
（当然，以前觉得还是找代理服务器伪装下ip比较好）

headers()是自定义的随机取一个user-agent出来

image.png

拿到json进行处理

执行之后取出要用的json 因为是gbk的，所以要decode('gbk').encode('utf-8')

把json转成dict
from json import loads
loads(xxxx)

image.png

image.png

分析url，拼接url

image.png

相关文章

Python入门爬虫必知的两套解析方法和四种爬虫实现方式
对于大多数零基础入门Python的朋友而言，爬虫绝对是学习 python 的最好的起手和入门方式。因为爬虫思维模...
3分钟带你了解世界第一语言Python 入门上手也这么简单！
一、Python入门 1. Python爬虫入门一之综述 Python爬虫入门二之爬虫基础了解 Python爬虫入...
Python网络爬虫（八） - 利用有道词典实现一个简单翻译程序
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（七）- 深度爬虫CrawlSpider
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（二）- urllib爬虫案例
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（一）- 入门基础
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（四）- XPath
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（三）- 爬虫进阶
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（六）- Scrapy框架
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...
Python网络爬虫（五）- Requests和Beautifu
目录： Python网络爬虫（一）- 入门基础Python网络爬虫（二）- urllib爬虫案例Python网络爬...

网友评论

本文标题：python 爬虫思维

本文链接：https://www.haomeiwen.com/subject/xjadrftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

栏目导航

热点阅读

关于我们|服务条款|联系我们|python 爬虫思维|投稿指南|网站地图|RSS订阅|排版工具|手机版

提供经典美文摘抄,优美散文欣赏,现代诗歌精选,短篇小说,心情随笔,表白情书范文,故事会在线阅读欣赏

Copyright © 2014-2023 Haomeiwen.com All Rights Reserved. 好美文阅读网版权所有

备案信息：桂公网安备 45052102000051号 · 桂ICP备13007215号-3

本站所收录作品、热点评论等信息部分来源互联网，目的只是为了系统归纳学习和传递资讯

所有作品版权归原创作者所有，与本站立场无关，如不慎侵犯了你的权益，请联系我们告知，我们将做删除处理！