进入虚拟环境,运行以下命令
(scrapy) shanghaimei@shanghaimei:~$ scrapy shell "https://book.douban.com/"
```
[s] shelp() Shell help (print this help)
[s] view(response) View response in a browser
In [1]:
会发现返回403
[s] response <403 https://movie.douban.com>
只要在命令上加请求头就正常返回了
scrapy shell "https://movie.douban.com" -s USER_AGENT='Mozilla/5.0'
下面拿数据了,找打数据接口,执行
scrapy shell "https://movie.douban.com/j/search_subjects?type=tv&tag=%E7%83%AD%E9%97%A8&page_limit=50&page_start=0" -s USER_AGENT='Mozilla/5.0'
In [1]: view(response)
Out[1]: True
#返回一个txt的json数据文件
网友评论