- 错误信息:
requests.exceptions.SSLError: ("bad handshake: Error([('SSL routines', 'tls_process_server_certificate', 'certificate verify failed')],)",)
python做爬虫,对于有的网站,需要验证证书,比如:12306。
百度查报错信息,stack overflow里讲的比较好。
cafile = 'cacert.pem' # http://curl.haxx.se /ca/cacert.pem
r = requests.get(url, verify=cafile)
requests的官方相关帮助
就是说加上一个参数:verify=证书路径,或verify=False
我测试了一下,后者会有警告,提示安全问题
InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings InsecureRequestWarning)
使用证书的最大问题是证书在哪儿弄?,经过探索得到解决:通过浏览器查看网页证书,然后另存为
百度经验贴:下载安全证书
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# @Author : 西瓜,2017/4/29 0:30
# @File : 12306测试.py
# python版本:python3.5
my_header = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:52.0) Gecko/20100101 Firefox/52.0',
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3',
'Accept-Encoding': 'gzip, deflate',
'Referer': 'http://www.baidu.com',
'Connection': 'keep-alive',
'Cache-Control': 'max-age=0',
'Host':None
}
import requests
from bs4 import BeautifulSoup
geturl ="https://kyfw.12306.cn/otn/leftTicket/query?leftTicketDTO.\
train_date=2017-04-29&leftTicketDTO.from_station=WHN&leftTicketDTO.\
to_station=SZN&purpose_codes=ADULT"
# 测试网址
# geturl = "http://blog.csdn.net/wangming520liwei/article/details/53896964"
# res = requests.get(geturl,headers=my_header)
res = requests.get(geturl,headers=my_header,verify=False)
# res = requests.get(geturl,headers=my_header,verify="E:/SRCA.crt")
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text,'html5lib')
网友评论