(1)在urllib中:
proxy = urllib.request.ProxyHandler({'http': proxy_addr})
opener=urllib.request.build_opener(proxy)
urllib.request.install_opener(opener)
data = urllib.request.urlopen(url).read().decode('utf-8')
(2)在requests中:
proxy_dict = {"https": proxy_url}
response = requests.get(url, proxies=proxy_dict)
(3)在scarpy中:
设置中间件
class ProxyMiddleware(object):
def process_request(self, request, spider):
request.meta['proxy'] = http://proxy.yourproxy:8001
直接在实例化Request时:
yield scrapy.Request(url=url, callback=self.parse, meta={'proxy': 'http://proxy.yourproxy:8001'})
网友评论