1、Scrapy
安装Scrapy:
sudo pip install virtualenv
virtualenv scrapyenv
cd scrapyenv
source bin/activate
pip install Scrapy
文档:https://docs.scrapy.org/en/latest/intro/tutorial.html
2、使用urllib2和正则表达式实现
import urllib2
import re
response = urllib2.urlopen('http://www.baidu.com/')
html = response.read()
print(re.match('dev', html).span())
网友评论