有时候,我们需要获取网站的全部url,用作于其他测试
以sogoWeChat为例:
import re
import urllib.request
response = urllib.request.urlopen("https://weixin.sogou.com/")
html = response.read()
tag = re.findall(r'<a href="([a-zA-z]+://[^\s]*)"', str(html))
print(tag)
![](https://img.haomeiwen.com/i13136383/c838246ff19c7625.png)
推荐一个正则表达式在线验证网站:http://tool.oschina.net/regex/#
![](https://img.haomeiwen.com/i13136383/162063ab31de6200.png)
网友评论