美文网首页Crawler
Crawler Demo 01

Crawler Demo 01

作者: JaedenKil | 来源:发表于2019-01-15 16:04 被阅读1次
from bs4 import BeautifulSoup
from urllib.request import urlopen

html = urlopen("https://morvanzhou.github.io/static/scraping/basic-structure.html").read().decode('utf-8')
# print(html)
soup = BeautifulSoup(html, features='lxml')
# print("\n")
# print(soup.h1)
# print("\n")
# print(soup.p)
all_href = soup.find_all('a')
print(all_href)
# [<a href="https://morvanzhou.github.io/">莫烦Python</a>,
#  <a href="https://morvanzhou.github.io/tutorials/data-manipulation/scraping/">爬虫教程</a>]
all_href = [l['href'] for l in all_href]
print('\n', all_href)

相关文章

网友评论

    本文标题:Crawler Demo 01

    本文链接:https://www.haomeiwen.com/subject/nquydqtx.html