Crawler Demo 01
from bs4 import BeautifulSoup
from urllib.request import urlopen
html = urlopen("https://morvanzhou.github.io/static/scraping/basic-structure.html").read().decode('utf-8')
# print(html)
soup = BeautifulSoup(html, features='lxml')
# print("\n")
# print(soup.h1)
# print("\n")
# print(soup.p)
all_href = soup.find_all('a')
print(all_href)
# [<a href="https://morvanzhou.github.io/">莫烦Python</a>,
# <a href="https://morvanzhou.github.io/tutorials/data-manipulation/scraping/">爬虫教程</a>]
all_href = [l['href'] for l in all_href]
print('\n', all_href)
本文标题:Crawler Demo 01
本文链接:https://www.haomeiwen.com/subject/nquydqtx.html
网友评论