这是python实战的第二次作业,这次学会了查文档来进行学习
最终成果图
爬取商品信息结果.png我的代码
__author__ = 'Reborn'
from bs4 import BeautifulSoupimport rewith
open(r"E:/study/Workspaces/pycharm/source/1_2answer_of_homework/index.html",'r') as ht_data:
Soup = BeautifulSoup(ht_data,'lxml')
picadds = Soup.select('body > div > div > div.col-md-9 > div > div > div > img')
titles = Soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a')
prices = Soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4.pull-right')
rates = Soup.select('body > div > div > div.col-md-9 > div > div > div > div.ratings > p > span')
nums = Soup.select('body > div > div > div.col-md-9 > div > div > div > div.ratings > p.pull-right')
star = [];index = 0
for index in range(1,len(rates)):
string=str(rates[index])
if re.search("empty",string) != None:
star.append('☆')
else:
star.append('★')
flag = 0
for picadd,title,price,sta,num in zip(picadds,titles,prices,star,nums):
data = {
'picadd': picadd.get("src"),
'title' : title.get_text(),
'price' : price.get_text(),
'star' : ''.join(star[flag:flag+4]),
'num' : num.get_text()
}
flag += 5
print(data)
我的感悟
- 现在才学会把文档用起来,发挥它真正的作用
- 薄弱的基础,还需要多翻几次课本。多看一些代码
- 保持独立思考,用自己的力量把问题解决
网友评论