美文网首页
无标题文章

无标题文章

作者: Alexhan1989 | 来源:发表于2016-07-05 14:52 被阅读0次

    python实战计划1_2

    视频总共看了好几遍才看懂,BeautifulSoup文档也看了好几遍,总算做出来了。

    结果:

    Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5 "/Applications/PyCharm CE.app/Contents/helpers/pycharm/utrunner.py" /Users/apple/PycharmProjects/untitled/beautifulsouptest.py true

    Testing started at 下午12:47 ...

    {'price': '$24.99', 'name': 'EarPod', 'star': 5, 'image': 'img/pic_0000_073a9256d9624c92a05dc680fc28865f.jpg', 'rate': '65 reviews'}

    {'price': '$64.99', 'name': 'New Pocket', 'star': 4, 'image': 'img/pic_0005_828148335519990171_c234285520ff.jpg', 'rate': '12 reviews'}

    {'price': '$74.99', 'name': 'New sunglasses', 'star': 4, 'image': 'img/pic_0006_949802399717918904_339a16e02268.jpg', 'rate': '31 reviews'}

    {'price': '$84.99', 'name': 'Art Cup', 'star': 3, 'image': 'img/pic_0008_975641865984412951_ade7a767cfc8.jpg', 'rate': '6 reviews'}

    {'price': '$94.99', 'name': 'iphone gamepad', 'star': 4, 'image': 'img/pic_0001_160243060888837960_1c3bcd26f5fe.jpg', 'rate': '18 reviews'}

    {'price': '$214.5', 'name': 'Best Bed', 'star': 4, 'image': 'img/pic_0002_556261037783915561_bf22b24b9e4e.jpg', 'rate': '18 reviews'}

    {'price': '$500', 'name': 'iWatch', 'star': 4, 'image': 'img/pic_0011_1032030741401174813_4e43d182fce7.jpg', 'rate': '35 reviews'}

    {'price': '$15.5', 'name': 'Park tickets', 'star': 4, 'image': 'img/pic_0010_1027323963916688311_09cc2d7648d9.jpg', 'rate': '8 reviews'}

    我的代码:

    frombs4importBeautifulSoup

    info = []

    withopen('/Users/apple/Downloads/Plan-for-combating-master/week1/1_2/1_2answer_of_homework/index.html','r')aswb_data:

    soup = BeautifulSoup(wb_data,'lxml')

    images = soup.select('body > div > div > div.col-md-9 > div > div > div > img')

    prices = soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4.pull-right')

    names = soup.select('body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a')

    rates = soup.select('body > div > div > div.col-md-9 > div > div > div > div.ratings > p.pull-right')

    stars = soup.select('body > div > div > div.col-md-9 > div > div > div > div.ratings > p:nth-of-type(2)')#找到星号所在的段

    foriinrange(len(stars)):

    stars[i] = stars[i].find_all(class_='glyphicon-star')#找到每个段中的实体星星

    stars[i] = stars[i].count(stars[i][0])#数下每个段中有几个星星

    forname,price,rate,image,starinzip(names,prices,rates,images,stars):

    data = {

    'name': name.string,

    'price': price.string,

    'rate': rate.string,

    'image': image.get('src'),

    'star': star

    }

    info.append(data)

    print(data)

    总结:

    1. Safari查看源代码没有css selector项,只能用chrome查看.

    2. 复制css selector 地址后要删所有的nth child()项。

    body > div:nth-child(2) > div > div.col-md-9 > div:nth-child(2) > div:nth-child(1) > div > div.ratings > p:nth-child(2) > span:nth-child(1)

    相关文章

      网友评论

          本文标题:无标题文章

          本文链接:https://www.haomeiwen.com/subject/dtxcjttx.html