美文网首页
第二周第二课时

第二周第二课时

作者: 采矿 | 来源:发表于2016-05-29 17:23 被阅读14次
    运行结果
    #具体代码
    from bs4 import BeautifulSoup
    import requests
    import pymongo
    import time
    client = pymongo.MongoClient('localhost', 27017)
    phonenumber = client['phonenumber']
    phonenumber_sheet = phonenumber['phonenumber_sheet']
    
    
    def get_urls(page_num):
        web_url = 'http://bj.58.com/shoujihao/1/pn{}/'.format(str(page_num))
        web_data = requests.get(web_url)
        time.sleep(1)
        soup = BeautifulSoup(web_data.text, 'lxml')
        detail_urls = soup.select('li > a.t')
        titles = soup.select('a.t > strong')
        for detail_url, title in zip(detail_urls, titles):
            data = {
                'detail_url': detail_url.get('href'),
                'title': title.get_text()
            }
            phonenumber_sheet.insert_one(data)
            print(data)
    
    
    for page in range(1, 12):
        get_urls(page)
    
    

    相关文章

      网友评论

          本文标题:第二周第二课时

          本文链接:https://www.haomeiwen.com/subject/fwlzrttx.html