2018-12-29 python爬取当当网图书畅销榜

作者: 大靓的小本本 | 来源:发表于2018-12-29 15:07 被阅读0次

scrapy 爬取当当网-图书排行榜-多条件爬取
那些年的畅销书你看了吗？当当图书畅销榜分析
2018-12-29 python爬取当当网图书畅销榜
Python爬虫实战2
python爬取当当网图书排行榜
Python爬虫（15）利用Scrapy爬虫当当网图书畅销榜
史上最详细的爬虫教程，Python采集全网最受欢迎的 500 本
利用python爬虫可视化分析当当网的图书数据！
Python·爬取当当网图书信息
python爬取当当网图书信息

代码：

import requests
from bs4 import BeautifulSoup


#header={
#    'User-Agent':'Mozilla/5.0(Windows NT 10.0;Win64;x64;rv:58.0)Gecko/20100101 Firefox/58.0',
#    'Connection':'keep-alive'
#    }
def page_link():
    for item in range(1,25):
        url='http://bang.dangdang.com/books/bestsellers/01.00.00.00.00.00-24hours-0-0-1-{}'.format(item)
        data=requests.get(url)
        soup=BeautifulSoup(data.text,'lxml')

        with open(r'C:\Users\Administrator\Desktop\test.txt', 'a+',encoding='utf-8') as f:
            for book in soup.select('.bang_list_mode > li'):           
                name=book.select('.name')[0].text
                href=book.select('.name > a')[0].attrs['href']
                star=book.select('.bang_list li .level ')
                comment=book.select('.star > a')[0].attrs['href']
                comment_num=book.select('.star > a')[0].text
                author=book.select('.publisher_info')[0].text
                price=book.select('.price .price_n')[0].text          
                data={
                    '书名':name,
                    'url':href,
                    '作者':author,
                    '评论':comment_num,
                    '评论链接':comment,
                    '价格':price
                    }
                print(data)
                print('*'*100)
                         
                f.write(name + href + '\n' + author + price + '\n'+comment_num + comment +'\n')           
                f.write('*'*100+'\n')
        f.close()
        
    #print(data.encoding)  #Gb2312   

if __name__ == '__main__':
    page_link()

结果显示：

捕获.PNG