爬虫scrapy框架（5）——pipelines

作者: 猛犸象和剑齿虎 | 来源:发表于2019-05-30 06:38 被阅读0次

爬虫scrapy框架（5）——pipelines
Python爬虫基础 | Windows 环境下安装MySQL-
Pycharm+Scrapy框架运行爬虫糗事百科（无items数
scrapy与scrapy-redis的使用（一）-基础
(六)Scrapy爬虫框架的认识(读书笔记)|Python网络爬
python爬虫框架scrapy
python爬虫框架Scrapy
python爬虫框架Scrapy
2019-06-21爬虫框架
2019Python学习教程（全套Python学习视频）：Scr

t013b9c86f5a43c0037.jpg

scrapy crawl musicspide -o mu.json 方式是框架为我们提供的一种数据存储方式，但是更多的是我们自定义的处理，爬虫在爬取数据后保存及后期处理就交给pipelines管道来实现。
在pipelines.py写入：

# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://doc.scrapy.org/en/latest/topics/item-pipeline.html

#管道负责item后期保存
class MyspiderPipeline(object):
    def __init__(self):#定义一些初始化的参数可以省略
        self.file=open('music.txt','a')
    #管道每次接收item后，执行下面的方法
    def process_item(self, item, spider):
        content=str(item)+'\n'
        self.file.write(content)
        return item
    #当爬虫爬取结束时执行的方法
    def close_spider(self,spider):
        self.file.close()