美文网首页
Python 记录学习Scrapy遇到的错误

Python 记录学习Scrapy遇到的错误

作者: 我的小小笔尖 | 来源:发表于2023-10-14 08:34 被阅读0次

    创建Scrapy项目报错

    命令:scrapy startproject python123demo
    错误:ImportError: cannot import name 'PseudoElement' from 'cssselect.parser'

    Traceback (most recent call last):
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\Scripts\scrapy.exe\__main__.py", line 4, in <module>
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\__init__.py", line 12, in <module>
        from scrapy.http import FormRequest, Request
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\http\__init__.py", line 10, in <module>
        from scrapy.http.request.form import FormRequest
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\http\request\form.py", line 19, in <module>
        from parsel.selector import create_root_node
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\parsel\__init__.py", line 16, in <module>
        from parsel.selector import Selector, SelectorList  # NOQA
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\parsel\selector.py", line 31, in <module>
        from .csstranslator import GenericTranslator, HTMLTranslator
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\parsel\csstranslator.py", line 8, in <module>
        from cssselect.parser import Element, FunctionalPseudoElement, PseudoElement
    ImportError: cannot import name 'PseudoElement' from 'cssselect.parser' (C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\cssselect\parser.py)
    

    cssselect库和parsel 库冲突,通过升级cssselect库解决
    查看版本命令:pip show cssselect
    Version: 1.1.0

    查看版本命令:pip show parsel
    Version: 1.8.1

    升级版本命令:pip install cssselect --upgrade
    查看版本命令:pip show cssselect
    Version: 1.2.0


    运行spider时报错

    命令:scrapy crawl demo
    错误:AttributeError: 'AsyncioSelectorReactor' object has no attribute '_handleSignals'

    2023-10-10 19:01:25 [scrapy.utils.log] INFO: Scrapy 2.9.0 started (bot: python123demo)
    2023-10-10 19:01:25 [scrapy.utils.log] INFO: Versions: lxml 4.6.4.0, libxml2 2.9.5, cssselect 1.2.0, parsel 1.8.1, w3lib 1.22.0, Twisted 23.8.0, Python 3.7.4 (tags/v3.7.4:e09359112e, Jul  8 2019, 20:34:20) [MSC v.1916 64 bit (AMD64)], pyOpenSSL 22.0.0 (OpenSSL 1.1.1m  14 Dec 2021), cryptography 36.0.1, Platform Windows-10-10.0.19041-SP0
    2023-10-10 19:01:25 [scrapy.crawler] INFO: Overridden settings:
    {'BOT_NAME': 'python123demo',
     'FEED_EXPORT_ENCODING': 'utf-8',
     'NEWSPIDER_MODULE': 'python123demo.spiders',
     'REQUEST_FINGERPRINTER_IMPLEMENTATION': '2.7',
     'ROBOTSTXT_OBEY': True,
     'SPIDER_MODULES': ['python123demo.spiders'],
     'TWISTED_REACTOR': 'twisted.internet.asyncioreactor.AsyncioSelectorReactor'}
    2023-10-10 19:01:25 [asyncio] DEBUG: Using selector: SelectSelector
    2023-10-10 19:01:25 [scrapy.utils.log] DEBUG: Using reactor: twisted.internet.asyncioreactor.AsyncioSelectorReactor
    2023-10-10 19:01:25 [scrapy.utils.log] DEBUG: Using asyncio event loop: asyncio.windows_events._WindowsSelectorEventLoop
    2023-10-10 19:01:25 [scrapy.extensions.telnet] INFO: Telnet Password: f8ddf2dd6160e967
    2023-10-10 19:01:25 [scrapy.middleware] INFO: Enabled extensions:
    ['scrapy.extensions.corestats.CoreStats',
     'scrapy.extensions.telnet.TelnetConsole',
     'scrapy.extensions.logstats.LogStats']
    2023-10-10 19:01:26 [scrapy.middleware] INFO: Enabled downloader middlewares:
    ['scrapy.downloadermiddlewares.robotstxt.RobotsTxtMiddleware',
     'scrapy.downloadermiddlewares.httpauth.HttpAuthMiddleware',
     'scrapy.downloadermiddlewares.downloadtimeout.DownloadTimeoutMiddleware',
     'scrapy.downloadermiddlewares.defaultheaders.DefaultHeadersMiddleware',
     'scrapy.downloadermiddlewares.useragent.UserAgentMiddleware',
     'scrapy.downloadermiddlewares.retry.RetryMiddleware',
     'scrapy.downloadermiddlewares.redirect.MetaRefreshMiddleware',
     'scrapy.downloadermiddlewares.httpcompression.HttpCompressionMiddleware',
     'scrapy.downloadermiddlewares.redirect.RedirectMiddleware',
     'scrapy.downloadermiddlewares.cookies.CookiesMiddleware',
     'scrapy.downloadermiddlewares.httpproxy.HttpProxyMiddleware',
     'scrapy.downloadermiddlewares.stats.DownloaderStats']
    2023-10-10 19:01:26 [scrapy.middleware] INFO: Enabled spider middlewares:
    ['scrapy.spidermiddlewares.httperror.HttpErrorMiddleware',
     'scrapy.spidermiddlewares.offsite.OffsiteMiddleware',
     'scrapy.spidermiddlewares.referer.RefererMiddleware',
     'scrapy.spidermiddlewares.urllength.UrlLengthMiddleware',
     'scrapy.spidermiddlewares.depth.DepthMiddleware']
    2023-10-10 19:01:26 [scrapy.middleware] INFO: Enabled item pipelines:
    []
    2023-10-10 19:01:26 [scrapy.core.engine] INFO: Spider opened
    2023-10-10 19:01:26 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
    2023-10-10 19:01:26 [scrapy.extensions.telnet] INFO: Telnet console listening on 127.0.0.1:6023
    Traceback (most recent call last):
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 193, in _run_module_as_main
        "__main__", mod_spec)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\runpy.py", line 85, in _run_code
        exec(code, run_globals)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\Scripts\scrapy.exe\__main__.py", line 7, in <module>
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\cmdline.py", line 158, in execute
        _run_print_help(parser, _run_command, cmd, args, opts)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\cmdline.py", line 111, in _run_print_help
        func(*a, **kw)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\cmdline.py", line 166, in _run_command
        cmd.run(args, opts)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\commands\crawl.py", line 30, in run
        self.crawler_process.start()
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\crawler.py", line 383, in start
        install_shutdown_handlers(self._signal_shutdown)
      File "C:\Users\winds\AppData\Local\Programs\Python\Python37\lib\site-packages\scrapy\utils\ossignal.py", line 19, in install_shutdown_handlers
        reactor._handleSignals()
    AttributeError: 'AsyncioSelectorReactor' object has no attribute '_handleSignals'
    

    Twisted版本与Scrapy版本冲突,卸载Twisted,并重新安装低版本的

    查看版本命令:pip show Twisted
    Version: 23.8.0

    卸载命令:pip uninstall Twisted
    安装命令:pip install -i http://pypi.douban.com/simple --trusted-host pypi.douban.com Twisted==22.10.0

    相关文章

      网友评论

          本文标题:Python 记录学习Scrapy遇到的错误

          本文链接:https://www.haomeiwen.com/subject/srnibdtx.html