美文网首页
Scrapy dynamic data

Scrapy dynamic data

作者: 方方块 | 来源:发表于2017-07-18 02:34 被阅读0次

    Dynamic data - page content are loaded in ajax like Jianshu
    Solution - use selenium with scrapy

    https://stackoverflow.com/questions/17975471/selenium-with-scrapy-for-dynamic-page

    What is selenium?

    Selenium is a tool that automates web applications for testing purpose. Requires the use of a specific type of browser webDriver to start.

    • Can be solely used or along with scrapy, better used with scrapy since scrapy is faster and smaller

    Steps

    • Selenium automates browser and iterate on the javascript (loading ajax contents)
      • pip install selenium && brew install chromedriver
      • overrides start_requests which returns a request
      • uses selector to extract
      • return http request with url and callback

    Last words

    It is pretty repetitive in terms of using it to scrape, in terms of simulating ajax get or post request, it basically uses a jquery-like wrapper called click()

    相关文章

      网友评论

          本文标题:Scrapy dynamic data

          本文链接:https://www.haomeiwen.com/subject/hwefkxtx.html