美文网首页
webdriver通过browser.page_source得到

webdriver通过browser.page_source得到

作者: 沫明 | 来源:发表于2019-02-21 15:39 被阅读0次

webdriver通过browser.page_source得到网页源代码,再进行xpath提取

def danwei2():
        browser = webdriver.Ie(r'D:\driver\IEDriverServer.exe')
        # browser = webdriver.Firefox(r'D:\driver\geckodriver')
        # browser = webdriver.Chrome(r'D:\driver\chromedriver.exe')
        # browser = webdriver.PhantomJS(r'D:\phantomjs-2.1.1-windows\bin\phantomjs')
        url = 'http://search.gjsy.gov.cn:9090/queryAll/listFrame2?page=2&districtCode=130300&checkYear=2017&sydwName=&selectPage=0'
    
        browser.get(url)
        time.sleep(0.5)
    
        print(browser.page_source)    #打印源码
        res = browser.page_source  #page_source页面源代码
        rs1 = etree.HTML(res)  # 是将HTML转化为二进制/html 格式
        num = rs1.xpath('//*[@id="searchForm"]/table/tbody/tr/td/table/tbody/tr[2]/td/table/tbody/tr[9]/td/table/tbody/tr[2]/td/table/tbody/tr[2]/td[2]/a/text()')
        unit = rs1.xpath('//*[@id="searchForm"]/table/tbody/tr/td/table/tbody/tr[2]/td/table/tbody/tr[9]/td/table/tbody/tr[2]/td/table/tbody/tr[2]/td[3]/a/text()')
        print(num)
        print(unit)

相关文章

网友评论

      本文标题:webdriver通过browser.page_source得到

      本文链接:https://www.haomeiwen.com/subject/fwaryqtx.html