webdriver通过browser.page_source得到
作者:
沫明 | 来源:发表于
2019-02-21 15:39 被阅读0次
webdriver通过browser.page_source得到网页源代码,再进行xpath提取
def danwei2():
browser = webdriver.Ie(r'D:\driver\IEDriverServer.exe')
# browser = webdriver.Firefox(r'D:\driver\geckodriver')
# browser = webdriver.Chrome(r'D:\driver\chromedriver.exe')
# browser = webdriver.PhantomJS(r'D:\phantomjs-2.1.1-windows\bin\phantomjs')
url = 'http://search.gjsy.gov.cn:9090/queryAll/listFrame2?page=2&districtCode=130300&checkYear=2017&sydwName=&selectPage=0'
browser.get(url)
time.sleep(0.5)
print(browser.page_source) #打印源码
res = browser.page_source #page_source页面源代码
rs1 = etree.HTML(res) # 是将HTML转化为二进制/html 格式
num = rs1.xpath('//*[@id="searchForm"]/table/tbody/tr/td/table/tbody/tr[2]/td/table/tbody/tr[9]/td/table/tbody/tr[2]/td/table/tbody/tr[2]/td[2]/a/text()')
unit = rs1.xpath('//*[@id="searchForm"]/table/tbody/tr/td/table/tbody/tr[2]/td/table/tbody/tr[9]/td/table/tbody/tr[2]/td/table/tbody/tr[2]/td[3]/a/text()')
print(num)
print(unit)
本文标题:webdriver通过browser.page_source得到
本文链接:https://www.haomeiwen.com/subject/fwaryqtx.html
网友评论