xpath lxml.etree

作者: chliar | 来源:发表于2018-08-03 10:38 被阅读0次

获取一部分htm 含子类

 driver.get('https://index.baidu.com/#/')
    html = etree.HTML(driver.page_source)
    print etree.tostring(html.xpath('//div [@class="index-logo"]')[0])
    /
    etree.tostring(i, encoding='utf8').decode('utf8')

    模糊查询：
    mytree = tree.xpath('//div[contains(@id,"%s")]' % j)
    tree.xpath('//div[contains(@id,"nav-main-") and not(contains(@id,"nav-main-past"))]')



soup = etree.HTML(data)
print(soup.xpath('//ul[@class="text-list"]/li/a[contains(text(),"如何")]/text()'))

# 获取所有子标签下的text
soup.xpath('//'//ul[@class="text-list"]/li/a[contains(text(),"如何")]')[0].xpath('string(.)')
c_info = node.xpath('./*//span [@class="c-info"]').xpath('string(.)').extract_first()

# 获取含有@属性的标签
soup.xpath('ul[@class]')  # 获取含有class 属性的标签

xpath 语法

获取子类带html 标签内容（部分源码）：
xpath(response.xpath('//div [@class="index-logo"]/node()').extract())

//title[@lang]    选取所有包含名为 lang 的属性的 title 元素。
//title[@lang='eng']  选取所有 title 元素，且这些元素拥有值为 eng 的 lang 属性。

网友评论

本文标题：xpath lxml.etree

本文链接：https://www.haomeiwen.com/subject/exkyvftx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

xpath lxml.etree

相关文章

网友评论

延伸阅读

深度阅读

栏目导航

热点阅读