美文网首页
常用的xpath

常用的xpath

作者: tkpy | 来源:发表于2018-08-01 17:28 被阅读0次

    xpath的模糊查询

    //div[contains(text(),"history-loadmore") and not(contains(@class, "history-loadmore hide"))]
    

    选取同级节点

    # 同级节点下个节点
    //div[@class='listpage']/span/following-sibling::a[1]
    # 同级节点上个节点
    //div[@class='address-row']/table/tbody/tr[@id='submitTime']/preceding-sibling::tr[1]
    

    获取父级节点

    //div[@class='page-box house-lst-page-box']/parent::div
    

    xpath定位

    # 大于1
    //li[position()>1]
    # 倒数第一个
    //li[last()]
    # 倒数第二个
    //li[last()-1]
    

    列表时间筛选

    //span[@class='light' and number(translate(text(),'更新时间-',''))>20171204]/../../../../h3/a/@href
    

    xpath获取标签

        content_html = html.xpath("//div[@class='show-content-free']")
        content_html = etree.tostring(content_html[0], encoding='UTF-8', pretty_print=False, method='html')
        content_html = content_html.decode()
    

    xpath的string()方法

    content_text = html.xpath("string(//div[@class='show-content-free'])")[0]
    

    使用xpath获取标签

            content_html = response.xpath("//div[@class='txt_con']")
            content_html = etree.tostring(content_html[0], encoding='UTF-8', pretty_print=False, method='html')
            content_html = content_html.decode()
    

    requests获取标签的所有内容

            content_text = response.xpath("//div[@id='ctrlfscont']")
            content_text = content_text[0].xpath('string(.)').encode('utf-8').strip().decode()
    

    相关文章

      网友评论

          本文标题:常用的xpath

          本文链接:https://www.haomeiwen.com/subject/qsmrvftx.html