美文网首页
python之lxml 解析问题总结

python之lxml 解析问题总结

作者: tafanfly | 来源:发表于2019-03-01 10:06 被阅读0次

    python3 env

    (1)解析xml报错 ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.

    • lxml 不支持解析带有encoding 声明的字符串,例如 xml中以encoding="UTF-8"开头,需要转换成bytes类型。案例一案例二
    from lxml import etree
    
    content = '''<?xml version="1.0"?>
    <response version="1.0">
    <code>200</code>
    <message>Hello</message>
    </response>'''
    
    etree.fromstring(content)
    Out[12]: <Element response at 0x7f2fad8d4c08>
    
    
    
    content = '''<?xml version="1.0" encoding="utf-8"?>
    <response version="1.0">
    <code>200</code>
    <message>Hello</message>
    </response>'''
    
    etree.fromstring(content)
    ---------------------------------------------------------------------------
    ValueError: Unicode strings with encoding declaration are not supported. Please use bytes input or XML fragments without declaration.
    
    
    
    content = b'''<?xml version="1.0" encoding="utf-8"?>
    <response version="1.0">
    <code>200</code>
    <message>Hello</message>
    </response>'''
    
    xml = etree.fromstring(content)
    print (xml)
    Out[5]: <Element response at 0x7f1a1248f288
    

    得到上述的xml的类, 需要转化为string,要加参数encoding='unicode'

    etree.tostring(xml)
    Out[7]: b'<response version="1.0">\n<code>200</code>\n<message>Hello</message>\n</response>'
    
    etree.tostring(xml, encoding='unicode')
    Out[8]: '<response version="1.0">\n<code>200</code>\n<message>Hello</message>\n</response>'
    

    相关文章

      网友评论

          本文标题:python之lxml 解析问题总结

          本文链接:https://www.haomeiwen.com/subject/wgohuqtx.html