使用Python读写xml文件

作者: cws | 来源:发表于2017-09-15 11:43 被阅读139次

使用Python读写xml文件
python 数据分析基础 day5－读写csv文件
Python对txt、csv以及xml文件读写数据
Python学习记录之：IO编程
使用C#代码读写XML文件
python xml文件的读写
如何使用python读写文件？
xml编程
14、文件的内建函数
文件和异常处理

原文链接：http://wyb0.com/posts/python-read-and-write-xml/

0x00 解析XML的方法

SAX (simple API for XML)
python 标准库包含SAX解析器，SAX用事件驱动模型，通过在解析XML的过程中触发一个个的事件并调用用户定义的回调函数来处理XML文件。
DOM(Document Object Model)
将XML数据在内存中解析成一个树，通过对树的操作来操作XML。
ElementTree(元素树)
ElementTree就像一个轻量级的DOM，具有方便友好的API。代码可用性好，速度快，消耗内存少。

我在这里使用ElementTree

0x01 Element对象的属性

每个Element对象都具有以下属性：

tag：string对象，表示数据代表的种类
attrib：dictionary对象，表示附有的属性
text：string对象，表示element的内容
tail：string对象，表示element闭合之后的尾迹
若干子元素（child elements）

>>> from xml.etree import ElementTree as ET
>>> xml = """<books>
...   <book id='37476'>aaaa</book>
...   <book id='83727'>bbbb</book>
... </books>"""
>>> root = ET.fromstring(xml)
>>> root.tag
'books'
>>> child = root.getchildren()
>>> child
[<Element 'book' at 0x106f59410>, <Element 'book' at 0x106f59450>]
>>> child[0].tag
'book'
>>> child[0].attrib
{'id': '37476'}
>>> child[0].text
'aaaa'

0x02 文件内容

<?xml version='1.0' encoding='UTF-8'?>
<books>
  <book>
    <name>Python黑帽子</name>
    <date>2015</date>
    <price>37￥</price>
    <description>用python写一些程序</description>
  </book>
  <book>
    <name>Web安全深度剖析</name>
    <date>2014</date>
    <price>39￥</price>
    <description>讲述web渗透的基础知识</description>
  </book>
  <book>
    <name>白帽子讲web安全</name>
    <date>2013</date>
    <price>44￥</price>
    <description>道哥力作</description>
  </book>
</books>

0x03 读取xml节点

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from xml.etree import ElementTree as ET

tree = ET.parse('test.xml')
root = tree.getroot()
# root = ET.fromstring(country_data_as_string) #通过字符串导入,直接获取根
childs = root.getchildren()

books = []
for child0 in childs:
    book = {}
    for child00 in child0.getchildren():
        # print child00.tag #标签名，即name、date、price、description
        # print child00.text
        book[child00.tag] = child00.text
    books.append(book)

print books
"""
books = [
    {'name': 'Python黑帽子','date': '2015','price': '37￥','description': '用python写一些程序'},
    {'name': 'Web安全深度剖析','date': '2014','price': '39￥','description': '讲述web渗透的基础知识'},
    {'name': '白帽子讲web安全','date': '2013','price': '44￥','description': '道哥力作'}        
]
"""

0x04 写入xml文件

#!/usr/bin/env python
# -*- coding: utf-8 -*-

from xml.etree.ElementTree import Element,ElementTree

books = [
    {
        'name': u'Python黑帽子',
        'date': '2015',
        'price': u'37￥',
        'description': u'用python写一些程序'
    },
    {
        'name': u'Web安全深度剖析',
        'date': '2014',
        'price': u'39￥',
        'description': u'讲述web渗透的基础知识'
    },
    {
        'name': u'白帽子讲web安全',
        'date': '2013',
        'price': u'44￥',
        'description': u'道哥力作'
    }        
]

def indent(elem, level=0):
    """美化写入文件的内容"""
    i = "\n" + level*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for elem in elem:
            indent(elem, level+1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i


root = Element('books')
tree = ElementTree(root)

for book in books:
    child0 = Element('book')
    root.append(child0)

    for k,v in book.items():
        child00 = Element(k)
        child00.text = v
        child0.append(child00)

indent(root,0)
tree.write('aa.xml', 'UTF-8')