举例:data.xml。 可以将文件中的数据结构看成一棵树,annotation是根节点,folder, filename, path, source, size, segmented, object是annotation的子节点,database是source的子节点,以此类推...
1、创建xml文件
from lxml.etree import Element, SubElement, tostring
from xml.dom.minidom import parseString
def csvtoxml(image_name, bbox, label, save_dir='Annotations', width=640, height=480, channel=3):
'''
:param image_name:图片名
:param bbox:对应的bbox
:param save_dir:
:param width:这个是图片的宽度,这里使用的数据集是固定的大小的,所以设置默认
:param height:这个是图片的高度,这里使用的数据集是固定的大小的,所以设置默认
:param channel:这个是图片的通道,这里使用的数据集是固定的大小的,所以设置默认
:return:
'''
node_root = Element('annotation')
node_folder = SubElement(node_root, 'folder')
node_folder.text = 'jpg'
node_filename = SubElement(node_root, 'filename')
node_filename.text = image_name
node_size = SubElement(node_root, 'size')
node_width = SubElement(node_size, 'width')
node_width.text = '%s' % width
node_height = SubElement(node_size, 'height')
node_height.text = '%s' % height
node_depth = SubElement(node_size, 'depth')
node_depth.text = '%s' % channel
for x, y, h, w in bbox:
left, top, right, bottom = x, y, w, h
node_object = SubElement(node_root, 'object')
node_name = SubElement(node_object, 'name')
node_name.text = label
node_pose = SubElement(node_object, 'pose')
node_pose.text = 'Unspecified'
node_difficult = SubElement(node_object, 'difficult')
node_difficult.text = '0'
node_bndbox = SubElement(node_object, 'bndbox')
node_xmin = SubElement(node_bndbox, 'xmin')
node_xmin.text = '%s' % int(left)
node_ymin = SubElement(node_bndbox, 'ymin')
node_ymin.text = '%s' % int(top)
node_xmax = SubElement(node_bndbox, 'xmax')
node_xmax.text = '%s' % int(left + right)
node_ymax = SubElement(node_bndbox, 'ymax')
node_ymax.text = '%s' % int(top + bottom)
xml = tostring(node_root, pretty_print=True)
dom = parseString(xml)
save_xml = os.path.join(save_dir, image_name.replace('jpg', 'xml'))
with open(save_xml, 'wb') as f:
f.write(xml)
return
创建结果如下:
<annotation>
<folder>jpg</folder>
<filename>A000844.jpg</filename>
<path>Annotations/A000844.jpg</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>640</width>
<height>480</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>cat</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>300</xmin>
<ymin>74</ymin>
<xmax>344</xmax>
<ymax>347</ymax>
</bndbox>
</object>
</annotation>
2、读入需要修改的xml文件:
import xml.etree.ElementTree as ET
tree = ET.parse('data.xml')
root = tree.getroot()
3、查找
(1)查看根节点root的子节点
[child.tag for child in root]
## output: ['folder', 'filename', 'path', 'source', 'size', 'segmented', 'object']
(2)查看叶子节点的值 -- 叶子节点的父节点.find(' 叶子节点').text
print(root.find('folder').text)
print(root.find('filename').text)
print(root.find('path').text)
print(root.find('size')[2].text)
4、修改:
(1)修改叶子节点中的内容:Element.text
(2)添加和修改属性:Element.set()
(3)添加新的子节点:Element.append()
(4)删除子节点:Element.remove()
(5)最后,ElementTree.write() 方法完成添加或修改。
root[0].text = 'JPEGImages'
root[1].text = root[1].text.replace('jpg', 'txt')
root[2].text = os.path.splitext(root[2].text)[0]+'.txt'
root[4][2].text = str(1)
# root.set('updated', 'yes')
tree.write('output.xml')
修改之后的output.xml文件
<annotation>
<folder>JPEGImages</folder>
<filename>A000844.txt</filename>
<path>Annotations/A000844.txt</path>
<source>
<database>Unknown</database>
</source>
<size>
<width>640</width>
<height>480</height>
<depth>1</depth>
</size>
<segmented>0</segmented>
<object>
<name>cat</name>
<pose>Unspecified</pose>
<truncated>0</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>300</xmin>
<ymin>74</ymin>
<xmax>344</xmax>
<ymax>347</ymax>
</bndbox>
</object>
</annotation>
网友评论