研究背景
在深度学习中常用的数据集进行归纳和总结
1、COCO 数据集
COCO(Common Objects in Context)是一个新的图像识别、分割和图像语义数据集,是一个大规模的图像识别、分割、标注数据集。它可以用于多种竞赛,与本领域最相关的是检测部分,因为其一部分是致力于解决分割问题的。
COCO2014数据集类别汇总
coco目标检测数据集标注目标信息采用的是数据格式是json,其内容本质是一种字典结构,字典堆栈和列表信息内容维护。
coco里面的id和类名字对应:总共80类,但id号到90
person # 1
vehicle 交通工具 #8
{bicycle
car
motorcycle
airplane
bus
train
truck
boat}
outdoor #5
{traffic light
fire hydrant
stop sign
parking meter
bench}
animal #10
{bird
cat
dog
horse
sheep
cow
elephant
bear
zebra
giraffe}
accessory 饰品 #5
{backpack 背包
umbrella 雨伞
handbag 手提包
tie 领带
suitcase 手提箱
}
sports #10
{frisbee
skis
snowboard
sports ball
kite
baseball bat
baseball glove
skateboard
surfboard
tennis racket
}
kitchen #7
{bottle
wine glass
cup
fork
knife
spoon
bowl
}
food #10
{banana
apple
sandwich
orange
broccoli
carrot
hot dog
pizza
donut
cake
}
furniture 家具 #6
{chair
couch
potted plant
bed
dining table
toilet
}
electronic 电子产品 #6
{tv
laptop
mouse
remote
keyboard
cell phone
}
appliance 家用电器 #5
{microwave
oven
toaster
sink
refrigerator
}
indoor #7
{book
clock
vase
scissors
teddy bear
hair drier
toothbrush
}
coco_id_name_map={1: 'person', 2: 'bicycle', 3: 'car', 4: 'motorcycle', 5: 'airplane',
6: 'bus', 7: 'train', 8: 'truck', 9: 'boat', 10: 'traffic light',
11: 'fire hydrant', 13: 'stop sign', 14: 'parking meter', 15: 'bench',
16: 'bird', 17: 'cat', 18: 'dog', 19: 'horse', 20: 'sheep', 21: 'cow',
22: 'elephant', 23: 'bear', 24: 'zebra', 25: 'giraffe', 27: 'backpack',
28: 'umbrella', 31: 'handbag', 32: 'tie', 33: 'suitcase', 34: 'frisbee',
35: 'skis', 36: 'snowboard', 37: 'sports ball', 38: 'kite', 39: 'baseball bat',
40: 'baseball glove', 41: 'skateboard', 42: 'surfboard', 43: 'tennis racket',
44: 'bottle', 46: 'wine glass', 47: 'cup', 48: 'fork', 49: 'knife', 50: 'spoon',
51: 'bowl', 52: 'banana', 53: 'apple', 54: 'sandwich', 55: 'orange',
56: 'broccoli', 57: 'carrot', 58: 'hot dog', 59: 'pizza', 60: 'donut',
61: 'cake', 62: 'chair', 63: 'couch', 64: 'potted plant', 65: 'bed', 67: 'dining table',
70: 'toilet', 72: 'tv', 73: 'laptop', 74: 'mouse', 75: 'remote', 76: 'keyboard',
77: 'cell phone', 78: 'microwave', 79: 'oven', 80: 'toaster', 81: 'sink',
82: 'refrigerator', 84: 'book', 85: 'clock', 86: 'vase', 87: 'scissors',
88: 'teddy bear', 89: 'hair drier', 90: 'toothbrush'}
COCO2017数据集类别汇总
包含了超过80个物体类别,分别为:['background = 0','person=1', 'bicycle=2', 'car=3', 'motorcycle=4', 'airplane=5', 'bus=6', 'train=7', 'truck=8', 'boat=9', 'traffic light=10', 'fire hydrant=11', 'stop sign=13', 'parking meter=14', 'bench=15', 'bird=16', 'cat=17', 'dog=18', 'horse=19', 'sheep=20', 'cow=21', 'elephant=22', 'bear=23', 'zebra=24', 'giraffe=25', 'backpack=27', 'umbrella=28', 'handbag=31', 'tie=32', 'suitcase=33', 'frisbee=34', 'skis=35', 'snowboard=36', 'sports ball=37', 'kite=38', 'baseball bat=39', 'baseball glove=40', 'skateboard=41', 'surfboard=42', 'tennis racket=43', 'bottle=44', 'wine glass=46', 'cup=47', 'fork=48', 'knife=49', 'spoon=50', 'bowl=51', 'banana=52', 'apple=53', 'sandwich=54', 'orange=55', 'broccoli=56', 'carrot=57', 'hot dog=58', 'pizza=59', 'donut=60', 'cake=61', 'chair=62', 'couch=63', 'potted plant=64', 'bed=65', 'dining table=67', 'toilet=70', 'tv=72', 'laptop=73', 'mouse=74', 'remote=75', 'keyboard=76', 'cell phone=77', 'microwave=78', 'oven=79', 'toaster=80', 'sink=81', 'refrigerator=82', 'book=84', 'clock=85', 'vase=86', 'scissors=87', 'teddy bear=88', 'hair drier=89', 'toothbrush=90']。
91个填充类别,分别为['banner=92', 'blanket=93', 'branch=94', 'bridge=95', 'building-other=96', 'bush=97', 'cabinet=98', 'cage=99', 'cardboard=100', 'carpet=101', 'ceiling-other=102', 'ceiling-tile=103', 'cloth=104', 'clothes=105', 'clouds=106', 'counter=107', 'cupboard=108', 'curtain=109', 'desk-stuff=110', 'dirt=111', 'door-stuff=112', 'fence=113', 'floor-marble=114', 'floor-other=115', 'floor-stone=116', 'floor-tile=117', 'floor-wood=118', 'flower=119', 'fog=120', 'food-other=121', 'fruit=122', 'furniture-other=123', 'grass=124', 'gravel=125', 'ground-other=126', 'hill=127', 'house=128', 'leaves=129', 'light=130', 'mat=131', 'metal=132', 'mirror-stuff=133', 'moss=134', 'mountain=135', 'mud=136', 'napkin=137', 'net=138', 'paper=139', 'pavement=140', 'pillow=141', 'plant-other=142', 'plastic=143', 'platform=144', 'playingfield=145', 'railing=146', 'railroad=147', 'river=148', 'road=149', 'rock=150', 'roof=151', 'rug=152', 'salad=153', 'sand=154', 'sea=155', 'shelf=156', 'sky-other=157', 'skyscraper=158', 'snow=159', 'solid-other=160', 'stairs=161', 'stone=162', 'straw=163', 'structural-other=164', 'table=165', 'tent=166', 'textile-other=167', 'towel=168', 'tree=169', 'vegetable=170', 'wall-brick=171', 'wall-concrete=172', 'wall-other=173', 'wall-panel=174', 'wall-stone=175', 'wall-tile=176', 'wall-wood=177', 'water-other=178', 'waterdrops=179', 'window-blind=180', 'window-other=181', 'wood=182', 'other=183']。提供了118287张训练图片,5000张验证图片,以及超过40670张测试图片。由于其规模巨大,目前已非常常用,对领域发展很重要。实际上,该竞赛的结果每年都会在ECCV的研讨会上与ImageNet数据集的结果一起公布。它有如下特点:
1)Object segmentation:物体分割
2)Recognition in context :上下文识别
3)Superpixel stuff segmentation:超分辨率的实物分割
4)330K images (>200K labeled):33万张图片(超过20万有标记)
5)1.5 million object instances:150万个物体实例
6)80 object categories:80个物体类别
9)91 stuff categories :91个stuff类别
10)5 captions per image:每张图像5个标题
11)250,000 people with keypoints:25万张带关节点的人物图片
COCO数据集对于图像的标注信息不仅有类别、位置信息,还有对图像的语义文本描述,COCO数据集的开源使得近两三年来图像分割语义理解取得了巨大的进展,也几乎成为了图像语义理解算法性能评价的“标准”数据集。详细介绍参考。注意COCO用于语义分割的API要从这里下载:https://github.com/nightrome/cocostuffapi
代码:获取COCO caption 每张图片有5句文本描述
from pycocotools.coco import COCO
import numpy as np
import skimage.io as io
import matplotlib.pyplot as plt
import pylab
pylab.rcParams['figure.figsize'] = (8.0, 10.0)
dataDir='./coco2017'
dataType='val2017' # train2017
# initialize COCO api for caption annotations\n",
annFile = '{}/annotations/captions_{}.json'.format(dataDir,dataType)
coco=COCO(annFile)
coco_caps=COCO(annFile)
imgIdsall = coco_caps.getImgIds()
print(imgIdsall)
print(len(imgIdsall))
for i in imgIdsall:
imgIds = coco.getImgIds(imgIds = [i])
img = coco.loadImgs(imgIds[np.random.randint(0,len(imgIds))])[0]
print(img)
str = img['file_name']
str1 = str[:-4]
print(str1)
path = './val2017/'+str1+'.txt' # train2017
with open (path,'w') as f:
# load and display caption annotations\n",
annIds = coco.getAnnIds(imgIds=img['id'])
anns = coco.loadAnns(annIds)
for ann in anns:
print(ann['caption'])
f.write(ann['caption']+'\n')
print(' ')
coco.showAnns(anns)
代码:从指定文本中,读取文件名,然后总指定路径将文件复制到指定文件夹中
# -*- coding: utf-8 -*-
import time
import os
import shutil
def re_mycopyfile(srcfile,dstfile,num):
#name_long=16
l=len(str(num))
zero='00000000'
newname = srcfile[-16:-4]
if not os.path.isfile(srcfile):
print "%s not exist!"%(srcfile)
else:
#fpath,fname=os.path.split(dstfile) #分离文件名和路径
if not os.path.exists(dstfile):
os.makedirs(dstfile) #创建路径
#dstfile=dstfile+zero[:name_long-l-1]+str(num)+'.txt'
dstfile = dstfile+str(newname)+'.txt'
print dstfile
shutil.copyfile(srcfile,dstfile) #复制文件
print "copy %s -> %s"%(srcfile,dstfile)
if __name__ == '__main__':
path1="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/trainls.txt" # 待复制文件列表
path2="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/train2017all/" # 待复制文件目录
path3="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/train2017/" # 保存目标目录
path4="/home/henry/Files/ICCV2019/cocostuffapi/PythonAPI/trainnew.txt"
begin=0
count=begin
with open(path1,'r')as f:
for line in f:
line=line.split('\n')
print line[0]
srcfile = path2+str(line[0])
print srcfile
count=count+1
print count
dstfile=path3
re_mycopyfile(srcfile,dstfile,count)
count=begin
name_long=6
l=len(str(count+1))
zero='00000000'
with open(path1,'r')as f:
for line in f:
count=count+1
out_words=line.split('/')
#out_words[-1]=zero[:name_long-l-1]+str(count)+'.txt'
out_words[-1] = zero[:name_long - l - 1] + str(count) + '.txt'
with open(path4,'a+') as fp:
fp.write("/".join(out_words)+"\n")
2、VOC2007数据集
类别汇总
aeroplane
bicycle
bird
boat
bottle
bus
car
cat
chair
cow
diningtable
dog
horse
motorbike
person
pottedplant
sheep
sofa
train
tvmonitor
- MSCOCO数据集格式转化成VOC数据集格式
参考链接COCO数据集转化成VOC数据集格式
首先得到COCO_train.json文件,可以根据实际需要的类别进行修改
#-*- coding:utf-8-*-
import json
className = { # 84 total
1:'person',
2:'bicycle',
3:'car',
4:'motorcycle',
5:'airplane',
6:'bus',
7:'train',
8:'truck',
9:'boat',
10:'traffic light',
11:'fire hydrant',
13:'stop sign',
14:'parking meter',
15:'bench',
16:'bird',
17:'cat',
18:'dog',
19:'horse',
20:'sheep',
21:'cow',
22:'elephant',
23:'bear',
24:'zebra',
25:'giraffe',
27:'backpack',
28:'umbrella',
31:'handbag',
32:'tie',
33:'suitcase',
34:'frisbee',
35:'skis',
36:'snowboard',
37:'sports ball',
38:'kite',
39:'baseball bat',
40:'baseball glove',
41:'skateboard',
42:'surfboard',
43:'tennis racket',
44:'bottle',
46:'wine glass',
47:'cup',
48:'fork',
49:'knife',
50:'spoon',
51:'bowl',
52:'banana',
53:'apple',
54:'sandwich',
55:'orange',
56:'broccoli',
57:'carrot',
58:'hot dog',
59:'pizza',
60:'donut',
61:'cake',
62:'chair',
63:'couch',
64:'potted plant',
65:'bed',
67:'dining table',
70:'toilet',
71:'truck',
72:'tv',
73:'laptop',
74:'mouse',
75:'remote',
76:'keyboard',
77:'cell phone',
78:'microwave',
79:'oven',
80:'toaster',
81:'sink',
82:'refrigerator',
84:'book',
85:'clock',
86:'vase',
87:'scissors',
88:'teddy bear',
89:'hair drier',
90:'toothbrush',
}
classNum = [1,2,3,4,5,6,7,8,9,10,
11,12,13,14,15,16,17,18,19,20,
21,22,23,24,25,26,27,28,29,30,
31,32,33,34,35,36,37,38,39,40,
41,42,43,44,45,46,47,48,49,50,
51,52,53,54,55,56,57,58,59,60,
61,62,63,64,65,66,67,68,69,70,
71,72,73,74,75,76,77,78,79,80,
81,82,83,84,85,86,87,88,89,90]
cocojson="/home/ouc/data1/liuhongzhi/AttnGAN/dataset/coco2014/annotations/instances_train2014.json"
def writeNum(Num):
with open("COCO_train.json", "a+") as f:
f.write(str(Num))
inputfile = []
inner = {}
cnt = 0
with open(cocojson, "r+") as f:
allData = json.load(f)
data =allData["annotations"]
print(data[1])
print("read ready")
for i in data:
if (i['category_id'] in classNum):
inner = {
"filename":str(i["image_id"]).zfill(12),
"name":className[i["category_id"]],
"bndbox":i["bbox"]
}
inputfile.append(inner)
cnt = cnt + 1
if cnt%10000 == 0:
print("id : " + str(cnt))
inputfile = json.dumps(inputfile)
writeNum(inputfile)
其次根据选取出来的类别中的图片筛选需要的图片到指定目录存放,得到训练集图片
# -*- coding: utf-8 -*-
# @Time : 2018/03/09 10:46
# @Author : SyGoing
# @Site :
# @File : getimagesbyID.py
# @Software: PyCharm
import json
import os
import cv2
#from utils.timer import Timer
nameStr = []
with open("COCO_train.json", "r+") as f:
data = json.load(f)
print("read ready")
for i in data:
imgName = "COCO_train2014_"+ str(i["filename"]) + ".jpg"
nameStr.append(imgName)
nameStr = set(nameStr)
print(nameStr)
print(len(nameStr))
#t_total = Timer()
#total_time = t_total.toc()
#wait_time = max(int(60 - total_time * 1000), 1)
#cv2.waitKey(0)
path = "/home/ouc/data1/liuhongzhi/AttnGAN/dataset/coco2014/images/train2014/"
savePath="/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/JPEGImages/"
count=0
for file in nameStr:
print(path+file)
img=cv2.imread(path+file)
'''
print(str(img))
cv2.imshow('test',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
'''
cv2.imwrite(savePath+file,img)
count=count+1
print('num: '+count.__str__()+' '+file)
然后根据筛选出来的图片ID生成VOC数据集的XML文件到Annotations文件夹
#-*- coding:utf-8-*-
import xml.dom
import xml.dom.minidom
import os
# from PIL import Image
import cv2
import json
# xml文件规范定义
_IMAGE_PATH = '/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/JPEGImages/'
_INDENT = '' * 4
_NEW_LINE = '\n'
_FOLDER_NODE = 'COCO2014'
_ROOT_NODE = 'annotation'
_DATABASE_NAME = 'LOGODection'
_ANNOTATION = 'COCO2014'
_AUTHOR = 'SyGoing_CSDN'
_SEGMENTED = '0'
_DIFFICULT = '0'
_TRUNCATED = '0'
_POSE = 'Unspecified'
# _IMAGE_COPY_PATH= 'JPEGImages'
_ANNOTATION_SAVE_PATH = '/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/Annotations/'
# _IMAGE_CHANNEL= 3
# 封装创建节点的过程
def createElementNode(doc, tag, attr): #创建一个元素节点
element_node = doc.createElement(tag)
# 创建一个文本节点
text_node = doc.createTextNode(attr)
# 将文本节点作为元素节点的子节点
element_node.appendChild(text_node)
return element_node
# 封装添加一个子节点
def createChildNode(doc, tag, attr, parent_node):
child_node = createElementNode(doc,tag, attr)
parent_node.appendChild(child_node)
# object节点比较特殊
def createObjectNode(doc, attrs):
object_node =doc.createElement('object')
midname=attrs['name']
#if midname !='person': # 注释后可以得到所有类别
# midname='car'
createChildNode(doc, 'name', midname,
object_node)
#createChildNode(doc, 'name',attrs['name'],
# object_node)
createChildNode(doc, 'pose',
_POSE, object_node)
createChildNode(doc, 'truncated',
_TRUNCATED,object_node)
createChildNode(doc, 'difficult',
_DIFFICULT,object_node)
bndbox_node = doc.createElement('bndbox')
createChildNode(doc, 'xmin',str(int(attrs['bndbox'][0])),
bndbox_node)
createChildNode(doc, 'ymin',str(int(attrs['bndbox'][1])),
bndbox_node)
createChildNode(doc, 'xmax',str(int(attrs['bndbox'][0] + attrs['bndbox'][2])),
bndbox_node)
createChildNode(doc, 'ymax',str(int(attrs['bndbox'][1] + attrs['bndbox'][3])),
bndbox_node)
object_node.appendChild(bndbox_node)
return object_node
# 将documentElement写入XML文件
def writeXMLFile(doc, filename):
tmpfile = open('tmp.xml', 'w')
doc.writexml(tmpfile, addindent='' *4, newl='\n', encoding='utf-8')
tmpfile.close()
# 删除第一行默认添加的标记
fin = open('tmp.xml')
# print(filename)
fout = open(filename, 'w')
# print(os.path.dirname(fout))
lines = fin.readlines()
for line in lines[1:]:
if line.split():
fout.writelines(line)
# new_lines =''.join(lines[1:])
# fout.write(new_lines)
fin.close()
fout.close()
if __name__ == "__main__":
##读取图片列表
img_path ="/home/ouc/data1/liuhongzhi/yolo2-pytorch/datasets/COCO/VOC2007/JPEGImages/"
fileList = os.listdir(img_path)
if fileList == 0:
os._exit(-1)
with open("COCO_train.json", "r") as f:
ann_data = json.load(f)
current_dirpath =os.path.dirname(os.path.abspath('__file__'))
if not os.path.exists(_ANNOTATION_SAVE_PATH):
os.mkdir(_ANNOTATION_SAVE_PATH)
# if not os.path.exists(_IMAGE_COPY_PATH):
# os.mkdir(_IMAGE_COPY_PATH)
for imageName in fileList:
saveName =imageName.strip(".jpg")
print(saveName)
# pos =fileList[xText].rfind(".")
# textName =fileList[xText][:pos]
# ouput_file = open(_TXT_PATH +'/' + fileList[xText])
# ouput_file =open(_TXT_PATH)
# lines = ouput_file.readlines()
xml_file_name =os.path.join(_ANNOTATION_SAVE_PATH, (saveName + '.xml'))
# withopen(xml_file_name,"w") as f:
# pass
img =cv2.imread(os.path.join(img_path, imageName))
print(os.path.join(img_path,imageName))
# cv2.imshow(img)
height, width, channel =img.shape
print(height, width, channel)
my_dom = xml.dom.getDOMImplementation()
doc = my_dom.createDocument(None,_ROOT_NODE, None)
# 获得根节点
root_node = doc.documentElement
# folder节点
createChildNode(doc, 'folder',_FOLDER_NODE, root_node)
# filename节点
createChildNode(doc, 'filename',saveName + '.jpg', root_node)
# source节点
source_node =doc.createElement('source')
# source的子节点
createChildNode(doc, 'database',_DATABASE_NAME, source_node)
createChildNode(doc, 'annotation',_ANNOTATION, source_node)
createChildNode(doc, 'image','flickr', source_node)
createChildNode(doc, 'flickrid','NULL', source_node)
root_node.appendChild(source_node)
# owner节点
owner_node = doc.createElement('owner')
# owner的子节点
createChildNode(doc, 'flickrid','NULL', owner_node)
createChildNode(doc, 'name',_AUTHOR, owner_node)
root_node.appendChild(owner_node)
# size节点
size_node =doc.createElement('size')
createChildNode(doc, 'width',str(width), size_node)
createChildNode(doc, 'height',str(height), size_node)
createChildNode(doc, 'depth',str(channel), size_node)
root_node.appendChild(size_node)
# segmented节点
createChildNode(doc, 'segmented',_SEGMENTED, root_node)
for ann in ann_data:
imgName ="COCO_train2014_" + str(ann["filename"])
cname=saveName;
if (saveName == imgName ):
# object节点
object_node =createObjectNode(doc, ann)
root_node.appendChild(object_node)
else:
continue
# 构建XML文件名称
print(xml_file_name)
# 创建XML文件
# createXMLFile(attrs, width,height, xml_file_name)
# # 写入文件
#
writeXMLFile(doc, xml_file_name)
最后得到train.txt文件,里面是所有训练图片的名字,需要删除路径和后缀,只保留图片名。
find ./JPEGImages -name '*.jpg' > train.txt
3、 Cityscapes数据集
Cityscapes数据集则是由奔驰主推,提供无人驾驶环境下的图像分割数据集,用于评估视觉算法在城区场景语义理解方面的性能。图像Translation算法常用,如Pix2pix和CycleGAN。
Cityscapes包含50个欧洲城市不同场景、不同背景、不同季节的街景的33类标注物体,包括:{'unlabeled'=0 , 'ego vehicle'=1 , 'rectification border'=2 , 'out of roi'= 3 , 'static'=4 , 'dynamic'=5 , 'ground'=6 ,'road'=7 ,'sidewalk'=8 ,parking'=9 ,'rail track'=10 ,'building'=11 ,'wall'=12 ,'fence'=13 , 'guard rail'=14 ,'bridge'=15 ,'tunnel'=16 ,'pole'=17 ,'polegroup'=18 , 'traffic light'=19 ,'traffic sign'=20 , 'vegetation'=21 , 'terrain'=22 ,'sky'=23 , 'person'=24 , 'rider'=25 , 'car'=26 ,'truck'=27 , 'bus'=28 ,'caravan'=29 ,'trailer'=30 ,'train'=31 ,'motorcycle'=32 , 'bicycle'=33 },但是在这33个类中,评估时只用到了19个类别,因此训练时将33个类映射为19个类,评估时需要将19个类又映射回33个类上传评估服务器。这个数据需要注册账号才能下载。
Cityscapes数据集共有fine和coarse两套评测标准,前者提供5000张精细标注的图像,后者提供5000张精细标注外加20000张粗糙标注的图像,用PASCAL VOC标准的 intersection-over-union (IoU)得分来对算法性能进行评价。 5000张精细标注的图片分为训练集2975张图片,验证集有500张图片,而测试集有1525张图片,测试集不对外公布,需要将预测结果上传到评估服务器才能计算mIoU值。
网友评论