美文网首页
探索Paddle自带的cifar10数据集

探索Paddle自带的cifar10数据集

作者: LabVIEW_Python | 来源:发表于2021-03-13 10:52 被阅读0次

在创建神经网络模型前,探索数据集是一个必须的工作,包括:

  1. 数据集的类型、格式
  2. 数据集的均值和标准差
    范例1:了解数据类型、格式和通道顺序
import paddle 
from paddle.vision.datasets import Cifar10

data_train = Cifar10(mode='train', backend='cv2') 
img, label = data_train[0]
print(type(img), img.shape, type(label), label)

<class 'numpy.ndarray'> (32, 32, 3) <class 'numpy.ndarray'> 0

范例2:了解数据集的均值和标准差

import paddle 
from paddle.vision.datasets import Cifar10

data_train = Cifar10(mode='train', backend='cv2')

imgs=[]
l = len(data_train)
for i in range(l):
    imgs.append(data_train[i][0])
imgs = np.array(imgs)
print(imgs.shape)
imgs_r = imgs[:,:,:,0]
imgs_r_mean = np.mean(imgs_r)
imgs_r_std = np.std(imgs_r)
print(f"r mean:{imgs_r_mean}, std:{imgs_r_std}")
imgs_g = imgs[:,:,:,1]
imgs_g_mean = np.mean(imgs_g)
imgs_g_std = np.std(imgs_g)
print(f"g mean:{imgs_g_mean}, std:{imgs_g_std}")
imgs_b = imgs[:,:,:,2]
imgs_b_mean = np.mean(imgs_b)
imgs_b_std = np.std(imgs_b)
print(f"b mean:{imgs_b_mean}, std:{imgs_b_std}")

(50000, 32, 32, 3)
r mean:125.30689239501953, std:62.99320983886719
g mean:122.9505386352539, std:62.0887565612793
b mean:113.86553955078125, std:66.70484924316406

由此可以得到数据集的平均值和标准差:
mean = [125.31, 122.95, 113.86]
std = [62.99, 62.08, 66.7]

相关文章

网友评论

      本文标题:探索Paddle自带的cifar10数据集

      本文链接:https://www.haomeiwen.com/subject/wvxxcltx.html