美文网首页我爱编程
pytorch学习笔记(2)—构建数据类、图像预处理、读写模型

pytorch学习笔记(2)—构建数据类、图像预处理、读写模型

作者: cuiyr123 | 来源:发表于2018-04-16 20:20 被阅读0次

    2. pytorch读数据

    可以numpy读数据,然后torch.from_numpy转化成torch数据。pytorch中提供了torchvision包可以读入常用的图像数据集CIFAR10,MNIST,也有针对于这些图像的简单变换。

    import torchvision.datasets
    import torch.utils.data.DataLoader
    import torchvision.transforms as transforms
    

    读图像常用的包有 opencv、scikit-image、matplot等

    import cv2
    import matplotlib.pyplot as plt
    import skimage 
    

    读文件常用的包有pandas(解析cvs文件)...

    import pandas as pd
    

    2.1 数据集类

    在pytorch中,用torch.utils.data.Dataset描述数据集类,在使用自己的数据时,需要重写_len和_getitem两个方法。数据集一般用dict封装,以faceLandmark为例,构造函数init读取的是图像的索引,len则返回数据集的长度,getitem则把数据集封装成dict类型,用来读取某个特定的图像

    class FaceLandmarksDataset(Dataset):
        """Face Landmarks dataset."""
    
        def __init__(self, csv_file, root_dir, transform=None):
            """
            Args:
                csv_file (string): Path to the csv file with annotations.
                root_dir (string): Directory with all the images.
                transform (callable, optional): Optional transform to be applied
                    on a sample.
            """
            self.landmarks_frame = pd.read_csv(csv_file)
            self.root_dir = root_dir
            self.transform = transform
    
        def __len__(self):
            return len(self.landmarks_frame)
    
        def __getitem__(self, idx):
            img_name = os.path.join(self.root_dir,
                                    self.landmarks_frame.iloc[idx, 0])
            image = io.imread(img_name)
            landmarks = self.landmarks_frame.iloc[idx, 1:].as_matrix()
            landmarks = landmarks.astype('float').reshape(-1, 2)
            sample = {'image': image, 'landmarks': landmarks}
    
            if self.transform:
                sample = self.transform(sample)
    
            return sample
    

    定义好数据类之后,读取数据时只要实例化这个类即可,如下表示实例化类并显示前四幅图像

    face_dataset = FaceLandmarksDataset(csv_file='faces/face_landmarks.csv',
                                        root_dir='faces/')
    
    fig = plt.figure()
    
    for i in range(len(face_dataset)):
        sample = face_dataset[i]
    
        print(i, sample['image'].shape, sample['landmarks'].shape)
    
        ax = plt.subplot(1, 4, i + 1)
        plt.tight_layout()
        ax.set_title('Sample #{}'.format(i))
        ax.axis('off')
        show_landmarks(**sample)
    
        if i == 3:
            plt.show()
            break
    

    2.2图像数据的变换

    图像调整大小主要借助skimage包中的transform来实现:

    img=transform.resize(old_image,(new_h,new_w))
    

    这里,我们可以用别人写好的用于变换的类:

    lass Rescale(object):
        """Rescale the image in a sample to a given size.
    
        Args:
            output_size (tuple or int): Desired output size. If tuple, output is
                matched to output_size. If int, smaller of image edges is matched
                to output_size keeping aspect ratio the same.
        """
    
        def __init__(self, output_size):
            assert isinstance(output_size, (int, tuple))
            self.output_size = output_size
    
        def __call__(self, sample):
            image, landmarks = sample['image'], sample['landmarks']
    
            h, w = image.shape[:2]
            if isinstance(self.output_size, int):
                if h > w:
                    new_h, new_w = self.output_size * h / w, self.output_size
                else:
                    new_h, new_w = self.output_size, self.output_size * w / h
            else:
                new_h, new_w = self.output_size
    
            new_h, new_w = int(new_h), int(new_w)
    
            img = transform.resize(image, (new_h, new_w))
    
            # h and w are swapped for landmarks because for images,
            # x and y axes are axis 1 and 0 respectively
            landmarks = landmarks * [new_w / w, new_h / h]
    
            return {'image': img, 'landmarks': landmarks}
    
    
    class RandomCrop(object):
        """Crop randomly the image in a sample.
    
        Args:
            output_size (tuple or int): Desired output size. If int, square crop
                is made.
        """
    
        def __init__(self, output_size):
            assert isinstance(output_size, (int, tuple))
            if isinstance(output_size, int):
                self.output_size = (output_size, output_size)
            else:
                assert len(output_size) == 2
                self.output_size = output_size
    
        def __call__(self, sample):
            image, landmarks = sample['image'], sample['landmarks']
    
            h, w = image.shape[:2]
            new_h, new_w = self.output_size
    
            top = np.random.randint(0, h - new_h)
            left = np.random.randint(0, w - new_w)
    
            image = image[top: top + new_h,
                          left: left + new_w]
    
            landmarks = landmarks - [left, top]
    
            return {'image': image, 'landmarks': landmarks}
    
    
    class ToTensor(object):
        """Convert ndarrays in sample to Tensors."""
    
        def __call__(self, sample):
            image, landmarks = sample['image'], sample['landmarks']
    
            # swap color axis because
            # numpy image: H x W x C
            # torch image: C X H X W
            image = image.transpose((2, 0, 1))
            return {'image': torch.from_numpy(image),
                    'landmarks': torch.from_numpy(landmarks)}
    

    调整图像数据的格式主要用np中的transpose来实现,注意在numpy 中图像以H x W x C的格式存储,在torch中,图像用C x H x W存储。另外torchvision中的transfomrs.Compose可以把一系列变换组合起来,变换后的数据集可以实例化为:

    transformed_dataset = FaceLandmarksDataset(csv_file='faces/face_landmarks.csv',
                                               root_dir='faces/',
                                               transform=transforms.Compose([
                                                   Rescale(256),
                                                   RandomCrop(224),
                                                   ToTensor()
                                               ]))
    
    for i in range(len(transformed_dataset)):
        sample = transformed_dataset[i]
    
        print(i, sample['image'].size(), sample['landmarks'].size())
    
        if i == 3:
            break
    

    当然,torchvision也提供了现成的transforms,如下:

    import torch
    from torchvision import transforms, datasets
    
    data_transform = transforms.Compose([
            transforms.RandomSizedCrop(224),
            transforms.RandomHorizontalFlip(),
            transforms.ToTensor(),
            transforms.Normalize(mean=[0.5, 0.5, 0.5],
                                 std=[0.5, 0.5, 0.5])
        ])
    hymenoptera_dataset = datasets.ImageFolder(root='hymenoptera_data/train',
                                               transform=data_transform)
    dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset,
                                                 batch_size=4, shuffle=True,
                                                 num_workers=4)
    
    

    这种变换的方式最常见,

    ==transforms.Compose==
    就是把各种变换组合,
    ==transforms.RandomSizeCrop(224)==
    就是随机剪切,最最重要的==transforms.ToTensor()==
    就是把形状为[H, W, C]的取值为[0,255]的numpy.ndarray,转换成形状为[C, H, W],取值为[0, 1.0]的torch.FloadTensor。
    还原时,使用==torch.clamp(0,1)==
    最后
    ==transforms.Normalize==是最常见的归一化手段,把图像从[0,1]转化成[-1,1 ]

    2.3 读数据

    我们可以写个循环每次实例化一个数据类,然后把实例出来的图片放入网络,但这样做就忽略了深度学习中常用的Batch, shuffling 和multiprocessing.这些问题可以用pytorch 中的Dataloader来解决。导入data loader 包

    import torch.utils.data.DataLoader
    

    DataLoarder接受一个Dataset类,可以定义batch size shuffle,以及线程个数

    dataloader = DataLoader(transformed_dataset, batch_size=4,shuffle=True, num_workers=4)
    

    2.4 读写模型

    训练好的模型可以保存参数或者整个模型,设model为某个模型的实例,保存参数可用以下语句

    torch.save(model.state_dict(),'./pretrained/model.pth')
    

    读取的时候则是

    model.load_state_dict(torch.load('./pretrained/model.pth'))
    

    相关文章

      网友评论

        本文标题:pytorch学习笔记(2)—构建数据类、图像预处理、读写模型

        本文链接:https://www.haomeiwen.com/subject/hmxykftx.html