关于小样本生成

作者: 雪俏 | 来源:发表于2019-05-15 21:42 被阅读0次

关于小样本生成
独立同分布的大样本OLS回归
推广绘本阅读正当其时
冬至了，我们去吃羊肉好吗
[Xcode插件]Category添加属性后自动生成getter
《科学学习》归纳类比：发现蕴含的共通原理
至本功课
小样
小样
小样

-paper1:Matching Networks for One Shot Learning （谷歌DeepMind的一篇论文）
-paper2:DATA AUGMENTATION GENERATIVE ADVERSARIAL NET
-paper3:MetaGAN: An Adversarial Approach to Few-Shot Learning（NIPS2018）

Matching Networks for One Shot Learning

附上我觉得总结的不错的一篇链接

Abstract

In this work, we employ ideas from metric learning based on deep neural features and from recent advances that augment neural networks with external memories. Our framework learns a network that maps a small labelled support set and an unlabelled example to its label, obviating the need for fine-tuning to adapt to new class types.

Matching Networks基于小样本学习归类，使得训练好的模型不需要经过调整也可以用在对训练过程中未出现过的类别进行归类。

Introduction

Deep Learning：Learning slow and based on large datasets（many weight updates using stochastic gradient descent.) This is mostly due to the parametric aspect of the model, in which training examples need to be slowly learnt by the model into its parameters。且样本用完即弃。
Non-parametric Model:作者此处对比nearest neighbor分类器，对NN而言，样本是什么输入就是什么并且会被保存，无需训练。从而可以快速学习。
本文目的融合parametric model和non-parametric model。
本文的两个创新点：提出Matching Net(对样本学习一个样本的表示，把他们编码一下)&& 提出新的训练测试方式（task based)。
Main novelty：We propose Matching Nets (MN), a neural network which uses recent advances in attention and memory that enable rapid learning.

Model

Model Architecture

Contribution: one-shot learning within the set-to-set framework
Simplest form of model:
$\hat{y}=\sum_{i=1}^ka(\hat{x},x_i)y_i\tag{1}$

其中， $x_i,y_i$ 来自support set $S=\{(x_i,y_i)\}_{i=1}^k ,a$ 可以看作一个attention kernel。
该模型用函数可表示为 $prediction=f(support\_set, test\_example)$ ，用概率可表示为 $P(\hat{y}|\hat{x},S)$ 。For a given input unseen example $\hat{x}$ ,our predicted output class would be $\arg \max_yP(y|\hat{x},S)$ .其中P为parametric neural network。

Attention Kernel

use the softmax over the cosine distance（Euclidean distance + softmax as weight ）

其中f,g是两个嵌入对编码函数，如figure1所示。

Full Context Embeddings

（内部结构没太看懂，得对LSTM结构有深入对了解才行，这里是宏观上理解）
Full Context Embeddings f

其中

可以参照模型图来理解。K为LSTM的timestep，等于support set中图片的个数，每个图片产生一个embedding，要与每个embedding计算cos距离，这样就是K次（不断地对测试样本自身进行K次迭代编码）。其余参数是LSTM内部结构函数。
Full Context Embeddings g

关于fully-conditional embedding的理解：支持集的样本除了能优化 g 网络，也应该可以优化用于编码测试样本的 f 网络，嵌入函数同时考虑support set和test set可以消除随机选择造成的差异性。为使支持集样本之间信息互通具体操作为：

双向LSTM：学习训练集的embedding，使得每个训练样本的embedding是其它训练样本的函数；
基于attention-LSTM来对测试样本embedding，使得每个测试样本的embeding是训练集embedding的函数。

Training Strategy

a task T as distribution over possible label sets L.（各种label集的可能组合）
$L \sim T$ : a label set L sampled from a task T
$S \sim L ，B \sim L$ : use L to sample the support set S and a batch B
一个B含多个task，一个task有一个S和一个test example。对one-shot来说，support set中有且只有一个样本与test example同类。
The Matching Net is then trained to minimise the error predicting the labels in the batch B conditioned on the support set S. 换句话解释：form of meta-learning: learn to learn from a given support set to minimize a loss over batch.

DAGAN

Introduction

Figure1的意思大致为：从源域学习好的manifold可以用于实现和有效地改进匹配网络的few-shot目标域。通过DAGAN可以增加匹配网络和相关模型中的数据（从DAGAN生成的每个类的最相关的比较点来实现）这涉及切线距离的概念。DAGAN以学习到流形之间的距离为目标关键。
Figure2的意思大致介绍了data shift的概念：协变量移位对多个域之间的变化情况。（对于one shot学习，类分布有一个极端的变化——两个分布没有共同支持。因此需要假设类条件分布具有一些共性，信息才可以从源域转换到one-shot目标域，生成新的数据。）
介绍了典型的数据增强技术的思想：在数据类间转换去挖掘其中的已知不变性。引出DAGAN的思想就差不多是在不同的源域训练GAN，从而学得更大的不变空间模型。训练出来的DAGAN不依赖于类本身，能捕获跨类转换，将数据点移动到相同类的其他点。

Controbution

Using GAN to learn a representation and process for a data
augmentation.
用单个新数据点生成了数据增强样本。
在数据量少的情况下也保证了任务的泛化性。
DAGAN在元学习空间中的应用，表现出比以往所有通用的元学习(meta-learning )模型更好的表现。
在元学习空间中的应用比以往所有通用的元学习(meta-learning )模型有更好的表现。
To our knowledge, this is the first paper to demonstrate state-of-the-art performance on meta-learning via novel data augmentation strategies.

Background

Transfer Learning and Dataset Shift:The term dataset shift (Storkey, 2009) generalises the concept of covariate shift (讲了协变量转移的概念)
Data Augmentation:Almost all cases of data augmentation are from a priori known invariance.（先验已知不变性）

Models

这里的 g 为generative model，f 为neural network takes the representation r and the random z as inputs.(这里感觉文中的描述和模型图片有点矛盾，个人觉得还是主要看图片描述，把g当成一个encoder，f当成一个decoder) 这时给出一个任意的，我们可以

主要模型如下：

Learning

这里要强调向D提供原始数据的重要性，防止GAN简单地对当前数据点进行自动编码。

Architecture

G: a combination of a UNet and ResNet (UResNet)

D:a DenseNet discriminator, using layer normalization instead of batch
normalization(the latter would break the assumptions of the WGAN objective function.)

Conclusions

DAGANS improve performance of classifiers even after standard data-augmentation.
数据增强在所有模型和方法上的一般性意味着DAGAN could be a valuable addition to any low data setting.

MetaGAN

网上搜不到对这篇文章的分析，就我个人理解整篇文章偏理论，提出了把GAN应用到元学习领域。文章借用元学习训练的方式，整体来看很像半监督学习GAN。

核心思想

通过对抗训练的方式使得鉴别器 learn sharper decision boundary.

Introduction

Problem：Adapt to new tasks within a few numbers of steps and scarce data.
Solve：MetaLearning：Train a adaptation strategy to a distribution of similar tasks, trying to extract transferable patterns useful for many tasks.
目前小样本学习方法建议阅读当小样本遇上机器学习 fewshot learning
目前许多few-shot learning models考虑如何用少量样本进行监督学习，而本文MetaGAN框架将监督和半监督学习结合，通过对抗学习的方式使用G生成的假数据学习到更清晰的决策边界，for both sample-level and task-level。
关于sharper decision boundary的理解可以参考文中的这张图：

BACKGROUND

Few-Shot Learning Def

Approch

Increase the dimension of the classifier output from N to N + 1, to model the probability that input data is fake.（通过给classifier增加一个额外的输出，这就是我说的其实想法类似于 semi-supervised GANs）

Basic Algorithm

Discriminator的选择
理论上选择是没有限制的，本文使用

MAML：representing learning to fast fine-tune based models
Relation Networks： learning shared embedding and metric based models
文章最后附录部分还给出了基于MAML的伪代码。
Generative的选择
Conditional generative model
G和D具体选择详见原文，不多做分析

WHY DOES METAGAN WORK?

最后作者分析了MetaGAN work的原因。直观的理解就是那幅图，当然作者没有那么随意，用了许多数学知识来证明，于我而言晦涩难懂，这里就不班门弄斧了。

实验

Sample-level
Task-level
效果都不错。

特别感谢@ewanlee

关于小样本生成
-paper1:Matching Networks for One Shot Learning （谷歌DeepM...
独立同分布的大样本OLS回归
本文将把OLS回归，从小样本推广到大样本的情形。关于小样本OLS回归，可见《小样本OLS回归的框架》和《小样本OL...
推广绘本阅读正当其时
我们最近做过一个关于绘本的小样本调查，调查显示有94.12%的回答会给孩子买绘本；其中选择在大书城购买的有58.8...
冬至了，我们去吃羊肉好吗
本故事纯属虚构 “小样，等冬至我们去吃羊肉吧，我知道...
[Xcode插件]Category添加属性后自动生成getter
AMECategoryMaker 一个无需resign Xcode的category生成器关于本扩展当你新建一...
《科学学习》归纳类比：发现蕴含的共通原理
原文链接：小样本学习与智能前沿今天给大家分享的这本书《科学学习》是由【美】丹尼尔 L. 施瓦茨等所著的一本关于学...
至本功课
哈哈哈哈，至本赠送小样特别大方，我收到了好多卸妆膏，洗面奶，水乳的小样。这款卸妆膏冬天使用可能有点干了，清洁...
小样
歇斯底里，嚎啕大哭这些都不能表达我此刻的心情贱人有贱样你写她，她写你，互相写他眼中的你心胸狭隘，小气鬼别人...
小样
文／夏莲泛黄的纸灯笼洋溢着自由的火花那迸射而出的灵魂却被勾住了双脚上演着惺惺作态佯装澄澈的双眼洞穿这...
小样
梦想的路上很漫长，漫长到你根本看不到希望。你以为你满怀壮志的上路，你根本不知道路上的路况，你会说路上会有平坦，荆...