美文网首页
Higher Library

Higher Library

作者: 四碗饭儿 | 来源:发表于2019-12-19 02:19 被阅读0次

Higher是FAIR开源的一个元学习框架,主要针对gradient-based meta-learning。在gradient-based meta-learning算法中,经常涉及两层优化(Bi-level Optimization/Nested Optimization),以gradient-based hyper-params optimization为例

  • 第一层/底层 Inner Loop是training,在给定超参\varphi的情况下,优化模型参数\theta
  • 第二层/顶层 Outer Loop是meta-training, 优化超参\varphi

在每一步模型更新时,都要进行上述两层优化子步骤。读者可能觉得这两层优化看起来平平无奇,与普通的模型训练没什么区别。确实如此,但如何级联这两层优化决定了最终算法的效果。Inner Loop需要“准备”些东西以供Outer Loop使用。Higher库的相关论文总结了此类Inner Loop- Outer Loop级联的算法。

图1 Higher库的算法框架

如图1所示,输入为模型参数\theta_t,元参数\varphi_{\tau}I为meta-params更新次数,J为inner loop展开的步数(朝前看的步数,number of steps looking ahead)。如果I = J = 0,那么

2-6行描述虚拟更新

第2行:得到此时的超参\varphi^{opt}_0, \varphi^{loss}_0
第3行: 复制得到虚拟模型\theta_0' = \theta_t, 复制得到虚拟优化器opt'_0 = opt_t
第4行: inner loop
第5行: 计算虚拟梯度,得到梯度G_0 = \nabla_{\theta_0'} l_{t+0}^{train}(\theta_0', \varphi_0^{loss}),保留梯度图状态(不清空梯度zero_grad)。
第6行: 虚拟更新, \theta_{1}' = opt'_0(\theta_0',\varphi_0^{opt}, G_0)
第8行: A_0 初始化

我们来看下它的用法吧。

model = MyModel()
opt = torch.optim.Adam(model.parameters())

# When you want to branch from the current state of your model and unroll
# optimization, follow this example. This context manager gets a snapshot of the
# current version of the model and optimizer at the point where you want to
# start unrolling and create a functional version `fmodel` which executes the
# forward pass of `model` with implicit fast weights which can be read by doing
# `fmodel.parameters()`, and a differentiable optimizer `diffopt` which ensures
# that at each step, gradient of `fmodel.parameters()` with regard to initial
# fast weights `fmodel.parameters(time=0)` (or any other part of the unrolled
# model history) is defined.

with higher.innerloop_ctx(model, opt) as (fmodel, diffopt):
    for xs, ys in data:
        logits = fmodel(xs)  # modified `params` can also be passed as a kwarg
        loss = loss_function(logits, ys)  # no need to call loss.backwards()
        diffopt.step(loss)  # note that `step` must take `loss` as an argument!
        # The line above gets P[t+1] from P[t] and loss[t]. `step` also returns
        # these new parameters, as an alternative to getting them from
        # `fmodel.fast_params` or `fmodel.parameters()` after calling
        # `diffopt.step`.

        # At this point, or at any point in the iteration, you can take the
        # gradient of `fmodel.parameters()` (or equivalently
        # `fmodel.fast_params`) w.r.t. `fmodel.parameters(time=0)` (equivalently
        # `fmodel.init_fast_params`). i.e. `fast_params` will always have
        # `grad_fn` as an attribute, and be part of the gradient tape.

    # At the end of your inner loop you can obtain these e.g. ...
    grad_of_grads = torch.autograd.grad(
        meta_loss_fn(fmodel.parameters()), fmodel.parameters(time=0))

相关文章

  • Higher Library

    Higher是FAIR开源的一个元学习框架,主要针对gradient-based meta-learning。在g...

  • Higher❄️

    (ps:图片来源网络 侵删) 《higher》是收录在专辑《Erik Gronwall》中的一首歌曲,由瑞典的创作...

  • 2021-07-27

    faster higher stronger

  • Higher Education

    “天职”不是偶然碰上的,而是由自己亲自制造出来的。 一 此刻计算机发出的轰鸣声貌似在抗议着什么,自觉在这个局中越陷...

  • Stand Higher

    As a person, your eyes can see is limited, so you meet pr...

  • BEC Higher

    白天同事问起:小苗,你有没有考过商务英语方面的证呀?那段考证的点点滴滴瞬间又被激活起来。 当年还在盐城师范中文系大...

  • English

    With the rapid development of higher education in China, ...

  • LeetCode 374 Guess Number Higher

    LeetCode 374 Guess Number Higher or Lower ===============...

  • 20篇短文搞定3500个单词系列-7. A Fair Compe

    Swifter, Higher and Stronger stands for the spirit of the...

  • 【基础英语】写作句型

    There is no denying that practicality of our higher educa...

网友评论

      本文标题:Higher Library

      本文链接:https://www.haomeiwen.com/subject/pydlictx.html