美文网首页
Generalization and Regularizatio

Generalization and Regularizatio

作者: 朱小虎XiaohuZhu | 来源:发表于2018-10-09 18:03 被阅读39次

Jesse Farebrother, Marlos C. Machado, Michael Bowling
University of Alberta

ABSTRACT

Deep reinforcement learning (RL) algorithms have shown an impressive ability to learn complex control policies in high-dimensional environments.

深度强化学习算法已经展示了令人信服的能力可以学习高维环境中的复杂控制策略

However, despite the ever-increasing performance on popular benchmarks like the Arcade Learning Environment (ALE), policies learned by deep RL algorithms can struggle to generalize when evaluated in remarkably similar environments.

不过,尽管其在 ALE 环境等流行的基准数据集上有了不断提升的性能,由深度强化学习学得的策略

These results are unexpected given the fact that, in supervised learning, deep neural networks often learn robust features that generalize across tasks.

这些结果令人意想不到,因为在监督学习中,深度神经网络通常可以学习到能够在任务间泛化的健壮的特征

In this paper, we study the generalization capabilities of DQN in order to aid in understanding this mismatch
between generalization in deep RL and supervised learning methods.

本文,研究了 DQN 的泛化能力来增加我们对深度强化学习和监督学习方法的泛化的不相匹配的理解

We provide evidence suggesting that DQN overspecializes to the domain it is trained on.

我们给出了 DQN 太过专于训练领域的证据

We then comprehensively evaluate the impact of traditional methods of regularization from supervised learning, l_2 and dropout, and of reusing learned representations to improve the generalization capabilities of DQN.

然后完备地评估了源自监督学习中正则化传统方法(如 l_2 和随机丢弃)和重用学到的表示提升泛化能力的影响。

We perform this study using different game modes of Atari 2600 games, a recently introduced modification
for the ALE which supports slight variations of the Atari 2600 games used for benchmarking in the field.

使用了 Atari 2600 游戏的不同游戏模式,近期出现的变动支持微小变体进行基准测试

Despite regularization being largely underutilized in
deep RL, we show that it can, in fact, help DQN learn more general features.

尽管正则化在深度强化学习中没有被大量使用,实际上,能够学习到更加一般的特征

These features can then be reused and fine-tuned on similar tasks, considerably improving the sample efficiency of DQN.

这些特征可以重用并在类似的任务上优调,显著地提升 DQN 的样本效率

相关文章

网友评论

      本文标题:Generalization and Regularizatio

      本文链接:https://www.haomeiwen.com/subject/ipxyaftx.html