美文网首页
VALUE PROPAGATION NETWORKS

VALUE PROPAGATION NETWORKS

作者: 朱小虎XiaohuZhu | 来源:发表于2017-10-31 16:59 被阅读64次

Anonymous authors
Paper under double-blind review
ABSTRACT
We present Value Propagation (VProp), a parameter-efficient differentiable planning
module built on Value Iteration which can successfully be trained in a reinforcement
learning fashion to solve unseen tasks, has the capability to generalize to
larger map sizes, and can learn to navigate in dynamic environments. We evaluate
on configurations of MazeBase grid-worlds, with randomly generated environments
of several different sizes. Furthermore, we show that the module and its variants
provide a simple way to learn to plan when adversarial agents are present and
the environment is stochastic, providing a cost-efficient learning system to build
low-level size-invariant planners for a variety of interactive navigation problems.

相关文章

网友评论

      本文标题:VALUE PROPAGATION NETWORKS

      本文链接:https://www.haomeiwen.com/subject/yxgbpxtx.html