美文网首页
A Formal Solution to the Grain o

A Formal Solution to the Grain o

作者: 朱小虎XiaohuZhu | 来源:发表于2018-12-12 00:48 被阅读29次

Jan Leike, Jessica Taylor, Benya Fallenstein

Abstract

A Bayesian agent acting in a multi-agent environment learns to predict the other agents’ policies if its prior assigns positive probability to them (in other words, its prior contains a grain of truth). Finding a reasonably large class of policies that contains the Bayes-optimal policies with respect to this class is known as the grain of truth problem. Only small classes are known to have a grain of truth and the literature contains several related impossibility results. In this paper we present a formal and general solution to the full grain of truth problem: we construct a class of policies that contains all computable policies as well as Bayes-optimal
policies for every lower semicomputable prior over the class. When the environment is unknown, Bayes-optimal agents may fail to act optimally even asymptotically.

However, agents based on Thompson sampling converge to play ε-Nash equilibria in arbitrary unknown computable multi-agent environments. While these results are purely theoretical, we show that they can be computationally approximated arbitrarily closely

相关文章

网友评论

      本文标题:A Formal Solution to the Grain o

      本文链接:https://www.haomeiwen.com/subject/wenyhqtx.html