AlphaZero 论文集

AlphaZero 论文集

作者: 阿甘run | 来源:发表于2017-12-11 15:57 被阅读65次

Nature 论文

Mastering the game of Go without human knowledge

Nature 550, 7676 (2017). doi:10.1038/nature24270

Authors: David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis

网址：https://www.nature.com/nature/journal/v550/n7676/full/nature24270.html

请下载pdf查看！

Mastering the game of Go with deep neural networks and tree search

David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis: Nature 529(7587): 484-489 (2016)

Papers

Mastering the Game of Go without Human Knowledge

https://deepmind.com/documents/119/agz_unformatted_nature.pdf

Human level control with deep reinforcement learning

http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

Play Atari game with deep reinforcement learning

https://www.cs.toronto.edu/%7Evmnih/docs/dqn.pdf

Prioritized experience replay

https://arxiv.org/pdf/1511.05952v2.pdf

Dueling DQN

https://arxiv.org/pdf/1511.06581v3.pdf

Deep reinforcement learning with double Q Learning

https://arxiv.org/abs/1509.06461

Deep Q learning with NAF

https://arxiv.org/pdf/1603.00748v1.pdf

Deterministic policy gradient

http://jmlr.org/proceedings/papers/v32/silver14.pdf

Continuous control with deep reinforcement learning) (DDPG)

https://arxiv.org/pdf/1509.02971v5.pdf

Asynchronous Methods for Deep Reinforcement Learning

https://arxiv.org/abs/1602.01783

Policy distillation

https://arxiv.org/abs/1511.06295

Control of Memory, Active Perception, and Action in Minecraft

https://arxiv.org/pdf/1605.09128v1.pdf

Unifying Count-Based Exploration and Intrinsic Motivation

https://arxiv.org/pdf/1606.01868v2.pdf

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

https://arxiv.org/pdf/1507.00814v3.pdf

Action-Conditional Video Prediction using Deep Networks in Atari Games

https://arxiv.org/pdf/1507.08750v2.pdf

Control of Memory, Active Perception, and Action in Minecraft

https://web.eecs.umich.edu/~baveja/Papers/ICML2016.pdf

PathNet

https://arxiv.org/pdf/1701.08734.pdf

Papers for NLP

Coarse-to-Fine Question Answering for Long Documentshttps://homes.cs.washington.edu/~eunsol/papers/acl17eunsol.pdfADeep Reinforced Model for Abstractive Summarizationhttps://arxiv.org/pdf/1705.04304.pdfReinforcementLearning for Simultaneous Machine Translationhttps://www.umiacs.umd.edu/~jbg/docs/2014_emnlp_simtrans.pdfDualLearning for Machine Translationhttps://papers.nips.cc/paper/6469-dual-learning-for-machine-translation.pdfLearningto Win by Reading Manuals in a Monte-Carlo Frameworkhttp://people.csail.mit.edu/regina/my_papers/civ11.pdfImprovingInformation Extraction by Acquiring External Evidence with Reinforcement Learninghttp://people.csail.mit.edu/regina/my_papers/civ11.pdfDeepReinforcement Learning with a Natural Language Action Spacehttp://www.aclweb.org/anthology/P16-1153DeepReinforcement Learning for Dialogue Generationhttps://arxiv.org/pdf/1606.01541.pdfReinforcementLearning for Mapping Instructions to Actionshttp://people.csail.mit.edu/branavan/papers/acl2009.pdfLanguageUnderstanding for Text-based Games using Deep Reinforcement Learninghttps://arxiv.org/pdf/1506.08941.pdfEnd-to-endLSTM-based dialog control optimized with supervised and reinforcement learninghttps://arxiv.org/pdf/1606.01269v1.pdfEnd-to-EndReinforcement Learning of Dialogue Agents for Information Accesshttps://arxiv.org/pdf/1609.00777v1.pdfHybridCode Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learninghttps://arxiv.org/pdf/1702.03274.pdfDeepReinforcement Learning for Mention-Ranking Coreference Modelshttps://arxiv.org/abs/1609.08667

精选文章

wikihttps://en.wikipedia.org/wiki/Reinforcement_learningDeepReinforcement Learning: Pong from Pixelshttp://karpathy.github.io/2016/05/31/rl/CS294: Deep Reinforcement Learninghttp://rll.berkeley.edu/deeprlcourse/强化学习系列之一:马尔科夫决策过程http://www.algorithmdog.com/%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0-%E9%A9%AC%E5%B0%94%E7%A7%91%E5%A4%AB%E5%86%B3%E7%AD%96%E8%BF%87%E7%A8%8B强化学习系列之九:Deep Q Network (DQN)http://www.algorithmdog.com/drl强化学习系列之三:模型无关的策略评价http://www.algorithmdog.com/reinforcement-learning-model-free-evalution【整理】强化学习与MDPhttp://www.cnblogs.com/mo-wang/p/4910855.html强化学习入门及其实现代码http://www.jianshu.com/p/165607eaa4f9深度强化学习系列（二）：强化学习http://blog.csdn.net/ikerpeng/article/details/53031551采用深度 Q 网络的 Atari 的 Demo：

Nature 上关于深度 Q 网络 (DQN) 论文:http://www.nature.com/articles/nature14236David视频里所使用的讲义pdfhttps://pan.baidu.com/s/1nvqP7dB什么是强化学习？http://www.cnblogs.com/geniferology/p/what_is_reinforcement_learning.htmlDavidSilver关于深度确定策略梯度 DPG的论文http://www.jmlr.org/proceedings/papers/v32/silver14.pdfNature上关于 AlphaGo 的论文：http://www.nature.com/articles/nature16961AlphaGo相关的资源http://deepmind.com/research/alphago/What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/DeepLearning in a Nutshell: Reinforcement Learninghttps://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/Bellmanequationhttps://en.wikipedia.org/wiki/Bellman_equationReinforcementlearninghttps://en.wikipedia.org/wiki/Reinforcement_learningMasteringthe Game of Go without Human Knowledgehttps://deepmind.com/documents/119/agz_unformatted_nature.pdfReinforcementLearning(RL) for Natural Language Processing(NLP)https://github.com/adityathakker/awesome-rl-nlp

视频教程

强化学习教程(莫烦)https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/强化学习课程 by David Silverhttps://www.bilibili.com/video/av8912293/?from=search&seid=1166472326542614796CS234:Reinforcement Learninghttp://web.stanford.edu/class/cs234/index.html什么是强化学习? (Reinforcement Learning)https://www.youtube.com/watch?v=NVWBs7b3oGk什么是 Q Learning (Reinforcement Learning 强化学习)https://www.youtube.com/watch?v=HTZ5xn12AL4强化学习-莫烦https://morvanzhou.github.io/tutorials/machine-learning/ML-intro/DavidSilver深度强化学习第1课 - 简介 (中文字幕)https://www.bilibili.com/video/av9831889/DavidSilver的这套视频公开课（Youtube）https://www.youtube.com/watch?v=2pWv7GOvuf0&amp;amp;amp;list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxTDavidSilver的这套视频公开课（Bilibili）http://www.bilibili.com/video/av9831889/?from=search&seid=17387316110198388304Deep Reinforcement Learninghttp://videolectures.net/rldm2015_silver_reinforcement_learning/

Tutorial

Reinforcement Learning for NLPhttp://www.umiacs.umd.edu/~jbg/teaching/CSCI_7000/11a.pdfICML2016, Deep Reinforcement Learning tutorialhttp://icml.cc/2016/tutorials/deep_rl_tutorial.pdfDQN tutorialhttps://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-4-deep-q-networks-and-beyond-8438a3e2b8df#.28wv34w3a

代码

OpenAI Gymhttps://github.com/openai/gymGoogleDeepMind 团队深度 Q 网络 (DQN) 源码:http://sites.google.com/a/deepmind.com/dqn/ReinforcementLearningCodehttps://github.com/halleanwoo/ReinforcementLearningCodereinforcement-learninghttps://github.com/dennybritz/reinforcement-learningDQNhttps://github.com/devsisters/DQN-tensorflowDDPGhttps://github.com/stevenpjg/ddpg-aigymA3C01https://github.com/miyosuda/async_deep_reinforceA3C02https://github.com/openai/universe-starter-agent

相关文章

网友评论

本文标题：AlphaZero 论文集

本文链接：https://www.haomeiwen.com/subject/lmgiixtx.html

延伸阅读

深度阅读

您也可以注册成为美文阅读网的作者，发表您的原创作品、分享您的心情！

栏目导航

热点阅读

关于我们|服务条款|联系我们|AlphaZero 论文集|投稿指南|网站地图|RSS订阅|排版工具|手机版

提供经典美文摘抄,优美散文欣赏,现代诗歌精选,短篇小说,心情随笔,表白情书范文,故事会在线阅读欣赏

Copyright © 2014-2023 Haomeiwen.com All Rights Reserved. 好美文阅读网版权所有

备案信息：桂公网安备 45052102000051号 · 桂ICP备13007215号-3

本站所收录作品、热点评论等信息部分来源互联网，目的只是为了系统归纳学习和传递资讯

所有作品版权归原创作者所有，与本站立场无关，如不慎侵犯了你的权益，请联系我们告知，我们将做删除处理！