美文网首页
AlphaZero 论文集

AlphaZero 论文集

作者: 阿甘run | 来源:发表于2017-12-11 15:57 被阅读65次

     

    Nature 论文

    Mastering the game of Go without human knowledge

    Nature 550, 7676 (2017). doi:10.1038/nature24270

    Authors: David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis

    网址:https://www.nature.com/nature/journal/v550/n7676/full/nature24270.html

    请下载pdf查看!

    Mastering the game of Go with deep neural networks and tree search

    David Silver, Aja Huang, Chris J. Maddison, Arthur Guez, Laurent Sifre, George van den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Vedavyas Panneershelvam, Marc Lanctot, Sander Dieleman, Dominik Grewe, John Nham, Nal Kalchbrenner, Ilya Sutskever, Timothy P. Lillicrap, Madeleine Leach, Koray Kavukcuoglu, Thore Graepel, Demis Hassabis: Nature 529(7587): 484-489 (2016)

    Papers

    Mastering the Game of Go without Human Knowledge

    https://deepmind.com/documents/119/agz_unformatted_nature.pdf

    Human level control with deep reinforcement learning

    http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html

    Play Atari game with deep reinforcement learning

    https://www.cs.toronto.edu/%7Evmnih/docs/dqn.pdf

    Prioritized experience replay

    https://arxiv.org/pdf/1511.05952v2.pdf

    Dueling DQN

    https://arxiv.org/pdf/1511.06581v3.pdf

    Deep reinforcement learning with double Q Learning

    https://arxiv.org/abs/1509.06461

    Deep Q learning with NAF

    https://arxiv.org/pdf/1603.00748v1.pdf

    Deterministic policy gradient

    http://jmlr.org/proceedings/papers/v32/silver14.pdf

    Continuous control with deep reinforcement learning) (DDPG)

    https://arxiv.org/pdf/1509.02971v5.pdf

    Asynchronous Methods for Deep Reinforcement Learning

    https://arxiv.org/abs/1602.01783

    Policy distillation

    https://arxiv.org/abs/1511.06295

    Control of Memory, Active Perception, and Action in Minecraft

    https://arxiv.org/pdf/1605.09128v1.pdf

    Unifying Count-Based Exploration and Intrinsic Motivation

    https://arxiv.org/pdf/1606.01868v2.pdf

    Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models

    https://arxiv.org/pdf/1507.00814v3.pdf

    Action-Conditional Video Prediction using Deep Networks in Atari Games

    https://arxiv.org/pdf/1507.08750v2.pdf

    Control of Memory, Active Perception, and Action in Minecraft

    https://web.eecs.umich.edu/~baveja/Papers/ICML2016.pdf

    PathNet

    https://arxiv.org/pdf/1701.08734.pdf

    Papers for NLP

    Coarse-to-Fine Question Answering for Long Documentshttps://homes.cs.washington.edu/~eunsol/papers/acl17eunsol.pdfADeep Reinforced Model for Abstractive Summarizationhttps://arxiv.org/pdf/1705.04304.pdfReinforcementLearning for Simultaneous Machine Translationhttps://www.umiacs.umd.edu/~jbg/docs/2014_emnlp_simtrans.pdfDualLearning for Machine Translationhttps://papers.nips.cc/paper/6469-dual-learning-for-machine-translation.pdfLearningto Win by Reading Manuals in a Monte-Carlo Frameworkhttp://people.csail.mit.edu/regina/my_papers/civ11.pdfImprovingInformation Extraction by Acquiring External Evidence with Reinforcement Learninghttp://people.csail.mit.edu/regina/my_papers/civ11.pdfDeepReinforcement Learning with a Natural Language Action Spacehttp://www.aclweb.org/anthology/P16-1153DeepReinforcement Learning for Dialogue Generationhttps://arxiv.org/pdf/1606.01541.pdfReinforcementLearning for Mapping Instructions to Actionshttp://people.csail.mit.edu/branavan/papers/acl2009.pdfLanguageUnderstanding for Text-based Games using Deep Reinforcement Learninghttps://arxiv.org/pdf/1506.08941.pdfEnd-to-endLSTM-based dialog control optimized with supervised and reinforcement learninghttps://arxiv.org/pdf/1606.01269v1.pdfEnd-to-EndReinforcement Learning of Dialogue Agents for Information Accesshttps://arxiv.org/pdf/1609.00777v1.pdfHybridCode Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learninghttps://arxiv.org/pdf/1702.03274.pdfDeepReinforcement Learning for Mention-Ranking Coreference Modelshttps://arxiv.org/abs/1609.08667

    精选文章

    wikihttps://en.wikipedia.org/wiki/Reinforcement_learningDeepReinforcement Learning: Pong from Pixelshttp://karpathy.github.io/2016/05/31/rl/CS294: Deep Reinforcement Learninghttp://rll.berkeley.edu/deeprlcourse/强化学习系列之一:马尔科夫决策过程http://www.algorithmdog.com/%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0-%E9%A9%AC%E5%B0%94%E7%A7%91%E5%A4%AB%E5%86%B3%E7%AD%96%E8%BF%87%E7%A8%8B强化学习系列之九:Deep Q Network (DQN)http://www.algorithmdog.com/drl强化学习系列之三:模型无关的策略评价http://www.algorithmdog.com/reinforcement-learning-model-free-evalution【整理】强化学习与MDPhttp://www.cnblogs.com/mo-wang/p/4910855.html强化学习入门及其实现代码http://www.jianshu.com/p/165607eaa4f9深度强化学习系列(二):强化学习http://blog.csdn.net/ikerpeng/article/details/53031551采用深度 Q 网络的 Atari 的 Demo:

    Nature 上关于深度 Q 网络 (DQN) 论文:http://www.nature.com/articles/nature14236David视频里所使用的讲义pdfhttps://pan.baidu.com/s/1nvqP7dB什么是强化学习?http://www.cnblogs.com/geniferology/p/what_is_reinforcement_learning.htmlDavidSilver关于 深度确定策略梯度 DPG的论文http://www.jmlr.org/proceedings/papers/v32/silver14.pdfNature上关于 AlphaGo 的论文:http://www.nature.com/articles/nature16961AlphaGo相关的资源http://deepmind.com/research/alphago/What’s the Difference Between Artificial Intelligence, Machine Learning, and Deep Learning?https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-intelligence-machine-learning-deep-learning-ai/DeepLearning in a Nutshell: Reinforcement Learninghttps://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-reinforcement-learning/Bellmanequationhttps://en.wikipedia.org/wiki/Bellman_equationReinforcementlearninghttps://en.wikipedia.org/wiki/Reinforcement_learningMasteringthe Game of Go without Human Knowledgehttps://deepmind.com/documents/119/agz_unformatted_nature.pdfReinforcementLearning(RL) for Natural Language Processing(NLP)https://github.com/adityathakker/awesome-rl-nlp

    视频教程

    强化学习教程(莫烦)https://morvanzhou.github.io/tutorials/machine-learning/reinforcement-learning/强化学习课程 by David Silverhttps://www.bilibili.com/video/av8912293/?from=search&seid=1166472326542614796CS234:Reinforcement Learninghttp://web.stanford.edu/class/cs234/index.html什么是强化学习? (Reinforcement Learning)https://www.youtube.com/watch?v=NVWBs7b3oGk什么是 Q Learning (Reinforcement Learning 强化学习)https://www.youtube.com/watch?v=HTZ5xn12AL4强化学习-莫烦https://morvanzhou.github.io/tutorials/machine-learning/ML-intro/DavidSilver深度强化学习第1课 - 简介 (中文字幕)https://www.bilibili.com/video/av9831889/DavidSilver的这套视频公开课(Youtube)https://www.youtube.com/watch?v=2pWv7GOvuf0&list=PL7-jPKtc4r78-wCZcQn5IqyuWhBZ8fOxTDavidSilver的这套视频公开课(Bilibili)http://www.bilibili.com/video/av9831889/?from=search&seid=17387316110198388304Deep Reinforcement Learninghttp://videolectures.net/rldm2015_silver_reinforcement_learning/

    Tutorial

    Reinforcement Learning for NLPhttp://www.umiacs.umd.edu/~jbg/teaching/CSCI_7000/11a.pdfICML2016, Deep Reinforcement Learning tutorialhttp://icml.cc/2016/tutorials/deep_rl_tutorial.pdfDQN tutorialhttps://medium.com/@awjuliani/simple-reinforcement-learning-with-tensorflow-part-4-deep-q-networks-and-beyond-8438a3e2b8df#.28wv34w3a

    代码

    OpenAI Gymhttps://github.com/openai/gymGoogleDeepMind 团队深度 Q 网络 (DQN) 源码:http://sites.google.com/a/deepmind.com/dqn/ReinforcementLearningCodehttps://github.com/halleanwoo/ReinforcementLearningCodereinforcement-learninghttps://github.com/dennybritz/reinforcement-learningDQNhttps://github.com/devsisters/DQN-tensorflowDDPGhttps://github.com/stevenpjg/ddpg-aigymA3C01https://github.com/miyosuda/async_deep_reinforceA3C02https://github.com/openai/universe-starter-agent

    相关文章

      网友评论

          本文标题:AlphaZero 论文集

          本文链接:https://www.haomeiwen.com/subject/lmgiixtx.html