1 |
Robust Adversarial Reinforcement Learning |
2 |
Mastering the game of Go with deep neural networks and tree search |
3 |
Mastering the game of Go without human knowledge |
4 |
Continuous Control With Deep Reinforcement Learning |
5 |
Benchmarking deep reinforcement learning for continuous control |
6 |
Deep Reinforcement Learning for Mention-Ranking Coreference Models |
7 |
Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning |
8 |
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access |
9 |
Deep Reinforcement Learning for Dialogue Generation |
10 |
Online Reinforcement Learning in Stochastic Games |
11 |
Self-critical Sequence Training for Image Captioning |
12 |
Improved Image Captioning via Policy Gradient optimization of SPIDEr |
13 |
Safe and Nested Subgame Solving for Imperfect-Information Games |
14 |
Learning to Collaborate: Multi-Scenario Ranking via Multi-Agent Reinforcement Learning |
15 |
Neural Adaptive Video Streaming with Pensieve |
16 |
ReasoNet: Learning to Stop Reading in Machine Comprehension |
17 |
Dual learning for machine translation |
18 |
Reinforcement Mechanism Design |
19 |
Tuning Recurrent Neural Networks with Reinforcement Learning |
20 |
Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning |
21 |
Designing neural network architectures using reinforcement learning |
22 |
Neural Architecture Search with Reinforcement Learning |
23 |
Task-Oriented Query Reformulation with Reinforcement Learning |
24 |
Ask the Right Questions: Active Question Reformulation with Reinforcement Learning |
25 |
Go for a Walk and Arrive at the Answer: Reasoning over Paths in Knowledge Bases using Reinforcement Learning |
26 |
Real-Time Bidding by Reinforcement Learning in Display Advertising |
27 |
Dynamic Scholarly Collaborator Recommendation via Competitive Multi-Agent Reinforcement Learning |
28 |
DRN: A Deep Reinforcement Learning Framework for News Recommendation |
29 |
Reinforcement Learning for Relation Classification from Noisy Data |
30 |
Resource Management with Deep Reinforcement Learning |
31 |
Model-Based Reinforcement Learning in Continuous Environments using real-time constrained optimization |
32 |
End-to-End Training of Deep Visuomotor Policies |
33 |
Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates |
34 |
Learning To Route |
35 |
Learning Structured Representation for Text Classification via Reinforcement Learning |
36 |
Generating Text with Deep Reinforcement Learning |
37 |
A Deep Reinforced Model for Abstractive Summarization |
38 |
Experience-driven Networking A Deep Reinforcement Learning based Approach |
39 |
Coordinated deep reinforcement learners for traffic light control |
40 |
Playing FPS Games with Deep Reinforcement Learning |
41 |
A Deep Hierarchical Approach to Lifelong Learning in Minecraft |
42 |
Playing Atari with Deep Reinforcement Learning |
43 |
Learning to act by predicting the future |
44 |
Active Neural Localization |
45 |
Asynchronous methods for deep reinforcement learning |
46 |
Deep Attention Recurrent Q-Network |
47 |
Reinforcement Learning Neural Turing Machines - Revised |
48 |
Learning Deep Neural Network Policies with Continuous Memory States |
49 |
Dueling Network Architectures for Deep Reinforcement Learning |
50 |
Evolution Strategies as a Scalable Alternative to Reinforcement Learning |
51 |
#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning |
52 |
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models |
53 |
Curiosity-driven Exploration by Self-supervised Prediction |
54 |
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning |
55 |
Deep Exploration via Bootstrapped DQN |
56 |
The option-critic architecture |
57 |
FeUdal Networks for Hierarchical Reinforcement Learning |
58 |
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation |
59 |
Deep reinforcement learning in large discrete action spaces |
60 |
Learning To Reinforcement Learn |
61 |
Reinforcement Learning under Model Mismatch |
62 |
Continuous Deep Q-Learning with Model-based Acceleration |
63 |
Safe Model-based Reinforcement Learning with Stability Guarantees |
64 |
Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning |
65 |
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning |
66 |
Learning to Communicate with Deep Multi-Agent Reinforcement Learning |
67 |
Learning multiagent communication with back-propagation |
68 |
Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning |
69 |
Learning values across many orders of magnitude |
70 |
Deep Reinforcement Learning in Parameterized Action Space |
71 |
Reinforcement Learning with Parameterized Actions |
72 |
Value iteration networks |
73 |
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor |
74 |
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning |
75 |
Sample Efficient Actor-Critic with Experience Replay |
76 |
Policy Shaping:Integrating Human Feedback with Reinforcement Learning |
77 |
Interpolated policy gradient: Merging on-policy and off-policy gradient estimation for deep reinforcement learning |
78 |
Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation |
79 |
Proximal Policy Optimization Algorithms |
80 |
Simple Random Search Provides a Competitive Approach to Reinforcement Learning |
81 |
Trust Region Policy Optimization |
82 |
Generative Adversarial Imitation Learning |
83 |
Where to Add Actions in Human-in-the-Loop Reinforcement Learning |
84 |
Maximum Entropy Deep Inverse Reinforcement Learning |
85 |
Cooperative inverse reinforcement learning |
86 |
Reinforcement Learning from Demonstration through Shaping |
87 |
Hybrid Reward Architecutre for Reinforcement Learning |
88 |
Deep Reinforcement Learning from Human Preferences |
89 |
Optimistic posterior sampling for reinforcement learning: worst-case regret bounds |
90 |
Distral: Robust Multitask Reinforcement Learning |
91 |
Scalable Multitask Policy Gradient Reinforcement Learning |
92 |
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning |
93 |
Transfer Reinforcement Learning with Shared Dynamics |
94 |
Successor Features for Transfer in Reinforcement Learning |
95 |
Massively Parallel Methods for Deep Reinforcement Learning |
96 |
Deep reinforcement learning with double Q-learning |
97 |
Human-level control through deep reinforcement learning |
98 |
Multi-step Off-policy Learning Without Importance Sampling Ratios |
99 |
Weighted importance sampling for off-policy learning with linear function approximation |
100 |
Off-policy learning based on weighted importance sampling with linear computational complexity |
101 |
Safe and Efficient Off-policy Reinforcement Learning |
102 |
Universal Value Function Approximators |
103 |
Linear feature encoding for reinforcement learning |
104 |
Imagination-Augmented Agents for Deep Reinforcement Learning |
网友评论