美文网首页Reinforcement Learning
19-20 NLP顶会强化学习相关文章汇总

19-20 NLP顶会强化学习相关文章汇总

作者: BoringFantasy | 来源:发表于2020-02-18 20:31 被阅读0次

19-20 自然语言处理顶会强化学习相关论文统计

ACL 2019

Long Papers: http://www.acl2019.org/EN/program/papers.xhtml#papers01
Short Papers: http://www.acl2019.org/EN/program/papers.xhtml#papers02

Long Papers: 
 Reinforced Training Data Selection for Domain Adaptation 

Aspect Sentiment Classification Towards Question-Answering with Reinforced Bidirectional Attention Network 

A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer 

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards 

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards 

Neural Keyphrase Generation via Reinforcement Learning with Adaptive Rewards 

What should I ask? Using conversationally informative rewards for goal-oriented visual dialog. 

Reinforced Dynamic Reasoning for Conversational Question Generation 


Short Papers:
 A Deep Reinforced Sequence-to-Set Model for Multi-Label Classification 

Historical Text Normalization with Delayed Rewards 

Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning 

Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning 

Rewarding Smatch: Transition-Based AMR Parsing with Reinforcement Learning 

End-to-end Deep Reinforcement Learning Based Coreference Resolution 

End-to-end Deep Reinforcement Learning Based Coreference Resolution 

EMNLP 2019

https://www.emnlp-ijcnlp2019.org/program/accepted/

Answers Unite! Unsupervised Metrics for Reinforced Summarization Models
Thomas Scialom, Sylvain Lamprier, Benjamin Piwowarski and Jacopo Staiano

Clickbait? Sensational Headline Generation with Auto-tuned Reinforcement Learning
Peng Xu, Chien-Sheng Wu, Andrea Madotto and Pascale Fung

Deep Reinforcement Learning-based Text Anonymization against Private-Attribute Inference
Ahmadreza Mosallanezhad, Ghazaleh Beigi and Huan Liu

Hierarchical Text Classification with Reinforced Label Assignment
Yuning Mao, Jingjing Tian, Jiawei Han and Xiang Ren

Human-Like Decision Making: Document-level Aspect Sentiment Classification via Hierarchical Reinforcement Learning
Jingjing Wang, Changlong Sun, Shoushan Li, Jiancheng Wang, Luo Si, Min Zhang, Xiaozhong Liu and Guodong Zhou

Incorporating Graph Attention Mechanism into Knowledge Graph Reasoning Based on Deep Reinforcement Learning
Heng Wang, Shuangyin Li, Rong Pan and Mingzhi Mao

Learning the Extraction Order of Multiple Relational Facts in a Sentence with Reinforcement Learning
Xiangrong Zeng, Shizhu He, Daojian Zeng, Kang Liu, Shengping Liu and Jun Zhao

Lexical-Based Adversarial Reinforcement Training for Robust Sentiment Classification
Jingjing Xu, Liang Zhao, Hanqi Yan, Qi Zeng, Yun Liang and Xu SUN

Reinforced Product Metadata Selection for Helpfulness Assessment of Customer Reviews
Miao Fan, Chao Feng, Mingming Sun and Ping Li

An Empirical Comparison on Imitation Learning and Reinforcement Learning for Paraphrase Generation
Wanyu Du and Yangfeng Ji

Deep Reinforcement Learning with Distributional Semantic Rewards for Abstractive Summarization
Siyao Li, Deren Lei, Pengda Qin and William Yang Wang

Neural Topic Model with Reinforcement Learning
Lin Gui, Jia Leng, Gabriele Pergola, yu zhou, Ruifeng Xu and Yulan He

Better Rewards Yield Better Summaries: Learning to Summarise Without References
Florian Böhm, Yang Gao, Christian M. Meyer, Ori Shapira, Ido Dagan and Iryna Gurevych

Guided Dialog Policy Learning: Reward Estimation for Multi-Domain Task-Oriented Dialog
Ryuichi Takanobu, Hanlin Zhu and Minlie Huang

Rewarding Coreference Resolvers for Being Consistent with World Knowledge
Rahul Aralikatte, Heather Lent, Ana Valeria Gonzalez, Daniel Herschcovich, Chen Qiu, Anders Sandholm, Michael Ringaard and Anders Søgaard

Collaborative Policy Learning for Open Knowledge Graph Reasoning
Cong Fu, Tong Chen, Meng Qu, Woojeong Jin and Xiang Ren

Modeling Multi-Action Policy for Task-Oriented Dialogues
Lei Shu, Hu Xu, Bing Liu and Piero Molino

AAAI 2020

Google Research Football: A Novel Reinforcement Learning Environment Karol Kurach (Google Brain)*; Anton Raichuk (Google); Piotr Stańczyk (Google Brain); Michał Zając (Google Brain); Olivier Bachem (Google Brain); Lasse Espeholt (DeepMind); Carlos Riquelme (Google Brain); Damien Vincent (Google Brain); Marcin Michalski (Google); Olivier Bousquet (Google); Sylvain Gelly (Google Brain)

Reinforcing an Image Caption Generator using Off-­‐line Human Feedback Paul Hongsuck Seo (POSTECH)*; Piyush Sharma (Google Research); Tomer Levinboim (Google); Bohyung Han (Seoul National University); Radu Soricut (Google)

Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance Xiaojian Ma (University of California, Los Angeles)*; Mingxuan Jing (Tsinghua University); Wenbing Huang (Tsinghua University); Chao Yang (Tsinghua University); Fuchun Sun (Tsinghua); Huaping Liu (Tsinghua University); Bin Fang (Tsinghua University)

Proximal Distilled Evolutionary Reinforcement Learning Cristian Bodnar (University of Cambridge)*; Ben Day (University of Cambridge); Pietro Lió (University of Cambridge)

Tree-­‐Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video Jie Wu (Sun Yat-­‐sen University)*; Guanbin Li (Sun Yat-­‐sen University); si liu (Beihang University); Liang Lin (DarkMatter AI)

Attractive or Faithful? Popularity-­‐Reinforced Learning for Inspired Headline Generation YunZhu Song (National Chiao Tung University)*; Hong-­‐Han Shuai (National Chiao Tung University); Sung-­‐Lin Yeh (National Tsing Hua University); Yi-­‐Lun Wu (National Chiao Tung University); Lun-­‐Wei Ku (Academia Sinica); Wen-­Chih Peng (National Chiao Tung University)

RL-­‐Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning Nan Jiang (Tsinghua University)*; Sheng Jin (Tsinghua University); Zhiyao Duan (Unversity of Rochester); Changshui Zhang (Tsinghua University)

Mastering Complex Control in MOBA Games with Deep Reinforcement Learning Deheng Ye (Tencent)*; Zhao Liu (Tencent); Mingfei Sun (Tencent); Bei Shi (Tencent AI Lab); Peilin Zhao (Tencent AI Lab); Hao Wu (Tencent); Hongsheng Yu (Tencent); Shaojie Yang (Tencent); Xipeng Wu (Tencent); Qingwei Guo (Tsinghua University); Qiaobo Chen (Tencent); Yinyuting Yin (Tencent); Hao Zhang (Tencent); Tengfei Shi (Tencent); Liang Wang (Tencent); Qiang Fu (Tencent AI Lab); Wei Yang (Tencent AI Lab); Lanxiao Huang (Tencent)

Partner Selection for the Emergence of Cooperation in Multi-­‐Agent Systems using Reinforcement Learning Nicolas Anastassacos (The Alan Turing Institute)*; Steve Hailes (University College London); Mirco Musolesi (UCL)

Reinforcing Neural Network Stability with Attractor Dynamics Hanming Deng (Shanghai Jiao Tong University); Yang Hua (Queen's University Belfast); Tao Song (Shanghai Jiao Tong University)*; Zhengui Xue (Shanghai Jiao Tong University); Ruhui Ma (Shanghai Jiao Tong University); Neil Robertson (Queen's University Belfast); Haibing Guan (Shanghai Jiao Tong University)

Uncertainty-­‐Aware Action Advising for Deep Reinforcement Learning Agents Felipe Leno da Silva (University of Sao Paulo)*; Pablo Hernandez-­‐Leal (Borealis AI); Bilal Kartal (Borealis AI); Matthew Taylor (Borealis AI)

MetaLight: Value-­‐based Meta-­‐reinforcement Learning for Traffic Signal Control Xinshi Zang (Shanghai Jiao Tong University)*; Huaxiu Yao (Pennsylvania State University); Guanjie Zheng (Pennsylvania State University); Nan Xu (University of Southern California); Kai Xu (Shanghai Tianrang Intelligent Technology Co., Ltd); Zhenhui (Jessie) Li (Penn State University)

Adaptive Quantitative Trading: an Imitative Deep Reinforcement Learning Approach Yang Liu (University of Science and Technology of China)*; Qi Liu (" University of Science and Technology of China, China"); Hongke Zhao (Tianjin University); Zhen Pan (University of Science and Technology of China); Chuanren Liu (The University of Tennessee Knoxville)

Neighborhood Cognition Consistent Multi-­‐Agent Reinforcement Learning Hangyu Mao (Peking University)*; Wulong Liu (Huawei Noah's Ark Lab); Jianye Hao (Tianjin University); Jun Luo (Huawei Technologies Canada Co. Ltd.); Dong Li ( Huawei Noah's Ark Lab); Zhengchao Zhang (Peking University); Jun Wang (UCL); Zhen Xiao (Peking University)

SMIX($\lambda$): Enhancing Centralized Value Functions for Cooperative Multi-­‐Agent Reinforcement Learning Chao Wen (Nanjing University of Aeronautics and Astronautics)*; Xinghu Yao (Nanjing University of Aeronautics and Astronautics); Yuhui Wang (Nanjing University of Aeronautics and Astronautics, China); Xiaoyang Tan (Nanjing University of Aeronautics and Astronautics, China)

Unpaired Image Enhancement Featuring Reinforcement-­‐Learning-­‐Controlled Image Editing Software Satoshi Kosugi (The University of Tokyo)*; Toshihiko Yamasaki (The University of Tokyo)

Crowdfunding Dynamics Tracking: A Reinforcement Learning Approach Jun Wang (University of Science and Technology of China)*; Hefu Zhang (University of Science and Technology of China); Qi Liu (" University of Science and Technology of China, China"); Zhen Pan (University of Science and Technology of China); Hanqing Tao (University of Science and Technology of China (USTC))

Model and Reinforcement Learning for Markov Games with Risk Preferences Wenjie Huang (Shenzhen Research Institute of Big Data)*; Hai Pham Viet (Department of Computer Science, School of Computing, National University of Singapore); William Benjamin Haskell (Supply Chain and Operations Management Area, Krannert School of Management, Purdue University)

Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning Liang Tong (Washington University in Saint Louis)*; Aron Laszka (University of Houston); Chao Yan (Vanderbilt UNIVERSITY); Ning Zhang (Washington University in St. Louis); Yevgeniy Vorobeychik (Washington University in St. Louis)

Toward A Thousand Lights: Decentralized Deep Reinforcement Learning for Large-­‐Scale Traffic Signal Control Chacha Chen (Pennsylvania State University)*; Hua Wei (Pennsylvania State University); Nan Xu (University of Southern California); Guanjie Zheng (Pennsylvania State University); Ming Yang (Shanghai Tianrang Intelligent Technology Co., Ltd); Yuanhao Xiong (Zhejiang University); Kai Xu (Shanghai Tianrang Intelligent Technology Co., Ltd); Zhenhui (Jessie) Li (Penn State University)

Towards Fine-­‐Grained Temporal Network Representation via Time-­‐Reinforced Random Walk Zhining Liu ( University of Electronic Science and Technology of China)*; Dawei Zhou (University of Illinois at Urbana-­‐Champaign); Yada Zhu (IBM); Jinjie Gu (Ant Financial); Jingrui He (University of Illinois at Urbana-­Champaign)

Deep Reinforcement Learning for Active Human Pose Estimation Erik Gärtner (Lund University)*; Aleksis Pirinen (Lund University); Cristian Sminchisescu (Lund University)

Be Relevant, Non-­‐redundant, Timely: Deep Reinforcement Learning for Real-­‐time Event Summarization Min Yang ( Chinese Academy of Sciences)*; Chengming Li (Chinese Academy of Sciences); Fei Sun (Alibaba Group); Zhou Zhao (Zhejiang University); Ying Shen (Peking University Shenzhen Graduate School); Chenglin Wu (fuzhi.ai)

A Tale of Two-­‐Timescale Reinforcement Learning with the Tightest Finite-­‐Time Bound Gal Dalal (Technion)*; Balazs Szorenyi (Yahoo Research); Gugan Thoppe (Duke University)

Reinforcement Learning with Perturbed Rewards Jingkang Wang (University of Toronto); Yang Liu (UCSC); Bo Li (University of Illinois at Urbana–Champaign)*

Exploratory Combinatorial Optimization with Reinforcement Learning Thomas Barrett (University of Oxford)*; William Clements (Unchartech); Jakob Foerster (Facebook AI Research); Alexander Lvovsky (Oxford University)

Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction Vishal Jain (Mila, McGill University)*; Liam Fedus (Google); Hugo Larochelle (Google); Doina Precup (McGill University); Marc G. Bellemare (Google Brain)

Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents Xian Yeow Lee (Iowa State University)*; Sambit Ghadai (Iowa State University); Kai Liang Tan (Iowa State University); Chinmay Hegde (New York University); Soumik Sarkar (Iowa State University)

Modelling Sentence Pairs via Reinforcement Learning: An Actor-­‐Critic Approach to Learn the Irrelevant Words MAHTAB AHMED (The University of Western Ontario)*; Robert Mercer (The University of Western Ontario)

Reinforcement-­‐Learning based Portfolio Management with Augmented Asset Movement Prediction States Yunan Ye (Zhejiang University)*; Hengzhi Pei (Fudan University); Boxin Wang (University of Illinois at Urbana-­Champaign); Pin-­‐Yu Chen (IBM Research); Yada Zhu (IBM Research); Jun Xiao (Zhejiang University); Bo Li (University of Illinois at Urbana–Champaign)

Deep Reinforcement Learning for General Game Playing Adrian Goldwaser (University of New South Wales)*; Michael Thielscher (University of New South Wales)

Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning Jianwen Sun (Nanyang Technological University)*; Tianwei Zhang ( Nanyang Technological University); Xiaofei Xie (Nanyang Technological University); Lei Ma (Kyushu University); Yan Zheng (Tianjin University); Kangjie Chen (Tianjin University); Yang Liu (Nanyang Technology University, Singapore)

LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-­‐Based Games Leonard Adolphs (ETHZ)*; Thomas Hofmann (ETH Zurich)

Induction of Subgoal Automata for Reinforcement Learning Daniel Furelos-­‐Blanco (Imperial College London)*; Mark Law (Imperial College London); Alessandra Russo (Imperial College London); Krysia Broda (Imperial College London); Anders Jonsson (UPF)

MRI Reconstruction with Interpretable Pixel-­‐Wise Operations Using Reinforcement Learning wentian li (Tsinghua University)*; XIDONG FENG (department of Automation,Tsinghua University); Haotian An (Tsinghua University); Xiang Yao Ng (Tsinghua University); Yu-­‐Jin Zhang (Tsinghua University)

Explainable Reinforcement Learning Through a Causal Lens Prashan Madumal (University of Melbourne)*; Tim Miller (University of Melbourne); Liz Sonenberg (University of Melbourne); Frank Vetere (University of Melbourne)

Reinforcement Learning based Meta-­‐path Discovery in Large-­‐scale Heterogeneous Information Networks Guojia Wan (Wuhan University); Bo Du (School of Compuer Science, Wuhan University)*; Shirui Pan (Monash University); Reza Haffari (Monash University, Australia)

Reinforcement Learning When All Actions are Not Always Available Yash Chandak (University of Massachusetts Amherst)*; Georgios Theocharous ("Adobe Research, USA"); Blossom Metevier (University of Massachusetts, Amherst); Philip Thomas (University of Massachusetts Amherst)

Dialog State Tracking with Reinforced Data Augmentation Yichun Yin (Noah's Ark Lab of Huawei)*; Lifeng Shang (Noah's Ark Lab); Xin Jiang (Huawei Noah's Ark Lab); Xiao Chen (Huawei Noah's Ark Lab); Qun Liu (Huawei Noah's Ark Lab)

Reinforcement Mechanism Design: With Applications to Dynamic Pricing in Sponsored Search Auctions Weiran Shen (Carnegie Mellon University)*; Binghui Peng (Columbia University); Hanpeng Liu (Tsinghua University); Michael Zhang (Chinese University of Hong Kong); Ruohan Qian (Baidu Inc.); Yan Hong (Baidu Inc.); Zhi Guo (Baidu Inc.); Zongyao Ding (Baidu Inc.); Pengjun Lu (Baidu Inc.); Pingzhong Tang (Tsinghua University)

Metareasoning in Modular Software Systems: On-­‐the-­‐Fly Configuration Using Reinforcement Learning with Rich Contextual Representations Aditya Modi (Univ. of Michigan Ann Arbor)*; Debadeepta Dey (Microsoft); Alekh Agarwal (Microsoft); Adith Swaminathan (Microsoft Research); Besmira Nushi (Microsoft Research); Sean Andrist (Microsoft Research); Eric Horvitz (MSR)

Joint Entity and Relation Extraction with a Hybrid Transformer and Reinforcement Learning Based Model Ya Xiao (Tongji University)*; Chengxiang Tan (Tongji University); Zhijie Fan (The Third Research Institute of the Ministry of Public Security); Qian Xu (Tongji University); Wenye Zhu (Tongji University)

Reinforced Curriculum Learning on Pre-­‐trained Neural Machine Translation Models Mingjun Zhao (University of Alberta)*; Haijiang Wu (Tencent); Di Niu (University of Alberta); Xiaoli Wang (Tencent)

Reinforcement Learning of Risk-­‐Constrained Policies in Markov Decision Processes Tomas Brazdil (Masaryk University); Krishnendu Chatterjee (IST Austria); Petr Novotný (Masaryk University)*; Jiří Vahala (Masaryk University)

Deep Model-­‐Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization Qi Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Jie Wang (University of Science and Technology of China)*

Reinforcement Learning with Non-­‐Markovian Rewards Maor Gaon (Ben-­‐Gurion University); Ronen Brafman (BGU)*

Modular Robot Design Synthesis with Deep Reinforcement Learning Julian Whitman (Carnegie Mellon University)*; Raunaq Bhirangi (Carnegie Mellon University); Matthew Travers (CMU); Howie Choset (Carnegie Melon University)

BAR -­‐ A Reinforcement Learning Agent for Bounding-­‐Box Automated Refinement Morgane Ayle (American University of Beirut -­‐ AUB)*; Jimmy Tekli (BMW Group / Université de Franche-­‐Comté -­UFC); Julia Zini (American University of Beirut -­‐ AUB); Boulos El Asmar (BMW Group / Karlsruher Institut für Technologie -­‐ KIT); Mariette Awad (American University of Beirut-­‐ AUB)

Hierarchical Reinforcement Learning for Open-­‐Domain Dialog Abdelrhman Saleh (Harvard University)*; Natasha Jaques (MIT); Asma Ghandeharioun (MIT); Judy Hanwen Shen (MIT); Rosalind Picard (MIT Media Lab)

Copy or Rewrite: Hybrid Summarization with Hierarchical Reinforcement Learning Liqiang Xiao (Artificial Intelligence Institute, SJTU)*; Lu Wang (Khoury College of Computer Science, Northeastern University); Hao He (Shanghai Jiao Tong University); Yaohui Jin (Artificial Intelligence Institute, SJTU)

Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning Xiang Ni (IBM Research); Jing Li (NJIT); Wang Zhou (IBM Research); Mo Yu (IBM T. J. Watson)*; Kun-­‐Lung Wu (IBM Research)

Actor Critic Deep Reinforcement Learning for Neural Malware Control Yu Wang (Microsoft)*; Jack Stokes (Microsoft Research); Mady Marinescu (Microsoft Corporation)

Fixed-­‐Horizon Temporal Difference Methods for Stable Reinforcement Learning Kristopher De Asis (University of Alberta)*; Alan Chan (University of Alberta); Silviu Pitis (University of Toronto); Richard Sutton (University of Alberta); Daniel Graves (Huawei)

Sequence Generation with Optimal-­‐Transport-­‐Enhanced Reinforcement Learning Liqun Chen (Duke University)*; Ke Bai (Duke University); Chenyang Tao (Duke University); Yizhe Zhang (Microsoft Research); Guoyin Wang (Duke University); Wenlin Wang (Duke Univeristy); Ricardo Henao (Duke University); Lawrence Carin Duke (CS)

Scaling All-­‐Goals Updates in Reinforcement Learning Using Convolutional Neural Networks Fabio Pardo (Imperial College London)*; Vitaly Levdik (Imperial College London); Petar Kormushev (Imperial College London)

Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning Tian Tan (Stanford University)*; Zhihan Xiong (Stanford University); Vikranth Dwaracherla (Stanford University)

Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning Sanket Shah (Singpore Management University)*; Arunesh Sinha (Singapore Management University); Pradeep Varakantham (Singapore Management University); Andrew Perrault (Harvard University); Milind Tambe (Harvard University)


IJCAI 2019

https://www.ijcai19.org/accepted-papers.html

A Dual Reinforcement Learning Framework for Unsupervised Text Style Transfer: Fuli Luo, Peng Li, Jie Zhou, Pengcheng Yang, Baobao Chang, Xu Sun, Zhifang Sui

A Restart-based Rank-1 Evolution Strategy for Reinforcement Learning: Zefeng Chen, Yuren Zhou, Xiao-yu He, Siyu Jiang

An Actor-Critic-Attention Mechanism for Deep Reinforcement Learning in Multi-view Environments: Elaheh Barati, Xuewen Chen

An Atari Model Zoo for Analyzing, Visualizing, and Comparing Deep Reinforcement Learning Agents: Felipe Such, Vashisht Madhavan, Rosanne Liu, Rui Wang, Pablo Samuel Castro, Yulun Li, Jiale Zhi, Ludwig Schubert, Marc G. Bellemare, Jeff Clune, Joel Lehman

Automatic Successive Reinforcement Learning with Multiple Auxiliary Rewards: Zhao-Yang Fu, De-Chuan Zhan, Xin-Chun Li, Yi-Xing Lu

Autoregressive Policies for Continuous Control Deep Reinforcement Learning: Dmytro Korenkevych, Ashique Rupam Mahmood, Gautham Vasan, James Bergstra

Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces: Haotian Fu, Hongyao Tang, Jianye Hao, Zihan Lei, Yingfeng Chen, Changjie Fan

Dynamic Electronic Toll Collection via Multi-Agent Deep Reinforcement Learning with Edge-Based Graph Convolutional Network Representation: Wei Qiu, Haipeng Chen, Bo An

Energy-Efficient Slithering Gait Exploration for a Snake-Like Robot Based on Reinforcement Learning: Zhenshan Bing, Christian Lemke, Zhuangyi Jiang, Kai Huang, Alois Knoll

Explaining Reinforcement Learning to Mere Mortals: An Empirical Study: Andrew Anderson, Jonathan Dodge, Amrita Sadarangani, Zoe Juozapaitis, Evan Newman, Jed Irvine, Souti Chattopadhyay, Alan Fern, Margaret Burnett

Incremental Learning of Planning Actions in Model-Based Reinforcement Learning: Alvin Ng, Ron Petrick

Hybrid Actor-Critic Reinforcement Learning in Parameterized Action Space: Zhou Fan, Rui Su, Weinan Zhang, Yong Yu

Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human/Agent's Demonstration: Zhaodong Wang, Matt Taylor

Interactive Teaching Algorithms for Inverse Reinforcement Learning: Parameswaran Kamalaruban, Rati Devidze, Volkan Cevher, Adish Singla

Large-Scale Home Energy Management Using Entropy-Based Collective Multiagent Deep Reinforcement Learning: Yaodong Yang, Jianye Hao, Yan Zheng, Chao Yu

Meta Reinforcement Learning with Task Embedding and Shared Policy: Lin Lan, Zhenguo Li, Xiaohong Guan, Pinghui Wang

Metatrace Actor-Critic: Online Step-Size Tuning by Meta-gradient Descent for Reinforcement Learning Control: Kenny Young, Baoxiang Wang, Matthew E. Taylor

Multi-scale Information Diffusion Prediction with Reinforced Recurrent Networks: Cheng Yang, Jian Tang, Maosong Sun, Ganqu Cui, Zhiyuan Liu

Mutually Reinforced Spatio-Temporal Convolutional Tube for Human Action Recognition: Haoze Wu, Jiawei Liu, Zheng-Jun Zha, Zhenzhong Chen, Xiaoyan Sun

Playing FPS Games With Environment-Aware Hierarchical Reinforcement Learning: Shihong Song, Jiayi Weng, Hang Su, Dong Yan, Haosheng Zou, Jun Zhu

Playing Card-Based RTS Games with Deep Reinforcement Learning: Tianyu Liu, Zijie Zheng, Hongchang Li, Kaigui Bian, Lingyang Song

Reinforced Negative Sampling for Recommendation with Exposure Data: Jingtao Ding, Yuhan Quan, Xiangnan He, Yong Li, Depeng Jin

Reinforcement Learning Experience Reuse with Policy Residual Representation: WenJi Zhou, Yang Yu, Yingfeng Chen, Kai Guan, Tangjie Lv, Changjie Fan, Zhi-Hua Zhou

Reward Learning for Efficient Reinforcement Learning in Extractive Document Summarisation: Yang Gao, Christian Meyer, Mohsen Mesgar, Iryna Gurevych

Sharing Experience in Multitask Reinforcement Learning: Tung-Long Vuong, Do-Van Nguyen, Tai-Long Nguyen, Cong-Minh Bui, Hai-Dang Kieu, Viet-Cuong Ta, Quoc-Long Tran, Thanh-Ha Le

SlateQ: A Tractable Decomposition for Reinforcement Learning with Recommendation Sets: Eugene Ie, Vihan Jain, Jing Wang, Sanmit Narvekar, Ritesh Agarwal, Rui Wu, Heng-Tze Cheng, Tushar Chandra, Craig Boutilier

Soft Policy Gradient Method for Maximum Entropy Deep Reinforcement Learning: Wenjie Shi, Shiji Song, Cheng Wu

Solving Continual Combinatorial Selection via Deep Reinforcement Learning: HyungSeok Song, Hyeryung Jang, Hai H. Tran, Se-eun Yoon, Kyunghwan Son, Donggyu Yun, Hyoju Chung, Yung Yi

Successor Options: An Option Discovery Framework for Reinforcement Learning: Rahul Ramesh, Manan Tomar, Balaraman Ravindran

Transfer of Temporal Logic Formulas in Reinforcement Learning: Zhe Xu, Ufuk Topcu

Using Natural Language for Reward Shaping in Reinforcement Learning: Prasoon Goyal, Scott Niekum, Raymond Mooney

Value Function Transfer for Deep Multi-Agent Reinforcement Learning Based on N-Step Returns: Yong Liu, Yujing Hu, Yang Gao, Yingfeng Chen, Changjie Fan

Failure-Scenario Maker for Rule-Based Agent using Multi-agent Adversarial Reinforcement Learning and its Application to Autonomous Driving: Akifumi Wachi

LTL and Beyond: Formal Languages for Reward Function Specification in Reinforcement Learning: Alberto Camacho, Rodrigo Toro Icarte, Toryn Q. Klassen, Richard Valenzano, Sheila McIlraith

A Survey of Reinforcement Learning Informed by Natural Language: Jelena Luketina↵, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstett, Shimon Whiteson, Tim Rocktäschel

Leveraging Human Guidance for Deep Reinforcement Learning Tasks: Ruohan Zhang, Faraz Torabi, Lin Guan, Dana H. Ballard, Peter Stone

CRSRL: Customer Routing System using Reinforcement Learning: Chong Long, Zining Liu, Xiaolu Lu, Zehong Hu, Yafang Wang
    
Deep Reinforcement Learning for Ride-sharing Dispatching and Repositioning: Zhiwei (Tony) Qin, Xiaocheng Tang, Yan Jiao, Fan Zhang, Chenxi Wang

文献统计可能会有遗漏, 文章是通过 Reinforcement Learning 等关键字检索的.

相关文章

网友评论

    本文标题:19-20 NLP顶会强化学习相关文章汇总

    本文链接:https://www.haomeiwen.com/subject/aoenfhtx.html