Deep Reinforcement Learning Papers

Deep Reinforcement Learning Papers

A list of recent papers regarding deep reinforcement learning.

The papers are organized based on manually-defined bookmarks.

They are sorted by time to see the recent papers first.

Any suggestions and pull requests are welcome.

Bookmarks

All Papers

Value

Policy

Discrete Control

Continuous Control

Text Domain

Visual Domain

Robotics

Games

Monte-Carlo Tree Search

Inverse Reinforcement Learning

Improving Exploration

Multi-Task and Transfer Learning

Multi-Agent

Hierarchical Learning

All Papers

Model-Free Episodic Control, C. Blundell et al.,arXiv, 2016.

Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, 2016.

Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, 2016.

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, 2016.

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, 2016.

Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, 2016.

Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, 2016.

Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, 2016.

Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, 2016.

Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, 2016.

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, 2016.

Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, 2016.

Value Iteration Networks, A. Tamar et al.,arXiv, 2016.

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, 2016.

Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, 2015.

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, 2015.

Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, 2015.

Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, 2015.

MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, 2016.

Learning Simple Algorithms from Examples, W. Zaremba et al.,arXiv, 2015.

Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, 2015.

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, 2016.

Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, 2016.

Policy Distillation, A. A. Rusu et at.,ICLR, 2016.

Prioritized Experience Replay, T. Schaul et al.,ICLR, 2016.

Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, 2015.

Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, 2016.

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, 2015.

Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, 2015.

ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, 2015.

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, 2015.

Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, 2015.

Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,arXiv, 2015.

Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, 2016.

Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, 2015.

Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, 2015.

Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, 2015.

Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, 2015.

Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,arXiv, 2015.

Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, 2015.

Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,arXiv, 2015.

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, 2015.

Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al.,arXiv, 2015.

High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, 2016.

End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, 2015.

DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,RSS, 2015.

Universal Value Function Approximators, T. Schaul et al.,ICML, 2015.

Deterministic Policy Gradient Algorithms, D. Silver et al.,ICML, 2015.

Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, 2015.

Trust Region Policy Optimization, J. Schulman et al.,ICML, 2015.

Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, 2015.

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, 2014.

Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, 2013.

Value

Model-Free Episodic Control, C. Blundell et al.,arXiv, 2016.

Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, 2016.

Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, 2016.

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, 2016.

Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, 2016.

Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, 2016.

Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, 2016.

Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, 2016.

Value Iteration Networks, A. Tamar et al.,arXiv, 2016.

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, 2016.

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, 2015.

Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, 2015.

Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, 2015.

Learning Simple Algorithms from Examples, W. Zaremba et al.,arXiv, 2015.

Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, 2015.

Prioritized Experience Replay, T. Schaul et al.,ICLR, 2016.

Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, 2015.

Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, 2016.

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, 2015.

Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, 2015.

Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, 2015.

Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,arXiv, 2015.

Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, 2016.

Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, 2015.

Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, 2015.

Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, 2015.

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, 2015.

Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, 2015.

Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, 2015.

Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, 2013.

Policy

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, 2016.

Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, 2016.

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, 2016.

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, 2015.

MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, 2016.

ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, 2015.

Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, 2016.

Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, 2015.

High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, 2016.

End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, 2015.

Deterministic Policy Gradient Algorithms, D. Silver et al.,ICML, 2015.

Trust Region Policy Optimization, J. Schulman et al.,ICML, 2015.

Discrete Control

Model-Free Episodic Control, C. Blundell et al.,arXiv, 2016.

Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, 2016.

Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, 2016.

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, 2016.

Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, 2016.

Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, 2016.

Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, 2016.

Value Iteration Networks, A. Tamar et al.,arXiv, 2016.

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, 2016.

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, 2015.

Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, 2015.

Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, 2015.

Learning Simple Algorithms from Examples, W. Zaremba et al.,arXiv, 2015.

Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, 2015.

Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, 2016.

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, 2016.

Policy Distillation, A. A. Rusu et at.,ICLR, 2016.

Prioritized Experience Replay, T. Schaul et al.,ICLR, 2016.

Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, 2015.

Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, 2016.

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, 2015.

Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, 2015.

ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, 2015.

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, 2015.

Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, 2015.

Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al.,arXiv, 2015.

Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, 2015.

Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, 2015.

Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, 2015.

Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, 2015.

Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,arXiv, 2015.

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, 2015.

Universal Value Function Approximators, T. Schaul et al.,ICML, 2015.

Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, 2015.

Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, 2015.

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, 2014.

Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, 2013.

Continuous Control

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, 2016.

Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, 2016.

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, 2016.

Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, 2016.

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, 2015.

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, 2015.

Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, 2016.

Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, 2015.

Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,arXiv, 2015.

High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, 2016.

End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, 2015.

DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,RSS, 2015.

Deterministic Policy Gradient Algorithms, D. Silver et al.,ICML, 2015.

Trust Region Policy Optimization, J. Schulman et al.,ICML, 2015.

Text Domain

Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al.,NIPS Workshop, 2015.

MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, 2016.

Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, 2015.

Generating Text with Deep Reinforcement Learning, H. Guo,arXiv, 2015.

Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, 2015.

Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al.,arXiv, 2015.

Visual Domain

Model-Free Episodic Control, C. Blundell et al.,arXiv, 2016.

Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, 2016.

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, 2016.

Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, 2016.

Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, 2016.

Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, 2016.

Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, 2016.

Value Iteration Networks, A. Tamar et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, 2016.

Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, 2015.

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, 2015.

Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, 2015.

Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, 2015.

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, 2016.

Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, 2016.

Policy Distillation, A. A. Rusu et at.,ICLR, 2016.

Prioritized Experience Replay, T. Schaul et al.,ICLR, 2016.

Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, 2016.

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, 2015.

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, 2015.

Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, 2015.

Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, 2016.

Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, 2015.

Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, 2015.

Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, 2015.

Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, 2015.

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, 2015.

High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, 2016.

End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, 2015.

Universal Value Function Approximators, T. Schaul et al.,ICML, 2015.

Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, 2015.

Trust Region Policy Optimization, J. Schulman et al.,ICML, 2015.

Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, 2015.

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, 2014.

Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, 2013.

Robotics

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, 2016.

Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al.,ICML, 2016.

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al.,arXiv, 2016.

Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al.,ICML, 2016.

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Memory-based control with recurrent neural networks, N. Heess et al.,NIPS Workshop, 2015.

Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al.,arXiv, 2015.

Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al.,NIPS, 2015.

Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al.,arXiv, 2015.

High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al.,ICLR, 2016.

End-to-End Training of Deep Visuomotor Policies, S. Levine et al.,arXiv, 2015.

DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al.,RSS, 2015.

Trust Region Policy Optimization, J. Schulman et al.,ICML, 2015.

Games

Model-Free Episodic Control, C. Blundell et al.,arXiv, 2016.

Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al.,arXiv, 2016.

Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, 2016.

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, 2016.

Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al.,ICML, 2016.

Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al.,IJCAI Deep RL Workshop, 2016.

Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, 2016.

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, 2016.

Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al.,arXiv, 2016.

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al.,AAAI, 2016.

How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al.,NIPS Workshop, 2015.

Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, 2015.

MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al.,arXiv, 2016.

Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al.,arXiv, 2015.

Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, 2016.

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, 2016.

Policy Distillation, A. A. Rusu et at.,ICLR, 2016.

Prioritized Experience Replay, T. Schaul et al.,ICLR, 2016.

Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al.,arXiv, 2015.

Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al.,ICLR, 2016.

Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende,arXiv, 2015.

Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al.,arXiv, 2015.

Continuous control with deep reinforcement learning, T. P. Lillicrap et al.,ICLR, 2016.

Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al.,EMNLP, 2015.

Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai,arXiv, 2015.

Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, 2015.

Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone,arXiv, 2015.

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, 2015.

Universal Value Function Approximators, T. Schaul et al.,ICML, 2015.

Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al.,ICML Workshop, 2015.

Trust Region Policy Optimization, J. Schulman et al.,ICML, 2015.

Human-level control through deep reinforcement learning, V. Mnih et al.,Nature, 2015.

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, 2014.

Playing Atari with Deep Reinforcement Learning, V. Mnih et al.,NIPS Workshop, 2013.

Monte-Carlo Tree Search

Mastering the game of Go with deep neural networks and tree search, D. Silver et al.,Nature, 2016.

Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al.,ICLR, 2016.

Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al.,NIPS, 2014.

Inverse Reinforcement Learning

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al.,arXiv, 2016.

Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al.,arXiv, 2015.

Multi-Task and Transfer Learning

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al.,ICLR, 2016.

Policy Distillation, A. A. Rusu et at.,ICLR, 2016.

ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al.,arXiv, 2015.

Universal Value Function Approximators, T. Schaul et al.,ICML, 2015.

Improving Exploration

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al.,arXiv, 2016.

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al.,arXiv, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

Deep Exploration via Bootstrapped DQN, I. Osband et al.,arXiv, 2016.

Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al.,NIPS, 2015.

Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al.,arXiv, 2015.

Multi-Agent

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al.,arXiv, 2016.

Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al.,arXiv, 2015.

Hierarchical Learning

Deep Successor Reinforcement Learning, T. D. Kulkarni et al.,arXiv, 2016.

Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al.,arXiv, 2016.

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al.,arXiv, 2016.

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 216,324评论 6 498
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 92,356评论 3 392
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 162,328评论 0 353
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 58,147评论 1 292
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 67,160评论 6 388
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 51,115评论 1 296
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 40,025评论 3 417
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 38,867评论 0 274
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 45,307评论 1 310
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 37,528评论 2 332
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 39,688评论 1 348
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 35,409评论 5 343
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 41,001评论 3 325
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 31,657评论 0 22
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 32,811评论 1 268
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 47,685评论 2 368
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 44,573评论 2 353

推荐阅读更多精彩内容

  • 我不争不代表我不想要 我不哭不代表我不难过 我只是习惯了你们的不在意,习惯被你们忽视…… 我无数次的告...
    百变小叮铛阅读 244评论 0 3
  • 中午吃饭的时候,我们几个同事不知道从什么话题引到了明星后台的话题上。说起了马思纯, “她应该是一个不张扬的人,从不...
    烟花雨荨阅读 399评论 0 1
  • 最近在给公司没上线的项目进行xcode7无证书打包测试时,偶尔会出现打包几天后或一两个月内点击APP闪退的情况。...
    生产八哥阅读 341评论 0 0