awesome-rl-papers

Block MDP

Provable RL with Exogenous Distractors via Multistep Inverse Dynamics (ICLR2022 oral) arxiv [no code]
Provably efficient RL with Rich Observations via Latent State Decoding (ICML2019) arxiv code
Provable Rich Observation Reinforcement Learning with Combinatorial Latent States (ICLR2021) arxiv [no code]
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning (ICML2020) arxiv [no code]
Efficient Reinforcement Learning in Block MDPs: A Model-free Representation Learning Approach (arxiv) arxiv code
Exploiting Action Impact Regularity and Exogenous State Variables for Offline Reinforcement Learning arxiv [no code]
On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP arxiv
Block Contextual MDPs for Continual Learning (ICLR2022 withdraw) openreview
Learning Domain Invariant Representations in Goal-conditioned Block MDPs arxiv code
On Reinforcement Learning with Adversarial Corruption and Its Application to Block MDP (ICML2021) pdf

Lifelong Learning, Continual Learning

Modular Lifelong Reinforcement Learning via Neural Composition (ICLR2022) arxiv [no code]
Generalisation in Lifelong Reinforcement Learning through Logical Composition (ICLR2022) arxiv [code bug]
Continual Learning via Local Module Composition (NIPS2021) arxiv code
Gradient Projection Memory for Continual Learning (ICLR2021 oral) arxiv code
Policy and value transfer in lifelong reinforcement learning. (ICML2018) arxiv [no code]
Lipschitz Lifelong Reinforcement Learning (AAAI2021) arxiv code
Towards Continual Reinforcement Learning: A Review and Perspectives arxiv
Fast reinforcement learning with generalized policy updates (PNAS) arxiv
Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement (ICML2018) arxiv [no code]
Lifelong Policy Gradient Learning of Factored Policies for Faster Training Without Forgetting (NIPS2020) arxiv
Policy Consolidation for Continual Reinforcement Learning (ICML2019) arxiv code
Continual Reinforcement Learning with Complex Synapses arxiv [no code]
Continuous Coordination As a Realistic Scenario for Lifelong Learning arxiv code1 code2
Lifelong Incremental Reinforcement Learning with Online Bayesian Inference (TNNLS) pdf code
Is Model-Free Learning Nearly Optimal for Non-Stationary RL? [ICML2021] arxiv [no code]

Generalization

Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL (ICLR2022) arxiv code
Why Generalization in RL is Difficult: Epistemic POMDPs and Implicit Partial Observability (NIPS2021) arxiv
Constructing a Good Behavior Basis for Transfer using Generalized Policy Updates (ICLR2022) arxiv [no code]
Environment Generation for Zero-Shot Compositional Reinforcement Learning (NIPS2021) arxiv code?
Reinforcement Learning with Prototypical Representations (ICML2021) arxiv code
Deep Reinforcement Learning amidst Continual Structured Non-Stationarity (ICML2021) arxiv
K-level Reasoning for Zero-Shot Coordination in Hanabi (NIPS2021) arxiv
Source tasks selection for transfer deep reinforcement learning: a case of study on Atari games
The Distracting Control Suite -- A Challenging Benchmark for Reinforcement Learning from Pixels pdf code
AdaRL: What, Where, and How to Adapt in Transfer Reinforcement Learning (ICLR2022 spotlight) arxiv code
Improving zero-shot generalization in offline reinforcement learning using generalized similarity functions (ICLR2022 reject) openreview code
DARLA: Improving Zero-Shot Transfer in Reinforcement Learning (ICML2017) arxiv code
Case-based reasoning for better generalization in textual reinforcement learning (ICLR2022 poster) arxiv [[no code]]
Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks (ICML2021) arxiv code
Grounding Language to Entities and Dynamics for Generalization in Reinforcement Learning (ICML2021) arxiv code
On the Generalization of Representations in Reinforcement Learning (AISTATS22) arxiv code
Policy Architectures for Compositional Generalization in Control arxiv code
Leveraging procedural generation to benchmark reinforcement learning
Quantifying generalization in reinforcement learning
Decoupling value and policy for generalization in reinforcement learning code
On overfitting and asymptotic bias in batch reinforcement learning with partial observability
Improving generalization in reinforcement learning with mixture regularization (NIPS2020) code
Observational overfitting in reinforcement learning
Assessing generalization in deep reinforcement learning.
Neuro-algorithmic Policies enable Fast Combinatorial Generalization (ICML2021) [no code]
Self-supervised Visual Reinforcement Learning with Object-centric Representations (ICLR2021 spotlight) code
Transient Non-stationarity and Generalisation in Deep Reinforcement Learning (ICLR2021)
Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals (NIPS2020) (GNN)
SECANT: Self-Expert Cloning for Zero-Shot Generalization of Visual Policies (ICML2021) code
Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion (AAAI2021) code
Planning to Explore via Self-Supervised World Models (ICML2020) arxiv code

Transfer learnning

Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers (ICLR2021) arxiv code
REPAINT: Knowledge Transfer in Deep Reinforcement Learning (ICML2021)

Multi-Task

Multi-Task Reinforcement Learning with Context-based Representations (ICML2021) arxiv code

Abstraction, Logical

Compositional Reinforcement Learning from Logical Specifications (NIPS2021) arxiv code
Learning Markov State Abstractions for Deep Reinforcement Learning (NIPS2021) arxiv code
R5: RULE DISCOVERY WITH REINFORCED AND RECURRENT RELATIONAL REASONING (ICLR2022) arxiv
A Theory of Abstraction in Reinforcement Learning pdf thesis
Model-Invariant State Abstractions for Model-Based Reinforcement Learning arxiv [no code]

Symbolic

EMERGENT SYMBOLS THROUGH BINDING IN EXTERNAL MEMORY (ICLR2021) arxiv code
Deep symbolic regression: Recovering mathematical expressions from data via risk-seeking policy gradients (ICLR2021) arxiv code
Discovering symbolic policies with deep reinforcement learning (ICML2021) arxiv
Iterated learning for emergent systematicity in VQA (ICLR2021 oral) arxiv

Auto RL

Evolving Reinforcement Learning Algorithms (ICLR2021 oral) arxiv
Discovering Reinforcement Learning Algorithms (NIPS2020) arxiv
CARL: A Benchmark for Contextual and Adaptive Reinforcement Learning (NIPS2021w) arxiv

Evolutionary RL

Transform2Act: Learning a Transform-and-Control Policy for Efficient Agent Design (ICLR2022 oral) arxiv code

Graph

Graph Convolutional Reinforcement Learning (ICLR2020) arxiv pytorch tf
Graph Policy Gradients for Large Scale Robot Control (CoRL2019 oral) arxiv code
Actor-Attention-Critic for Multi-Agent Reinforcement Learning (ICML2019) arxiv code
Symbolic Relational Deep Reinforcement Learning based on Graph Neural Networks
Efficient and Interpretable Robot Manipulation with Graph Neural Networks
Towards practical multi-object manipulation using relational reinforcement learning.
Neural task graphs:Generalizing to unseen tasks from a single video demonstration. (CVPR2019)

MARL

Multi-Agent Inverse Reinforcement Learning: Suboptimal Demonstrations and Alternative Solution arxiv
Multi-Agent Generative Adversarial Imitation Learning (NIPS2018) arxiv
Social Neuro AI: Social Interaction as the "dark matter" of AI arxiv
Emergent Social Learning via Multi-agent Reinforcement Learning (ICML2021) arxiv [no code]
Meta-brain Models: biologically-inspired cognitive agents arxiv
Learning Meta Representations for Agents in Multi-Agent Reinforcement Learning arxiv [no code]
An Efficient Transfer Learning Framework for Multiagent Reinforcement Learning (NIPS2021) arxiv code
Option-Critic in Cooperative Multi-agent Systems arxiv code
Bayesian Action Decoder for Deep Multi-Agent Reinforcement Learning (ICML2019) arxiv code
Joint Policy Search for Collaborative Multi-agent Imperfect Information Games (NIPS2020) arxiv code
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction arxiv [no code]
Cooperative Exploration for Multi-Agent Deep Reinforcement Learning (ICML2021) arxiv code
A Policy Gradient Algorithm for Learning to Learn in Multiagent Reinforcement Learning (ICML2021) arxiv code
Coach-Player Multi-agent Reinforcement Learning for Dynamic Team Composition (ICML)
Learning Fair Policies in Decentralized Cooperative Multi-Agent Reinforcement Learning (ICML2021)
Reinforcement Learning under a Multi-agent Predictive State Representation Model: Method and Theory (ICLR2022)
Tensor Decomposition for Multi-agent Predictive State Representation
Learning with Opponent-Learning Awareness (AAMAS2018) arxiv code
Model-based Multi-agent Policy Optimization with Adaptive Opponent-wise Rollouts (IJCAI2021) arxiv code
Communication in multi-agent reinforcement learning: Intention sharing (ICLR2021)
Efficient Model-based Multi-agent Reinforcement Learning via Optimistic Equilibrium Computation arxiv [no code]
Agent Modelling under Partial Observability for Deep Reinforcement Learning (NIPS2021) code

Auxiliary task, Representation learning

state-representaton-learning-rl blog
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning (ICLR2021 oral) arxiv code
Learning Invariant Representations for Reinforcement Learning without Reconstruction arxiv code
Decoupling Representation Learning from Reinforcement Learning (ICML2021) arxiv code
Dealing with Non-Stationarity in MARL via Trust-Region Decomposition (ICLR2022) arxiv [no code]

Exploration

A Tutorial on Thompson Sampling pdf
When should agents explore? (ICLR2021 spotlight) arxiv
Principled Exploration via Optimistic Bootstrapping and Backward Induction (ICML2021)

Offline

NeoRL: A Near Real-World Benchmark for Offline Reinforcement Learning arxiv code

Low Rank MDP

Representation Learning for Online and Offline RL in Low-rank MDPs (ICLR2022 spotlight) arxiv
A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning (ICLR2022 reject) openreview
Agnostic Reinforcement Learning with Low-Rank MDPs and Rich Observations (NIPS2021) arxiv

Sample Efficiency

Sample Efficient Deep Reinforcement Learning via Uncertainty Estimation (ICLR2022 spotlight) arxiv

Interpretability

Programmatic Reinforcement Learning without Oracles (ICLR2022 spotlight) openreview [no code]

Sim2Real

Understanding Domain Randomization for Sim-to-real Transfer (ICLR2022 spotlight) arxiv

Hierarchical

Possibility Before Utility: Learning And Using Hierarchical Affordances (ICLR2022 spotlight) arxiv code
Hierarchical Reinforcement Learning: A Comprehensive Survey pdf
Hierarchical Multi-Agent Reinforcement Learning pdf
Hierarchical Cooperative Multi-Agent Reinforcement Learning with Skill Discovery (AAMAS2020) arxiv code
Graph-Based Skill Acquisition For Reinforcement Learning pdf
Compositional Reinforcement Learning from Logical Specifications (NIPS2021) code (Dijkstra)

POMDP

Deep Variational Reinforcement Learning for POMDPs (ICLR2018) arxiv code
Structured World Belief for Reinforcement Learning in POMDP (ICML2021) arxiv [no code]
An Efficient, Expressive and Local Minima-free Method for Learning Controlled Dynamical Systems (AAAI2018) arxiv code
Learning Latent Dynamics for Planning from Pixels (ICML2019) arxiv code
Recurrent Model-Free RL is a Strong Baseline for Many POMDPs
Reinforcement Learning in Rich-Observation MDPs using Spectral Methods
On Improving Deep Reinforcement Learning for POMDPs code
Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model (NIPS2020)
Planning from Pixels using Inverse Dynamics Models (ICLR2021) [no code]

Contrained rl

Density Constrained Reinforcement Learning (ICML2021) arxiv

Evolution

Trust Region Evolution Strategies (AAAI2019) pdf

Model based

Model-Based Reinforcement Learning via Latent-Space Collocation (ICML2021)

Binding

Towards Nonlinear Disentanglement in Natural Data with Temporal Sparse Coding (ICLR2021 oral) code

Review

Deep Reinforcement Learning: Opportunities and Challenges pdf
Reinforcement Learning in Robotics: A Survey pdf
Model-based Multi-agent Reinforcement Learning: Recent Progress and Prospects arxiv
A Survey of Generalisation in Deep Reinforcement Learning arxiv
Approximation Methods for Partially Observed Markov Decision Processes (POMDPs)

Tutorials

Notes on Theoretical Foundations of Reinforcement Learning pdf
Exploration blog
https://github.com/yangyutu/EssentialMath/

Misc

Contextual Decision Processes with Low Bellman Rank are PAC-Learnable (ICML2017) arxiv
Is the Policy Gradient a Gradient? pdf
Bayesian Reinforcement Learning: A Survey arxiv
Learning Good State and Action Representations via Tensor Decomposition arxiv
On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning (ICLR2022 spotlight) arxiv
Constrained Policy Optimization via Bayesian World Models openreview
Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution (ICML2017) pdf
A bayesian approach to problems in stochastic estimation and control pdf
On Proximal Policy Optimization's Heavy-tailed Gradients (ICML2021) arxiv
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning (ICML2021) arxiv code
Muesli: Combining Improvements in Policy Optimization (ICML2021) arxiv code
Low-Precision Reinforcement Learning: Running Soft Actor-Critic in Half Precision (ICML2021) arxiv [no code]
Controlling Graph Dynamics with Reinforcement Learning and Graph Neural Networks (ICML2021) arxiv [no code]
Spectral Normalisation for Deep Reinforcement Learning: An Optimisation Perspective (ICML2021)
Temporal Predictive Coding For Model-Based Planning In Latent Space (ICML2021)
Neural codes: Firing rates and beyond. (beta distribution and spike coding)
Analysis and Improvement of Policy Gradient Estimation (NIPS2011) (variance of policy gradient estimator is inversely proportional to $\sigma^2$)
Recurrent predictive state policy networks
A recurrent latent variable model for sequential data.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

awesome-rl-papers

Block MDP

Lifelong Learning, Continual Learning

Generalization

Transfer learnning

Multi-Task

Abstraction, Logical

Symbolic

Auto RL

Evolutionary RL

Graph

MARL

Auxiliary task, Representation learning

Exploration

Offline

Low Rank MDP

Sample Efficiency

Interpretability

Sim2Real

Hierarchical

POMDP

Contrained rl

Evolution

Model based

Binding

Review

Tutorials

Misc

Resourses

About

Releases

Packages

dyabel/awesome-rl-papers

Folders and files

Latest commit

History

Repository files navigation

awesome-rl-papers

Block MDP

Lifelong Learning, Continual Learning

Generalization

Transfer learnning

Multi-Task

Abstraction, Logical

Symbolic

Auto RL

Evolutionary RL

Graph

MARL

Auxiliary task, Representation learning

Exploration

Offline

Low Rank MDP

Sample Efficiency

Interpretability

Sim2Real

Hierarchical

POMDP

Contrained rl

Evolution

Model based

Binding

Review

Tutorials

Misc

Resourses

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages