This repository contains the source code used to produce the results obtained in our 2023 IFAC WC submission (extended version here).
In this work, we propose a straightforward yet effective algorithm for enabling safety in the context of Safe Reinforcement Learning (RL) using Model Predictive Control (MPC) as function approximation. The unknown constraints encoding safety are learnt from observed MPC trajectories via Gaussian Process (GP) regression, and are then enforced onto the RL agent to guarantee that the MPC controller is safe with high probability.
If you find the paper or this repository helpful in your publications, please consider citing it.
@article{airaldi20235759,
title = {Learning safety in model-based Reinforcement Learning using MPC and Gaussian Processes},
journal = {IFAC-PapersOnLine},
volume = {56},
number = {2},
pages = {5759-5764},
year = {2023},
note = {22nd IFAC World Congress},
doi = {https://doi.org/10.1016/j.ifacol.2023.10.563},
author = {Filippo Airaldi and Bart De Schutter and Azita Dabiri},
}
The code was created with Python 3.9.5
. To access it, clone the repository
git clone https://github.com/FilippoAiraldi/learning-safety-in-mpc-based-rl.git
cd learning-safely
and then install the required packages by, e.g., running
pip install -r requirements.txt
The repository code is structured in the following way
agents
contains the RL algorithms used within the paper- the Perfect-Knowledge agent, a non-learning agent with exact information on the quadrotor drone dynamics
- the LSTD Q-learning agent, in both its safe and unsafe variants, i.e., with and without our proposed algorithm, respectively.
envs
contains the quadrotor environment (in OpenAI'sgym
style) used in the numerical experimentmpc
contains the implementation (based CasADi) of the MPC optimization schemeresouces
contains media and other miscellaneous resourcessim
contains pickle-serialized simulation results of the different agentsutil
contains utility classes and functions for, e.g., plotting, I/O, exceptions, etc.train.py
launches simulations for the different agentsvisualization.py
visualizes the simulation results
Training simulations can easily be launched via command. The default arguments are already set to yield the results found in the paper. To reproduce the simulation results run the following command calling one of the 3 available
python train.py (--pk | --lstdq | --safe_lstqd)
Note that only one can be simulated at a time. Results will be saved under the filename ${runname}.pkl
.
To visualize simulation results, simply run
python visualization.py ${runname1}.pkl ... ${runnameN}.pkl
You can additionally pass --papermode
, which will cause the paper figures to be created (in this case, the simulation results filepaths are hardcoded).
The repository is provided under the GNU General Public License. See the LICENSE file included with this repository.
Filippo Airaldi, PhD Candidate [f.airaldi@tudelft.nl | filippoairaldi@gmail.com]
Delft Center for Systems and Control in Delft University of Technology
This research is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (Grant agreement No. 101018826 - CLariNet).
Copyright (c) 2023 Filippo Airaldi.
Copyright notice: Technische Universiteit Delft hereby disclaims all copyright interest in the program “learning-safety-in-mpc-based-rl” (Learning safety in model-based Reinforcement Learning using MPC and Gaussian Processes) written by the Author(s). Prof. Dr. Ir. Fred van Keulen, Dean of 3mE.