RecurrentMaskablePPO is a custom implementation of the Proximal Policy Optimization (PPO) algorithm, designed specifically for environments with recurrent states and maskable actions. This implementation is based on the stable-baselines3-contrib repository, which extends the popular reinforcement learning library, stable-baselines3.
- Compatible with environments that have recurrent states and require masking of certain actions.
- Built on top of the stable-baselines3 library, inheriting its modularity and ease of use.
- Efficient and scalable implementation for complex tasks.
To install RecurrentMaskablePPO, follow the steps below:
-
Make sure you have Python 3.7 or later installed on your system. You can download the latest version from the official Python website.
-
Install stable-baselines3-contrib using requirements.txt:
pip install -r requirements.txt
- Clone this repository:
git clone https://github.com/yourusername/RecurrentMaskablePPO.git
- Navigate to the cloned repository and install the package:
cd recurrent_msakable
pip install -e .