ANYmal BRAX

This repository contains code and files to train ANYmal robot in simulation using RL. Besides the raw training code, convenience functions are provided for logging training progress in BRAX. To benefit from the gradients provided by the fully differentiable simulation, applications of APG are explored.

DISCLAIMER: This code is not official research code, but was instead developed as part of a course project at ETH (course: Digital Humans). No guarantees for its completeness or correctness can be provided.

The given code is made publicly available for the benefit of the robotics/simulation communities, to have a trainable model of a real legged robot.

Authors: Turcan Tuna, Julian Nubert, Jonas Frey, Sahana Betschen

Supervisor: Miguel Zamora

Getting Started

In order to get started, clone this repository together with its submodule(s).

git clone --recurse-submodules https://github.com/leggedrobotics/anymal_brax.git

Dependencies

In order to run our code, the following dependencies need to be installed. These mainly contain:

Torch
JAX
BRAX

We provide a conda environment file to simplify this installation. Execute:

conda env create -f conda/anymal-brax-3.11.yml

Next activate the given environment.

conda activate anymal-brax-3.11.yml

Test the Environment

The most brittle part about the setup usually includes the successful JAX installation, suitable for GPU. In order to verify the correctness of the installation, run the following script that we provide:

python3 bin/test_jax_gpu.py

If this is outputting something like

Fast took  0.03966069221496582
Slow took 0.11339068412780762
1.0
-1.0
-1.0

you are good to go.

Environment Setup

In a next setup, you can install our environment. This will also install the BRAX library at release v0.9.1 for you.

pip install -e submodules/brax
pip install -e .

Training Monitoring

For the monitoring of the training runs we provide an MLFlow interface, which lets you navigate through the training run such as shown in this example: In order to do so, you have to start an MLFlow-server in the root directoy of this repository. This will then visualize the runs in the mlruns folder, which is used for logging during training. To start the server, run:

mlflow server

This should output something like:

[2023-06-15 18:35:07 +0200] [576067] [INFO] Starting gunicorn 20.1.0
[2023-06-15 18:35:07 +0200] [576067] [INFO] Listening at: http://127.0.0.1:5000 (576067)
[2023-06-15 18:35:07 +0200] [576067] [INFO] Using worker: sync
[2023-06-15 18:35:07 +0200] [576068] [INFO] Booting worker with pid: 576068
[2023-06-15 18:35:07 +0200] [576069] [INFO] Booting worker with pid: 576069
[2023-06-15 18:35:07 +0200] [576070] [INFO] Booting worker with pid: 576070
[2023-06-15 18:35:07 +0200] [576071] [INFO] Booting worker with pid: 576071

Hold ctr and click on the link (http://127.0.0.1:5000) to open the browser interface of the training run. On the left you will be able to see all the experiments (such as apg_reacher, ppo_anymal, etc.).

Models and videos of the training run can be seen for each experiment run at the bottom in the section "Artifacts".

ANYmal PPO Training

In order to train walking motions for the quadrupedal robot ANYmal, first a new environment had to be created. It can be found in this folder. To execute the training simply run:

train_anymal_ppo.py

The expected outcome of the training will look something like this:

Reacher PPO Training

The reacher can be trained using PPO RL:

train_reacher_ppo.py

The result can be found in MLFlow under the Experiments section as ppo_reacher. The result should look something like:

Reacher APG Training

As one of our main motivations for using BRAX was the availability of gradients, thanks to the differential nature of BRAX and JAX, we played around with the simplest method of its kind: the Analytical Policy Gradient (APG). As it is not clear how direct gradient learning will scale, due to issues such as

local optima and
vanishing/exploding gradients in longer trajectories,

we first started by applying APG to the simpler 2-dimensional reacher example. To run the training, simply run:

train_reacher.py

With this we are able to achieve the following results:

ANYmal APG Training

So far we were not able to achieve walking behavior of ANYmal solely by using APG. However, we successfully used APG in order to acheive standup motions: The training code for this will follow shortly.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
bin		bin
conda		conda
images		images
src/anymal_brax		src/anymal_brax
submodules		submodules
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ANYmal BRAX

Getting Started

Dependencies

Test the Environment

Environment Setup

Training Monitoring

ANYmal PPO Training

Reacher PPO Training

Reacher APG Training

ANYmal APG Training

About

Releases

Packages

Languages

License

Chenchunxuan/anymal_brax

Folders and files

Latest commit

History

Repository files navigation

ANYmal BRAX

Getting Started

Dependencies

Test the Environment

Environment Setup

Training Monitoring

ANYmal PPO Training

Reacher PPO Training

Reacher APG Training

ANYmal APG Training

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages