This codebase contains the implementation of the algorithms and environments evaluated in A Comparison of Imitation Learning Algorithms for Bimanual Manipulation.
Install the conda environment. Depending on your platform, install the correct <arch> = x86
or <arch> = arm
environments, located in the requirements folder. The installation command for the corresponding algorithms are given below:
conda env create -f torch_<arch>.yml
for: [ACT, Diffusion] (and experimental: BC)conda env create -f tensorflow_<arch>.yml
for: [IBC]conda env create -f theano_x86.yml
for: [GAIL, DAgger, BC] (x86 only)
Activate the conda environment. The conda environments for the corresponding algorithms are given below:
conda activate irl_torch
for: [ACT, Diffusion] (and experimental: BC)conda activate irl_tensorflow
for: [IBC]conda activate irl_theano
for: [GAIL, DAgger, BC]
Install bimanual_imitation. Change to main repo directory and run: pip install -e .
to install the bimanual_imitation library.
To train the policies, go to the bimanual_imitation/algorithms folder and run:
conda activate <conda_env_name>
python imitate_<algorithm>.py
where:
- <algorithm> is replaced with the method you intend to use: for example,
diffusion
- <conda_env_name> is replaced according to Step 2 in the section above: for example,
irl_torch
Note: you can pass --help
to the script to view the available arguments. You can also modify the default configs accordingly.
To visualize the gymnasium environment used for training the policies, go to the irl_environments folder and run:
python bimanual_quad_insert.py
You can modify the bottom of the script to run different environments, export gifs, and plot the features.
Environments are named as quad_insert_<action_noise><observation_noise>
, where:
- <action_noise> is replaced with:
a0
(None),aL
(Low),aM
(Medium/High) - <observation_noise> is replaced with:
o0
(None),oL
(Low),oM
(Medium/High)
For example: quad_insert_a0o0
, quad_insert_aLoL
, and quad_insert_aMoM
.
To train the the policies on the cluster, go to the bimanual_imitation folder and run:
python pipeline.py --phase <phase_name> --alg <alg_name> --env_name <gym_env_name>
Example Usage:
python pipeline.py --phase 3_train --alg act --env_name quad_insert_aLoL
Note: you can pass --help
to the script to view the available arguments. To run locally, pass the --run_local
argument. Modify the slurm configuration according to your available resources.
After running the pipeline, you can generate rollouts and videos of the policy by running: python generate_trajs.py
The expert datasets (stored as protobuf files) are located in the irl_data/expert_trajectories folder. You can extract these to a trajectory class using the load_trajs
function inside of the proto_logger.
Please refer to the algorithms folder, where you can find work from the original authors of the compared methods.