Learning Transferrable and Adaptive Control Policies

This repository contains implementations of transfer learning algorithms described in the following papers:

Learning Fast Adaptation with Meta Strategy Optimization, ICRA 2020

Policy Transfer with Strategy Optimization, ICLR 2019

Prepare for the Unknown: Learning a Universal Policy with Online System Identification, RSS 2017

Prerequisites

To use this code you need to install OpenAI Baselines, Dart and PyDart2.

You can find detailed instructions for installing OpenAI Baselines here. For installing Dart and PyDart2, you can follow the installation instructions here.

Note that the environments also depends on OpenAI Gym, however it should come with Baselines.

Installation

Run the following command from the project directory:

pip install -e .

How to use

SO-CMA

SO-CMA has two stages: training universal policy and strategy optimization.

To train a universal policy, use the code in ppo. FOr the strategy optimization part, use the code in test_socma.

An example of Dart hopper transferred to MuJoCo hopper can be found in examples:

examples/socma_hopper_5d_train.sh

The training results will be saved to data/.

To perform strategy optimization, run:

examples/socma_hopper_5d_test.sh

You can also use test_policy.py to test individual policies.

UP-OSI

Training UP-OSI involves two steps: training a universal policy and training an online system identification model.

To train a universal policy, use the code in ppo. To train the online system identification model, use the code in train_osi.

An example training script for the hopper environment is available in examples, use the following command to run the example training script:

examples/uposi_hopper_2d_train.sh

The training results will be saved to data/.

To test the resulting controller, run:

examples/uposi_hopper_2d_test.sh

and follow the prompt in the terminal. After each rollout a plot of the estimated model parameters and true model parameters is shown.

ODE Internal Error

If you see errors like: ODE INTERNAL ERROR 1: assertion "d[i] != dReal(0.0)" failed in _dLDLTRemove(), try downloading lcp.cpp and replace the one in dart/external/odelcpsolver/ with it. Recompile Dart and Pydart2 afterward and the issue should be gone.

Additional feedbacks:

Please contact Wenhao Yu (stacormed@gmail.com) if you have any feedbacks/questions about this work.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
examples		examples
policy_transfer		policy_transfer
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Learning Transferrable and Adaptive Control Policies

Prerequisites

Installation

How to use

SO-CMA

UP-OSI

ODE Internal Error

Additional feedbacks:

About

Releases

Packages

Languages

VincentYu68/policy_transfer

Folders and files

Latest commit

History

Repository files navigation

Learning Transferrable and Adaptive Control Policies

Prerequisites

Installation

How to use

SO-CMA

UP-OSI

ODE Internal Error

Additional feedbacks:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages