Ready Policy One (RP1)

Code to complement "Ready Policy One: World Building through Active Learning".

Trains an agent inside an ensemble of dynamics models.

General instructions

All the experiments run in this paper exist in the args_yml directory. On the machines we trained on, we could run 5 seeds concurrently, hence the macro-level script run_experiments.py launches 5 at once, with a binary flag to toggle seeds 0-4 or 5-9.

To run the HalfCheetah Ready Policy One experiments for seeds 5-9, type the following:

python run_experiments.py --yaml ./args_yml/main_exp/halfcheetah-rp1.yml --seeds5to9

Citation

@article{rpone2020,
title={Ready Policy One: World Building Through Active Learning},
author={Ball, Philip and Parker-Holder, Jack and Pacchiano, Aldo and Choromanski, Krzysztof and Roberts, Stephen},
journal={Proceedings of the 37th International Conference on Machine Learning},
year={2020}
}

FAQs

Why is model free running so slowly?

Two reasons: 1) It is non-parallelised; 2) This code tries to find GPUs where possible, try forcing it to run on CPU

Acknowledgements

The authors acknowledge Nikhil Barhate for his PPO-PyTorch repo. The ppo.py file here is a heavily modified version of this code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Ready Policy One (RP1)

General instructions

Citation

FAQs

Why is model free running so slowly?

Acknowledgements

Files

README.md

Latest commit

History

README.md

File metadata and controls

Ready Policy One (RP1)

General instructions

Citation

FAQs

Why is model free running so slowly?

Acknowledgements