Skip to content

Latest commit

 

History

History
34 lines (21 loc) · 1.32 KB

README.md

File metadata and controls

34 lines (21 loc) · 1.32 KB

Ready Policy One (RP1)

Code to complement "Ready Policy One: World Building through Active Learning".

Trains an agent inside an ensemble of dynamics models.

General instructions

All the experiments run in this paper exist in the args_yml directory. On the machines we trained on, we could run 5 seeds concurrently, hence the macro-level script run_experiments.py launches 5 at once, with a binary flag to toggle seeds 0-4 or 5-9.

To run the HalfCheetah Ready Policy One experiments for seeds 5-9, type the following:

python run_experiments.py --yaml ./args_yml/main_exp/halfcheetah-rp1.yml --seeds5to9

Citation

@article{rpone2020,
title={Ready Policy One: World Building Through Active Learning},
author={Ball, Philip and Parker-Holder, Jack and Pacchiano, Aldo and Choromanski, Krzysztof and Roberts, Stephen},
journal={Proceedings of the 37th International Conference on Machine Learning},
year={2020}
}

FAQs

Why is model free running so slowly?

Two reasons: 1) It is non-parallelised; 2) This code tries to find GPUs where possible, try forcing it to run on CPU

Acknowledgements

The authors acknowledge Nikhil Barhate for his PPO-PyTorch repo. The ppo.py file here is a heavily modified version of this code.