Skip to content

Latest commit

 

History

History
72 lines (51 loc) · 3.29 KB

README.md

File metadata and controls

72 lines (51 loc) · 3.29 KB

HyperAgent Hits

Author: Yingru Li, Jiawei Xu, Lei Han, Zhi-Quan Luo

This repository contains the official implementation of the HyperAgent algorithm, introduced in our ICML 2024 paper Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent.

For integrating the Generative Pre-trained Transformer (GPT) with HyperAgent, see szrlee/GPT-HyperAgent, designed for adaptive foundation models for online decisions.

HyperAgent Performance

  • Data Efficient ✅: HyperAgent achieves human-level performance (1 IQM) with only 15% of the data used by Double-DQN (DDQN, 2016, DeepMind) in 1.5M interactions.
  • Computation Efficient ✅: HyperAgent uses just 5% of the model parameters compared to the 2023 state-of-the-art algorithm (BBF, DeepMind).
  • Ensemble+ Comparison: Achieves only 0.22 IQM score under 1.5M interactions and requires double the parameters of HyperAgent.

Reference:

Installation

cd HyperAgent
pip install -e .

If you encounter an error related to ROMs when using Atari, follow these steps:

  1. Download Roms.rar from the Atari 2600 VCS ROM Collection.

  2. Extract the .rar file to a directory of your choice.

  3. Run the following command:

    python -m atari_py.import_roms <path to extracted folder>

This command will import the ROMs and print their names as they are processed. The ROMs will be copied to your atari_py installation directory.

For detailed instructions on using ROMs, please refer to the official documentation.

Usage

To reproduce the results for Atari (e.g., Pong):

sh experiments/start_atari.sh Pong

To reproduce the results for DeepSea (e.g., size 20):

sh experiments/start_deepsea.sh 20

Citation

If you find this work useful in your research, please cite our paper:

@inproceedings{li2024hyperagent,
  title         = {{Q-Star Meets Scalable Posterior Sampling: Bridging Theory and Practice via HyperAgent}},
  author        = {Li, Yingru and Xu, Jiawei and Han, Lei and Luo, Zhi-Quan},
  booktitle     = {Forty-first International Conference on Machine Learning},
  year          = {2024},
  series        = {Proceedings of Machine Learning Research},
  eprint        = {2402.10228},
  archiveprefix = {arXiv},
  primaryclass  = {cs.LG},
  url           = {https://arxiv.org/abs/2402.10228}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.