Skip to content

We perform functional grounding of LLMs' knowledge in BabyAI-Text

License

Notifications You must be signed in to change notification settings

yashonwu/Grounding_LLMs_with_online_RL

 
 

Repository files navigation

Grounding Large Language Models with Online Reinforcement Learning

This repository contains the code used for our paper Grounding Large Language Models with Online Reinforcement Learning. We perform functional grounding of LLMs' knowledge in BabyAI-Text: Main schema

We then perform an in-depth anaylsis of the generalization abilities of our trained agents: Generalization schema

We release our BabyAI-Text environment along with the code to perform our experiments (both training agents and evaluating their performance). We rely on the Lamorel library to use LLMs.

Our repository is structured as follows:

📦 Grounding_LLMs_with_online_RL
┣ 📂 babyai-text -- our BabyAI-Text environment
┣ 📂 experiments -- code for our experiments
┃ ┣ 📂 agents -- implementation of all our agents
┃ ┃ ┣ 📂 bot -- bot agent leveraging BabyAI's bot
┃ ┃ ┣ 📂 random_agent -- agent playing uniformly random
┃ ┃ ┣ 📂 drrn -- DRRN agent from here
┃ ┃ ┣ 📂 ppo -- agents using PPO
┃ ┃ ┃ ┣ 📜 symbolic_ppo_agent.py -- SymbolicPPO adapted from BabyAI's PPO
┃ ┃ ┃ ┗ 📜 llm_ppo_agent.py -- our LLM agent grounded using PPO
┃ ┣ 📂 configs -- Lamorel configs for our experiments
┃ ┣ 📂 slurm -- utils scripts to launch our experiments on a SLURM cluster
┃ ┣ 📂 campaign -- SLURM scripts used to launch our experiments
┃ ┣ 📜 train_language_agent.py -- train agents using BabyAI-Text (LLMs and DRRN) -> contains our implementation of PPO loss for LLMs as well as additional heads on top of LLMs
┃ ┣ 📜 train_symbolic_ppo.py -- train SymbolicPPO on BabyAI (with BabyAI-Text's tasks)
┃ ┣ 📜 post-training_tests.py -- generalization tests of trained agents
┃ ┣ 📜 test_results.py -- utils to format results
┃ ┗ 📜 clm_behavioral-cloning.py -- code to perform Behavioral Cloning on an LLM using trajectories

Installation steps

  1. Create conda env
conda create -n dlp python=3.10.8; conda activate dlp
  1. Install PyTorch
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch
  1. Install packages required by our package
pip install -r requirements.txt
  1. Install BabyAI-Text: See installation details in the babyai-text package
  2. Install Accelerate
cd v0.13.2/accelerate-0.13.2; pip install -e .; cd ../..
  1. Install Lamorel
git clone https://github.com/ClementRomac/lamorel.git; cd lamorel/lamorel; pip install -e .; cd ../..

Launch

Please use Lamorel along with our configs. You can find examples of our training scripts in campaign.

About

We perform functional grounding of LLMs' knowledge in BabyAI-Text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.1%
  • Other 0.9%