This repository contains the code implementation of ICLR 2024 paper: Zero-Shot Robustification of Zero-Shot Models (RoboShot) paper link.
- WILDS datasets (Waterbirds, CelebA): The code enables automatic download of WILDS datasets (thanks to the amazing WILDS benchmark package!). No extra steps needed here!
- DomainBed datasets (PACS, VLCS): Download the datasets from DomainBed suit
- CXR:
- Create new conda environment
conda create -n roboshot python=3.7
conda activate roboshot
- Install required packages
bash env.sh
- Put in the
absolute
path of to download your datasets inutils/sys_const.py
under theDATA_DIR
constant. - We have a cached ChatGPT concepts that you can use directly without calling the API. However, if you wish to run the full pipeline from scratch and getting fresh concept from ChatGPT, you should:
- Get OpenAI API key
- Create
api_key.py
in theutils
directory - Paste the following code:
API_KEY = [your API key string here]
- If you wish to use LLaMA, download its weights here and follow the instructions from HuggingFace here. Then, tut in the
absolute
path to your LLaMA weights inutils/sys_const.py
under theLLAMA_PATH
constant.
Now we are ready to run the code!
python run.py -d=waterbirds -reuse
Flags:
-d
: select dataset (waterbirds/celebA/pacs/cxr/vlcs)-clip
: select CLIP model (align/alt/openclip_vitl14/openclip_vitb32/openclip_vith14)-lm
: select LLM to extract insights (chatgpt/llama/gpt2/flan-t5)-reuse
: reuse the cached ChatGPT output