This repository provides an implementation of Rank-One Model Editing (ROME) on auto-regressive transformers (GPU-only). We currently support OpenAI's GPT-2 XL (1.5B) and EleutherAI's GPT-J (6B). The release of a 20B GPT-like model from EleutherAI is expected soon; we hope to support it ASAP.
Feel free to open an issue if you find any problems; we are actively developing this repository and will monitor tickets closely.
- Installation
- Causal Tracing
- Rank-One Model Editing (ROME)
- CounterFact Dataset
- Evaluation
- How to Cite
We recommend conda
for managing Python, CUDA, and PyTorch-related dependencies, and pip
for everything else. To get started, simply install conda
and run:
./scripts/setup_conda.sh
notebooks/causal_trace.ipynb
demonstrates Causal Tracing, which can be modified to apply tracing to the processing of any statement.
notebooks/rome.ipynb
demonstrates ROME. The API is simple; one simply has to specify a requested rewrite of the following form:
request = {
"prompt": "{} plays the sport of",
"subject": "LeBron James",
"target_new": {
"str": "football"
}
}
Several similar examples are included in the notebook.
Description coming soon!
We compare ROME against several state-of-the-art model editors. All are implemented in baselines/ in their respective folders. Implementations are not our own; they are adapted slightly to plug into our evaluation system.
- Knowledge Neurons (KN): Dai et al. [Code] [Paper]
- Knowledge Editor (KE): De Cao et al. [Code] [Paper]
- Model Editor Networks with Gradient Decomposition (MEND): Mitchell et al. [Code] [Paper]
Description coming soon!
@article{meng2022locating,
title={Locating and Editing Factual Knowledge in GPT},
author={Kevin Meng and David Bau and Alex Andonian and Yonatan Belinkov},
journal={arXiv preprint arXiv:2202.05262},
year={2022}
}