GitHub - eminorhan/llm-memory: Memory experiments with LLMs

Recognition, recall, and retention of few-shot memories in LLMs

This repository contains the code for reproducing the results reported in the following paper:

Orhan AE (2023) Recognition, recall, and retention of few-shot memories in large language models. arXiv:2303.17557.

The repository contains three Python files train.py, test.py, generate.py (all modified from the Huggingface causal language modeling example here) to train (or finetune) a model, to run a recognition test, and to run a recall test, respectively.

Usage examples

Some usage examples for these files are given below.

Finetune a gpt-j-6B model with the study sentences in seen_data_0.json for 1 epoch (1 exposure) on 4 GPUs (with a total batch size of 4x4=16 sentences) using the Huggingface Accelerate framework (see the example config file here):

accelerate launch --config_file accelerate_config.yaml --num_cpu_threads_per_process 4 train.py \
    --model_name_or_path "EleutherAI/gpt-j-6B" \
    --train_file "data/llm-experiment-data/expt1/seen_data_0.json" \
    --per_device_train_batch_size 4 \
    --learning_rate 0.00001 \
    --output_dir OUTPUT_DIR \
    --save_prefix INFORMATIVE_SAVE_PREFIX \
    --block_size 128 \
    --num_train_epochs 1 \
    --overwrite_cache

Run a recognition test on a model with the study sentences in seen_data_0.json and foils in unseen_data_0.json:

python -u test.py \
    --model_name_or_path MODEL_PATH \
    --seen_file "data/llm-experiment-data/expt1/seen_data_0.json" \
    --unseen_file "data/llm-experiment-data/expt1/unseen_data_0.json" \
    --per_device_eval_batch_size 1 \
    --output_dir OUTPUT_DIR \
    --save_prefix INFORMATIVE_SAVE_PREFIX \
    --block_size 128 \
    --overwrite_cache

Run a recall test with a model with the study sentences in seen_data_0.json:

python -u generate.py \
    --model_name_or_path MODEL_PATH \
    --seen_file "data/llm-experiment-data/expt1/seen_data_0.json" \
    --per_device_eval_batch_size 1 \
    --output_dir OUTPUT_DIR \
    --save_prefix INFORMATIVE_SAVE_PREFIX \
    --block_size 128 \
    --overwrite_cache

Reproduction

The scripts folder contains SLURM scripts for reproducing all experiments reported in the paper, using these three files. The data folder contains all the experimental data used in the experiments. The utils folder contains utility functions that were used to generate the experimental data. The results of all recognition, recall, and retention experiments reported in the paper are available from this Huggingface dataset repository.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Recognition, recall, and retention of few-shot memories in LLMs

Usage examples

Reproduction

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
data		data
scripts		scripts
utils		utils
.gitignore		.gitignore
README.md		README.md
accelerate_config.yaml		accelerate_config.yaml
accelerate_config_big.yaml		accelerate_config_big.yaml
generate.py		generate.py
test.py		test.py
train.py		train.py

eminorhan/llm-memory

Folders and files

Latest commit

History

Repository files navigation

Recognition, recall, and retention of few-shot memories in LLMs

Usage examples

Reproduction

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages