expert-mimic

Experiment with expert-mimic via QLoRA on interviews' data

This repository contains the code for the paper "Experiment on Fact Memorization in LLM Using QLoRA and Quality Assessment of Such Memorization"

Installation

To install the required packages, run the following command:

pip install -r requirements.txt

In addition, you need to install the unsloth package. Its installation depends on the hardware you are using. Please refer to the official documentation for more information.

This code was tested with Python 3.11 and two GPUs: NVIDIA GeForce RTX 3090 on the local Ubuntu machine and L4 on the LightningStudio platform.

Before running the code, you also need to fill the '.env' file with the necessary environment variables. You can use the '.env.example' file as a template.

Usage

Datasets

ChatML dataset - a dataset of interviews with chatbots. Files starting with an underscore should be treated as a test set.
Fact Q/A dataset - a dataset with facts and questions about them for the three newest interviews (those containing "2023" in the filename).

Style experiment

To perform the style experiment, we need to make 3 steps:

Train the LLM model on the ChatML dataset:

python3 style_experiment/us_finetune_musk.py

This script will train the LLM models and push the results to the Hugging Face Hub.

Generate the responses for the test set:

python3 style_experiment/predict_test_dataset.py

This script will generate the responses for the test set and save them to the style_experiment/data/musk/*.jsonl files.

To evaluate the responses (pairwise comparison), run the following command for every pair you want to compare:

python3 style_calculator.py {path_to_file_a.jsonl} {path_to_file_b.jsonl}

This script generates the reports for the pairwise comparison of the responses. This report contains two files:

CSV with per-sample results for debugging
TXT with the final result

Reports are saved to the folder of the first file with the name pattern {file_a_name}_vs_{file_b_name}.*

Fact memorization experiment

To perform the fact memorization experiment, we need to make 3 steps:

Train the LLM model on the Fact Q/A dataset:

python3 fact_memo_experiment/us_finetune_musk.py

This script will train the LLM models and push the results to the Hugging Face Hub.

Generate the responses for the Fact Q/A dataset:

python3 fact_memo_experiment/predict_qa_dataset.py

To evaluate the responses (pairwise comparison), run the following command for every pair you want to compare:

python3 facts_calculator.py {path_to_file_a.jsonl} {path_to_file_b.jsonl}

This script generates the reports for the pairwise comparison of the responses. This report contains two files with the same names as in the style experiment. However, the reports contain different metrics as described in the paper.

Models

The trained models are available in the Hugging Face Hub: https://huggingface.co/atepeq

These names are hardcoded in the prediction scripts (see the second step in the style and fact memorization experiments).

If you want to reproduce the training process, you'll need to generate your own Huggingface API token and replace the HUGGINGFACE_TOKEN in the .env file. In addition, you'll need to change full model names in the scripts.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
data_parsers		data_parsers
fact_memo_experiment		fact_memo_experiment
metrics_utils		metrics_utils
style_experiment		style_experiment
.env.pattern		.env.pattern
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
facts_calculator.py		facts_calculator.py
requirements.txt		requirements.txt
style_calculator.py		style_calculator.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

expert-mimic

Installation

Usage

Datasets

Style experiment

Fact memorization experiment

Models

About

Releases

Packages

Contributors 2

Languages

License

Abilityai/research-expert-mimic

Folders and files

Latest commit

History

Repository files navigation

expert-mimic

Installation

Usage

Datasets

Style experiment

Fact memorization experiment

Models

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages