Leveraging RAG-Assisted Teacher Models in Knowledge-Distillation for Enhanced Domain-Specific Question Answering

Code for CS577 Natural Language Processing project. The src directory contains the source code for the majority of our experiments. We also experimented with available KD frameworks like MiniLLM and Distilling Step-by-Step. All the models and tokenizers are pulled from the huggingface hub.

Data Preprocessing

We utilize the SQuAD dataset as our primary benchmark and distillation dataset. Our preprocessing script preprocess.py converts the raw SQuAD data into a uniform format to be passed into the model. The two eval scripts eval.py and eval2.py generate predictions for dev set, and we use SQuAD's official evaluation script (squad_eval.py in our code) to compute metrics over the gold outputs and our predictions.

Knowledge Bases

We use Cohere's wikipedia embeddings as our generic knowledge base, the knowledge.py script generates context for the teacher models using Cohere's API.

Knowledge Distillation Training

We have different scripts corresponding to different experiments we ran throughout the project. finetune.py fine-tunes a given model from Hugging Face hub on a given dataset, and distill.py distills knowledge using the logits from the teacher model. Additionally, student.py and teacher.py contain definitions of one iteration of experiments, and dataloader.py contains the dataloader used in training.

By Harmya, Sarthak

Name		Name	Last commit message	Last commit date
Latest commit History 111 Commits
data		data
src		src
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Leveraging RAG-Assisted Teacher Models in Knowledge-Distillation for Enhanced Domain-Specific Question Answering

Data Preprocessing

Knowledge Bases

Knowledge Distillation Training

About

Releases

Packages

Contributors 2

Languages

harmya/rag-distillation

Folders and files

Latest commit

History

Repository files navigation

Leveraging RAG-Assisted Teacher Models in Knowledge-Distillation for Enhanced Domain-Specific Question Answering

Data Preprocessing

Knowledge Bases

Knowledge Distillation Training

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages