Skip to content

harmya/rag-distillation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Leveraging RAG-Assisted Teacher Models in Knowledge-Distillation for Enhanced Domain-Specific Question Answering

Code for CS577 Natural Language Processing project. The src directory contains the source code for the majority of our experiments. We also experimented with available KD frameworks like MiniLLM and Distilling Step-by-Step. All the models and tokenizers are pulled from the huggingface hub.

Data Preprocessing

We utilize the SQuAD dataset as our primary benchmark and distillation dataset. Our preprocessing script preprocess.py converts the raw SQuAD data into a uniform format to be passed into the model. The two eval scripts eval.py and eval2.py generate predictions for dev set, and we use SQuAD's official evaluation script (squad_eval.py in our code) to compute metrics over the gold outputs and our predictions.

Knowledge Bases

We use Cohere's wikipedia embeddings as our generic knowledge base, the knowledge.py script generates context for the teacher models using Cohere's API.

Knowledge Distillation Training

We have different scripts corresponding to different experiments we ran throughout the project. finetune.py fine-tunes a given model from Hugging Face hub on a given dataset, and distill.py distills knowledge using the logits from the teacher model. Additionally, student.py and teacher.py contain definitions of one iteration of experiments, and dataloader.py contains the dataloader used in training.

By Harmya, Sarthak

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published