Skip to content

A dense passage retriever model using a knowledge distillation technique that uses the student model itself as a teacher model.

Notifications You must be signed in to change notification settings

YooEunseok/TF-KD-DPR

Repository files navigation

Effects of Teacher-free Self-training Knowledge Distillation in Dense Text Retrieval for Open Domain Q&A (2023) (KCI)

오픈 도메인 질의응답을 위한 Dense Text Retrieval에서의 Teacher-free self-training 지식 증류 기법의 효과 (2023) (KCI) https://www.dbpia.co.kr/journal/articleDetail?nodeId=NODE11646268

Recent dense retrieval studies have shown that a more effective retriever model can be obtained by distilling knowledge from the ranker model to the retriever model in the existing two-stage framework. However, the knowledge distillation techniques have limitations as they must separately learn the teacher model in advance, and that it takes a lot of time and effort to find the most suitable teacher model to distill knowledge to the student model. In this paper, we propose a dense retriever model using teacher-free self-training knowledge distillation techniques, a method of distilling knowledge using the stud model itself as a teacher model. In the first training stage, negative log-likelihood is used as a loss function, and in the next training stage, a loss function using a teacher-free distillation technique is used. This is unlike self-regularization, another teacher-free knowledge distillation technique that does not use the teacher model, and also unlike the label smoothing regularization, which creates soft labels by placing all non-correct documents in the same value based on actual correct answers. Based on the prediction of the trained model, a document judged to be similar to the gold document is allowed to have a higher value than a document that does not. In experiments, we demonstrate the effectiveness of our proposed method by showing improved performance over existing dense passage retrieval models.

About

A dense passage retriever model using a knowledge distillation technique that uses the student model itself as a teacher model.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages