Skip to content

youssef-mansor/CLIR-using-BERT

Repository files navigation

Contributors Forks Stargazers Issues LinkedIn


Cross lingual information retrieval (CLIR) using BERT

Report Bug · Request Feature

About The Project

This project implements cross-lingual information retrieval techniques using BERT (Bidirectional Encoder Representations from Transformers) for English-Turkish language pair. It compares different approaches including Latent Semantic Indexing (LSI), LSI with translation, and BERT-based methods.

The main goal of this project is to evaluate and compare different techniques for cross-lingual information retrieval between English and Turkish. The project explores:

  • Latent Semantic Indexing (LSI)
  • LSI with Translation
  • BERT-based approach

For each approach, various similarity metrics are used:

  • Cosine Similarity
  • Jaccard Similarity
  • Dice Similarity
  • Overlap Similarity

Results

BERT-based method shows significant improvement over traditional LSI and LSI with translation approaches.

Image

Tech Stack

The project utilizes the following technologies:

Python NumPy scikit-learn Transformers Matplotlib PyTorch

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published