Skip to content

Implementing Vector Database on CoNaLa dataset to retrieve program snippets relevant to user queries. This is a very simple simulation of a Vector Database.

Notifications You must be signed in to change notification settings

swastikmaiti/Vector_Database

Repository files navigation

Vector-Database with Qdrant Library and Embedding with Sentence Tansformers

Simulating a Vector Database on CoNaLa dataset.

Dataset

  • CoNaLa: The Code/Natural Language Challenge dataset to retrieve program snippets relevant to user queries.

Frameworks

  • Vector Database: in-memory vector database using Qdrant library.
  • Embeddings: Sentence Transformer (all-MiniLM-L6-v2).

Files

  • prepare_data.ipynb: Notebook to view the data and perfrom simple Analysis of the Dataset.
  • embeddings.ipynb: Contain the full code to create embedding using sentence-transformers, vector-database using qdrant and then retrieval based on cosine similarity.

If you find the repo helpful, please drop a ⭐

About

Implementing Vector Database on CoNaLa dataset to retrieve program snippets relevant to user queries. This is a very simple simulation of a Vector Database.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published