This project is based on Cocktail Recommendation System, which utilizes the Retrieval-Augmented Generation (RAG) approach to provide users with personalized cocktail recommendations based on their queries. Leveraging state-of-the-art machine learning and natural language processing techniques, the system delivers accurate and relevant recommendations. The system employs MongoDB for data storage and retrieval, Hugging Face's Datasets library for dataset management, OpenAI for text embeddings and chat completion, and Google Colab as the development environment.
You can access the live web application here.
-
Libraries Installation
-
Data Preparation
- Load Dataset: Utilize Hugging Face's Datasets library to load the cocktail recipes dataset.
- Data Cleaning and Preparation: Clean the dataset by processing ingredients and removing missing values.
- Create Embeddings with OpenAI: Generate embeddings for cocktail ingredients using OpenAI's text embedding model.
-
Vector Database Setup and Data Ingestion
- Connect to MongoDB: Establish connection to MongoDB Atlas using the provided URI.
- Data Ingestion: Ingest cleaned data into MongoDB collection for vector search.
- Hugging Face's Datasets library
- pandas
- numpy
- OpenAI
- pymongo
- Google Colab (for development environment) [optional]
- streamlit
- python-dotenv