Skip to content

A Multi-Modal Search Engine developed using CLIP by OpenAI, with Flask API for backend and HTML/CSS for the frontend web application. It uses accelerated GPU and image embeddings to search images from your gallery.

License

Notifications You must be signed in to change notification settings

ahmedembeddedxx/multimodal-search-engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multimodal Search Engine

This project is a Multi-Modal Search Engine developed using CLIP by OpenAI, with Flask API for backend and HTML/CSS for the frontend web application.

Introduction

This project provides a seamless web interface where users can input text queries, and the system retrieves relevant images based on the textual description based on CLIP architecture read the paper.

Take a look

Screenshot-2024-04-10-at-11-02-46-PM

Screenshot-2024-04-10-at-11-03-23-PM

Screenshot-2024-04-10-at-11-03-51-PM

Screenshot-2024-04-10-at-11-04-14-PM

Demo Video

Watch the YouTube video

  • This video demonstrates how to use our project's main feature.

How to use for your own images?

  • Sample data of 130 images is present in the file or
  • See the video or
  • Place your images in src/minidata
  • Run the notebook src/image-processor
  • Move the data in src/image_embeddings & the data in src/minidata to flaskapp/image_embeddings & flaskapp/static respectively (caution: transfer the data, not the directories)

Features

  • Multi-Modal Search: Users can input textual descriptions of images to retrieve relevant images.
  • Intuitive Web Interface: The frontend is built using React to provide a user-friendly experience.
  • Scalable Backend: Flask API serves as the backend, handling requests and interacting with the CLIP model.

Clone the repository:

git clone https://github.com/ahmedembeddedxx/multimodal-search-engine.git

Usage

Start the backend server:

cd flaskapp/
flask run

Access the web application in your browser at http://127.0.0.1:5000/.

Stacks

  • OpenAI for developing CLIP.
  • Flask for the backend framework.

Future Expectences

  • Shift the app to ReactJs
  • Use ImageBind by MetaAI
  • More accurate modal evaluation
  • Integrate Audio & Video Functionality

About

A Multi-Modal Search Engine developed using CLIP by OpenAI, with Flask API for backend and HTML/CSS for the frontend web application. It uses accelerated GPU and image embeddings to search images from your gallery.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages