Status: In Progress
This project uses ElasticSearch with the k-NN Plugin for image similarity search. It applies the VGG16 model to extract image features, enabling efficient retrieval based on feature similarity and fuzzy search.
- First Steps
- Setting Up ElasticSearch and k-NN Plugin
- Index Mapping & Data Loading
- Performing k-NN Search
- Fuzzy Search and Dimensionality Reduction
- Testing
- References
- Extract features from images using VGG16. These features represent high-level patterns and semantics of the images, enabling similarity-based retrieval.
To perform k-NN (k-nearest neighbor) search on the image features, set up ElasticSearch with the k-NN Plugin. The plugin allows efficient similarity search using vector data.
-
Docker Command to Set Up ElasticSearch:
Run the following Docker command to set up ElasticSearch:
docker run --name elasticsearch --net elas -e "discovery.type=single-node" -e "xpack.security.enabled=false" -p 9200:9200 -p 9300:9300 docker.elastic.co/elasticsearch/elasticsearch:8.15.2
-
Install k-NN Plugin:
Once ElasticSearch is running, install the k-NN plugin to enable vector-based search:
sudo bin/elasticsearch-plugin install "org.elasticsearch.plugin:knn:8.15.2"
Create an ElasticSearch index with appropriate mapping to store image features as vectors. This will support the k-NN search functionality.
- Define the vector field for storing image features.
Use Logstash or a direct Python script to load image feature vectors (extracted using VGG16) into your ElasticSearch index.
Perform k-NN search to retrieve images similar to a given query. The search is based on the feature vectors that are stored in the ElasticSearch index.
-
Approximate k-NN: This method is fast but less accurate.
-
Precise k-NN: More accurate, but slower.
It is recommended to reduce the dimensionality of the feature vectors using Principal Component Analysis (PCA) for better performance and reduced memory usage.
Use quantization to convert the floating-point vectors into byte vectors, which helps in saving memory while still allowing for efficient similarity search.
You can add a filter option such as tags (e.g., image categories) to narrow down the search results, reducing the number of images to search.
To test the search functionality, follow these steps:
-
Run ElasticSearch with the connection details mentioned above.
-
Launch the application by running:
python app.py
This will start the search interface where you can test the image similarity search and semantic search features.