Skip to content

yusufM03/Elastic_Search_image_VGG16

Repository files navigation

Elastic Search Image VGG16 Similarity

Status: In Progress

This project uses ElasticSearch with the k-NN Plugin for image similarity search. It applies the VGG16 model to extract image features, enabling efficient retrieval based on feature similarity and fuzzy search.

Table of Contents

  1. First Steps
  2. Setting Up ElasticSearch and k-NN Plugin
  3. Index Mapping & Data Loading
  4. Performing k-NN Search
  5. Fuzzy Search and Dimensionality Reduction
  6. Testing
  7. References

First Steps

Feature Extraction from Images

  • Extract features from images using VGG16. These features represent high-level patterns and semantics of the images, enabling similarity-based retrieval.

Set Up ElasticSearch and k-NN Plugin

To perform k-NN (k-nearest neighbor) search on the image features, set up ElasticSearch with the k-NN Plugin. The plugin allows efficient similarity search using vector data.


Setting Up ElasticSearch and k-NN Plugin

  1. Docker Command to Set Up ElasticSearch:

    Run the following Docker command to set up ElasticSearch:

    docker run --name elasticsearch --net elas -e "discovery.type=single-node" -e "xpack.security.enabled=false" -p 9200:9200 -p 9300:9300 docker.elastic.co/elasticsearch/elasticsearch:8.15.2
  2. Install k-NN Plugin:

    Once ElasticSearch is running, install the k-NN plugin to enable vector-based search:

    sudo bin/elasticsearch-plugin install "org.elasticsearch.plugin:knn:8.15.2"

Index Mapping & Data Loading

Define the Index Mapping

Create an ElasticSearch index with appropriate mapping to store image features as vectors. This will support the k-NN search functionality.

  • Define the vector field for storing image features.

Load CSV Data into ElasticSearch

Use Logstash or a direct Python script to load image feature vectors (extracted using VGG16) into your ElasticSearch index.


Performing k-NN Search

Perform k-NN search to retrieve images similar to a given query. The search is based on the feature vectors that are stored in the ElasticSearch index.

Different Methods of k-NN Plugin

  • Approximate k-NN: This method is fast but less accurate.

  • Precise k-NN: More accurate, but slower.


Fuzzy Search and Dimensionality Reduction

Dimensionality Reduction using PCA

It is recommended to reduce the dimensionality of the feature vectors using Principal Component Analysis (PCA) for better performance and reduced memory usage.

Quantization for Memory Saving

Use quantization to convert the floating-point vectors into byte vectors, which helps in saving memory while still allowing for efficient similarity search.

Filter Option (e.g., Tags)

You can add a filter option such as tags (e.g., image categories) to narrow down the search results, reducing the number of images to search.


Testing

To test the search functionality, follow these steps:

  1. Run ElasticSearch with the connection details mentioned above.

  2. Launch the application by running:

    python app.py

This will start the search interface where you can test the image similarity search and semantic search features.


References

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published