Images usually consists of an important memory saved in the form of jpg, jpeg, png, etc. However, we're often stuck with several copies of the same images, saved under different names on our devices and manually sorting through them is tedious to say the least. For this, I created a web application called 'Redundancy Remover' using Python, Flask, HTML, CSS, and JavaScript that takes several images as an input, searches for similar images among them, deletes copies of an existing image and produces a zipped file of single copies of all the uploaded images as an output.
To detect similar images, I've use the hashing method that is specifically developed for images: Average Hashing.
Following are the steps for identifying similar images using the Average Hashing Algorithm:
- Reduction of the image's size
- Gray Scaling
- Calculating the mean pixel value of the whole image
- Convert the entire image into binary bits using the mean pixel value of the whole image as a threshold value.
- Construct the hash value
- Compare the hash value of all the uploaded images.
- If simlar hash values are found, a duplicate is detected else the image is unique and passed on to the user as the ouput.
- After downloading this repository and opening it in any IDE, create and activate a python virtual environment by typing the following in the terminal
env\Scripts\activate
By doing this, you won't need to install any external modules or packages as everything that is required to run the program will be available in the environment. These modules can viewed in the env folder.
- Once you've done this, the environment name (in this case 'env') should be visible at the beginning of a new terminal line. Now, run the python program to start the server.
python main.py
- Go to the browser and type
localhost:5000
or
http://127.0.0.1:5000/