-
-
Notifications
You must be signed in to change notification settings - Fork 176
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why top 8x8 pixels of the Matrix are selected at PerceptualHash? #52
Comments
I don't think only the first 8x8 is taken. The perceptual hash one is not fully ready yet. The others work pretty well. |
What I understand is: The whole 32x32 matrix has to be calculated *(NOT TRUE, findings in next post) to produce good results, and the first 8x8 also is gives the best results, according to my empiric testing, not because I understand how the matrix is done (I'm still reading about Discrete Cosine Transformations to figure it out). I was trying to figure out if it’s worth to store more "hash", from the matrix, if the pixels beyond the 8x8 could be relevant, since they are already calculated. What’s lacking to the perceptual hash to be ready? I believe according to my tests that is the best performer. I'm probably going to use it, actually I might integrate it to my app within a couple of days. |
I was not seeing why the whole 32*32 matrix was calculated, and before I checked incorrectly that was necessary (at least for the code to produce the same results), so I tried again. In short, now I'm producing only a 8x8 matrix, which gives the same exact final hash result, AND I reduced the total execution time by a +75%, from 210ms to 50ms for an image, or 12 sec to 2,8 sec for my entire benchmark test. I'm just doing : After this findings I'm not confident on how the DCT matrix is done. UPDATE: I can confirm the DCT is correct, I test it with an exemple ->" "on wikipedia and I got the correct results, exemple column A I got (you have to add colors r+g+b raw):
|
Well I did some heat maps of the matrix and it showed expected result, so I'm guessing the DCT is working ok. Its supposed to give a top-let to bottom-right in diagonal result. Also note that the first bit seems to be always 1, I don't know if theoretically but in practice, so it can be discarded, or changed by a 0 making it easier to handle on software without unsigned 64bit integer, or add the bit "65", so move it one place. As I said the order of the matrix is relevant, it is not the most optimal to just pick the first 8x8 from the 32x32 image, the reason for the 8x8 in the first place, is to get the lower frequencies of the DCT matrix (maybe discarding the super low) because the higher ones are not good for our intensions. So with that in mind, if the matrix frequency advances diagonally it only makes sense to order it diagonally, this will provide:
The green and yellow heatmap 32x32 image, is the one I made. |
|
My initial version of the phash was a port of some existing code I found. Looks like you got quite a good understanding of how the algorithm works! Do you see code improvements you could do? |
Apart from the already commented (the unnecessary calculations, the diagonal order of the final hash, and discarding the first bit), now I'm looking at the median/average method: I’m trying to think of solution that would provide more entropy without damaging the efficacy of finding similar images. That or use average, but for that I will need to increase the size of my benchmark to get more solid decisions. |
Well I integrated it and I realized it was slower than I thought, when I looked it in comparison with the other image manipulations that my application does, it was a lot. |
PD: finally discarted |
@smithtek, could you please share the code and explain how to use it? |
Hi, which part? I finally discarted the "inverse multiplication table", I was not sure of it, and on new benchmaks it performed worse. If you are talking about the "pre-calculated", yes I could share, but its just calculating in advance to gain efficiency. Other than that with the fragments I shared you can make the same system, I guess I could paste it all toghether, but keep in mind I changed some things from your code. |
Here is a compilation of everything, you may need to make some small changes to work with your existing code. At function __construct() you add:
You precalcule "manually" this array with the following, you don't need it after that, I would add it as a comment, in case somebody wants to calculate 16 or 64:
Then on the for loops, keep in mind I'm using imagick directly.
Then the two functions you need, the diagonalMatrix() is already pasted on a previous comment, and the calculateDCT():
|
@smithtek please review #58 if you have time. I applied all your suggestions and made a separate implementation w/ them. |
Generated DCTs looks good, I checked. I probably won't have time to test your implementation. |
…rdex See issue jenssegers#52 Squashed commit of the following: commit 39baec8 Author: Anatoly Pashin <anatoly.pashin@gmail.com> Date: Thu Jul 9 12:45:26 2020 +1000 Fix cs commit aa31683 Author: Anatoly Pashin <anatoly.pashin@gmail.com> Date: Thu Jul 9 12:45:17 2020 +1000 Fix cs commit df4555d Author: Anatoly Pashin <anatoly.pashin@gmail.com> Date: Thu Jul 9 12:40:40 2020 +1000 Fix typo commit b422ea4 Author: Anatoly Pashin <anatoly.pashin@gmail.com> Date: Thu Jul 9 12:39:20 2020 +1000 Add PerceptualHash2
@smithtek Sorry, but why didn't you just create a pull request with all the changes and instead describe all your steps here? |
I'm currently testing the PR made by @b1rdex using the code from @smithtek, I will report any progress here. |
FYI, we use #58 in production since I created the PR for 2-5M images. The number is dependent on the currently active goods number available to buy. We haven't detected any issues with it. |
#58 seems to work really well, it seems to find more meaningfull duplicates up until a distance of 10 and sometimes even more. Take for example: Still, here's a funny duplicate: At higher distances, it still finds some similar pictures but consistency tends to drop around 11~12: I think #58 is really good as-is, no need for additional improvements. Additional thoughts in the discussion concerning collisions here #27. |
The matrix is 32x32, are the first 8x8 ordered in any relevant way?
Also which hash is performing better for you?
The text was updated successfully, but these errors were encountered: