LeaderBoard for various CBIR models
Content-based Image Retrieval models.
Any suggestion for new benchmark dataset is welcome. Any suggestion for missing models is welcome.
Most models are based on publicaly published result (peer-reviewd) or reproducable result (with source-code).
If you have your model that is not published yet, and not open-sourced, I'll mark it in etc
column.
Rank is based on Oxford105k (mid-scale image retrieval).:trophy:
model | oxf5k | par6k | oxf105k | par106k | holidays | yymm | ref | etc |
---|---|---|---|---|---|---|---|---|
🏆GeM(Res) | 87.8 | 92.7 | 84.6 | 86.9 | 93.9 | 17.11 | [GeM17] | |
GeM(VGG) | 87.9 | 87.7 | 83.3 | 81.3 | 89.5 | 17.11 | [GeM17] | |
R-MAC(Res,E2E) | 86.1 | 94.5 | 82.8 | 90.6 | 94.8 | 16.10 | [DeepIR16] | |
BoW(200k)+VV | 80.1 | 73.4 | 74.5 | 64.9 | - | 16.xx | [VV16] | HesAff+RootSIFT, HE, VBW, Top1000, 1-to-1 |
BoW(200k) | 76.2 | 71.2 | 66.4 | 60.2 | - | 16.xx | [VV16] | |
BoW(16M,L16)+FSM | 74.2 | 74.9 | 67.4 | 67.5 | 74.9 | 12.xx | [VW16M12] | HesAff+SIFT, 15 alt.words |
BoW(1M)+FSM | 66.4 | - | 54.1 | - | - | 07.xx | [FSM07] | HesAff+SIFT |
BoW(1M) | 61.8 | - | 49.0 | - | - | 07.xx | [FSM07] |
- This result does not use Query Expansion (QE), Database Augmentation (DBA), or Spatial Verification.
- For BoW based Image Retrieval System, Spatial Verifiaction is necessary to consider spatial information. So, I explicitly add the spatial verification method after
+
symbol. (i.e FSM, VV) - HesAff: Hessian Affine Keypoint Detector. See [HesAff09]
- HE: Hamming Embedding (mitigate quantizatin error of visual words). See [HE08]
- RootSIFT: practical tip. better represenation for L2 distance measure. See [RootSIFT12]
- VBW: Visual Burstiness Weighting (mitigate repetative pattern dominancy problem). See [VBW09]
- TopXXX: Rerank top xxx results with spatial verification
- 1-to-1 : enforcing 1-to-1 correspondence with keypoint geometry. See [PGM15]
[GeM17]: Fine-tuning CNN Image Retrieval with No Human Annotation by Filip Radenović, Giorgos Tolias, Ondřej Chum https://arxiv.org/abs/1711.02512,
[DeepIR16]: End-to-end Learning of Deep Visual Representations for Image Retrieval by Albert Gordo, Jon Almazan, Jerome Revaud, Diane Larlus https://arxiv.org/abs/1610.07940
[VV16]: A Vote-and-Verify Strategy for Fast Spatial Verification in Image Retrieval by Sch"{o}nberger, Johannes Lutz and Price, True and Sattler, Torsten and Frahm, Jan-Michael and Pollefeys, Marc https://github.com/vote-and-verify/vote-and-verify
[RootSIFT12]: Three things everyone should know to improve object retrieval by Relja Arandjelovi´c Andrew Zisserman https://www.robots.ox.ac.uk/~vgg/publications/2012/Arandjelovic12/arandjelovic12.pdf
[PGM15]: Pairwise Geometric Matching for Large-scale Object Retrieval by Xinchao Li, Martha Larson, Alan Hanjalic https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Li_Pairwise_Geometric_Matching_2015_CVPR_paper.pdf
[VW16M12]: Learning Vocabularies over a Fine Quantization by Andrej Mikul´ık, Michal Perdoch, Ondˇrej Chum, and Jiˇr´ı Matas http://cmp.felk.cvut.cz/~perdom1/papers/mikulik_ijcv12.pdf
[HesAff09]: Efficient Representation of Local Geometry for Large Scale Object Retrieval by Perdoch, M. and Chum, O. and Matas, J. http://cmp.felk.cvut.cz/~perdom1/hesaff/
[VBW09]: On the burstiness of visual elements by Herve Jegou ; Matthijs Douze ; Cordelia Schmid http://ieeexplore.ieee.org/abstract/document/5206609/
[HE08]: Hamming embedding and weak geometric consistency for large scale image search by Herve Jegou, Matthijs Douze, and Cordelia Schmid https://hal.inria.fr/inria-00316866/document/
[FSM07]: Object retrieval with large vocabularies and fast spatial matching by James Philbin ; Ondrej Chum ; Michael Isard ; Josef Sivic ; Andrew Zisserman http://ieeexplore.ieee.org/document/4270197/
by Liang Zheng, Yi Yang, Qi Tian https://arxiv.org/abs/1608.01807
Image from "Efficient Diffusion on Region Manifolds: Recovering Small Objects with Compact CNN Representations" by Ahmet Iscen, Giorgos Tolias, Yannis Avrithis, Teddy Furon, Ondrej Chum
https://arxiv.org/abs/1611.05113
[ ] Add post-processed version including QE, and diffusion.