Skip to content

Latest commit

 

History

History
21 lines (13 loc) · 1.53 KB

spatial_search.md

File metadata and controls

21 lines (13 loc) · 1.53 KB

Spatial-Semantic Image Search by Visual Feature Synthesis

  • The paper attempts to rethink Text-based Search and Image-based Search by proposing a novel Region-Based Image retrieval (RBIR) methodology
  • RBIR attempts to address the relationship between spatial query and relationship,it can handle the query composed of semantic concepts (i.e., object categories) and their spatial layout.
  • This is because users want to constrain their retrieval both semantically and spatially
  • The paper leverages how those elements are spatially are arranged.
  • Users can to create, manipulate and annotate all the bounding boxes to specify their search intent, a neural agent can automatically retrieve relevant images.

Visual Feature Synthesis

  • Represent database images using pre-trained deep visual features
  • Train a ConvNet model to synthesize visual representation from the query
  • Use the synthesized feature to retrieve the database images

image

Specifically, the method transforms the user canvas queryto spatial-semantic representation, each spatial location is associated with the semantic word vector (word2vec). Then a Convolutional Neural Network (CNN) synthesizes the appropriate visual feature. The network is trained with three loss functions, Similarity Loss, Discriminative Loss and Ranking Loss.

Quantitative Evaluation on Visual Genome and MS-COCO datasets shows that the method outperforms all the other methods.