Spatial-Semantic Image Search by Visual Feature Synthesis

The paper attempts to rethink Text-based Search and Image-based Search by proposing a novel Region-Based Image retrieval (RBIR) methodology
RBIR attempts to address the relationship between spatial query and relationship,it can handle the query composed of semantic concepts (i.e., object categories) and their spatial layout.
This is because users want to constrain their retrieval both semantically and spatially
The paper leverages how those elements are spatially are arranged.
Users can to create, manipulate and annotate all the bounding boxes to specify their search intent, a neural agent can automatically retrieve relevant images.

Visual Feature Synthesis

Represent database images using pre-trained deep visual features
Train a ConvNet model to synthesize visual representation from the query
Use the synthesized feature to retrieve the database images

Specifically, the method transforms the user canvas queryto spatial-semantic representation, each spatial location is associated with the semantic word vector (word2vec). Then a Convolutional Neural Network (CNN) synthesizes the appropriate visual feature. The network is trained with three loss functions, Similarity Loss, Discriminative Loss and Ranking Loss.

Quantitative Evaluation on Visual Genome and MS-COCO datasets shows that the method outperforms all the other methods.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spatial_search.md

spatial_search.md

Spatial-Semantic Image Search by Visual Feature Synthesis

Visual Feature Synthesis

Files

spatial_search.md

Latest commit

History

spatial_search.md

File metadata and controls

Spatial-Semantic Image Search by Visual Feature Synthesis

Visual Feature Synthesis