Skip to content

kzagoris/Word-Spotting-By-Example

Repository files navigation

Word Spotting by Example

Our algorithm relies on Document-oriented Local Features (DoLF) [Zagoris2017, Zagoris2014], which considers information around representative keypoints and a matching process that incorporates spatial context in a local proximity search without using any training data. Finally, it introduces a distance algorithm that incorporates spatial context and is employed under segmentation-based and segmentation-free scenarios.

The main novelties of the above approach are:

  1. Use of local features that take into consideration the handwritten document particularities. Therefore, it can detect the characters' meaningful points that reside in the documents independently of its scaling.
  2. It provides consistency between different handwritten writing variations.
  3. Use of the same operational pipeline in both segmentation-based and segmentation-free scenarios
  4. Incorporation of spatial context in the local search of the matching process.

The segmentation - free operational pipeline:

query image the document image
the query image, the localpoints, the central location (shown in magenta color), and its nearest keypoint (shown in orange color) the document image
candidate local points multiple instances of word boundaries
the candidate local points for the document coordinate origin multiple instances of word boundaries around each candidate coordinate origin
multiple word detection final result
multiple word detection final result (the green color denotes the most similar word)

Finally, implementing the proposed keyword spotting method as a recommender system to a transcription process is available at http://vc.ee.duth.gr/ws [Zagoris2015].

A more efficient matching procedure that uses the DoLF local points is available at http://orpheus.ee.duth.gr/word-spotting-demonstrator/

Usage

WordSpottingByExample 1.0.0.0


  indexing     Indexing Directory

  retrieval    Retrieve Word

  help         Display more information on a specific command.

  version      Display version information.

Indexing Verb Usage

WordSpottingByExample 1.0.0.0
USAGE:
Indexing example:
  WordSpottingByExample indexing --imageformat jpg C:\WordImages .\dataset.sqlite

  -i, --imageformat                      (Default: png) Image Extension

  --help                                 Display this help screen.

  --version                              Display version information.

  ImagesDirectory (pos. 0)               Required. The directory path that contains the document images for indexing

  Output SQLite Dataset File (pos. 1)    Required. The output SQLite file that contains the dataset info

Retrieval Verb Usage

WordSpottingByExample 1.0.0.0
USAGE:
Locate a word:
  WordSpottingByExample retrieval .\query-word.png .\dataset.sqlite Results.xml

  -i, --imageformat          (Default: png) Image Extension

  --help                     Display this help screen.

  --version                  Display version information.

  QueryImagePath (pos. 0)    Required. The query word image path. It must be a directory or a file

  Dataset (pos. 1)           Required. The SQLite database file

  Results (pos. 2)           Required. The XML Retrieval Results File. It follows the H-KWS2014 XML Format. Download
                             Evaluation Tool from <https://vc.ee.duth.gr/H-KWS2014/#VCGEval>

References

[Zagoris2014] K. Zagoris, I. Pratikakis and B. Gatos, "Segmentation-Based Historical Hand-written Word Spotting Using Document-Specific Local Features," 2014 14th International Conference on Frontiers in Handwriting Recognition, Heraklion, 2014, pp. 9-14.

[Zagoris2015] K. Zagoris, I. Pratikakis, and B. Gatos, "A framework for efficient transcription of historical documents using keyword spotting," in Historical Document Imaging and Processing (HIP'15), 3rd International Workshop on, August 2015, pp. 9–14.

[Zagoris2017] K. Zagoris, I. Pratikakis, B. Gatos. 2017 Unsupervised Word Spotting in Historical Handwritten Document Images using Document-oriented Local Features. Transactions on Image Processing. Under Review.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published