DBpedia Spotlight looks for ~3.5M things of unknown or ~320 known types in text and tries to link them to their global unique identifiers in DBpedia.
Go to our Demonstration page, copy+paste some text and play with the parameters to see how it works.
You can use our demonstration Web Service directly from your application.
curl http://spotlight.sztaki.hu:2222/rest/annotate \
--data-urlencode "text=President Obama called Wednesday on Congress to extend a tax break
for students included in last year's economic stimulus package, arguing
that the policy provides more generous assistance." \
--data "confidence=0.35"
or for JSON:
curl http://spotlight.sztaki.hu:2222/rest/annotate \
--data-urlencode "text=President Obama called Wednesday on Congress to extend a tax break
for students included in last year's economic stimulus package, arguing
that the policy provides more generous assistance." \
--data "confidence=0.35" \
-H "Accept: application/json"
If you need service reliability and lower response times, you can run DBpedia Spotlight in your own In-House Server. Just download a model and Spotlight from here to get started.
wget http://spotlight.sztaki.hu/downloads/dbpedia-spotlight-latest.jar
wget http://spotlight.sztaki.hu/downloads/latest_models/en.tar.gz
tar xzf en.tar.gz
java -jar dbpedia-spotlight-latest.jar en http://localhost:2222/rest
Models and raw data for most languages are available here.
If you use DBpedia Spotlight in your research, please cite the following paper:
@inproceedings{isem2013daiber,
title = {Improving Efficiency and Accuracy in Multilingual Entity Extraction},
author = {Joachim Daiber and Max Jakob and Chris Hokamp and Pablo N. Mendes},
year = {2013},
booktitle = {Proceedings of the 9th International Conference on Semantic Systems (I-Semantics)}
}
All the original code produced for DBpedia Spotlight is licensed under Apache License, 2.0. Some modules have dependencies on LingPipe under the Royalty Free License. Some of our original code (currently) depends on GPL-licensed or LGPL-licensed code and is therefore also GPL or LGPL, respectively. We are currently cleaning up the dependencies to release two builds, one purely GPL and one purely Apache License, 2.0.
The documentation on this website is shared as Creative Commons Attribution-ShareAlike 3.0 Unported License.
More information on citation and how to cite the deprecated Lucene version can be found here.
More documentation is available from the DBpedia Spotlight wiki.
Check the FAQ here