« Back to Market study on...

DBpedia Spotlight

DBpedia Spotlight is tool for annotating mentions of DBpedia resources in text, providing a solution for linking unstructured information sources to the Linked Open Data cloud through DBpedia. It works on text only and offers three functions: annotate, disambiguate and candidates (find candidate DBpedia resources). DBpedia Spotlight can either be accessed through a web application (which is a demo client) where you can enter text and the tool will create an HTML-version of the text including the DBpedia annotations. Or you can use the Scala / Java API, REST Web Service to get the functionality of annotating and/or disambiguating entities in text. There are also two other APIs available: Annotation Java / Scala API, exposing the underlying logic that performs the annotation/disambiguation and Indexing Java / Scala API, executing the data processing necessary to enable the annotation / disambiguation algorithms used. More technical information can be found in the User Manual or the Technical Documentation . A local installation on one’s own web server would also be possible. The necessary downloads can be found on DBpedia Spotlight’s main page. The program can be used under the terms of the Apache License, 2.0. Part of the code uses LingPipe under the Royalty Free License. Therefore, this license also applies to the output of the currently deployed web service.

We tested the DBpedia Spotlight Demo with the English demo text about the Berlin Cathedral, as the demo only works in English. We set the parameters like described in the user’s manual: confidence: 0,4; contextual score: 0,0; prominence (support): 20; no common words; document-centric and show n-best candidates. The result is shown in Figure 6. Interestingly the tool found more concepts than those linked in Wikipedia, yet not all of them are identified correctly. For example in Wikipedia the Evangelical Church of Berlin-Brandenburg-Silesian Upper Lusatia is recognized as one concept, in the demo it divides the concept into single parts. When changing the support parameter fewer entities are detected, yet again, the precision of Wikipedia is not reached.

Figure 6. DBpedia Spotlight Demo

Currently they only offer a web service for DBpedia Spotlight in English. However, since it's based on Wikipedia, one could use the DBpedia Spotlight software to build a service for any language that is in Wikipedia. There are some minor changes needed for the most basic features of the tool, and for using more NLP-intensive features, it needs a few more changes. DBpedia Spotlight has also planned and prototyped a function that uses graph-structured metadata for disambiguation, but it is not yet finished.

References:
Mendes, P. (2012). Personal communication with Marlies Olensky.
Mendes, P.N., Jakob, M., García-Silva, A. & Bizer, C. (2011). DBpedia Spotlight: Shedding Light on the Web of Documents. In Proceedings of the 7th International Conference on Semantic Systems (I-Semantics). Graz, Austria, 7–9 September 2011. Retrieved from http://www.wiwiss.fu-berlin.de/en/institute/pwo/bizer/research/publications/Mendes-Jakob-GarciaSilva-Bizer-DBpediaSpotlight-ISEM2011.pdf (19.01.2012)
Mendes, P.N., Jakob, M., Daiber J. & Bizer, C. (2011). DBpedia Spotlight. http://dbpedia.org/spotlight (03.02.2012)

1 Attachment
1179 Views
Average (0 Votes)
Comments

contact us For Research and Development questions


Antoine Isaac
antoine.isaac@kb.nl

+31 70 3140979


Robina Clayphan
robina.clayphan@kb.nl


Valentine Charles
valentine.charles@kb.nl

+31 70 3140179