ThoughtLab: Think with us

Enriching metadata

MIMO-DB: MIMO Aggregator BackOffice - The MIMO-DB was developed as part of the MIMO Project (Musical Instrument Museums Online). This service is used by all partners to monitor the harvesting and enrichment of their metadata. It allows for multilingual search. The MIMO-DB is a professional front-end to an XML database using the new LIDO museum metadata model.

MIMO-DB features:

  • OAI harvesting management
  • metadata enrichment before ingestion, i.e. linking automation to
    • instrument keywords
    • instrument makers authorities
    • geographical locations using Geonames
  • enrichment ‘monitoring' using reports
  • search and retrieval of LIDO-based musical instruments records

See the MIMO-DB prototype.


For more information, or to give feedback, please contact: Rodolphe Bailly.


EuropeanaConnect Gazetteer - A rich data resource including over 9m geographic names, co-ordinates, and boundaries.

The EuropeanaConnect Gazetteer is a Geographical Information Service (GIS). This rich data resource gives service providers access to over 9 million geographic names, co-ordinates, and boundaries.

By enriching Europeana's metadata with these geographic references, it is possible to identify features such as continents, countries, cities, monuments and rivers contained in the objects on Europeana. The service also has a multilingual aspect. Users can search for a single term such as ‘London' and retrieve results for objects marked with ‘Londres' or ‘Londyn'.

Information in the EuropeanaConnect Gazetteer has been collected from free data sources, which means there are no legal constraints to its use and re-use. Information from additional data sources is also continuously integrated, ensuring the service is kept up to date.

See the prototype: The Europeana Gazetteer Prototype

For more information, or to give feedback, please contact: André Soares and Gilberto Pedrosa.

EuropeanaConnect Geoparser - A tool to extract place names from text.

This tool works with the EuropeanaConnect Gazetteer.

EuropeanaConnect Geoparser enables users to enter unstructured or partially-structured text such as metadata records and then receive a list of geographic features that are referred to in them. The results can be returned as XML or HTML.

See the prototype: The EuropeanaConnect Geoparser Prototype

For more information, or to give feedback, please contact: Nuno Freire.

MyStoryPlayer - MyStoryPlayer is an audiovisual annotation tool developed by the project ECLAP, the e-library for performing arts.

This tool has been developed to offer new possibilities for educational and ‘infotainment' purposes.

MyStoryPlayer allows a user to interact with multimedia objects by annotating them with other audiovisual objects through temporal and logical relationships. A user can therefore connect different multimedia resources around a topic or an event by creating explicit relationships between them and play them synchronously within the same interface.

Each annotation created within the tool contains:

  • a text description, as in any other annotation tools
  • a link between two different media , which are related through a time relation.

The user can annotate an object using a resource as a whole (e.g. an image) or by using only a part of this resource (e.g. few minutes of a video).

Once an object has been completely annotated, the user can play the different media in parallel and start comparison work. The users can play several related annotations, creating their own story by navigating among them. The stories are personal experiences that can be shared among users. This is very interesting and useful for educational purpose and for leisure.

MyStoryPlayer is based on a semantic database using a RDF ontology and can be used as a web application.

MyStoryPlayer tool page is available at: A stand alone version is accessible on

Videos providing more details on the tool can be played on ECLAP at:

For more information, or to give feedback, please contact Paolo Nesi

Stanbol - The Apache Stanbol project was initiated by the European R&D project IKS (Interactive Knowledge Stack).

Stanbol is designed to provide semantic services for existing content management systems. It offers the following features:

  • Content enhancement services that add semantic information to ‘non-semantic' pieces of content.
  • Reasoning services that produce additional semantic information about the content based on the data retrieved via content enhancement.
  • Knowledge model services that are used to define and manipulate the data models (e.g., ontologies) for storing the semantic information.
  • Persistence services that store (or cache) semantic data (enhanced content, entities, facts) and make it searchable.

More information on the technical details

Two prototypes using the Europeana data are available
Note: we recommend the use of Firefox, Safari or Google Chrome to view the demos


To use this demo, enter some texts in the Stanbol Enhancer input frame. The text will be linked with the labels of pictures available from the Austrian National Library (ONB). The linking is based on the dataset from the ONB available in the Europeana Linked Open Data pilot

Note that the demo only works with German text. As examples you can use the following texts:


The Hallo tool has been developed by the team responsible for VIE(Vienna IKS Editables – a Java library). This demo uses text written in a HTML5 Rich Text Editor, where users can accept suggested annotations, triggering the insertion of pictures below the editable region.

You first have to click on the text and then on the ‘annotate' icon. Click on one of the underlined word and choose your reference from Europeana. The image selected will bring you to the corresponding record on

For more information, or to give feedback, please contact Rupert Westenthaler.

PunditPundit is an open source semantic annotation tool developed by Net7 allowing users to create structured data annotating the web. Pundit not only enables users to annotate Web pages in various ways (comments, citations, bookmarks…) but also to convert these annotations into semantically structured data that can be later integrated to the "Semantic Web".

The framework to represent the annotations is based on the Open Annotation model. The ability to express semantic relationships between resources relies on the use of ontologies and vocabularies (such as and that can be configured according to the needs of the communities using Pundit. The re-use of resources already available on the web provides solutions to multilinguality or disambiguation related challenges. In Pundit each annotation is described by data structured in RDF which connects the annotated object (a manuscript, a video…) to other objects or entities available as Linked Data. Semantic relationships based on URIs are created between the different entities identified in a given web page. Pundit stores annotations on an RDF based server which exposes APIs to consume annotations along with their structured content.

Pundit is also a collaborative platform which allows users to team up during the annotation creation process, to share their results and also re-use annotations created by other users. It therefore participates in the dialogue between scholars and in the creation of new knowledge. For more technical details on Pundit, see


The DM2E (Digitised Manuscripts to Europeana) project further develops Pundit in the context of humanities research based on the Europeana content and metadata. Users are now able to annotate and contextualise their objects by using the Europeana content and metadata. You can test the demos and install the bookmarklet at

For more information, or to give feedback, please contact Simone Fonda or the pundit team

PATHSenrich – A web service prototype developed by PATHS project allowing independent content providers to enrich their cultural heritage at item level.

The aim of PATHS project is to enable exploration and discovery within cultural heritage collections. In order to support this, the project developed a range of enrichment techniques which augmented these collections with additional information to enhance users's browsing experience. The semantic enrichment techniques developed in PATHS are summarised in this document, which details how these techniques are applied to the Europeana data. The demonstration system developed for PATHS that uses the content from Europeana is available here.

The full enrichment functionality is for the moment only available for internal use alone, but the University of the Basque Country (UPV/EHU) has developed a web service which provides a selected subset of the enrichment functionality. At present, the service enriches the any item provided to Europeana described using the Europeana Data Model (EDM) with two types of information:

  • links to related items in the PATHS collection (as subset of Europeana)
  • links to related Wikipedia articles

    To use this web service, access, enter one cultural heritage item represented following the EDM in JSON format as exported by the Europeana API v2.07 (a sample record is provided in the interface). Then click "Process" to get the output. The enrichment is performed by analysing the title and descriptions in the metadata associated with the item. The output is returned both as a list of human-readable URLs and XML.

    Note that the service is currently a proof-of-concept, not for extensive use. The LoCloud project is currently developing the web service further. The new project will aim at improving the quality of the = enrichment (e.g. links to Wikipedia), provide additional functionality and alternative input/output formats.

    URL for the prototype:

    A paper presented at TPDL 2013 providing more details on the tool can be found at:

    For more information, or to give feedback, please contact Arantxa Otegi

    contact us For Research and Development questions

    Antoine Isaac

    +31 70 3140979

    Robina Clayphan

    Valentine Charles

    +31 70 3140179