Europeana enriches its data with the AAT

By Valentine Charles and Cécile Devarenne, Europeana Foundation

The Getty Research Institute announced last March the release of their Art and Architecture Thesaurus (AAT) as Linked Open Data. This release opened many opportunities for Europeana.

AAT is a rich, structured and multilingual vocabulary including terms, descriptions, and other information for generic concepts related to art, architecture, other types of cultural heritage and conservation.

AAT has always been an important resource for Europeana's data providers, especially museums. However, until now Europeana was not in the position to exploit it: firstly because the vocabulary was not openly available, secondly because Europeana didn't have the technical means to exploit it.

The implementation of the Europeana Data Model (EDM) was the first step towards the re-use of widespread vocabularies such as AAT. EDM embraces the principles of the Semantic Web and therefore can be seamlessly integrated with a network of vocabularies at a semantic level. EDM gives support for contextual resources — the so-called ‘semantic layer', including concepts from ‘value vocabularies' like thesauri, authority lists, classifications, either coming from the network of Europeana's providers or from third-party data sources. Since EDM is geared towards re-using existing semantic resources, the publication of AAT as Linked Open Data was an opportunity to seize.

It offers:

  • an unambiguous reference to a controlled, trustable subject representation
  • machine-readable access to multilingual labels
  • machine-readable access to semantic relationships

Europeana developed internally a small enrichment tool in order to ‘dereference' the AAT URIs, i.e., fetch all the multilingual and semantic data attached to AAT concepts from the centralised open Getty linked data service. It is really easy, as the AAT linked data are represented with SKOS, which is also the model EDM re-uses for describing concept data (cf. EDM documentation for more details). In the meantime, several Europeana partners were contacted and asked to re-submit their data in EDM, replacing their old AAT labels (provided as a simple text string) by the new AAT URIs in the EDM fields. These URIs are identifiers giving access to the data representing the AAT concepts over the web. The AAT URIs have a base to which is added the identifier of the concept. The AAT URIs were dereferenced when the objects that referred to them were ingested in Europeana, bringing in the additional AAT data such as label translations.

The following collections were re-published in the Europeana portal:

In Europeana, enrichments are visible in the portal display, as shown in the object below. The AAT URIs can be seen in certain metadata fields such as Format and Type. All multilingual labels fetched from the AAT linked data service are displayed in the foldout ‘Auto-generated tags' area.

The enrichment of Europeana data with AAT data offers strong potential for the developments of multilingual services (cf. Final report of the EuropeanaTech Task Force on Multilingual and Semantic Enrichment Strategy). It complements nicely the data of other open and multilingual vocabularies that Europeana encourages providers to send links to, such as GND, Iconclass, VIAF or any domain vocabulary following the EDM recommendations for metadata on contextual resources (See these examples of objects enriched with the MIMO or the Partage Plus vocabularies or GND or Iconclass or VIAF). In particular, Europeana is able to translate the terms according to the language of the interface selected by a user based on the translation provided by vocabularies. For the previous example, above, if the language of the interface is switch to Dutch, the term ‘astronomy' is translated to ‘astronomie' based on the labels of

Note that the benefits of using AAT also extend to cases where object descriptions have been described with an ad hoc, local vocabulary, if this vocabulary has been mapped to AAT by its creator or a third party. Europeana would then exploit semantic relations and translations across a network of vocabularies, pooling all the contextual data together.

Europeana also performs automatic metadata enrichment with other external value vocabularies and datasets by creating links to objects in Europeana as described in the previous Europeana Blog A Multilingual and Semantic Enrichment Strategy

The increased dependency of Europeana services, such as query translation, on semantic and multilingual enrichment illustrates the need for rich vocabularies that are:

  • technically available (through Linked Data or in dedicated repositories), properly documented, and in open access.
  • well-connected together, e.g., equivalent elements in other vocabularies are indicated
  • multilingual.

Europeana was involved as an external reviewer of the AAT release, and will continue its efforts in demonstrating the potential of Linked Open vocabularies.

We thank Antoine Isaac, Yorgos Mamakis, Georgios Markakis, David Haskiya, Péter Kiraly, Andrew MacLean and Dean Birkett for their work and the Europeana data providers for their data contributions.