This page provides information on how the metadata served by data.europeana.eu is organised. It assumes that the reader has extensive knowledge of Linked Data technology. The project has also been described in a technical paper presented at the Dublin Core 2011 conference.
Disclaimer: Many aspects documented in this page are experimental and thus remain open for discussion and change. As for the rest of this Linked Open Data pilot, we welcome all comments that would help us improve the service!
Background - The Europeana Data Model (EDM)
The Europeana Data Model was designed to replace the Europeana Semantic Elements (ESE). EDM will gradually make Europeana fit within a networked data environment. It is a much more flexible and precise model than ESE, and offers the opportunity to attach every statement to the specific resource it applies to, and to reflect some basic form of data provenance. The main EDM requirements include:
- distinguishing between a 'provided item' (painting, book) and its digital representations
- distinguishing between an item and the metadata record describing it
- allowing the ingestion of multiple records for the same item, which may contain contradictory statements about it
As a consequence of EDM having to meet these requirements, EDM data has a level of complexity above that which Europeana currently maintains. This level of complexity is comparable to what can be found in the data of many Europeana providers, and thus, we argue, it enables better exploitation of that data. Note also that, as much as possible, EDM re-uses elements coming from already-established vocabularies, such as Dublin Core, OAI-ORE, SKOS and CIDOC-CRM, thus lowering the cost of its creation and, hopefully, its adoption.
For more information on EDM, we refer to the EDM Definitions and EDM Primer on Europeana's technical documents page. The EDM OWL ontology is accessible through content negotiation but it is also directly available. Please be aware that both data.europeana.eu and those documents are under constant revision. There could therefore be some (minor) discrepancies between them!
Generating EDM data for Europeana
Currently, Europeana does not harvest metadata in the EDM format. We thus had to convert legacy ESE data into EDM. This entails creating resources for the main EDM classes, and distributing ESE metadata fields over these various resources, as presented in this mapping. The resulting data does not realise the full potential offered by EDM but it allows us to make some distinctions, which we believe are useful for data consumers.
Additionally, data.europeana.eu includes semantic connections to external (linked data) sources. We serve links to other linked data services already maintained by Europeana providers - currently only the Swedish cultural heritage aggregator (SOCH). The vast majority of external links come from semantic enrichment realised at the Europeana Office, connecting Europeana items to places (as provided by GeoNames), concepts (from the GEMET thesaurus), persons (from DBpedia) and time periods (from an adhoc time period vocabulary).
A walk through the resources served at data.europeana.eu
The core EDM classes, together with the properties we expect to be applied to their instances, are presented in these templates. Of course it is unrealistic that all of those properties would be available for any single object exposed in Europeana. This is especially true for items that are described using the simple legacy ESE format. In particular, Europeana does not currently harvest any description for contextual entities, such as concepts, agents and places. Still, the ESE data harvested from Europeana providers, as well as the enrichment work by Europeana, allows us to create and describe a network of EDM resources for every Europeana object, as shown in this big picture example. The following explains in more detail the data that can be found for every class of resource served by data.europeana.eu:
Item (Provided Cultural Heritage Object): Item resources represent objects (painting, book, etc.) for which institutions provide digital representations to be accessed through Europeana. Provided Cultural Heritage Object (CHO) URIs (for example, http://data.europeana.eu/item/92037/25F9104787668C4B5148BE8E5AB8DBEF5BE5FE03, raw data) are the main entry points in data.europeana.eu. A provided CHO is the hub of the network of relevant resources (cf. below). When applicable, the Europeana URIs for these objects also link, via 'owl:sameAs' statements, to other linked data resources about the same object, for example this item (raw data). In this Linked Data pilot, no descriptive metadata (creator, subject, etc.) is directly attached to item URIs. It is instead attached to the proxies that represent a view of the object, from a specific institution's perspective (either a Europeana provider or Europeana itself). Depending on the feedback received during this pilot, we may change this and duplicate all the descriptive metadata at the level of the item URI. Such an option is costly in terms of data verbosity, but it would enable easier access to metadata for data consumers less concerned about provenance.
Provider's proxy: These resources (for example, http://data.europeana.eu/proxy/provider/92037/25F9104787668C4B5148BE8E5AB8DBEF5BE5FE03, raw data) are used as subjects of descriptive statements (creator, subject, date of creation, etc.) for the item, which are contributed by a Europeana provider. In the OAI-ORE model, proxies enable the separation of different views for the same resource, in the context of different aggregations. This allows us to distinguish the original metadata for the object from the metadata that is created by Europeana, an important requirement for us. Descriptive properties that apply to these proxies mostly come from Dublin Core: view an example. Proxies are connected to the item they represent a facet of using the 'ore:proxyFor' property. They are attached to the aggregation that contextualises them using 'ore:proxyIn'. Note to the reader: given the lack of support for named graphs (aka 'quadruples') in the RDF standard, ORE introduced proxies in order to support referencing resources in the context of a graph. Eventually, named graphs may be natively supported by RDF, which would lead to obsolescence of the proxy construct.
Provider's aggregation: These resources (e.g., http://data.europeana.eu/aggregation/provider/92037/25F9104787668C4B5148BE8E5AB8DBEF5BE5FE03, raw data) provide data related to a Europeana provider's gathering of digitised representations and descriptive metadata for an item. As shown in this data, they are related to digital resources about the item, be they files directly representing it ('edm:object' and 'edm:isShownBy') or webpages showing the object in context ('edm:isShownAt'). They may also provide controlled rights information applying to these resources ('edm:rights'). Other statements provided in the same ESE record as the descriptive metadata for the item – but that do not always clearly apply to it – are also attached to aggregations. Finally, provenance data is given in statements using 'edm:provider' (the direct provider to Europeana in the data aggregation chain) or 'edm:dataProvider' (the cultural institution that curates the object). The aggregation is connected to the item resource using the 'edm:aggregatedCHO' property.
Europeana's proxy: The second type of proxies served at data.europeana.eu. are Europeana proxies (e.g., http://data.europeana.eu/proxy/europeana/92037/25F9104787668C4B5148BE8E5AB8DBEF5BE5FE03, raw data) which provide access to the metadata created by Europeana for a given item, distinct from the metadata provided by the provider. Here one can find 'edm:year' statements, indicating a normalised date associated with the object. We also serve millions of generic 'edm:hasMet' enrichments, created by Europeana from a range of ESE descriptive fields (read documentation). These statements connect a Europeana proxy to places from GeoNames, concepts from the GEMET thesaurus, persons from DBpedia and periods from an adhoc time vocabulary. Finally, a proxy is connected to the item it represents a facet of, using the 'ore:proxyFor' property, as well as to the aggregation that contextualises it, using 'ore:proxyIn'.
Europeana's aggregation: a Europeana aggregation (for example, http://data.europeana.eu/aggregation/europeana/92037/25F9104787668C4B5148BE8E5AB8DBEF5BE5FE03, raw data) bundles together the result of all data creation and aggregation efforts for a given item. It aggregates the provider's aggregation (using ore:aggregates), which in turn will connect to the provider's proxy. Next to the provider aggregation, one can find the digitised resources Europeana.eu serves for the item, i.e., an object page ('edm:landingPage') and a thumbnail (using a combination of 'edm:hasView' and 'foaf:thumbnail'). The Europeana proxy is also connected to this aggregation, as mentioned above.
Resource map: OAI-ORE Resource maps are constructs for indicating meta-level statements about the creation and publication of ORE data (ORE aggregations and their aggregated resources). We are exploring their use as a contextualisation mechanism for the Europeana aggregation. Maps (for example, http://data.europeana.eu/rm/europeana/92037/25F9104787668C4B5148BE8E5AB8DBEF5BE5FE03; raw data) are connected to an item they are about using 'foaf:primaryTopic', and to its corresponding Europeana aggregation using 'ore:describes'. They sum up the provenance of data using 'dc:creator' and 'dc:contributor' statements. Crucially, they also indicate, in a machine-readable way, that the (RDF) data served at data.europeana.eu is provided under the CC0 Public Domain Dedication.
Namespaces used in data.europeana.eu
The following RDF namespace abbreviations are currently used in data.europeana.eu: