Provider and Aggregator names
Is there a publicly available list of provider names?
You can find a full list of Europeanas data providers here. The name of an institution only appears on this list when we have ingested data and there is something to display on the portal. Institutions that have not had their data ingested do not appear on this list.
What does Europeana do if the aggregator is only temporary (e.g. for the duration of a project)?
Even if the aggregator is only temporary, the aggregator name should still be entered in the metadata under edm:provider.
There is a system for tracking and adjusting changes to projects and providers as part of the ingestion process.
Europeana stipulates there should be only one 'click' between the portal and the object. If content providers use an unusual image format which requires a plug-in, with an intermediate page to give users this information, this adds an extra 'click"'between the portal and the content. Is this acceptable?
If a second click is required in order to alert a user to the need for a plug-in, that is acceptable.
I have supplied metadata in one language. Which language will appear during search of the Europeana Portal?
In Europeana, Providers can search for their own metadata in the language of their own ingested metadata.
The metadata found in Europeana reflects the language of the country of the Data Provider. It is ingested and indexed in the language of submission. So for example if the title is provided in Greek, it will be indexed in Greek and a query in Greek will find that term in Greek.
Where countries may have more than one language, Providers may send metadata referencing more than one language if they wish e.g. Wales has both English and Welsh. Therefore, the number of languages that can be used to search specific records within Europeana is the number of languages represented in the ingested metadata. This is currently 37 languages.
How does Europeana count the number of objects? For example, does a digitised book of 300 pages count as one object or 300?
Europeana counts the number of records. So if there is a book with 300 pages you can navigate on the local site but only one record to the book in Europeana it will count as one object.
What is the best level of granularity for objects to be submitted to Europeana?
The granularity should reflect the level of description of the object.
The granularity should be a) at a level that is meaningful for a user and b) it should have a metadata record that relates to that level and should provide context for material. For example, when providing a link to a page in a book, if there is no context provided in the accompanying metadata, the record will make no sense.
Granularity, in the case of hierarchical objects, is easier to express using EDM.
What is a 'digital object'? For example, museum descriptions are so rich they could be regarded as digital objects in their own right even though they have no digital 'image' to accompany the description. They are essentially very full metadata records. What is the policy about including such objects?
Europeana focuses on giving access to the digital version of physical objects held in institutions rather than just abstract digital information about these objects. Therefore such catalogue descriptions are not considered as digital objects in their own right within the context of Europeana. This definition includes digitised catalogue cards as well, as they function as finding aids and not as objects.
What kind of content are you looking for?
We evaluate new sources of data to best fill in gaps in material and provide a unique value to the Europeana data services. At the moment we are looking for content that falls within the following list. Of course, this list is not comprehensive.
- Romany material
- 1989 and subsequent event material
- Silent film
- News and documentary
- Languages other than English, French, German and Spanish
- Music (jazz, contemporary and classical)
- Wildlife sounds
- Ethnographic recordings
- Languages other than English, French and German
- Early history to 14th century
- Contemporary content
- Music scores and lyrics
- Performing arts and film texts
- Political documents
- Medical pre-1950s
- Biology pre-1950s
- Economy pre-1950s
Europeana Data Model (EDM)
Why did you change from ESE to EDM?
EDM is a richer data model that will improve the way metadata can be provided and used in Europeana and beyond. ESE is a flat data model that provides the lowest common denominator for ensuring interoperability between metadata standards. The disadvantage of this data model is that the richness provided in some metadata sets is lost in the process. EDM has several advantages over ESE:
- It allows a greater degree of granularity in describing objects, distinguishing:
- the original object and its digital representation
- objects that are composed of other objects
- objects within a collection while retaining context
- It will also support the aggregation of representations of the same object from different sources with different and possibly contradictory statement in the metadata e.g. The Mona Lisa can be accessed under a number of Data Providers
- ESE elements have been incorporated into the EDM schema
To exploit the richness of data from many cultural heritage domains, EDM adopts a cross-domain semantic web-based framework that will also allow enrichment of data from third party sources by linking. It is a more open and transparent data model than ESE and has benefits to both provider and user.
Technically how should we deliver our metadata to Europeana?
Original metadata is best delivered via OAI-PMH. But if this is not possible a mapping file delivered via FTP or emailed to us, however OAI-PMH is the recommended form of delivery.
In what format should we deliver metadata to Europeana?
The objective is for providers to supply their metadata in the richest XML format possible. We accept three types of metadata, EDM – EDM, ESE – EDM or Original Format – EDM (provided that the right mandatory elements and vocabularies are provided we can manually map original metadata to the EDM format in house. The Europeana Ingestion team will carry out the transformation of the data and ensure material is enriched and portal ready.
Is data in Europeana expressed in XML or RDF?
At the moment, we are using an XML schema to represent EDM data.
How can a provider update their data?
Updates can be done by modifying the mapping, or by a redelivery of a specific dataset.
What if I have changed project/aggregator?
If you as a data provider have changed the means by which you submit, please contact the Ingestion Team, and we will discuss the best way to ensure that the risk of duplicate records and broken links are minimised.
As a data provider, how can I indicate that my content has been generated by users, and how can other users find it?
The metadata element 'europeana:UGC' (also 'edm:UGC') has been introduced to allow providers to indicate that content has been sourced from the public. Digitised and born-digital resources contributed by the general public are collected by Europeana through crowd-sourcing initiatives and projects. Any objects that fit this definition should have this element provided with the value 'TRUE' in uppercase so it can be identified in the portal. (See the metadata documentation for further details.)
If this value is present in the metadata then an icon representing user-generated content and a 'Content created by user' label will be displayed in search results. It will also be used as a filter to allow portal users to include or exclude such content from a search.
How frequently does Europeana harvest data? Data providers may receive a few hundred new records a day.
Regardless of how many records a provider receives a day Europeana harvests and publishes on a monthly basis.
Europeana has a dedicated ingestion team who are responsible for data harvesting. For entirely new datasets and providers it is important to submit data as early as possible to ensure the team have enough time to provide you with feedback on your datasets.
The ingestion team harvest material on a monthly basis, and there is a submission deadline for all datasets of the 21st of the month. This gives us time to process and enrich your datasets for publication.
A provider/aggregator should establish a regular import pattern and ingestion plan with the Ingestion Team relating to the frequency of harvest of sets, and volume of records within cycles, e.g. Provider X wants Europeana to harvest every second month and expects to submit c.10000 records in 4 datasets each cycle. The provision of plans of this kind allows Europeana to optimise ingestion and ensure as many records are published as possible. If you do not have an exact ingestion plan, a rough estimation of the number of records in total will suffice.
What kind of thumbnails should text, video and audio have? Should it be a scan of the cover of the publication (book cover, record cover etc.)? Or should audio and video have a preview that would play when the mouse pointer is hovered over it?
The thumbnails are a smaller version of the image of the digital object, and directly affect the user click through rate. Therefore it is probably quite important to you to provide the best quality image for the digital object for Europeana. We can take a large image provided in the metadata and our algorithms will convert them into an appropriate size for thumbnails in the portal. If small images are provided the quality of the subsequent thumbnails may be effected.
At the moment, all thumbnails in the portal are images and they do not trigger any functionality. The image provided for the thumbnail can be anything you choose to best represents the digital object, in the case of a book, the cover may be suitable, or where there is no title on the cover a photograph of the title page, whatever will be most interesting and discoverable to the end user.
What should we do if we do not have an image for the object e.g. because it is a sound?
If no image can be provided to use as a thumbnail, then a default image will be used based on the type of object (image, sound, text or video).
These default images do not give a very good end user experience, particularly when several appear on the results screen. If at all possible, an image representing the object should be provided.
What size and format should thumbnails be?
Details of the thumbnails can be found in the Europeana Portal Image Policy document.
Is it acceptable to provide a thumbnail image that has a digital watermark on it?
In order to offer a good user experience, Europeana does not accept thumbnail images that contain a visible digital watermark.
Is Europeana coordinating a system of identifiers for submitted publications that could be used in place of, for example, Digital Object Identifiers (DOI)? If not, is such a system being planned for the future?
Europeana uses a custom system of identifiers for ingested data.
Note that Europeana does not ingest publications, only the metadata, so the identifier is for the record, not the publication.
How does Europeana enrich metadata?
Europeana attempts to enrich data by cleaning it and adding standard multilingual terms and references to answer ‘who', ‘what', ‘where' and ‘when' questions. This is carried out by applying terms from thesauri or controlled vocabularies using the Annocultur tool. The added terms and values are not copied back into the provider metadata but stored alongside it and displayed under separate ‘Autogenerated tags' in the portal.
Person names are matched to DBpedia (http://dbpedia.org/About) entries to give a unique identifier for the name. This links to additional information about the person, and enables the display of multilingual versions of the name in the portal.
Concepts are matched to the terms in the Gemet thesarus (http://www.eionet.europa.eu/gemet) which provides a unique reference for the concept. It also enables the display of the term in many languages and a display of the references and labels of broader terms.
Places are matched to place names in GeoNames (http://www.geonames.org/). This gives a unique reference for the place name, enables the display of multilingual versions of the names, the display of geographic coordinates and broader geographic areas associated with the place.
Time periods are enriched from a vocabulary developed using the Annocultur tool (http://annocultor.eu/) which can now be found at http://semium.org/time.html. This establishes a unique reference for the time period with start and end dates and can also connect to the name of historical periods.
Why is multilingual metadata important?
Europeana provides access to cultural heritage from different countries and in different languages. The language a user is searching in, is the language in which documents are retrieved. Providing your metadata in more than one language ensures that your objects are visible across language borders, which will result in more traffic.
How should metadata in different languages be labelled?
We encourage you to include the XML attribute ‘xml:lang'in all metadata elements which you can provide in several languages. Here is an example:
<dc:description xml:lang="fr">végétation des montagnes de France</dc:description>
This allows us to use this information in multilingual functionalities. So far, Europeana cannot adequately display this information but we are expecting to integrate this functionality in the near future.
How should multiple records in different languages for the same object be submitted?
If metadata records for the same object exist in several languages they should not be submitted as separate records. This would result in duplicate objects in the portal with no way to cross-link them. The metadata should be submitted as one record with duplicate elements for each value occurring in multiple language versions (ideally also with an ‘xml:lang' tag). This ensures that multilingual values are identified and ready for display in the portal.
If ‘xml:lang' tags have been used then future functionality could display these different versions by labelling the different languages. For example:
Subject: [es] baño termal, cura; recuperación, viaje; [en] cure, recovery, travel, [fr] cure, rétablissement, station thermale
Note that the ‘Title' element is a special case and is explained in a separate FAQ.
How does Europeana handle titles in different languages?
This is a special case because at the moment only one instance of ‘dc:title' can be displayed in the portal. However, we would still suggest that you provide translations in repeated elements using the ‘xml:lang' tag, as this functionality is expected to be implemented in future. For now, to ensure translated titles are displayed please put the alternative versions in the ‘dcterms:alternative'element as well.
For EDM data, it is possible to submit several ‘dc:title' properties with' xml:lang' tags which would indicate that an object has a title in several languages.
My objects are already ingested in Europeana and I have now translated all my metadata into another language. How do I add these additional translations?
If already-submitted records have been enriched with values in multiple languages after ingestion, then the data sets can be resubmitted to replace the earlier monolingual versions.
We have multilingual vocabularies, how can we submit them to Europeana and how are they used?
Once EDM is implemented, providers can submit multilingual versions of their vocabularies. These will be ingested together with their data to support multilingual functionality and data enrichment. Examples of such vocabularies are those created by MIMO, further explained in their EDM case study (http://pro.europeana.eu/mimo-edm). The instrument keywords are listed in several languages together with broader and related terms.
We also encourage you to submit relevant multilingual vocabularies to the Europeana Tech mailing listso it can be discussed with the community.
Europeana Semantic Elements (ESE)
Should data still be provided to Europeana in ESE now that EDM has been published?
ESE is a subset of EDM and several of its elements are critical to the functioning of the Europeana portal. These quality improvements will have a positive effect on future EDM implementation.
ESE will still be accepted as a metadata format for Europeana. It will be manually converted to EDM for use in the portal. However we will assist you in making the transition to EDM, as this will customise your data, and display it to its fullest potential in the Europeana Portal.
How can content providers use ESE to give an indication of the broader historical context of an object? For example: a) to indicate that a particular instrument represented the shift towards the use of high technology in chemistry or b) that a particular essay by James Joule was fundamental to the work of Helmholtz in developing his concept of energy.
ESE does not easily allow the expression of the semantics given in these examples in the metadata. EDM is more suitable for this kind of data.
Until EDM is fully deployed this kind of historical information could be put into a ‘dc:description’ element. Alternatively (or indeed, additionally), a ‘dc:relation’ element could be used to provide the URL of the related object from the described object. This would create the link but not state the nature of the connection.
The Rights Guidelines say that Europeana will supply some tools to support selection of licences, rights statements and public domain values for the ‘europeana:rights’ element.
Where can these tools be found?
This tool will help you pick a rights statement for your digital objects. There is also a Public Domain Calculator, which will step through the process of deciding if an object is in the Public Domain. It can be found at http://www.outofcopyright.eu/. However if you need further assistance picking an appropriate rights statement for your collection the Operations team can assist you in this matter and you can contact them here.
Why have you added a Europeana type of ‘3D'?
Some providers have 3D objects to supply to Europeana and need to be able to indicate this in the metadata. The value ‘3D’ can now be entered in the ‘europeana:type’ element (also ‘edm:type’). The portal will use this value to create a new facet in the search results. In addition, if the value ‘3D-PDF’ is entered in the ‘dc:format’ element, a PDF icon will be displayed alongside the object in the search result so users will know what application is needed to view the object. (See the metadata documentation for further details.)