We want people to be able to search for, find and use cultural heritage material online. That’s easier - and a much smoother experience for everyone - when the content files (the images, text documents or audio/video clips) and accompanying metadata (the information about what the item is and where it comes from) are of good quality.
Through the Europeana Strategy 2020-2025, we will continue to work with aggregators and data providers to invest in resources, activities and technologies - like machine-learning and other enrichment services - to make our metadata and content better.
A focus on quality
Working through existing aggregators’ networks, and with the support of EU Member States, we’ll work on helping institutions to understand why good data is important and we’ll support them to produce higher quality content and metadata. We’ll develop and use the Europeana Publishing Framework to support how institutions work with, produce and improve the material they share with us.
We’ll showcase high-quality content via Europeana’s editorial and campaigns, and we’ll develop the platform itself to put more emphasis on our partner institutions and make sure that good results are visible to all.
Cosmina Berta from the German Digital Library and a member of Europeana’s Data Quality Committee says, ‘The biggest challenges for metadata practitioners are defining what data quality is and implementing data quality metrics, especially because data usage scenarios change over time. I hope through this strategy we can reach more of a consensus regarding defining and measuring data quality and define a clearer concept about its implementation. I am a big fan of standardisation and in my ideal world, standardisation will play a bigger role in defining and achieving data quality.
‘If we define the goals we want to achieve - the ‘what’ and the ‘to what end’ – then the ‘how’ to get there can be more easily outlined, implemented and standardised. Aggregators are important here. Institutions can then better reach their audience and advance their overall research and education purposes.’
Getting better connected
When cultural content and metadata are prepared in a standardised way, no matter which institution creates them, they can be used in and across a wide range of systems, not just the Europeana platform. Cultural heritage institutions can benefit from, for example, interoperability with collections from other institutions or connections to international initiatives such as Wikidata.
Here, the use of standard linked data formats coupled with improvements to multilingualism will lead to better connection of Europeana’s collections with other platforms and services.
Henning Scholz, Europeana Foundation’s Partner and Operations Manager, says, ‘With the release of the Europeana Publishing Framework metadata component in summer 2019, we made multilingualism integral to our concept of data quality. Making it clear what language the metadata is provided in will facilitate machine translation of metadata, making our heritage accessible in all EU languages. We still have a long way to go, but if we can focus on tagging key metadata fields with the correct language, we can make good progress over the next two years.’
Using tech - and people power - to enrich data
Manually improving the metadata quality of millions of records from different sources requires a huge amount of time and resources. The application of artificial intelligence tools and machine-learning, combined with human knowledge provided by both domain experts and crowd-sourcing campaigns (think EnrichEuropeana and CrowdHeritage), offers a remarkable opportunity for improving the quality of metadata.
We will work on ways to enrich metadata and perform data-related tasks automatically, semi-automatically or by using the strength of the crowd.