The METIS Sandbox - or finding joy in working with data
Metis Sandbox is a functional application that enhances the overall data aggregation workflow from data providers to Europeana. In this post, we take a look at what the pilot, developed as part of the Europeana Common Culture project, has achieved.
The Europeana Common Culture project aims to develop a harmonised and coordinated environment for Europeana’s national aggregators, as well as improve content and metadata quality to increase user satisfaction. As part of the project, three pilots (Metis Sandbox, the Linked Open Data Aggregator and 3D Content in Europeana) experimented with novel approaches for aggregation which were tested by national aggregators (NAs) and validated with cultural heritage institutions (CHIs). The Metis Sandbox pilot was led by Deutsche Digitale Bibliothek (DDB) and delivered in close collaboration with the Europeana Foundation.
As a long time aggregator for Europeana, the DDB had hands-on observations and experience to bring to the pilot. The process of getting DDB data published in Europeana often took a long time (sometimes up to six months), for various reasons: the data was not valid according to the Europeana Data Model (EDM) and needed to be corrected by the DDB or by the cultural heritage institution/intermediate provider; or it was noticed that the data quality was poor and data or mapping corrections were needed. All this was accompanied by back and forth communication between Europeana’s Data Publishing Services (DPS) Team, the aggregator, sometimes an intermediate aggregator, and the CHI. This loop had to be repeated several times until everything was correct, taking a vast amount of time.
This led to the question: what if aggregators could see how data looked in Europeana without involving the DPS team? They would be able to perform corrections before even sending the data to Europeana. It would have the potential to profit all aggregators by reducing the number of steps necessary to publish data through Europeana, as well as the to-and-fro communication!
And so the Metis Sandbox was born.
The Metis Sandbox allows the processing of sample datasets according to the Metis workflow. From data import, validation, transformation, normalisation and enrichment to publication it reproduces step by step the processes applied within Metis, Europeana’s core infrastructure for data aggregation. The system issues reports for every step if an error is reported and provides a link to a preview environment at the end of the process, allowing the user to see their data in a copy of Europeana. The data output by the system contains all the automatic enrichments and technical metadata generated by Europeana, as well as the quality tiers for content and metadata.
The DDB was the first to test the pilot. And oh what a joy it was! It’s not often that the words 'data' and 'joy' are to be found in the same context, but being able to see the data along with the content and metadata tier meant that this was the case. From the moment it was tested, the Metis Sandbox had a direct impact on the improvement of the DDB’s data workflow and the quality of the submitted data.
It was then time to involve other aggregators from the Europeana Common Culture project and get more feedback about the Metis Sandbox. Several national aggregators in the project were involved in testing and evaluation and the results were very encouraging: 83% of the participants found it useful and signaled that they would be happy to use Metis Sandbox for future deliveries to Europeana. The most appreciated features of the tool were content and metadata tier calculation, data validation, and the detailed error/warning reporting.
So what is on the books for the future of Metis Sandbox?
The Metis Sandbox is, next to Metis, one of the pillars of Europeana’s Aggregation Strategy published in October. Building on the pilot developed during the project, Europeana Foundation will further extend it to provide, together with Metis, a solution which provides ways of speeding up the publication process in Europeana, to support the digital transformation of aggregators and cultural heritage institutions and improve data quality. The public release of Metis Sandbox will be available in spring 2021.