Maria Eskevich presents "Bringing Europeana and CLARIN together" at DI4R 2017
On Thursday 30 November Maria Eskevich will present the results of the joint paper by Europeana and CLARIN, called "Bringing Europeana and CLARIN together: Dissemination and exploitation of cultural heritage data in a research infrastructure", at Digital Infrastructures for Research 2017 (DI4R2017). Under the theme “Connecting the building blocks for Open Science”, the 2017 edition of the DI4R conference will showcase the policies, processes, best practices, data and services that, leveraging today’s initiatives – national, regional, European and international – are the building blocks of the European Open Science Cloud and European Data Infrastructure.
The paper was a joint effort by Nuno Freire and Clemens Neudecker from Europeana, and Twan Goosen and Maria Eskevich from CLARIN. Nuno and Twan wrote an excellent blog on bridging the gap between CLARIN and Europeana for our Pro website.
From the presentation description: "The potential inclusion of many new CH resources by ‘harvesting’ metadata from Europeana, opens up new applications for CLARIN’s processing tools. CLARIN and Europeana do not share a common metadata model, and therefore a semantic and structural mapping had to be defined, and a conversion implemented on basis of this. CLARIN’s ingestion pipeline was then extended to retrieve a set of selected collections from Europeana and apply this conversion in the process. Several infrastructure components had to be adapted to accommodate the significant increase in the amount of data to be handled and stored. Currently about 775 thousand Europeana records can be found in the VLO, with several times more records expected in the foreseeable future. Currently, about 10 thousand are technically suitable for processing via the LRS. Relatively straightforward improvements to the metadata on the side of Europeana and/or its data providers could substantially increase this number. CLARIN is working with Europeana to implement such improvements. More tools are also expected to be connected to the LRS in the short to mid-term, which is also expected to lead to an increased ‘coverage’. As a next step, CLARIN can extend and refine the selection of included resources, and Europeana can adapt their data and metadata to optimally serve the research community. CLARIN’s experience and potentially part of its implementation work can be applied to integrate Europeana with other resource infrastructures."
Read the full presentation description and sign up for the DI4R conference below.