Approaches for handling multilingualism for digital cultural heritage

A strategy for multilingualism

At Europeana, we have been working for a number of years to enable multilingual access to digital cultural heritage, for example, by exploiting multilingual linked open data to enrich metadata. It has been a community effort: Europeana Foundation plays its part, but the work of data providers is crucial too. This requires an appropriate long-term strategy, and a transparent approach to sharing goals and ongoing efforts.

In 2016, EuropeanaTech published a whitepaper on best practices for multilingual access, which outlined challenges in this area and presented ideas to tackle them. Under the Europeana DSI project in 2020, we built on this whitepaper to publish a new multilingual strategy and roadmap. This document lays out proposals for handling multilingualism in the context of three general usage scenarios: search, navigate and read. It presents how Europeana's object metadata, full-text content, editorial content and user interface can be enriched with multilingual resources and/or automatically translated in order to provide better user experience, and it proposes a roadmap of experiments and implementations.

Both the 2016 whitepaper and an early version of this strategy were made available for feedback and shared with the EuropeanaTech community. The strategy also benefited from discussions during an event on multilingualism in Digital Cultural Heritage organised by the EU Finnish Presidency (see full report here).

EuropeanaTech Insight and plans for the future

Today, we want to bring these topics to your attention in the context of a new issue of EuropeanaTech Insight dedicated to multilingualism in digital cultural heritage. This issue contains three papers that present the problems and key technologies that Europeana and the wider cultural heritage sector should investigate in order to better connect collections and users across languages. This includes recent progress on automatic translation technology, and how this technology can be used for search in cultural heritage repositories. Read EuropeanaTech Insight.

In the coming months, we will continue these efforts, focusing on exploring how Europeana can employ automatic translation systems like eTranslation, which benefit from recent progress in AI-powered language technology. A project on automatic translation and Europeana has been recently selected for co-funding in the 2020 CEF Telecom Call on Automated Translation, and we will keep you posted through Europeana Pro news!

Get involved

Like all systems based on machine learning, the quality of the eTranslation service provided by the European Commission depends on the quality of the data that is used to train it. The more cultural heritage data is made available for its training, the better it will work for our sector, therefore data input is always more than welcome. Our previous call for multilingual data for eTranslation remains open, and we invite you to submit proposals to the recently launched EuropeanaTech Challenge, which calls for datasets that are appropriate for use in Artificial Intelligence / Machine Learning research and development (which naturally include efforts in language technology).

To exchange ideas, receive news and network with experts on this topic and research and development in the cultural heritage sector, join the EuropeanaTech community.