2 minutes to read Posted on Tuesday September 25, 2018

Updated on Wednesday September 26, 2018

Member States
portrait of Harry Verwayen

Harry Verwayen

General Director , Europeana Foundation

Could AI and data mining technologies overcome issues in cultural heritage?

Today, Europeana Executive Director Harry Verwayen spoke at the EBU event Cultural heritage for the future: the role of media innovation. In his presentation which follows, Harry illustrates the challenges facing digital audiovisual archives, and the potential of new technologies, including AI, to overcome these challenges.

Neurones connecting, artwork, Stephen Magrath, Wellcome Collection, Public Domain
Neurones connecting, artwork
Stephen Magrath
Wellcome Collection
United Kingdom
Neurones connecting, artwork, Stephen Magrath, Wellcome Collection, Public Domain

Good news for digital innovation

So the good news is that Europe is a world leader in digital heritage innovation. European Union (EU) digital cultural initiative Europeana consists of a thriving network of close to 4000 memory institutions (libraries, museums, archives etc.) who have shared over 50 million digitised objects in a standardised format on the digital platform. This model, based on open standards and an inclusive, distributive approach, is being adopted as a model in the US and Japan to Brazil and Canada.

Ten years ago, the EU made a bold decision about our cultural heritage. They deemed it too important to leave to market forces alone. Two weeks ago, Europeana’s public evaluation has shown that it is still very relevant to the challenges we are facing today in Europe and the EU has recommitted its support to the initiative.

A challenge for audiovisual archives

However, we are facing some serious challenges and unfortunately, these affect audiovisual (AV) archives more than any other creative medium.

Here is the issue: we have currently digitised around 10% of all our heritage. Of that 10% (which represents around 300 million objects), only about one third is available online, and of that, only 7% is available for reuse. At Europeana, we work very hard to improve this equation and as you can see almost 20% in Europeana can be shared, adapted (all within full respect of copyright of course) while the other 80% can at least be viewed online.

Unfortunately, AV is trailing behind the pack in this competition for access: on Europeana Collections, we provide access to around 1 million videos, and 700,000 sound recordings. However, only 6,000 (0.5%) is explicitly labelled as available for reuse.

So what can be done about this? How do we have an impact on the big societal issues that Europe faces using a powerful agent like AV archives to our advantage?

A large part of this issue lies of course in copyright. We encourage institutions to adopt open access policies where they commit to publishing collections they own the copyright to under an open license. But at this point in time, it is more important that we work together with our partners to develop technological solutions that improve the access to content, within the current copyright framework.

Think for example of embeddable players that allow image archives to stream their material in different contexts, still within the copyright frameworks.  

The potential of AI and data mining archives

But we should think much further than that: what if we could make full use of the capabilities of AI and machine learning to mine these archives, at scale? We’d be able to make all kinds of connections, discover patterns and uncover the past in unprecedented ways!

A prototype version of the Time Machine project has been experimenting with deep mining the city archives to extract intelligence about cities of the past. Trade routes, social mobility patterns, the way knowledge flowed through the city over time. Mining this data brings about endless possibilities to improve education, enrich tourist experiences and give meaning to the past.

Imagine doing this at scale across Europe, also mining AV archives using speech and image recognition, and partitioning of scenes. This would allow us to bypass the most apparent limitations - allow for the easy translation of subtitles and findability of relevant sections of otherwise very long and unwieldy videos.

That is exactly what the Time Machine FET flagship initiative proposes: the development of a large-scale (robotised) digitisation and computing infrastructure to extract intelligence using AI.

When we talk about ‘digital transformation’ we should ask ourselves: ‘what do we want to be transformed into?’ Do we want that transformation to be dictated by large corporate platforms? Or do we want to use the strengths of our networked infrastructure and set the terms ourselves?

I would like to encourage the European Union to be bold and support large-scale, ambitious innovation projects such as Time Machine. These projects have the potential to turn one of our biggest assets, cultural heritage, into an instrument of social and economic innovation.

The above is a transcription of the presentation.