Q&A: Partner Profile - The European Library

The European Library is the coordinator of the Europeana Cloud project and one of three partners who will initially test our new Cloud infrastructure. Recently we caught up with Director Louise Edwards, who told us why Europeana Cloud is so valuable for The European Library, for researchers and for the cultural heritage community as a whole.

What is The European Library and what role does it play within the Europeana Cloud project?

The European Library takes in metadata from Europe’s libraries and forwards it to Europeana. We have multiple roles within the Europeana Cloud project. We are the project coordinator, responsible for making sure that all partners pull together strategically and that the project works as a whole.

We are also one of the three main aggregators of the project, along with the Polish Digital Libraries Federation, a national aggregator in Poland, and Europeana, thesuper-aggregator for all of Europe’s cultural metadata. Together we’re working to develop a shared infrastructure that all three of us can use. Once the project is finished, this infrastructure will be broadened and opened up to other aggregators.

How will this new infrastructure change the way that The European Library works?

At the moment each aggregator is running different hardware and software, and our data sits in different places. We can’t share data without going through a time-consuming process of importing and exporting data. It can take time to change formats, it can take time to pass the data from A to B.

With the Cloud, we’ll have a shared infrastructure where it’s much easier to get hold of the data, to share it and to use it in different ways. Imagine a shared network drive within an office, which allows you to access anyone else’s data within the office and to take that data and use or share it for any purpose. At a very simple level, that’s what we’re doing within Europeana Cloud– although there will be greater controls on how different users can use the data

What value will this new shared infrastructure have for The European Library?

For a start, the Cloud should help to reduce our costs. We currently spend a lot of time forwarding data to Europeana, sending it from our system to their system. There’s a real friction cost involved in passing this data around. If we are running the same system, less time will be taken to speak technically to Europeana. It will be much easier for them to access and harvest our data.

It will also allow The European Library to better enrich that data, transform it and allow other people to build tools on top of that data so it becomes much more open. As an example, one of the things we’re really interested in is a tool for entity extraction. That’s where you have computers recognise names and locations within data. If someone were to build this tool within the Cloud, we could use it to enrich all or some of the data and then feed the data back into our own portal or share it in other ways to benefit users.

The key here is that the Cloud will give us a two-way process of updating and enriching content and then creating subsets that are useful for different types of end user groups.

Why is it important to divide data up into different categories or subsets in the first place?

Europeana provides access to 30 million records which you can search. That volume of data is wonderful but it also makes it difficult to find material related to precise topics or to locate types of data such as sounds. What the Cloud will allow people to do is to pre-select various parts of the data and to build tools on top of those separate indexes of data.

Let me give you an example. Imagine a researcher who discovers a collection of letters from a famous politician. Under the current system, the researcher can’t download the data and the only tools he can use to analyse the collection are the ones provided by the aggregator (in this case Europeana). Europeana Cloud offers a platform, to get access to the letters and to build a tool to transcribe the letters. He could then return the transcriptions to the Cloud so that so that everyone can have access to that new and enhanced data. By everyone, we mean not only other researchers but also the original institution that provided the data.

Are there similar Cloud projects to this one at the moment or is it unique?

This is quite unique in terms of cultural heritage. There are similar projects in different fields. EUDAT, for example, is building a distributed structure for sharing scientific research data related to subject areas such as Climate Modeling and Human Physiology. They need much more complex controls related to authorisation and authentication because there are data protection issues but the concept of sharing data over a common infrastructure with different groups is essentially the same as Europeana Cloud.

When will people be able to access the Cloud?

The first experience many people will have of the cloud is through the prototype tools we are building for specific communities. We already have a tool for early modern philosophy and this year we’re using data in Europeana to build a specific tool for people working in musicology.

By the end of the project, we’ll have APIs or data dumps that will allow people to access the data in the cloud, take away what they’re interested in and build tools using the data.

Through all of these ways, we’re showing how Europeana Cloud can be valuable in segmenting data for specific communities, rather than putting the focus on millions of pieces of data which nobody can really search through effectively.

To learn more about Europeana Cloud, follow us on Twitter or subscribe to our newsletter.