Seven years and still going - how we're improving the accuracy of rights statements in Europeana data

Julia Fallon, Europeana Foundation’s Senior Policy Adviser, talks about what we’ve learned in the past seven years and how we’re bringing this into our work throughout 2019.

In 2012, research conducted by Kennisland - our trusted partner since 2010 and architects of the Europeana Licensing Framework - revealed that 54% of the 24 million objects found through Europeana Collections had no standardised rights information at all. Thanks to the introduction of a new Data Exchange Agreement which requires all digital objects to be labelled with a standardised rights statements, mid way through 2012, this percentage dropped to 34%.

Of the remaining 66%, the research estimated that around half were correctly labelled digital objects, and around half were incorrect. And so to respond to this, in 2013, we launched the rights labelling campaign to improve the presence and accuracy of the statements. We, rather naively, gave ourselves a three-month deadline to achieve +/- 95% labelling and accuracy.

That's already quite a lot to take in. But this was our starting point.

Fast forward to 2019, we’ve more than doubled the amount of digital objects from 24 to 58 million

We can proudly say that since July 2014, every single object published through Europeana Collections is labelled with a rights statements. And research published by Kennisland in 2018 showed that 62% of objects were labelled by data partners with an accurate rights statement. So some significant progress was made, but our gut feeling was that there was much more that could be done.

In 2019, a third research report by Kennisland assessed the accuracy of rights information in the highest quality images shared through Europeana - those qualifying for Tier’s 3 & 4 of the Publishing Framework. It showed a very different picture: just 38% of labels were accurate. The remainder a combination of mixed, inaccurate or unable to determine. The main causes centred around probable misuse of Creative Commons tools and licences in the form of two main scenarios; either it’s unlikely the rights holder gave permission for a particular licence, or no copyright subsists in the work so a licence cannot be applied.

A legacy of the 2013 rights labelling campaign, starting in the same year, was the embedding in the data ingestion process of checks and balances to detect and address the most significant issues raised in the 2012 research. Issues such as claiming copyright on a public domain work - going against the principles embedded in the Europeana Public Domain Charter that digitisation should not create any new rights. But the recent research has shone a light of a different set of issues, the misuse - a harsh but accurate term - of Creative Commons tools and licences.

So we have seen first hand over these years that cultural heritage institutions struggle with applying accurate rights statements to their digital objects. And that this challenge does not seem to be going anywhere soon. At this week’s Creative Commons Summit, Lisette Kalshoven - who in previous roles at Kennisland co-authored all of the research into accuracy of rights in Europeana data - will be addressing this very issue - How do we get more accurate rights statements from GLAMs? (Take a look - the Summit is jam packed full of great sessions).

We can also see that it’s a conversation that is wider than Europeana, our data partners and Network Association members. For us it’s a familiar conversation, but perhaps it’s new to you; so let’s take a look back on what we have learned in seven years, how we bring that experience forward throughout 2019 and perhaps give you an insight into why we took 18 months to finish that very first rights campaign.

#1 Consistency is our best friend

My colleagues in the Data Partner Services Team are responsible for working with you, our data partners, to review, refine and publish their data. It’s also their role to raise issues of accuracy. Over the years, the amount of data they handle and the number of data partners have increased significantly, and the people in the team have changed, but the criteria for achieving accurate rights information has always been consistent. Our Publishing Guide - first developed in 2014 - aligns our working practices with the standards, such as the Licensing and Publishing Frameworks, that we use, establishing the acceptance criteria for data.

In 2019 we continue our conversations around rights but with a new focus on the accuracy of how Creative Commons tools and licences are used, to address the inaccuracies that stem from their misuse. To support the Data Partner Services Team, who received copyright training by Kennisland earlier this year, the Publishing Guide will be updated to more clearly define how we approach improving accuracy with Creative Commons tools and licences.

#2 Our priorities are not (often) your priorities, we must narrow that gap

I think we can agree, no-one actually wants rights information for their digital objects & collections to be inaccurate. However our ambitions over the years to address this haven’t always aligned with the priorities of our data partners. Challenged by a lack of resources, time and often lack of access to expertise, we have learned to anticipate that our data partners will need (a lot of) time to respond to rights queries.

What we think is a simple query to solve, often takes months to work its way through organisational processes. But it’s important to get it right, so in 2019, we’re going to explore ways to better match our ambition addressing inaccuracies in rights information with your priorities, such as aligning them with updates to your datasets. We also have some incentives for partners wanting to improve their rights information, as accuracy of the rights statement becomes a factor in determining where and how we can share your high quality data.

#3 You’re great, you just don’t know it

We sometimes overestimate, and sometimes underestimate, what you know about copyright and rights statements. It’s a difficult subject to master in depth, but 99% of you that I meet, underestimate how much you know and understand. And whilst talking about copyright sometimes elicits groans from the people around us, I’m always really impressed with how open and eager you are to learn more about this complex topic (and actually, how many of you want to do this).

As advocates of better copyright rules for cultural heritage institutions, we have the privilege of working with experts in the field of copyright. In 2019, we’ll continue to work with experts in the field, but I also want to hear more from you. I’ve read and heard lots of great examples of copyright and rights statement workflows, about challenges you’ve overcome, and workshops and courses you’ve run. Through the copyright community, you have the opportunity to share this with a captive audience, and to learn from each other.

#4 Being transparent is really hard, we must do better

We really want to tell you how you are doing. In previous years we’ve experimented with dashboards to give context to your collections amongst the wider dataset available through Europeana Collections. But automating the assessment of rights information is a long long way off. For one thing, we don’t have the data that we need - such as consistent date information for each digital object - to use to make automated calculations. If you take the public domain calculator - a decison making process so complex that printed out in very small print it fills a wall from floor to ceiling - you'll see how critical this information is, and how useful it would be to have (you can download the flowchart at the end of this article).

But as we make progress in the coming year towards improving the accuracy of rights information, we will experiment with different ways of showing overall progress - so how far we’ve come with improving the accuracy of rights statements from that original 38% accuracy of the highest quality data. And we want to share more examples of the route our partners have taken to improving the accuracy of their statements, so that you can learn from us, and from each other.

Public Domain Calculator Poster.pdf
Added 11/09/2023 | PDF | 23.91 MiB
DOWNLOAD - Europeana_Professional/IPR/IPR images/Public Domain Calculator Poster.pdf

If you want to follow these conversations, sign up to the copyright community and receive its newsletter, and follow @EuropeanaIPR on twitter.

Remember to save the date for this year’s Europeana annual event, where the copyright community will contribute to a range of panel and themed sessions throughout the three-day conference.