Transcriptions, subtitles and enrichments: sharing our copyright approach

Europeana increasingly relies on content generated by users or machines which adds value to the cultural heritage data we share, and makes it accessible and reusable. This content raises new questions, including ones related to copyright: what can we enrich and transcribe, and how do we label, or license, the results? In this post, we share our thoughts and ask our partners, networks and data providers for feedback.

What user generated content does Europeana work with, and why?

Generally, the user or machine generated content that Europeana works with consists of annotations and enrichments, including semantic enrichment and transcriptions or subtitles. For instance, through the CrowdHeritage project, Europeana worked with partners to facilitate the creation and integration of enrichments and annotations on content in Europeana, and through the EnrichEuropeana project, several handwritten items in Europeana have been transcribed.

User generated content offers many promising possibilities. It enriches collections, helps us make them more interoperable, brings us closer to having more translated content, and contributes to accessibility.

What rights statements do we suggest applying to the content generated by users?

In order to continue fostering the usability of content in Europeana, we are defining an approach to user generated content that defends the principles of openness. We want to align how we treat user generated content from a copyright perspective to the conditions we apply to metadata and digital objects provided by our data partners. More specifically, we are exploring the following possibilities:

1) Enrichments and annotations

In our view, enrichments and annotations should be treated in the same way as the metadata provided by our data partners. As such, the enrichments and annotations contributed by a user should be made available under the terms of the Creative Commons CC0 1.0 Universal Public Domain Dedication, which allows anyone to use these with no restrictions.

2) Transcriptions and subtitles

Through our terms, we are planning to ask people contributing with transcriptions and subtitles to not claim any intellectual property rights that could, in some exceptional cases, be generated from their act of transcribing or subtitling. This avoids adding additional layers of rights that would hamper further uses of transcriptions and subtitles, for instance to support accessibility or multilinguality.

However, in order to respect intellectual property rights that might exist on the content or digital objects that the transcription or subtitles result from, all transcriptions and subtitles contributed by the user will be made available by Europeana under the rights statement assigned to this content.

What is our internal approach to what can be enriched, transcribed or subtitled?

Copyright and the rights statements chosen by Europeana data partners determine the extent to which digital objects in Europeana can be used. We should base our decision to transcribe and subtitle digital content on the terms these indicate and our willingness to encourage the use of digital objects licensed openly. We are therefore considering not supporting the transcription or creation of subtitles of any digital objects marked with CC BY-ND, CC BY-NC-ND, NoC-OKLR, InC, InC-EDU, InC-EU-OW or CNE.

In addition, even though digital objects marked as CC BY-NC-SA and NoC-NC would in principle allow us to create a literal transcription of a handwritten letter, challenges would rise if eventually we wanted to promote its use by anyone, in any platform. We therefore consider it best to focus our efforts to transcribe and subtitle content on digital objects marked as PDM, CC0, CC BY or CC BY-SA.

We can encourage the creation of metadata enrichments without discriminating in terms of the rights statement applied to the content, and will be diligent in ensuring that this metadata is differentiated from the one provided by Europeana data partners.

What are the next steps?

To make this approach clear to users contributing with content, we need to update our Terms for User Contributions. These terms set the conditions that apply to the content individuals and institutions contribute to Europeana, and have been developed so that there is clarity on how this content can be used. They now need to be updated to accommodate the many exciting projects which bring in annotations, metadata enrichments, transcriptions, subtitles and other types of user or machine generated content.

However, before we do that, we would like to collect feedback, ideas, thoughts and concerns around this topic from our partners, data providers, members of our networks and the broader public, so please reach out to us with your comments. For more information on our general approach to copyright, check out the Europeana Licensing Framework.