J-Ark (European Jewish Community Archive) is a Generic Services project funded under CEF Telecom eArchiving. The project is developing European eArchiving standard-based, long-term preservation solutions for Jewish heritage archives, which it has integrated into the aggregation flow of a Europeana thematic aggregator for Jewish heritage, Judaica (operated by the Jewish Heritage Network)
The project is unique in using software components from three different initiatives funded by the European Commission (eArchiving, Europeana, eTranslation) to build a comprehensive end-to-end solution for small and mid-size cultural heritage institutions, an important group of stakeholders for European infrastructures.
Enhanced aggregation workflow
J-Ark has brought together the open-source, eArchiving-compliant, long-term preservation solution RODA (provided by KEEP SOLUTIONS, Portugal) and services for machine translation and anonymisation of heritage content (provided by Pangeanic, Spain) as a new, integrated solution to Judaica. This means that the aggregator now offers a service which ingests digital objects and metadata via an orchestrated workflow, starting with a web-based or file-based pipeline for submitting digital objects (as E-ARK SIPs, the eArchiving specification for submitting an archival package). Metadata can be added manually through the user-friendly CMS offered by the aggregator or uploaded in a spreadsheet file.
The project explored a direct integration with a CMS that
stores the original metadata in order to test and to showcase potential
issues and possibilities. We built a custom integration with dLibra,
the digital library system actively used in Poland, which allows
cultural heritage institutions using dLibra to upload a collection of
objects in dLibra to the archival service with several clicks.
After digital objects and metadata are submitted, the metadata is
automatically translated to European languages (English, French,
Spanish, German, Italian) and anonymised to address privacy requirements
of preserved archival collections. Representation of the metadata in
the Europeana Data Model
(a Linked Data description format is stored by RODA (the preservation
system), which now added EDM to the list of supported formats.
The project has been piloted on the data sets of two partners: Brama
Grodzka - Teatr NN, a Jewish heritage centre from Lublin, Poland, and
the Jewish Community of Lithuania. Content partners explored different
metadata and digital content submission workflows during hackathons,
which were then implemented by the project.
These pilots highlighted the importance of two fundamental aspects
when integrating a preservation solution into an existing digital
environment. First, the solution needs to integrate harmoniously with
the current workflows (such as content publishing, metadata production
etc.) and systems (CMS, digital asset management) around digital objects
and metadata. Second, it is necessary to find the right strategy to
decide how the complexity and specifics of the data structure at source
is reflected on the preservation end. For example, harvesting Brama
Grodzka - Teatr NN data (based on dLibra system) required a custom
harvester to map a complex hierarchical data structure to more "flat"
If you are a cultural heritage institution interested in a long-term
preservation solution compatible with aggregation flows please get in touch.
Future perspectives on preserving data within data spaces
Over the course of the J-Ark project, it has become clear that
connecting the different services of European digital infrastructures
will be a core requirement of the next generation of European ecosystems
for data - the data spaces. Developing the common European data spaces
is a new flagship initiative of the European Commission aiming to
support the growth of the digital economy in strategic sectors and
domains of public interest. Interoperable data spaces cover/domains from
manufacturing and health to energy and agriculture, and ensure both
public and private sector organisations and research institutions can
make available and exchange data in a trustworthy and secure manner.
The project’s experiences connecting digital preservation to the
aggregation flow highlight the importance of considering long-term
preservation of cultural heritage assets in the broader perspective of
data space design. They also raise some of the bigger questions about
the future of digital preservation and how they can be addressed by the
common European data spaces. How long is ‘long-term’ preservation: 10,
50 or maybe 300 years? How do we make it truly inclusive by preserving
not only mainstream collections, usually safeguarded by institutions
well-equipped with preservation means (or at least an established
Content Management System for objects and metadata), but also archives
of small organisations, communities, and even personal archives? How do
we save on costs by looking at the ecosystem as a whole rather than at
specific use cases?
If you are interested in exploring some of these questions, the
project has organised a free online event on Wednesday 22nd February
2023, and we’d love to see you there. The event will include updates
from the project, the European Commission and the Europeana Foundation
and a round-table that looks at the future of digital preservation and
the common European data spaces. Find out more and book your space