towards a common approach for access to digital archival records in europe. alex thirifays and...
Upload: 12th-international-conference-on-digital-preservation-ipres-2015
Post on 11-Jan-2017
125 views
TRANSCRIPT
iPRES12th International Conference on Digital Preservation
University of North Carolina at Chapel Hill
Alex ThirifaysDanish National Archives (DNA)
E-ARK European Archival Records and Knowledge Preservation
Towards a Common Approach for Access to Digital Archival Records in
Europe
THEE-ARK PROJECT
ISCO-FUNDED
BY THEEUROPEAN
COMMISSIONUNDER THE
ICT-PSPPROGRAMME
www.eark-project.eu
What’s the ambition of E-ARK?Overall goal: Create open source, full-fledged digital archive with • Common workflows and terminology• Common formats (SAD-IP)• Common tools• Solution will be: Scalable, computational, modular, robust,
and adaptable
Common methods• Common framework using international
standards e.g. OAIS, PREMIS, METS, PAIS…• Reuse of existing software (e.g. ICA-AtoM) and formats
(e.g. SIARD)• Open Source, Github, etc.
Different content types• Databases, geodata, Electronic Records Management Systems
(ERMS), individual computer files, and Online Analytical Processing (OLAP)
Who and what? These designated communities…• Producers• Archives• Consumers
Need…• Everything but images (e.g. database archiving, geodata)• User friendliness• Uniformity; reduction of number of tools Savings!• Exchange Is E-ARK the first step of a common
European infrastructure? What’s next?
Get…• The Reference Implementation, which is
Archival Storage
Access
E-ARKSIP
SIP Creation
Tools
Archival records
Content and Records
Management Systems
SIP – AIPConversion
E-ARKAIP
CMISInterface
Data Mining
Interface
Digital preservation systems
AIP - DIPConversion
Scalable Computation
E-ARKDIP
Archival Search, Access and
Display Tools
Content and Records
Management Systems
Data MiningShowcase
Reference implementation
Ingest
Scope
SIP•Package prepared by Pre-Ingest WP3
AIP•Package created for long-term archive WP4
DIP•Package created for accessWP5 Danish National Archives
‘Access’ workpackage main working areas and method
Access Tools
User needs
Requirements specification
& DIP format
Best practices
Search Interface
Order ManagementAIP-DIP transformation
DIP modification
End-user access to requested archives
The GAP analysis• Examine landscape of current access
solutions;• Examine user needs for access solutions• Compare those and create a
GAP analysis
Findings from GAP analysisUser requirements
Overall users’ needs are not met very well!
• Content data type coverage (databases!) Must bridge!
• Integration of Access services Must bridge!
• Metadata and search quality Must bridge!
• Usability (& exploitation) Must bridge!
DIP & tool requirementsReq. no
Requirement description Use Case
MoSCoW
23 The DIP must allow for the inclusion of any descriptive metadata from the AIP
UC4.2 M
24 The DIP must allow for the inclusion of any relevant descriptions of access conditions and restrictions
UC4.2 M
25 The DIP must allow for the inclusion of any relevant technical metadata about its content
UC4.2 M
26 The DIP must allow to use any relevant metadata standards within it
UC4.2 M
27 The DIP must include the date and time of the creation
UC4.2 M
28 The DIP must allow to include data in any type or format within it
UC4.2 M
29 The DIP must include information which allows its validation and authentication by the user
UC4.2 M
30 The DIP should include relevant information about the context and provenance of the package (i.e. the position in the archival hierarchy, reference to the creator and archives)
UC4.2 S
31 The DIP should allow for including / logging information about any changes done to the IP during ingest (SIP), preservation (AIP) or access preparation (DIP)
UC4.3 S
32 The DIP should include information about its current status in the DIP preparation workflow (as an example, whether the DIP is ready for delivery or still being modified)
UC4.3 S
Metadata requirements
Examination of metadata standards:
• Categorization of metadata elements to enablecomparison of different standards
• Quantification of elements to produce a detailed impression of the coverage of each standard
Result: METS, PREMIS, EAD, EAC-CPF, INSPIRE, SIARD, Moreq
Take-up and sustainability…
• Access attracts increasing attention/funding. For example public authorities need access to their own records. This is why national archives of Sweden and Norway are in the process of creating so-called ‘middle archives’ that cater for these needs.
• Archives need database archiving. Over the coming 5 years, the Danish National Archives will ingest around 100TB of data per year, most of which are databases. No reason to believe that public authorities in other countries generate less data.
• ~Data mining. Exploitation of data is sought for. However, there’s a conflict between confidentiality and access. Will it be solved by EU initiatives like Scrutiny?
Take-up and sustainability…
• The common IP format will – facilitate exchange of information packages and
standardize the search for them and within them– it will also reduce the number of tools needed in the
archival community, and thus their development cost and maintenance cost
• Pilots will prove the concept, which is the main strength of the E-ARK project regarding take-up
• Flexible work flows, micro-services and open source will cater for adaptability to local needs and longevity
…and a glimpse into the future
• Finalisation of the DIP format (January 2016)• Pilot release of E-ARK Access tools (April 2016)• Functionality show-off at iPRES, Bern (2016)• Final release of E-ARK Access tools (January 2017)
Beyond the E-ARK project there’s the possibility of building a common, international archival infrastructure, building on the E-ARK IP formats. It would allow people from everywhere to search in all kinds of archival repositories, exploiting data in new ways, opening the doors for new research, clever journalism, more efficient public administration, and, not the least, new business possibilities for private companies.
No, it’s not Linked Open Data, and never will b
e, but it smells like it