bring out yer sips: an introduction to digital …...bring out yer sips: an introduction to digital...

28
Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley, Digital Preservation Librarian, Scholars Portal

Upload: others

Post on 20-May-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Bring out yer SIPs: An Introduction to Digital Preservation with ArchivematicaiSkills WorkshopFebruary 9, 2018

Grant Hurley, Digital Preservation Librarian, Scholars Portal

Page 2: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Agenda

- Basic concepts in digital preservation- Introduction to Archivematica- Preparing transfers + Demo- Processing transfers + Demo- Looking at AIPs- Thinking about DIPs- Processing activity

Page 3: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

What’s this “digital preservation” thing?

Uh oh

Page 4: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

● Digital objects (both born digital

and digitized) need active management to ensure ongoing access

● Quickly-changing technological norms create risks that must be managed from the object’s creation

● Digital preservation is a set of theories and practices that work to keep digital objects authentic, available and reliable over time.

Page 5: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Identity: what it is; format identification, descriptive information, provenance, etc.

Integrity: establishing that a file remains unaltered over time

Page 6: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Identity: File formats

filename : '/Users/hurleyg/Documents/Teaching/iSkills/CheckYourBits.jpg'filesize : 582231modified : 2018-01-24T15:50:08-05:00errors : matches : - ns : 'pronom' id : 'fmt/43' format : 'JPEG File Interchange Format' version : '1.01' mime : 'image/jpeg' basis : 'extension match jpg; byte match at [[[0 14]] [[582229 2]]]' warning :

File format identifications/descriptions in Pronom (UK National Archives) - ID = Pronom identifierArchivematica uses Siegfried or FIDO

Page 7: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Integrity: The almighty checksum

md5 checksum = 2c93b97c3d7e53dab9161e389c98465c

md5 checksum = 1148058955697062ca583d0cc0474322

Page 8: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

The even more almighty OAIS

Page 9: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Other important concepts

Identification: determining what a particular file’s format and version is

Characterization: extracting metadata related to the file’s intrinsic properties. For example, audio sample rate, channels, etc. for a mp3 file.

Validation: determining if a file is well-formed and valid according to its specification.

Normalization: converting a file from a source format to a standardized format.

Page 10: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

What is Archivematica?

Page 11: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

What it does

- Creates well-formed data packages for long-term preservation and access

- Takes a pre-structured transfer from a data source- Makes a Submission Information Package (SIP)- Transforms the SIP into an Archival Information Package

(AIP) - Also can create a dissemination information Package

(DIP) for access- Each of these functions has configurable tasks associated

Page 12: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

What it does

- Stores and applies preservation policies for normalization, access copies, etc.

- Allows access to, and deletion of, AIPs- Assists in ingest of descriptive metadata, rights

information- Manages data flows in and out of system through

separate Storage Service module- Can connect to access systems for DIP deposit (mostly

just AtoM) - Can be fully automated

Page 13: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Where it came from- Standards for digital preservation developed in late

1990s and early 2000s, but no easy way of applying them- UNESCO released 2007 report advocating for open

source digital preservation system- Artefactual Systems started up by creating Access to

Memory (AtoM) system for archival description- Various small open source tools were also being

developed by others for particular tasks- Artefactual developed Archivematica beginning in 2008- Beta release in 2012; current release is 1.6.1 (2017)

Page 14: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

What it is- Modular workflow created using a microservices design

pattern - Data follows structured, chained pathway, there the results of one

step triggers the initiation of the next step.

- Components can be replaced or turned off/on.

- Accessible through the browser

- Requires a virtual machine to run on (Ubuntu or CentOS)

- Runs in LAMP environment (Linux, Apache, MySQL, PHP)

- Open source, developed by Artefactual Systems staff

Page 15: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

What it isn’t

- A storage system

- An access system

- Easy to install or maintain in production

- User friendly

- A complete digital archives workflow

Page 16: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Who uses itLargely, memory institutions (libraries, archives, galleries, museums) with digital collections that need preserving

- Libraries: - Digitized/born-digital content in institutional repositories- Research data management (several current projects trying to

develop Archivematica’s capacity in this domain)- Digital collections (books, journals, maps, etc.)

- Archives- Digitized collections (photographs, audio-visual materials, etc.)- Born digital donations (all sorts of stuff)

- Private papers/collections- Records from corporate bodies, institutions, etc.

Page 17: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

The Workflow

Pre-Transfer*

Selection of objects to

preserve

Metadata preparation

Packaging for transfer

Transfer

Generates METS file to be written

to

Virus scan

File ID, characterization,

validation

Backlog

You can send something here

if you don’t want

to continue

processing it

Appraisal

File format view/analysis

Selection for retention

ID sensitive data

Ingest

Normalize files

Create & store AIP/DIP

Storage &Access*

Store in location

Send access copies to other

systems

*Not in Archivematica

*Linked to by Archivematica

Page 18: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Preparing transfers

Page 19: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Steps

- Determining content and structure (1 SIP = 1 AIP = fonds, series, item? Or section of one of these?)

- Gather and structure metadata (next slide)

- Gather submission documentation (not in demo)

- Package and structure for ingest- All data needs to be in a directory, at minimum

Page 20: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Metadata

Descriptive metadata

- Uses simple Dublin Core as key standard, other information is recorded as ‘Custom’ - Transfer level can be added through interface or imported- Item level must be imported via csv file

Rights metadata

- Mapped to PREMIS - Same import structure as above

Page 21: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Demo

- Set of photos + metadata csv file

- Bagging using Python script

Page 22: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Processing transfers

Page 23: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Demo- Same materials as before

- Uploaded to transfer source on Ontario Library Research Cloud

- Process using standard workflow and settings

- Briefly demo backlog/appraisal tabs

- Store AIP on OLRC

- No DIP

Page 24: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Looking at AIPs

Page 25: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

AIP Contents

- METS file

- Originals + normalized copies in ‘objects’ folder

- Materials that made up original transfer

- Logs

Page 26: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Thinking about DIPs

Page 27: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

DIPs

- Set of normalized files for access, created with access policies in preservation planning module

- Archivematica can connect to AtoM for DIP deposit to existing description

- Can transfer over some metadata, so description work can be lessened, but only at transfer/item level

Page 28: Bring out yer SIPs: An Introduction to Digital …...Bring out yer SIPs: An Introduction to Digital Preservation with Archivematica iSkills Workshop February 9, 2018 Grant Hurley,

Activity time!