access across time: how the naa preserves digital records

25
Access Across Time: How the NAA Preserves Digital Records Andrew Wilson Assistant Director, Preservation

Upload: channing-fisher

Post on 04-Jan-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Access Across Time: How the NAA Preserves Digital Records. Andrew Wilson Assistant Director, Preservation. What I will talk about. NAA Context Some Concepts NAA Implementation NAA Process flow Preservation Software Platform (Xena). National Archives of Australia. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Access Across Time: How the NAA Preserves Digital Records

Access Across Time: How the NAA Preserves Digital

Records

Andrew WilsonAssistant Director, Preservation

Page 2: Access Across Time: How the NAA Preserves Digital Records

What I will talk about

• NAA Context

• Some Concepts

• NAA Implementation

• NAA Process flow

• Preservation Software Platform (Xena)

Page 3: Access Across Time: How the NAA Preserves Digital Records

National Archives of Australia

• Established 1946 as part of National Library

• Independent since 1960• Legislation: Archives Act 1983• Approx. 420 staff in 12 locations• Budget ca. A$65 million.• 350 shelf kilometres of records• Separate preservation funding

since 2001

Page 4: Access Across Time: How the NAA Preserves Digital Records

Digital Preservation Project

• Started in 2001/2 FY• Cost to date approx. A$2 million

(30 months)• Aim: to develop viable approach to

the preservation of ‘born digital’ records for long term accessibility and use

• Deadline: July 2003

Page 5: Access Across Time: How the NAA Preserves Digital Records

Note

• NOT a digital archive but an approach to digital preservation

• Well developed archival processes that can be applied to records irrespective of format:- appraisal/selection- transfer- description- retrieval and access

• Project purely about preservation

Page 6: Access Across Time: How the NAA Preserves Digital Records

Some Definitions

• RecordsRecorded information created or received and maintained by an organisation in the transaction of business

• Digital RecordsRecords in digital form processed by computers

• Not:Systems or working applications

Page 7: Access Across Time: How the NAA Preserves Digital Records

The preservation problem

• Technological obsolescence

– Hardware

– Software

• Restrictions on the use of technology

Page 8: Access Across Time: How the NAA Preserves Digital Records

Traditionally

Researcher directly experiences the record through its source object

Preserve the object and you preserve the record

Object Researcher

Page 9: Access Across Time: How the NAA Preserves Digital Records

Researcher experiences the record through a performance

Preserve the performance and you preserve the record

But…digital records are performances

Source Process Performance Researcher

Page 10: Access Across Time: How the NAA Preserves Digital Records

A Two Part Solution1.Keep a master copy of every

source we accept into custody- Passive Access- Researcher gets the 'Zeros and

Ones', not the performance

2.Active Intervention to recreate the performance- Replace the source and process- Active Access to the 'essence' of the

performance- Based on experience with Audiovisual

material

Page 11: Access Across Time: How the NAA Preserves Digital Records

The essence of the record

• What we want to preserve out of the performance- What aspects are essential

to the record's value?- What aspects are incidental

to the record's value?

Page 12: Access Across Time: How the NAA Preserves Digital Records

Our preservation approach

• Select open and well documented data formats

• Migrate records into these formats (‘normalisation’)

• Support open source software tools that can read these formats

Page 13: Access Across Time: How the NAA Preserves Digital Records

Preservation System

• 3 separate components1. Quarantine2. Preservation3. Storage

• All components physically separated from each other and all other NAA networks

• Access to hardware restricted to digital preservation staff

Page 14: Access Across Time: How the NAA Preserves Digital Records

Quarantine server

Records Transferwritten to server

Digital Preservation recorder captures information aboutactions on each digital object

• Transport medium stored on repository shelf for at least 4 weeks

• Objects then re-checked for viruses using new virus definitions

Checked objectswritten to transportmedium

DPr

• Checksums verified

• Objects undergo virus check

For the technically minded:- Dell PowerEdge 2600 server- 2 x 2GHz processors- .7Tb disk store- independent UPS

Page 15: Access Across Time: How the NAA Preserves Digital Records

Preservation server

Transport mediumis attached topreservation server

Digital Preservation recorder captures information aboutactions on each digital object

Output setswritten to newtransport media

DPr

Preservation software platform (Xena) processes digital objects

Xena outputs two new objects and calculates new checksums for each:

1.Wrapped bitstream

2.‘Normalised version’

For the technically minded:- Dell PowerEdge 2600 server- 2 x 2GHz processors- .7Tb disk store- independent UPS

Page 16: Access Across Time: How the NAA Preserves Digital Records

Digital Repository

Transport mediaare attached torepository server

Digital Preservation recorder captures information aboutactions on each digital object

Third copy on digital tape which is stored offsite

DPr

RAID Storage

RAID Storage

2 copies on RAID storage

- Configured as RAID 10- Automated, regular,

frequent verification of checksums

Simple managementapplication to allowaccess to digitalobjects (eg. DSpace)

For the technically minded:- Dell PowerEdge 2600 server- 2 x 2GHz processors- .7Tb disk store- fibre channel between

server and RAID- independent UPS

Copies written to new media for access

To Access

Page 17: Access Across Time: How the NAA Preserves Digital Records

NAA Implementation

1. Follows Open Archival Information System framework

2. Non-proprietary, open source solution

3. Based on the extensible markup language (xml)

Page 18: Access Across Time: How the NAA Preserves Digital Records

xena=

xml electronic normalising of archives

Page 19: Access Across Time: How the NAA Preserves Digital Records

xena

• File-based

• Java/Swing application

• Runs in Java 1.3 +

• Packaged as an executable .jar file

• Modular

• Multiple document interface

Page 20: Access Across Time: How the NAA Preserves Digital Records

xena functionality:

• File format guessing

• File ‘normalisation’

• XML encapsulation

• Process and data verification

• File viewing

A core module plus ‘plug-in’ modules which do:

Page 21: Access Across Time: How the NAA Preserves Digital Records

Core module

The core consists of:

I. Graphical User Interface components

II. Plug-in management components

III. Generic validation components

Page 22: Access Across Time: How the NAA Preserves Digital Records

Plug-in modules• Plug-ins are created for identified data types

that are to be processed. Each plug-in consists of:– A guesser component– One or more input format type components– A normalised format type– One or more normalisation modules– One or more view components– Sorting functionality– Validation functionality– Printing functionality– GUI interaction methods

Page 23: Access Across Time: How the NAA Preserves Digital Records

DEMONSTRATION OF

XENA

Page 24: Access Across Time: How the NAA Preserves Digital Records

ContactsAndrew WilsonProject ManagerAtoR Digital Preservation Project+61 2 6212 [email protected]

Web: http://www.naa.gov.au/recordkeeping/preservation/summary.html

Page 25: Access Across Time: How the NAA Preserves Digital Records

THANK YOU

ANY QUESTIONS?