digital destiny
DESCRIPTION
Presentation on electronic records management and archival issues. Originally presented at the Fall 2008 meeting of the Southeastern Wisconsin Archivists GroupTRANSCRIPT
Brad HoustonUniversity Records ArchivistUniversity of Wisconsin-Milwaukee
More records created since 1945 than during the 3000 years before that
90% of all of these new records are born-digital
Electronic records are ephemeral!
Electronic records are intangible!
Electronic records are decentralized! (multiple creators)
Result: massive challenges for institutional archivists
Give detailed instructions on setting up an electronic records program at your institution
Endorse or explain specific Records Management Applications (RMAs)
Tell you that your electronic records management system is WRONG
Pretend to be the last word on electronic records
Provide an overview of electronic records challenges
Examine how the “functions” of archival practice (appraisal, processing, access, etc.) change in light of electronic records
Provide some (not all) vital characteristics to look for in archival e-recs systems
Allow for lots of discussion and feedback at the end
UWM Archival Collection 200: University Communications and Media Relations Mixed media (mostly photographs),
including digital photos First ‘born-digital’ accession of material in
UWM Archives Processed Spring 2008 by UWM SOIS
fieldworker Mistakes were made by professional staff:
here’s how to avoid them!
Data or information that has been captured and fixed for storage and manipulation in an automated system and that requires the use of the system to render it intelligible by a person. (Glossary of Archival and Records Terminology)
“Strictly speaking, it is not possible to preserve an electronic record…” (Luciana Duranti)
Digitized: records scanned into the system for access or preservation purposes “Behaves” like analog records– usually
discrete, usually separable for RM purposes Born Digital: records created ‘on-line’
for electronic use, disposition Often linked to other records (e.g. in a
relational database), making it hard to separate/schedule
Requires unique identity of records Dates made, transmitted, received, filed Names of authors, addressee, recipient,
creatorsRequires provable integrity of records
Name of handling office Name of records custodian Indication of annotations, modification
actions Indication of technical modifications
Message: Is the content of the document adequately preserved?
Media: Is the storage medium durable enough to retain its integrity over time?
Metadata: Is there enough supplementary info to contextualize and prove authenticity of the document?
If you lose even one of these components of an electronic record, you have not adequately preserved it.
Make sure your records management program is up to date! Records Schedules! Electronic Document Management! Electronic Records Management!▪ Note: EDMS and ERMS are NOT the same!
Training, Workshops, Outreach!▪ Fundamental message: E-recs are records too!
Includes rules which govern: Which documents are eligible for inclusion Who inputs/removes records (“trusted
custodian”) How long records remain (Classification
system) How to remove expired records (retention
scheduling) Key function: guaranteeing ongoing
authenticity of records DoD 5015.2: U.S. standard for TRSs
Electronic records are tricky to deal with without RM, but by no means impossible Coordinate metadata collection/transfer
procedures with organization records manager
Work with targeted creators directly to encourage file organization
Talk to your IT dept., administration about RM utility, functional requirements
Received directly from Univ. Communications and media relations; maintained case file Chain of custody, context of digital photos
Submitted on CD-Rs along with analog photos Immutability of format suggests authenticity
Problem: no trusted recordkeeping system at UWM Authenticity is presumed, but not demonstrated
None active yet, but we’re working on it… UWM uses Xythos for shared file storage
EDMS capabilities should be enabled within year ERMS capabilities are DoD 5015.2 compliant,
but have not been discussed in detail yet Importance of “getting in on the ground
floor” Talk to your IT dept. about TRS requirements!
Knowledge Information Ecosystem Information Studies Documentary Forms in the Digital
EnvironmentSkills
Management Skills Technical Skills Soft Skills
How to survey mass quantities of e-records?
How to appraise series of interrelated e-recs?
How to prepare records for accession?
“Post-custodial era?”
Increased role of questionnaires/surveys
Przyblya and Huth: we should become partners and students of creators and IT staff
Swain: Appraise at series level, NOT item or folder level
Key point: understand how systems work to document transactions– appraisal follows naturally
Moving records from recordkeeping to preservation system
Based on work Wilczek and Glick did at Tufts and Yale
Involves creation of SIPs (Submission Information Packages) and AIPs (Archival Information Packages) Content information Preservation Description Information
Establish relationship, define project, collect information Creation of XML schema for records
Assess value, record types, formats, identification, copyright, access rights Create or modify policies for each of above
Assess recordkeeping system Assess feasibility of submission project Finalize submission agreement
Create and Transfer Submission Info Packs Includes: metadata, digital signatures,
transformation audit trailValidate SIPs and transform metadataFormulate Archival Info Packs and
create configuration rulesAssess AIPsFormally accession
Photos were transferred in 2006, so preliminary appraisal did not occur
Photos were grouped within CD-R by subject Within directories, often many shots of same
event from different angles, different lighting, etc.
Fieldworker grouped photographs within subject by event pictured, then sampled
Lessons learned: encourage better use of metadata by producers
Do traditional arrangement schemes apply? What constitutes an electronic “series”?
Is “folder-level description” meaningful?
Is “item-level description” practical? Search engine technology?
File name issues?File Format issues?What do we do with this metadata?
In some cases, directories=folders Item-level description possible for small
collections; directory-level will be more common
In other cases, entire database must be described and made available Access to Archival Databases (NARA)
Post-Custodial effect: encourage standardization among active users File naming, directory structure
Digital photos were treated the same as analog photos in description
Directory structure was already present for arrangement; file names inconsistent Less of a problem because of thumbnails
Arranged digital photos as separate series because of access issues
Very ad hoc process– should institute policy for next processing project
Problems with hardware and software obsolescence
Problems with file format obsolescence Physical storage necessities
Of the three, probably the least pressing problem
“There is a much greater assurance that 20 or 30 years from now , you’ll be able to find records from the Civil War than you will from anything that’s going on today. “—Amy Moran
Can you still read these?
1. Migration: Moving files to new systems on periodic basis
2. Emulation: creating programs to read original datastreams
CAMiLEON project, Univ. of Michigan3. Encapsulation: providing a framework
to read files within a discrete XML ‘wrapper’
The best solution, but also the most difficult
Why reformat? The usefulness (or not) of standards Are native formats viable? “It depends”
(h/t Susan Davis) A good stopgap solution, but should
not replace creation of preservation system
Consider usability of new format XML is ideal, but again requires most
work
Formats for textual records Text File (UNICODE encoding) Open Document Format (ODF) PDF and PDF/A
Formats for image records TIFF JPEG 2000
Formats for other A/V records AAF (Advanced Authoring Format)
After sampling, photos for preservation were converted to TIFF
TIFF preservation files currently stored on UWM Archives Dept. Server Probably not the best solution, but
acceptable Copies were made and converted to
JPEG for access copy Numerous mass-converters on market to do
this quickly
How will your users discover the files? Finding Aid as normal? Digital collection page?
How will your users get to the files? Web access vs. In-house access Direct access vs. access copies
How will e-recs access reorient your reference process?
“Reading-room only access to digital content is not the desired or expected access.”—Tim Pyatt Feasibility considerations of online access
Access copies Greater usability Short term: faster load time, familiar
interface Long term: use as a backup if data is lost
Reference will shift from searching-oriented to research-oriented questions
Finding aid notes in Use Restrictions field that access copies are available
Patrons are referred to CD on which access copy is found to view photos CD is for reading room use only
Volume of photos No web access… yet May add some to our Digital Collections in
future We treat these as analog for access–
may not be as useful down the road
Coordinate with your records management program before even THINKING archives
Encourage donors/creators to practice good arrangement processes with active files
Work with administration, IT dept. EARLY to develop requirements for recordkeeping
Use producers’ knowledge of file schemes to inform appraisal decisions
Develop policies to standardize process, add authority to solicitation
Consider digital preservation environment (Emulation? Migration? Transformation?)
Rethink concept of archival series– not necessarily analogous, esp. for born-digital!
Outreach, Outreach, Outreach! Did I mention outreach?
InterPARES project http://www.interpares.org/
Open Archival Information System Reference Guide http://public.ccsds.org/publications/
archive/650x0b1.pdfCAMiLEON project (Univ. of Michigan
and Univ. of Leeds) http://www.si.umich.edu/CAMILEON/
Fedora Project Ingest Guide http://dca.lib.tufts.edu/features/nhprc/reports/
ingest/index.html New Skills for a Digital Era– proceedings and case
studies http://rpm.lib.az.us/NewSkills/index.asp
PDF/A Competence Center http://www.pdfa.org
DoD 5015.2 RMA design criteria standard http://www.dtic.mil/whs/directives/corres/pdf/
501502std.pdf
Slides of this presentation will be available on the UWM Records Management website http://www.uwm.edu/Libraries/arch/recordsmgt/
education.html
Any other questions? Contact me: [email protected] 414-229-6979