1 kort introduktion til oais open archival information system
TRANSCRIPT
1
Kort introduktion til OAIS
Open
Archival
Information
System
2
Program
DEL 1• OAIS• Arkivbegrebet• Informationsbegrebet• Processerne• Relaterede standarder
3
Historie
• NASA National Space Science Data Center er NASAs første digitale arkiv – siden 1966
• Consultative Committee for Space Data Systems (CCSDS) har deltaget i ISO-arbejde siden 1990
• ”Blue book” fra CCSDS i 2002• Nu en CCSDS ”Recommendation” og en
ISO Standard
4
OAIS
DEKLARATION:------------Informationsbegreb 40%Procesbeskrivelser 40%Interfaces 10%Diverse + fyldstoffer 10%
5
En ”Reference model”
• Abstrakt – top-down
• Forstå sammenhænge – arkivarer, ledere, brugere
• Rammemodel for yderligere standardisering
• Fokus på elektroniske arkivalier, men også ikke-elektroniske
6
Hvem bruger det?
• ”Alle”
• Digitale biblioteker
• Papirarkiver
• Videnskabelige datacentre
7
Hvad er et Arkiv?
• OAIS: Et arkivs hovedopgave er at bevare en samling information og gøre denne tilgængelig i en forståelig og brugbar form for et udpeget ”Designated Community” (”udpeget målgruppe” ?)
8
Model View of an OAIS Environment
Producer is the role played by those persons, or client systems, who provide the information to be preserved
Management is the role played by those who set overall OAIS policy as one component in a broader policy domain
Consumer is the role played by those persons, or client systems, who interact with OAIS services to find and acquire preserved information of interest
OAIS(archive)
Management
Producer Consumer
9
Open Archival Information System:Six Functional Entities
SIP = Submission Information Package
SIP
DIP
Administration
PRODUCER
CONSUMER
queriesresult sets
MANAGEMENT
Ingest Access
DataManagement
ArchivalStorage
DescriptiveInfo.
Preservation Planning
orders
AIP
AIP = Archival Information Package
DIP = Dissemination Information Package
10
OAIS Information Definition
Information is always expressed (i.e., represented) by some type of data
Data interpreted using its Representation Information yields Information
Information Object preservation requires clear identification and understanding of the Data Object and its associated Representation Information
DataObject
InterpretedUsing its
RepresentationInformation
Yields
InformationObject
11
Information Package Definition
An Information Package is a conceptual container holding two types of information
– Content Information– Preservation Description Information (PDI)
PreservationDescriptionInformation
ContentInformation
12
Information ObjectInformation
Object
RepresentationInformation
1+Interpretedusing1+Data
Object
PhysicalObject
DigitalObject
BitSequence
1+
Interpretedusing
13
Recursive Nature ofRepresentation Information
Structure Information Semantic Information Other Representation
Information
Interpreted using
SemanticInformation
StructureInformation
Other
RepresentationInformation
adds meaning to
RepresentationInformation
*
1
*
1
14
Types of Information Used in OAIS
InformationObject
ContentInformation
PackagingInformation
PreservationDescriptionInformation
DescriptiveInformation
. . .
15
Content Information
The information which is the primary object of preservation An instance of Content Information is the information that an archive
is tasked to preserve. Deciding what is the Content Information may not be obvious and
may need to be negotiated with the Producer The Data Object in the Content Information may be either a Digital
Object or a Physical Object (e.g., a physical sample, microfilm)
16
Preservation Description Information
• Provenance Information– Describes the source of Content Information, who has had custody
of it, what is its history
• Context Information– Describes how the Content Information relates to other information
outside the Information Package
• Reference Information– Provides one or more identifiers, or systems of identifiers, by which
the Content Information may be uniquely identified
• Fixity Information– Protects the Content Information from undocumented alteration
17
PDI Examples Content
InformationType
Reference Provenance Context Fixity
SpaceScience Data
Object identifier Journal reference Mission,
instrument, title,attribute set
Instrument description Processing history Sensor description Instrument Instrument mode Decommutation map Software interface
specification
Calibration history Related data sets Mission Funding history
CRC Checksum Reed-Solomon
coding
DigitalLibrary
Collections
Bibliographicdescription
Persistentidentifier
For scannedcollections: metadata about the
digitisation process pointer to master
version For born-digital
publications: pointer to the digital
original Metadata about the
preservation process: pointers to earlier
versions of thecollection item
change history
Pointers to relateddocuments inoriginalenvironment at thetime of publication
Digitalsignature
Checksum Authenticity
indicator
Software Package
Name Author/Originator Version number Serial number
Revision history License holder Registration Copyright
Help file User guide Related software Language
Certificate Checksum Encryption CRC
18
Descriptive Information
Contain the data that serves as the input to documents or applications called Access Aids.
Access Aids can be used by a consumer to locate, analyze, retrieve, or order information from the OAIS.
19
Packaging Information
Information which, either actually or logically, binds and relates the components of the package into an identifiable entity on specific media
Examples of Packaging Information include tape marks, directory structures and filenames
20
OAIS Archival Information PackageArchival
InformationPackage (AIP)
ContentInformation
PreservationDescriptionInformation
(PDI)e.g., • Hardcopy document
• Document as an electronic file together with its format description • Scientific data set consisting of image file, text file, and format descriptions file describing the other files
e.g., • How the Content Information came into being, who has held it, how it relates to other information, and how its integrity is assured
PackagingInformation
PackageDescription
further described by
delimited byderived from
e.g., How to find Content information and PDI onsome medium
e.g., Informationsupporting customersearches for AIP
21
AIP Types Archival Information Unit (AIU)
contains a single Data Object as the Content Object
Archival Information Collection (AIC) contains multiple AIPs in its Content Object– Each member of an AIC is
an AIP containing Content Information and PDI
– The AIC contains unique PDI on the collection process
ArchivalInformation
Package
ArchivalInformation
Unit
ArchivalInformationCollection
22
Package Descriptions and Access Aids
Package Descriptions are needed by an OAIS to provide visibility and access to the OAIS holdings
Package Descriptions contain 1 or more Associated Descriptions which describe the AIP Content Information from the point of view of a single Access Aid
Some example of Access Aids Include:– Finding Aids - assist the consumer in locating information of interest– Ordering Aids - allow the consumer to discover the cost of and order
AIUs of interest– Retrieval Aids - enable authorized users to retrieve the AIU described by
the Unit Descriptor from Archival Storage
23
Information Model Summary Presented a model of information objects as containing
data objects and representation objects Classified information required for Long-term archiving
into 4 classes: Content Information, PDI, Packaging Information and Descriptive Information
Described how these classes would be aggregated and related in an AIP to fully describe an instance of Content Information
Presented information needed for Access, in addition to that needed for Long-term Preservation
Put the Access oriented structures in the context of the other data needed to operate an OAIS
24
Detailed Models
Functional View
25
General Principles
Highlight the major functional areas important to digital archiving
Use functional decomposition to clarify the range of functionality that might be encountered– Don't decompose beyond two levels to avoid becoming too
implementation dependent– Provide a useful set of terms and concepts– Do not imply that all archives need to implement all the sub-
functions Identify some common services which are likely to be
needed, and are assumed to be available, as underlying support
26
Open Archival Information System:Six Functional Entities
SIP = Submission Information Package
SIP
DIP
Administration
PRODUCER
CONSUMER
queriesresult sets
MANAGEMENT
Ingest Access
DataManagement
ArchivalStorage
DescriptiveInfo.
Preservation Planning
orders
AIP
AIP = Archival Information Package
DIP = Dissemination Information Package
27
Functional Entities In An OAIS Ingest: This entity provides the services and functions to accept Submission
Information Packages (SIPs) from Producers and prepare the contents for storage and management within the archive
Archival Storage: This entity provides the services and functions for the storage, maintenance and retrieval of Archival Information Packages
Data Management: This entity provides the services and functions for populating, maintaining, and accessing both descriptive information which identifies and documents archive holdings and internal archive administrative data.
Administration: This entity manages the overall operation of the archive system
Preservation Planning: This entity monitors the environment of the OAIS and provides recommendations to ensure that the information stored in the OAIS remain accessible to the Designated User Community over the long term even if the original computing environment becomes obsolete.
Access: This entity supports consumers in determining the existence, description, location and availability of information stored in the OAIS and allowing consumers to request and receive information products
28
29
Ingest Data Flow DiagramNESTOR/Grunberger
Rev f, 2-25-99
AIPSIPSIP
AIP
AIP
Descriptive info.
Co-ordinate Updates
Generate Descriptive
Info
Archival Storage
Data Management
Quality Assurance
Report request
ReportAdministration Format & doc. stds.
Receive Submission
Descriptive info.
GenerateAIP
Storage confirmation
[Updated] SIP
SIP, AIP Report
SIP
QA results
PRODUCER
30
31
32
33
34
Digital Migration Approaches • Four primary types of digital migration in
response to motivators, ordered by increasing risk of information loss:– Refreshment
• Media replacement with no bit changes
– Replication• No change to Packaging Information or Content Information
bits
– Repackaging• Some bit changes in Packaging Information
– Transformation• Reversible: Bit changes in Content Information are reversible
by an algorithm• Non-reversible: Bit changes in Content Information are not
reversible by an algorithm
35
Relaterede standarder(Generelt)
• International Council on Archives: Handbook No. 7, 1988: Dictionary of Archival Terminology: English and French… og 5 andre sprog
• På vej fra Research Libraries Group (RLG): Certificeringskrav for arkiver
36
Schema for metadata
• For biblioteker: METS, MARC, MODS, MIX, TEI, EAD, ONIX
• For arkiver: OAI-MARC m.fl., men
• Mange er ”extensible”
• Ingen er rigtig slået igennem
37
Relaterede standarder(Representation Information)
• EAST: ISO standard for databeskrivelse (syntaktisk beskrivelse)
• Data Entity Dictionary Specification Language (DEDSL) – Abstract Syntax (semantisk beskrivelse)
• ISO-standarder fra Technical Committee (TC) 211: Referencemodel for geo-data
38
Relaterede standarder(Ingest)
• CCSDS, 2004: Producer-Archive Interface Methodology Abstract Standard
• På vej fra CCSDS: XML-baserede SIP’er
39
Mulige fordele ved OAIS
• En anerkendt ”Best Practice” for arkiver
• Lettere at kommunikere internationalt
• Lettere at kommunikere med ikke-arkivpersonale
40
Idéer 1
• Tilpasse sprogbrug og tankegang til OAIS så vi bedre kan kommunikere og i øvrigt støtte os til OAIS’ informationsbegreb og processer.
• Tilpasse vores danske begrebsapparat til OAIS.
41
Idéer 2
Justere vores informationsbegreb og:• Systematisk udpege målgruppen for nye
afleveringer• Lade målgruppen danne grundlag for krav
til Representation Information og Preservation Description Information (”kontekst”) i hvert enkelt tilfælde
• Gruppere metadata ifølge OAIS
42
Idéer 3
Få commitment til OAIS fra hele RA
• Et OAIS arkiv skal have ”custody” over hele arkivaliet – dette inkluderer f.eks. søgebeskrivelsen i DAISY
43
Idéer 4
• Gennemgå vores processer i forhold til OAIS og ”Producer-Archive Interface Methodology Abstract Standard” (”PAIMAS” ? ) så vi bliver bekræftet i at vi har tænkt på det hele.
44
Idéer 5
Overveje at indgå i et arkivsamarbejde:
• Fælles søgemidler (på tværs af arkiver)
• Fælles ”Archival Storage” (bevaringsdel) – fælles digitalt bevaringscenter
• Fælles afleveringsformat og/eller udleveringsformat