caspar : early results and future goals
TRANSCRIPT
-
8/8/2019 CASPAR : Early Results and Future Goals
1/30
CODATA 2006, Beijing, China 23-25 Oct 2006 1
CASPAR: Early results andfuture goals
David Giaretta
-
8/8/2019 CASPAR : Early Results and Future Goals
2/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2
CASPAR aimsProduce tools and techniques to support digitalpreservation and make it easier to share the cost must be relatively easy to use must have a low buy-in in terms of effort required for
adoption must avoid requiring wholesale change of everyone elsessystems
must be decentralised and reproducible so that it can live onafter the formal end of the CASPAR project
must be preservable
must be open: open source, open standardsCannot do everything but should do somethingbroadly usefulWorking closely with the UK Digital Curation Centre
-
8/8/2019 CASPAR : Early Results and Future Goals
3/30
CODATA 2006, Beijing, China 23-25 Oct 2006 3
Digital PreservationEasy to doas long as you can provide moneyforever Easy to test claims about toolsas long as you live a long time
-
8/8/2019 CASPAR : Early Results and Future Goals
4/30
CODATA 2006, Beijing, China 23-25 Oct 2006 4
Validation
Demonstrate theoretical basisAccelerated lifetime tests Changes in hardware Changes in environment Changes in Designated Community
Demonstrate increased trustworthiness Measured using draft Certification
Standard
-
8/8/2019 CASPAR : Early Results and Future Goals
5/30
CODATA 2006, Beijing, China 23-25 Oct 2006 5
Digital PreservationNeed to preserve information & knowledge not just the bits Documents, videos are rendered simple?
Data must be processed - harder Need to manage knowledge to keep archivesalive through time Preservation is a process, not a one-time event Preservation is expensive costs need to be shared
The alternative is money endless supplies of money
Open Archival Information Systems ReferenceModel (ISO 1 472 1) provides a general conceptualframework
-
8/8/2019 CASPAR : Early Results and Future Goals
6/30
CODATA 2006, Beijing, China 23-25 Oct 2006 6
Immediate benefits of Digital
Preservation: Use of Unfamiliar DataGlobal Cyber-Infrastructures allow users tofind and try to use data from many sources
Some sources will be familiar Most available sources will be unfamiliar
How can one be sure that the unfamiliar datais used correctly
Garbage in garbage outNeed to be able to deal with unfamiliar datawhether it is contemporary or old (preserved)
-
8/8/2019 CASPAR : Early Results and Future Goals
7/30
CODATA 2006, Beijing, China 23-25 Oct 2006 7
OAIS Reference ModelISO 14721 : Reference Model for an Open Archival Information Systems(OAIS). http://public.ccs d s.o rg /publicatio ns/a r chiv e /650x0b1.p df An OAIS is an archive, consisting of an organization of people andsystems, that has accepted the responsibility to preserve information andmake it available for a Designated Community.L ong Term Preservation : The act of maintaining information, in a correctand Independently Understandable form, over the Long Term.L ong Term is long enough to be concerned with the impacts of changingtechnologies, including support for new media and data formats, or with achanging user community.Designated Community: An identified group of potential Consumerswho should be able to understand a particular set of information. The
Designated Community may be composed of multiple user communities.Has sufficient documentation to allow the information to beunderstood and used by the Designated Community without having toresort to special resources not widely available, including namedindividuals.
OASIS OAIXX
-
8/8/2019 CASPAR : Early Results and Future Goals
8/30
CODATA 2006, Beijing, China 23-25 Oct 2006 8
OAIS Information ModelInformation
Object
RepresentationInformation
1+
interpretedusing1+Data
Object
interpretedusing
PhysicalObject DigitalObject
BitSequence
1+
Recursion ends atKNOWLEDGEBASEof the DESIGNATEDCOMMUNITY(this knowledge willchange over time andregion)
-
8/8/2019 CASPAR : Early Results and Future Goals
9/30
CODATA 2006, Beijing, China 23-25 Oct 2006 9
Rep.Info. Classification
-
8/8/2019 CASPAR : Early Results and Future Goals
10/30
CODATA 2006, Beijing, China 23-25 Oct 2006 10
FITS FILE
FITSSTANDARD
PDFSTANDARD
FITSJAV A s/w
JAV A V M
PDFs/w
FITSDICTIONARY
DICTIONARYSPECIFICATION
UNICODESPECIFICATION
XML
SPECIFICATION
-
8/8/2019 CASPAR : Early Results and Future Goals
11/30
CODATA 2006, Beijing, China 23-25 Oct 2006 11
Representation InformationThe Data Object is interpreted using theRepresentation Information (RepInfo)The Reference Model is designed to ensure
that an OAIS is not set the impossible task of having to provide all possible RepInfoimmediatelyHence:
Take account of the Designated Community and itsassociated Knowledge Base
The amount of RepInfo is not fixed Additional RepInfo will be needed over time
-
8/8/2019 CASPAR : Early Results and Future Goals
12/30
CODATA 2006, Beijing, China 23-25 Oct 2006 12
Early ResultsHigh level architecture for sharing costand access to RepresentationInformationDetailed examinations of specificdatasets to understand what is really
needed to keep them understandableand usable
-
8/8/2019 CASPAR : Early Results and Future Goals
13/30
CODATA 2006, Beijing, China 23-25 Oct 2006 13
Rep. Info. Use and maintenance
-
8/8/2019 CASPAR : Early Results and Future Goals
14/30
-
8/8/2019 CASPAR : Early Results and Future Goals
15/30
CODATA 2006, Beijing, China 23-25 Oct 2006 15
CASPARinformationflowarchitecture
Rep
Info
-
8/8/2019 CASPAR : Early Results and Future Goals
16/30
CODATA 2006, Beijing, China 23-25 Oct 2006 16
CASPAR TestbedsThree testbeds Cultural: UNESCO Performing Arts: INA , IRCAM
Scientific: ESA and CCLRCComplex, multi-source, multifaceted dataMany common preservation & evaluation &validation issues
Some specific requirements on preservation(technical, delivery, legal) Specific user communities/ Knowledge basesAlso test the OAIS model
-
8/8/2019 CASPAR : Early Results and Future Goals
17/30
CODATA 2006, Beijing, China 23-25 Oct 2006 17
Science: CCLRC example
Ionosonde data
World map of ionosondes
-
8/8/2019 CASPAR : Early Results and Future Goals
18/30
CODATA 2006, Beijing, China 23-25 Oct 2006 18
Some IssuesDifficult to derive physical quantitiesfrom data Can be analysed in multiple ways Raises fundamental questions about
Representation InformationCommon automated method is
proprietary Data structure also proprietary Paper documentation - restricted accessProvenance and trust
-
8/8/2019 CASPAR : Early Results and Future Goals
19/30
CODATA 2006, Beijing, China 23-25 Oct 2006 19
ESA exampleGOME
GlobalOzoneMonitoringInstrument
on ERS-2
-
8/8/2019 CASPAR : Early Results and Future Goals
20/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2 0
GOME data processing
-
8/8/2019 CASPAR : Early Results and Future Goals
21/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2 1
GOME Level 4 product:Integration of GOME, other data and models
GOME Level 3 product: Integrationof time and space data
GOME Level 2 product:Oz one profile atgiven location
-
8/8/2019 CASPAR : Early Results and Future Goals
22/30
CODATA 2006, Beijing, China 23-25 Oct 2006 22
Some IssuesProvenance and Context of processeddata
relationship to
Representation Information of raw dataand
Knowledge base of DesignatedCommunity
-
8/8/2019 CASPAR : Early Results and Future Goals
23/30
CODATA 2006, Beijing, China 23-25 Oct 2006 23
UNESCO examplesDATA :
Scanned documents and maps
Aerial and close range photography(Digital photogrammetry)
Monument measurements (Laser scanning)
Satellite images (Remote sensing andimage processing)
Multi-scale digital cartography (Geographicinformation systems (GIS) and CAD)
3 D models, virtual tours (Computer visualization)
Mandatory Documentation:
Identification of property
Description of property
Justification of inscription
State of conservation andfactors affecting theproperty
Protection andManagement
Monitoring
Documentation
Contact information of responsible authorities
Signature on behalf of theState Party(ies)
World HeritageList
-
8/8/2019 CASPAR : Early Results and Future Goals
24/30
CODATA 2006, Beijing, China 23-25 Oct 2006 24
Performing Arts examplesExamples:Score
MAX/MSP patches
Additional instructions
-
8/8/2019 CASPAR : Early Results and Future Goals
25/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2 5
Some IssuesWhat is Preservation of performability? Composers intention
AuthenticityProprietary software and hardwareCopyrightDigital Rights Management
-
8/8/2019 CASPAR : Early Results and Future Goals
26/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2 6
Shared InfrastructureRegistries of Representation InformationPersistent Identifier name resolvers
DOI? ARK? URL? none are guaranteedInterfaces support preservation andinteroperabilityStandards Preservation Description
Information Fixity, Provenance, Reference, Context
Accreditation/Certification for repositories
-
8/8/2019 CASPAR : Early Results and Future Goals
27/30
CODATA 2006, Beijing, China 23-25 Oct 2006 27
Knowledge at the heart of preservation
Knowledge driven approachKnowledge management to support long-termpreservation of concepts/information including: Single, complex, on demand, interactive objects DRM Authenticity Access Storage Designated Community descriptions
Knowledge base definitionontologies
-
8/8/2019 CASPAR : Early Results and Future Goals
28/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2 8
WHENComponent architecture and prototypesby month 1 2
Framework architecture month 1 8Component integration months 24 -3 0Testbed implementations months 3 0- 3 6Project completion month 42
-
8/8/2019 CASPAR : Early Results and Future Goals
29/30
CODATA 2006, Beijing, China 23-25 Oct 2006 2 9
www.casparpreserves.eu
-
8/8/2019 CASPAR : Early Results and Future Goals
30/30
CODATA 2006, Beijing, China 23-25 Oct 2006 3 0
ConclusionsScience Data and Knowledge needs more than juststoring the bitsUnderstanding and being able to process the vastamount of unfamiliar data which is available is hardIt is expensive Costs much be shared
So far the Open Archival Information Systems
Reference Model is OK Many similarities can be exploited Many subtleties need to be explored
Watch this space