roda: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation...

86
RODA: digital preservation for the portuguese public administration José Carlos Ramalho [email protected] Miguel Ferreira [email protected] Luis Faria [email protected] Rui Castro [email protected] Francisco Barbedo [email protected] Cecília Henriques ch [email protected] Glória Santos gloria @iantt.pt Luis Corujo lcorujo @iantt.pt 1

Upload: others

Post on 05-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 2: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Context

Digitarq (2003-now)• metadata management (EAD based)• digital object management (NISO MIX)

RODA (2006-2008)• metadata management (EAD based)• digital object management (...)• digital preservation protocols and policies

CRAV: Readers Virtual Room (2006-2007)• request management• document workflow

2

Page 3: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Context

Digitarq (2003-now)• metadata management (EAD based)• digital object management (NISO MIX)

RODA (2006-2008)• metadata management (EAD based)• digital object management (...)• digital preservation protocols and policies

CRAV: Readers Virtual Room (2006-2007)• request management• document workflow

Partners/Contracters:

• National Directory Board of Archives

• Photography National Archive

• Oporto’s county Archive

• Some city hall archives (can grow exponencially)

2

Page 4: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

RODA: Motivation

• Today History is being made in the digital world;• Digital Object production grows everyday;• There are no structures to support incorporation,

management and long-term preservation of digital objects;

• We have to preserve the digital memory, heritage and testimonials of public organizations. • Example: SGU work

3

Page 5: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Some Requisites/Questions?

• How do we achieve Authenticity?

• How do we describe and classify DO?

• How can we implement digital preservation?

4

Page 6: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Authenticity

“O Codex 632” by José Rodrigues dos Santos

Subject: Who really was Cristophoros Colombus?

Was he italian? Spanish? Or a portuguese belonging to a jewish family?

5

Page 7: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Authenticity

We must trust our sources: in ancient History there are no direct speech or evidence.

EX: the bible

6

Page 8: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Authenticity

We must trust our sources: in ancient History there are no direct speech or evidence.

EX: the bible

How do we become trustful?

6

Page 9: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Authenticity

We must trust our sources: in ancient History there are no direct speech or evidence.

EX: the bible

How do we become trustful?

• Reputation

• Documenting every action taken upon DOs

6

Page 10: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Digital Object Classes

7

Page 11: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

8

DO Anatomy

DatabaseText Doc.Still Image

SQL Server Hard DiscAccess

PDF Doc.

PNG image

Ms Word Doc.

Tape

...

Conceptuallevel

Logicallevel

Physicallevel

8

Page 12: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

8

DO Anatomy

DatabaseText Doc.Still Image

SQL Server Hard DiscAccess

PDF Doc.

PNG image

Ms Word Doc.

Tape

...

Conceptuallevel

Logicallevel

Physicallevel

If one of these levels becomes obsolete we

loose access to the DO

8

Page 13: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

DO Preservation Strategies• Focusing the physical/logical object

o Centered in preserving information in her logical format or/and physical support

o Uses original technology associated to these objects to ensure the access to them

o Technology preservation

• Focusing the conceptual object

o Centered in preserving the object core properties in a way that is independent from hardware and software

o Conceptual object preservation9

Page 14: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Emulation

10

Page 15: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Emulator: application capable of reproducing the behaviour of an hardware/software platform. Ex: ZX Spectrum, GBA, ...

Emulation

10

Page 16: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Emulator: application capable of reproducing the behaviour of an hardware/software platform. Ex: ZX Spectrum, GBA, ...

Emulation

• Advantageso Original technological context recriation o Object’s look & feel preservation

• Disadvantageso Emulators also become obsoleteo Users have to operate obsolete systemso Creating emulators is a complex task o Copyright problemso To preserve a complete operating system to be able to visualize a

single document may be overwhelmingo Information reuse in not guaranteed

10

Page 17: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Encapsulation

11

Page 18: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Preserving the original bit stream together with enough metadata capable of ensuring its future interpretation and access

Encapsulation

11

Page 19: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Preserving the original bit stream together with enough metadata capable of ensuring its future interpretation and access

Encapsulation

• Advantageso It allows the postponement of preservation

responsibilitieso Targeted for objects that will be accessed in a far futureo Emulator and visualizer developement is delayed

• Disadvantageso Complex objects have complex specificationso An incomplete specification can have nasty effects

11

Page 20: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Conceptual object preservation

Migration: periodic DO transfer from one hw/sw configuration into an updated one (centered in preserving significant properties other then preserving the original bit stream).

Advantages– DO are disseminated in formats known to users– No need to preserve the original hw/sw platform– Most used strategy and the only that has worked so far

Disadvantages– Possible loss of information during conversion– Continued maintenance is needed – In the longterm perspective costs are high

12

Page 21: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Conceptual object preservation

Migration: periodic DO transfer from one hw/sw configuration into an updated one (centered in preserving significant properties other then preserving the original bit stream).

Advantages– DO are disseminated in formats known to users– No need to preserve the original hw/sw platform– Most used strategy and the only that has worked so far

Disadvantages– Possible loss of information during conversion– Continued maintenance is needed – In the longterm perspective costs are high

What are the significant properties?

12

Page 22: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Preservation Services

t1

t4

t2t3

t5

t6 t7

<.9, .8, .95, .1>

<.5, .3, .95, .6>

<.5, .3, .95, 1>

<.9, .6, .9, .7>

<.3, .6, .9

5, .1>

<.7, .5, .65, .1>

<.9, .8, .6, .1>13

Page 23: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Preservation Services

t1

t4

t2t3

t5

t6 t7

<.9, .8, .95, .1>

<.5, .3, .95, .6>

<.5, .3, .95, 1>

<.9, .6, .9, .7>

<.3, .6, .9

5, .1>

<.7, .5, .65, .1>

<.9, .8, .6, .1>

CRiB project: http://crib.dsi.uminho.pt

13

Page 24: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Open Archival Information System

ISO 14721

14

Page 25: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

OAIS (Functional Components)

•Ingestion• Reception, validation, transformation/

normalization, description of the whole package submitted by the producer

•Storage• Ensures information preservation at physical/

logical level (e.g. refreshing, migration, integrity checks, disaster recovery, etc.)

•Metadata management• Responsible for the management of stored DOs

15

Page 26: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

OAIS (Information Packages

• Submission Information Package (SIP)✴ Digital Object✴ Metadata created by producer

‣ too open...• Archival Information Package (AIP)

✴ Digital Object to be stored✴ Metadata: enough to ensure DO’s preservation

and access‣ model defined by PREMIS

• Dissemination Information Package (DIP)• DO transformed into the format that will be

delivered to the consumer• Metadata

16

Page 27: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Ingestion

17

Page 28: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Ingestion

Submission Contract• SIP specification• Ingestion workflow specification

17

Page 29: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (example)

one still image

18

Page 30: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (example)

one still image

criation properties:

- date- hardware- ...

18

Page 31: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (example)

one still image

criation properties:

- date- hardware- ...

Technical Metadata:- color- dimensions- ...

18

Page 32: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (example)

one still image

criation properties:

- date- hardware- ...

Technical Metadata:- color- dimensions- ...

Descriptive Metadata:- producer

- colection- ...

18

Page 33: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (example)

one still image

criation properties:

- date- hardware- ...

Technical Metadata:- color- dimensions- ...

Descriptive Metadata:- producer

- colection- ...

Manifest

18

Page 34: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (example)

Com

pres

sed

Fileone still image

criation properties:

- date- hardware- ...

Technical Metadata:- color- dimensions- ...

Descriptive Metadata:- producer

- colection- ...

Manifest

18

Page 35: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

19

Page 36: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

19

Page 37: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

DO = Image+

19

Page 38: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

DO = Image+ Properties

19

Page 39: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

DO = Image+ Properties Technical Metadata

19

Page 40: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

DO = Image+ Properties Technical MetadataDescriptive Metadata

19

Page 41: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

DO = Image+ Properties Technical MetadataDescriptive Metadata

Manifest

19

Page 42: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Structure (+complex)

1001001011010100100101

DO = Image+ Properties Technical MetadataDescriptive Metadata

Manifest

Compressed File

19

Page 43: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP AIP

• integrity check• virus check• generation of preservation metada (PREMIS)• conversion to a normalized format

• generation of technical metadata• generation of preservation metadata (PREMIS)

Ingestion Workflow

20

Page 44: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

AIP Storage

<EAD>

<PREMIS> <PREMIS> Metadata

DOs

21

Page 45: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Normalization

22

Page 46: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Data Model

23

Page 47: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Stages

• Analysis and Planning

• Prototyping

• Testing and Dissemination

24

Page 48: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Planning and Analysis

25

Page 49: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Requisites• Graphical Interface for Ingestion process

• Producer registry

• SIP production tool

• Ingestion feedback

• Partial Ingestion

• “Quarantine” zone: cache, ingestion buffer

• SIP validation

• Error reporting

• Persistent identifiers

• PREMIS event generation

• DIP digital signature

• ...

26

Page 50: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Development framework

27

Page 51: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Requisites based comparaison

IngestionAIP

Management

Dissemination

28

Page 52: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Matching data models

DSpace

29

Page 53: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Matching data models

Fedora

30

Page 54: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Roda Data Model

31

Page 55: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Roda Data Model

Description Objects

31

Page 56: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Roda Data Model

Description Objects

Representation Objects

31

Page 57: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Roda Data Model

Description Objects

Representation Objects

Preservation Objects

31

Page 58: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Architecture

32

Page 59: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

RODA Schemas

33

Page 60: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Prototyping

34

Page 61: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

8

DatabaseText Doc.Still Image

SQL Server Hard DiscAccess

PDF Doc.

PNG image

Ms Word Doc.

Tape

...

Conceptuallevel

Logicallevel

Physicallevel

Preserving Conceptual Object

35

Page 62: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Text Documents and Still Images

36

Page 63: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Text Documents and Still Images

• EAD elements capture most of the significant properties: provenance, producer history, context, ...

• Content is kept in a normalized format: PDF and uncompressed TIFF.

36

Page 64: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Text Documents and Still Images

• EAD elements capture most of the significant properties: provenance, producer history, context, ...

• Content is kept in a normalized format: PDF and uncompressed TIFF.

<EAD>

36

Page 65: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Text Documents and Still Images

• EAD elements capture most of the significant properties: provenance, producer history, context, ...

• Content is kept in a normalized format: PDF and uncompressed TIFF.

<EAD>

<EAD>

36

Page 66: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Databases

• Data?

• Structure?

• Views?

• Reports?

• Stored Procedures?

• ...

37

Page 67: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Databases

• Data?

• Structure?

• Views?

• Reports?

• Stored Procedures?

• ...

First prototype:

• Data

• Structure

37

Page 68: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP Builder

38

Page 69: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

DBML

• Platform and RDBMS independent

• Stores the DB structure and information

• BLOBs are exported and preserved as standalone files in the representation

• Transformations to SQL and back are defined

39

Page 70: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

DBML

• Platform and RDBMS independent

• Stores the DB structure and information

• BLOBs are exported and preserved as standalone files in the representation

• Transformations to SQL and back are defined

<TABLE NAME="Districts"> <COLUMNS> <COLUMN NAME="code" TYPE="int" NULL="no"/> ... </COLUMNS> <KEYS> <PKEY TYPE="simple"> <FIELD NAME=""/> </PKEY> <PKEY TYPE="compound"> <FIELD NAME=""/> <FIELD NAME=""/> </PKEY> <KEY NAME="" REF=""/> ... </KEYS> </TABLE>

39

Page 71: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

DBML

• Platform and RDBMS independent

• Stores the DB structure and information

• BLOBs are exported and preserved as standalone files in the representation

• Transformations to SQL and back are defined

<TABLE NAME="Districts"> <COLUMNS> <COLUMN NAME="code" TYPE="int" NULL="no"/> ... </COLUMNS> <KEYS> <PKEY TYPE="simple"> <FIELD NAME=""/> </PKEY> <PKEY TYPE="compound"> <FIELD NAME=""/> <FIELD NAME=""/> </PKEY> <KEY NAME="" REF=""/> ... </KEYS> </TABLE>

... <DATA> <products> <products-REG> <code> a122 </code> <description> milk </description> ... </products-REG> <products-REG> ... </products-REG> </products> ... </DATA>...

39

Page 72: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

DBML

• Platform and RDBMS independent

• Stores the DB structure and information

• BLOBs are exported and preserved as standalone files in the representation

• Transformations to SQL and back are defined

<TABLE NAME="Districts"> <COLUMNS> <COLUMN NAME="code" TYPE="int" NULL="no"/> ... </COLUMNS> <KEYS> <PKEY TYPE="simple"> <FIELD NAME=""/> </PKEY> <PKEY TYPE="compound"> <FIELD NAME=""/> <FIELD NAME=""/> </PKEY> <KEY NAME="" REF=""/> ... </KEYS> </TABLE>

... <DATA> <products> <products-REG> <code> a122 </code> <description> milk </description> ... </products-REG> <products-REG> ... </products-REG> </products> ... </DATA>...

DB SIP composition:

• METS file for packaging and organizing

• EAD file describing intellectual properties

• DBML file(s)

• DO for each found BLOB

• METS file + MIX for each DO

39

Page 73: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

SIP -> AIP

• Check and validation ...

• Generate SQL file

• Generate PREMIS

40

Page 74: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Dissemination

• Abstract Database Creation: a database of databases... Ingests databases from DBML (DBML-->SQLadb);

• Specific Database Creation: execute the SQL file in the selected RDMS

41

Page 75: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Dissemination

42

Page 76: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

DB Abstract Schema

Dat

a

Structure

43

Page 77: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Browser

44

Page 78: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Browser

44

Page 79: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Browser

44

Page 80: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Browser

44

Page 81: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Search Engine

45

Page 82: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Final thoughts

“Data Preservation is a people problem”Michael Lesk

46

Page 83: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Final thoughts

“Data Preservation is a people problem”Michael Lesk

• People need to be trained to save data in a proper way.• What to preserve? Data, Structure, Semantics...• Preservation is for future users but only today users vote on budget• We need to make data collecting people have preservation concerns• Preservation is fault tolerance. All systems are imperfect

46

Page 84: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Look and see how our brothers are working to transfer all our writings into CDROM format.

47

Page 85: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

RODA Homepage

48

Page 86: RODA: digital preservation for the portuguese public ... · 23/11/2007  · digital preservation for the portuguese public administration José Carlos Ramalho jcr@di.uminho.pt Miguel

Let’s Preserve Tomorrow’s History...

Questions?

49