on the effective manipulation of digital objects libraries computer center department of informatics...

32
On the Effective Manipulation of Digital Objects Libraries Computer Center Department of Informatics & Telecommunications University of Athens A Prototype-based Instantiation Approach Kostas Saidis, George Pyrounakis, Mara Nikolaidou 9th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2005)

Post on 19-Dec-2015

220 views

Category:

Documents


2 download

TRANSCRIPT

On the Effective Manipulation of Digital Objects

Libraries Computer Center

Department of Informatics & Telecommunications

University of Athens

A Prototype-based Instantiation Approach

Kostas Saidis, George Pyrounakis, Mara Nikolaidou

9th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2005)

2 / 32September 19, ECDL 2005

Outline Motivation – The University of Athens (UoA) DL Digital Objects (DOs)

Encoding & Storage Manipulation (DL Application Logic) Manual Handling of DO Type variations

Digital Object Prototypes Automatic DO Type conformance Digital Object Dictionary A 3-tier DL Architecture

Collection Management & Scope of Prototypes Open Issues & Future Work

3 / 32September 19, ECDL 2005

The UoA DL Project Over 1 million objects originating from 8

different collections Folklore notebooks, Ancient papyri, Historical

archive’s folders & documents, Byzantine music manuscripts, Theatrical photos & brochures, Informatics research papers and dissertations, Medical images, Press articles

Heterogeneous & (mostly) digitized material We are developing a Web based DL System

for all material, using FEDORA as a digital object repository

4 / 32September 19, ECDL 2005

Motivation Increase productivity (strict time limits)

Simplify & speed up the cataloging process Provide effective Web-based cataloging

interfaces (cataloging personnel not librarians) Decrease development time (small team)

Avoid custom coding for each content variation Elaborate on reusable and configurable DL

modules Treat content variations in a unified manner

5 / 32September 19, ECDL 2005

Digital Objects A Digital Object is a human generated artifact

consisting of the digital content and related information Digital Content (files) Metadata (descriptive, administrative, etc) Structure & Reference information Behaviors (DO related functionality)

6 / 32September 19, ECDL 2005

Abstract Representation of a DO

7 / 32September 19, ECDL 2005

Encoding & Storage of DOs Several XML-based standards support various

forms of digital content & metadata (METS, FOXML, MPEG21, RDF…)

METS Sections, Behaviors, Profiles

FEDORA Digital Object Model METS variant in version 1.x Fedora Object XML (FOXML) in 2.x Datastreams, Disseminators, Content Models

Focus on how each DO part is encoded & stored

8 / 32September 19, ECDL 2005

Manipulation of DOs In the context of DL Application Logic, DOs

should be manipulated in a higher level of abstraction

Focus on the overall behavior of the DO (what are the DO parts and how do they behave)

DO Manipulation depends on the nature of the DO – the DO reflects the underlying “real world” object

9 / 32September 19, ECDL 2005

2-tier DL Architecture

10 / 32September 19, ECDL 2005

DL Application Logic A DL Module performs the following steps:

1. Loads the DO and its required parts

2. Parses XML and puts the data in the appropriate memory data structures

3. Performs operations on the data

4. Serializes the data in XML format

5. Saves the DO and its parts to the repository Steps 1, 2, 4, 5 require different

implementations for each DO Type

11 / 32September 19, ECDL 2005

DO Types? Do we capture, express and use DO Typing

information in an effective manner? METS “Profiles” & FEDORA “Content Models”

model DO Types… … but their goal is to be used by humans as a

guide and not by the DL System as a DO type specification

We resolve DO Typing issues manually

12 / 32September 19, ECDL 2005

Manual Handling of DO Types Developers generate ad-hoc, custom & not

reusable implementations of DO types’ variations of behavior

Catalogers carry out manual XML editing in a low level of abstraction with too technical, complex & over detailed semantics

DL modules exhibit limited evolution and configuration capabilities (due to scattered code and strong couplings & interdependencies)

Position

The DL System should resolve DO Typing issues

automatically

(in a manner transparent to the DL Application Logic)

14 / 32September 19, ECDL 2005

An example – Theatrical Collection What is an Album DO?

A container of photos accompanied by theatrical play metadata

What is a Photo DO? A digital image

• stored in various formats (e.g high quality, www quality, thumbnail)

• accompanied by the metadata required for describing the picture

How can we make Album and Photo DOs behave as such, automatically?

15 / 32September 19, ECDL 2005

By Drawing on the notions of OO

16 / 32September 19, ECDL 2005

The OO Viewpoint In the OO model an object is itself aware of its

“nature” and behaves accordingly Objects are conceived as instances of a type,

automatically conforming to the type’s definitions & specifications

OO types are separate entities (named either classes or prototypes)

17 / 32September 19, ECDL 2005

Digital Object Prototypes A DO Prototype is a DO Type Specification, a

separate entity that defines the DO’s: Constitutional parts – metadata sets, files,

structure, etc Private behaviors – DO internal operations

such as serializations, validations, assignment of default values, content conversions, etc

Public behaviors (behavior schemes) – the DO external interface, consisting of high level operations such as Detail view, Browse View, Edit View, etc

18 / 32September 19, ECDL 2005

OO Encapsulation

19 / 32September 19, ECDL 2005

Photo Prototype & Instances

20 / 32September 19, ECDL 2005

Digital Object Instances The process of generating a DO from a

Prototype is called instantiation The resulted object is an instance of the

prototype A DO instance automatically conforms to the

Prototype’s specifications Stored DOs vs DO instances

21 / 32September 19, ECDL 2005

3-tier DL Architecture

22 / 32September 19, ECDL 2005

Digital Object Dictionary The runtime environment in which DO instances

and Prototypes operate The DO Dictionary

Instantiates a DO based on the prototype specifications (loads & parses XML, assigns default values, etc)

Exposes the public behaviors of DOs in a high level, uniform API (for use by DL Modules)

Saves the DO instance (serializes data structures in XML, performs validations, etc)

23 / 32September 19, ECDL 2005

3-tier DL ArchitectureS

epar

atio

n o

f C

on

cern

s

24 / 32September 19, ECDL 2005

3-tier DL Architecture

Storage

Sep

arat

ion

of

Co

nce

rns

25 / 32September 19, ECDL 2005

3-tier DL Architecture

Storage

DO Typing & Instantiation

Sep

arat

ion

of

Co

nce

rns

26 / 32September 19, ECDL 2005

3-tier DL Architecture

Storage

DO Typing & Instantiation

Composition of DO behavior

Sep

arat

ion

of

Co

nce

rns

27 / 32September 19, ECDL 2005

DL Application Logic Revisited A DL Module performs the following steps:

1. Acquires the DO Instancedo = dictionary.acquireObject(“type”)do = dictionary.acquireObject(“uoadl:1024”)

2. Performs operations on its datado.getMDSet(“DC”).getField(“title”)dictionary.executeBehavior(do, “editView”)

3. Stores the DO in the repositorydictionary.saveObject(do)

28 / 32September 19, ECDL 2005

Scope of Prototypes Should we have global DO Types? Collection-pertinent types: A DO Prototype is

defined in the context of a Collection Support fine grained definition of collection

specific kinds of material Hierarchical naming scheme for types

Theatrical Collection Photo: dl.theatre.photo Medical Collection Photo: dl.medical.photo

Avoid type collisions

29 / 32September 19, ECDL 2005

Album Prototype & Instances

30 / 32September 19, ECDL 2005

Collection Management DL = Hierarchy of DO instances

Collections are also DOs, conforming to the Collection Prototype

The DL itself is a DO, representing the “super-collection” (the collection of all the collections)

All content is modeled in a unified manner All content can be characterized Easily add new collections & sub-collections Allow the DL designer to work out the details of

each collection independently, yet in a uniform manner

31 / 32September 19, ECDL 2005

DL as a Hierarchy of DO instances

32 / 32September 19, ECDL 2005

Open Issues & Future Work OO Inheritance for DO Prototypes (e.g the

Notebook type derives from the Book type) OO Polymorphism for DO instances (e.g the

DO “uoadl:1234” is both a Notebook & a Book) Generalize the notion of behavior schemes &

investigate relations with FEDORA behaviors Supply general purpose linking capabilities that

exceed structural relations Deliver on schedule…