on the effective manipulation of digital objects libraries computer center department of informatics...
Post on 19-Dec-2015
220 views
TRANSCRIPT
On the Effective Manipulation of Digital Objects
Libraries Computer Center
Department of Informatics & Telecommunications
University of Athens
A Prototype-based Instantiation Approach
Kostas Saidis, George Pyrounakis, Mara Nikolaidou
9th European Conference on Research and Advanced Technology for Digital Libraries (ECDL 2005)
2 / 32September 19, ECDL 2005
Outline Motivation – The University of Athens (UoA) DL Digital Objects (DOs)
Encoding & Storage Manipulation (DL Application Logic) Manual Handling of DO Type variations
Digital Object Prototypes Automatic DO Type conformance Digital Object Dictionary A 3-tier DL Architecture
Collection Management & Scope of Prototypes Open Issues & Future Work
3 / 32September 19, ECDL 2005
The UoA DL Project Over 1 million objects originating from 8
different collections Folklore notebooks, Ancient papyri, Historical
archive’s folders & documents, Byzantine music manuscripts, Theatrical photos & brochures, Informatics research papers and dissertations, Medical images, Press articles
Heterogeneous & (mostly) digitized material We are developing a Web based DL System
for all material, using FEDORA as a digital object repository
4 / 32September 19, ECDL 2005
Motivation Increase productivity (strict time limits)
Simplify & speed up the cataloging process Provide effective Web-based cataloging
interfaces (cataloging personnel not librarians) Decrease development time (small team)
Avoid custom coding for each content variation Elaborate on reusable and configurable DL
modules Treat content variations in a unified manner
5 / 32September 19, ECDL 2005
Digital Objects A Digital Object is a human generated artifact
consisting of the digital content and related information Digital Content (files) Metadata (descriptive, administrative, etc) Structure & Reference information Behaviors (DO related functionality)
7 / 32September 19, ECDL 2005
Encoding & Storage of DOs Several XML-based standards support various
forms of digital content & metadata (METS, FOXML, MPEG21, RDF…)
METS Sections, Behaviors, Profiles
FEDORA Digital Object Model METS variant in version 1.x Fedora Object XML (FOXML) in 2.x Datastreams, Disseminators, Content Models
Focus on how each DO part is encoded & stored
8 / 32September 19, ECDL 2005
Manipulation of DOs In the context of DL Application Logic, DOs
should be manipulated in a higher level of abstraction
Focus on the overall behavior of the DO (what are the DO parts and how do they behave)
DO Manipulation depends on the nature of the DO – the DO reflects the underlying “real world” object
10 / 32September 19, ECDL 2005
DL Application Logic A DL Module performs the following steps:
1. Loads the DO and its required parts
2. Parses XML and puts the data in the appropriate memory data structures
3. Performs operations on the data
4. Serializes the data in XML format
5. Saves the DO and its parts to the repository Steps 1, 2, 4, 5 require different
implementations for each DO Type
11 / 32September 19, ECDL 2005
DO Types? Do we capture, express and use DO Typing
information in an effective manner? METS “Profiles” & FEDORA “Content Models”
model DO Types… … but their goal is to be used by humans as a
guide and not by the DL System as a DO type specification
We resolve DO Typing issues manually
12 / 32September 19, ECDL 2005
Manual Handling of DO Types Developers generate ad-hoc, custom & not
reusable implementations of DO types’ variations of behavior
Catalogers carry out manual XML editing in a low level of abstraction with too technical, complex & over detailed semantics
DL modules exhibit limited evolution and configuration capabilities (due to scattered code and strong couplings & interdependencies)
Position
The DL System should resolve DO Typing issues
automatically
(in a manner transparent to the DL Application Logic)
14 / 32September 19, ECDL 2005
An example – Theatrical Collection What is an Album DO?
A container of photos accompanied by theatrical play metadata
What is a Photo DO? A digital image
• stored in various formats (e.g high quality, www quality, thumbnail)
• accompanied by the metadata required for describing the picture
How can we make Album and Photo DOs behave as such, automatically?
16 / 32September 19, ECDL 2005
The OO Viewpoint In the OO model an object is itself aware of its
“nature” and behaves accordingly Objects are conceived as instances of a type,
automatically conforming to the type’s definitions & specifications
OO types are separate entities (named either classes or prototypes)
17 / 32September 19, ECDL 2005
Digital Object Prototypes A DO Prototype is a DO Type Specification, a
separate entity that defines the DO’s: Constitutional parts – metadata sets, files,
structure, etc Private behaviors – DO internal operations
such as serializations, validations, assignment of default values, content conversions, etc
Public behaviors (behavior schemes) – the DO external interface, consisting of high level operations such as Detail view, Browse View, Edit View, etc
20 / 32September 19, ECDL 2005
Digital Object Instances The process of generating a DO from a
Prototype is called instantiation The resulted object is an instance of the
prototype A DO instance automatically conforms to the
Prototype’s specifications Stored DOs vs DO instances
22 / 32September 19, ECDL 2005
Digital Object Dictionary The runtime environment in which DO instances
and Prototypes operate The DO Dictionary
Instantiates a DO based on the prototype specifications (loads & parses XML, assigns default values, etc)
Exposes the public behaviors of DOs in a high level, uniform API (for use by DL Modules)
Saves the DO instance (serializes data structures in XML, performs validations, etc)
25 / 32September 19, ECDL 2005
3-tier DL Architecture
Storage
DO Typing & Instantiation
Sep
arat
ion
of
Co
nce
rns
26 / 32September 19, ECDL 2005
3-tier DL Architecture
Storage
DO Typing & Instantiation
Composition of DO behavior
Sep
arat
ion
of
Co
nce
rns
27 / 32September 19, ECDL 2005
DL Application Logic Revisited A DL Module performs the following steps:
1. Acquires the DO Instancedo = dictionary.acquireObject(“type”)do = dictionary.acquireObject(“uoadl:1024”)
2. Performs operations on its datado.getMDSet(“DC”).getField(“title”)dictionary.executeBehavior(do, “editView”)
3. Stores the DO in the repositorydictionary.saveObject(do)
28 / 32September 19, ECDL 2005
Scope of Prototypes Should we have global DO Types? Collection-pertinent types: A DO Prototype is
defined in the context of a Collection Support fine grained definition of collection
specific kinds of material Hierarchical naming scheme for types
Theatrical Collection Photo: dl.theatre.photo Medical Collection Photo: dl.medical.photo
Avoid type collisions
30 / 32September 19, ECDL 2005
Collection Management DL = Hierarchy of DO instances
Collections are also DOs, conforming to the Collection Prototype
The DL itself is a DO, representing the “super-collection” (the collection of all the collections)
All content is modeled in a unified manner All content can be characterized Easily add new collections & sub-collections Allow the DL designer to work out the details of
each collection independently, yet in a uniform manner
32 / 32September 19, ECDL 2005
Open Issues & Future Work OO Inheritance for DO Prototypes (e.g the
Notebook type derives from the Book type) OO Polymorphism for DO instances (e.g the
DO “uoadl:1234” is both a Notebook & a Book) Generalize the notion of behavior schemes &
investigate relations with FEDORA behaviors Supply general purpose linking capabilities that
exceed structural relations Deliver on schedule…