etana-add: an interactive tool for integrating archaeological dl collections jcdl 2006, chapel hill,...
TRANSCRIPT
ETANA-ADD: An Interactive Tool for Integrating
Archaeological DL Collections
JCDL 2006, Chapel Hill, NCJune 13, 2006
Naga Srinivas Vemuri, Rao Shen, Sameer Tupe, Weiguo Fan,
Edward A. Fox
[email protected] http://fox.cs.vt.edu
Acknowledgements (Selected)
• Sponsors: NSF grant ITR-0325579, ASOR, CWRU, ETANA, Vanderbilt U., Virginia Tech
• VT Students: Vidhya Vijayaraghavan, other DLRL members
• Others: Umm el-Jimal Dig Team
Acknowledgements (Selected)
• Karen Borstad, MPP
• Giorgio Buccellati, UCLA
• Douglas Clark, Walla Walla College
• Joanne Eustis, CWRU
• Nick Fischio, CWRU
• Israel Finkelstein, Tel-Aviv University
• Paul Gherman, Vanderbilt U.
• Andrew Graham, U. Toronto
• Tim Harrison, U. Toronto
• Larry Herr, Canadian University College
• Christopher Holland, LRP
• Paul Jacobs, Mississippi State U.
• Douglas Knight, Vanderbilt U.
• Stan LaBianca, Andrews U.
• David McCreery, Willamette U.
• Eric Meyers, Duke U.
• Adam Porter, Illinois College
• Jack Sasson, Vanderbilt U.
• Tom Schaub, Indiana U. of Penn.
• Randall Younker, Andrews U.
Introduction
ETANA, ETANA-DL, 5S
What are the issues involved in integrating new collections into a DL, with evolving metadata schema (i.e., bottom up schema evolution)?
Can we partially automate the process of integrating new collections in such situations?
ETANA-DL
Heterogeneity: 8 archaeological sites, 13 different artifact types
Example artifact types: Bone, Burial, Figurine, Locus, Pottery, Seed, etc.
Union services: Multidimensional Browsing, Searching, Recommendation, Annotation, etc.
ETANA-DL (Cont.)
Individual (archaeological) site approachLocal conventions for metadataCustom built services
ETANA-DLProvides union services across
sitesA global schema based on
incremental approach
The Mapping Process in ETANA
Mapping Process: Global schema defines collections (metadata) in the system using an incremental approach.
Adding a new artifact collection if artifact type is already defined,
perform mapping if artifact type is not defined, then
extend global schema and perform mapping
The Whole Integration Process
Conversion process: custom DB to XML format Needs to identify metadata elements in DB Results in local XML data, local XML schema
Mapping process Needs to perform schema mapping Results in global schema extension, and the
evolution of a new global collection Integration process
New site to be “published” as OAI Provider OAI harvesting results in integration of the
new collection.
The Integration Problem
Problem: The integration process requires both technical and domain expertise.
Propose: Partially automate the process to minimize the need for technical skills
Solution: ETANA-ADD Tool
Related Work
Gatherer: A tool used in Greenstone for adding new collections.Tightly coupled with Greenstone No knowledge of its ability to
handle evolving schema and its content
Database to OAI Provider: OAICat and OAI PMH2 PerlDoesn’t accommodate mapping
process
Related Work (Cont.)
OCHRE proposed archaeoML to define DL collectionsDoesn’t automate integration
processAbility to handle heterogeneous data
is not known Altova MapForce
Doesn’t support incremental mapping
ETANA-ADD Tool
An interactive tool for end usersPartially automates the integration
processMinimizes the need for technical skillsReuses existing tools to some extent,
by providing easy GUI wrapper on top of them
ETANA-ADD Tool (Cont.)
The process flow involved while using the tool:
DB2XML Schema Mapper
OAI XML File Data Provider
An Integration Scenario
Adding burial artifacts collection to ETANA DLPerform DB2XML process
using ETANA-ADDPerform Schema MappingPublish Burial Collection as
OAI Data Provider
Results
Integrated Ummm el-Jimal site with the help of ETANA-ADD Bone, Burial, Locus, Miscellaneous
Artifact, Pottery, Pottery Bucket. No additional code written A comparison with earlier integrated
site, Megiddo (7 artifact collections)
Conclusions
Target users: Administrators handling archaeology data (to be invited in fall for usability studies)
Developed ETANA-ADD to minimize technical expertise in integrating new archaeological collections
Willing to share our software, which may be applicable to other domains with similar problems (i.e., with evolving global schema)
References
Brainbridge, D., Thompson, J., and Witten, I. H. Assembling and enriching digital library collections. In Proc. JCDL 2003: 323-334.
Raghavan, A., Vemuri, N. S., Shen, R., Gonçalves, M.A., Fan, W. and Fox, E.A. Incremental, Semi-automatic, Mapping-Based Integration of Heterogeneous Collections into Archaeological Digital Libraries: Megiddo Case Study. In Proc. ECDL 2005: 139-150.
Ravindranathan, U., Shen, R., Gonçalves, M.A., Fan, W., Fox, E. A., Flanagan, J.W., ETANA-DL: a digital library for integrating heterogeneous archaeological data. In Proc. JCDL 2004: 76-77.
Suleman, H. Open Digital Libraries, Ph.D. Dissertation, Dept. Comp. Sci., Virginia Tech, http://scholar.lib.vt.edu/theses/available/etd-11222002-155624, 2002.