stephen abrams california digital library evan owens portico tom cramer stanford university

12
JH VE2 Digital Library Federation Fall Forum Providence, November 12-14, 2008 JH VE2 Needs Assessment and Functional Requirements Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

Upload: montrell-seamus

Post on 30-Dec-2015

20 views

Category:

Documents


0 download

DESCRIPTION

Digital Library Federation Fall Forum Providence, November 12-14, 2008 JH VE 2 Needs Assessment and Functional Requirements. Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University. Agenda. Project goals, deliverables, and schedule - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Digital Library Federation Fall ForumProvidence, November 12-14, 2008

JH VE2Needs Assessment and Functional

Requirements

Stephen AbramsCalifornia Digital Library

Evan OwensPortico

Tom CramerStanford University

Page 2: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Agenda

• Project goals, deliverables, and schedule

• New terminology and concepts

• Functional requirements

Page 3: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

JH VE2 project

• A next-generation architecture for format-aware object characterization

– Three-fold goals:

• Re-factor the existing architecture to achieve higher performance, simplify system integration, and encourage third-party enhancement

• Provide significant new function

• Implement modules

• Collaborative project of CDL, Portico, and Stanford University

– Funded by Library of Congress/NDIIPP– Open source BSD license

Page 4: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

New function

• Complex object data model

• Generic plug-in interface

• Common data structure passed between modules to enable stateful processing

• Identification de-coupled from validation

• Standardized handling of format profiles and error reporting

• Symbolic display of binary formats

• API-level support for editing

Page 5: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Format support

• Based on project partner requirements and budgetary constraints

– Image: JPEG 2000, TIFF– Audio: WAVE– Text: SGML, UTF-8, XML– Document: PDF– GIS: Shapefile– Color: ICC– And their well-known variants, e.g. TIFF/IT, TIFF/EP, GeoTIFF,

EXIF, DNG, …

• Unfortunately precluding some JHOVE-supported formats

– AIFF, GIF, HTML, JPEG

Page 6: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Schedule

• Months 1-6 Outreach, design, and prototyping

• Months 7-9 Core APIs and framework

• Months 10-24 Module implementation

Page 7: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

For more information

Wiki confluence.ucop.edu/display/JHOVE2Info/Home

Mailing lists JHOVE2-Announce-L

JHOVE2-Techtalk-L (Subscribe via the wiki)

Page 8: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Terminology

• What is it?– Identification Determining presumptive format through

signature matching

• What is it, really?– Validation Determining conformance to commonly-

accepted normative requirements

• What about it?– Feature extraction Reporting intrinsic properties significant

to preservation planning and action

• What should you do with it?– Assessment Determining acceptability for a given

purpose on the basis of locally-defined policies

Page 9: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Objects, not files

ICC

XMP

TIFF

abcd.tif

dBASE IV

1234.dbf

SHP

1234.shp

SHX

1234.shxShapefile

SHPdBASE IV SHX

Source units Reportable units

TIFF

XMPICC

Page 10: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Functional requirements

• Given the use of new terminology to define characterization, is the conceptual basis for the project clear?

• What does "assessment" mean in the context of your existing or planned preservation workflows?

• Ambiguous statements in format specifications can be dealt with in two ways:

– Configurable validation criteria, or– The application of assessment rules to the results of prescribed

validation criteria

Which seems preferable?

Page 11: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Functional requirements

• Given a potential trade-off between:

– More substantive assessment capabilities– More supported formats

what would your prioritization be?

• Regarding the functional requirements, what is missing? What is most important?

• Would you be interested in pursuing 3rd party development, or co-development, of missing functionality?

Page 12: Stephen Abrams California Digital Library Evan Owens Portico Tom Cramer Stanford University

JH VE2

Functional requirements

• Do you have interesting test data that you can contribute to a JHOVE2 testbed?

• Any other questions or suggestions?