tutorial – semantic digital libraries, may 9, 2007 www 2007 copyright 2006-2007, deri nui galway,...

33
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University Tutorial – Semantic Digital Libraries - Existing Semantic Digital Libraries Solutions FEDORA Sandy Payette Director, Fedora Project Cornell University Dean Krafft, PI, NSDL Cornell University

Upload: debra-ramsey

Post on 11-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries

- Existing Semantic Digital Libraries Solutions –

FEDORA

Sandy Payette

Director, Fedora Project

Cornell University

Dean Krafft,

PI, NSDL

Cornell University

Page 2: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Scholarly and Scientific Workbenches

“Web 2.0” Collaborative Repositories Museum Exhibits with Lesson Plans

Fedora Semantic Digital Libraries enable …

DataData

Annotation

Article

Linking Data and Publications

blog and wiki

Page 3: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Fedora - Technology Integration

Semantic

Repository

Process Preservation

• Traverse graph• Relate• Contextualize• Inference• Query

• Workflow• Messaging• Transactions

• Digital Objects• Manage• Access• Versioning• Storage

• Integrity• Monitoring• Alerting• Migration• Replication

Page 4: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

The Fedora Project

• Fedora– Flexible– Extensible – Digital – Object– Repository– Architecture

• History– Cornell Research (1997-2002)

– DARPA and NSF-funded research and reference implementations– Distributed, Interoperable Repositories (experiments with CNRI)

– Open Source Project (2002-present)– Andrew W. Mellon Foundation (2002-2009)– Joint development by Cornell University and University of Virginia– Transitioning into non-profit organization (Fedora Commons 501c3)

Page 5: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Fedora Macro Roadmap

Fedora Phase 2

Fedora Enterprise

Now Q4 2007

Fedora Commons

Semantic TechnologiesService Framework

Workflow Engine and Supporting ToolsMessage-Oriented Middleware and ESBDistributed Transactions

Technical: Evolution of Semantic-Repo-Service Integrated PlatformCommunity Building: Foster Development and OutreachBusiness Model: Tapping ongoing sources of funding

Q2 2009 2010 2011 onward2005

Page 6: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Relevant Technology Orientation for Fedora

• Service-oriented architecture

• Web 2.0

• Semantic Technologies

SOA

Web 2.0

RDF

OWL

Page 7: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Fedora Service Framework

Fedora RepositoryServ ice

Fedora Serv ices

Apps

PreservationIntegrity

Exte rnalW orkflo w

JHO VE

G DF R

BasicWorkflow

Administrator

PRO AI

(O AI Pro v ide r)

DirectoryIngest

InstitutionalRepository

Clients

FederationPID

Resolution

PreservationM onitoring

M essaging

Search

O ther

Oth erService

Dialog Box Name

O KTex t:

Tex t

Tex t

Tex t

Tex t

Tex t

Cancel

H elp

Sample Text Here Sample Text Here Sample TextHere Sample Text Here Sample Text Here SampleText Here Sample Text Here Sample Text HereSample Text Here Sample Text Here

S am ple Tex t Here S am ple Tex t Here S am ple Tex t Here S am ple Tex t HereS am ple Tex t Here S am ple Tex t Here S am ple Tex t Here S am ple Tex t HereS am ple Tex t Here S am ple Tex t Here S am ple Tex t Here S am ple Tex t Here

Fez and FIREDirIngest PolicyBuilder

SOA

Page 8: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Fedora Technology – Enabling Position

Semantic technologies integrated with repositories.Enables many different applications.

Collaborative Applications

Web 2.0Applications

SemanticDigital Libraries

Page 9: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Motivations: Fedora and Semantic Technologies (RDF)

• A natural model for exposing repository as network of objects– Object-to-object relationships– Relationships to external entities– Query the graph; traversal to discover related stuff

• Indexing based on generalizable data model– Graph-based data model is a common reduction– Avoid fixed schema problems and metadata mud wrestling

• Extensible enrichment of object descriptions– Keep overlaying statements from multiple ontologies– Organic evolution

• Powerful queries and inference for repository management– Transitive relationships among objects– Dependency analysis; – Detection/Extraction of sub-graphs– Provenance of disseminations

Page 10: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

RDF in the Fedora Digital Object Model

Page 11: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Digital Objects contain their RDF assertions

• Assert relationships from Fedora base ontology– Collection – member– Whole – part– Equivalence– Description Of– More…

– Assert relationships/properties from community ontologies– isAnnotationOf– isRecommendedBy– isCertifiedBy– More ….

Page 12: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Example:Digital Object with “compositional semantics”

Page 13: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

RDF “Relationships” Datastream

<foxml:datastream ID="RELS-EXT" CONTROL_GROUP="X"> <foxml:datastreamVersion ID="RELS-EXT.0" MIMETYPE="text/xml" LABEL="RDF"> <foxml:xmlContent> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" ….>

<rdf:Description rdf:about="info:fedora/nsdl:100"> <fedora:isMemberOf rdf:resource="info:fedora/nsdl:nvo-49"/>

<fedora:isMemberOf rdf:resource="info:fedora/nsdl:physics-48"/> <nsdl:reviewedBy rdf:resource=“info:fedora/nsdl:ev-120”/> <nsdl:hasDataComponent rdf:resource="info:fedora/nsdl:nvo-11"/></rdf:Description>

</rdf:RDF> </foxml:xmlContent> </foxml:datastreamVersion></foxml:datastream>

Page 14: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

• NOT the core object store - RI is an index of the repository• Automatic, incremental indexing into triplestore• Search/query the repository via Fedora RI Query Interface

Fedora RDF-based Resource Index (RI)

RDF Index of Repository

RELS-EXT datastream

Fedora model properties

DC datastream

Digital Object Store

Page 15: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

RI Graph view (abbreviated) …

Page 16: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

RI Implementation: The Triplestore Challenge

• Scalability• Few triplestores perform well for 100M+ triples• Kowari – we tested to 180M triples• MPTStore – we tested to 250M triples

• Performance• Jena - easy to get out of memory• Sesame Native - slow for complex queries • Kowari

• Fast queries and full-featured query language (iTQL)• Instability and corruption problems

• MPTStore• Very fast for SPO queries (limited support for complex queries)• Add/modify significantly faster than Kowari

• Mulgara• Fork of Kowari; complex queries; models; inference• Major bug fixes to fix stability and corruption problems• New features planned

Page 17: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Use Case: scholarly objects and annotation in the humanities

S erv ice

hasPartD iagram

hasP artL ette rannotationOf

PID -11PID -3

PID -1PID -10

providesC ontex t

PID -2

am azo n e-com m erce

museum objects

commercial web content

scholarly objects

URI-100

xx:recommends

URI-55 yy:certifies

Page 18: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

hasDataComponent

hasRevie

w

hasDS

hasDS hasDS

hasDS

Use Case: scientific publication and collaboration

Page 19: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Use Case: Object-Centered Sociality

Page 20: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

What is NSDL committed to?

NSDL 2.0 as a platform for developing digital library tools

Support for communities across the full range of science, technology, engineering and mathematics research, learning and education

The library as a shared, collaborative, contributory space

Supporting the creation of context around library resources to enhance discovery, use, and understanding

Page 21: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

NSDL Semantic Digital Libraryrepository requirements

Supports storing both content and metadata

Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation

Accessible through web service architecture of remixable data sources and transformations

Page 22: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

NSDL Data Repository (NDR)

Implemented in Fedora 2.2 with MPTStore and journalling

Moderately large: 4.7 million digital objects, 250 million RDF triples

D.O.s: resources, metadata, agents, metadata providers, aggregators

A REST API to allow authenticated access by other applications

In production at nsdl.org

Page 23: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

NSDL as Semantic Digital Library: collaboration, context, and contribution

The NDR and services provide the platform, but we still need the applications

Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging

Solution 2: Leverage the existing software: WordPress, MediaWiki, Connotea, Sakai

Solution 3: Engage with partners and the broader community to build applications to the platform

Page 24: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

Expert Voices

The NSDL Blogosphere, live at http://expertvoices.nsdl.org

Topic-based discussions (e.g. forensics) linked to related library resources

A way for NSDL community members to become NSDL contributors: of resources, questions, reviews, annotations, metadata

Wordpress-based multi-user multi-blog application (open source, plug-in architecture)

Owner controls publication of entries as NSDL resources and visibility of comments

Entries can contain linked references to NSDL resources, references to URLs that should become resources, and new resource metadata

Page 25: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Page 26: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Page 27: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Page 28: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

OurNSDL: NDR-integrated Wiki

Community of approved contributors (e.g. teachers, librarians, scientists) are granted edit access on OurNSDL wiki

New resources and metadata are created as wiki pages and reflected into the NDR

Non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking, with links reflected back into RDF relationships in NDR

User and project pages organize NDR resources, again reflected back into repository as RDF

Now implementing MediaWiki extensions; beta release expected 2Q07

Page 29: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

NDR Entry for Soft Matter Wiki

Wiki Entry

NewMetadata

NewAudience

MD

ReferencedNew

Resource 1

ReferencedExisting

Resource 2

Annotates

Metadata for

Metadata for

Member ofMetadataProvider

MetadataProvider

ExistingCollection

Soft MatterWiki

Member of

Inferred relationshipbetween resources

Page 30: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Page 31: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Page 32: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

NSDL 2.0 Ecosystem

Protocol:OAI-PMHHTTPRESTNDR API

STEMCollections

SearchServiceArchive

Service

Fedora-based NDR

Page 33: Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University

NSDL 2.0 and the Semantic Web

NSDL 2.0 applications situate resources in context, aiding both discovery and use

Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library

Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library