tutorial – semantic digital libraries, may 9, 2007 www 2007 copyright 2006-2007, deri nui galway,...
TRANSCRIPT
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Tutorial – Semantic Digital Libraries
- Existing Semantic Digital Libraries Solutions –
FEDORA
Sandy Payette
Director, Fedora Project
Cornell University
Dean Krafft,
PI, NSDL
Cornell University
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Scholarly and Scientific Workbenches
“Web 2.0” Collaborative Repositories Museum Exhibits with Lesson Plans
Fedora Semantic Digital Libraries enable …
DataData
Annotation
Article
Linking Data and Publications
blog and wiki
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora - Technology Integration
Semantic
Repository
Process Preservation
• Traverse graph• Relate• Contextualize• Inference• Query
• Workflow• Messaging• Transactions
• Digital Objects• Manage• Access• Versioning• Storage
• Integrity• Monitoring• Alerting• Migration• Replication
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
The Fedora Project
• Fedora– Flexible– Extensible – Digital – Object– Repository– Architecture
• History– Cornell Research (1997-2002)
– DARPA and NSF-funded research and reference implementations– Distributed, Interoperable Repositories (experiments with CNRI)
– Open Source Project (2002-present)– Andrew W. Mellon Foundation (2002-2009)– Joint development by Cornell University and University of Virginia– Transitioning into non-profit organization (Fedora Commons 501c3)
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Macro Roadmap
Fedora Phase 2
Fedora Enterprise
Now Q4 2007
Fedora Commons
Semantic TechnologiesService Framework
Workflow Engine and Supporting ToolsMessage-Oriented Middleware and ESBDistributed Transactions
Technical: Evolution of Semantic-Repo-Service Integrated PlatformCommunity Building: Foster Development and OutreachBusiness Model: Tapping ongoing sources of funding
Q2 2009 2010 2011 onward2005
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Relevant Technology Orientation for Fedora
• Service-oriented architecture
• Web 2.0
• Semantic Technologies
SOA
Web 2.0
RDF
OWL
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Service Framework
Fedora RepositoryServ ice
Fedora Serv ices
Apps
PreservationIntegrity
Exte rnalW orkflo w
JHO VE
G DF R
BasicWorkflow
Administrator
PRO AI
(O AI Pro v ide r)
DirectoryIngest
InstitutionalRepository
Clients
FederationPID
Resolution
PreservationM onitoring
M essaging
Search
O ther
Oth erService
Dialog Box Name
O KTex t:
Tex t
Tex t
Tex t
Tex t
Tex t
Cancel
H elp
Sample Text Here Sample Text Here Sample TextHere Sample Text Here Sample Text Here SampleText Here Sample Text Here Sample Text HereSample Text Here Sample Text Here
S am ple Tex t Here S am ple Tex t Here S am ple Tex t Here S am ple Tex t HereS am ple Tex t Here S am ple Tex t Here S am ple Tex t Here S am ple Tex t HereS am ple Tex t Here S am ple Tex t Here S am ple Tex t Here S am ple Tex t Here
Fez and FIREDirIngest PolicyBuilder
SOA
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Fedora Technology – Enabling Position
Semantic technologies integrated with repositories.Enables many different applications.
Collaborative Applications
Web 2.0Applications
SemanticDigital Libraries
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Motivations: Fedora and Semantic Technologies (RDF)
• A natural model for exposing repository as network of objects– Object-to-object relationships– Relationships to external entities– Query the graph; traversal to discover related stuff
• Indexing based on generalizable data model– Graph-based data model is a common reduction– Avoid fixed schema problems and metadata mud wrestling
• Extensible enrichment of object descriptions– Keep overlaying statements from multiple ontologies– Organic evolution
• Powerful queries and inference for repository management– Transitive relationships among objects– Dependency analysis; – Detection/Extraction of sub-graphs– Provenance of disseminations
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RDF in the Fedora Digital Object Model
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Digital Objects contain their RDF assertions
• Assert relationships from Fedora base ontology– Collection – member– Whole – part– Equivalence– Description Of– More…
– Assert relationships/properties from community ontologies– isAnnotationOf– isRecommendedBy– isCertifiedBy– More ….
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Example:Digital Object with “compositional semantics”
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RDF “Relationships” Datastream
<foxml:datastream ID="RELS-EXT" CONTROL_GROUP="X"> <foxml:datastreamVersion ID="RELS-EXT.0" MIMETYPE="text/xml" LABEL="RDF"> <foxml:xmlContent> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" ….>
<rdf:Description rdf:about="info:fedora/nsdl:100"> <fedora:isMemberOf rdf:resource="info:fedora/nsdl:nvo-49"/>
<fedora:isMemberOf rdf:resource="info:fedora/nsdl:physics-48"/> <nsdl:reviewedBy rdf:resource=“info:fedora/nsdl:ev-120”/> <nsdl:hasDataComponent rdf:resource="info:fedora/nsdl:nvo-11"/></rdf:Description>
</rdf:RDF> </foxml:xmlContent> </foxml:datastreamVersion></foxml:datastream>
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
• NOT the core object store - RI is an index of the repository• Automatic, incremental indexing into triplestore• Search/query the repository via Fedora RI Query Interface
Fedora RDF-based Resource Index (RI)
RDF Index of Repository
RELS-EXT datastream
Fedora model properties
DC datastream
Digital Object Store
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RI Graph view (abbreviated) …
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
RI Implementation: The Triplestore Challenge
• Scalability• Few triplestores perform well for 100M+ triples• Kowari – we tested to 180M triples• MPTStore – we tested to 250M triples
• Performance• Jena - easy to get out of memory• Sesame Native - slow for complex queries • Kowari
• Fast queries and full-featured query language (iTQL)• Instability and corruption problems
• MPTStore• Very fast for SPO queries (limited support for complex queries)• Add/modify significantly faster than Kowari
• Mulgara• Fork of Kowari; complex queries; models; inference• Major bug fixes to fix stability and corruption problems• New features planned
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Use Case: scholarly objects and annotation in the humanities
S erv ice
hasPartD iagram
hasP artL ette rannotationOf
PID -11PID -3
PID -1PID -10
providesC ontex t
PID -2
am azo n e-com m erce
museum objects
commercial web content
scholarly objects
URI-100
xx:recommends
URI-55 yy:certifies
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
hasDataComponent
hasRevie
w
hasDS
hasDS hasDS
hasDS
Use Case: scientific publication and collaboration
Tutorial – Semantic Digital Libraries, May 9, 2007 WWW 2007 Copyright 2006-2007, DERI NUI Galway, University of Vienna, Fraunhofer IPSI, Cornell University
Use Case: Object-Centered Sociality
What is NSDL committed to?
NSDL 2.0 as a platform for developing digital library tools
Support for communities across the full range of science, technology, engineering and mathematics research, learning and education
The library as a shared, collaborative, contributory space
Supporting the creation of context around library resources to enhance discovery, use, and understanding
NSDL Semantic Digital Libraryrepository requirements
Supports storing both content and metadata
Allows arbitrary relationships among resource and metadata objects: organization, annotation, citation
Accessible through web service architecture of remixable data sources and transformations
NSDL Data Repository (NDR)
Implemented in Fedora 2.2 with MPTStore and journalling
Moderately large: 4.7 million digital objects, 250 million RDF triples
D.O.s: resources, metadata, agents, metadata providers, aggregators
A REST API to allow authenticated access by other applications
In production at nsdl.org
NSDL as Semantic Digital Library: collaboration, context, and contribution
The NDR and services provide the platform, but we still need the applications
Solution 1: Leverage the existing successful models: blogs, wikis, bookmarking/tagging
Solution 2: Leverage the existing software: WordPress, MediaWiki, Connotea, Sakai
Solution 3: Engage with partners and the broader community to build applications to the platform
Expert Voices
The NSDL Blogosphere, live at http://expertvoices.nsdl.org
Topic-based discussions (e.g. forensics) linked to related library resources
A way for NSDL community members to become NSDL contributors: of resources, questions, reviews, annotations, metadata
Wordpress-based multi-user multi-blog application (open source, plug-in architecture)
Owner controls publication of entries as NSDL resources and visibility of comments
Entries can contain linked references to NSDL resources, references to URLs that should become resources, and new resource metadata
OurNSDL: NDR-integrated Wiki
Community of approved contributors (e.g. teachers, librarians, scientists) are granted edit access on OurNSDL wiki
New resources and metadata are created as wiki pages and reflected into the NDR
Non-wiki-based NDR resources and metadata are displayed as read-only wiki pages, subject to comment and linking, with links reflected back into RDF relationships in NDR
User and project pages organize NDR resources, again reflected back into repository as RDF
Now implementing MediaWiki extensions; beta release expected 2Q07
NDR Entry for Soft Matter Wiki
Wiki Entry
NewMetadata
NewAudience
MD
ReferencedNew
Resource 1
ReferencedExisting
Resource 2
Annotates
Metadata for
Metadata for
Member ofMetadataProvider
MetadataProvider
ExistingCollection
Soft MatterWiki
Member of
Inferred relationshipbetween resources
…
NSDL 2.0 Ecosystem
Protocol:OAI-PMHHTTPRESTNDR API
STEMCollections
SearchServiceArchive
Service
Fedora-based NDR
NSDL 2.0 and the Semantic Web
NSDL 2.0 applications situate resources in context, aiding both discovery and use
Users become contributors, adding new resources, ratings, annotations, and organizational structure – frequently as a side effect of using the library
Fedora-based semantic web technology organizes resources, ties context to content, maintains provenance, enables discovery, empowers the user, and powers the library