the driver initiative for networking repositories wolfram horstmann universität bielefeld
Post on 18-Dec-2015
214 views
TRANSCRIPT
![Page 1: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/1.jpg)
The DRIVER initiative for networking repositories
Wolfram Horstmann
Universität Bielefeld
![Page 2: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/2.jpg)
DRIVER motivation
Scholarly communication changes towards distributed provision of text, data and services
Repositories are thought as a saviour in this development building such a distributed system
An infrastructure supporting distributed repositories and services is needed
(and reactions)
(needs explanation)
![Page 3: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/3.jpg)
Some observations on repositories
They represent a shift towards …open internet-exposure as opposed to closed database (‚graveyards‘)
content orientation as opposed to mere technical orientation (‚web-servers‘)
distributed systems centralized structures not immediateley required nowadays
![Page 4: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/4.jpg)
„Everybody can be a publisher“Common description standards e.g. Dublin Core Metadata Initiative Many subject-specific standards
Common transfer protocols e.g. OAI-PMH, but also FTP, XML-RPC, WS, etc.
Searchability is possible!
Still: many results are lost to re-use/remixClosed: too sensible, weakly described, unimportant (???)
Missing service frameworks / infrastructures
Problems: Data and service interoperability
Solution: „Infrastructure“
Repositories can solve access problem
![Page 5: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/5.jpg)
What infrastructures are: DRIVER terms
Not an infrastructureSingle repository
Single application for search and retrieval (e.g. BASE) Only local operation Backwards causation on repositories is missing
Maybe an infrastructureDistributed repository landscape as a whole As a capacity for emergent properties, e.g. quality and quantity
incentive for data population Nurturing development of service providers
Definitely an infrastructureMany service providers in one organisational and technical context (e.g. run-time environment)
Enabling re-use and remix of data and services
![Page 6: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/6.jpg)
DRIVER Objectives
Organisational structure for repositoriese.g. the „Confederation“
Improving quality and standards in local rep.e.g. validation procedures
Building a distributed runtime systeme.g. service and data sharing
Target GroupsRepository Managers
Service Providers
Information System Executives
![Page 7: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/7.jpg)
The DRIVER approach is incremental
Start with publication metadataExisting distributed system, somehow connected
Considerable homogeneity and formats: OAI-PMH
Extend geographical coverageFrom 5 countries, to 10, to 27, to ???
Extend towards other contentsFrom publication metadata to enhanced publications, i.e. representations of „texts + data“
Learn about subject specificityData bring in disciplinary requirements
![Page 8: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/8.jpg)
88
The DRIVER Initiative
DRIVER-I 6/2006 – 11/2007
Organisational Models and Technical Test-Bed
DRIVER-II 12/2007 – 11/2009
Running Organisation and Production Infrastructure
DRIVER-Confederation 2010ff
Operations Office and Technical Deployment
NB: DRIVER is not an authoritative body, it is a liberal
bottom-up initiative of stakeholders
![Page 9: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/9.jpg)
DRIVER partners and related projects
Networking, Support, Policy, StudiesGöttingen, Nottingham, SURF, Genth, Ljubiljana, Minho, Copenhagen
Technical development and deploymentAthens, Bielefeld, Pisa, Warsaw
Partners make links to many other thingsOA-services: Sherpa-ROMEO, OpenDOAR, BASE…
Projects: Europeana, PEER, DELOS, DL.org, D4Science, PARSE-Insight, NESTOR…
Orgs: DINI, JISC, LIBER, SPARC, KE …
Platforms: DSPACE/FEDORA/OPUS/ePrints
![Page 10: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/10.jpg)
10 DRIVER-II Midterm Review, January 30, 2009 - Pisa10
Project structure
Networking
ResearchService
Running Infrastructure: Content &
Functionality
Construction of Services: ideas, design, development
TechnicalManagement
Advocacy: attracting users, content and Service
providers
Discovery: technology watch, EPs
requirements
![Page 11: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/11.jpg)
Some results
![Page 12: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/12.jpg)
Some Results: Studies
![Page 13: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/13.jpg)
Some Results: A Portal
![Page 14: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/14.jpg)
Some Results: A Search
![Page 15: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/15.jpg)
Some Results: Repository Registration
![Page 16: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/16.jpg)
Some Results: Guidelines
Build on knowledge from past & current IR projects (EU)26 actively involved contributors (experts and repository managers) from 8 countries.Practical answers on how to:
Improve full-text access Standardize metadata qualityCreate a reliable infrastructure for permanent identification, resolution, traceability and storageResolve semantic and classification issues
![Page 17: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/17.jpg)
Some Results: Support structures
![Page 18: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/18.jpg)
Some Results: Repositories
185+ harvested repositories
21 countries
856,264+ documents
![Page 19: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/19.jpg)
Some Results: Service-Oriented-Arch.
9 hosting nodes
25+ Functionality typologies(services)
36serviceInstances
3 applications: DRIVER Main, Belgium, Spain-Recolecta
![Page 20: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/20.jpg)
2020
Some Results: Runtime-System & Hosting
Enabling Layer
Data Layer
EU Open AccessRepositories
Functionality Layer
Ad
min
istr
ato
rsE
nd
use
rs
Advanced User InterfacesNational portals
Project Applications
![Page 21: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/21.jpg)
Another Compulsory Design Diagram
![Page 22: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/22.jpg)
Some Results: A software
Meant for large service providers only!
![Page 23: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/23.jpg)
Technicalities
![Page 24: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/24.jpg)
DRIVER and standards
Service Resources are implemented as Web Services and accessed through the corresponding Web Service Interface
Parameters calls are enveloped into SOAP messages
The Enabling Services are also compatible with REST
XML is the lingua-franca for the whole systemResource internal status, i.e. Resource profiles
Profiles in Information Service use Exist XML engine
VocabulariesNames of Languages: ISO 639 – 2 (three letters, B/T)Names of Countries: ISO 3166 (two letters)Date format: ISO 8601: 1988 (E)
DRIVER AggregationHarvesting according to OAI-PMH protocolAdopting OAI-Provenance best practice (OAI-about)To be extended to other object models and harvesting protocols
Queries to Search and Index obey to SRW/CQL standard
![Page 25: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/25.jpg)
25 DRIVER-II Midterm Review, January 30, 2009 - Pisa25
Enabling Layer DevelopmentsFunction Task Partner Status D-NET
IS-Store Resource profile store Enhanced
Port (PERL > JAVA)
CNR RC 1.1
IS-S&N W3C S&N/Topics Enhanced
Port (PERL > JAVA)
CNR RC 1.1
IS-Lookup Resource discovery Enhanced
Port (PERL > JAVA)
CNR RC 1.1
IS-Registry Resource registration/de-registration/update
Enhanced
Port (PERL > JAVA)
CNR RC 1.1
Manager Orchestration of DRIVER Info Space
Enhanced
Port (PERL > JAVA)
CNR RC 1.1
Authn&Authz Service-2-Service secure interaction/multiple applications
Enhanced Service (JAVA)
ICM Proto 2.0
Monitoring Admin User Interface and autonomic administration
Novel Service (JAVA) CNR RC 1.2
![Page 26: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/26.jpg)
26 DRIVER-II Midterm Review, January 30, 2009 - Pisa26
Data-Layer DevelopmentsFunction Task Partner Status D-NET
Harvester Collects arbitrary formats Port (PERL > JAVA) UniBi/CNR Alpha 2.0
Transformator Eases arbitrary mappings Novel service (JAVA) UniBi/CNR Alpha 2.0
Feature Extraction Executes transform.s. and utilities
Novel service (JAVA) UniBi Alpha 2.0
Text-Engine Utilities, e.g. language detection, full-text-extr.
Novel service (JAVA) UniBi Alpha 1.1
MD-Store Support special MD operations
Port (PERL > JAVA) UniBi Alpha 1.1
Store Generic store for binaries Novel service (JAVA) UniBi/ICM/CNR
Proto 2.0
Index Lookup table for stored information
Adapt from YADDA ICM/UniBi Prod. 1.0
OAI-ORE Publisher Exposure of stored information
Novel service (JAVA) CNR Spec. 2.0
OAI-PMH Publisher Exposure of stored information
-- CNR Prod. 1.0
Content Service Managing complex objects Novel service (JAVA) CNR Proto 2.0
Access Service Generic service for using remote objects
Novel service (JAVA) CNR Proto 2.0
![Page 27: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/27.jpg)
27 DRIVER-II Midterm Review, January 30, 2009 - Pisa27
Functional Layer DevelopmentsFunction Task Partner Status D-NET
AID Enhanced Publications management
Novel Service (JAVA) NKUA Spec. 2.0
Advanced search Optimized Search
Similarity Search
Enhanced Service (JAVA)
Novel Service (JAVA)
NKUA
ICM
Spec.
Spec.
2.0
2.0
User Services Advanced personalization Enhanced Service (JAVA)
NKUA Spec. 2.0
Community Service Advanced Community management
Enhanced Service (JAVA)
NKUA Spec. 2.0
Web Interface Generic to data model and services
Enhanced UIs
Enhanced Service (JAVA)
NKUA Spec.
Spec
1.2
2.0
![Page 28: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/28.jpg)
28
Current Work: DRIVER-II
NetworkingConfederation with who-is-who advisory board
Outreach: LIBER, SPARC, US, JAPAN etc…
ConsolidationDRIVER-I Services packaged and performing in production quality
EnhancementDRIVER-I Services Improved indexing and data aggregation functionalities
DRIVER-II Services: D-NET v2.0 Enhanced publication management and functionality
![Page 29: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/29.jpg)
DRIVER II – D-NET v2.0
StudiesWhat are „Enhanced Publications“? >> PDFTechnologies for „Enhanced Publications“ >> PDFLong-Term Preservation of „Enhanced Publications“„Technology Watch“: the Future >> PDF
Demonstrators „Enhanced Publications“ >> Live„Enhanced Publications“ Long-Term Preserv. >> Film
InfrastructureSpecs. ready, Development in progress >> WIKI D-NET v1.1: Java-Porting & Build-System D-NET v1.2: New Aggregator, Installer (, Contracts) D-NET v2.0: Compound Object Management
![Page 30: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/30.jpg)
Outlook: Enhanced Publications
![Page 31: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/31.jpg)
Outlook: Enhanced Publications
Based on OAI-ORE
![Page 32: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/32.jpg)
The Web-Capable Model – OAI-ORE
http://www.openarchives.org/ore/
![Page 33: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/33.jpg)
The Document Model for DRIVER
![Page 34: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/34.jpg)
The Object Model – Internal Processing
Primitives: Types, Sets and Objects
Object: atoms, descriptions, relations
![Page 35: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/35.jpg)
3535
The DRIVER-application
![Page 36: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/36.jpg)
Compound Object Management
Object Instances DRIVER Processing DRIVER Application
Web-Representation
Web-Processing
![Page 37: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/37.jpg)
Conclusion
![Page 38: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/38.jpg)
Lessons learnt
Distributed data infrastructure requires links between organisational and technical concepts
Data specialists, computer scientists, service providers
Guidelines / content policies as a „glue“
In distributed data provision, quality and access measures are the most ‚expensive‘ tasks
Distributed service operation (not data provision) can be solved but asks novel questions (SLAs)
„Infrastructure“ for novel paradigms for scholarly communication are hard to get across ;-)
![Page 39: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/39.jpg)
Summary
DRIVER tackles the data infrastructure challenge from the text-repository side (mostly OAI-PMH)
DRIVER handshakes with primary & secondary data through „enhanced publications“
DRIVER isn‘t only a project but a forum for information specialists
‚Products‘ include: Studies, Infrastructure run-time-system in production, software, support …
DRIVER has adressed many problems for data and service interoperability in a distributed repository environment and found some solutions
![Page 40: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/40.jpg)
But…
How could DRIVER link to serious processing of unstructured data?
![Page 41: The DRIVER initiative for networking repositories Wolfram Horstmann Universität Bielefeld](https://reader038.vdocuments.us/reader038/viewer/2022110207/56649d235503460f949fa0f9/html5/thumbnails/41.jpg)
Thanks