digital library and mle integration - where are we now and where do we want to be?
DESCRIPTION
A presentation to the UCISA TLIG-SDG User Support Conference 2004, Exeter.TRANSCRIPT
UKOLN is supported by:
Digital library and MLE integrationwhere are we now and where do we want to be?
Andy Powell, UKOLN, University of Bath
UCISA TLIG-SDGUser Support Conference 2004, Exeter
www.bath.ac.uk
a centre of expertise in digital information management
www.ukoln.ac.uk
UCISA TLIG-SDG, 2004, Exeter 2
Contents
• current developments in five institutional ‘service areas’– external information services– library– computing services / Web support– e-Learning– MIS, registry, student records, finance, etc.
• standards bodies/activities• technical options for joining stuff together• broad and shallow!
– but highlighting RSS and OpenURL
UCISA TLIG-SDG, 2004, Exeter 3
Institutional service areas…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 4
External content
• wide range of information and other content coming into the institution from outside– made available thru JISC ‘site licensing’ deals,
national data centres, elsewhere
• primary focus of the JISC’s ‘information environment’ activity
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 5
The ‘problem’…
• end-user often has access to large number of heterogeneous collections - full-text, A&I, images, video, data, etc.
• however, experience of these collections is less than optimal:– end-users not aware of available content– end-user has to interact with (search or
browse) multiple different Web sites to work across range of content
– content ‘discovery’ services not joined-up with ‘delivery’ services
UCISA TLIG-SDG, 2004, Exeter 6
Simple scenario
• consider a researcher searching for material to inform a research paper on HIV and/or AIDS
• he or she searches for ‘hiv aids’ using:– the RDN, to discover Internet resources – ZETOC, to discover recent journal articles
• (and, of course, he or she may use a whole range of other search strategies using other services as well)
UCISA TLIG-SDG, 2004, Exeter 7
UCISA TLIG-SDG, 2004, Exeter 8
UCISA TLIG-SDG, 2004, Exeter 9
Issues
• different user interfaces– look-and-feel– subject classification, metadata usage
• everything is HTML – human-oriented– difficult to merge results, e.g. combine into a
list of references– difficult to build a reading list to pass on to
students– need to manually copy-and-paste search
results into HTML page or MS-Word document or desktop reference manager or …
UCISA TLIG-SDG, 2004, Exeter 10
The problem space…
• from perspective of ‘data consumer’– need to interact with multiple collections of stuff
- bibliographic, full-text, data, image, video, etc.– delivered thru multiple Web sites– few good cross-collection discovery services
(apart from Google, but much of the licensed content is part of the invisible Web and therefore not available to Google)
• from perspective of ‘data provider’– few agreed mechanisms for disclosing
availability of content
UCISA TLIG-SDG, 2004, Exeter 11
UK JISC IE context…
• 206 collections and counting…(Hazel Woodward, e-ICOLC, Helsinki, Nov 2001)– Books: 10,000 +– Journals: 5,000 +– Images: 250,000 +– Discovery tools: 50 +
• A & I databases, COPAC, RDN, …
– National mapping data & satellite imagery
• plus institutional content (e-prints, research data, library content, learning resources, etc.)
• plus content made available thru projects – 5/99, FAIR, X4L, …
• plus …
UCISA TLIG-SDG, 2004, Exeter 12
A solution… the JISC IE
• an ‘information environment’• framework of machine-oriented services
allowing the end-user to– discover, access, use, publish resources across a range
of content providers– move away from lots of stand-alone Web sites...
• content providers expose metadata for– searching, harvesting, alerting
• develop end-user services and tools that bring stuff together…
• …based on open ‘standards’
UCISA TLIG-SDG, 2004, Exeter 13
End-user services and tools• generally means ‘portals’…
– ‘library’ portals or metasearch tools (e.g. Encompass, MetaLib or ZPortal)
– ‘subject’ portals developed within academia– ‘institutional’ portals (uPortal)
• …but also other stuff– reading list and other tools in VLE (possibly
externally hosted, e.g. Sentient Discover)– commercial/publisher services (ISI Web of
Knowledge, ingenta, Bb Resource Center, etc.)– OpenURL resolvers (e.g. SFX)– personal desktop reference manager (e.g.
Endnote)
UCISA TLIG-SDG, 2004, Exeter 14
Link resolvers
• ‘discovery’ is only part of the problem…• in the case of books, journals, journal articles,
end-user wants access to the most appropriate copy
• need to join up discovery services with access/delivery services (local library OPAC, ingentaJournals, Amazon, etc.)
• need localised view of available services• linking services that provide access to the
most appropriate copy– user and institutional preferences, cost, access
rights, location, etc.
UCISA TLIG-SDG, 2004, Exeter 15
Technologies
• global, standards-based, cross-domain solutions – Web services…
• cross-searching– Z39.50 – Bath Profile, a profile of Z39.50– SRW (Search and Retrieve Web-service)
(SOAP implementation of Z39.50)
• harvesting– OAI-PMH - Open Archives Initiative Protocol for Metadata
Harvesting
• alerting– RSS - RDF/Rich Site Summary
• linking– OpenURL
UCISA TLIG-SDG, 2004, Exeter 16
Institutional service areas…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 17
Library• management of collection
– migration from hardcopy journals to externally held e- collection
• management of institutional assets - eprint archives, edata archives– role in cataloguing assets
• catalogue - gateway/portal to external (and internal) content
• ‘managing agent’ for ATHENS• preservation• link resolvers
Library
UCISA TLIG-SDG, 2004, Exeter 18
Linking within the ‘collection’
• the context– distributed information environment (e.g. the JISC IE)
– multiple A&I and other discovery services
– rapidly growing e-journal collection
– need to interlink available resources
• the problem– links controlled by external info services
– links not sensitive to user’s context (appropriate copy problem)
– links dependent on vendor agreements
– links don’t cover complete collection
UCISA TLIG-SDG, 2004, Exeter 19
The problem
• the context– distributed information environment (e.g. the JISC IE)
– multiple A&I and other discovery services
– rapidly growing e-journal collection
– need to interlink available resources
• the REAL problem– libraries have no say in linking
– libraries losing core part of ‘organising information’ task
– expensive collection not used optimally
– users not well served
a libraryperspective
!
UCISA TLIG-SDG, 2004, Exeter 20
The solution…
• do NOT hardwire a link to a single service on the referenced item (e.g. a link from an A&I service to the corresponding full-text)
• BUT rather– provide a link that transports metadata
about the referenced item– to another service that is better placed
to provide appropriate links
OpenURL
OpenURLresolver
(link server)
UCISA TLIG-SDG, 2004, Exeter 21
Non-OpenURL linking
link destination
resolution of metadata into a link(typically a URL)
A&I servicedocument delivery
service
link source
link to referenced work .reference
UCISA TLIG-SDG, 2004, Exeter 22
OpenURL linking
link source.
user-specific
(institution)
resolution of metadata &identifiers into services
reference OpenURLOpenURL
resolver
provision of OpenURL
linklink
destination
linklink
destinationlink
linkdestination
linklink
destination
transportation of metadata & identifiers
context-sensitive
A&I servicedocument delivery
service
UCISA TLIG-SDG, 2004, Exeter 23
Example 1
• journal article• from Web of Science to ingenta Journals
UCISA TLIG-SDG, 2004, Exeter 24
button indicatingOpenURL ‘link’
is available
UCISA TLIG-SDG, 2004, Exeter 25
OpenURL resolver offeringcontext-sensitive links,including link to ingenta
UCISA TLIG-SDG, 2004, Exeter 26
UCISA TLIG-SDG, 2004, Exeter 27
also links to other servicessuch as Google search for
related information
UCISA TLIG-SDG, 2004, Exeter 28
UCISA TLIG-SDG, 2004, Exeter 29
Example 2
• book• from University of Bath OPAC to Amazon
UCISA TLIG-SDG, 2004, Exeter 30
button indicatingOpenURL ‘link’
is available
UCISA TLIG-SDG, 2004, Exeter 31
OpenURL resolver offeringcontext-sensitive links,
including link to Amazon
UCISA TLIG-SDG, 2004, Exeter 32
UCISA TLIG-SDG, 2004, Exeter 33
also links to other servicessuch as Google search for
related information
UCISA TLIG-SDG, 2004, Exeter 34
UCISA TLIG-SDG, 2004, Exeter 35
Summary…ISI Web of Science
University of Bath OPAC
OpenURL resolver
ingenta
Amazon
OpenURL SourceOpenURLResolver
OpenURL Target
UCISA TLIG-SDG, 2004, Exeter 36
OpenURL summary
• standard for linking ‘discovery’ services to ‘delivery’ services
• supports linking from OpenURL ‘source’ to OpenURL ‘target’ via OpenURL ‘resolver’
End-user
source resolver target
e.g. Web of Science e.g. ingenta
http://www.bath.ac.uk/openurl?genre=article&atitle=Information%20gateways:%20collaboration%20on%20content &title=Online%20Information%20Review &issn=1468-4527&volume=24&spage=40&epage=45 &artnum=1&aulast=Heery&aufirst=Rachel
BASEURL
http://www.niso.org/committees/committee_ax.htmlhttp://www.niso.org/committees/committee_ax.html
UCISA TLIG-SDG, 2004, Exeter 37
Institutional service areas…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 38
ComputingServices/
Web support team
Comp. Serv. / Web support
• Content Management System(CMS) – use of XML– blogs, wikis, …
• development/delivery ofportal - uPortal
• campus infrastructure– search engine, but often with fairly narrow coverage
(home grown (Ht:/Dig) vs. commercial vs. externally hosted (Google))
– internal AAA (authentication, authorisation and accounting) infrastructure
– network, shared filestore, groupware, etc.
UCISA TLIG-SDG, 2004, Exeter 39
uPortal
• framework for buildinginstitutional portal– single sign-on– integrated and
personalised access to multiple ‘channels’
– portlet model– multiple standards for
‘portlets’ but some trend towards use of WSRP (Web Services for Remote Portlets)
– RSS often used to carry portlet content
portlet
http://www.uportal.org/http://www.uportal.org/
UCISA TLIG-SDG, 2004, Exeter 40
What is RSS?
• simple XML application for sharing (syndicating) ‘news’ feeds on the Web
• RDF Site Summary or Rich Site Summary (depending on who you ask)
• ‘news’ can be interpreted quite loosely, e.g. new items added to database, ‘to do’ lists, timetable/lists of meetings, etc.
• uses ‘channel’ and ‘item’ terminology• a ‘channel’ is an XML document that is
made available on a Web-site – to update the channel, simply update the XML
http://www.eevl.ac.uk/rss_primer/http://www.eevl.ac.uk/rss_primer/
UCISA TLIG-SDG, 2004, Exeter 41
What is RSS? (2)
• each ‘item’ has simple metadata (title, description) and URL link to resource (news story or whatever)
• RSS also provides channel branding (logo, etc.)
• fairly widespread usage• easy to use within ‘portals’ (e.g. uPortal)• lots of software and toolkits available• some experimental use as encoding format
for ‘reading/resource lists’ in e-Learning
UCISA TLIG-SDG, 2004, Exeter 42
Institutional service areas…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 43
e-Learning
e-Learning
• delivery of VLE• development of ‘managed learning
environment’ MLE (integration with MIS/registry, etc.)
• separation of content into backend learning object repositories
• development of whole range of supporting standards, primarily thru IMS
UCISA TLIG-SDG, 2004, Exeter 44
IMS
• global consortium defining open standards for delivery of online distributed learning activities: – Accessibility
– Competency Definitions
– Content Packaging
– Digital Repositories
– Enterprise
– Learner Information Package
– Meta-data
– Question & Test
– Simple Sequencing
http://www.imsglobal.org/http://www.imsglobal.org/
UCISA TLIG-SDG, 2004, Exeter 45
IMS DRI Specifications
• IMS Digital Repositories Interoperability specifications
• define protocols for interoperability between systems
• machine2machine (i.e. how software components talk to each other over the network)
• how data (and metadata) is transferred• between VLE and back-end repositories
– learning object repositories and other services that make content available
UCISA TLIG-SDG, 2004, Exeter 46
Why is IMS DRI important?
• VLE (in some shape or form) likely to become one of key ‘presentation’ services in institutional context
• IMS DRI spec’s fill space between ‘information providers’ and VLEs
• same/similar set of ‘digital library’ standards as endorsed by the JISC IE architecture
• VLE-vendors relatively mainstream• pushes digital library technologies to new
and wider audience
UCISA TLIG-SDG, 2004, Exeter 47
e-Learning
• recognition that e-Learningsystems are complexobjects– trend towards ‘service oriented architectures’– breaking the VLE into lots of smaller
functional components• JISC Learning
Framework (CETIS),MIT’s OKI, Sakai (alignment ofOKI and uPortal)
UCISA TLIG-SDG, 2004, Exeter 48
Service oriented architecture
• analysis of functional components within large-scale institutional services (library, VLE, student records, etc.)– delivery of separate, small-scale functional
‘services’– rationalisation of ‘shared’ common services
(middleware)
Portal
VLE Library mgt system
Student record / MIS
UCISA TLIG-SDG, 2004, Exeter 49
VLE Library mgt system
Student record / MIS
Content Management
DiscoverDiscover
DiscoverCollaboration
Course management
Authentication
Authorisation
Packaging
Assessment
Content Management
DiscoverDiscover
DiscoverCataloguing
Course management
Authentication
Authorisation
Packaging
Grading
Course management
Authentication
Authorisation
UCISA TLIG-SDG, 2004, Exeter 50
DiscoverCollaboration
Assessment
DiscoverCataloguing Grading
Content Management
DiscoverDiscover
Packaging
Authentication
Authorisation Course management
VLE Library mgt system
Student record / MIS
UCISA TLIG-SDG, 2004, Exeter 51
Implementation of SOA
• implementation choices– J2EE (JavaBeans, etc.)
– JINI (Java APIs)
– .Net
– Web Services (SOAP)
• I think(!)…– OKI, uPortal, Sakai tend to lean towards Java-
based solutions – J2EE
– whereas SOAP perhaps provides a more open, robust and language independent solution?
UCISA TLIG-SDG, 2004, Exeter 52
Web services
• machine (m2m) interfaces between services on the Web
• underpin many e-commerce activities and the Grid• a whole new set of acronyms – SOAP, WSDL,
UDDI, WSRP• based on HTTP and XML (i.e. mainstream Web
pedigree)• Google and Amazon APIs• support both informational (e.g. search) and
transactional (e.g. billing) types of service
UCISA TLIG-SDG, 2004, Exeter 53
But… 2 notes of caution?
• is all this added complexity worthwhile and/or realistic?– will the market-place support it?
• where does application logic sit in a service oriented architecture?– need ‘open’ role/process layer to coordinate
use of services
UCISA TLIG-SDG, 2004, Exeter 54
Institutional service areas…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 55
MIS/Registry/
Finance/Etc.
MIS, registry, finance, etc.
• source of data for other ‘service’ areas• recipient of data from other ‘service’
areas• ‘data glue’ that holds everything else
together• interest in developing a ‘portal’ to MIS-
related functions– this group has driven
uPortal in US– but in UK?
UCISA TLIG-SDG, 2004, Exeter 56
Bringing it all together…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
UCISA TLIG-SDG, 2004, Exeter 57
Knowledge management?
• end-user is faced with a range of different human-oriented Web interfaces within the institution
• might need to perform same search (e.g. ‘hiv aids’) against– Web site search engine
– library OPAC
– VLE
– institutional eprint archive
– personal email archives
• and manually collate results from each, etc.
UCISA TLIG-SDG, 2004, Exeter 58
HTTP links…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
http://…URLs
UCISA TLIG-SDG, 2004, Exeter 59
HTTP links
• easy for end-users to create• bit of a mess…• …but deep-links (e.g. direct to book
record in the library catalogue) will reduce need for end-user to repeat searches
• ongoing maintenance problem – checking for dead links
• hard-wired links are same for everyone
UCISA TLIG-SDG, 2004, Exeter 60
Robot/search engine…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
HTTP/HTMLWeb robot
search engine
UCISA TLIG-SDG, 2004, Exeter 61
Robot/search engine
• easily deployed technology• lots of choice in the market-place (Open
source, commercial, externally hosted, etc.)• not everything is visible to the robot – even
within institution (invisible Web)• can’t easily index some formats (images,
video, data, etc.)• limited search functionality (because of lack
of metadata)• well understood by the end-user
UCISA TLIG-SDG, 2004, Exeter 62
OpenURL links…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
OpenURLlinks (URLs)
UCISA TLIG-SDG, 2004, Exeter 63
OpenURL links
• context sensitive – OpenURL resolver can offer different set of links to each user
• harder to create links• more persistent – change resolver, rather
than each link• up-front investment in OpenURL resolver
software (but typically from the library budget!)
UCISA TLIG-SDG, 2004, Exeter 64
Portal – Web services…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
uPortalframework,
aggregated content(distributed search)
UCISA TLIG-SDG, 2004, Exeter 65
Portal – Web services
• aggregate content from multiple sources using distributed search and/or metadata harvesting – SOAP, Z39.50, OAI-PMH
• typically adopted by ‘library’ portals
• likely to be approach adopted by external information suppliers
• likely to be supported by ‘repository’ software within institution
• but may require some in-house development in short term
• learning curve to overcome
UCISA TLIG-SDG, 2004, Exeter 66
Portal – portlets…
e-Learning
Library
ComputingServices/
Web support team
MIS/Registry/
Finance/Etc.
Externalcontent
uPortalframework, aggregateduser-interface (WSRP)
UCISA TLIG-SDG, 2004, Exeter 67
Portal – portlets
• aggregate pieces of user-interface from multiple sources
• Web services for Remote Portlets (WSRP)
• RSS• uPortal appears to be technology leader
UCISA TLIG-SDG, 2004, Exeter 68
Summary
• all approaches listed above likely to be used/adopted in some form
• RSS and OpenURL look to be being widely adopted
• fairly widespread interest in uPortal• standardisation happening in multiple arenas
(W3C, NISO, IMS, IEEE, OASIS) – tracking standards not easy
• deployment happening in multiple places within institutions – so needs strategic overview of joined up service provision
UCISA TLIG-SDG, 2004, Exeter 69
Questions?