hub distributed model 2009

38
The Archives Hub ~ Interoperability, Spokes and the Distributed Model

Upload: jane-stevenson

Post on 22-Apr-2015

1.277 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Hub Distributed Model 2009

The Archives Hub ~Interoperability, Spokes and the Distributed Model

Page 2: Hub Distributed Model 2009

The Hub in a Nutshell

• Based at Mimas, University of Manchester

• In service since 2000

• Over 23,000 collection descriptions

• 170 repositories

• JISC funded

• Management and service team at Manchester

• Development team at Liverpool

• Cheshire software

• Cheshire for Archives – works with EAD descriptions

• Distributed system

Page 3: Hub Distributed Model 2009

Hub Workshop 2009

Content and contributors

• Strategic aim: build and enhance content

• Meeting the needs of the UK research community

• Meeting the needs of the wider community

• Archives for education and research

Flickr cc licence: eileenaway's photostream

The success of the Hub is a reflection of the rich content available from Hub contributors

Page 4: Hub Distributed Model 2009

Hub Workshop 2009

Current contributors

• Higher/Further Education

• Consortium contributions

• Institutions with a research agenda

• Others on a case-by-case basis

• We encourage institutions to contact us

John Rylands Library, Manchester

Page 5: Hub Distributed Model 2009

Hub Workshop 2009

Collection or lower-level…?

• Originally funded for collection-level

• Software/searches effective with both

• Complimentary approaches

• Researchers ask for detail

Flickr cc licence: Muffet’s photostream

Flickr cc licence: soylentgreen23’s photostream

• Images useful at item level

Page 6: Hub Distributed Model 2009

JISC Information Environment

… a vast and sometimes bewildering range of potential sources of electronic information. Each source of information has its own name, its own interface, features and search facilities. Little wonder, then, that many users remain unaware of their existence or fail to discover their value for their own learning, teaching or research.

A key challenge is therefore to achieve a managed, coherent and shared information environment that will overcome these obstacles.

Being able to cross-search and use customised, value added and other services will considerably simplify users’ interactions with online resources. This should encourage take-up and greatly improve means of accessing these resources.

…these activities need to be based on standards for the creation, access, use, preservation and interoperability of networked resources.

http://www.jisc.ac.uk/index.cfm?name=ie_home

Page 7: Hub Distributed Model 2009

Hub Workshop 2009

JISC Information Environment

Most content providers will already offer a Web site through which end-users can access their content. To be a part of the JISC-IE, content providers also need to support machine oriented interfaces to their resources.

1. Support searching using Z39.50/SRW

2. Support metadata harvesting using OAI-PMH

Andy Powell

5 step guide to becoming a content provider in the JISC Information Environment

http://www.ariadne.ac.uk/issue33/info-environment

Page 8: Hub Distributed Model 2009

Hub Workshop 2009

E-GIF, open source and open standardse-GIF version 6.1 (18th March 2005)

– The e-Government Interoperability Framework (e-GIF) sets out the government’s technical policies and specifications for achieving interoperability… across the public sector.

– There is a strategic decision to adopt XML and XSL as the core standards for data integration and management.

– It is a pragmatic strategy that aims to reduce cost and risk for government systems while aligning them to the global Internet revolution.

http://www.govtalk.gov.uk/documents/eGIF%20v6_1(1).pdf

Open Source, Open Standards and Re–Use: Government Action Plan http://www.netvibes.com/cabinetoffice#Open_Source

Page 9: Hub Distributed Model 2009

Hub Workshop 2009

Isn’t technology brilliant?!!

• Technical know-how• XML

• Data creation/editing template

• Web interface

• Machine interfaces

• Distributed model

• Web 2.0

• Dissemination

= Satisfying user experience

+ understanding users

Page 10: Hub Distributed Model 2009

Hub Workshop 2009

Hub Data Flow

• Sustainable model

• Data held as XML

• Efficient search

mechanism

• Flexible access

• Easy to become a

Spoke

Page 11: Hub Distributed Model 2009

The Distributed Hub

Flickr cc licence : Thomas Hawk

The main goal of a distributed computing system is to connect users and resources in a transparent, open, and scalable way. Ideally this arrangement is drastically more fault tolerant and more powerful than many combinations of stand-alone computer systems.

[Wikipedia]

• Administration interface

• Customisable web front-end

• Machine-to-machine interfaces

• Data Creation Template

• Local control

• Technical support locally

• Hub team support

Page 12: Hub Distributed Model 2009

Spokes software

• Offers a means of storing and sharing archival descriptions in XML

• Provides machine-to-machine access to the descriptions through Z39.50 and SRU (Search and Retrieve via URL) & OAI-PMH for harvesting records

• Provides a customisable Web search interface

• Is open source and based on open standards

• Includes a data creation and editing template

Page 13: Hub Distributed Model 2009

Hub Workshop 2009

Anatomy of a Spoke

EAD XML files

Web search interface

Direct searching access for other

applications through

standards-based machine-to-

machine protocols

Including the central Hub!

Z39.50

SRUCheshire indexes of EAD data

HT

TP

Page 14: Hub Distributed Model 2009

Spokes indexes

The database will provide indexes based on the following standards:

Data standard Data field(s)

cql.anywhere full text

dc.description unittitle, controlaccess, and scopecontent fields

dc.title collection title (titleproper)

dc.creator creator of the collection

dc.identifier eadid

dc.date unitdate

dc.subjects controlaccess fields

bath.name personal, family, corporate and geographic names

bath.personalName personal names

bath.corporateName corporate names

bath.geographicName geographic names

bath.genreForm genre

Page 15: Hub Distributed Model 2009

Hub Workshop 2009

Administration Interface

http://spoke.mimas.archiveshub.ac.uk/ead/admin/

Page 16: Hub Distributed Model 2009
Page 17: Hub Distributed Model 2009

Hub Workshop 2009

Page 18: Hub Distributed Model 2009

Hub Workshop 2009

Liverpool Spoke

Page 19: Hub Distributed Model 2009

Hub Workshop 2009

John Rylands Spoke

Page 20: Hub Distributed Model 2009

Hub Workshop 2009

Agreement with Spokes

Page 21: Hub Distributed Model 2009

Hub Workshop 2009

Hosted Spokes

• Spokes at Manchester– Configuration

– Agreement between parties

• Manchester team undertake agreed level of support

• Institution still responsible for the data

Page 22: Hub Distributed Model 2009

Hub Workshop 2009

Being an Archives Hub Spoke…

• Gives you control over your own EAD files

– Allows you to update and add new files when you need to

• Exposes your EAD to other applications which need to cross-search the descriptions

– Using standards-compliant methods

• Means you benefit from using software that has been developed with the Archives Hub community

Page 23: Hub Distributed Model 2009

Hub Workshop 2009

Collaboration & Sharing

• Networks and communities – the National Archives Network

• Cross-service and cross-domain collaboration

– Copac

– Intute

– Digitisation Projects

• Expand and share content

– import/export/M2M

• Links to other archive services

– NRA

Page 24: Hub Distributed Model 2009

Hub Workshop 2009

The National Archives Network

‘Our vision of the future of British archives is of a flow of archival information which takes account of all the opportunities offered by digital networks and offers opportunity for exploration - historical, personal, social - to the broadest possible range of people wherever they can use it - in the home, the classroom or the office.’

British Archives: The Way Forward (NCA, 2000)

A comprehensive national resource discovery mechanism

Page 25: Hub Distributed Model 2009

Hub Workshop 2009

The importance

‘There can be no higher priority for archives than the creation of this collaborative electronic network, overcoming the limitations of geography, crossing the many archival sectors and creating a truly unified digital directory or encyclopaedia of British historical documents.’

British Archives: The Way Forward (NCA, 2000)

Page 26: Hub Distributed Model 2009

Hub Workshop 2009

National Archive Network

Page 27: Hub Distributed Model 2009

Hub Workshop 2009

The opportunity

‘Outreach has been a developing preoccupation for archives in recent years, but the arrival of the internet age provides the opportunities to take archives, as never before, to the doorstep of the community at large.’

British Archives: The Way Forward (2000)

Page 28: Hub Distributed Model 2009

Hub Workshop 2009

Progress of the NAN

• Many archives took part in this drive towards a national archives network

• …many still are taking part

• The importance of recognised standards

• Intention to create collection level catalogues of all substantial collections within a defined timeframe

Page 29: Hub Distributed Model 2009

Hub Workshop 2009

Success of the NAN

• Strands of the national archives network provide access to archives that were previously inaccessible

• The HLF has played a major role in enabling access and online discovery

• Users of archives have benefited enormously

• Data standards have become of central importance

Page 30: Hub Distributed Model 2009

Hub Workshop 2009

Shortcomings of the NAN

• We don’t have a single national network

• Differences in data structure; content; search capabilities; look and feel

• Strands are not fully interoperable

• Politics, funding and willpower may not combine in favour of this approach

• The landscape has changed substantially since 2000 – maybe this solution is no longer appropriate?

Page 31: Hub Distributed Model 2009

Hub Workshop 2009

The NAN today

• Many ‘strands’

• Only a few use EAD (support EAD export)

• Lack of funding for a joint solution

Key is interoperability and machine-to-machine interfaces:

• NAN as a community, sharing knowledge and experiences

• NAN as a promoter of standards and facilitator for data sharing

• NAN strands as promoters of flexible and open approaches

Page 32: Hub Distributed Model 2009

Hub Workshop 2009

The Interoperable Hub

The ability of software and hardware on different machines to share data

• Content standards

• Structural standards

• Validation of content

• Data Editor

•Training and awareness

• Contributor responsibility

• Networking and community building

Page 33: Hub Distributed Model 2009

Hub Workshop 2009

Page 34: Hub Distributed Model 2009

Hub Workshop 2009

Machine-to-machine interfaces

• Web access is just one means of access to the data

• Machine access provides flexible access, so people can set their own agendas– Z39.50

– SRU

– OAI-PMH (harvester)

• Need to provide semantic data – properly marked up, well-structured

Page 35: Hub Distributed Model 2009

Hub Workshop 2009

Pilot project for SRU: Genesis portal for Women’s Studies

• Hub hosts data

• Genesis searches the Hub using SRU

• Implications for data – how search just for appropriate descriptions?

• Possible issues with search speeds

Page 36: Hub Distributed Model 2009

Hub Workshop 2009

Persistent Identifiers

• All Hub descriptions have their own identifiers – a unique reference

• Gives them their own web address – can point to any description

• Facilitates linking, e.g. from National Register of Archives

• Enables bookmarking of content

http://www.archiveshub.ac.uk/arch/glossary.shtml#identifier

Page 37: Hub Distributed Model 2009

Hub Workshop 2009

Challenges (of which there are many)

• Understanding our users

• Encouraging item-level descriptions

• Encouraging images/links to content

• Which technology?

• Perceptions of relevancy

• Understanding Impact

• Sustainability

Flickr cc licence: hoodwink’s photostream

Page 38: Hub Distributed Model 2009

Hub Workshop 2009

Moving Forward

• Increasing content and contributors

• Branding and new Website

• More engagement with users / user generated content

• Continuing to be standards-based, open and interoperable