1 accessing multiple resources via z39.50 paul miller interoperability focus uk office for library...

43
1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) [email protected] http:// www.ukoln.ac.uk/ UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union. UKOLN also receives support from the Universities of Bath and Hull where staff are based.

Upload: ella-watkins

Post on 28-Mar-2015

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

1

Accessing Multiple Resources via Z39.50

Paul Miller

Interoperability FocusUK Office for Library & Information Networking (UKOLN)

[email protected] http://www.ukoln.ac.uk/

UKOLN is funded by the Library and Information Commission, the Joint Information Systems Committee (JISC) of the Higher Education Funding Councils, as well as by project funding from the JISC and the European Union.

UKOLN also receives support from the Universities of Bath and Hull where staff are based.

Page 2: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

2

Outline

• What is Z39.50?• Some gory details

– Attribute Sets, Profiles, and all…

• Maintenance and development

• What’s wrong with Z39.50?

• The Bath Profile

• The New Attribute Architecture

• How it’s used• Tools, registries, etc.

See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/

Page 3: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

3

What is Z39.50?

• ANSI/NISO Z39.50–1995, Information Retrieval (Z39.50): Application Service Definition and Protocol Specification

• ISO 23950:1998, Information and Documentation — Information Retrieval (Z39.50) — Application Service Definition and Protocol Specification.

See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html

Page 4: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

4

What is Z39.50?

“This standard specifies a client/server based protocol for Information Retrieval. It specifies procedures and structures for a client to search a database provided by a server, retrieve database records identified by a search, scan a term list, and sort a result set. Access control, resource control, extended services, and a ‘help’ facility are also supported. The protocol addresses communication between corresponding information retrieval applications, the client and server (which may reside on different computers); it does not address interaction between the client and the end-user.”

(Z39.50–1995, page 0).

See http://lcweb.loc.gov/z3950/agency/1995doce.htmlSee http://lcweb.loc.gov/z3950/agency/1995doce.html

Page 5: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

5

Some gory details…• Z39.50 follows client/server model

• But calls them Origin and Target

Client/origin

Server/target

Page 6: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

6

Client/Server architecture

Page 7: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

7

Client/Server architecture

Page 8: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

8

Some gory details…

• Z39.50–1995 is divided into eleven ‘Facilities’

Initialization Search

Retrieval Result–set–delete

Browse Sort

Access Control Accounting

Explain Extended Services

Termination.

See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/

Page 9: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

9

Facilities and Services

• Each Facility comprises at least one Service• A Service facilitates a particular

interaction between Origin and Target• The three key services are Init,

Search, and Present.

See http://www.ariadne.ac.uk/issue21/z3950/See http://www.ariadne.ac.uk/issue21/z3950/

Page 10: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

10

Init

• The only Service of the Initialization Facility

• Origin–initiated

• Used to start a ‘Z–association’• Origin requests a number of

parameters under which the searches will be conducted

• Target responds, either accepting offered parameters or proposing others if necessary.

Page 11: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

11

Search

• The only Service of the Search Facility

• Origin–initiated

• Used to actually conduct a search• Origin specifies databases to be

searched, attribute combinations, and query

• Target responds, identifying the number of matching results.

Page 12: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

12

Present

• Main Service of the Retrieval Facility (along with Segment)

• Origin–initiated• although Target can initiate a Segment

request if the result set is very large

• Used to return records to the user.

Page 13: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

13

Init for dummies

Hello. Do you speak English?

Hello. Yes, I do. Let’s talk.

Page 14: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

14

Search for dummies

Cool. Can I have anything you’ve got on a place

called “Bristol”?

I’ve got 25 records matching your request, and here’s the first five. As you didn’t

specify anything else, I’ve sent them to you in MARC, so I hope

that’s OK.

Page 15: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

15

Present for dummies25, eh? Can I have the first ten, please? Oh, and I really don’t like

MARC. If you can send Dublin Core that would be great, and if not I’ll

settle for some SUTRS.

DC:Creator – blahDC:Title – blah…

Page 16: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

16

Now it gets hairy…

• To communicate successfully, Origin and Target need to use the same Attribute Set.• An Attribute Set like Bib–1 defines six

forms of Attribute —– Use– Relation– Truncation– Completeness– Position– Structure.

Page 17: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

17

Use Attributes

• Define the ‘access points’ on which a search takes place• Title, author, subject, etc.

See http://lcweb.loc.gov/z3950/agency/defns/bib1.htmlSee http://lcweb.loc.gov/z3950/agency/defns/bib1.html

Page 18: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

18

Relation Attributes

• Defines the relationship between the search term and values stored in the database/index• Less than, greater than, equal to,

phonetically matched, etc.

Page 19: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

19

Truncation Attributes

• Defines which part of the stored value is to be searched on• Beginning of any word, end of any

word, etc.• ‘Smith’ finds ‘Smithsonian’ and not

‘Wordsmith’, and vice versa.

Page 20: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

20

Completeness Attributes

• Defines how much of the stored index term must be in the search term• ‘Smith’ finds ‘Smith’, but not

‘Smithsonian’ or ‘the Smith’, etc.

Page 21: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

21

Position Attributes

• Defines where in the index the search term should be located• At the start of the field, anywhere, etc.

Page 22: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

22

Structure Attributes

• Specifies the form to be searched for• Word, phrase, date, etc.

Page 23: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

23

Record Syntaxes• Record Syntaxes define the structure in which

results are returned to the Origin.• This does not mean that Targets need to store data

in these formats

• MARC• UKMARC, USMARC/MARC21, DANMARC, MARB,

UNIMARC…

• SUTRS• Simple Unstructured Text Record Syntax

• GRS–1• Generic Record Syntax

• XML.

Page 24: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

24

Profiles• Groupings of Attribute Sets, Record

Syntaxes, etc. to meet specific needs• Disciplinary

– Cultural Heritage (CIMI)– Geospatial (GEO)

• Geographic/Cultural/National– Texas Profile– OPAC Network for Europe (ONE)– Conference of European National Librarians (CENL)

• Functional– Collections Profile

• Etc.

Page 25: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

25

• Z39.50 Maintenance Agency• Based at Library of Congress,

and officially responsible for upkeep of the standard

• ZIG• Z39.50 Implementor’s Group• Informal grouping of vendors, users and

implementors who work to progress new areas of the standard

• Next meeting in Texas in January• Likely to be at UKOLN in 2001.

Maintenance and Development

See http://www.loc.gov/z3950/agency/See http://www.loc.gov/z3950/agency/

Page 26: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

26

What’s wrong with Z39.50?• Profiles for each discipline

• Defeats interoperability?

• Vendor interpretation of the standard

• Bib–1 bloat

• Largely invisible to the user

• Seen as complicated, expensive and old–fashioned

• Surely no match for XML/RDF/ whatever.

Page 27: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

27

The Bath Profile

• System vendors implement areas of the Z39.50 standard differently

• Regional, National, and disciplinary Profiles have appeared over previous years, many of which have basic functions in common

• Users wish to search across national/regional boundaries, and between vendors.

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

Page 28: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

28

Learning from the past

• The Bath Profile is heavily influenced by• ATS–1• CENL• DanZIG• MODELS• ONE• Z Texas• vCUC

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

Page 29: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

29

Learning from the past

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

Page 30: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

30

Doing the work

• ZIP–PIZ–L mailing list, hosted by National Library of Canada

• Meeting face–to–face• The UK’s Joint Information Systems

Committee (JISC) supported a face–to–face meeting in Bath over the summer

• A draft, being widely circulated for comment.

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

Page 31: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

31

What we proposed

• Minimisation of ‘defaults’• Where possible, every attribute is defined in the Profile

(Use, Relation, Position, Structure, Truncation, Completeness)

• Three Functional Areas• Basic Bibliographic Search & Retrieval• Bibliographic Holdings Search & Retrieval• Cross–Domain Search & Retrieval

• Three or more Levels of Conformance in each Area.

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

See http://www.ukoln.ac.uk/interop–focus/activities/z3950/int_profile/bath/

Page 32: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

32

What we proposed

• SUTRS and one of UNIMARC or MARC21 for Bibliographic Search results• Or all three at Level 1?

• SUTRS and Dublin Core (in XML) for Cross–Domain results

• Other record syntaxes also permitted, but conformant tools must support at least these.

Page 33: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

33

The new Attribute Architecture• Recognition of existing problems

• Probably 2–3 years away in mainstream implementations?

• Deals with Bib–1 bloat by identifying key attributes of value to multiple applications, and grouping them together

– Utility Attribute Set (description of records)– Cross–Domain Attribute Set (description of

resources, and closely related to Dublin Core element set)

– Bib–2etc.

Page 34: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

34

The new Attribute Architecture

New Attribute Type Relation to Bib–1 Attributes

Access Point Use

Semantic Qualifier new

Language new

Content Authority new

Expansion/ Interpretation

Truncation and some of Relation

Normalized Weight new

Page 35: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

35

The new Attribute Architecture

New Attribute Type Relation to Bib–1 Attributes

Hit Count new

Comparison Most of Relation and part of

Completeness

Format/ Structure Structure

Occurrence Completeness

Indirection new

Functional Qualifier new

Page 36: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

36

Using Z39.50

• Z39.50 widely deployed in the library sector and elsewhere, although often invisibly• The Origin can be either a human user

or a second Origin computer– e.g. Z39.50 portals, summing resources

from multiple targets

• Users access Z39.50 Targets using proprietary clients or, increasingly, via web interfaces

– e.g. WinWillow, ZNavigator, many WOPACs.

Page 37: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

37

Using Z39.50© A

rts & H

umanities D

ata S

ervice

Page 38: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

38

Using Z39.50© A

rts & H

umanities D

ata S

ervice

Page 39: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

39

Using Z39.50© U

niversity of C

alifornia

Page 40: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

40

Using Z39.50© U

niversity of C

alifornia

Page 41: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

41

Building the DNER

• Distributed National Electronic Resource• Policy aspiration of the Joint Information

Systems Committee• Intended to provide greater access to JISC’s

Current Content Collection– RDN– AHDS– MIMAS– EDINA– The Data Archive– EDUSERVE– eLib projects

etc.

Page 42: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

42

Building the DNER

• Construction of Bath Profile–conformant Z39.50 Targets at data sources

• Construction of various Portals to facilitate access• ‘JISC Portal’ ?• Data Centre Portals• Subject Portals• Data Type Portals• Institutional Portals• Personal Portals ?

Page 43: 1 Accessing Multiple Resources via Z39.50 Paul Miller Interoperability Focus UK Office for Library & Information Networking (UKOLN) P.Miller@ukoln.ac.uk

43

Building the DNER

• Remaining challenges• Authentication hell

– Move from endless authentication to single authentication

• Alignment of different data types– Ordnance Survey maps at Edinburgh– Satellite imagery in Manchester– Electronic journal articles in many formats, etc.– Census data at the Data Archive– Survey data in Manchester– Chemical structures in Manchester

• Collection Level Description.