access 2004 resource sharing initiatives in atlantic canada: the standards dilemma slavko...
TRANSCRIPT
Access 2004
Resource Sharing InitiativesResource Sharing Initiatives in Atlantic Canada: in Atlantic Canada:
The Standards DilemmaThe Standards Dilemma
Resource Sharing InitiativesResource Sharing Initiatives in Atlantic Canada: in Atlantic Canada:
The Standards DilemmaThe Standards Dilemma
Slavko ManojlovichSlavko ManojlovichAssistant to the University Librarian for Systems and Assistant to the University Librarian for Systems and
PlanningPlanningMemorial University Of NewfoundlandMemorial University Of Newfoundland
Email: [email protected]: [email protected]
Access 2004
Outline• Standards – a formal view• Standards – the view from the
trenches• Standards – “gotchas” over the past 6
months
Access 2004
What is a standard?
A NISO standard developed through consensus, identifies model methods, materials, or practices for libraries, bibliographic and information services, and publishers.
Access 2004
Library Related Standards
• Standard Identifiers (ISBN, DOI)• Z39.50• NCIP• OpenURL• MetaSearch (in progress)
……
Access 2004
ISBИ ∑ИSIPΘPИЦRҐ
ZXXXIX.L
Access 2004
Atlantic Scholarly Information Network • Resource Discovery and Management
Service:– Federated searching (Sirsi SingleSearch)– OpenURL Resolver (Sirsi Resolver)– Document delivery (Relais)– Context management system (Sirsi Rooms)
80,000+ users from 17 institutions will have access to 1,000s of licensed/public resources within an infrastructure modeled on the library pathfinder.
Official launch planned for 2005 APLA conference.
Access 2004
ASIN and Standards
• Atlantic Scholarly Information NetworkResource Discovery and Management Service
Search (Z39.50; OAI-PMH; MetaSearch; Authentication)
Retrieve (MARC21; XML – MARC, Holdings Dublin Core schemas)
Request (OpenURL; Standard Identifiers; ILL; Z39.50)Receive (NCIP; ILL)
Access 2004
Standards on their own do not ensure interoperability
Threats to interoperability include:• Interpretation• Cultural/historical practice• Technological change• Localization• Undocumented practices and
procedures
Access 2004
Threats to Interoperability• Interpretation
Standards are often defined using terminology with imprecise semantics(e.g. index, keyword, phrase, exact match, user address, etc.). Standards and associated profiles require a reality check.
• Cultural/Historical PracticeDifferent cataloguing rules and associated formats underlie MARC21, UNIMARC, danMARC2, etc.See: http://liber.library.uu.nl/publish/articles/000066/index.html
Access 2004
Threats to Interoperability
• Technological ChangeMemorial’s ILS history includes:UTLAS/CLSI SPIRES/CLSI SIRSIRetrofitting ILS to conform to new standards. Some standards die (e.g. Z39.58)
• Localization“Though cataloging rules are intended to be objective and formats are inflexible, catalogers are human, and much of what they enter is a personal decision, often subjective rather than objective.”
Henry L. SnyderUC - Riverside
Access 2004
Threats to Interoperability
• Undocumented practices and procedures– Vendor and version of Z39.50 server– Supported Z39.50 attribute combinations and
record syntaxes– Extent of available holdings and holdings data
elements including the Resolver’s knowledgebase
– Limitations of the server: truncation, stopwords, etc.
– Indexing policies, character set encoding (This also applies to web-based search services)
Access 2004
ASIN and Known Item Searching
• Focus on known item searching through the use of:– Standard Identifiers– Precision searches (exact, first in field)– Scan (browse)
• Reduce reliance on keyword and phrase searches– Low precision– Extra work for staff– Adds unnecessary processing for patrons (similar
to Google results)– Extra processing for client and server software
Access 2004
Standard Identifiers In Catalogues
Memorial Acadia
Total No. of titles 1,417,683 365,847
LCCN OR ISBN OR ISSN 69% (983,415) 69% (253,676)
LCCN and ISBN/ISSN 39% (550,439) 27% (98,280)
Only ISBN/ISSN 16% (226,345) 24% (87,931)
Only LCCNs 14% (206,631) 18% (67,465)
Access 2004
ISBN• The International Standard Book
Number System• Always consists of 10 digits divided
into four parts of variable length which must be clearly separated by hyphens or spaces.
Access 2004
ISBN (continued)
• For the purposes of data processing the 10-digit string is used without hyphens or spaces.
• Examples:0 571 08989 5 = 057108909590-70002-04-3 = 9070002043
• Note: you cannot reconstruct a hyphenated number.
Access 2004
Library of Congress ISBN
Access 2004
AMICUS ISBN
Access 2004
Université du Québec ISBN
Access 2004
Biblioteca Nacional Espana ISBN
Access 2004
ISBN Search (no hyphens)
Access 2004
ISBN Search (with hyphens)
Access 2004
Gotcha!!Gotcha!!ISBN search and hyphensISBN search and hyphens
Gotcha!!Gotcha!!ISBN search and hyphensISBN search and hyphens
Access 2004
Library of Congress Control Number
• LCCN Structure A (1898-2000)Alphabetic prefix(3), Year(2), Serial Number(6), Supplement Number (1), Suffix (variable), Rev Date (variable)
• LCCN Structure B (2001-)Alpha prefix (2), Year(4), Serial Number(6)
Serial numbers less than 6 digits are right justified and unused positions contain zeros.The hyphen which separates the year and serial number… is not carried in the MARC record. For example, the number 85-2 is carried as 85000002 in a record.
Access 2004
Université du Québec LCCN
Access 2004
WorldCat LCCN Z39.50 Search
• Use attribute 9 (LCCN)Search for: 95002537 returns 11 records95002537 (Silicon Snake oil)95025377 (The Industrial Revolution…)95025378 (Avoiding Nuclear Anarchy…)..sn 95002537 (My little bedtime stories)
• Use attribute 1007 (Standard Identifier)Search for: 95002537 returns 2 records95002537 (Silicon Snake oil)950025372 (Apercu de genealogie et d'histoire des familles Beaulieu du Grand Madawaska)
Access 2004
Gotcha!!Gotcha!!WorldCat LCCN SearchWorldCat LCCN Search
Gotcha!!Gotcha!!WorldCat LCCN SearchWorldCat LCCN Search
Access 2004
RPN Query • Reverse Polish Notation
Used to express boolean queries in Z39.50 without using brackets. Process from right to left.RPN query:Orwell:NOT:Down:AND:Paris:London:AND Means:(Paris AND London AND Down) NOT Orwell
Access 2004
RPN Query: Silicon Snake
Access 2004
RPN Query: Silicon Snake Oil
Access 2004
RPN Query: Explanation
Compare:Case 1: (term3) and (term2 and term1)withCase 2: (term3 and term2) and (term1)
Should return the same result but one vendor’s server drops term1 in Case #1.
Access 2004
Gotcha!!Gotcha!!RPN QueriesRPN Queries
Gotcha!!Gotcha!!RPN QueriesRPN Queries
Access 2004
Character Sets• Question: How do I search the Taiwan
National Library Z server in Chinese?• Character set negotiation
UNICODE, MARC-8, CCCII, EACC and other character sets.
• MARC-8 to Unicode conversion rules for double width diacriticsUse rules set by the MARC 21 community or the Unicode Consortium?
• JACKPHY (Japanese, Arabic, Chinese…) records from LC are improperly coded.
• Without character set negotiation how do you encode foreign language characters in a Z39.50 search?
Access 2004
Taiwan National Library Catalogue
Access 2004
Gotcha!!Gotcha!!Non-roman language search Non-roman language search
Gotcha!!Gotcha!!Non-roman language search Non-roman language search
Access 2004
Stopwords
• Some servers do not index high posting and/or reserved terms including articles, boolean operators, etc.
• Use of stopwords in a search may generate an “error message” and/or return “0” results.
• Stopwords are not documented.
Access 2004
Université du Quebéc
• If you include a stopword in a Z39.50 keyword search you will receive the following error message:
Gone with the wind BIB 4 stopword error
Gone with wind BIB 4 stopword error
Gone wind returns 22 records
Access 2004
Nova Scotia Provincial Library
Access 2004
Nova Scotia Provincial Library
Access 2004
Gotcha!!Gotcha!!Stopwords in searchesStopwords in searches
Gotcha!!Gotcha!!Stopwords in searchesStopwords in searches
Access 2004
“Any” Search• According to the Bath Profile:
Uses: Searches for complete word in data elements that are commonly used as access points (as defined by the server). Any searches comprising more than one keyword are interpreted in such a way that the terms may exist in the same or different attributes.Example: a search on "Dickens AND Twist" might conceivably find "Dickens" in the Author Use Attribute (1003) and "Twist" in the Title Use Attribute (4).
Access 2004
Melvyl Author/Title Search
Access 2004
Melvyl “Any” Search
Access 2004
Melvyl “Any” Search Explanation
• Melvyl supports searching for authors, titles, subjects, etc. using the “Any” search but the terms must all be from the same heading (e.g. author, title, etc.)
Access 2004
Gotcha!!Gotcha!!Melvyl “Any” SearchMelvyl “Any” Search
Gotcha!!Gotcha!!Melvyl “Any” SearchMelvyl “Any” Search
Access 2004
Scan (browse)• Browse is an essential component of
a library catalogue interface, especially for known item searching.
• View a list of author/title/subject headings and retrieve records associated with the desired heading.
• There are wide-spread problems with the follow-up Z39.50 search associated with a browse list.
Access 2004
MUN Keyword Title Search(American History)
Access 2004
MUN Browse Title (American History)
Access 2004
MUN Follow-Up Browse Title Search (American History)
Access 2004
WorlCat Browse (use attribute alone returns keywords)
Access 2004
WorlCat Browse (use and structure phrase
attributesreturn headings – just subfield “a”)
Access 2004
WorlCat Browse (use/structure/completeness attributes return complete
headings)
Access 2004
Follow-up Search From a Browse List
• Not well defined by the profiles. Implementor agreement is confusing.
• Ideally, a follow-up search should be an “exact match” just like it is in the catalogue, however, “exact match” is not supported by all servers.
• Heading punctuation must be included for some servers and removed for others.
Access 2004
Gotcha!!Gotcha!!Follow-up Browse searchFollow-up Browse search
Gotcha!!Gotcha!!Follow-up Browse searchFollow-up Browse search
Access 2004
NCIP• Mapping of data elements to/from
circulation/ILL to NCIP is not a trivial process.
• Circulation systems have a long history and reflect “local” policies.
• One example: Sirsi’s Unicorn system supports two logins for a user: bar code and altid both tied to the same pin. NCIP and the Relais ILL system support a single loginid per user.
Access 2004
Gotcha!!Gotcha!!NCIP/Circulation/ILL Cross-walkNCIP/Circulation/ILL Cross-walk
Gotcha!!Gotcha!!NCIP/Circulation/ILL Cross-walkNCIP/Circulation/ILL Cross-walk
Access 2004
Dealing With The Gotchas
• ISBN Search– Identify servers which require
hyphens in searches and lobby vendors and Bath/NISO Profile committees to support ISBN search without hyphens
– Strip out hyphens from a search when required
– Reconstruct hyphenated ISBN when required (?)
Access 2004
Dealing With The Gotchas
• WorldCat LCCN Search– Contact OCLC regarding the problem.
• RPN Queries– Given that the problem is associated
with the largest installed base of Z39.50 servers lobby the Z client developers to send RPN queries based on the OCLC way which appears to work with all Z servers.
Access 2004
Dealing With The Gotchas
• Non-roman language search– Strive for an end-to-end Unicode solution
which includes the metasearching client, XML record display and ILL document request service.
• Stopwords– Identify servers with stopword limitations and
configure client to strip out common stopwords from keyword searches and/or lobby Bath/NISO US profile committees for a solution. For Université du Quebéc, use “phrase” search.
Access 2004
Dealing With The Gotchas
• Melvyl “Any” Search– Contact Melvyl
• Follow-up Browse Search– Lobby Bath/NISO US profile committees
for a solution. Ask OCLC to update documentation on Browse.
• NCIP/CIRC/ILL cross-walk– Communicate problems to NCIP
committee committee. Will need to register additional schemas to accommodate local needs.
Access 2004
Final Words Of Advice• Test, test, and test some more.• Don’t take standards for granted.• Listen to your users concerns.• Communicate/share problems with
others.• Test, test, and test some more.