mcnc/cnidr & a/www enterprises introduction to cnidr’s isite jim fullton - mcnc/cnidr archie...
TRANSCRIPT
![Page 1: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/1.jpg)
MCNC/CNIDR & A/WWW Enterprises
Introduction to CNIDR’s Isite
Jim Fullton - MCNC/CNIDRArchie Warnock - A/WWW
Enterprises
![Page 2: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/2.jpg)
MCNC/CNIDR & A/WWW Enterprises
What is Isite?
A freely available implementation of the Z39.50 search/retrieval protocol
It includes a Unix-based server, a WWW gateway, a command-line client and a sophisticated text search engine
ftp://ftp.cnidr.org/pub/NIDR.tools/Isite http://vinca.cnidr.org/software/Isite/Isite.html
![Page 3: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/3.jpg)
MCNC/CNIDR & A/WWW Enterprises
What is Isearch?
Isearch is the successor to freeWAIS Isearch is a sophisticated full-text
search and retrieval system Isearch is a component of Isite, an
implementation of the NISO standard protocol Z39.50 for information search and retrieval
ftp://ftp.cnidr.org/pub/NIDR.tools/Isearch http://vinca.cnidr.org/software/Isearch/Isearch.html
![Page 4: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/4.jpg)
MCNC/CNIDR & A/WWW Enterprises
System Components - I
Iindex, the Text Indexer - builds searchable version of the document collection Implements fast word-based searching Document parser - recognize start/end
of individual documents Field parser - recognize start/end of
fields within individual documents
![Page 5: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/5.jpg)
MCNC/CNIDR & A/WWW Enterprises
System Components - II
Isearch, the Search engine - searches a document collection based on user-supplied query Command line search
Primarily used for testing WWW gateway (using CGI)
End-user interface using forms Z39.50 gateway
![Page 6: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/6.jpg)
MCNC/CNIDR & A/WWW Enterprises
Isearch Capabilities
Fast full-text search US AIDS Patent Collection - can search
~250,000 patents in < 1 second Fielded search
Can restrict searches to title, author, abstract, other fields
Relevance ranking Search “hits” are assigned scores &
sorted
![Page 7: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/7.jpg)
MCNC/CNIDR & A/WWW Enterprises
Isearch Capabilities
Word truncation search for “matri*” matches “matrix”
and “matrices” Boolean functions
AND, OR and ANDNOT combinations of different fields
Customized presentation of results Phrase searching (coming soon)
![Page 8: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/8.jpg)
MCNC/CNIDR & A/WWW Enterprises
Isearch Customization
What’s needed to customize Isearch? Isearch is written in C++ Documents are C++ objects - data &
procedures Already have SGML & HTML, among others
Object technology allows code reusability, customizing only where differences from existing objects occur
![Page 9: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/9.jpg)
MCNC/CNIDR & A/WWW Enterprises
Isearch Customization
What’s needed to make arbitrary documents searchable? Code to parse documents Code to parse fields Code to build brief and full result
records Yes, it requires programming But, many of these are derived from
existing procedures
![Page 10: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/10.jpg)
MCNC/CNIDR & A/WWW Enterprises
Introduction to Z39.50
Developed for search and retrieval Networked, client/server environment Tested by working information
scientists (Z39.50 Implementor’s Group)
Commerical & public domain support (Isite from CNIDR)
http://www.ds.internic.net/z3950/z3950.html
![Page 11: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/11.jpg)
MCNC/CNIDR & A/WWW Enterprises
Attribute Sets
Attributes define how the query is specified Use: field names Relation: comparisons Position: location in field Structure: word/phrase/key/etc Truncation: left/right/none/etc Completeness: subfield/field
![Page 12: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/12.jpg)
MCNC/CNIDR & A/WWW Enterprises
Attributes & Element Sets
Supported Attribute Sets BIB-1 GILS GEO STAS
Element Sets define retrievable sets of use attributes Brief record Full record Summary record (GEO)
![Page 13: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/13.jpg)
MCNC/CNIDR & A/WWW Enterprises
Record Syntaxes
Z39.50 allows specification of a “Preferred Record Syntax” for results SUTRS (unstructured text) HTML USMARC GRS-1 (tagged, generalized syntax)
![Page 14: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/14.jpg)
MCNC/CNIDR & A/WWW Enterprises
Profiles - GEO and Otherwise
Profiles define allowed attributes and element sets
Usually domain specific - ATS-1, GILS, WAIS, GEO, Digital Collections, Museum Collections
Supported by external agreement between client & server (currently) i.e., a GEO client talks to a GEO server
![Page 15: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/15.jpg)
MCNC/CNIDR & A/WWW Enterprises
FGDC Enhancements
Search Engine (Iindex/Isearch) Field types (text, numeric, date,
others) Search in nested fields Search in numeric fields Date & Date Range Searching Spatial Searching
![Page 16: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/16.jpg)
MCNC/CNIDR & A/WWW Enterprises
FGDC Enhancements
Z39.50 Implementation (ZDist) Support for GEO attributes & element
sets GRS-1 record syntax Support for additional (non-Isearch)
search engines Syntax to support nested query
![Page 17: MCNC/CNIDR & A/WWW Enterprises Introduction to CNIDR’s Isite Jim Fullton - MCNC/CNIDR Archie Warnock - A/WWW Enterprises](https://reader036.vdocuments.us/reader036/viewer/2022082518/56649d0c5503460f949e0936/html5/thumbnails/17.jpg)
MCNC/CNIDR & A/WWW Enterprises
Outstanding Issues
User Interface What fields are searchable and how
does the user indicate them? How complex can the geographic
queries be? Bounding box only? Complex regions?