next generation library catalogs: local developments and
TRANSCRIPT
Next Generation Library Catalogs: Local developments and researchLocal developments and research
opportunities
Derek Rodriguez, TRLNe e od gue ,September 26, 2008
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
OverviewOverview
Introduction to TRLNS d l f h TRLN E dScope and goals of the TRLN Endeca ProjectProject challenges and system architectureDiscussion of opportunities for SILS / TRLN collaboration
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Triangle Research Libraries NetworkTriangle Research Libraries Network
An Academic Research Library ConsortiumConsortium
Duke University LibrariesNC C t l Lib iNC Central LibrariesNC State LibrariesUNC Chapel Hill Libraries
Founded in 1977, cooperation dates to 1930s
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
TRLN: Oversight and GovernanceTRLN: Oversight and Governance
Governing BoardExecutive CommitteeExecutive Committee
Council of DirectorsTRLN CouncilsTRLN Councils
Collection DevelopmentHuman ResourcesSer icesServicesTechnology
Committees and Task groups as needed
TRLN Staff (Director, two Program Officers, and an Administrative Assistant)
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Triangle Research Libraries NetworkTriangle Research Libraries Network
Program focusC ll b ti ll ti d l tCollaborative collection development
Print and electronicS i t t d t t i l tServices to extend access to materials to affiliated patrons
Document Delivery and Reciprocal BorrowingDocument Delivery and Reciprocal BorrowingDigital TRLN
Endeca projectEndeca projectSingle Copy ArchiveTRLN Management Academy
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
g y
Next Gen CatalogsNext Gen Catalogs
Generations of OPACsOnline ‘Card’ CatalogOnline Card Catalog+ exact-match Boolean keyword searching+ Web enabled catalogs offered search OR Web enabled catalogs offered search OR browsing of pre-coordinated headings
Next Generation Library CatalogsNot bound to legacy ILSTake advantage of latest web technologiesShould be ‘metadata agnostic’Exhibit an open system architecture
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Exhibit an open system architecture
TRLN Endeca Project: Phase 1TRLN Endeca Project: Phase 1
Why Endeca?Support for integrated search and browseSupport for integrated search and browse‘Forgiving’ features (spell correction, term suggestion, etc.)
Can ingest ‘metadata of all types’Can ingest metadata of all typesNCSU’s previous experience with Endeca
Phase 1 GoalsSearch and request from one interfaceSupport scoped interfaces at the campuses
Timeline June 2007 – August 2008
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
TRLN Endeca Project: Phase 1TRLN Endeca Project: Phase 1
Steering Committee and four task groupsUser InterfaceUser InterfaceData and IndexingDocument DeliveryDocument DeliveryUsability
Distributed Implementation TeampProgrammers, designers, data specialists
Commercial partnerspEndeca professional servicesMCNC
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Syndetic Solutions
Phase 1 TimelinePhase 1 Timeline
March 2007 – Contract with Endeca is signedApril 2007 – Task Groups are formedApril 2007 Task Groups are formedMay 2007 – UITG and DITG begin workJune 2007 – Staff attend Endeca traininggJuly 2007 – Requirements Definition MeetingAugust 2007 – Order serversSeptember 2007 – Complete wireframesOctober 2007 – Complete data and indexing recommendations install servers conduct focus grouprecommendations, install servers, conduct focus group interviews
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Phase 1 Timeline cont’dPhase 1 Timeline cont’d
November 2007 – Milestone: TRLN collections are indexed in Endeca (~10 million records)( )December 2007 – Begin testing Requests featureJanuary 2008 – Build Search TRLN User InterfaceFebruary – Conduct usability testing, refine data extracts, complete Requests, refine user interface, begin contracts with Syndetic Solutionscontracts with Syndetic SolutionsMarch – Final testing, phase 1 launch of Search TRLNMay – Refine relevance ranking / performance tuningy g p gJune – August – Duke, UNC, and NCSU interfaces launch
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
July 2008 – New oversight group convenes
Project Challenges
Coordinating the efforts of four task groups, an implementation team, and relationships with three p , pvendorsGather user perspective to support designCreate a data model to fit application logicCreate extract and data harvesting routinesDefine indexes and relevance rankingDefine indexes and relevance rankingBuild 4 user interfacesIntegrate with 9 local document delivery officesIntegrate with 9 local document delivery officesPerformance tuningOngoing governance and evaluation
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Create a data model
Data model should …Support needed functionalitypp yAccommodate but not reproduce MARCBe flexible and accepting of multiple metadata typesRespect local needs where possibleRespect local needs where possible
Stakeholders represent 4 institutions, 10 libraries, and numerous cataloging units
T k D fi fi ld / f t MARC t E d iTask: Define fields / facets, MARC to Endeca mappings, extract rules, and indexing rules
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Endeca Record: Properties Endeca Record: Properties
Endeca Properties (like database fields)Can be indexed, displayed, or used for sorting, p y , g
Can serve as unique ids or keys for joins
Can support linking via hyperlinks or calls to external services
Examples:Main Author: Traver, Robert, 1903-1991.Main Title: Trout MagicL ti D i LibLocation: Davis LibraryCall Number: SH687 .T73 1989UniqueId: UNCb2260544
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Endeca Record: DimensionsEndeca Record: Dimensions
Endeca Dimensions (Facets)Values can be pre-set or data-drivenp
Hierarchies and ranges are possible
Example:
FormatBook
Record id 1Record id 2
JournalRecord id 3
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Endeca Record: DimensionsEndeca Record: Dimensions
Facet SourceSubject 6xx $a or $xSubject 6xx $a or $xGenre 6xx $v / 008Time Period 6xx $yRegion 6xx $zRegion 6xx $zMedical Subject: Topic MESHAuthor Author fieldsLanguage MARC 008Language MARC 008Format Item record / MARC 008Publication Year MARC 008Call Number Range LC and NLM Call NumbersCall Number Range LC and NLM Call NumbersLocation (Library) Item Record Availability Item statusNew Titles Date cataloged
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
New Titles Date cataloged
Endeca Record: DimensionsEndeca Record: Dimensions
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Mapping and extract policies
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Mapping and extract policies
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Mapping and extract policies
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Mapping and extract policies (examples)
MARC 245 -> Endeca Main TitleMARC 100, 110, 111 -> Endeca Main AuthorMARC 100, 110, 111, 700, 800 -> Author FacetMARC 6xx -> Pre-coordinated Subjects
eg. Library schools – North Carolina – Chapel Hill --History
C SMARC 6xx -> Subject FacetsSubject (Topic): Library schoolsSubject (Topic): Historyj ( p ) yRegion: North CarolinaRegion: Chapel Hill
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Mapping and extract policies (examples)
MARC 008 (bytes 35-37) -> Language facet( y ) g gMARC 008 (bytes 33-34) -> Genre facet
Item DataLibrary -> Location property (for display)Library > Location facetLibrary -> Location facetStatus -> Status property (for display)Status -> Availability facetCall Number -> Call number and call number sort propertiesCall Number -> Classification facet
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Mapping and extract policies (examples)
MARC Field Extract rule Endeca Label Behavior245 abfghknps Strip trailing / Property:
Main TitleIndex as title and keyword, display on brief and full
drecord screens245 abfghknps Strip non-filing
charactersProperty: Title Sort
Do not index or display, enable sortsort
6xx subfield a or x Where first indicator = 0,1,2,3,4, or 7). Trim punctuation.
Facet: Subject Facet
Item status Create pipe delimited field of all item statuses attached to title
Property: Statuses
Used for display
title
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
System architectureSystem architecture
DukeUNC NCCUNCSU MARC data from ILSs, 4 institutions, 10 libraries, 3 vendors, 11 million titles
“ICE” ToCs Non-MARC data
EAD ?
Dublin Core ??
Other ?• Features•- Full ILS extracts occur as needed• - Cataloging changes are updated
TRLN Index
daily• - Circulation status is updated every 30 minutes• - ICE Table of Contents data is updated weekly
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
• - Index is rebuilt every night
One index, many interfacesOne index, many interfaces
TRLN Index(mirrored on 3
servers) HTML to browser
Client side calls for content
Apache/ Tomcat/Java UI
(4 instances) XML to Duke HTML to
RSS
XML to Duke PHP UI
HTML to browser
XML to other
XML from Web Services
devices/systems
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Some usage statistics
Search TRLN only, Sept 1 – 21, 20087437 sessions
7146 (97%) performed text search1461 (19%) performed facet navigation3702 (50%) began as local catalog searches
Searches by indexKeyword 48.40% Title 33 99%Title 33.99% Author 8.90% Subject 3.88% ISBN 3 21%ISBN 3.21% Multi-index queries 1.62%
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Some usage statistics
Other interesting data points18,833 record views
3801 sessions (51%) included at least one record view
1956 item requests1124 sessions (15%) included an item request1124 sessions (15%) included an item request
Future AnalysisMost popular facets
Subject, Location, and Format seem most popular so far
Session Level analysisyEstablish benchmarks for considering and evaluating enhancements
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
TRLN – SILS CollaborationCory Lown, SILS MSLS ’08, Brad Hemminger
Transaction Log Analysis with NCSU Endeca, http://hdl handle net/1901/488http://hdl.handle.net/1901/488
Dre Orphanides, SILS MSLS ‘08Field Experience, Spring ’08
Tessa Sullivan, SILS PhD StudentTransaction Log Analysis, Fall ’08
Brad Hemminger Associate Professor SILS and SaraBrad Hemminger, Associate Professor SILS and Sara Ramdeen, SILS PhD Student
End-user study, Fall ‘08
Kathy Wisser, Tessa Sullivan, Derek Rodriguez (SILS PhD Students)
Metadata Quality Projecty j
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Projects under consideration (next 6 – 12 months)
Indexing Non-MARC data (EAD, DC, etc.) in EndecaData modelingData harvestingUser interface issues
Creating a user interface for NC CentralUser needs assessmentUser needs assessmentUser interface design and implementationEvaluation
Evaluation (ongoing)Log analysis / Usability TestingLog analysis / Usability Testing
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Research Opportunities
Name AuthoritiesCan we integrate LC Name authorities into our Endeca interfaces?
Faceted Access to Subject Terminology (FAST)Si lifi d h b d LCSH t b tt t tSimplified schema based on LCSH to better support post-coordinated retrieval and simplify heading assignmentSee http://www.oclc.org/research/projects/fast/
FRBR / Record Rollup
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Research Opportunities
Find similar / recommender servicesHow do we leverage metadata about a record to help the userHow do we leverage metadata about a record to help the user find similar items? Can facets drive ‘find similar’ functionality? Can we use circulation / usage data to recommend other titles to users?users?
PersonalizationDo users want a personalized experience with a system like this? What about tagging?
Support for small devices like the iPhone, etc.
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Possible forms for collaborationField experiences for Master’s students
ExamplesShadowing a projectShadowing a project System building
Independent studiesExamples
Writing a literature review / background paperTransaction log analysis, user studies
Joint research and development projectsJoint research and development projects
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008
Discussion and Questions
Derek Rodriguez TRLNDerek Rodriguez, TRLN919-962-8022derek@trln [email protected]
http://www.trln.org
Useful links:Search TRLN : http://search trln orgSearch TRLN : http://search.trln.org
Endeca: http://www.endeca.com
Center for Research and Development of Digital Libraries UNC Chapel Hill, School of Information and Library Science, September 26, 2008