endeca @ ncsu libraries kristin antelman ncsu libraries june 24, 2006

42
Endeca @ NCSU Endeca @ NCSU Libraries Libraries Kristin Antelman Kristin Antelman NCSU Libraries NCSU Libraries June 24, 2006 June 24, 2006

Upload: shanon-andrews

Post on 31-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Endeca @ NCSU Endeca @ NCSU LibrariesLibraries

Kristin AntelmanKristin Antelman

NCSU LibrariesNCSU Libraries

June 24, 2006June 24, 2006

Page 2: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Overview The problemThe problem Quick demoQuick demo Technical overviewTechnical overview Implementation processImplementation process Use dataUse data Assessment dataAssessment data Next stepsNext steps

Page 3: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Existing catalogs are hard to Existing catalogs are hard to use:use:– known item searching works pretty known item searching works pretty well, well, but …but …

– users often do keyword searching users often do keyword searching on topics and get large result on topics and get large result sets returned in system sort ordersets returned in system sort order

– catalogs are unforgiving on catalogs are unforgiving on spelling errors, stemmingspelling errors, stemming

Why did we do this?

NO RELEVANCY!

Page 4: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Catalog value is buried

Subject headings are not Subject headings are not leveraged in searchingleveraged in searching– they should be they should be browsedbrowsed or or linkedlinked from, not searchedfrom, not searched

Data from the item record is not Data from the item record is not leveragedleveraged– should be able to filter by item type, should be able to filter by item type, location, circulation status, location, circulation status, popularitypopularity

Page 5: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

What does the Endeca software do? Provides search software for Provides search software for ecommerce companiesecommerce companies

Faceted browse of structured metadata; Faceted browse of structured metadata; goal is to goal is to exposeexpose the ontology the ontology

Page 6: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 7: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 8: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 9: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 10: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 11: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 12: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 13: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 14: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 15: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 16: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 17: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 18: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 19: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 20: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Endeca technical overview

Raw MARC data

NCSU exports and reformats

Flat text files

Data FoundryParse text files

Indices

MDEX Engine

NCSU Web Application

HTTP

Client browser

HTTP

Endeca Information Access Platform

Page 21: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Integrating Endeca - Enhancements MarcAdapter plugin for raw MARC MarcAdapter plugin for raw MARC data.data.– Eliminate need for external MARC Eliminate need for external MARC 21 translation and file merging21 translation and file merging

Partial UpdatesPartial Updates– Update circulation data multiple Update circulation data multiple times throughout the daytimes throughout the day

Page 22: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Implementation process TimelineTimeline

– License / negotiation: Spring 2005License / negotiation: Spring 2005– Acquire: Summer 2005Acquire: Summer 2005– Implementation: August 2005 – January 12, Implementation: August 2005 – January 12, 20062006

7 representative team members7 representative team members– functional requirements, metadata, functional requirements, metadata, interface issuesinterface issues (total of 40-60 hours) (total of 40-60 hours)– project manager: approximately 10 hours project manager: approximately 10 hours per week per week for 20 weeks for 20 weeks

Java-trained librarianJava-trained librarian (30-40 hrs/wk (30-40 hrs/wk for 14 weeks)for 14 weeks)

It doesn’t have to be perfect!It doesn’t have to be perfect!

Page 23: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Key decision points

Search interfaceSearch interface

Page 24: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Main search page

Endeca

Web2

Page 25: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Advanced search

Page 26: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

A few major issues Search interfaceSearch interface

Selecting dimensions and their orderSelecting dimensions and their order

Page 27: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

10. Library of Congress Classification

9. Availability

1. Subject: Topic2. Subject: Genre3. Format4. Library5. Subject: Region6. Subject: Era7. Language8. Author

Dimensions

Page 28: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

A few major issues Search interfaceSearch interface

Selecting dimensions and their orderSelecting dimensions and their order Defining the relevance algorithmDefining the relevance algorithm

Page 29: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Relevance defined Relevance ranking in Endeca – Relevance ranking in Endeca – select from a variety of modules select from a variety of modules and order them based on importanceand order them based on importance

At NCSU…At NCSU…1.1.Original query term(s) (no Original query term(s) (no thesaurus, stemming, spell thesaurus, stemming, spell correction)correction)

2.2.Exact phrase matchExact phrase match3.3.Field ranking (Title higher than Field ranking (Title higher than Author higher than Table of Contents, Author higher than Table of Contents, etc.)etc.)

4.4.Number of fields that contain Number of fields that contain term(s) …term(s) …

Page 30: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Use dataUse data

Page 31: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Some search statistics (March

- May 2006)

Requests by Search Type

Search -> Navigation

29%

Navigation 20%

Search 51%

Page 32: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Sorting statistics (March – May

2006)

Sorting Requests

Most Popular19%

Title A-Z13%

Pub Date53%

Author A-Z

Call Number

Page 33: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Some navigation statistics (March - May 2006)

Navigation Requests by Dimension

70,516

38,074

38,605

59,248

87,221

74,985

65,545

155,856

169,249

23,848

0 30,000 60,000 90,000 120,000 150,000

Author

Language

Subject: Era

Subject: Region

Library

Format

Subject: Genre

Subject: Topic

LC Classification

Availability

Requests

Page 34: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

AssessmentAssessment

Page 35: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Some user reaction

“The new Endeca system is incredible. It would be difficult to exaggerate how much better it is than our old online card catalog (and therefore that of most other universities). I've found myself searching the catalog just for fun, whereas before it was a chore to find what I needed.”

- NCSU Undergrad, - NCSU Undergrad, StatisticsStatistics

“The new library catalog search features are a big improvement over the old system. Not only is the search extremely fast, but seemingly it's much more intelligent as well.”

- NCSU faculty, - NCSU faculty, PsychologyPsychology

Page 36: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Topical searching tasks

Topical Task Success: Web2

Easy36%

Medium7%Hard

23%

Failed34%

Topical Task Success: Endeca

Easy58%

Medium17%

Hard3%

Failed22%

Page 37: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Average topical task duration

Page 38: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Testing relevance

Are search results in Endeca more Are search results in Endeca more likely to be relevant to a user’s likely to be relevant to a user’s query than search results in Web2 query than search results in Web2 OPAC? OPAC?

100 topical user searches from 1 100 topical user searches from 1 month in fall 2005month in fall 2005

How many of top 5 results How many of top 5 results relevant?relevant?– 40% relevant in Web2 OPAC40% relevant in Web2 OPAC– 68% relevant in Endeca catalog68% relevant in Endeca catalog

Page 39: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 40: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006
Page 41: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Future plans

FRBR-ized displays FRBR-ized displays FAST (Faceted Access to Subject Terms) instead of LCSHFAST (Faceted Access to Subject Terms) instead of LCSH

Enrich records with Enrich records with supplemental contentsupplemental content

More integration with website searchMore integration with website search

Use Endeca to index local collectionsUse Endeca to index local collections

Page 42: Endeca @ NCSU Libraries Kristin Antelman NCSU Libraries June 24, 2006

Thank you

project page:www.lib.ncsu.edu/endeca