smart subjects: application independent subject recommendations tito sierra ncsu libraries code4lib...

30
Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Upload: pearl-ferguson

Post on 23-Dec-2015

249 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Smart Subjects: Application Independent

Subject Recommendations

Tito SierraNCSU LibrariesCode4Lib 2007

Page 2: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Outline

• Concept

• Motivation

• Smart Subjects Applications

• How it Works

• Strengths and Weakness

• Future Plans

Page 3: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Smart Subjects Concept

Input:•User search query

Output:• A list of related library subjects

Page 4: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Smart Subjects Concept

Input:•User search query

Output:• A list of related library subjects

Basically a subject recommendation engine.

Page 5: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Example 1

Input:

music therapy

Output:• Music• Curriculum & Instruction• Education• Communication &

Media• Psychology• Biochemistry

Page 6: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Example 2

Input:

asymptotic stability

Output:• Bioinformatics &

Biomathematics• Statistics• Mathematics, Science &

Technology Education• Mathematics• Computer Science• Aerospace Engineering

Page 7: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Example 3

Input:

illegal immigration

Output:• Criminology• Political Science• Public Administration• Biology• Zoology• Industrial Engineering

Page 8: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Example 3

Input:

illegal immigration

Output:• Criminology• Political Science• Public Administration• Biology• Zoology?• Industrial Engineering

Page 9: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Motivation

Search log analysis: standard, international economic development, fines, dissertation abstracts, music therapy, ACM, wolfcopy, Oxford English Dictionary, audio, illegal immigration, schedule, interlibrary, datamonitor, chemistry, JAMA, CRC, photography, vision, wiley, ciation builder, job, academic search elite, ria, film studies, career development, sanborn maps, citation index, iee, history, industry analysis, scholarly journals, ethics, spss, petition, animal behavior, psych info, repository, ENR, diabetes, data, lrl, cancer, textbooks, wharton, Christian Science Monitor, ITTC, blah, PubMed, time magazine, nutrition, DVD, questia, conductive heat transfer, sage, newspaper

Page 10: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Motivation

Search log analysis:• Lots of topical

subject queries in the long tail!

standard, international economic development, fines, dissertation abstracts, music therapy, ACM, wolfcopy, Oxford English Dictionary, audio, illegal immigration, schedule, interlibrary, datamonitor, chemistry, JAMA, CRC, photography, vision, wiley, ciation builder, job, academic search elite, ria, film studies, career development, sanborn maps, citation index, iee, history, industry analysis, scholarly journals, ethics, spss, petition, animal behavior, psych info, repository, ENR, diabetes, data, lrl, cancer, textbooks, wharton, Christian Science Monitor, ITTC, blah, PubMed, time magazine, nutrition, DVD, questia, conductive heat transfer, sage, newspaper

Page 11: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Motivation

Existing work:• Subject Browse

portal at NCSU

Page 12: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Subject Browse at NCSU

• Locally developed subject classification launched in Fall 2005

• 100 subject nodes in 12 top-level categories

• Subject nodes influenced by the university curriculum (e.g. Crop Science)

Page 13: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Subject Browse at NCSU

Page 14: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Subject Browse at NCSU

Page 15: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Smart Subjects Applications

• Quick Search integration

• OpenSearch interface

Page 16: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Quick Search Integration

Page 17: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Quick Search Integration

Page 18: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Quick Search Integration

Page 19: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

OpenSearch Interface

Page 20: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

OpenSearch Interface

Page 21: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

How it Works

1. Harvest available institutional data• Course catalog descriptions• Faculty publications citations

2. Create “text extract” representations for each academic department on campus

3. Index the text extracts

4. Retrieval interface queries indices

5. Retrieval algorithm crosswalks academic departments to library subject classification

Page 22: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

How it Works

Page 23: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

How it Works

Page 24: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

How it Works

Page 25: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Technology Used

• SWISH-E for indexing

• PHP for retrieval processing/scoring

Page 26: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Strengths

• Application and collection independent

• Subject recommendations can be integrated in any library search application

• Encourages broader, serendipitous resource discovery

Page 27: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Weaknesses

• False positives (bad recommendations)

• Zero hits (no recommendations)

Page 28: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Future Plans

• Deploy new uses of Smart Subjects tool•Database Advisor

• Increase the size of subject indices• Article table of contents data• Backlog of course descriptions

• Gauge interest for a community subject recommendation platform

Page 29: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

More Information

Project Site:http://www.lib.ncsu.edu/dli/projects/smartsubjects

Page 30: Smart Subjects: Application Independent Subject Recommendations Tito Sierra NCSU Libraries Code4Lib 2007

Thanks!

Tito SierraNCSU Libraries

[email protected]