october 21st, 2010 current trends in library search: from electronic card boxes to large scale,...

22
October 21st, 2010 www.gbv.de Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, [email protected] Verbundzentrale des GBV (VZG)

Upload: david-gilbert

Post on 11-Jan-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Current trends in library search:

From electronic card boxes to large scale,

aggregated search engines

Till Kinstler, [email protected] des GBV (VZG)

Page 2: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 3: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 4: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Taken from Avram, Henriette D.: The MARC Pilot Project. Final Report. Library of Congress, Washington, DC., 1968, http://www.eric.ed.gov/ERICWebPortal/detail?accno=ED029663

Page 5: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

01410nam a2200349 i 4500

001 177062495

003 DE-601

005 20100706235250.0

008 950105s1957 xxu 000 0 eng d

016 7 $a 452218721 $2 DE-101

040 $a GyGoGBV $b ger $e rakwb

041 0 $a eng

044 $a xxu $a xxk

245 00 $a Information systems in documentation : $b based on the Symposium on Systems for Information Retrieval held at Western Reserve University, Cleveland, Ohio, in April, 1957 / $c ed.: Jesse H. Shera; A. Kent; J. W. Perry.

260 $a New York, NY [u.a.] : $b Interscience Publ., $c 1957.

300 $a XV, 639 S : $b Ill., graph. Darst ; $c gr. 8.

490 0 $a Advances in documentation and library science ; $v 2

653 0 $a Information storage and retrieval systems

700 1 $a Shera, Jesse H., $0 (DE-601)400477432.

700 1 $a Kent, A..

700 1 $a Perry, J. W..

710 2 $a Symposium on Systems for Information Retrieval $c (1957, Cleveland, Ohio)

711 2 $a Symposium on Systems for Information Retrieval $d (1957.04. : $c Cleveland, Ohio)

830 $v 2 $w (DE-601)129356271

950 $a Literaturrecherche $a Kongre� $2 GBV

050 0 $a Z695.92

060 0 $a Z 1008

082 00 $a 029.75

084 $a 06.74 ; Informationssysteme $2 bcl

084 $a 35.99 ; Chemie: Sonstiges $2 bcl

900 $a GBV $b SUB+Uni G�ttingen <7> $d !FMAG! ZA 18582:2 $x L $z LC

954 $a 40 $b 742655784 $c 01 $d ZA 18582:2 $e u $x 0007

Page 6: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

GBV (Common Library Network)

~ 400 (mainly academic) libraries

~ 120 million data records describing books, articles, digital objects, ... available through these libraries

What do we do with this data?

Page 7: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 8: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 9: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 10: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 11: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 12: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

„In addition, we have also found that the poor usability, high complexity, and lack of integration of many electronic resource discovery systems, have raised the entry threshold of information technology literacy. This acts as a barrier to information search and retrieval. […]

Users find database structures hinder. They have to learn the procedural knowledge for using a particular database as well as have some basic knowledge of how the data table is organised and what subject matter the built-in thesauri refers to; both have limited transferability. The participants did not appear to lack information technology or digital literacy, as they had demonstrated they were able to use other internet-based search and retrieval tools.“

(Wong, W. ; Stelmaszewska, H. ; Barn, B. ; Bhimani, N. ; Barn, S.: JISC User Behaviour Observational Study: User Behaviour in Resource Discovery. Final Report / JISC. Version: November 2009. http://www.jisc.ac.uk/media/documents/publications/programme/2010/ubirdfinalreport.pdf)

Page 13: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Search Engine Index

Page 14: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 15: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 16: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Suchkiste goals:

Provide a single access point to DFG Nationallizenzen collection (~150 million digital ressources)

Make better use of data using a search engine using built on information retrieval technology → Solr

User experience based on web standards

Open up library data silos to the web

Page 17: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 18: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Page 19: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Challenges: data collection

Where to get it? Legal questions?

Coverage?

How to get/transfer it? OAI-PMH? RSS? ftp? „Dumps“? „XML“? Tapes?

Updates?

Processing? Normalisation? Deduplication/Clustering? Variety of data (formats), structure (implicit and explicit), documentation?, messy data, errors in data (encoding, structure...)

Sotrage and management of large data sets

→ lots of manual(!) work

Page 20: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Challenges: Search Engines

How to index structured library data?

Relevance ranking?

Factors? (TF/IDF?, popularity?, availability?,„freshness“?, „context“?, …)Use structure and content of (library) metadata?Mixing „metadata“ and „fulltext“?Mixing (data on) different media?

Minor issues: search suggestions, stemming

Page 21: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Challenges: User Interfaces

overall user experience

Single search box ↔ advanced search making use of data structure

Browsing, Visualisation, Refining, Facets?

Making it part of the web

Search suggestions, spelling corrections, ...

Page 22: October 21st, 2010  Current trends in library search: From electronic card boxes to large scale, aggregated search engines Till Kinstler, kinstler@gbv.de

October 21st, 2010www.gbv.de

Next:

One central index of all GBV data (~120 million records)

Beyond opening HTML record views to the web, opening data for use on the web: linked data