co-cited author maps as interfaces to digital libraries: kohonen and pfnet displays for the...

29
Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College of Information Science and Technology Drexel University, Philadelphia, PA

Upload: raymond-sullivan

Post on 12-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Co-Cited Author Maps as Interfaces to Digital Libraries:

Kohonen and PFNetDisplays for the Humanities

Howard D. White Jan Buzydlowski

Xia LinCollege of Information Science and Technology

Drexel University, Philadelphia, PA

Page 2: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Co-citation is the mentioning of any two earlier documents in the bibliographic references of a later third document.

The count of mentions may grow over time as new writings appear. Thus, co-citation counts can reflect citers’ changing perceptions of documents as more or less strongly related.

Documents shown to be related by their co-citation counts can be mapped as proximate in intellectual space.

Co-Citation Analysis

Doc 1

Doc 2

Doc 3

Page 3: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Co-Citation Analysis

Lin, Xia. 1997. Map Displays for Information Retrieval. Journal of the American Society for Information Science 48: 40-54.

Chen, Chaomei. 1998. Bridging the Gap: The Use of Pathfinder Networks in Visual Navigation. Journal of Visual Languages and Computing 9: 267-286.

Document co-citation counts times two papers are cited together.

Author co-citation counts times two authors, e.g., Lin and Chen, are cited together.

Journal co-citation counts times two journals are cited together.

Page 4: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Co-Citation Analysis

Data on co-citation are readily obtainable from databases of the Institute for Scientific Information (ISI) in Philadelphia, PA:• Scisearch (Science Citation Index)• Social Scisearch (Social Sciences Citation Index)• Arts & Humanities Search (Arts & Humanities

Citation Index) These databases are searchable online through,

e.g., the Dialog Corporation.

Page 5: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Author Co-Citation Analysis (ACA)

Detects patterns in the frequency with which any works by any two authors are jointly cited in later works.

Only recurrent co-citation is significant: the more times authors are cited together, the more strongly related they are in the eyes of citers.

Page 6: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Author Co-Citation Analysis

If Ben Shneiderman and Shakespeare are cited together in one article, it probably means little.

If Ben Shneiderman and Stuart Card are cited together in 205 articles,* it means a lot: their names have jointly come to symbolize something like “interactive interfaces for digital libraries.” Possibly no subject heading captures this concept.

In a cited-author (CA) search on Dialog, SELECT CA=SHNEIDERMAN B AND CA=CARD SK

would retrieve the 205 citing articles. *Actual count, 7/10/00

Page 7: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Underlying Database and Software

ISI gave our college 10 years’ worth of data from the Arts & Humanities Citation Index (AHCI 1988-1997) as a research grant. Has 1.26 million bibliographic records on articles and other items from humanities journals.

For retrievals from AHCI, we bought BRS Search, an industrial-strength engine, from Dataware, Inc.

Buzydlowski and Lin have written several special programs in Java and C to implement our system on top of the BRS Search software.

Page 8: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Our Project

Produces co-cited author maps in real time (a few seconds) on a Web site.

Low cognitive load: User merely has to enter name of a single author of interest as a “seed.”• E.g., Dickinson-E for Emily Dickinson

System responds with the top authors co-cited with that seed—about 25 names ranked by frequency of co-occurrence.

Page 9: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Quick Visualizations of a Database

User can choose to display the top 25 as either a Kohonen feature map (SOM, self-organizing map) or a Pathfinder network map (PFNET).

User can use either map as • An aid to retrieving articles from AHCI

1988-97 that cite authors in various combinations. Combinations are made through drag-and-drop.

• Reproducible artwork in a new study, such as a review of a literature or a commentary on the author used as “seed.”

Page 10: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 11: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 12: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 13: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 14: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Maps in the Humanities

We are able to produce maps of authors in the humanities with high face validity.• Can build maps around great names in literature,

philosophy, history, religion, the fine arts. E.g., Dante, Picasso, D. H. Lawrence, Martin Luther, Edward Gibbon, Emily Dickinson, Plato, Vladimir Nabokov.

• Can also build maps around noted scholars, critics, or commentators. E.g., Simon Schama, Garry Wills, Elaine Showalter, Camille Paglia, Derek de Solla Price.

• System will work with authors in other ISI databases in the natural and social sciences. Also with other kinds of co-occurring terms: journal names, descriptors, etc.

Page 15: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Advantages of Maps

Ranked list of top 25 co-cited authors often contains names not previously known to user.

Both Kohonen maps and PFNETs show interconnections of the 25 authors not apparent in the one-dimensional ranking of a simple list.

Page 16: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Interpretation of Maps

Kohonen maps show high co-citation counts of authors by placing them closer in space.

PFNETs show highest co-citation counts of authors directly, as links between nodes bearing authors’ names. The counts themselves can be made to appear above the links.

Page 17: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Kohonen Feature Maps

Are a variety of neural network. Are produced by an algorithm for

unsupervised computer learning in which data points “compete” for the position on the output grid that best represents their numeric weights (co-citation counts) relative to all other points.

Page 18: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

PFNETs

Are algorithmically connected graphs based on finding “minimum-cost” path between any two nodes.

In ACA, this is generally the highest single co-citation count between author pairs (all pairs are examined).

Results in useful simplification of graph. Use spring embedder algorithm to

produce layout.

Page 19: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

PFNETs

Make sense as pictures of relations in databases! Independent observers have found them highly

intelligible:• Xia Lin on Chinese philosophers• Kate McCain on historians of science & technology• Howard White on various literary figures and artists

Buzydlowski research will test interpretability of PFNETs and Kohonen maps as interfaces for domain experts and naïve users.

Page 20: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Interface Design Considerations

Link interface to valuable digital libraries (ISI citation databases and the journal literatures they lead to).

Focus on intellectual content: meaningful words, meaningfully presented.

Stress quick and flexible presentations over long-term displays.

Page 21: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 22: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 23: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 24: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 25: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

Evidence We’re on Right Track

US Patent 6,038,574: “Method and Apparatus for Clustering Collection of Linked Documents Using Co-Citation Analysis”

Filed: March 18, 1998 Awarded: March 14, 2000 Inventors: James E. Pitkow, Peter L. Pirolli,

Jock D. Mackinlay, Stuart K. Card, all of Xerox PARC

Page 26: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College
Page 27: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

SCHLEIERMACHER F

GADAMER HG

KANT I

HEGEL GWF

BARTH K

DILTHEY W

HEIDEGGER M

PLATO

BIBLE

ARISTOTLE

HABERMAS J

DERRIDA J

RICOEUR P

GOETHE JWV

BULTMANN R

FRANK M

NIETZSCHE F

TILLICH P

FICHTE JG

PANNENBERG W

TROELTSCH E

SCHELLING FWJ

SCHLEGEL FV

LUTHER M

EBELING G

PFNET of authors co-cited with F. Schleiermacher in AHCI,

1988-1997(Biblical and literary hermeneutics)

Page 28: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College

AuthorLink System Structure

…….. Procedures

Web InterfaceJava Applet

Web Server

Application ServerJava Servlets

Kohonen Mapping Procedures in C

BRS SearchEngine/ISI Data

PFNET Mapping Procedures in C

cgi

Page 29: Co-Cited Author Maps as Interfaces to Digital Libraries: Kohonen and PFNet Displays for the Humanities Howard D. White Jan Buzydlowski Xia Lin College