spring 2008 progress report spr2008pr david gleich and ying wang (with margot gerritsen and amin...

16
Spring 2008 Progress Report SPR2008PR David Gleich and Ying Wang (with Margot Gerritsen and Amin Saberi too!) Library of Congress May 27 th or May 28 th

Post on 19-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Spring 2008 Progress ReportSPR2008PR

David Gleich and Ying Wang(with Margot Gerritsen and Amin Saberi too!)

Library of CongressMay 27th or May 28th

Alternate Titles

Why LCSH is better than Wikipedia

Matching stuff to fluff

A novel quadratic programming framework for the network alignment problem.

Outline

The matching problem and it’s myriad uses

Parsing wikipedia and LCSH for all of the data

Theories on subject ontologies (you probably know better)

Last fall

Ying, Jeremy, Vinayak, and I spoke to a few of you about the similarities between LCSH and

Wikipedia categories.

We started working on ways of comparing these databases.

From MARC to GRAPH

1. Concatenate subfields of 1xx tags for nodes.2. Use 550 and 551 tags for edges.3. Use 450 and 451 tags for alternate names.

...150 0 _aKlingon (Artificial language)450 0 _atlhIngan (Artificial language)550 0 _wg _aLanguages, Artificial...

Klingon (artificial language) Languages, Artificial

LCSH

Overv

iew

(larg

est

connect

ed c

om

ponent)

PrivacyPrivacy, Right of

Privacy (Jewish Law)

Privacy (Islamic Law)Privacy (Canon Law)

Wikipedia to GRAPH

see also

narrowerterm

Determinants Linear algebra

Wikipedia ideas

Evaluate LCSH graph vs. WC graphTry and match LCSH with WC

... many more ideas …

MATCHING

Matching means taking a node in LCSH and finding only one node in WC that is a good pair.

Most famous matching problem:stable marriage.

Stable Marriage3

5

1

2

6

4

4

6

1

5

3

2

Angelina JolieBrad Pitt

David Gleich

Laura Bofferding

Slide approved by Laura Bofferding

2008 May 27

Matching WC and LCSH

•LCSH and WC have short text labels; use the labels to come up with a set of potential links.

Graph A Graph B

Linear algebra Linear algebra

Linear functions

Algebra

Matching with links

•How?

Graph A Graph B

Matching without links

•Bipartite matching problem/stable marriage•Maximize the cardinality (number of pairs)

Graph A Graph B

Matching with squares

•Enumerate squares•Maximize cardinality and squares

Graph A Graph B

i i' j j'

i

i'

j

j'

Matching with squares

}1,0{

10 tosubject

maximizex

i

T

x

Ax

xe

}1,0{

10 tosubject

maximizex

i

TT

x

Ax

Sxxxe

Bipartite matching

Square matching

Polynomial

NP-Complete