tango (rpi, june 2009)

11
TANGO (RPI, June 2009) George Nagy, Mukkai Krishnamoorthy, Sharad Seth Raghav Padmanabhan, Ramana C. Jandhyala, Sean Kelley Max Muthalathu, William Silversmith

Upload: amelia

Post on 06-Jan-2016

26 views

Category:

Documents


0 download

DESCRIPTION

TANGO (RPI, June 2009). George Nagy, Mukkai Krishnamoorthy, Sharad Seth Raghav Padmanabhan, Ramana C. Jandhyala, Sean Kelley Max Muthalathu, William Silversmith. Completed Stuff. WNT (Piyushee, MS May 2008) TAT (Raghav, MS May 2009) Pubs: ICPR08, WNT PJ & GN, Dec. 2008 - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: TANGO (RPI, June 2009)

TANGO (RPI, June 2009)

George Nagy, Mukkai Krishnamoorthy,

Sharad Seth

Raghav Padmanabhan,

Ramana C. Jandhyala,

Sean Kelley

Max Muthalathu,

William Silversmith

Page 2: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 2

Completed Stuff• WNT (Piyushee, MS May 2008)

• TAT (Raghav, MS May 2009)

Pubs:

ICPR08, WNT PJ & GN, Dec. 2008

ICPR08, QBT, RP & GN Dec. 2008

MKM09, Tessellations, RJ, RP, MK, GN, SS, WS, July 2009

GREC09, TAT results, RP, RP, MK, GN, SS, WS, July 2009

Page 3: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 3

Software

• TAT (demo)

• EX2XY, XY2EX (Ramana)

• OO2XY, XY2OO (Sean, in progress)

• XY2LN (SS, MK)

• XY2WN (Bill)

• TAT stat analysis (RB & GN, in progress)

Page 4: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 4

Partial grammar for X-Y trees (MK & SS)

Employment Status

Unemployed Employed

Education

High School or Less

College

High School or Less

College

BS/BAGraduat

e Degree

BS/BAGraduat

e Degree

SXY = { c [ c c ] c [ c { c [ c c ] } c { c [ c c ] } ]

Grammar G1 for parsing all layout-equivalent tessellations of this kind is:

S : = AA : = { B }

B : = c [ X ] B | c [ X ] X : = c X | A X | A | c

Page 5: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 5

A’ and A’’ table formatsTwo different table formatsAll possible combinations may exist (

B1 B2 B1 B2D1D2D1D2

AB B1 B2 B1 B2

C DD1D2D1D2

C D B B1 B2 B1 B2D1D2D1D2

C1

C2

C

C2

C1

DC1

C2

AA1 A2

AA1 A2

B

A1 A2

A’

A’’

Hybrid

Page 6: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 6

Appearance-based distance (WS?)

Each table cell is described by a vector:width, type size, typeface, indent, justification, alpha/num, color, #_of_chars,…

Compute differences between horizontally and vertically adjacent cells

From resulting “gradient map” determine row header, column header, and delta cell regions.

(Show GN’s Excel example)

Page 7: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 7

Prediction of TAT-time

Multiple regression of interaction time from:

• Size of table (#cols, #rows, or # cells)

• Number of aggregates

• Number of footnotes

• Number units

• Other?

(GN has tried it with 20 tables – have Excel ‘GN_Data_Analysis’)

Page 8: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 8

Table similarity• May be useful to determine similar edit sequences.

• Tree distance between X-Y representationssymmetry?

• Edit distance between linear P-notation for X-Y trees

• Metric for parse sequences??

• Tree distance between Wang category forests? (new)

Page 9: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 9

Learning ???

• Retain edit sequences from TAT• Make X-Y tree from each imported but not edited table• Find distance of X-Y tree from new table to all previous• Execute edit sequences of nearest neighbor(s)• Check algorithmically if resulting X-Y tree corresponds

to correct WN• Check visually if table corresponding to resulting X-Y

tree is equivalent to original table.• If not, edit• Concatenate further edit and associate with X-Y tree of

new table, then add to reference set

Page 10: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 10

Discussion Items• Lists & Ordering• XML format and verification• Augmentations (spotting and processing)• Open Office• Table ontology• XY tree to WN via lexical parse (checks?)• Use of parse trees for XY2WN• Learning?• Overall TANGO evaluation for final report• Critique draft slides for GREC and MKM• Tools: RPI: OO, VBA, Matlab, Python, BYU: ??• Other RPI projects: PERFECT, CERVITOR, CAVIAR

Page 11: TANGO (RPI, June 2009)

June 15,3009 TANGO PROGRESS REPORT 11

Survival Plans

• NSF TANGO Final Report !• New NSF proposal (Maria)• Other possible sponsors?• Confs• Archival Journals• Collaborators• Demos and dissemination• Next visit