transforming arbitrary tables into f-logic frames with tartar
DESCRIPTION
Transforming Arbitrary Tables into F-Logic Frames with TARTAR. Aleksander Pivk , York Sure, Philipp Cimiano , Matjaz Gams , Vladislav Rajkovic , Rudi Studer Presented By Stephen Lynn. Information Extraction. Free-form Text Linguistic/NLP approaches Tabular Structures - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/1.jpg)
TARTARInformation Extraction
Transforming Arbitrary Tables into F-Logic Frames with TARTARAleksander Pivk, York Sure, Philipp Cimiano,Matjaz Gams, Vladislav Rajkovic, Rudi Studer
Presented By Stephen Lynn
![Page 2: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/2.jpg)
TARTARInformation Extraction
Information Extraction Free-form Text
Linguistic/NLP approaches Tabular Structures
Table comprehension task html, excel, pdf, text, etc.
Semantic interpretation taskMore effort???
![Page 3: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/3.jpg)
TARTARInformation Extraction
TARTAR Architecture
![Page 4: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/4.jpg)
TARTARInformation Extraction
Semantic Representation Frame Logic (F-Logic)
Model-theoretic semanticsComplete resolution-based proof theoryExpressive power of logicAvailability of efficient reasoning tools
![Page 5: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/5.jpg)
TARTARInformation Extraction
F-Logic Frame
![Page 6: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/6.jpg)
TARTARInformation Extraction
Table Comprehension Dimensions – a grouping of cells representing
similar entities
![Page 7: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/7.jpg)
TARTARInformation Extraction
Table Comprehension Stub – dimension with headers used to index
elements in body
![Page 8: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/8.jpg)
TARTARInformation Extraction
Table Comprehension Box head – column headers (often nested)
![Page 9: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/9.jpg)
TARTARInformation Extraction
Table Comprehension Body – data values
![Page 10: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/10.jpg)
TARTARInformation Extraction
Table Classes 1D, 2D, Complex
![Page 11: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/11.jpg)
TARTARInformation Extraction
Methodology
![Page 12: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/12.jpg)
TARTARInformation Extraction
Cleaning & Canonicalization Clean DOM tree
CyberNeko HTML Parser Rowspan/Colspan expansion
![Page 13: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/13.jpg)
TARTARInformation Extraction
Structure Detection Token Type Hierarchy Assign Functional Types and Probabilities
![Page 14: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/14.jpg)
TARTARInformation Extraction
Structure Detection Detect Logical Table Orientation
![Page 15: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/15.jpg)
TARTARInformation Extraction
Structure Detection Discover and Level Regions
Logical Units
![Page 16: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/16.jpg)
TARTARInformation Extraction
FTM Building Functional Table Model (FTM)
Arrange regions into a treeLeaf nodes are data
![Page 17: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/17.jpg)
TARTARInformation Extraction
Semantic Enriching of FTM Labeling
WordNet and GoogleSets Map FTM to a frame
![Page 18: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/18.jpg)
TARTARInformation Extraction
Evaluation Crawl, extract, filter web tables
135 tables85.4% success rateMostly problems with complex tables
Compare auto-generated frames with human generated frames14 people transformed 3 tables each21 total tables (each done twice)Syntactic/Semantic correctness (Strict and Soft)
![Page 19: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/19.jpg)
TARTARInformation Extraction
Results
Inter-annotator agreement
System-annotator agreement
![Page 20: Transforming Arbitrary Tables into F-Logic Frames with TARTAR](https://reader033.vdocuments.us/reader033/viewer/2022051821/5681623a550346895dd26c68/html5/thumbnails/20.jpg)
TARTARInformation Extraction
Benefits Fully automated knowledge formalization Arbitrary tables Independent of domain knowledge Independent of document type Explicit semantics of generated frames Query answering over heterogeneous tables