joint inference in information extraction hoifung poon dept. computer science & eng. university...
TRANSCRIPT
![Page 1: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/1.jpg)
Joint Inference in Information Extraction
Hoifung PoonDept. Computer Science & Eng.
University of Washington
(Joint work with Pedro Domingos)
![Page 2: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/2.jpg)
Problems of Pipeline Inference
AI systems typically use pipeline architecture Inference is carried out in stages E.g., information extraction, natural language processing
speech recognition, vision, robotics
Easy to assemble & low computational cost, but … Errors accumulate along the pipeline No feedback from later stages to earlier ones
Worse: Often process one object at a time
![Page 3: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/3.jpg)
We Need Joint Inference
?
S. Minton Integrating heuristics for constraint satisfaction problems: A case study. In AAAI Proceedings, 1993.
Minton, S(1993 b). Integrating heuristics for constraint satisfaction problems: A case study. In: Proceedings AAAI.
Author Title
![Page 4: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/4.jpg)
We Need Joint Inference A topic of keen interest
Natural language processing [Sutton et. al., 2006] Statistical relational learning [Getoor & Taskar, 2007] Etc.
However … Often very complex to set up Computational cost can be prohibitive Joint inference may hurt accuracy
E.g., [Sutton & McCallum, 2005] State of the art:
Fully joint approaches are still rare Even partly-joint ones require much engineering
![Page 5: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/5.jpg)
Joint Inference in Information Extraction: State of the Art
Information Extraction (IE):
Segmentation + Entity Resolution (ER)
Previous approaches are not fully joint:
[Pasula et. al., 2003]: Citation matching
Perform segmentation in pre-processing
[Wellner et. al., 2004]: Citation matching
Only one-time feedback from ER to segmentation
Both Pasula and Wellner require much engineering
![Page 6: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/6.jpg)
Joint Inference in Information Extraction: Our Approach
We develop the first fully joint approach to IE Based on Markov logic Performs segmentation and entity resolution in
a single inference process Applied to citation matching domains
Outperform previous approaches Joint inference improves accuracy
Requires far less engineering effort
![Page 7: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/7.jpg)
Outline
Background: Markov logic MLNs for joint citation matching Experiments Conclusion
![Page 8: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/8.jpg)
Markov Logic
A general language capturing complexity and uncertainty A Markov Logic Network (MLN) is a set of pairs
(F, w) where F is a formula in first-order logic w is a real number
Together with constants, it defines a Markov network with One node for each ground predicate One feature for each ground formula F,
with the corresponding weight w
1( ) exp ( )i i
i
P x w f xZ
![Page 9: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/9.jpg)
Markov Logic
Open-source package: Alchemyalchemy.cs.washington.edu
Inference: MC-SAT [Poon & Domingos, 2006]
Weight Learning: Voted perceptron [Singla & Domingos, 2005]
![Page 10: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/10.jpg)
Outline
Background: Markov logic MLNs for joint citation matching Experiments Conclusion
![Page 11: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/11.jpg)
Citation Matching
Extract bibliographic records from citation lists
Merge co-referent records
Standard application of information extraction
Here: Extract authors, titles, venues
![Page 12: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/12.jpg)
Types and Predicates
Output
Input
token = {Minton, Hoifung, and, Pedro, ...}field = {Author, Title, Venue}citation = {C1, C2, ...}position = {0, 1, 2, ...}
Token(token, position, citation)HasPunc(citation, position)……
InField(position, field, citation)SameCitation(citation, citation)
![Page 13: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/13.jpg)
MLNs for Citation Matching
Isolated segmentationEntity resolutionJoint segmentation: Jnt-SegJoint segmentation: Jnt-Seg-ER
![Page 14: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/14.jpg)
Isolated Segmentation
Essentially a hidden Markov model (HMM)Token(+t, p, c) InField(p, +f, c)InField(p, +f, c) InField(p+1, +f, c)
Add boundary detection:InField(p, +f, c) ¬HasPunc(c, p) InField(p+1, +f, c)
Other rules: Start with author, comma, etc.
![Page 15: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/15.jpg)
Entity Resolution
Obvious starting point: [Singla & Domingos, 2006] MLN for citation entity resolution Assume pre-segmented citations
However … Poor results from merging this and segmentation Entity resolution misleads segmentation
InField(p, f, c) Token(t, p, c) InField(p', f, c') Token(t, p', c') SameCitation(c, c')
Example of how joint inference can hurt
![Page 16: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/16.jpg)
Entity Resolution
Devise specific predicate and rule to pass information between stages
SimilarTitle(citation, position, position, citation, position, position)
Same beginning trigram and ending unigram No punctuation inside Meet certain boundary conditions
SimilarTitle(c, p1, p2, c', p1', p2') SimilarVenue(c, c') SameCitation(c, c')
Suffices to outperform the state of the art
![Page 17: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/17.jpg)
Joint Segmentation: Jnt-Seg
Leverage well-delimited citations to help
more ambiguous ones JointInferenceCandidate(citation, position, citation)
S. Minton Integrating heuristics for constraint satisfaction problems: A case study. In AAAI Proceedings, 1993.
Minton, S(1993 b). Integrating heuristics for constraint satisfaction problems: A case study. In: Proceedings AAAI.
![Page 18: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/18.jpg)
Joint Segmentation: Jnt-Seg
Augment the transition rule:
InField(p, +f, c) ¬HasPunc(c, p) ¬(c’ JointInferenceCandidate(c,p,c’))
InField(p+1, +f, c)
Introduces virtual boundary based on
similar citations
![Page 19: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/19.jpg)
Joint Segmentation: Jnt-Seg-ER
Jnt-Seg could introduce wrong boundary in “dense” citation lists
Entity resolution can help:
Only consider co-referent pairsInField(p, +f, c) ¬HasPunc(c, p) ¬(c’ JointInferenceCandidate(c,p,c’)
SameCitation(c, c’)) InField(p+1, +f, c)
![Page 20: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/20.jpg)
MLNs for Joint Citation Matching
Jnt-Seg + ER or Jnt-Seg-ER + ER
18 predicates, 22 rules
State-of-the-art system in citation matching: Best entity resolution results to date Both MLNs outperform isolated segmentation
![Page 21: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/21.jpg)
Outline
Background: Markov logic MLNs for joint citation matching Experiments Conclusion
![Page 22: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/22.jpg)
Datasets
CiteSeer 1563 citations, 906 clusters Four-fold cross-validation Largest fold: 0.3 million atoms, 0.4 million clauses
Cora 1295 citations, 134 clusters Three-fold cross-validation Largest fold: 0.26 million atoms, 0.38 million clauses
![Page 23: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/23.jpg)
Pre-Processing
Standard normalizations: Case, stemming, etc.
Token matching: Two tokens match Differ by at most one character Bigram matches unigram formed by concatenation
![Page 24: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/24.jpg)
Entity Resolution: CiteSeer (Cluster Recall)
90
91
92
93
94
95
96
97
98
Constr. Face Reason. Rei nfor.
Pasul aWel l nerJ nt MLN
![Page 25: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/25.jpg)
Entity Resolution: Cora
Approach Pairwise
Recall
Pairwise
Precision
Cluster
Recall
Fellegi-Sunter 78.0 97.7 62.7
Joint MLN 94.3 97.0 78.1
![Page 26: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/26.jpg)
Segmentation: CiteSeer (F1)
82
84
86
88
90
92
94
96
Author Ti tl e Venue
I sol atedJ nt-SegJ nt-Seg-ER
Potential Records
![Page 27: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/27.jpg)
Segmentation: CiteSeer (F1)
91. 5
92
92. 5
93
93. 5
94
94. 5
95
95. 5
Author Ti tl e Venue
I sol atedJ nt-SegJ nt-Seg-ER
All Records
![Page 28: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/28.jpg)
Segmentation: Cora (F1)
93
94
95
96
97
98
99
100
Author Ti tl e Venue
I sol atedJ nt-SegJ nt-Seg-ER
Potential Records
![Page 29: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/29.jpg)
Segmentation: Cora (F1)
96
96. 5
97
97. 5
98
98. 5
99
99. 5
100
Author Ti tl e Venue
I sol atedJ nt-SegJ nt-Seg-ER
All Records
![Page 30: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/30.jpg)
Segmentation: Error Reduction by Joint Inference
Cora Author Title Venue
All 24% 7% 6%
Potential 67% 56% 17%
CiteSeer Author Title Venue
All 6% 3% 0%
Potential 35% 44% 8%
![Page 31: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/31.jpg)
Conclusion
Joint inference attracts great interest today
Yet very difficult to perform effectively
We propose the first fully joint approach to information extraction, using Markov logic
Successfully applied to citation matching Best results to date on CiteSeer and Cora Joint inference consistently improves accuracy Use Markov logic & its general efficient algorithms
Far less engineering than Pasula and Wellner
![Page 32: Joint Inference in Information Extraction Hoifung Poon Dept. Computer Science & Eng. University of Washington (Joint work with Pedro Domingos)](https://reader036.vdocuments.us/reader036/viewer/2022062511/551475cc5503462d4e8b62c0/html5/thumbnails/32.jpg)
Future Work
Learn features and rules for joint inference(E.g., statistical predicate invention, structural learning)
General principles for joint inference Apply joint inference to other NLP tasks