learning to construct and reason with a large kb of extracted information
DESCRIPTION
Learning to Construct and Reason with a Large KB of Extracted Information. William W. Cohen Machine Learning Dept and Language Technology Dept joint work with: Tom Mitchell, Ni Lao, William Wang, Kathryn Rivard Mazaitis, - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/1.jpg)
Learning to Construct and Reason with a Large KB of
Extracted InformationWilliam W. Cohen
Machine Learning Dept and Language Technology Dept
joint work with:
Tom Mitchell, Ni Lao, William Wang, Kathryn Rivard Mazaitis,Richard Wang, Frank Lin, Ni Lao, Estevam Hruschka, Jr., Burr
Settles, Partha Talukdar, Derry Wijaya, Edith Law, Justin Betteridge, Jayant Krishnamurthy, Bryan Kisiel, Andrew
Carlson, Weam Abu Zaki , Bhavana Dalvi, Malcolm Greaves, Lise Getoor, Jay Pujara, Hui Miao, …
![Page 2: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/2.jpg)
Outline• Background: information extraction and NELL• Key ideas in NELL
– Coupled learning– Multi-view, multi-strategy learning
• Inference in NELL– Inference as another learning strategy
• Learning in graphs • Path Ranking Algorithm• ProPPR
– Promotion as inference
• Conclusions & summary
![Page 3: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/3.jpg)
But first….some backstory
![Page 4: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/4.jpg)
..and an unrelated project…
![Page 5: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/5.jpg)
..called SimStudent…
![Page 6: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/6.jpg)
SimStudent will learn rules to solve a problem step-by-step and guide a student through
how solve problems step-by-step
![Page 7: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/7.jpg)
Quinlan’s FOIL
![Page 8: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/8.jpg)
Summary of SimStudent• Possible for a human author (eg middle school teacher) to
build an ITS system– by building a GUI, then demonstrating problem solving and
having the system learn how from examples• The rules learned by SimStudent can be used to construct
a “student model” – with parameter tuning this can predict how well individual
students will learn– better than state-of-the-art in some cases!
• AI problem solving with a cognitively predictive model … and ILP is a key component!
![Page 9: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/9.jpg)
Information Extraction• Goal:
– Extract facts about the world automatically by reading text
– IE systems are usually based on learning how to recognize facts in text
• .. and then (sometimes) aggregating the results• Latest-generation IE systems need not require large
amounts of training• … and IE does not necessarily require subtle analysis of
any particular piece of text
![Page 10: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/10.jpg)
Never Ending Language Learning (NELL)
• NELL is a broad-coverage IE system– Simultaneously learning 500-600 concepts and relations
(person, celebrity, emotion, aquiredBy, locatedIn, capitalCityOf, ..)
– Starting point: containment/disjointness relations between concepts, types for relations, and O(10) examples per concept/relation
– Uses 500M web page corpus + live queries– Running (almost) continuously for over three years– Has learned over 50M beliefs, over 1M high-confidence ones
• about 85% of high-confidence beliefs are correct
![Page 12: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/12.jpg)
NELL Screenshots
![Page 13: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/13.jpg)
NELL Screenshots
![Page 14: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/14.jpg)
NELL Screenshots
![Page 15: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/15.jpg)
More examples of what NELL knows
![Page 16: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/16.jpg)
![Page 17: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/17.jpg)
Outline• Background: information extraction and NELL• Key ideas in NELL
– Coupled learning– Multi-view, multi-strategy learning
• Inference in NELL– Inference as another learning strategy
• Learning in graphs • Path Ranking Algorithm• ProPPR
– Promotion as inference
• Conclusions & summary
![Page 18: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/18.jpg)
Bootstrapped SSL learning of lexical patterns
ParisPittsburgh
SeattleCupertino
mayor of arg1live in arg1
San FranciscoAustindenial
arg1 is home oftraits such as arg1
it’s underconstrained!!
anxietyselfishness
Berlin
Extract cities:
Given: four seed examples of the class “city”
![Page 19: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/19.jpg)
NP1 NP2
Krzyzewski coaches the Blue Devils.
athleteteam
coachesTeam(c,t)
person
coach
sport
playsForTeam(a,t)
NP
Krzyzewski coaches the Blue Devils.
coach(NP)
hard (underconstrained)semi-supervised learning
problem
much easier (more constrained)semi-supervised learning problem
teamPlaysSport(t,s)
playsSport(a,s)
One Key to Accurate Semi-Supervised Learning
1. Easier to learn many interrelated tasks than one isolated task2. Also easier to learn using many different types of information
![Page 20: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/20.jpg)
Outline• Background: information extraction and NELL• Key ideas in NELL
– Coupled learning– Multi-view, multi-strategy learning
• Inference in NELL– Inference as another learning strategy
• Learning in graphs • Path Ranking Algorithm• ProPPR
– Promotion as inference
• Conclusions & summary
![Page 21: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/21.jpg)
Ontology and
populated KB
the Web
CBL
text extraction patterns
SEAL
HTML extraction patterns
evidence integration
PRA
learned inference
rules
Morph
Morphologybased
extractor
Another key idea: use multiple types of information
![Page 22: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/22.jpg)
Outline• Background: information extraction and NELL• Key ideas in NELL
– Coupled learning– Multi-view, multi-strategy learning
• Inference in NELL– Inference as another learning strategy
• Background: Learning in graphs • Path Ranking Algorithm• ProPPR
– Promotion as inference
• Conclusions & summary
![Page 23: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/23.jpg)
proposal
CMU
NSF
graph
William
6/18/07
6/17/07
Sent To
Term In Subject
Background: Personal Info Management as Similarity Queries on a Graph
[SIGIR 2006, EMNLP 2008, TOIS 2010]
Einat Minkov, Univ Haifa
![Page 24: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/24.jpg)
Learning about graph similarity• Personalized PageRank aka Random Walk with Restart:
– Similarity measure for nodes in a graph, analogous to TFIDF for text in a WHIRL database
– natural extension to PageRank– amenable to learning parameters of the walk (gradient
search, w/ various optimization metrics):• Toutanova, Manning & NG, ICML2004; Nie et al,
WWW2005; Xi et al, SIGIR 2005– or: reranking, etc– queries:Given type t* and node x, find y:T(y)=t* and y~xGiven type t* and nodes X, find y:T(y)=t* and y~X
![Page 25: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/25.jpg)
Many tasks can be reduced to similarity queries
Person namePerson namedisambiguationdisambiguation
ThreadingThreading
Alias findingAlias finding
[ term “andy” file msgId ]
“person”
[ file msgId ]
“file”
What are the adjacent messages in this thread? A proxy for finding “more messages like this one”
What are the email-addresses of Jason ?... [ term Jason ]
“email-address”
Meeting Meeting attendees finderattendees finder
Which email-addresses (persons) should I notify about this meeting? [ meeting mtgId ]
“email-address”
![Page 26: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/26.jpg)
Learning about graph similarity:the next generation
• Personalized PageRank aka Random Walk with Restart:– Given type t* and nodes X, find y:T(y)=t* and y~X
• Ni Lao’s thesis (2012): New, better learning methods– richer parameterization– faster PPR inference– structure learning
• Other tasks:– relation-finding in parsed text– information management for biologists– inference in large noisy knowledge bases
![Page 27: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/27.jpg)
Lao: A learned random walk strategy is a weighted set of random-walk “experts”, each of which is a walk constrained by a path (i.e., sequence of relations)
6) approx. standard IR retrieval
1) papers co-cited with on-topic papers
7,8) papers cited during the past two years
12-13) papers published during the past two years
Recommending papers to cite in a paper being prepared
![Page 28: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/28.jpg)
Another study:learning inference rules for a noisy KB(Lao, Cohen, Mitchell 2011)(Lao et al, 2012)
Synonyms of the query team
American
IsA
PlaysIn
AthletePlaysInLeagueHinesWard
SteelersAthletePlaysForTeam
NFL
TeamPlaysInLeague
?
isa-1
Random walk interpretation is crucial
i.e. 10-15 extra points in MRR
![Page 29: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/29.jpg)
Ontology and
populated KB
the Web
CBL
text extraction patterns
SEAL
HTML extraction patterns
evidence integration
PRA
learned inference
rules
Morph
Morphologybased
extractor
Another key idea: use multiple types of information
![Page 30: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/30.jpg)
Outline
• Background: information extraction and NELL• Key ideas in NELL• Inference in NELL
– Inference as another learning strategy• Background: Learning in graphs • Path Ranking Algorithm• PRA + FOL: ProPPR and joint learning for inference
– Promotion as inference
• Conclusions & summary
![Page 31: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/31.jpg)
How can you extend PRA to
• Non-binary predicates?• Paths that include constants?• Recursive rules?• …. ?
• Current direction: using ideas from PRA in a general first-order logic: ProPPR
![Page 32: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/32.jpg)
athletePlaySportViaRule(Athlete,Sport) onTeamViaKB(Athlete,Team), teamPlaysSportViaKB(Team,Sport)
teamPlaysSportViaRule(Team,Sport) memberOfViaKB(Team,Conference), hasMemberViaKB(Conference,Team2),playsViaKB(Team2,Sport).
teamPlaysSportViaRule(Team,Sport) onTeamViaKB(Athlete,Team), athletePlaysSportViaKB(Athlete,Sport)
A limitation• Paths are learned separately for each relation
type, and one learned rule can’t call another• PRA can learn this….
![Page 33: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/33.jpg)
A limitation• Paths are learned separately for each relation
type, and one learned rule can’t call another• But PRA can’t learn this…..
athletePlaySport(Athlete,Sport) onTeam(Athlete,Team), teamPlaysSport(Team,Sport)
athletePlaySport(Athlete,Sport) athletePlaySportViaKB(Athlete,Sport)
teamPlaysSport(Team,Sport) memberOf(Team,Conference), hasMember(Conference,Team2),plays(Team2,Sport).
teamPlaysSport(Team,Sport) onTeam(Athlete,Team), athletePlaysSport(Athlete,Sport)
teamPlaysSport(Team,Sport) teamPlaysSportViaKB(Team,Sport)
![Page 34: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/34.jpg)
Solution: a major extension from PRA to include large subset of Prolog
athletePlaySport(Athlete,Sport) onTeam(Athlete,Team), teamPlaysSport(Team,Sport)
athletePlaySport(Athlete,Sport) athletePlaySportViaKB(Athlete,Sport)
teamPlaysSport(Team,Sport) memberOf(Team,Conference), hasMember(Conference,Team2),plays(Team2,Sport).
teamPlaysSport(Team,Sport) onTeam(Athlete,Team), athletePlaysSport(Athlete,Sport)
teamPlaysSport(Team,Sport) teamPlaysSportViaKB(Team,Sport)
![Page 35: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/35.jpg)
Sample ProPPR program….
Horn rules features of rules(vars from head ok)
![Page 36: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/36.jpg)
.. and search space…
D’oh! This is a graph!
![Page 37: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/37.jpg)
• Score for a query soln (e.g., “Z=sport” for “about(a,Z)”) depends on probability of reaching a ☐ node*• learn transition probabilities based on features of the rules• implicit “reset” transitions with (p≥α) back to query node
• Looking for answers supported by many short proofs
“Grounding” size is O(1/αε) … ie independent of DB size fast approx incremental inference (Reid,Lang,Chung, 08)
Learning: supervised variant of personalized PageRank (Backstrom & Leskovic, 2011)
*Exactly as in Stochastic Logic Programs[Cussens, 2001]
![Page 38: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/38.jpg)
Sample Task: Citation Matching• Task:
• citation matching (Alchemy: Poon & Domingos).• Dataset:
• CORA dataset, 1295 citations of 132 distinct papers.• Training set: section 1-4.• Test set: section 5.• ProPPR program:
• translated from corresponding Markov logic network (dropping non-Horn clauses)
• # of rules: 21.
![Page 39: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/39.jpg)
Task: Citation Matching
![Page 40: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/40.jpg)
Time: Citation Matchingvs Alchemy
“Grounding” is independent of DB size
![Page 41: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/41.jpg)
Accuracy: Citation Matching
AUC scores: 0.0=low, 1.0=hiw=1 is before learning
UW rules
Our rules
![Page 42: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/42.jpg)
It gets better…..• Learning uses many example queries
• e.g: sameCitation(c120,X) with X=c123+, X=c124-, …
• Each query is grounded to a separate small graph (for its proof)
• Goal is to tune weights on these edge features to optimize RWR on the query-graphs.
• Can do SGD and run RWR separately on each query-graph
• Graphs do share edge features, so there’s some synchronization needed
![Page 43: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/43.jpg)
Learning can be parallelized by splitting on the separate “groundings” of each query
![Page 44: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/44.jpg)
Ontology and
populated KB
the Web
CBL
text extraction patterns
SEAL
HTML extraction patterns
evidence integration
PRA
learned inference
rules
Morph
Morphologybased
extractor
Back to NELL……
![Page 45: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/45.jpg)
Experiment:•Take top K paths for each predicate learned by Lao’s PRA
• (I don’t know how to do structure learning for ProPPR yet)•Convert to a mutually recursive ProPPR program•Train weights on entire program
athletePlaySport(Athlete,Sport) onTeam(Athlete,Team), teamPlaysSport(Team,Sport)
athletePlaySport(Athlete,Sport) athletePlaySportViaKB(Athlete,Sport)
teamPlaysSport(Team,Sport) memberOf(Team,Conference), hasMember(Conference,Team2),plays(Team2,Sport).
teamPlaysSport(Team,Sport) onTeam(Athlete,Team), athletePlaysSport(Athlete,Sport)
teamPlaysSport(Team,Sport) teamPlaysSportViaKB(Team,Sport)
![Page 46: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/46.jpg)
More details
• Train on NELL’s KB as of iteration 713• Test on new facts from later iterations• Try three “subdomains” of NELL
– pick a seed entity S– pick top M entities nodes in a (simple untyped
RWR) from S– project KB to just these M entities– look at three subdomains, six values of M
![Page 47: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/47.jpg)
![Page 48: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/48.jpg)
Outline• Background: information extraction and NELL• Key ideas in NELL
– Coupled learning– Multi-view, multi-strategy learning
• Inference in NELL– Inference as another learning strategy
• Learning in graphs • Path Ranking Algorithm• ProPPR
– Promotion as inference• Conclusions & summary
![Page 49: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/49.jpg)
More detail on NELL
• For iteration i=1,….,715,…:– For each view (lexical patterns, …, PRA):
• Distantly-train for that view using KBi
• Propose new “candidate beliefs” based on the learned view-specific classifier
– Hueristically find the “best” candidate beliefs and “promote” them into KBi+1
Not obvious how to promote in a principled way …
![Page 50: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/50.jpg)
Promotion: identifying new correct extractions from a pool of noisy extractions
• Many types of noise are possible:• co-referent entities• missing or spurious labels• missing or spurious relations• violations of ontology (e.g., an athlete that is not a person)
• Identifying true extractions requires joint reasoning, e.g.• Pooling information about co-referent entities• Enforcing mutual exclusion of labels and relations
Problem: How can we integrate extractions from multiple sources in the presence of ontological constraints at the scale of millions of extractions?
![Page 51: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/51.jpg)
An example
Ontology:Dom(hasCapital, country)Mut(country, bird)
Sample Extractions:Lbl(Kyrgyzstan, bird)Lbl(Kyrgyzstan, country)Lbl(Kyrgyz Republic, country)Rel(Kyrgyz Republic, Bishkek,
hasCapital)Entity Resolution:SameEnt(Kyrgyz Republic,
Kyrgyzstan)
country
Kyrgyzstan Kyrgyz Republic
bird
Bishkek
SameEnt
Dom
Mut
Lbl
Rel(hasCapital)
Lbl Lbl
Kyrgyzstan
Kyrgyz RepublicBishkekcountry
Rel(hasCapital)Lbl
A knowledge graph view of NELL’s extractions
What you want
![Page 52: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/52.jpg)
Knowledge graph
country
Kyrgyzstan Kyrgyz Republic
bird
Bishkek
SameEnt
Dom
Mut
Lbl
Rel(hasCapital)
Lbl Lbl
Representation as a noisy knowledge graph
Kyrgyzstan
Kyrgyz RepublicBishkekcountry
Rel(hasCapital)LblAfter Knowledge Graph Identification
graph identification
Lise Getoor, Jay Pujara, and Hui Miao @ UMD
![Page 53: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/53.jpg)
Graph Identification as Joint Reasoning: Probabilistic Soft Logic (PSL)
• Templating language for hinge-loss MRFs, much more scalable!• Model specified as a collection of logical formulas
– Formulas are ground by substituting literal values– Truth values of atoms relaxed to [0,1] interval– Truth values of formulas derived from Lukasiewicz t-norm
• Each ground rule, r, has a weighted potential, ϕr corresponding to a distance to satisfaction
• PSL defines a probability distribution over atom truth value assignments, I:
• Most probable explanation (MPE) inference is convex• Running time scales linearly with grounded rules (|R|)
€
P(I) =1Zexp − wrr∈R∑ ?r(I)
p[ ]
![Page 54: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/54.jpg)
PSL Representation of Heuristics for PromotionPromote any candidate
Promote “hints” (old promotion strategy)
Be consistent about labels for duplicate entities
![Page 55: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/55.jpg)
PSL Representation of Ontological Rules
Adapted from Jiang et al., ICDM 2012
Be consistent with constraints from ontology
Too expressive for ProPPR
![Page 56: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/56.jpg)
Datasets & Results• Evaluation on NELL dataset from iteration 165:
• 1.7M candidate facts • 70K ontological constraints
• Predictions on 25K facts from a 2-hop neighborhood around test data
• Beats other methods, runs in just 10 seconds!
F1 AUCBaseline .828 .873NELL .673 .765MLN (Jiang, 12) .836 .899KGI-PSL .853 .904
![Page 57: Learning to Construct and Reason with a Large KB of Extracted Information](https://reader031.vdocuments.us/reader031/viewer/2022012919/5681678f550346895ddcbcdb/html5/thumbnails/57.jpg)
Summary• Background: information extraction and NELL• Key ideas in NELL
– Coupled learning– Multi-view, multi-strategy learning
• Inference in NELL– Inference as another learning strategy
• Learning in graphs • Path Ranking Algorithm• ProPPR
– Promotion as inference
• Conclusions & summary