markov logic parag singla dept. of computer science university of texas, austin
TRANSCRIPT
Markov Logic
Parag SinglaDept. of Computer Science
University of Texas, Austin
Overview
Motivation Background Markov logic Inference Learning Applications
The Interface Layer
Interface Layer
Applications
Infrastructure
Networking
Interface Layer
Applications
Infrastructure
Internet
RoutersProtocols
WWWEmail
Databases
Interface Layer
Applications
Infrastructure
Relational Model
QueryOptimization
TransactionManagement
ERP
OLTP
CRM
Artificial Intelligence
Interface Layer
Applications
Infrastructure
Representation
Learning
Inference
NLP
Planning
Multi-AgentSystemsVision
Robotics
Artificial Intelligence
Interface Layer
Applications
Infrastructure
Representation
Learning
Inference
NLP
Planning
Multi-AgentSystemsVision
Robotics
First-Order Logic?
Artificial Intelligence
Interface Layer
Applications
Infrastructure
Representation
Learning
Inference
NLP
Planning
Multi-AgentSystemsVision
Robotics
Graphical Models?
Artificial Intelligence
Interface Layer
Applications
Infrastructure
Representation
Learning
Inference
NLP
Planning
Multi-AgentSystemsVision
Robotics
Statistical + Logical AI
Artificial Intelligence
Interface Layer
Applications
Infrastructure
Representation
Learning
Inference
NLP
Planning
Multi-AgentSystemsVision
Robotics
Markov Logic
Overview
Motivation Background Markov logic Inference Learning Applications
Markov Networks Undirected graphical models
Cancer
CoughAsthma
Smoking
Potential functions defined over cliques
Smoking Cancer Ф(S,C)
False False 4.5
False True 4.5
True False 2.7
True True 4.5
c
cc xZ
xP )(1
)(
x c
cc xZ )(
Markov Networks Undirected graphical models
Log-linear model:
Weight of Feature i Feature i
otherwise0
CancerSmokingif1)CancerSmoking,(1f
5.11 w
Cancer
CoughAsthma
Smoking
iii xfw
ZxP )(exp
1)(
First-Order Logic
Constants, variables, functions, predicates Anna, x, MotherOf(x), Friends(x,y)
Grounding: Replace all variables by constants Friends (Anna, Bob)
Formula: Predicates connected by operators Smokes(x) Cancer(x)
Knowledge Base (KB): A set of formulas Can be equivalently converted into a clausal form
World: Assignment of truth values to all ground predicates
Overview
Motivation Background Markov logic Inference Learning Applications
Markov Logic
A logical KB is a set of hard constraintson the set of possible worlds
Let’s make them soft constraints:When a world violates a formula,It becomes less probable, not impossible
Give each formula a weight(Higher weight Stronger constraint)
satisfiesit formulas of weightsexpP(world)
Definition
A Markov Logic Network (MLN) is a set of pairs (F, w) where F is a formula in first-order logic w is a real number
Together with a finite set of constants,it defines a Markov network with One node for each grounding of each predicate in
the MLN One feature for each grounding of each formula F
in the MLN, with the corresponding weight w
Example: Friends & Smokers
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
Example: Friends & Smokers
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Example: Friends & Smokers
Two constants: Ana (A) and Bob (B)
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Example: Friends & Smokers
Cancer(A)
Smokes(A) Smokes(B)
Cancer(B)
Two constants: Ana (A) and Bob (B)
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Example: Friends & Smokers
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Two constants: Ana (A) and Bob (B)
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Example: Friends & Smokers
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Two constants: Ana (A) and Bob (B)
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Example: Friends & Smokers
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Two constants: Ana (A) and Bob (B)
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
Example: Friends & Smokers
Cancer(A)
Smokes(A)Friends(A,A)
Friends(B,A)
Smokes(B)
Friends(A,B)
Cancer(B)
Friends(B,B)
Two constants: Ana (A) and Bob (B)
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
1.1
5.1
State of the World {0,1} Assignment to the nodes
Markov Logic Networks MLN is template for ground Markov networks Probability of a world x:
One feature for each ground formula
formulasgroundkkk xfw
ZxP
)(exp1
)(
otherwise 0
x given satisfied isformula if1)(
kth xfk
Markov Logic Networks
MLN is template for ground Markov nets Probability of a world x:
Weight of formula i No. of true groundings of formula i in x
formulas MLN
)(exp1
)(i
ii xnwZ
xP
formulasgroundkkk xfw
ZxP
)(exp1
)(
Relation to Statistical Models
Special cases: Markov networks Markov random fields Bayesian networks Log-linear models Exponential models Max. entropy models Gibbs distributions Boltzmann machines Logistic regression Hidden Markov models Conditional random fields
Obtained by making all predicates zero-arity
Markov logic allows objects to be interdependent (non-i.i.d.)
Relation to First-Order Logic
Infinite weights First-order logic Satisfiable KB, positive weights
Satisfying assignments = Modes of distribution Markov logic allows contradictions between
formulas Markov logic in infinite domains
Overview
Motivation Background Markov logic Inference Learning Applications
Inference
Cancer(A)
Smokes(A)?Friends(A,A)
Friends(B,A)
Smokes(B)?
Friends(A,B)
Cancer(B)?
Friends(B,B)
Most Probable Explanation (MPE)Inference
Cancer(A)
Smokes(A)?Friends(A,A)
Friends(B,A)
Smokes(B)?
Friends(A,B)
Cancer(B)?
Friends(B,B)
What is the most likely state of Smokes(A), Smokes(B), Cancer(B)?
MPE Inference
Problem: Find most likely state of world given evidence
)|(maxarg xyPy
Query Evidence
MPE Inference
Problem: Find most likely state of world given evidence
i
iixy
yxnwZ
),(exp1
maxarg
MPE Inference
Problem: Find most likely state of world given evidence
i
iiy
yxnw ),(maxarg
MPE Inference
Problem: Find most likely state of world given evidence
This is just the weighted MaxSAT problem Use weighted SAT solver
(e.g., MaxWalkSAT [Kautz et al. 97] )
i
iiy
yxnw ),(maxarg
Computing Probabilities: Marginal Inference
Cancer(A)
Smokes(A)?Friends(A,A)
Friends(B,A)
Smokes(B)?
Friends(A,B)
Cancer(B)?
Friends(B,B)
What is the probability Smokes(B) = 1?
Marginal Inference
Problem: Find the probability of query atoms given evidence
)|( xyP
Query Evidence
Marginal Inference
Problem: Find the probability of query atoms given evidence
iii
x
yxnwZ
xyP ),(exp1
)|(
Computing Zx takes exponential time!
Belief Propagation
Bipartite network of nodes (variables)and features
In Markov logic: Nodes = Ground atoms Features = Ground clauses
Exchange messages until convergence Messages
Current approximation to node marginals
Belief Propagation
Nodes (x)
Features (f)
Smokes(Ana)Smokes(Ana) Friends(Ana, Bob)
Smokes(Bob)
Belief Propagation
Nodes (x)
Features (f)
}\{)(
)()(fxnh
xhfx xx
Belief Propagation
Nodes (x)
Features (f)
}\{)(
)()(fxnh
xhfx xx
}{~ }\{)(
)( )()(x xfny
fyxwf
xf yex
Lifted Belief Propagation
}\{)(
)()(fxnh
xhfx xx
}{~ }\{)(
)( )()(x xfny
fyxwf
xf yex
Nodes (x)
Features (f)
Lifted Belief Propagation
}\{)(
)()(fxnh
xhfx xx
}{~ }\{)(
)( )()(x xfny
fyxwf
xf yex
Nodes (x)
Features (f)
Lifted Belief Propagation
}\{)(
)()(fxnh
xhfx xx
}{~ }\{)(
)( )()(x xfny
fyxwf
xf yex
, :Functions of edge counts
Nodes (x)
Features (f)
Overview
Motivation Background Markov logic Inference Learning Applications
Learning Parameters
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Three constants: Ana, Bob, John
Learning Parameters
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Smokes
Smokes(Ana)
Smokes(Bob)
Three constants: Ana, Bob, John
Learning Parameters
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Smokes
Smokes(Ana)
Smokes(Bob)
Closed World Assumption: Anything not in the database is assumed false.
Three constants: Ana, Bob, John
Learning Parameters
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Smokes
Smokes(Ana)
Smokes(Bob)
Closed World Assumption: Anything not in the database is assumed false.
Three constants: Ana, Bob, John
Cancer
Cancer(Ana)
Cancer(Bob)
Learning Parameters
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Smokes
Smokes(Ana)
Smokes(Bob)
Closed World Assumption: Anything not in the database is assumed false.
Three constants: Ana, Bob, John
Cancer
Cancer(Ana)
Cancer(Bob)
Friends
Friends(Ana, Bob)
Friends(Bob, Ana)
Friends(Ana, John)
Friends(John, Ana)
Learning Parameters
Data is a relational database Closed world assumption (if not: EM) Learning parameters (weights)
Generatively Discriminatively
Learning Parameters (Discriminative)
Given training data Maximize conditional likelihood of query (y)
given evidence (x) Use gradient ascent No local maxima
No. of true groundings of clause i in data
Expected no. true groundings according to model
),(),()|(log yxnEyxnxyPw iwiw
i
Learning Parameters (Discriminative)
Requires inference at each step (slow!)
Approximate expected counts by counts in MAP state of y given x
No. of true groundings of clause i in data
Expected no. true groundings according to model
),(),()|(log yxnEyxnxyPw iwiw
i
Learning Structure
Smokes
Smokes(Ana)
Smokes(Bob)
Cancer
Cancer(Ana)
Cancer(Bob)
Friends
Friends(Ana, Bob)
Friends(Bob, Ana)
Friends(Ana, John)
Friends(John, Ana)
Three constants: Ana, Bob, John
Can we learn the set of the formulas in the MLN?
Learning Structure
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Smokes
Smokes(Ana)
Smokes(Bob)
Cancer
Cancer(Ana)
Cancer(Bob)
Friends
Friends(Ana, Bob)
Friends(Bob, Ana)
Friends(Ana, John)
Friends(John, Ana)
Three constants: Ana, Bob, John
Can we refine the set of the formulas in the MLN?
Learning Structure
)()(),(,
)()(
ySmokesxSmokesyxFriendsyx
xCancerxSmokesx
?w
?w
2
1
Smokes
Smokes(Ana)
Smokes(Bob)
Cancer
Cancer(Ana)
Cancer(Bob)
Friends
Friends(Ana, Bob)
Friends(Bob, Ana)
Friends(Ana, John)
Friends(John, Ana)
),(),(, xyFriendsyxFriendsyx ?w3
Three constants: Ana, Bob, John
Learning Structure
Learn the MLN formulas given data Search in the space of possible formulas
Exhaustive search not possible! Need to guide the search
Initial state: Unit clauses or hand-coded KB Operators: Add/remove literal, flip sign Evaluation function:
likelihood + structure prior Search: top-down, bottom-up, structural motifs
Overview
Motivation Markov logic Inference Learning Applications Conclusion & Future Work
Applications
Web-mining Collective Classification Link Prediction Information retrieval Entity resolution Activity Recognition Image Segmentation &
De-noising
Social Network Analysis
Computational Biology Natural Language
Processing Robot mapping Planning and MDPs More..
Entity Resolution
Data Cleaning and Integration – first step in the data-mining process
Merging of data from multiple sources results in duplicates
Entity Resolution: Problem of identifying which of the records/fields refer to the same underlying entity
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
Author
Title
Venue
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
Author
Title
Venue
Entity Resolution
Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).
Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.
H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.
Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.
Author
Title
Venue
HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)
Predicates
HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)
HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)
Predicates & Formulas
HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)
HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)
HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)
(Similarity of records based on similarity of fields and vice-versa)
Predicates & Formulas
HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)
HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)
HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)
(Similarity of records based on similarity of fields and vice-versa)
SameRecord(r1,r2) ^ SameRecord(r2,r3) SameRecord(r1,r3)
(Transitivity)
Predicates & Formulas
HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)
HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)
HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)
(Similarity of records based on similarity of fields and vice-versa)
SameRecord(r1,r2) ^ SameRecord(r2,r3) SameRecord(r1,r3)
(Transitivity)
Predicates & Formulas
Winner of Best Paper Award!
Discovery of Social Relationships in Consumer Photo Collections
Find out all the photos of my child’s friends!
MLN Rules
Hard Rules Two people share a unique relationship Parents are older than their children …
Soft Rules Children’s friends are of similar age Young adult appearing with a child shares a parent-
child relationship Friends and relatives are clustered together …
Models Compared
Model Description
Random Random prediction (uniform)
Prior Prediction based on priors
Hard Hard constraints only
Hard-prior Combination of hard and prior
MLN Hand constructed MLN
Results (7 Different Relationships)
Alchemy
Open-source software including: Full first-order logic syntax MPE inference, marginal inference Generative & discriminative weight learning Structure learning Programming language features
alchemy.cs.washington.edu