markov logic parag singla dept. of computer science university of texas, austin

75
Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Upload: hillary-dennis

Post on 28-Dec-2015

219 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Markov Logic

Parag SinglaDept. of Computer Science

University of Texas, Austin

Page 2: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Overview

Motivation Background Markov logic Inference Learning Applications

Page 3: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

The Interface Layer

Interface Layer

Applications

Infrastructure

Page 4: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Networking

Interface Layer

Applications

Infrastructure

Internet

RoutersProtocols

WWWEmail

Page 5: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Databases

Interface Layer

Applications

Infrastructure

Relational Model

QueryOptimization

TransactionManagement

ERP

OLTP

CRM

Page 6: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Artificial Intelligence

Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning

Multi-AgentSystemsVision

Robotics

Page 7: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Artificial Intelligence

Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning

Multi-AgentSystemsVision

Robotics

First-Order Logic?

Page 8: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Artificial Intelligence

Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning

Multi-AgentSystemsVision

Robotics

Graphical Models?

Page 9: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Artificial Intelligence

Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning

Multi-AgentSystemsVision

Robotics

Statistical + Logical AI

Page 10: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Artificial Intelligence

Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning

Multi-AgentSystemsVision

Robotics

Markov Logic

Page 11: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Overview

Motivation Background Markov logic Inference Learning Applications

Page 12: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Markov Networks Undirected graphical models

Cancer

CoughAsthma

Smoking

Potential functions defined over cliques

Smoking Cancer Ф(S,C)

False False 4.5

False True 4.5

True False 2.7

True True 4.5

c

cc xZ

xP )(1

)(

x c

cc xZ )(

Page 13: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Markov Networks Undirected graphical models

Log-linear model:

Weight of Feature i Feature i

otherwise0

CancerSmokingif1)CancerSmoking,(1f

5.11 w

Cancer

CoughAsthma

Smoking

iii xfw

ZxP )(exp

1)(

Page 14: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

First-Order Logic

Constants, variables, functions, predicates Anna, x, MotherOf(x), Friends(x,y)

Grounding: Replace all variables by constants Friends (Anna, Bob)

Formula: Predicates connected by operators Smokes(x) Cancer(x)

Knowledge Base (KB): A set of formulas Can be equivalently converted into a clausal form

World: Assignment of truth values to all ground predicates

Page 15: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Overview

Motivation Background Markov logic Inference Learning Applications

Page 16: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Markov Logic

A logical KB is a set of hard constraintson the set of possible worlds

Let’s make them soft constraints:When a world violates a formula,It becomes less probable, not impossible

Give each formula a weight(Higher weight Stronger constraint)

satisfiesit formulas of weightsexpP(world)

Page 17: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Definition

A Markov Logic Network (MLN) is a set of pairs (F, w) where F is a formula in first-order logic w is a real number

Together with a finite set of constants,it defines a Markov network with One node for each grounding of each predicate in

the MLN One feature for each grounding of each formula F

in the MLN, with the corresponding weight w

Page 18: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

Page 19: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Page 20: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

Two constants: Ana (A) and Bob (B)

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Page 21: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

Cancer(A)

Smokes(A) Smokes(B)

Cancer(B)

Two constants: Ana (A) and Bob (B)

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Page 22: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

Two constants: Ana (A) and Bob (B)

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Page 23: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

Two constants: Ana (A) and Bob (B)

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Page 24: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

Two constants: Ana (A) and Bob (B)

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

Page 25: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Example: Friends & Smokers

Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)

Two constants: Ana (A) and Bob (B)

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

1.1

5.1

State of the World {0,1} Assignment to the nodes

Page 26: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Markov Logic Networks MLN is template for ground Markov networks Probability of a world x:

One feature for each ground formula

formulasgroundkkk xfw

ZxP

)(exp1

)(

otherwise 0

x given satisfied isformula if1)(

kth xfk

Page 27: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Markov Logic Networks

MLN is template for ground Markov nets Probability of a world x:

Weight of formula i No. of true groundings of formula i in x

formulas MLN

)(exp1

)(i

ii xnwZ

xP

formulasgroundkkk xfw

ZxP

)(exp1

)(

Page 28: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Relation to Statistical Models

Special cases: Markov networks Markov random fields Bayesian networks Log-linear models Exponential models Max. entropy models Gibbs distributions Boltzmann machines Logistic regression Hidden Markov models Conditional random fields

Obtained by making all predicates zero-arity

Markov logic allows objects to be interdependent (non-i.i.d.)

Page 29: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Relation to First-Order Logic

Infinite weights First-order logic Satisfiable KB, positive weights

Satisfying assignments = Modes of distribution Markov logic allows contradictions between

formulas Markov logic in infinite domains

Page 30: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Overview

Motivation Background Markov logic Inference Learning Applications

Page 31: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Inference

Cancer(A)

Smokes(A)?Friends(A,A)

Friends(B,A)

Smokes(B)?

Friends(A,B)

Cancer(B)?

Friends(B,B)

Page 32: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Most Probable Explanation (MPE)Inference

Cancer(A)

Smokes(A)?Friends(A,A)

Friends(B,A)

Smokes(B)?

Friends(A,B)

Cancer(B)?

Friends(B,B)

What is the most likely state of Smokes(A), Smokes(B), Cancer(B)?

Page 33: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

MPE Inference

Problem: Find most likely state of world given evidence

)|(maxarg xyPy

Query Evidence

Page 34: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

MPE Inference

Problem: Find most likely state of world given evidence

i

iixy

yxnwZ

),(exp1

maxarg

Page 35: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

MPE Inference

Problem: Find most likely state of world given evidence

i

iiy

yxnw ),(maxarg

Page 36: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

MPE Inference

Problem: Find most likely state of world given evidence

This is just the weighted MaxSAT problem Use weighted SAT solver

(e.g., MaxWalkSAT [Kautz et al. 97] )

i

iiy

yxnw ),(maxarg

Page 37: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Computing Probabilities: Marginal Inference

Cancer(A)

Smokes(A)?Friends(A,A)

Friends(B,A)

Smokes(B)?

Friends(A,B)

Cancer(B)?

Friends(B,B)

What is the probability Smokes(B) = 1?

Page 38: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Marginal Inference

Problem: Find the probability of query atoms given evidence

)|( xyP

Query Evidence

Page 39: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Marginal Inference

Problem: Find the probability of query atoms given evidence

iii

x

yxnwZ

xyP ),(exp1

)|(

Computing Zx takes exponential time!

Page 40: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Belief Propagation

Bipartite network of nodes (variables)and features

In Markov logic: Nodes = Ground atoms Features = Ground clauses

Exchange messages until convergence Messages

Current approximation to node marginals

Page 41: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Belief Propagation

Nodes (x)

Features (f)

Smokes(Ana)Smokes(Ana) Friends(Ana, Bob)

Smokes(Bob)

Page 42: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Belief Propagation

Nodes (x)

Features (f)

}\{)(

)()(fxnh

xhfx xx

Page 43: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Belief Propagation

Nodes (x)

Features (f)

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Page 44: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Nodes (x)

Features (f)

Page 45: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Nodes (x)

Features (f)

Page 46: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

, :Functions of edge counts

Nodes (x)

Features (f)

Page 47: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Overview

Motivation Background Markov logic Inference Learning Applications

Page 48: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Three constants: Ana, Bob, John

Page 49: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Three constants: Ana, Bob, John

Page 50: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Closed World Assumption: Anything not in the database is assumed false.

Three constants: Ana, Bob, John

Page 51: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Closed World Assumption: Anything not in the database is assumed false.

Three constants: Ana, Bob, John

Cancer

Cancer(Ana)

Cancer(Bob)

Page 52: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Closed World Assumption: Anything not in the database is assumed false.

Three constants: Ana, Bob, John

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)

Page 53: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters

Data is a relational database Closed world assumption (if not: EM) Learning parameters (weights)

Generatively Discriminatively

Page 54: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters (Discriminative)

Given training data Maximize conditional likelihood of query (y)

given evidence (x) Use gradient ascent No local maxima

No. of true groundings of clause i in data

Expected no. true groundings according to model

),(),()|(log yxnEyxnxyPw iwiw

i

Page 55: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Parameters (Discriminative)

Requires inference at each step (slow!)

Approximate expected counts by counts in MAP state of y given x

No. of true groundings of clause i in data

Expected no. true groundings according to model

),(),()|(log yxnEyxnxyPw iwiw

i

Page 56: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Structure

Smokes

Smokes(Ana)

Smokes(Bob)

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)

Three constants: Ana, Bob, John

Can we learn the set of the formulas in the MLN?

Page 57: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Structure

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)

Three constants: Ana, Bob, John

Can we refine the set of the formulas in the MLN?

Page 58: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Structure

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)

),(),(, xyFriendsyxFriendsyx ?w3

Three constants: Ana, Bob, John

Page 59: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Learning Structure

Learn the MLN formulas given data Search in the space of possible formulas

Exhaustive search not possible! Need to guide the search

Initial state: Unit clauses or hand-coded KB Operators: Add/remove literal, flip sign Evaluation function:

likelihood + structure prior Search: top-down, bottom-up, structural motifs

Page 60: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Overview

Motivation Markov logic Inference Learning Applications Conclusion & Future Work

Page 61: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Applications

Web-mining Collective Classification Link Prediction Information retrieval Entity resolution Activity Recognition Image Segmentation &

De-noising

Social Network Analysis

Computational Biology Natural Language

Processing Robot mapping Planning and MDPs More..

Page 62: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Entity Resolution

Data Cleaning and Integration – first step in the data-mining process

Merging of data from multiple sources results in duplicates

Entity Resolution: Problem of identifying which of the records/fields refer to the same underlying entity

Page 63: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Entity Resolution

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Author

Title

Venue

Page 64: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Entity Resolution

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Author

Title

Venue

Page 65: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Entity Resolution

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Author

Title

Venue

Page 66: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)

Predicates

Page 67: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)

HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)

Predicates & Formulas

Page 68: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)

HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)

HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)

(Similarity of records based on similarity of fields and vice-versa)

Predicates & Formulas

Page 69: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)

HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)

HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)

(Similarity of records based on similarity of fields and vice-versa)

SameRecord(r1,r2) ^ SameRecord(r2,r3) SameRecord(r1,r3)

(Transitivity)

Predicates & Formulas

Page 70: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)

HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)

HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)

(Similarity of records based on similarity of fields and vice-versa)

SameRecord(r1,r2) ^ SameRecord(r2,r3) SameRecord(r1,r3)

(Transitivity)

Predicates & Formulas

Winner of Best Paper Award!

Page 71: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Discovery of Social Relationships in Consumer Photo Collections

Find out all the photos of my child’s friends!

Page 72: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

MLN Rules

Hard Rules Two people share a unique relationship Parents are older than their children …

Soft Rules Children’s friends are of similar age Young adult appearing with a child shares a parent-

child relationship Friends and relatives are clustered together …

Page 73: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Models Compared

Model Description

Random Random prediction (uniform)

Prior Prediction based on priors

Hard Hard constraints only

Hard-prior Combination of hard and prior

MLN Hand constructed MLN

Page 74: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Results (7 Different Relationships)

Page 75: Markov Logic Parag Singla Dept. of Computer Science University of Texas, Austin

Alchemy

Open-source software including: Full first-order logic syntax MPE inference, marginal inference Generative & discriminative weight learning Structure learning Programming language features

alchemy.cs.washington.edu