markov logic parag singla dept. of computer science university of texas, austin

Markov Logic

Parag SinglaDept. of Computer Science

University of Texas, Austin

Overview

Motivation Background Markov logic Inference Learning Applications

The Interface Layer

Interface Layer

Applications

Infrastructure

Networking

Interface Layer

Applications

Infrastructure

Internet

RoutersProtocols

WWWEmail

Databases

Interface Layer

Applications

Infrastructure

Relational Model

QueryOptimization

TransactionManagement

ERP

OLTP

CRM

Artificial Intelligence

Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning

Multi-AgentSystemsVision

Robotics


Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning


Robotics

First-Order Logic?


Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning


Robotics

Graphical Models?


Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning


Robotics

Statistical + Logical AI


Interface Layer

Applications

Infrastructure

Representation

Learning

Inference

NLP

Planning


Robotics

Markov Logic

Overview


Markov Networks Undirected graphical models

Cancer

CoughAsthma

Smoking

Potential functions defined over cliques

Smoking Cancer Ф(S,C)

False False 4.5

False True 4.5

True False 2.7

True True 4.5

c

cc xZ

xP )(1

)(

x c

cc xZ )(

Markov Networks Undirected graphical models

Log-linear model:

Weight of Feature i Feature i

otherwise0

CancerSmokingif1)CancerSmoking,(1f

5.11 w

Cancer

CoughAsthma

Smoking

iii xfw

ZxP )(exp

1)(

First-Order Logic

Constants, variables, functions, predicates Anna, x, MotherOf(x), Friends(x,y)

Grounding: Replace all variables by constants Friends (Anna, Bob)

Formula: Predicates connected by operators Smokes(x) Cancer(x)

Knowledge Base (KB): A set of formulas Can be equivalently converted into a clausal form

World: Assignment of truth values to all ground predicates

Overview


Markov Logic

A logical KB is a set of hard constraintson the set of possible worlds

Let’s make them soft constraints:When a world violates a formula,It becomes less probable, not impossible

Give each formula a weight(Higher weight Stronger constraint)

satisfiesit formulas of weightsexpP(world)

Definition

A Markov Logic Network (MLN) is a set of pairs (F, w) where F is a formula in first-order logic w is a real number

Together with a finite set of constants,it defines a Markov network with One node for each grounding of each predicate in

the MLN One feature for each grounding of each formula F

in the MLN, with the corresponding weight w

Example: Friends & Smokers

)()(),(,

)()(

ySmokesxSmokesyxFriendsyx

xCancerxSmokesx


)()(),(,

)()(


xCancerxSmokesx

1.1

5.1


Two constants: Ana (A) and Bob (B)

)()(),(,

)()(


xCancerxSmokesx

1.1

5.1


Cancer(A)

Smokes(A) Smokes(B)

Cancer(B)


)()(),(,

)()(


xCancerxSmokesx

1.1

5.1


Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)


)()(),(,

)()(


xCancerxSmokesx

1.1

5.1


Cancer(A)

Smokes(A)Friends(A,A)

Friends(B,A)

Smokes(B)

Friends(A,B)

Cancer(B)

Friends(B,B)


)()(),(,

)()(


xCancerxSmokesx

1.1

5.1

State of the World {0,1} Assignment to the nodes

Markov Logic Networks MLN is template for ground Markov networks Probability of a world x:

One feature for each ground formula

formulasgroundkkk xfw

ZxP

)(exp1

)(

otherwise 0

x given satisfied isformula if1)(

kth xfk

Markov Logic Networks

MLN is template for ground Markov nets Probability of a world x:

Weight of formula i No. of true groundings of formula i in x

formulas MLN

)(exp1

)(i

ii xnwZ

xP

formulasgroundkkk xfw

ZxP

)(exp1

)(

Relation to Statistical Models

Special cases: Markov networks Markov random fields Bayesian networks Log-linear models Exponential models Max. entropy models Gibbs distributions Boltzmann machines Logistic regression Hidden Markov models Conditional random fields

Obtained by making all predicates zero-arity

Markov logic allows objects to be interdependent (non-i.i.d.)

Relation to First-Order Logic

Infinite weights First-order logic Satisfiable KB, positive weights

Satisfying assignments = Modes of distribution Markov logic allows contradictions between

formulas Markov logic in infinite domains

Overview


Inference

Cancer(A)

Smokes(A)?Friends(A,A)

Friends(B,A)

Smokes(B)?

Friends(A,B)

Cancer(B)?

Friends(B,B)

Most Probable Explanation (MPE)Inference

Cancer(A)


Friends(B,A)

Smokes(B)?

Friends(A,B)

Cancer(B)?

Friends(B,B)

What is the most likely state of Smokes(A), Smokes(B), Cancer(B)?

MPE Inference

Problem: Find most likely state of world given evidence

)|(maxarg xyPy

Query Evidence

MPE Inference


i

iixy

yxnwZ

),(exp1

maxarg

MPE Inference


i

iiy

yxnw ),(maxarg

MPE Inference


This is just the weighted MaxSAT problem Use weighted SAT solver

(e.g., MaxWalkSAT [Kautz et al. 97] )

i

iiy

yxnw ),(maxarg

Computing Probabilities: Marginal Inference

Cancer(A)


Friends(B,A)

Smokes(B)?

Friends(A,B)

Cancer(B)?

Friends(B,B)

What is the probability Smokes(B) = 1?

Marginal Inference

Problem: Find the probability of query atoms given evidence

)|( xyP

Query Evidence

Marginal Inference

Problem: Find the probability of query atoms given evidence

iii

x

yxnwZ

xyP ),(exp1

)|(

Computing Zx takes exponential time!

Belief Propagation

Bipartite network of nodes (variables)and features

In Markov logic: Nodes = Ground atoms Features = Ground clauses

Exchange messages until convergence Messages

Current approximation to node marginals

Belief Propagation

Nodes (x)

Features (f)

Smokes(Ana)Smokes(Ana) Friends(Ana, Bob)

Smokes(Bob)

Belief Propagation

Nodes (x)

Features (f)

}\{)(

)()(fxnh

xhfx xx

Belief Propagation

Nodes (x)

Features (f)

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

Nodes (x)

Features (f)

Lifted Belief Propagation

}\{)(

)()(fxnh

xhfx xx

}{~ }\{)(

)( )()(x xfny

fyxwf

xf yex

, :Functions of edge counts

Nodes (x)

Features (f)

Overview


Learning Parameters

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Three constants: Ana, Bob, John

Learning Parameters

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)


Learning Parameters

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Closed World Assumption: Anything not in the database is assumed false.


Learning Parameters

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)



Cancer

Cancer(Ana)

Cancer(Bob)

Learning Parameters

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)



Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)

Learning Parameters

Data is a relational database Closed world assumption (if not: EM) Learning parameters (weights)

Generatively Discriminatively

Learning Parameters (Discriminative)

Given training data Maximize conditional likelihood of query (y)

given evidence (x) Use gradient ascent No local maxima

No. of true groundings of clause i in data

Expected no. true groundings according to model

),(),()|(log yxnEyxnxyPw iwiw

i

Learning Parameters (Discriminative)

Requires inference at each step (slow!)

Approximate expected counts by counts in MAP state of y given x

No. of true groundings of clause i in data

Expected no. true groundings according to model

),(),()|(log yxnEyxnxyPw iwiw

i

Learning Structure

Smokes

Smokes(Ana)

Smokes(Bob)

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)


Can we learn the set of the formulas in the MLN?

Learning Structure

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)


Can we refine the set of the formulas in the MLN?

Learning Structure

)()(),(,

)()(


xCancerxSmokesx

?w

?w

2

1

Smokes

Smokes(Ana)

Smokes(Bob)

Cancer

Cancer(Ana)

Cancer(Bob)

Friends

Friends(Ana, Bob)

Friends(Bob, Ana)

Friends(Ana, John)

Friends(John, Ana)

),(),(, xyFriendsyxFriendsyx ?w3


Learning Structure

Learn the MLN formulas given data Search in the space of possible formulas

Exhaustive search not possible! Need to guide the search

Initial state: Unit clauses or hand-coded KB Operators: Add/remove literal, flip sign Evaluation function:

likelihood + structure prior Search: top-down, bottom-up, structural motifs

Overview

Motivation Markov logic Inference Learning Applications Conclusion & Future Work

Applications

Web-mining Collective Classification Link Prediction Information retrieval Entity resolution Activity Recognition Image Segmentation &

De-noising

Social Network Analysis

Computational Biology Natural Language

Processing Robot mapping Planning and MDPs More..

Entity Resolution

Data Cleaning and Integration – first step in the data-mining process

Merging of data from multiple sources results in duplicates

Entity Resolution: Problem of identifying which of the records/fields refer to the same underlying entity

Entity Resolution

Parag Singla and Pedro Domingos, “Memory-EfficientInference in Relational Domains” (AAAI-06).

Singla, P., & Domingos, P. (2006). Memory-efficentinference in relatonal domains. In Proceedings of theTwenty-First National Conference on Artificial Intelligence(pp. 500-505). Boston, MA: AAAI Press.

H. Poon & P. Domingos, Sound and Efficient Inferencewith Probabilistic and Deterministic Dependencies”, inProc. AAAI-06, Boston, MA, 2006.

Poon H. (2006). Efficent inference. In Proceedings of theTwenty-First National Conference on Artificial Intelligence.

Author

Title

Venue

HasToken(field,token) HasField(record,field) SameField(field,field) SameRecord(record,record)

Predicates


HasToken(f1,t) HasToken(f2,t) SameField(f1,f2) (Similarity of fields based on shared tokens)

Predicates & Formulas



HasField(r1,f1) HasField(r2,f2) SameField(f1,f2) SameRecord(r1,r2)

(Similarity of records based on similarity of fields and vice-versa)






SameRecord(r1,r2) ^ SameRecord(r2,r3) SameRecord(r1,r3)

(Transitivity)






SameRecord(r1,r2) ^ SameRecord(r2,r3) SameRecord(r1,r3)

(Transitivity)


Winner of Best Paper Award!

Discovery of Social Relationships in Consumer Photo Collections

Find out all the photos of my child’s friends!

MLN Rules

Hard Rules Two people share a unique relationship Parents are older than their children …

Soft Rules Children’s friends are of similar age Young adult appearing with a child shares a parent-

child relationship Friends and relatives are clustered together …

Models Compared

Model Description

Random Random prediction (uniform)

Prior Prediction based on priors

Hard Hard constraints only

Hard-prior Combination of hard and prior

MLN Hand constructed MLN

Results (7 Different Relationships)

Alchemy

Open-source software including: Full first-order logic syntax MPE inference, marginal inference Generative & discriminative weight learning Structure learning Programming language features

alchemy.cs.washington.edu

markov logic parag singla dept. of computer science university of texas, austin

Documents

bob bexample

bcancer constants

friends smokerstwo constants

friends smokersexample

finite set of constants

set of formulascan

formula f

set of pairs f