approximate inference on lpads (rcra 2009)

1
Stefano Bragaglia DEIS Università di Bologna [email protected] Fabrizio Riguzzi ENDIF Università di Ferrara [email protected] Approximate Inference for Logic Programs with Annotated Disjunctions Experiments & Results 0 2 4 6 8 10 12 200 600 1000 1400 1800 2200 2600 3000 3400 3800 4200 Answers Size (edges) Biological Graphs: number of successes Standard Best K Monte Carlo 0,001 0,01 0,1 1 10 100 1000 10000 100000 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 2600 2800 3000 3200 3400 3600 3800 4000 4200 Time (log s) Size (edges) Biological Graphs: CPU times averaged on successes Standard Best K Monte Carlo 0.493 0.665 0.602 0.523 0.643 0.567 0.713 0.621 EntrezGene_11803 PubMed_12653 HGNC_620 PubMed_1741124 PubMed_157782 EntrezProtein_8147603 PubMed_15529 HGNC_1505 EntrezProtein_7891032 Biological Graphs 0.3 0.3 0.3 0.3 0.3 0.3 0,000001 0,00001 0,0001 0,001 0,01 0,1 1 10 100 1000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 271 289 Time (log s) Size (steps) Parachutes Graphs: CPU times Standard Best K Monte Carlo 0.3 0.3 0.3 0.3 0.3 0.3 0,000001 0,00001 0,0001 0,001 0,01 0,1 1 10 100 1000 1 19 37 55 73 91 109 127 145 163 181 199 217 235 253 271 289 Time (log s) Size (steps) Lanes Graphs: CPU times Standard Best K Monte Carlo 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0.3 0,000001 0,00001 0,0001 0,001 0,01 0,1 1 10 100 1000 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Time (log s) Size (steps) Branches Graphs: CPU times Standard Best K Monte Carlo Artificial Graphs Details and Notes Datasets: the first dataset is a graph that describes a real case where nodes, edges and probabilities respectively represent the biological entities responsible of Alzheimer’s disease, the relationships between them and their strengths; we considered 10 samples composed by 50 subgraphs of increasing size each. The other graphs are artificial. Queries: compute the probability that a path exists between two given nodes of each graph. Tests: performed on Linux machines equipped with Intel Core 2 Duo E6550 (2333 MHz) and 4 Gb RAM with a 24 hours time limit. Inference & Approximation Standard Inference The cplint system (http://www.ing.unife.it/software/cplint/ ) can perform inference on LPADs in two steps by computing: 1. the explanations of the query with a meta-interpreter that performs resolution and keeps current set of choices; 2. the probability of the query by making explanations mutually exclusive with a dynamic programming algorithm that traverses their equivalent Binary Decision Diagram (BDD). :- flu(david). :- hayfever(david). ?- sneezing(david). (C 1 ;{X 1 /david};0) (C 2 ;{X2/david};0) Explanations found: [ (C 1 ; {X 1 /david}; 0) ] [ (C 2 ; {X 2 /david}; 0) ] Explanations of Medical Diagnosis :- edge(a,Node), path(Node,c). :- path(c,c). :- path(b,c). ?- path(a,c). (C 3 ;{Node 1 /c};0) (C 4 ;{Node 2 /b};0) :- edge(b,Node), path(Node,c). :- path(c,c). (C 5 ;{Node 3 /c};0) Explanations found: [ (C 3 ; {Node 1 /c}; 0) ] [ (C 4 ; {Node 2 /b}; 0), (C 5 ; {Node 3 /c}; 0) ] Explanations of Path Problem X C1 X C2 0 1 0.7 0.3 0.6 0.4 Probability of Medical Diagnosis P = 0.3×0.6 + 0.7 = 0.88 X a,c X a,b X b,c 0 1 0.7 0.3 0.5 0.5 0.8 0.2 Probability of Path Problem P = 0.3 + 0.7×0.5×0.8 = 0.58 Approximated Inference In some domains, standard inference may be impossible, therefore we considered the following approximate algorithms. Monte Carlo Algorithm This is a stochastic algorithm based on a Monte Carlo approach. Instances of the problem are sampled repeatedly from the LPAD. The sampling is efficient because it only considers the clauses that are relevant for the query. The algorithm also uses a meta-interpreter that stochastically chooses head atoms of resolving clauses. The fraction of sampled instances where the query is true gives the probability of the query. Note: this method does not require BDDs to accomplish the computation of the probability of the query, which may be too demanding in the evaluation of large problems. Best K Algorithm This is a deterministic algorithm based on a branch and bound technique. It builds the set of explanations incrementally by keeping only the k most probable explanations. (A typical value for k is 64.) The chosen explanations are used to build a BDD and to compute their probability. Thus the algorithm provides a lower bound on the probability of the query. Note: considering only a limited number of proofs allows a better control of the overall complexity of inference, which is fundamental in case of evaluation of large problems. Sintax & Semantic of LPADs Motivations Increasing interest in combining logic and probability. Many formalisms such as Markov Logic Networks, ProbLog and Logic Programs with Annotated Disjunctions (LPADs) enrich logic with probabilistic information. LPAD is the most interesting one because: - It can effectively express cause-effect relationships among events; - It can communicate well the possible effects of a single cause; - It can represent the contemporary contribution of more causes to the same effect in a very natural way. LPAD: An Example The following is a simple medical diagnosis application: C 1 : sneezing(X): 0.7 v not_sneezing(X): 0.3 flu(X). C 2 : sneezing(X): 0.6 v not_sneezing(X): 0.4 hayfever(X). C 3 : flu(andrew). C 4 : flu(david). C 5 : hayfever(david). C 6 : hayfever(robert). It means that flu and hayfever may both be cause of sneezing with the given probability values. The information available let us determine the probability of sneezing in case of: flu ex.: ?- sneezing(andrew). attended value: 0.7 hayfever ex.: ?- sneezing(robert). attended value: 0.6 both ex.: ?- sneezing(david). attended value: 0.88 Semantic of LPAD It is based on the concept of instance that is a normal logic program obtained by choosing a logical atom from the head of each grounding of every clause of the LPAD. The probability of an instance is computed by multiplying the probability values of all the atom choosen for that instance. The probability of a query is given by the sum of the probabilities of each instance where the query is true. Probability of the Queries The above considerations lead to the following results: Medical Diagnosis Query: ?- sneezing(david). Instances where query is true / available instances: 3/4 P = 0.7×0.6 + 0.7×0.4 + 0.3×0.6 = 0.88 Path Problem Query: ?- path(a,c). Instances where query is true / available instances: 5/8 P = 0.7×0.5×0.8 + 0.3×0.5×0.2 + 0.3×0.5×0.8 + 0.3×0.5×0.2 + 0.3×0.5×0.8 = 0.58 Syntax of LPAD 0.3 0.5 0.8 b c a C 1 : path(Node,Node). C 2 : path(Src,Dst) edge(Src,Node), path(Node,Dst). C 3 : edge(a,c): 0.3. C 4 : edge(a,b): 0.5. C 5 : edge(b,c): 0.8. An LPAD is a set of disjunctive clauses. Each atom in their head is annotated with a probability value between 0 and 1. The atoms of the head represent the mutually exclusive and exhaustive set of effects of the event represented by the body. The sum of the probabilities associated to the head atoms must be 1 (if not, another atom with the missing probability value is understood). The above example depicts a typical path problem.

Upload: stefano-bragaglia

Post on 07-May-2015

46 views

Category:

Documents


2 download

DESCRIPTION

This poster, presented at RCRA 2009, summarises my work for the MSc thesis. It introduces some fundamentals on LPADs, the standard inference mechanisms with a choice of two approximate algorithms and some graphs showing our findings.

TRANSCRIPT

Page 1: Approximate Inference on LPADs (RCRA 2009)

Stefano BragagliaDEIS — Università di [email protected]

Fabrizio RiguzziENDIF — Università di Ferrara

[email protected]

Approximate Inference for

Logic Programs with Annotated Disjunctions

Exp

eri

me

nts

& R

es

ult

s

0

2

4

6

8

10

12

200 600 1000 1400 1800 2200 2600 3000 3400 3800 4200

An

swe

rs

Size (edges)

Biological Graphs: number of successes

Standard

Best K

Monte Carlo

0,001

0,01

0,1

1

10

100

1000

10000

100000

20

0

40

0

60

0

80

0

10

00

12

00

14

00

16

00

18

00

20

00

22

00

24

00

26

00

28

00

30

00

32

00

34

00

36

00

38

00

40

00

42

00

Tim

e (

log

s)

Size (edges)

Biological Graphs: CPU times averaged on successes

Standard

Best K

Monte Carlo

0.493

0.665

0.6020.523

0.643

0.567 0.713 0.621

EntrezGene_11803

PubMed_12653 HGNC_620

PubMed_1741124

PubMed_157782EntrezProtein_8147603

PubMed_15529

HGNC_1505

EntrezProtein_7891032

Biological Graphs

0.3

0.3

0.3

0.3

0.3

0.3

0,000001

0,00001

0,0001

0,001

0,01

0,1

1

10

100

1000

1

19

37

55

73

91

10

9

12

7

14

5

16

3

18

1

19

9

21

7

23

5

25

3

27

1

28

9

Tim

e (

log

s)

Size (steps)

Parachutes Graphs: CPU times

Standard

Best K

Monte Carlo

0.3

0.3

0.3

0.3

0.3

0.3

0,000001

0,00001

0,0001

0,001

0,01

0,1

1

10

100

1000

1

19

37

55

73

91

10

9

12

7

14

5

16

3

18

1

19

9

21

7

23

5

25

3

27

1

28

9

Tim

e (

log

s)

Size (steps)

Lanes Graphs: CPU times

Standard

Best K

Monte Carlo

0.3

0.3

0.3

0.3

0.3

0.3

0.3

0.3

0.3

0.3

0.3

0,000001

0,00001

0,0001

0,001

0,01

0,1

1

10

100

1000

1 2 3 4 5 6 7 8 9 1011121314151617181920

Tim

e (

log

s)

Size (steps)

Branches Graphs: CPU times

Standard

Best K

Monte Carlo

Artificial Graphs

Details and Notes• Datasets: the first dataset is a graph that describes a real case where nodes, edges and probabilities respectively represent the biological entities responsible of Alzheimer’s disease, the relationships betweenthem and their strengths; we considered 10 samples composed by 50 subgraphs of increasing size each. The other graphs are artificial.• Queries: compute the probability that a path exists between two given nodes of each graph.• Tests: performed on Linux machines equipped with Intel Core 2 Duo E6550 (2333 MHz) and 4 Gb RAM with a 24 hours time limit.

Infe

ren

ce

& A

pp

roxi

ma

tio

nStandard InferenceThe cplint system (http://www.ing.unife.it/software/cplint/) can perform inference on LPADs in two steps by computing:1. the explanations of the query with a meta-interpreter that performs resolution and keeps current set of choices;2. the probability of the query by making explanations mutually exclusive with a dynamic programming algorithm that

traverses their equivalent Binary Decision Diagram (BDD).

:- flu(david).

:- hayfever(david).

?- sneezing(david).

(C1;{X1/david};0) (C2;{X2/david};0)

Explanations found:• [ (C1; {X1/david}; 0) ]• [ (C2; {X2/david}; 0) ]

Explanations of Medical Diagnosis

:- edge(a,Node), path(Node,c).

:- path(c,c).

:- path(b,c).

?- path(a,c).

(C3;{Node1/c};0) (C4;{Node2/b};0)

:- edge(b,Node), path(Node,c).

:- path(c,c).(C5;{Node3/c};0)

Explanations found:• [ (C3; {Node1/c}; 0) ]• [ (C4; {Node2/b}; 0), (C5; {Node3/c}; 0) ]

Explanations of Path Problem

XC1

XC2

0

10.7

0.3

0.6

0.4

Probability of Medical Diagnosis

P = 0.3×0.6 + 0.7 = 0.88

Xa,c

Xa,b

Xb,c

0

1

0.7

0.3

0.5

0.5

0.8

0.2

Probability of Path Problem

P = 0.3 + 0.7×0.5×0.8 = 0.58

Approximated InferenceIn some domains, standard inference may be impossible, therefore weconsidered the following approximate algorithms.

Monte Carlo AlgorithmThis is a stochastic algorithm based on a Monte Carlo approach.• Instances of the problem are sampled repeatedly from the LPAD. Thesampling is efficient because it only considers the clauses that are relevantfor the query.• The algorithm also uses a meta-interpreter that stochastically chooses headatoms of resolving clauses.• The fraction of sampled instances where the query is true gives theprobability of the query.Note: this method does not require BDDs to accomplish the computation ofthe probability of the query, which may be too demanding in the evaluationof large problems.

Best K AlgorithmThis is a deterministic algorithm based on a branch and bound technique.• It builds the set of explanations incrementally by keeping only the k mostprobable explanations. (A typical value for k is 64.)• The chosen explanations are used to build a BDD and to compute theirprobability.• Thus the algorithm provides a lower bound on the probability of the query.Note: considering only a limited number of proofs allows a better control ofthe overall complexity of inference, which is fundamental in case ofevaluation of large problems.

Sin

tax

& S

em

an

tic

of

LPA

Ds Motivations

• Increasing interest in combining logic and probability.• Many formalisms such as Markov Logic Networks, ProbLog and LogicPrograms with Annotated Disjunctions (LPADs) enrich logic with probabilisticinformation. LPAD is the most interesting one because:

- It can effectively express cause-effect relationships among events;- It can communicate well the possible effects of a single cause;- It can represent the contemporary contribution of more causes to the

same effect in a very natural way.

LPAD: An ExampleThe following is a simple medical diagnosis application:

C1: sneezing(X): 0.7 v not_sneezing(X): 0.3 ← flu(X).C2: sneezing(X): 0.6 v not_sneezing(X): 0.4 ← hayfever(X).C3: flu(andrew).C4: flu(david).C5: hayfever(david).C6: hayfever(robert).

It means that flu and hayfever may both be cause of sneezing with the givenprobability values. The information available let us determine the probabilityof sneezing in case of:• flu ex.: ?- sneezing(andrew). attended value: 0.7• hayfever ex.: ?- sneezing(robert). attended value: 0.6• both ex.: ?- sneezing(david). attended value: 0.88

Semantic of LPAD• It is based on the concept of instance that is a normal logic program obtained by choosing a logical atom from the head ofeach grounding of every clause of the LPAD.• The probability of an instance is computed by multiplying the probability values of all the atom choosen for that instance.• The probability of a query is given by the sum of the probabilities of each instance where the query is true.

Probability of the QueriesThe above considerations lead to the following results:

Medical DiagnosisQuery: ?- sneezing(david).

Instances where query is true / available instances: 3/4P = 0.7×0.6 + 0.7×0.4 + 0.3×0.6 = 0.88

Path ProblemQuery: ?- path(a,c).

Instances where query is true / available instances: 5/8P = 0.7×0.5×0.8 + 0.3×0.5×0.2 + 0.3×0.5×0.8 +

0.3×0.5×0.2 + 0.3×0.5×0.8 = 0.58

Syntax of LPAD0.3

0.5 0.8

b

ca C1: path(Node,Node).C2: path(Src,Dst) ← edge(Src,Node), path(Node,Dst).C3: edge(a,c): 0.3.C4: edge(a,b): 0.5.C5: edge(b,c): 0.8.

• An LPAD is a set of disjunctiveclauses.• Each atom in their head isannotated with a probability valuebetween 0 and 1.• The atoms of the head represent the mutually exclusive and exhaustive set of effects of the event represented by the body.• The sum of the probabilities associated to the head atoms must be 1 (if not, another atom with the missing probabilityvalue is understood).The above example depicts a typical path problem.