approximate inference on lpads (rcra 2009)
DESCRIPTION
This poster, presented at RCRA 2009, summarises my work for the MSc thesis. It introduces some fundamentals on LPADs, the standard inference mechanisms with a choice of two approximate algorithms and some graphs showing our findings.TRANSCRIPT
Stefano BragagliaDEIS — Università di [email protected]
Fabrizio RiguzziENDIF — Università di Ferrara
Approximate Inference for
Logic Programs with Annotated Disjunctions
Exp
eri
me
nts
& R
es
ult
s
0
2
4
6
8
10
12
200 600 1000 1400 1800 2200 2600 3000 3400 3800 4200
An
swe
rs
Size (edges)
Biological Graphs: number of successes
Standard
Best K
Monte Carlo
0,001
0,01
0,1
1
10
100
1000
10000
100000
20
0
40
0
60
0
80
0
10
00
12
00
14
00
16
00
18
00
20
00
22
00
24
00
26
00
28
00
30
00
32
00
34
00
36
00
38
00
40
00
42
00
Tim
e (
log
s)
Size (edges)
Biological Graphs: CPU times averaged on successes
Standard
Best K
Monte Carlo
0.493
0.665
0.6020.523
0.643
0.567 0.713 0.621
EntrezGene_11803
PubMed_12653 HGNC_620
PubMed_1741124
PubMed_157782EntrezProtein_8147603
PubMed_15529
HGNC_1505
EntrezProtein_7891032
Biological Graphs
0.3
0.3
0.3
0.3
0.3
0.3
0,000001
0,00001
0,0001
0,001
0,01
0,1
1
10
100
1000
1
19
37
55
73
91
10
9
12
7
14
5
16
3
18
1
19
9
21
7
23
5
25
3
27
1
28
9
Tim
e (
log
s)
Size (steps)
Parachutes Graphs: CPU times
Standard
Best K
Monte Carlo
0.3
0.3
0.3
0.3
0.3
0.3
0,000001
0,00001
0,0001
0,001
0,01
0,1
1
10
100
1000
1
19
37
55
73
91
10
9
12
7
14
5
16
3
18
1
19
9
21
7
23
5
25
3
27
1
28
9
Tim
e (
log
s)
Size (steps)
Lanes Graphs: CPU times
Standard
Best K
Monte Carlo
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0.3
0,000001
0,00001
0,0001
0,001
0,01
0,1
1
10
100
1000
1 2 3 4 5 6 7 8 9 1011121314151617181920
Tim
e (
log
s)
Size (steps)
Branches Graphs: CPU times
Standard
Best K
Monte Carlo
Artificial Graphs
Details and Notes• Datasets: the first dataset is a graph that describes a real case where nodes, edges and probabilities respectively represent the biological entities responsible of Alzheimer’s disease, the relationships betweenthem and their strengths; we considered 10 samples composed by 50 subgraphs of increasing size each. The other graphs are artificial.• Queries: compute the probability that a path exists between two given nodes of each graph.• Tests: performed on Linux machines equipped with Intel Core 2 Duo E6550 (2333 MHz) and 4 Gb RAM with a 24 hours time limit.
Infe
ren
ce
& A
pp
roxi
ma
tio
nStandard InferenceThe cplint system (http://www.ing.unife.it/software/cplint/) can perform inference on LPADs in two steps by computing:1. the explanations of the query with a meta-interpreter that performs resolution and keeps current set of choices;2. the probability of the query by making explanations mutually exclusive with a dynamic programming algorithm that
traverses their equivalent Binary Decision Diagram (BDD).
:- flu(david).
:- hayfever(david).
?- sneezing(david).
(C1;{X1/david};0) (C2;{X2/david};0)
Explanations found:• [ (C1; {X1/david}; 0) ]• [ (C2; {X2/david}; 0) ]
Explanations of Medical Diagnosis
:- edge(a,Node), path(Node,c).
:- path(c,c).
:- path(b,c).
?- path(a,c).
(C3;{Node1/c};0) (C4;{Node2/b};0)
:- edge(b,Node), path(Node,c).
:- path(c,c).(C5;{Node3/c};0)
Explanations found:• [ (C3; {Node1/c}; 0) ]• [ (C4; {Node2/b}; 0), (C5; {Node3/c}; 0) ]
Explanations of Path Problem
XC1
XC2
0
10.7
0.3
0.6
0.4
Probability of Medical Diagnosis
P = 0.3×0.6 + 0.7 = 0.88
Xa,c
Xa,b
Xb,c
0
1
0.7
0.3
0.5
0.5
0.8
0.2
Probability of Path Problem
P = 0.3 + 0.7×0.5×0.8 = 0.58
Approximated InferenceIn some domains, standard inference may be impossible, therefore weconsidered the following approximate algorithms.
Monte Carlo AlgorithmThis is a stochastic algorithm based on a Monte Carlo approach.• Instances of the problem are sampled repeatedly from the LPAD. Thesampling is efficient because it only considers the clauses that are relevantfor the query.• The algorithm also uses a meta-interpreter that stochastically chooses headatoms of resolving clauses.• The fraction of sampled instances where the query is true gives theprobability of the query.Note: this method does not require BDDs to accomplish the computation ofthe probability of the query, which may be too demanding in the evaluationof large problems.
Best K AlgorithmThis is a deterministic algorithm based on a branch and bound technique.• It builds the set of explanations incrementally by keeping only the k mostprobable explanations. (A typical value for k is 64.)• The chosen explanations are used to build a BDD and to compute theirprobability.• Thus the algorithm provides a lower bound on the probability of the query.Note: considering only a limited number of proofs allows a better control ofthe overall complexity of inference, which is fundamental in case ofevaluation of large problems.
Sin
tax
& S
em
an
tic
of
LPA
Ds Motivations
• Increasing interest in combining logic and probability.• Many formalisms such as Markov Logic Networks, ProbLog and LogicPrograms with Annotated Disjunctions (LPADs) enrich logic with probabilisticinformation. LPAD is the most interesting one because:
- It can effectively express cause-effect relationships among events;- It can communicate well the possible effects of a single cause;- It can represent the contemporary contribution of more causes to the
same effect in a very natural way.
LPAD: An ExampleThe following is a simple medical diagnosis application:
C1: sneezing(X): 0.7 v not_sneezing(X): 0.3 ← flu(X).C2: sneezing(X): 0.6 v not_sneezing(X): 0.4 ← hayfever(X).C3: flu(andrew).C4: flu(david).C5: hayfever(david).C6: hayfever(robert).
It means that flu and hayfever may both be cause of sneezing with the givenprobability values. The information available let us determine the probabilityof sneezing in case of:• flu ex.: ?- sneezing(andrew). attended value: 0.7• hayfever ex.: ?- sneezing(robert). attended value: 0.6• both ex.: ?- sneezing(david). attended value: 0.88
Semantic of LPAD• It is based on the concept of instance that is a normal logic program obtained by choosing a logical atom from the head ofeach grounding of every clause of the LPAD.• The probability of an instance is computed by multiplying the probability values of all the atom choosen for that instance.• The probability of a query is given by the sum of the probabilities of each instance where the query is true.
Probability of the QueriesThe above considerations lead to the following results:
Medical DiagnosisQuery: ?- sneezing(david).
Instances where query is true / available instances: 3/4P = 0.7×0.6 + 0.7×0.4 + 0.3×0.6 = 0.88
Path ProblemQuery: ?- path(a,c).
Instances where query is true / available instances: 5/8P = 0.7×0.5×0.8 + 0.3×0.5×0.2 + 0.3×0.5×0.8 +
0.3×0.5×0.2 + 0.3×0.5×0.8 = 0.58
Syntax of LPAD0.3
0.5 0.8
b
ca C1: path(Node,Node).C2: path(Src,Dst) ← edge(Src,Node), path(Node,Dst).C3: edge(a,c): 0.3.C4: edge(a,b): 0.5.C5: edge(b,c): 0.8.
• An LPAD is a set of disjunctiveclauses.• Each atom in their head isannotated with a probability valuebetween 0 and 1.• The atoms of the head represent the mutually exclusive and exhaustive set of effects of the event represented by the body.• The sum of the probabilities associated to the head atoms must be 1 (if not, another atom with the missing probabilityvalue is understood).The above example depicts a typical path problem.