jir´ı vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 ·...
TRANSCRIPT
![Page 1: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/1.jpg)
Some applications of Bayesian networks
Ji rı Vomlel
Institute of Information Theory and AutomationAcademy of Sciences of the Czech Republic
This presentation is available athttp://www.utia.cas.cz/vomlel/
1
![Page 2: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/2.jpg)
Contents• Brief introduction to Bayesian networks
• Typical tasks that can be solved using Bayesian networks
• 1: Medical diagnosis (a very simple example)
• 2: Decision making maximizing expected utility (another simpleexample)
• 3: Adaptive testing (a case study)
• 4: Decision-theoretic troubleshooting (a commercial product)
2
![Page 3: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/3.jpg)
Bayesian network• a directed acyclic graph G = (V, E)
• each node i ∈ V corresponds to a random variable Xi with afinite set Xi of mutually exclusive states
• pa(i) denotes the set of parents of node i in graph G
• to each node i ∈ V corresponds a conditional probability tableP(Xi | (X j) j∈pa(i))
• the DAG implies conditional independence relations between(Xi)i∈V
• d-separation (Pearl, 1986) can be used to read the CI relationsfrom the DAG
3
![Page 4: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/4.jpg)
Using the chain rule we have that:
P((Xi)i∈V) = ∏i∈V
P(Xi | Xi−1, . . . , X1)
Assume an ordering of Xi , i ∈ V such that if j ∈ pa(i) then j < i.From the DAG we can read conditional independence relations
Xi ⊥⊥ Xk | (X j) j∈pa(i) for i ∈ V and k < i and k 6∈ pa(i)
Using the conditional independence relations from the DAG we get
P((Xi)i∈V) = ∏i∈V
P(Xi | (X j) j∈pa(i)) .
It is the joint probability distribution represented by the Bayesiannetwork.
4
![Page 5: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/5.jpg)
Example:X1 X2P(X1) P(X2)
P(X3 | X1)
P(X4 | X2)
P(X6 | X3 , X4)
P(X9 | X6)
P(X8 | X7 , X6)
P(X5 | X1)
P(X7 | X5)
X5
X7
X3
X8
X6
X9
X4
P(X1, . . . , X9) =
= P(X9|X8, . . . , X1) · P(X8|X7, . . . , X1) · . . . · P(X2|X1) · P(X1)
= P(X9|X6) · P(X8|X7, X6) · P(X7|X5) · P(X6|X4, X3)
·P(X5|X1) · P(X4|X2) · P(X3|X1) · P(X2) · P(X1)
5
![Page 6: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/6.jpg)
Typical use of Bayesian networks
• to model and explain a domain.
• to update beliefs about states of certain variables when someother variables were observed, i.e., computing conditionalprobability distributions, e.g., P(X23|X17 = yes, X54 = no).
• to find most probable configurations of variables
• to support decision making under uncertainty
• to find good strategies for solving tasks in a domain withuncertainty.
6
![Page 7: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/7.jpg)
Simplified diagnostic exampleWe have a patient.Possible diagnoses: tuberculosis, lung cancer, bronchitis.
7
![Page 8: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/8.jpg)
We don’t know anything about the pa-tient
Patient is a smoker.
8
![Page 9: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/9.jpg)
Patient is a smoker. ... and he complains about dyspnoea
9
![Page 10: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/10.jpg)
Patient is a smoker and complainsabout dyspnoea
... and his X-ray is positive
10
![Page 11: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/11.jpg)
Patient is a smoker and complainsabout dyspnoea and his X-ray is pos-itive
... and he visited Asia recently
11
![Page 12: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/12.jpg)
Application 2 :Decision makingThe goal: maximize expected utility
Hugin example: mildew4.net
12
![Page 13: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/13.jpg)
Fixed and Adaptive Test Strategies
wrong
correct wrong
wrong correct
wrong
wrong
correct
correctwrong wrongcorrectcorrectcorrect
Q5
Q4
Q3
Q2
Q1
Q2
Q1
Q3
Q4
Q5
Q6
Q7
Q8
Q9
Q10
Q6 Q8
Q7
Q8
Q4 Q7 Q10
Q6
Q7
Q9
13
![Page 14: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/14.jpg)
X3
X1
X3
X3
X2
X3
X2
X1
X2
X1
X2
X2
X3
X1
X1
For all nodes n of a strategy s wehave defined:
• evidence en, i.e. outcomes ofsteps performed to get to noden,
• probability P(en) of getting tonode n, and
• utility f (en) being a real num-ber.
Let L(s) be the set of terminalnodes of strategy s.Expected utility of strategy isE f (s) = ∑`∈L(s) P(e`) · f (e`).
14
![Page 15: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/15.jpg)
X3
X1
X3
X3
X2
X3
X2
X1
X2
X1
X2
X2
X3
X1
X1
Strategy s? is optimal iff it maxi-mizes its expected utility.
Strategy s is myopically optimal iffeach step of strategy s is selectedso that it maximizes expected utilityafter the selected step is performed(one step look ahead).
15
![Page 16: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/16.jpg)
Application 3 : Adaptive test of basicoperations with fractions
Examples of tasks:
T1:( 3
4 ·56
)− 1
8 = 1524 −
18 = 5
8 −18 = 4
8 = 12
T2: 16 + 1
12 = 212 + 1
12 = 312 = 1
4
T3: 14 · 1 1
2 = 14 ·
32 = 3
8
T4:( 1
2 ·12
)·( 1
3 + 13
)= 1
4 ·23 = 2
12 = 16 .
16
![Page 17: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/17.jpg)
Elementary and operational skillsCP Comparison (common nu-
merator or denominator)
12 > 1
3 , 23 > 1
3
AD Addition (comm. denom.) 17 + 2
7 = 1+27 = 3
7
SB Subtract. (comm. denom.) 25 −
15 = 2−1
5 = 15
MT Multiplication 12 ·
35 = 3
10
CD Common denominator(
12 , 2
3
)=
(36 , 4
6
)CL Cancelling out 4
6 = 2·22·3 = 2
3
CIM Conv. to mixed numbers 72 = 3·2+1
2 = 3 12
CMI Conv. to improp. fractions 3 12 = 3·2+1
2 = 72
17
![Page 18: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/18.jpg)
Misconceptions
Label Description Occurrence
MAD ab + c
d = a+cb+d 14.8%
MSB ab −
cd = a−c
b−d 9.4%
MMT1 ab ·
cb = a·c
b 14.1%
MMT2 ab ·
cb = a+c
b·b 8.1%
MMT3 ab ·
cd = a·d
b·c 15.4%
MMT4 ab ·
cd = a·c
b+d 8.1%
MC a bc = a·b
c 4.0%
18
![Page 19: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/19.jpg)
Student model
CP
ACD
HV2 HV1
AD SB CMI CIM CL CD MT
MMT1 MMT2 MMT3 MMT4MCMAD MSB
ACMI ACIM ACL
19
![Page 20: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/20.jpg)
Evidence model for task T1(34· 5
6
)− 1
8=
1524
− 18
=58− 1
8=
48
=12
T1 ⇔ MT & CL & ACL & SB & ¬MMT3 & ¬MMT4 & ¬MSB
CL
MMT4
MSB
SB
MMT3
ACL MT
T1
X1
P(X1 | T1)
Hugin: model-hv-2.net
20
![Page 21: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/21.jpg)
Using information gain as the utility function
“The lower the entropy of a probability distribution the more we know.”
H (P(X)) = −∑x
P(X = x) · log P(X = x)
0
0.5
1
0 0.5 1
entr
opy
probability
Information gain in a node n of a strategy
IG(en) = H(P(S))− H(P(S | en))
21
![Page 22: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/22.jpg)
Skill Prediction Quality
74
76
78
80
82
84
86
88
90
92
0 2 4 6 8 10 12 14 16 18 20
Qua
lity
of s
kill
pred
icti
ons
Number of answered questions
adaptiveaverage
descendingascending
22
![Page 23: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/23.jpg)
Application 4: Troubleshooting
Dezide Advisor customized to a specific portal, seen from the user’sperspective through a web browser.
23
![Page 24: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/24.jpg)
Application 2: Troubleshooting - Light print problem
FF3
F2
F1
F4
FaultsActions
A3
A2
A1
Q1
Problem
Questions
• Problems: F1 Distribution problem, F2 Defective toner, F3
Corrupted dataflow, and F4 Wrong driver setting.
• Actions: A1 Remove, shake and reseat toner, A2 Try anothertoner, and A3 Cycle power.
• Questions: Q1 Is the configuration page printed light?
24
![Page 25: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/25.jpg)
Troubleshooting strategy
A1 = no
A2 = yes
Q1 = no
A1 = yesA2 = yes
Q1 = yes
A1 = yes
A2 = no
A1 = noA2 = no
A2
Q1
A1
A2 A1
The task is to find a strategy s ∈ S minimising expected cost of repair
ECR(s) = ∑`∈L(s)
P(e`) · ( t(e`) + c(e`) ) .
25
![Page 26: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/26.jpg)
Expected cost of repair for a given strategy
A1 = no
A2 = yes
Q1 = no
A1 = yesA2 = yes
Q1 = yes
A1 = yes
A2 = no
A1 = noA2 = no
A2
Q1
A1
A2 A1
ECR(s) =
P(Q1 = no, A1 = yes) ·(cQ1 + cA1
)+P(Q1 = no, A1 = no, A2 = yes) ·
(cQ1 + cA1 + cA2
)+P(Q1 = no, A1 = no, A2 = no) ·
(cQ1 + cA1 + cA2 + cCS
)+P(Q1 = yes, A2 = yes) ·
(cQ1 + cA2
)+P(Q1 = yes, A2 = no, A1 = yes) ·
(cQ1 + cA2 + cA1
)+P(Q1 = yes, A2 = no, A1 = no) ·
(cQ1 + cA2 + cA1 + cCS
)
Demo: www.dezide.com Products/Demo/‘‘Try out expert mode’’
26
![Page 27: Jir´ı Vomlelˇ - avcr.czstaff.utia.cas.cz/vomlel/slides/presentace-karny.pdf · 2005-09-02 · Bayesian network •a directed acyclic graph G = (V, E) •each node i ∈ V corresponds](https://reader034.vdocuments.us/reader034/viewer/2022042021/5e780d0034f86508ee6a3d2f/html5/thumbnails/27.jpg)
Commercial applications of Bayesian networks ineducational testing and troubleshooting
• Hugin Expert A/S.software product: Hugin - a Bayesian network tool.http://www.hugin.com/
• Educational Testing Service (ETS)the world’s largest private educational testing organizationResearch unit doing research on adaptive tests using Bayesiannetworks: http://www.ets.org/research/
• SACSO ProjectSystems for Automatic Customer Support Operations- research project of Hewlett Packard and Aalborg University.The troubleshooter offered as DezisionWorks by Dezide Ltd.http://www.dezide.com/
27