graphical) models) bayesian)networks) seman&cs)&)...
TRANSCRIPT
![Page 1: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/1.jpg)
Daphne Koller
Seman&cs)&)Factoriza&on)
Probabilis&c)Graphical)Models) Bayesian)Networks)
Representa&on)
![Page 2: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/2.jpg)
Daphne Koller
• Grade • Course Difficulty • Student Intelligence • Student SAT • Reference Letter
P(G,D,I,S,L)
![Page 3: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/3.jpg)
Daphne Koller
Intelligence Difficulty
Grade
Letter
SAT
![Page 4: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/4.jpg)
Daphne Koller
Intelligence Difficulty
Grade
Letter
SAT
0.3 0.08 0.25
0.4 g2(B)
0.02 0.9 i1,d0 0.7 0.05 i0,d1
0.5
0.3 g1(A) g3(C)
0.2 i1,d1
0.3 i0,d0
l1 l0
0.99 0.4 0.1 0.9 g1
0.01 g3 0.6 g2
0.2 0.95
s0 s1
0.8 i1 0.05 i0
0.4 0.6 d1 d0
0.3 0.7 i1 i0
![Page 5: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/5.jpg)
Daphne Koller
Intelligence Difficulty
Grade
Letter
SAT
P(D) P(I)
P(G|I,D) P(S|I)
P(L|G)
Chain Rule for Bayesian Networks
P(D,I,G,S,L) = P(D) P(I) P(G|I,D) P(S|I) P(L|G)
Distribution defined as a product of factors!
![Page 6: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/6.jpg)
Daphne Koller
Intelligence Difficulty
Grade
Letter
SAT
0.3 0.08 0.25
0.4 g2
0.02 0.9 i1,d0 0.7 0.05 i0,d1
0.5
0.3 g1 g3
0.2 i1,d1
0.3 i0,d0
l1 l0
0.99 0.4 0.1 0.9 g1
0.01 g3 0.6 g2
0.2 0.95
s0 s1
0.8 i1 0.05 i0
0.4 0.6 d1 d0
0.3 0.7 i1 i0
P(d0, i1, g3, s1, l1) =
![Page 7: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/7.jpg)
Daphne Koller
Bayesian Network • A Bayesian network is: – A directed acyclic graph (DAG) G whose
nodes represent the random variables X1,…,Xn – For each node Xi a CPD P(Xi | ParG(Xi))
• The BN represents a joint distribution via the chain rule for Bayesian networks
P(X1,…,Xn) = Πi P(Xi | ParG(Xi))
![Page 8: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/8.jpg)
Daphne Koller
BN Is a Legal Distribution: P ≥ 0
![Page 9: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/9.jpg)
Daphne Koller
BN Is a Legal Distribution: ∑ P = 1 ∑D,I,G,S,L P(D,I,G,S,L) = ∑D,I,G,S,L P(D) P(I) P(G|I,D) P(S|I) P(L|G)
= ∑D,I,G,S P(D) P(I) P(G|I,D) P(S|I) ∑L P(L|G)
= ∑D,I,G,S P(D) P(I) P(G|I,D) P(S|I)
= ∑D,I,G P(D) P(I) P(G|I,D) ∑S P(S|I)
= ∑D,I P(D) P(I) ∑G P(G|I,D)
![Page 10: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/10.jpg)
Daphne Koller
P Factorizes over G • Let G be a graph over X1,…,Xn. • P factorizes over G if
P(X1,…,Xn) = Πi P(Xi | ParG(Xi))
![Page 11: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/11.jpg)
Daphne Koller
Genetic Inheritance
Homer
Bart
Marge
Lisa Maggie
Clancy Jackie
Selma
Genotype
Phenotype
AA, AB, AO, BO, BB, OO
A, B, AB, O
![Page 12: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/12.jpg)
Daphne Koller
BNs for Genetic Inheritance
GHomer
GBart
GMarge
GLisa GMaggie
GClancy GJackie
GSelma
BClancy BJackie
BSelma BHomer BMarge
BBart BLisa BMaggie
![Page 13: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/13.jpg)
Daphne'Koller'
Reasoning'Pa1erns'
Probabilis3c'Graphical'Models' Bayesian'Networks'
Representa3on'
![Page 14: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/14.jpg)
Daphne'Koller'
The Student Network
Intelligence Difficulty
Grade
Letter
SAT
0.3 0.08 0.25
0.4 g2
0.02 0.9 i1,d0 0.7 0.05 i0,d1
0.5
0.3 g1 g3
0.2 i1,d1
0.3 i0,d0
l1 l0
0.99 0.4 0.1 0.9 g1
0.01 g3 0.6 g2
0.2 0.95
s0 s1
0.8 i1 0.05 i0
0.4 0.6 d1 d0
0.3 0.7 i1 i0
![Page 15: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/15.jpg)
Daphne'Koller'
Intelligence Difficulty
Grade
Letter
SAT
Causal Reasoning
P(l1) ≈ 0.5
P(l1 | i0 ) ≈
P(l1 | i0 , d0) ≈
Difficulty Intelligence
0.39
0.51
![Page 16: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/16.jpg)
Daphne'Koller'
Intelligence Difficulty
Grade
Letter
SAT
Evidential Reasoning P(d1) = 0.4 P(i1) = 0.3 P(d1 | g3) ≈ P(i1 | g3) ≈
Student gets a C
0.3 0.08 0.25
0.4 g2
0.02 0.9 i1,d0 0.7 0.05 i0,d1
0.5
0.3 g1 g3
0.2 i1,d1
0.3 i0,d0
0.63 0.08
![Page 17: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/17.jpg)
Daphne'Koller'
Intelligence Difficulty
Grade
Letter
SAT
Intercausal Reasoning
Class is hard!
Student gets a C
P(i1 | g3, d1) ≈ 0.11
P(d1) = 0.4 P(i1) = 0.3 P(d1 | g3) ≈ 0.63 P(i1 | g3) ≈ 0.08
![Page 18: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/18.jpg)
Daphne'Koller'
Intercausal Reasoning Explained
X2 X1
Y OR
X1 X2 Y' Prob(
0' 0' 0 0.25
0' 1' 1 0.25
1' 0' 1 0.25
1' 1 1 0.25
![Page 19: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/19.jpg)
Daphne'Koller'
Difficulty Intelligence Difficulty
Grade
Letter
SAT
Intercausal Reasoning II
Class is hard!
Student gets a B
P(i1) = 0.3
P(i1 | g2, d1) ≈ 0.34 P(i1 | g2) ≈ 0.175
g2'
![Page 20: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/20.jpg)
Daphne'Koller'
Student Aces the SAT • What happens to the posterior
probability that the class is hard? Intelligence
Grade
Letter
SAT
Student gets a C
Difficulty
Student aces the SAT
![Page 21: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/21.jpg)
Daphne'Koller'
Intelligence
Grade
Letter
SAT
Student Aces the SAT
P(i1 | g3, s1) ≈ 0.58 P(d1 | g3, s1) ≈ 0.76 Difficulty
Student aces the SAT
P(d1) = 0.4 P(i1) = 0.3 P(d1 | g3) ≈ 0.63 P(i1 | g3) ≈ 0.08
Student gets a C
![Page 22: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/22.jpg)
Daphne'Koller'
Flow'of''Probabilis3c'Influence'
Probabilis3c'Graphical'Models' Bayesian'Networks'
Representa3on'
![Page 23: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/23.jpg)
Daphne'Koller'
When can X influence Y? • X � Y • X � Y • X � W � Y • X � W � Y • X � W � Y • X � W � Y
Intelligence Difficulty
Grade
Letter
SAT
![Page 24: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/24.jpg)
Daphne'Koller'
Active Trails • A trail X1 ─ … ─ Xn is active if: it has no v-structures Xi-1 � Xi � Xi+1
![Page 25: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/25.jpg)
Daphne'Koller'
When can X influence Y Given evidence about Z
• X � Y • X � Y
• X � W � Y • X � W � Y • X � W � Y • X � W � Y
Intelligence Difficulty
Grade
Letter
SAT
W ∉ Z W ∈ Z
![Page 26: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/26.jpg)
Daphne'Koller'
When can X influence Y given evidence about Z
• S ― I ― G ― D allows influence to flow when: Intelligence Difficulty
Grade
Letter
SAT
![Page 27: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/27.jpg)
Daphne'Koller'
Active Trails • A trail X1 ─ … ─ Xn is active given Z if:
– for any v-structure Xi-1 � Xi � Xi+1 we have that Xi or one of its descendants ∈Z
– no other Xi is in Z
![Page 28: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/28.jpg)
Daphne Koller
Preliminaries*
Probabilis-c*Graphical*Models* Independencies*
Representa-on*
![Page 29: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/29.jpg)
Daphne Koller
Independence • For events α, β, P α ⊥ β if: – P(α, β) = P(α) P(β) – P(α|β) = P(α) – P(β|α) = P(β)
• For random variables X,Y, P X ⊥ Y if: – P(X, Y) = P(X) P(Y) – P(X|Y) = P(X) – P(Y|X) = P(Y)
![Page 30: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/30.jpg)
Daphne Koller
Independence
I D Prob i0 d0 0.42 i0 d1 0.18 i1 d0 0.28 i1 d1 0.12
I Prob i0 0.6 i1 0.4
D Prob d0 0.7 d1 0.3
P(I,D)
I D G Prob. i0 d0 g1 0.126 i0 d0 g2 0.168 i0 d0 g3 0.126 i0 d1 g1 0.009 i0 d1 g2 0.045 i0 d1 g3 0.126 i1 d0 g1 0.252 i1 d0 g2 0.0224 i1 d0 g3 0.0056 i1 d1 g1 0.06 i1 d1 g2 0.036 i1 d1 g3 0.024
I D
G
![Page 31: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/31.jpg)
Daphne Koller
Conditional Independence • For (sets of) random variables X,Y,Z P (X ⊥ Y | Z) if: – P(X, Y|Z) = P(X|Z) P(Y|Z) – P(X|Y,Z) = P(X|Z) – P(Y|X,Z) = P(X|Z) – P(X,Y,Z) ∝ φ1(X,Y) φ2(Y,Z)
![Page 32: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/32.jpg)
Daphne Koller
Conditional Independence
X2 X1
Coin
![Page 33: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/33.jpg)
Daphne Koller
Conditional Independence I S G Prob. i0 s0 g1 0.114 i0 s0 g2 0.1938 i0 s0 g3 0.2622 i0 s1 g1 0.006 i0 s1 g2 0.0102 i0 s1 g3 0.0138 i1 s0 g1 0.252 i1 s0 g2 0.0224 i1 s0 g3 0.0056 i1 s1 g1 0.108 i1 s1 g2 0.0096 i1 s1 g3 0.0024
S Prob s0 0.95 s1 0.05
P(S,G | i0)
S G Prob. s0 g1 0.19 s0 g2 0.323 s0 g3 0.437 s1 g1 0.01 s1 g2 0.017 s1 g3 0.023
G Prob. g1 0.2 g2 0.34 g3 0.46
I
G S
![Page 34: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/34.jpg)
Daphne Koller
Conditioning can Lose Independences I D G Prob. i0 d0 g1 0.126 i0 d0 g2 0.168 i0 d0 g3 0.126 i0 d1 g1 0.009 i0 d1 g2 0.045 i0 d1 g3 0.126 i1 d0 g1 0.252 i1 d0 g2 0.0224 i1 d0 g3 0.0056 i1 d1 g1 0.06 i1 d1 g2 0.036 i1 d1 g3 0.024
I D Prob. i0 d0 0.282 i0 d1 0.02 i1 d0 0.564 i1 d1 0.134
P(I,D | g1)
![Page 35: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/35.jpg)
Daphne Koller
Bayesian(Networks(
Probabilis2c(Graphical(Models( Independencies(
Representa2on(
![Page 36: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/36.jpg)
Daphne Koller
Independence & Factorization
• Factorization of a distribution P implies independencies that hold in P
• If P factorizes over G, can we read these independencies from the structure of G?
P(X,Y,Z) ∝ φ1(X,Z) φ2(Y,Z) P(X,Y) = P(X) P(Y) X,Y independent
(X ⊥ Y | Z)
![Page 37: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/37.jpg)
Daphne Koller
Flow of influence & d-separation
Definition: X and Y are d-separated in G given Z if there is no active trail in G between X and Y given Z
I D
G
L
S
Notation: d-sepG(X, Y | Z)
![Page 38: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/38.jpg)
Daphne Koller
Factorization Independence: BNs Theorem: If P factorizes over G, and d-sepG(X, Y | Z) then P satisfies (X ⊥ Y | Z)
I D
G
L
S
![Page 39: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/39.jpg)
Daphne Koller
Any node is d-separated from its non-descendants given its parents Intelligence Difficulty
Grade
Letter
SAT
Job
Happy
Coherence
If P factorizes over G, then in P, any variable is independent of its non-descendants given its parents
![Page 40: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/40.jpg)
Daphne Koller
I-maps • d-separation in G P satisfies
corresponding independence statement
• Definition: If P satisfies I(G), we say that G is an I-map (independency map) of P
I(G) = {(X ⊥ Y | Z) : d-sepG(X, Y | Z)}
![Page 41: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/41.jpg)
Daphne Koller
I-maps I D Prob i0 d0 0.42 i0 d1 0.18 i1 d0 0.28 i1 d1 0.12
I D Prob. i0 d0 0.282 i0 d1 0.02 i1 d0 0.564 i1 d1 0.134
G1
P2
I D I D
P1
G2
![Page 42: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/42.jpg)
Daphne Koller
Factorization Independence: BNs Theorem: If P factorizes over G, then G is
an I-map for P
![Page 43: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/43.jpg)
Daphne Koller
Independence Factorization Theorem: If G is an I-map for P, then P
factorizes over G
I D
G
L
S
![Page 44: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/44.jpg)
Daphne Koller
Summary Two equivalent views of graph structure: • Factorization: G allows P to be represented • I-map: Independencies encoded by G hold in P
If P factorizes over a graph G, we can read from the graph independencies that must hold in P (an independency map)
![Page 45: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/45.jpg)
Daphne'Koller'
Naïve'Bayes'
Probabilis5c'Graphical'Models' Bayesian'Networks'
Representa5on'
![Page 46: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/46.jpg)
Daphne'Koller'
Naïve'Bayes'Model'
Class
X1 X2 Xn .'.'.'(Xi ⊥ Xj | C) for all Xi, Xj
![Page 47: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/47.jpg)
Daphne'Koller'
Naïve'Bayes'Classifier'
Class
X1 X2 Xn .'.'.'
![Page 48: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/48.jpg)
Daphne'Koller'
Bernoulli'Naïve'Bayes'for'Text'
cat dog buy .'.'.'Document Label
P(“cat” appears | Label)
Financial cat dog buy sell …
Pets 0.001 0.001 0.2 0.3 … 0.3 0.4 0.02 0.0001 …
![Page 49: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/49.jpg)
Daphne'Koller'
Mul5nomial'Naïve'Bayes'for'Text'
W1 W2 Wn .'.'.'Document Label Financial
cat dog buy sell …
Pets 0.001 0.001 0.02 0.02 … 0.02 0.03 0.003 0.001 …
![Page 50: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/50.jpg)
Daphne'Koller'
Summary • Simple approach for classification
– Computationally efficient – Easy to construct
• Surprisingly effective in domains with many weakly relevant features
• Strong independence assumptions reduce performance when many features are strongly correlated
![Page 51: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/51.jpg)
Daphne Koller
Applica'on:+Diagnosis+
Probabilis'c+Graphical+Models+ Bayesian+Networks+
Representa'on+
![Page 52: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/52.jpg)
Daphne Koller
Medical Diagnosis: Pathfinder (1992) • Help pathologist diagnose lymph node
pathologies (60 different diseases) • Pathfinder I: Rule-based system • Pathfinder II used naïve Bayes and got
superior performance
Heckerman et al.
![Page 53: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/53.jpg)
Daphne Koller
Medical Diagnosis: Pathfinder (1992) • Pathfinder III: Naïve Bayes with better
knowledge engineering • No incorrect zero probabilities • Better calibration of conditional probabilities – P(finding | disease1) to P(finding | disease2) – Not P(finding1 | disease) to P(finding2 | disease)
Heckerman et al.
![Page 54: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/54.jpg)
Daphne Koller
Medical Diagnosis: Pathfinder (1992) • Pathfinder IV: Full Bayesian network – Removed incorrect independencies – Additional parents led to more accurate
estimation of probabilities • BN model agreed with expert panel in 50/53
cases, vs 47/53 for naïve Bayes model • Accuracy as high as expert that designed
the model Heckerman et al.
![Page 55: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/55.jpg)
Daphne Koller
CPCS
# of parameters: 21000 to 133,931,430 to 8254
M. Pradhan , G. Provan , B. Middleton , M.Henrion, UAI 94
![Page 56: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/56.jpg)
Daphne Koller
Medical Diagnosis (Microsoft)
Thanks to: Eric Horvitz, Microsoft Research
![Page 57: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/57.jpg)
Daphne Koller
- Medical Diagnosis (Microsoft)
Thanks to: Eric Horvitz, Microsoft Research
![Page 58: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/58.jpg)
Daphne Koller
Fault Diagnosis • Microsoft troubleshooters
![Page 59: Graphical) Models) Bayesian)Networks) Seman&cs)&) …spark-university.s3.amazonaws.com/stanford-pgm/slides/... · 2017-05-17 · Graphical) Models) Bayesian)Networks) Representa&on)](https://reader033.vdocuments.us/reader033/viewer/2022060312/5f0b290d7e708231d42f2547/html5/thumbnails/59.jpg)
Daphne Koller
Fault Diagnosis • Many examples: – Microsoft troubleshooters – Car repair
• Benefits: – Flexible user interface – Easy to design and maintain