belief networks conditional independence syntax and semantics exact inference approximate inference...

CS 460, Belief Networks 1

Belief networks

Conditional independenceSyntax and semanticsExact inferenceApproximate inference

Mundhenk and Itti 2008. Based on material from S Russel and P Norvig


Independence


Conditional independence


Conditional Independence

Cavity

Toothache Catch

Other interactions may exist, but they are either insignificant, unknown or irrelevant. We leave them out.



Cavity

Toothache Catch

We assume that a “catch” is not influenced by a toothache and visa versa.



Cavity

Toothache

Catch

(1a) Since Catch does not affect Toothache P(Toothache|Catch,Cavity) = P(Toothache|Cavity)

Cavity

Toothache

=



Cavity

Toothache Catch

(1b) Algebraically these statements are equivalent P(Toothache,Catch|Cavity) = P(Toothache|Cavity)P(Catch|Cavity)

Cavity

Toothache Catch

Cavity

=


Belief networks


Example

Probabilities derivedfrom prior observations


Typical Bayesian Network

Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B E P(A)

T T 0.95

T F 0.94

F T 0.29

F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

Here we see both the Topology and the Conditional Probability Tables (CPT).


Semantics


Markov blanket


Constructing belief networks


Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B E P(A)

T T 0.95

T F 0.94

F T 0.29

F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

What is the probability that the alarm has sounded but neither a burglary nor earthquake has occurred and both John and Mary call? – P j m a b e

Example: Full Joint Distribution


Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B E P(A)

T T 0.95

T F 0.94

F T 0.29

F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

0.9 0 0.990.7

0.00

.0 0.9980

2

1 9

06

0 0

PPP m aP j bP j a P eb em aa b e

CS 460, Belief Networks 19What if we find a new variable?

Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B S E P(A)

T T T 0.98

T T F 0.94

T F T 0.96

T F F 0.95

F T T 0.45

F T F 0.25

F F T 0.29

F F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

We find that storms can also set off alarms. We add that into our CPT. Notice that JohnCalls and MaryCalls stay the same since Storms were always there but were just unaccounted for. John and Mary did not change! However, we have better precision at P(A).

StormP(S)

0.1


What if we cause a new variable? (or a new variable just occurs)

Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B C E P(A)

T T T 0.98

T T F 0.94

T F T 0.96

T F F 0.95

F T T 0.45

F T F 0.25

F F T 0.29

F F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

What if we inject a new cause that was not there before. We pay a crazy guy to set off the alarm frequently, JohnCalls and MaryCalls may no longer be valid since we may have changed the behaviors. For instance the alarm goes off so often now that John and Mary are more likely to ignore it.

CrazyGuyP(C)

0.25


What if we cause a new variable?

Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B C E P(A)

T T T 0.98

T T F 0.94

T F T 0.96

T F F 0.95

F T T 0.45

F T F 0.25

F F T 0.29

F F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

If the introduced variable is highly erratic, it can invalidate even more of the CPT than we would like.

CrazyGuyP(C)

0.25


What if we cause a new variable?

Burglary Earthquake

Alarm

JohnCalls MaryCalls

P(B)

0.001

P(E)

0.002

B C E P(A)

T T T 0.98

T T F 0.94

T F T 0.96

T F F 0.95

F T T 0.45

F T F 0.25

F F T 0.29

F F F 0.001

A P(J)

T 0.90

F 0.05

A P(M)

T 0.70

F 0.01

However some changes to the CPT may be absurd so we may never have to worry about them.

CrazyGuy P(C)

0.25


Things in the model can change

We can account for change in the model over time (a more advanced topic). • John and Mary may be more or less likely to call at certain

times. Cyclical repetition may be not too difficult to model. • People may become tired of their job and be less likely call

over longer periods. This may be easy or difficult to model.• Crime picks up. If the trend is slow enough, the model may

be able to adjust online even if we have never observed crime picking up before. However, this may easily and totally throw our model off.• However, keep in mind, we may get good enough results for

our model even without accounting for changes over time.


How would we apply this to Robotics?

Tree in PathRock in

Path

Obstacle Detect

Execute Stop

ExecuteTurn

P(T)

0.25

P(R)

0.4

T R P(O)

T T 0.95

T F 0.45

F T 0.29

F F 0.001

O P(S)

T 0.90

F 0.05

O P(U)

T 0.70

F 0.01

The CPT can describe other events and probabilities such as action success given observations.


Exact Inference in a Bayesian Network

What if we want to make an inference such as: what is the probability of a tree in the path given that the robot has stopped and turned. This might be useful to a robot which can judge if there is a tree in a path based on the behavior of another robot. So if robot A sees robot B turn or stop it might infer that there is a tree in the path.

, ,r o

P t s u P t P r P o t r P s o P u o

, ,, ,

, , , ,

P t s u P t s uT s u

P t s u P t s u P t s u P t s u

P

Normalized we get:


Bayesian Inference cont’

0.9

0.9

0.7

0.

0.25

0.25

0

7

0.3

0.3

.25

0.6

0.4

0

0.45

0.95

0.5 0.6

0

.1

0.1.4

5

0.025 50.

P r

P r

P

P o t r

P o t r

P

P s o

P s o

P s o

P s o

P

o tr

P r

P t

P t

u o

P u o

P u o

P

r

P

P t

ut oo t rP

We compute P of tree and P of not tree and normalize. This is essentially an enumeration of all situations for tree and not tree.

P t

P t

P t

P r

P r

P r

P o t r

P o t r

P o t r

P s o

P s o

P

P u o

P u o

P u o

P o t r P u o

s o

P sPt orP

0.9

0.9

0.7

0.

0.75

0.75

0

7

0.3

0.3

.75

0.6

0.4

0.6

0.

0.001

0.29

0. 0.1

4 0.10.7

9 9

0.5

9

71

,P t s u

,P t s u


Bayesian Inference cont’

Finishing:

,

0.051975

0.00315

0.002025

0.00285

P t s u

,

0.0002835

0.05481

0.0134865

0.00639

P t s u

0.06

0.07497

0.06 0.07497,

0.06 0.07497 0.06 0.07497

0.4445,0.5555

, ,r o

P t s u P t P r P o t r P s o P u o

, ,, ,

, , , ,

P t s u P t s uT s u

P t s u P t s u P t s u P t s u

P

The P of a tree in the path is 0.4445


What about hidden variables?


Example: car diagnosis


Example: car insurance


Compact conditional distributions


Compact conditional distributions

Know

Infer


Hybrid (discrete+continuous) networks

If subsidy was a hidden variable would it be discrete and discreet?


Continuous child variables


Discrete variable w/ continuous parents


Discrete variable


Inference in belief networks

Exact inference by enumerationExact inference by variable eliminationApproximate inference by stochastic

simulationApproximate inference by Markov chain

Monte Carlo (MCMC)

belief networks conditional independence syntax and semantics exact inference approximate inference...

Documents

belief networks cs

belief networks9 slide

belief networks3 slide

belief networks6 slide

belief networks2 slide

belief networks10 slide

belief networks16 slide

belief networks14 slide