an introduction to bayesian networks stochastic processes course hossein amirkhani spring 2011

35
An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

Upload: joy

Post on 24-Feb-2016

37 views

Category:

Documents


0 download

DESCRIPTION

An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011. Outline. Introduction, Bayesian Networks , Probabilistic Graphical Models, Conditional Independence, I-equivalence. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

An introduction to Bayesian networks

Stochastic Processes Course

Hossein Amirkhani

Spring 2011

Page 2: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

2

An introduction to Bayesian networks

Outline

Introduction,

Bayesian Networks,

Probabilistic Graphical Models,

Conditional Independence,

I-equivalence.

Page 3: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

3

An introduction to Bayesian networks

Introduction

Our goal is to represent a joint distribution over some set of random variables .

Even in the simplest case where these variables are binary-valued, a joint distribution requires the specification of numbers.

The explicit representation of the joint distribution is unmanageable from every perspective: Computationally, Cognitively, and Statistically.

Page 4: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

4

An introduction to Bayesian networks

Bayesian Networks

Bayesian networks exploit conditional independence properties of the distribution in order to allow a compact and natural representation.

They are a specific type of probabilistic graphical models. BNs are directed acyclic graphs (DAG).

Page 5: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

5

An introduction to Bayesian networks

Probabilistic Graphical Models

Nodes are the random variables in our domain. Edges correspond, intuitively, to direct influence of

one node on another.

Factor Graph Markov Random Field Bayesian Network

Page 6: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

6

An introduction to Bayesian networks

Probabilistic Graphical Models

Graphs are an intuitive way of representing and visualising the relationships between many variables.

A graph allows us to abstract out the conditional independence relationships between the variables from the details of their parametric forms. Thus we can answer questions like: “Is A dependent on

B given that we know the value of C ?” just by looking at the graph.

Graphical models allow us to define general message-passing algorithms that implement probabilistic inference efficiently.

Graphical models = statistics × graph theory × computer science.

Page 7: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

7

An introduction to Bayesian networks

Bayesian Networks

Page 8: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

8

An introduction to Bayesian networks

Bayesian Networks

Page 9: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

9

An introduction to Bayesian networks

Conditional Independence: Example 1

tail-to-tail at c

Page 10: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

10

An introduction to Bayesian networks

Conditional Independence: Example 1

Page 11: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

11

An introduction to Bayesian networks

Conditional Independence: Example 1

Smoking

Lung Cancer Yellow Teeth

Page 12: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

12

An introduction to Bayesian networks

Conditional Independence: Example 2

head-to-tail at c

Page 13: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

13

An introduction to Bayesian networks

Conditional Independence: Example 2

Page 14: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

14

An introduction to Bayesian networks

Conditional Independence: Example 2

Type of Car Speed Amount of speeding Fine

Page 15: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

15

An introduction to Bayesian networks

Conditional Independence: Example 3

head-to-head at c

v-structure

Page 16: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

16

An introduction to Bayesian networks

Conditional Independence: Example 3

Page 17: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

17

An introduction to Bayesian networks

Conditional Independence: Example 3

Ability of team A Ability of team B

Outcome of A vs. B game

Page 18: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

18

An introduction to Bayesian networks

D-separation• A, B, and C are non-intersecting subsets of nodes in a directed graph.• A path from A to B is blocked if it contains a node such that either

a) the arrows on the path meet either head-to-tail or tail-to-tail at the node, and the node is in the set C, or

b) the arrows meet head-to-head at the node, and neither the node, nor any of its descendants, are in the set C.

• If all paths from A to B are blocked, A is said to be d-separated from B by C.

• If A is d-separated from B by C, the joint distribution over all variables in the graph satisfies .

Page 19: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

19

An introduction to Bayesian networks

I-equivalence

Let be a distribution over . We define to be the set of independence assertions that hold in .

Two graph structures and over are I-equivalent if .

The set of all graphs over X is partitioned into a set of mutually exclusive and exhaustive I-equivalence classes.

Page 20: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

20

An introduction to Bayesian networks

The skeleton of a Bayesian network

The skeleton of a Bayesian network graph over is an undirected graph over that contains an edge for every edge in .

Page 21: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

21

An introduction to Bayesian networks

Immorality

A v-structure is an immorality if there is no direct edge between X and Y.

Page 22: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

22

An introduction to Bayesian networks

Relationship between immorality, skeleton and I-equivalence

Let and be two graphs over . Then and have the same skeleton and the same set of immoralities if and only if they are I-equivalent.

We can use this theorem to recognize that whether two BNs are I-equivalent or not.

In addition, this theorem can be used for learning the structure of the Bayesian network related to a distribution. We can construct the I-equivalence class for a distribution by

determining its skeleton and its immoralities from the independence properties of the given distribution.

We then use both of these components to build a representation of the equivalence class.

Page 23: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

23

An introduction to Bayesian networks

Identifying the Undirected Skeleton

The basic idea is to use independence queries of the form for different sets of variables .

If and are adjacent in , we cannot separate them with any set of variables.

Conversely, if and are not adjacent in , we would hope to be able to find a set of variables that makes these two variables conditionally independent: we call this set a witness of their independence.

Page 24: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

24

An introduction to Bayesian networks

Identifying the Undirected Skeleton

Let be an I-map of a distribution , and let and be two variables that are not adjacent in . Then either or .

Thus, if and are not adjacent in , then we can find a witness of bounded size.

Thus, if we assume that has bounded indegree, say less than or equal to d, then we do not need to consider witness sets larger than d.

Page 25: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

25

An introduction to Bayesian networks

Page 26: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

26

An introduction to Bayesian networks

Identifying Immoralities

At this stage we have reconstructed the undirected skeleton. Now, we want to reconstruct edge direction.

Our goal is to consider potential immoralities in the skeleton and for each one determine whether it is indeed an immorality.

A triplet of variables X, Z, Y is a potential immorality if the skeleton contains but does not contain an edge between X and Y.

A potential immorality is an immorality if and only if Z is not in the witness set(s) for X and Y.

Page 27: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

27

An introduction to Bayesian networks

Page 28: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

28

An introduction to Bayesian networks

Representing Equivalence Classes

An acyclic graph containing both directed and undirected edges is called a partially directed acyclic graph or PDAG.

Page 29: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

29

An introduction to Bayesian networks

Representing Equivalence Classes

Let be a DAG. A chain graph is a class PDAG of the equivalence class of if shares the same skeleton as , and contains a directed edge if and only if all that are I-equivalent to contain the edge .

If the edge is directed, then all the members of the equivalence class agree on the orientation of the edge.

If the edge is undirected, there are two DAGs in the equivalence class that disagree with the orientation of the edge.

Page 30: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

30

An introduction to Bayesian networks

Representing Equivalence Classes

Is the output of Mark-Immoralities the class PDAG? Clearly, edges involved in immoralities must be

directed in K. The obvious question is whether K can contain

directed edges that are not involved in immoralities. In other words, can there be additional edges whose

direction is necessarily the same in every member of the equivalence class?

Page 31: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

31

An introduction to Bayesian networks

Rules

Page 32: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

32

An introduction to Bayesian networks

Page 33: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

33

An introduction to Bayesian networks

Example

Page 34: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

34

An introduction to Bayesian networks

References

D. Koller and N. Friedman: Probabilistic Graphical Models. MIT Press, 2009.

C. M. Bishop: Pattern Recognition and Machine Learning. Springer, 2006.

Page 35: An introduction to Bayesian networks Stochastic Processes Course Hossein Amirkhani Spring 2011

35

An introduction to Bayesian networks

THANKS