visualizing inference in large bayesian networks (ucsd m.sc. project)

54
M.Sc. Project UCSD 2013 Clifford Champion <cchampio@cs> Adviser: Prof. Charles Elkan VISUALIZING INFERENCE IN LARGE BAYESIAN NETWORKS December 10 th , 2013

Upload: duckmaestro

Post on 23-Oct-2015

48 views

Category:

Documents


0 download

DESCRIPTION

In this project we address the challenge of viewing and using Bayesian networks as their structural size and complexity grows. We introduce two new visualization methods, inference diffs, and relevance filtering, to enable visual analysis of information flow in these networks, and to enable direct comparison of two evidence configurations simultaneously. We implement and discuss the performance of these visualization methods on two modestly large networks, built from real-world data.

TRANSCRIPT

Page 1: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

M.Sc. Project UCSD 2013

Cl i f ford Champion

<cchampio@cs>

Adviser : Prof . Charles Elkan

VISUALIZING INFERENCE

IN LARGE BAYESIAN

NETWORKS

December 10 th , 2013

Page 2: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What and Why

Designing the Visualization

Implementation and Results

B-Vis, F-AI

Traffic and Census data sets

Conclusion and Q&A

OUTLINE

Page 3: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

WHAT AND WHY

Page 4: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

2002: the indexed size of the internet was about 167 TB.

2002: > 330 TB of human-generated email was created.

2010: 50 billion user photos stored in Facebook .

2010: 130 TB of logs generated daily by Facebook.

2010: 2.5 PB of Walmart customer and transaction data .

2013: Over 50 GB of Tweets created daily on Twitter.

2013: eBay stores 40 PB dedicated to “deep” analysis.

“BIG DATA”

Page 5: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

HOW DO WE USE ALL THIS DATA?

Image credit: http://commons.wikimedia.org/wiki/User:Shervinafshar

Page 6: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

To quote Edward Tufte

“ often the most effective way to describe, explore, and summarize a

set of numbers – even a very large set – is to look

at pictures of those numbers ” (emphasis added)

“ data graphics can be both the simplest

[and] most powerful of methods ”

Visualizations help reveal interesting facts

and abstract relationships

Impossible or inefficient if using tabular data alone

In software applications, visualizations are a navigational tool

DATA VISUALIZATION WILL BE ESSENTIAL

Page 7: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Bayesian networks can be an important tool for “big data”

“Information flow” in Bayesian networks can be an opaque

concept

D-separation is not useful enough

More there beneath the surface?

Visualizing Bayesian networks well has been a goal largely

neglected

WHY THIS PROJECT?

Page 8: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

A graphical model of random variables

On a scale of 0 to 1, how likely is rain today? (e.g.)

The edges of the graph define conditional (in)dependencies

between variables (nodes)

Can represent statistical, causal, and/or latent variables

What is the life expectancy of a non-smoker living in South America?

A car that won’t start can be caused by a dead battery. But being late

to work won’t cause a car to not start.

HMMs and topic clustering

Queryable: evidence goes in, new beliefs come out

If we know Winter=TRUE, what do we believe of Rain=TRUE?

BAYESIAN NETWORKS:

IN A NUTSHELL

Page 9: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Every random variable Y has a conditional probability

distribution P(Y|X1, . . . ,Xm(Y)), given m(Y) parents.

For our purposes, stored as a conditional probability table (CPT).

If Y has no parents, its probability distribution simplifies to P( Y).

Marginal distributions, e.g. P(Y) or P(Z), are

easily recovered/computed.

To create Bayesian network you must train (machine learning)

and/or hand-craft (expert interviews)

BAYESIAN NETWORKS:

IN A NUTSHELL

Page 10: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

VISUAL DESIGN

Page 11: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Top-down causal ordering

Regularly the unstated choice for small networks

Difficult to satisfy in large/complex networks

Edge crossings are avoided

Also difficult to satisfy in

large/complex networks

THE STATE OF THE ART

Image credit: Kollar, Daphne, and Nir Friedman. Probabilistic graphical

models: principles and techniques. The MIT Press, 2009.

Page 12: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

THE STATE OF THE ART

Conditional probability tables

Michele Cossalter, Ole J. Mengshoel, and Ted Selker. “Visualizing and Understanding Large-Scale Bayesian Networks“ The AAAI-11 Workshop

on Scalable Integration of Analytics and Visualization. 2011

Page 13: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

CPT heatmaps

Natural representation for parents of count 1 only.

THE STATE OF THE ART

Chiang, Chih-Hung, et al. Visualizing Graphical Probabilistic Models. Technical Report 2005-017, UML CS, 2005.

Page 14: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Marginal distributions via embedded bar charts

THE STATE OF THE ART

Bayes Server (http://bayesserver.com)

Page 15: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Marginal distributions via shading

Binary variables only

Parent influence via hue blending

At most 2 parents, maybe 3

THE STATE OF THE ART

Zapata-Rivera, Juan-Diego, Eric Neufeld, and Jim E. Greer.

"Visualization of Bayesian belief networks." Proceedings of IEEE

Visualization’99, Late Breaking Hot Topics. 1999.

Williams, Lloyd, and Robert St Amant. "A Visualization Technique

for Bayesian Modeling." Proc. of IUI. Vol. 6. 2006.

Page 16: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Partition and fish-eye

User-driven

THE STATE OF THE ART

Sundarararajan, Priya Krishnan, Ole J. Mengshoel, and Ted Selker. "Multi-focus and multi-window techniques for interactive network

exploration." IS&T/SPIE Electronic Imaging. International Society for Optics and Photonics, 2013.

Page 17: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Design before code

Wireframes and mockups produce high-quality “what-if’s”

General principles:

Maximize the “data -ink ratio” (Tufte)

Don’t distort the data, don’t mislead the viewer (Tufte)

Maximize readability and cleanliness (me)

Goals specific to Bayesian networks

Present the “basics” clearly and conveniently

The effects of evidence should be stupidly obvious

Should scale to large networks

Shoshin (初心)

PHILOSOPHIES & OBJECTIVES

Photo credit: zeze57@flickr

Page 18: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What are the variables of the model?

What is the structure of the model?

STRUCTURE

Page 19: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

STRUCTURE

Page 20: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Low contrast (no information beyond strokes/shading)

Single, capital letter for variable names;

subscripts if needed

Legend is ordered

w/layout vertical order

Structural view is

zoomable, scrollable

STRUCTURE

Page 21: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What are the event spaces?

EVENT SPACES

Page 22: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Each event space receives a

color mapping

Categorical spaces jump between

contrasting hues

Ordered spaces step through

similar hues

Legend is augmented accordingly

EVENT SPACES

Page 23: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What are the probability distributions captured in the model?

DISTRIBUTIONS

Page 24: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

DISTRIBUTIONS

Page 25: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Distributions are embedded into each node

Pie chart slices are proportional to marginal probability masses

DISTRIBUTIONS

P (A)

P (V)

P (T1)

P (X)

P (T2)

Page 26: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What about evidence?

What about the effects of evidence?

SEEING INFERENCE

Page 27: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

SEEING INFERENCE

Page 28: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Evidence nodes receive a black border

Query (non-evidence) nodes’ embedded

distributions updated to reflect

posterior distributions

SEEING INFERENCE

Let V=v

P (T1|V=v, A=a)

P (X|V=v, A=a)

P (T2|V=v, A=a)

Let A=a

Page 29: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What just happened? Dif ficult to see the change.

We need a way to perform a direct comparison.

Let E1 and E2 be evidence sets*, e.g. E1 = ( A=a ) and E2 = ( A=b, V=v)

Compute the posterior distributions separately

Visualize the posterior distributions together

Inspired by code dif fs in software engineering

* The word “set” is being abused here .

SEEING INFERENCE

Page 30: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

AN “INFERENCE DIFF”

Page 31: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Inner “pie” is posterior for E1

Outer “ring” is posterior for E 2

Seeing the dif ference

Consistent event space coloring

Consistent event space ordering

Changes in area and color mass

easy to spot

Evidence in E1 and E2

distinguished by black borders

around pie and/or ring

AN “INFERENCE DIFF”

Evidence

in E1

Evidence

in E2

P ( X | E2 ) P ( X | E1 )

Page 32: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

What if there are too many variables!

What about when I don’t know which variables to look for?

SCALING TO LARGER NETWORKS

Page 33: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

RELEVANCE FILTERING

Page 34: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

RELEVANCE FILTERING

Emphasize the variables with most change; diminish the rest

Use KL-divergence to quantify change [0, +Inf)

Call the top C% most changed

variables the “relevant” variables

Shrink & fade nodes of

irrelevant variables

Shorten and fade edges with

irrelevant variables

Reduces the canvas size needed

Facilitates discovery in

large models

Page 35: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

IMPLEMENTATION AND

RESULTS

Page 36: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Structure and CPT learning (F#)

Structure space search using edge operations and BIC scoring

Dynamic programming algorithm, memoizing local scores

OO Design: Types defined for network, random variable, distribution, event space, etc.

Immutable type design made life easier and computations faster

Inference (F#)

Approximate inference using Markov Chain Monte Carlo / Gibbs sampling

Visualization Tool (C#)

Adopted a variation of the Model-View-Controller paradigm

Independent threads for learning, layout, inference, and UI/rendering

Used Microsoft WPF for vector graphics and user-input handling

Used open-source Graph# for Sugiyama graph layout

All source shared at https://github.com/duckmaestro/F -AI

SOFTWARE IMPLEMENTATION

Page 37: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Traffic flow measurements from the San Francisco bay area

highway system

32 sensor locations

4 discretized values of traffic flow amount from low to high

4415 examples

Acquired by Krause and Guestrin; reprocessed by Shahaf et al.

Model Training

Entire data set used

Uniform Dirichlet prior

Parent limit of 2

EXAMPLE: SAN FRANCISCO TRAFFIC

NETWORK

Page 38: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

TRAFFIC NETWORK

“Traffic” Bayesian network

visualized in B-Vis.

Must zoom out a bit to see

entire model of this size.

Pictured with no evidence

configured.

Page 39: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

INFERENCE DIFF OF TRAFFIC NETWORK

E1 = (empty)

E2 = (A4=‘medium’)

Relevance filtering: top 20%.

Reduction in overall space

requirement allows us to see

entire structure even while

zoomed in.

The impact of this evidence

diminishes as it propagates

in this model

Page 40: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

1990 U.S. Census

68 attributes

Each attribute has categorical or discretized values

Each attribute has between 2 and 17 values

2.4 million examples

Discretized by Meek et al.; hosted by UCI Machine Learning

Repository

Model Training

First 10,000 randomly chosen examples

Uniform Dirichlet prior

Parent limit of 3

EXAMPLE: 1990 U.S. CENSUS

Page 41: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

CENSUS NETWORK

“Census” Bayesian network

visualized in B-Vis.

Must zoom quite far out to

see entire model on screen

Page 42: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

INFERENCE DIFF OF CENSUS NETWORK

E1 = (empty)

E2 = (‘income4’=‘yes’)

‘income4’

Page 43: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

TOP 20% RELEVANT VARIABLES

‘ income4’ := Interest , d iv idends , o r renta l income in pr ior year.

Relevant var iables ( in no spec ial o rder) :

year o f immigrat ion ( ' immigr ' )

place of b i r th ( 'pob ' )

Hispanic her i tage ( 'h ispanic ' )

re lat ionship to the homeowner ( ' re lat2' )

whether l i v ing in a subfami ly ( ' subfam1')

number of subfami l ies ( ' subfam2')

whether work ing on a farm ( ' income3' )

whether the ind iv idual ser ved in the mi l i tar y dur ing no major war or conf l ict ( 'othrser v ' )

the i r ancest r y ( 'ancstr y2' )

the i r means of t ranspor tat ion to work ( 'means ' )

the i r s tatus in the job market ( 'avai l ' )

employment s tatus of parents ( ' remplpar ' )

‘income4’

Page 44: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

LET’S DRILL DOWN

Let’s inspect

‘means’:

Increased

likelihood of no

daily commute

Decreased

likelihood of

bike, train, and

other non-auto

means of

commute

Page 45: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

INFORMATION FLOW SURPRISE

Parents of ‘income4’ were

irrelevant in this inference

dif f: ‘ income6’, ‘rpincome ’ ,

‘rearning ’ .

Relevance filtering reveals

that in general the greatest

impact of evidence can be

“far away”. Snowball ef fect.

‘income4’

‘rpincome’

‘income6’

‘rearning’

Page 46: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

PARTING THOUGHTS

Page 47: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Finding the greatest dif ference between two medical

treatments – a Bayesian network as a causal model

E1 = ( Age=38,

HasConditionX=True,

do(MedicationA=True),

do(MedicationB=False) )

E2 = ( Age=38,

HasConditionX=True,

do(MedicationA=False),

do(MedicationB=True) )

An inference dif f with relevance filtering could clearly

and visually present the greatest expected dif ferences in

prognosis and side-effects.

OTHER USES OF INFERENCE DIFFS

Page 48: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Layout stability is important

Better layout algorithms exist, and may be

customizable with relevance filtering in mind

Unused visual modalities have untapped potential

Node shape, additional evidence set rings, etc.

Large event spaces / continuous event spaces

Adaptive color space folding? / Density pie chart?

Alternative measures of “relevance”

User-specifiable event space value importance

Other graphical models

Dynamic Bayesian networks? Conditional random fields?

CHALLENGES & FUTURE WORK

Page 49: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Visualizations can help reveal insights

Visualizations can communicate dense information efficiently

We introduced inference diffs for direct comparisons of

posterior beliefs in Bayesian networks

We extended inference dif fs with relevance filtering

for assisting users in locating interesting phenomena in large

IN CONCLUSION

Page 50: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

Thanks! Q & A

Page 51: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

APPENDIX

Page 52: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

UX question: is there an easy way to assign evidence?

Radial drag-drop menu

Keeps with pie chart motif

Drag outside inward to

assign evidence

Drag inside outward to

remove evidence

ASSIGNING EVIDENCE

Dropping on the

ring assigns

evidence in E2

Dropping on the

center assigns

evidence in E1

Page 53: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

CONDITIONAL PROBABILITY TABLES

Use vertical space, not horizontal space

Event space color mapping reused

for probability masses and for parent

permutations

Page 54: Visualizing Inference in Large Bayesian Networks (UCSD M.Sc. Project)

DISCOVERY AND NAVIGATION

Goldfarb, Doron, et al. "Art History on Wikipedia, a Macroscopic Observation."arXiv preprint arXiv:1304.5629 (2013).