![Page 1: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/1.jpg)
Learning and Inference in
Probabilistic Graphical Models
CSCI 2950-P: Special Topics in Machine Learning
Spring 2010
Prof. Erik Sudderth
![Page 2: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/2.jpg)
Learning from Structured Data
![Page 3: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/3.jpg)
Hidden Markov Models (HMMs)
“Conditioned on the present, the past andfuture are statistically independent”
Visual Tracking
![Page 4: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/4.jpg)
Kinematic Hand Tracking
KinematicPrior
StructuralPrior
DynamicPrior
![Page 5: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/5.jpg)
Nearest-Neighbor Grids
unobserved or hidden variable
local observation of
Low Level Vision
• Image denoising
• Stereo
• Optical flow
• Shape from shading
• Superresolution
• Segmentation
![Page 6: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/6.jpg)
Wavelet Decompositions
• Bandpass decomposition
of images into multiple
scales & orientations
• Dense features which
simplify statistics of
natural images
*
![Page 7: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/7.jpg)
Hidden Markov Trees
• Hidden states model
evolution of image
patterns across scale
and location
![Page 8: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/8.jpg)
Validation: Image Denoising
Original Image: Barbara Corrupted by Additive
White Gaussian Noise
(PSNR = 24.61 dB)
![Page 9: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/9.jpg)
Denoising Results: Barbara
Noisy Input (24.61 dB) HDP-HMT (32.10 dB)
• Posterior mean of wavelet coefficients averages samples
with varying numbers of states (model averaging)
![Page 10: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/10.jpg)
Denoising: Input
24.61 dB
![Page 11: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/11.jpg)
Denoising: Binary HMT
Crouse, Nowak, & Baraniuk, 199829.35 dB
![Page 12: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/12.jpg)
Denoising: HDP-HMT
32.10 dB
![Page 13: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/13.jpg)
Visual Object Recognition
Can we transfer knowledge from one object category to another?
![Page 14: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/14.jpg)
Describing Objects with Parts
Pictorial StructuresFischler & Elschlager, 1973
Constellation ModelPerona et. al., 2000 to present
Generalized CylindersMarr & Nishihara, 1978
Recognition by ComponentsBiederman, 1987
![Page 15: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/15.jpg)
A Graphical Model for Object Parts
G0
J
G
H
R
N
w vL
G0
1G G
2
![Page 16: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/16.jpg)
3D Scenes
G0
H
RGlobal Density
Object categoryPart size & shape
Transformation prior
F
J
Gj
Transformed DensitiesObject categoryPart size & shape
Transformed locations
w v
2D Image FeaturesAppearance Descriptors2D Pixel Coordinates
N
o
u
3D Scene FeaturesObject category3D Location
![Page 17: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/17.jpg)
Stereo Test Image
![Page 18: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/18.jpg)
Many Other Applications
• Speech recognition & speaker diarization
• Natural language processing: parsing, topic models, …
• Robotics: mapping, navigation & control, …
• Error correcting codes & wireless communications
• Bioinformatics
• Nuclear test monitoring
• ………
![Page 19: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/19.jpg)
set of nodes
set of edges connecting nodes
Nodes are associated with random variables
An undirected graph is defined by
Undirected Graphical Models
Graph Separation
Conditional
Independence
![Page 20: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/20.jpg)
Inference in Graphical Models
Maximum a Posteriori (MAP) Estimates
• Provide both estimators and confidence measures
• Sufficient statistics for iterative parameter estimation
Posterior Marginal Densities
observations (implicitly encoded via compatibilities)
![Page 21: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/21.jpg)
Why the Partition Function?
• Sensitivity of physical systems to external stimuli
Statistical Physics
Hierarchical Bayesian Models
• Marginal likelihood of observed data
• Fundamental in hypothesis testing & model selection
Cumulant Generating Function
• For exponential families, derivatives with respect
to parameters provide marginal statistics
PROBLEM: Computing Z in general graphs is NP-complete
![Page 22: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/22.jpg)
What do you want to
learn about?
![Page 23: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/23.jpg)
Graphical Models
x3
x4
x5
x1
x2
x3
x4
x5
x1
x2
x3
x4
x5
x1
x2
Directed
Bayesian NetworkFactor Graph
Undirected
Graphical Model
![Page 24: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/24.jpg)
Exact Inference
MESSAGES: Sum-product or belief propagation algorithm
number of nodes
discrete states
for each node
Belief Prop:
Brute Force:
Computational cost:
![Page 25: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/25.jpg)
Continuous Variables
Discrete State Variables
Messages are finite vectors
Updated via matrix-vector products
Gaussian State Variables
Messages are mean & covariance
Updated via information Kalman filter
Continuous Non-Gaussian State Variables
Closed parametric forms unavailable
Discretization can be intractable even
with 2 or 3 dimensional states
![Page 26: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/26.jpg)
Variational Inference: An Example
• Choose a family of approximating distributions
which is tractable. The simplest example:
• Define a distance to measure the quality of
different approximations. One possibility:
• Find the approximation minimizing this distance
![Page 27: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/27.jpg)
Advanced Variational Methods
• Exponential families
• Mean field methods: naïve and structured
• Variational EM for parameter estimation
• Loopy belief propagation (BP)
• Bethe and Kikuchi entropies
• Generalized BP, fractional BP
• Convex relaxations and bounds
• MAP estimation and linear programming
• ………
![Page 28: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/28.jpg)
Markov Chain Monte Carlo
…
…
Metropolis-Hastings, Gibbs sampling, Rao-Blackwellization, …
![Page 29: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/29.jpg)
Sequential Monte CarloParticle Filters, Condensation, Survival of the Fittest,…
Sample-based density estimate
Weight by observation likelihood
Resample & propagate by dynamics
• Nonparametric approximation
to optimal BP estimates
• Represent messages and
posteriors using a set of
samples, found by simulation
![Page 30: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/30.jpg)
Nonparametric Belief Propagation
Belief Propagation
• General graphs
• Discrete or Gaussian
Particle Filters
• Markov chains
• General potentials
Nonparametric BP
• General graphs
• General potentials
![Page 31: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/31.jpg)
Nonparametric Bayes
• Model complexity grows as data observed:
Small training sets give simple, robust predictions
Reduced sensitivity to prior assumptions
Nonparametric No Parameters
• Literature showing attractive asymptotic properties
• Leads to simple, effective computational methods
Avoids challenging model selection issues
Flexible but Tractable
Dirichlet process mixture model
![Page 32: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/32.jpg)
Prereq: Intro Machine LearningSupervised Learning Unsupervised Learning
Dis
cre
teC
on
tin
uo
us
classification or
categorization
regression
clustering
dimensionality
reduction
• Bayesian and frequentist estimation
• Model selection, cross-validation, overfitting
• Expectation-Maximization (EM) algorithm
![Page 33: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/33.jpg)
Textbook & Readings
• Variational tutorial by Wainwright and Jordan (2008)
• Background chapter of Prof. Sudderth’s thesis
• Many classic and contemporary research articles…
![Page 34: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/34.jpg)
Grading
• Attend class and participate in discussions
• Prepare summary overview presentation, and lead classdiscussion, for ~2 papers
Prof. Sudderth will lecture 50% of the time
• Upload comments about the assigned reading beforeeach lecture (due at 9am)
Class Participation: 30%
• Proposal: 1-2 pages, due in March (10%)
• Presentation: ~10 minutes, during finals week (10%)
• Conference-style technical report (50%)
Final Project: 70%
![Page 35: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/35.jpg)
Reading Comments
• What is the most exciting or interesting model, idea, ortechnique described here? Why is it important?
• Don’t just copy the abstract - what do you think?
The Good: 1-2 sentences
• No method is perfect, and many are far from it!
• What is the biggest weakness of this model or approach?
• Problems could be a lack of empirical validation, missingtheory, unacknowledged assumptions, …
The Bad: 1-2 sentences
• Poorly written or unclear sections of the paper: terseexplanations, steps you didn’t follow, etc.
• What would you like to have explained in class?
The Ugly: 1-2 sentences
![Page 36: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/36.jpg)
Final Projects
• Propose a new family of graphical models suitable for aparticular application, try baseline learning algorithms
• Propose, develop, and experimentally test an extension ofsome existing learning or inference algorithm
• Experimentally compare different models or algorithms onan interesting, novel dataset
• Survey the latest advances in a particular applicationarea, or for a particular type of learning algorithm
• …
Key Requirements: Novelty, use of graphical models
Best case: Application of course
material to your own area of research
![Page 37: Learning and Inference in Probabilistic Graphical Models · some existing learning or inference algorithm • Experimentally compare different models or algorithms on an interesting,](https://reader034.vdocuments.us/reader034/viewer/2022042417/5f32638a3014c9268e5945df/html5/thumbnails/37.jpg)
Administration
• Your name
• Your CS account username
• Your department, major, and year
• Your experience in machine learningIf you took CS195-F in Fall 2009, just say so
Otherwise, 1-2 sentences about previous exposure
Mailing List: E-mail [email protected] with
Readings for Monday:
• Introductory chapters of Koller & Friedman; specificsections announced via e-mail
• No comments required for Monday’s lecture