approximate inference: variational inferencevariational inference cmsc 678 umbc outline recap of...
TRANSCRIPT
![Page 1: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/1.jpg)
Approximate Inference:Variational Inference
CMSC 678UMBC
![Page 2: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/2.jpg)
Outline
Recap of graphical models & belief propagation
Posterior inference (Bayesian perspective)
Math: exponential family distributions
Variational InferenceBasic TechniqueExample: Topic Models
![Page 3: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/3.jpg)
Recap from last time…
![Page 4: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/4.jpg)
Graphical Models
𝑝𝑝 𝑥𝑥1, 𝑥𝑥2, 𝑥𝑥3, … , 𝑥𝑥𝑁𝑁 = �𝑖𝑖
𝑝𝑝 𝑥𝑥𝑖𝑖 𝜋𝜋(𝑥𝑥𝑖𝑖))
Directed Models (Bayesian networks)
Undirected Models (Markov random fields)
𝑝𝑝 𝑥𝑥1, 𝑥𝑥2, 𝑥𝑥3, … , 𝑥𝑥𝑁𝑁 =1𝑍𝑍�𝐶𝐶
𝜓𝜓𝐶𝐶 𝑥𝑥𝑐𝑐
![Page 5: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/5.jpg)
Markov Blanket
x
Markov blanket of a node x is its parents, children, and
children's parents
𝑝𝑝 𝑥𝑥𝑖𝑖 𝑥𝑥𝑗𝑗≠𝑖𝑖 =𝑝𝑝(𝑥𝑥1, … , 𝑥𝑥𝑁𝑁)
∫ 𝑝𝑝 𝑥𝑥1, … , 𝑥𝑥𝑁𝑁 𝑑𝑑𝑥𝑥𝑖𝑖
=∏𝑘𝑘 𝑝𝑝(𝑥𝑥𝑘𝑘|𝜋𝜋 𝑥𝑥𝑘𝑘 )
∫ ∏𝑘𝑘 𝑝𝑝 𝑥𝑥𝑘𝑘 𝜋𝜋 𝑥𝑥𝑘𝑘 )𝑑𝑑𝑥𝑥𝑖𝑖factor out terms not dependent on xi
factorization of graph
=∏𝑘𝑘:𝑘𝑘=𝑖𝑖 or 𝑖𝑖∈𝜋𝜋 𝑥𝑥𝑘𝑘 𝑝𝑝(𝑥𝑥𝑘𝑘|𝜋𝜋 𝑥𝑥𝑘𝑘 )
∫ ∏𝑘𝑘:𝑘𝑘=𝑖𝑖 or 𝑖𝑖∈𝜋𝜋 𝑥𝑥𝑘𝑘 𝑝𝑝 𝑥𝑥𝑘𝑘 𝜋𝜋 𝑥𝑥𝑘𝑘 )𝑑𝑑𝑥𝑥𝑖𝑖
the set of nodes needed to form the complete conditional for a variable xi
![Page 6: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/6.jpg)
Markov Random Fields withFactor Graph Notation
x: original pixel/state
y: observed (noisy)
pixel/state
factor nodes are added
according to maximal cliques
unaryfactor
variable
factor graphs are bipartite
binaryfactor
![Page 7: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/7.jpg)
Two Problems for Undirected Models
Finding the normalizer
𝑍𝑍 = �𝑥𝑥
�𝑐𝑐
𝜓𝜓𝑐𝑐(𝑥𝑥𝑐𝑐)
Computing the marginals
𝑍𝑍𝑛𝑛(𝑣𝑣) = �𝑥𝑥:𝑥𝑥𝑛𝑛=𝑣𝑣
�𝑐𝑐
𝜓𝜓𝑐𝑐(𝑥𝑥𝑐𝑐)Q: Why are these difficult?
A: Many different combinations
Sum over all variable combinations, with the xn
coordinate fixed
𝑍𝑍2(𝑣𝑣) = �𝑥𝑥1
�𝑥𝑥3
�𝑐𝑐
𝜓𝜓𝑐𝑐(𝑥𝑥 = 𝑥𝑥1, 𝑣𝑣, 𝑥𝑥3 )
Example: 3 variables, fix the
2nd dimensionBelief propagation algorithms
• sum-product (forward-backward in HMMs)
• max-product/max-sum (Viterbi)
![Page 8: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/8.jpg)
Sum-ProductFrom variables to factors
𝑞𝑞𝑛𝑛→𝑚𝑚 𝑥𝑥𝑛𝑛 = �𝑚𝑚′∈𝑀𝑀(𝑛𝑛)\𝑚𝑚
𝑟𝑟𝑚𝑚′→𝑛𝑛 𝑥𝑥𝑛𝑛
From factors to variables
𝑟𝑟𝑚𝑚→𝑛𝑛 𝑥𝑥𝑛𝑛= �
𝒘𝒘𝑚𝑚\𝑛𝑛
𝑓𝑓𝑚𝑚 𝒘𝒘𝑚𝑚 �𝑛𝑛′∈𝑁𝑁(𝑚𝑚)\𝑛𝑛
𝑞𝑞𝑛𝑛′→𝑚𝑚(𝑥𝑥𝑛𝑛𝑛)
n
m
n
m
set of variables that the mth factor depends on
set of factors in which variable n participates
sum over configuration of variables for the mth factor,
with variable n fixed
default value of 1 if empty product
![Page 9: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/9.jpg)
Outline
Recap of graphical models & belief propagation
Posterior inference (Bayesian perspective)
Math: exponential family distributions
Variational InferenceBasic TechniqueExample: Topic Models
![Page 10: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/10.jpg)
Goal: Posterior Inference
Hyperparameters αUnknown parameters ΘData:
Likelihood model:
p( | Θ )
pα( Θ | )
we’re going to be Bayesian (perform Bayesian inference)
![Page 11: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/11.jpg)
Posterior Classification vs.Posterior Inference
“Frequentist” methods
prior over labels (maybe), not weights
Bayesian methods
Θ includes weight parameters
pα( Θ | )pα,w ( y| )
![Page 12: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/12.jpg)
(Some) Learning Techniques
MAP/MLE: Point estimation, basic EM
Variational Inference: Functional Optimization
Sampling/Monte Carlo
today
next class
what we’ve already covered
![Page 13: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/13.jpg)
Outline
Recap of graphical models & belief propagation
Posterior inference (Bayesian perspective)
Math: exponential family distributions
Variational InferenceBasic TechniqueExample: Topic Models
![Page 14: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/14.jpg)
Exponential Family Form
![Page 15: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/15.jpg)
Exponential Family Form
Support function• Formally necessary, in practice
irrelevant
![Page 16: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/16.jpg)
Exponential Family Form
Distribution Parameters• Natural parameters• Feature weights
![Page 17: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/17.jpg)
Exponential Family Form
Feature function(s)• Sufficient statistics
![Page 18: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/18.jpg)
Exponential Family Form
Log-normalizer
![Page 19: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/19.jpg)
Exponential Family Form
Log-normalizer
![Page 20: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/20.jpg)
Why? Capture Common Distributions
Discrete (Finite distributions)
![Page 21: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/21.jpg)
Why? Capture Common Distributions
• Gaussian
https://kanbanize.com/blog/wp-content/uploads/2014/07/Standard_deviation_diagram.png
![Page 22: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/22.jpg)
Why? Capture Common Distributions
Dirichlet (Distributions over (finite) distributions)
![Page 23: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/23.jpg)
Why? Capture Common Distributions
Discrete (Finite distributions)
Dirichlet (Distributions over (finite) distributions)
Gaussian
Gamma, Exponential, Poisson, Negative-Binomial, Laplace, log-Normal,…
![Page 24: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/24.jpg)
Why? “Easy” Gradients
Observed feature countsCount w.r.t. empirical distribution
Expected feature countsCount w.r.t. current model parameters
(we’ve already seen this with maxent models)
![Page 25: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/25.jpg)
Why? “Easy” Expectations
expectation of the sufficient
statistics
gradient of the log normalizer
![Page 26: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/26.jpg)
Why? “Easy” Posterior Inference
![Page 27: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/27.jpg)
Why? “Easy” Posterior Inference
p is the conjugate prior for q
![Page 28: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/28.jpg)
Why? “Easy” Posterior Inference
p is the conjugate prior for q
Posterior p has same form as prior p
![Page 29: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/29.jpg)
Why? “Easy” Posterior Inference
p is the conjugate prior for q
Posterior p has same form as prior p
All exponential family models have a conjugate prior (in theory)
![Page 30: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/30.jpg)
Why? “Easy” Posterior Inference
p is the conjugate prior for q
Posterior p has same form as prior p
Posterior Likelihood Prior
Dirichlet (Beta) Discrete (Bernoulli) Dirichlet (Beta)
Normal Normal (fixed var.) Normal
Gamma Exponential Gamma
![Page 31: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/31.jpg)
Outline
Recap of graphical models & belief propagation
Posterior inference (Bayesian perspective)
Math: exponential family distributions
Variational InferenceBasic TechniqueExample: Topic Models
![Page 32: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/32.jpg)
Goal: Posterior Inference
Hyperparameters αUnknown parameters ΘData:
Likelihood model:
p( | Θ )
pα( Θ | )
![Page 33: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/33.jpg)
(Some) Learning Techniques
MAP/MLE: Point estimation, basic EM
Variational Inference: Functional Optimization
Sampling/Monte Carlo
today
next class
what we’ve already covered
![Page 34: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/34.jpg)
Variational Inference
Difficult to compute
![Page 35: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/35.jpg)
Variational Inference
Difficult to compute
Minimize the “difference”
by changing λ
Easy(ier) to compute
q(θ): controlled by parameters λ
![Page 36: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/36.jpg)
Variational Inference
Difficult to compute
Easy(ier) to compute
Minimize the “difference”
by changing λ
![Page 37: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/37.jpg)
Variational Inference: A Gradient-Based Optimization Technique
Set t = 0Pick a starting value λt
Until converged:1. Get value y t = F(q(•;λt))2. Get gradient g t = F’(q(•;λt))3. Get scaling factor ρ t4. Set λt+1 = λt + ρt*g t5. Set t += 1
![Page 38: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/38.jpg)
Variational Inference: A Gradient-Based Optimization Technique
Set t = 0Pick a starting value λt
Until converged:1. Get value y t = F(q(•;λt))2. Get gradient g t = F’(q(•;λt))3. Get scaling factor ρ t4. Set λt+1 = λt + ρt*g t5. Set t += 1
![Page 39: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/39.jpg)
Variational Inference:The Function to Optimize
Posterior of desired model
Any easy-to-compute distribution
![Page 40: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/40.jpg)
Variational Inference:The Function to Optimize
Posterior of desired model
Any easy-to-compute distribution
Find the best distribution (calculus of variations)
![Page 41: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/41.jpg)
Variational Inference:The Function to Optimize
Find the best distribution
Parameters for desired model
![Page 42: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/42.jpg)
Variational Inference:The Function to Optimize
Find the best distribution
Variational parameters for θ
Parameters for desired model
![Page 43: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/43.jpg)
Variational Inference:The Function to Optimize
Find the best distribution
Variationalparameters for θ
Parameters for desired model
KL-Divergence (expectation)
DKL 𝑞𝑞 𝜃𝜃 || 𝑝𝑝(𝜃𝜃|𝑥𝑥) =
𝔼𝔼𝑞𝑞 𝜃𝜃 log𝑞𝑞 𝜃𝜃𝑝𝑝(𝜃𝜃|𝑥𝑥)
![Page 44: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/44.jpg)
Variational Inference
Find the best distribution
Variational parameters for θ
Parameters for desired model
![Page 45: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/45.jpg)
Exponential Family Recap: “Easy” Expectations
Exponential Family Recap: “Easy” Posterior Inference
p is the conjugate prior for π
![Page 46: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/46.jpg)
Variational Inference
Find the best distribution
When p and q are the same exponential family form, the variational update q(θ) is (often) computable (in closed form)
![Page 47: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/47.jpg)
Variational Inference: A Gradient-Based Optimization Technique
Set t = 0Pick a starting value λtLetF(q(•;λt)) = KL[q(•;λt) || p(•)]
Until converged:1. Get value y t = F(q(•;λt))2. Get gradient g t = F’(q(•;λt))3. Get scaling factor ρ t4. Set λt+1 = λt + ρt*g t5. Set t += 1
![Page 48: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/48.jpg)
Variational Inference:Maximization or Minimization?
![Page 49: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/49.jpg)
Evidence Lower Bound (ELBO)
log𝑝𝑝 𝑥𝑥 = log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃 𝑑𝑑𝜃𝜃
![Page 50: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/50.jpg)
Evidence Lower Bound (ELBO)
log𝑝𝑝 𝑥𝑥 = log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃 𝑑𝑑𝜃𝜃
= log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃𝑞𝑞 𝜃𝜃𝑞𝑞(𝜃𝜃)
𝑑𝑑𝜃𝜃
![Page 51: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/51.jpg)
Evidence Lower Bound (ELBO)
log𝑝𝑝 𝑥𝑥 = log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃 𝑑𝑑𝜃𝜃
= log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃𝑞𝑞 𝜃𝜃𝑞𝑞(𝜃𝜃)
𝑑𝑑𝜃𝜃
= log𝔼𝔼𝑞𝑞 𝜃𝜃𝑝𝑝 𝑥𝑥,𝜃𝜃𝑞𝑞 𝜃𝜃
![Page 52: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/52.jpg)
Evidence Lower Bound (ELBO)
log𝑝𝑝 𝑥𝑥 = log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃 𝑑𝑑𝜃𝜃
= log∫ 𝑝𝑝 𝑥𝑥,𝜃𝜃𝑞𝑞 𝜃𝜃𝑞𝑞(𝜃𝜃)
𝑑𝑑𝜃𝜃
= log𝔼𝔼𝑞𝑞 𝜃𝜃𝑝𝑝 𝑥𝑥,𝜃𝜃𝑞𝑞 𝜃𝜃
≥ 𝔼𝔼𝑞𝑞 𝜃𝜃 𝑝𝑝 𝑥𝑥,𝜃𝜃 − 𝔼𝔼𝑞𝑞 𝜃𝜃 𝑞𝑞 𝜃𝜃= ℒ(𝑞𝑞)
![Page 53: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/53.jpg)
Outline
Recap of graphical models & belief propagation
Posterior inference (Bayesian perspective)
Math: exponential family distributions
Variational InferenceBasic TechniqueExample: Topic Models
![Page 54: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/54.jpg)
Bag-of-Items Models
Three people have been fatally shot, and five people, including a mayor, were seriously wounded as a result of a Shining Path attack today against a community in Junindepartment, central Peruvian mountain region . …
p( ) Three: 1,people: 2,attack: 2,
…p( )=Unigram counts
![Page 55: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/55.jpg)
Bag-of-Items Models
Three people have been fatally shot, and five people, including a mayor, were seriously wounded as a result of a Shining Path attack today against a community in Junindepartment, central Peruvian mountain region . …
p( ) Three: 1,people: 2,attack: 2,
…pφ,ω( )=
Unigram counts
Global (corpus-level) parameters interact with local (document-level) parameters
![Page 56: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/56.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document (unigram) word counts
![Page 57: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/57.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document (unigram) word counts
Count of word j in document i
j
i
![Page 58: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/58.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document (latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
Count of word j in document i
j
i
K topics
![Page 59: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/59.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document (latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
~ Multinomial ~ Dirichlet ~ Dirichlet
(regularize/place priors)
Count of word j in document i
j
i
K topics
![Page 60: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/60.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document
(latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
![Page 61: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/61.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document
(latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
d
![Page 62: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/62.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document
(latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
d
![Page 63: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/63.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document
(latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
d
![Page 64: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/64.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document
(latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
d
![Page 65: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/65.jpg)
Latent Dirichlet Allocation(Blei et al., 2003)
Per-document
(latent) topic usage
Per-document (unigram) word counts
Per-topic word usage
d
![Page 66: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/66.jpg)
Variational Inference: LDirA
Topic usage
Per-document (unigram) word counts
Topic words
p: True model
𝜙𝜙𝑘𝑘 ∼ Dirichlet(𝜷𝜷)𝑤𝑤(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜙𝜙𝑧𝑧 𝑑𝑑,𝑛𝑛 )
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
![Page 67: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/67.jpg)
Variational Inference: LDirA
Topic usage
Per-document (unigram) word counts
Topic words
p: True model q: Mean-field approximation
𝜙𝜙𝑘𝑘 ∼ Dirichlet(𝜷𝜷)𝑤𝑤(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜙𝜙𝑧𝑧 𝑑𝑑,𝑛𝑛 )
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜙𝜙𝑘𝑘 ∼ Dirichlet(𝝀𝝀𝒌𝒌)
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
![Page 68: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/68.jpg)
Variational Inference: A Gradient-Based Optimization Technique
Set t = 0Pick a starting value λtLetF(q(•;λt)) = KL[q(•;λt) || p(•)]
Until converged:1. Get value y t = F(q(•;λt))2. Get gradient g t = F’(q(•;λt))3. Get scaling factor ρ t4. Set λt+1 = λt + ρt*g t5. Set t += 1
![Page 69: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/69.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝑝𝑝 𝜃𝜃(𝑑𝑑) | 𝛼𝛼
![Page 70: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/70.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝑝𝑝 𝜃𝜃(𝑑𝑑) | 𝛼𝛼 =
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) 𝛼𝛼 − 1 𝑇𝑇 log𝜃𝜃(𝑑𝑑) + 𝐶𝐶
exponential family form of Dirichlet
𝑝𝑝 𝜃𝜃 =Γ(∑𝑘𝑘 𝛼𝛼𝑘𝑘)∏𝑘𝑘 Γ 𝛼𝛼𝑘𝑘
�𝑘𝑘
𝜃𝜃𝑘𝑘𝛼𝛼𝑘𝑘−1
params = 𝛼𝛼𝑘𝑘 − 1 𝑘𝑘suff. stats.= log𝜃𝜃𝑘𝑘 𝑘𝑘
![Page 71: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/71.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝑝𝑝 𝜃𝜃(𝑑𝑑) | 𝛼𝛼 =
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) 𝛼𝛼 − 1 𝑇𝑇 log𝜃𝜃(𝑑𝑑) + 𝐶𝐶
expectation of sufficient statistics of q distribution
params = 𝛾𝛾𝑘𝑘 − 1 𝑘𝑘
suff. stats. = log𝜃𝜃𝑘𝑘 𝑘𝑘
![Page 72: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/72.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝑝𝑝 𝜃𝜃(𝑑𝑑) | 𝛼𝛼 =
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) 𝛼𝛼 − 1 𝑇𝑇 log𝜃𝜃(𝑑𝑑) + 𝐶𝐶 =expectation of the
sufficient statistics is the gradient of the
log normalizer
𝛼𝛼 − 1 𝑇𝑇𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝜃𝜃(𝑑𝑑) + 𝐶𝐶
![Page 73: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/73.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝑝𝑝 𝜃𝜃(𝑑𝑑) | 𝛼𝛼 =
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) 𝛼𝛼 − 1 𝑇𝑇 log𝜃𝜃(𝑑𝑑) + 𝐶𝐶 =expectation of the
sufficient statistics is the gradient of the
log normalizer
𝛼𝛼 − 1 𝑇𝑇𝛻𝛻𝛾𝛾𝑑𝑑𝐴𝐴 𝛾𝛾𝑑𝑑 − 1 + 𝐶𝐶
![Page 74: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/74.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
𝔼𝔼𝑞𝑞(𝜃𝜃(𝑑𝑑)) log𝑝𝑝 𝜃𝜃(𝑑𝑑) | 𝛼𝛼 = 𝛼𝛼 − 1 𝑇𝑇𝛻𝛻𝛾𝛾𝑑𝑑𝐴𝐴 𝛾𝛾𝑑𝑑 − 1 + 𝐶𝐶
ℒ �𝛾𝛾𝑑𝑑
= 𝛼𝛼 − 1 𝑇𝑇𝛻𝛻𝛾𝛾𝑑𝑑𝐴𝐴 𝛾𝛾𝑑𝑑 − 1 + 𝑀𝑀 𝛾𝛾𝑑𝑑there’s more math
to do!
![Page 75: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/75.jpg)
Variational Inference: A Gradient-Based Optimization Technique
Set t = 0Pick a starting value λtLetF(q(•;λt)) = KL[q(•;λt) || p(•)]
Until converged:1. Get value y t = F(q(•;λt))2. Get gradient g t = F’(q(•;λt))3. Get scaling factor ρ t4. Set λt+1 = λt + ρt*g t5. Set t += 1
![Page 76: Approximate Inference: Variational InferenceVariational Inference CMSC 678 UMBC Outline Recap of graphical models & belief propagation Posterior inference (Bayesian perspective) Math:](https://reader034.vdocuments.us/reader034/viewer/2022042613/5f9e3ddedd399f0be92a03b5/html5/thumbnails/76.jpg)
Variational Inference: LDirA
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜶𝜶)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜃𝜃(𝑑𝑑))
𝜃𝜃(𝑑𝑑) ∼ Dirichlet(𝜸𝜸𝒅𝒅)𝑧𝑧(𝑑𝑑,𝑛𝑛) ∼ Discrete(𝜓𝜓(𝑑𝑑,𝑛𝑛))
p: True model q: Mean-field approximation
ℒ �𝛾𝛾𝑑𝑑
= 𝛼𝛼 − 1 𝑇𝑇𝛻𝛻𝛾𝛾𝑑𝑑𝐴𝐴 𝛾𝛾𝑑𝑑 − 1 + 𝑀𝑀 𝛾𝛾𝑑𝑑
𝛻𝛻𝛾𝛾𝑑𝑑ℒ �𝛾𝛾𝑑𝑑= 𝛼𝛼 − 1 𝑇𝑇𝛻𝛻𝛾𝛾𝑑𝑑
2 𝐴𝐴 𝛾𝛾𝑑𝑑 − 1 + 𝛻𝛻𝛾𝛾𝑑𝑑𝑀𝑀 𝛾𝛾𝑑𝑑