Poisson Gamma Dynamical Systems
Aaron Schein UMass Amherst
Mingyuan Zhou Univ. Texas at Austin
Hanna Wallach Microsoft Research
Joint work with:
Sequential multivariate count data
Days in 2003
Sequential multivariate count data
Days in 2003
bursty
Sequential multivariate count data
Days in 2003
bursty
Sequentially observed count vectors
It is everywhere.
Sequential multivariate count data
International relations
RUS
PRK
JAP
KOR
International relations
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
June 28, 2003
RUS
PRK
JAP
KOR
Sequentially observed count vectors
6/28
32
2
0
0
0
5
Sequentially observed count vectors
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
June 28, 2003
RUS
PRK
JAP
KOR
Sequentially observed count vectors
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
0
5
June 29, 2003
RUS
PRK
JAP
KOR
40
0
0
0
0
0
6/29
Sequentially observed count vectors
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
0
5
June 30, 2003
RUS
PRK
JAP
KOR
40
0
0
0
0
0
6/29
7
0
2
5
6
10
6/30
Sequentially observed count vectors
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
0
5
July 1, 2003
RUS
PRK
JAP
KOR
40
0
0
0
0
0
6/29
7
0
2
5
6
10
6/30
3
0
0
0
0
13
7/1
Sequentially observed count vectorsJuly 2, 2003
RUS
PRK
JAP
KOR
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
0
5
40
0
0
0
0
0
6/29
7
0
2
5
6
10
6/30
3
0
0
0
0
13
7/1
3
0
21
0
0
0
7/2
Sequentially observed count vectors
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
0
5
40
0
0
0
0
0
6/29
7
0
2
5
6
10
6/30
3
0
0
0
0
13
7/1
3
0
21
0
0
0
7/2
T
V
Sequentially observed count vectors
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
0
5
40
0
0
0
0
0
6/29
7
0
2
5
6
10
6/30
3
0
0
0
0
13
7/1
3
0
21
0
0
0
7/2
sparseV = 6,197
only 4% non-zero!
T
VT = 365
Sequentially observed count vectors
y(1), . . . ,y(T )
y(t) =0
2
0
0
1
7V
Sequentially observed count vectors
y(1), . . . ,y(T )
y(t) =0
2
0
0
1
7V
bursty
RUS-KOR
RUS-PRK
KOR-PRK
JAP-RUS
JAP-PRK
JAP-KOR
6/28
32
2
0
0
05
40
0
0
0
0
0
6/29
7
02
5
6
10
6/30
3
0
0
0
013
7/1
3
021
0
0
0
7/2
sparse
Especially for large V!
Modeling sequential count data
• Prediction (forecasting, smoothing) • Explanation • Exploration (interpretability!)
Many reasons to want a probabilistic model
Modeling sequential count data
• Autoregressive models
• Hidden Markov models
• Dynamic topic models
• Hawkes process models
Many probabilistic models for sequential counts
Modeling sequential count data
• Expressive
• Parsimonious
• Gaussian*
Linear dynamical systems
unnatural for count datamisspecified likelihood… …or non-conjugate
Modeling sequential count data
The case for naturalness
Natural models
• Statistically and computationally efficient • the right inductive bias constrains the hypothesis space • the right inductive bias supplements a lack of data • quantities of interest available in closed form
• Important for measurement
✓ likelihood matches the support of the data ✓ conjugate priors
Our contributions:• Novel generative model
• Matches the form of linear dynamical systems • Natural for count data
• Auxiliary variable-based MCMC inference
• Superior forecasting and smoothing performance
(not size of data matrix, like the Gaussian LDS)
Benefits of the model:• Inference scales with the number of non-zeros
• Highly interpretable latent structure
Poisson gamma dynamical systems
y(t)v ⇠ Pois
!
· · ·
(natural likelihood for counts)
Poisson gamma dynamical systems
y(t)v
how active event type v is in component k
KX
k=1
�kv✓(t)k⇠ Pois
!
Poisson gamma dynamical systems
y(t)v
how active component k is at time step t
KX
k=1
�kv✓(t)k⇠ Pois
!
Poisson gamma dynamical systems
y(t)v
KX
k=1
�kv✓(t)k
h i=
Poisson gamma dynamical systems
⌧0Gam
,
!
⇠✓(t)k
KX
k0=1
⇡kk0✓(t�1)k0⌧0 · · ·
(conjugate prior to Poisson)
Poisson gamma dynamical systems
⌧0Gam
,
!
⇠✓(t)k
how active component k’ is at time step t-1
KX
k0=1
⇡kk0✓(t�1)k0⌧0
Poisson gamma dynamical systems
⌧0Gam
,
!
⇠✓(t)k
KX
k0=1
⇡kk0✓(t�1)k0⌧0
the probability of transitioning from component k’ into k
KX
k=1
⇡kk0 = 1
Poisson gamma dynamical systems
⌧0Gam
,
!
⇠✓(t)k
KX
k0=1
⇡kk0✓(t�1)k0⌧0
concentration hyperparameter
Poisson gamma dynamical systems
⌧0✓(t)k
KX
k0=1
⇡kk0✓(t�1)k0⌧0h i
=
Poisson gamma dynamical systems
✓(t)k
KX
k0=1
⇡kk0✓(t�1)k0
h i=
Poisson gamma dynamical systems
h i= ⇧✓(t�1)✓(t)
h i=y(t) �✓(t)
(matches linear dynamical systems)
Poisson gamma dynamical systems
Inferred latent structure
International relations data from 2003
Inferred latent structure
�kv
The top event types for component k=1
Inferred latent structure
✓(1)k , . . . , ✓(T )k
�kv
Inferred latent structure
Inferred latent structure
Inferred latent structure
Inferred latent structure
Inferred latent structure
Inferred latent structure
Inferred latent structure
Inferred latent structure
Technical challenge LegendPoisson/MultinomialGamma/Dirichlet
⇧
✓(1)
y(1) y(2) y(3)
✓(2) ✓(3)
⌫
�
Technical challenge LegendPoisson/MultinomialGamma/Dirichlet
⇧
✓(1)
y(1) y(2) y(3)
✓(2) ✓(3)
⌫
�
� ⇠(conditional posterior has closed form)
P (⇧ |Y,⇥,⌫)⇧ ⇠
P (⇥ |Y,⇧,⌫)⇥ ⇠
P (� |Y )
Augment and Conquer LegendPoisson/MultinomialGamma/Dirichlet
⇧
✓(1)
y(1) y(2) y(3)
✓(2)✓(3)
⌫
�
Original model⇧
y(1) y(2) y(3)
⌫
l(4)
m(3)
l(3)
m(2)
l(2)
m(1)
l(1)
�
Alternative model
⇧ ⇠ P (⇧ | A, Y,⌫)
Three rules
m
�↵
l
m
�↵
l
✓
m
�↵
m
�↵
y
✓ l
y
✓ l
m
LegendPoisson/MultinomialGamma/Dirichlet
Three rules
y
✓ l
y
✓ l
m
LegendPoisson/MultinomialGamma/Dirichlet
Relationship between Poisson and Multinomial
Steel (1953)
Three rules
m
�↵
l
m
�↵
l
✓
m
�↵
m
�↵
y
✓ l
y
✓ l
m Poisson-multinomial relationship
LegendPoisson/MultinomialGamma/Dirichlet
Three rules
✓
m
�↵
m
�↵
LegendPoisson/MultinomialGamma/DirichletNegative binomial
Negative binomial as gamma-PoissonGreenwood & Yule (1920)
Three rules
m
�↵
l
m
�↵
l
✓
m
�↵
m
�↵
y
✓ l
y
✓ l
m Poisson-multinomial relationship
LegendPoisson/MultinomialGamma/Dirichlet
Negative binomial definition
Negative binomial
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
Magic bivariate count distributionZhou & Carin (2012)
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
Magic bivariate count distributionZhou & Carin (2012)
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
Magic bivariate count distributionZhou & Carin (2012)
P (m |↵,�)=
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
Magic bivariate count distribution
CRT
Zhou & Carin (2012)
P (m |↵,�)=
P (l |m,↵)
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
Magic bivariate count distribution
CRT
Zhou & Carin (2012)
P (m |↵,�)=
P (l |m,↵) P (l |↵,�)=
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
Magic bivariate count distribution
CRTSumLog
Zhou & Carin (2012)
P (m |↵,�)=
P (l |m,↵) P (l |↵,�)=
P (m | l,�)
Three rules
m
�↵
l
m
�↵
l
LegendPoisson/MultinomialGamma/DirichletNegative binomial
P (l,m |↵,�)
P (m | l,�)P (l |↵,�)=
P (l |m,↵)P (m |↵,�)=
Magic bivariate count distribution
CRTSumLog
Zhou & Carin (2012)
Three rules
m
�↵
l
m
�↵
l
✓
m
�↵
m
�↵
y
✓ l
y
✓ l
m Poisson-multinomial relationship
LegendPoisson/MultinomialGamma/Dirichlet
Negative binomial definition
Negative binomialCRTSumLog
Magic bivariate count distribution
Augment and Conquer
⇧
✓(1)
y(1) y(2) y(3)
✓(2) ✓(3)
⌫
�
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
Augment and Conquer
⇧
✓(1)
y(1) y(2) y(3)
✓(2) ✓(3)
⌫
l(4)
�
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
✓(3)
Augment and Conquer
�
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
Augment and Conquer
�
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
�
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
�
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
⇧
✓(1)
y(1) y(2) y(3)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
⇧
✓(1)
y(1) y(2) y(3)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
l(2)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
⇧
✓(1)
y(1) y(2) y(3)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
l(2)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
⇧
✓(1)
y(1) y(2) y(3)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
l(2)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog⇧
✓(1)
y(1) y(2) y(3)
⌫
l(4)
m(3)
l(3)
m(2)
l(2)
m(1)
�
⇧
y(1) y(2) y(3)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
l(2)
m(1)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
⇧
y(1) y(2) y(3)
⌫
l(4)
m(3)
Augment and Conquer
l(3)
m(2)
l(2)
m(1)
l(1)
LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
�
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog⇧
y(1) y(2) y(3)
⌫
l(4)
m(3)
l(3)
m(2)
l(2)
m(1)
l(1)
�
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog⇧
y(1) y(2) y(3)
⌫
l(4)
m(3)
l(3)
m(2)
l(2)
m(1)
l(1)
�
⇧ ⇠ P (⇧ | A, Y,⌫)
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog⇧
y(1) y(2) y(3)
⌫
l(4)
m(3)
l(3)
m(2)
l(2)
m(1)
l(1)
�
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog⇧
✓(1)
y(1) y(2) y(3)
⌫
l(4)
m(3)
l(3)
m(2)
l(2)
m(1)
�
✓(1) ⇠ P (✓(1) | A, Y,⇧,⌫)
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
✓(2) ⇠ P (✓(2) |✓(1),A, Y,⇧,⌫)
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
l(3)
m(2)
�
Augment and Conquer LegendPoisson/MultinomialGamma/DirichletNegative binomialCRTSumLog
✓(3) ⇠ P (✓(3) |✓(2),✓(1),A, Y,⇧,⌫)
⇧
✓(1)
y(1) y(2) y(3)
✓(2)
⌫
l(4)
m(3)
✓(3)
�
Solution
Forward sampling ✓(t) ⇠ P (✓(t) |✓(t�1),A, Y,⇧,⌫)
Backward filtering A(t) ⇠ P (A(t) | A(t+1), Y,⇥,⇧,⌫)
Backward filtering forward sampling (BFFS)
Conclusion
Elegant inference in the natural model
and relies on a novel auxiliary variable scheme
that scales with the number of non-zeros
Come to poster #193 tonight!
https://github.com/aschein/pgds