![Page 1: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/1.jpg)
Shohei Shimizu
Shiga University / Osaka University, Japan
Launching department of data science in 2017!
1
A non-Gaussian approach for causal
structure learning in the presence of
hidden common causes
CRM Workshop Statistical Causal Inference and its Applications to Genetics
![Page 2: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/2.jpg)
Illustrating the problem
![Page 3: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/3.jpg)
Strong correlation btw chocolate
consumption and number of Nobel
laureates (Messerli12NEJM)
3
2002-2011Chocolate consumption (kg/yr/capita)Num
. N
obel la
ure
ate
s p
er
10 m
illio
n p
op.
Corr. 0.791
P-value < 0.001
![Page 4: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/4.jpg)
Eating more chocolate increases
num. Nobel laureates?
• Three candidate models (Messerli12NEJM; Maurage+13JNutrition)
4
Chclt Nobel?
Chclt Nobelor
GDP GDP
Chclt Nobelor
GDP
Corr. 0.791
P-value < 0.001N
ob
el
Chocolate
Hidden
Common
cause
Manage this gap!
Hidden
Common
cause
Hidden
Common
cause
![Page 5: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/5.jpg)
1. Estimation of causal direction with no temporal information being used
2. Coping with hidden common causes
5
Divided into two parts
x1 x2
?x1 x2
or
x1 x2 ?x1 x2 or
f1 f1
12b21b
12b21b
Once a direction has been estimated, the connection strength b21 or b12 can be computed
![Page 6: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/6.jpg)
Basic non-Gaussian model
with no hidden common
causeS. Shimizu, P. O. Hoyer, A. Hyvärinen
and A. Kerminen
Journal of Machine Learning Research
2006
x1 x2 ?x1 x2 or
![Page 7: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/7.jpg)
Linear Non-Gaussian Acyclic
Model (LiNGAM) (Shimizu+06JMLR)
• Identifiable: causal directions and coefficients
• Various extensions including nonlinear (Hoyer+08NIPS,
Zhang+09UAI) and cyclic (Lacerda+08UAI) models
7
i
ij
jiji exbx
x1 x2
x3
21b
23b13b
2e
3e
1e
Linearity
Acyclicity
Non-Gaussian errors eiIndependence of errors ei
(no hidden common causes)
![Page 8: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/8.jpg)
88Different directions give
different data distributionsGaussian Non-Gaussian
(ex. uniform)
Model 1:
Model 2:
x1
x2
x1
x2
e1
e2
x1
x2
e1
e2
x1
x2
x1
x2
x1
x2
212
11
8.0 exx
ex
22
121 8.0
ex
exx
1varvar 21 xx
,021 eEeE
![Page 9: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/9.jpg)
LiNGAM with hidden
common causes
P. O. Hoyer, S. Shimizu, A. Kerminen,
and M. Palviainen
Int. J. Approximate Reasoning
2008
x1 x2?
x1 x2
orf1 f1
![Page 10: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/10.jpg)
qf
2121
1
22
1
1
11
exbfx
efx
Q
q
Q
q
i
ij
jij
Q
q
qiqi exbfx 1
• Extension to incorporate non-Gaussian hidden
common causes
LiNGAM with hidden
common causes (Hoyer+08IJAR)
10
where are independent (WLG): ),,1( Qqfq
x1 x2 2e1e
1f 2f
![Page 11: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/11.jpg)
Our proposal:
A Bayesian LiNGAM
approach
S. Shimizu and K. Bollen.
Journal of Machine Learning Research,
2014
and something extra
![Page 12: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/12.jpg)
Key idea (1/2)
• Transform the model to a model with
no hidden common causes
12
)1(
1x)1(
2x
)(
2
mx)1(
1xx1 x2
f1 fQ…
2e1e
)1(
2e)1(
1e
)(
2
me)(
1
me
……
21b
21b
21b)(
2
m
)1(
2
LiNGAM with no hidden
common causes but with
possibly different
intercepts over obs.
LiNGAM with
hidden common
causes
)1(
1
)(
1
m
![Page 13: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/13.jpg)
Key idea (2/2)
• Include the sums of hidden common causes as
the model parameters, i.e., observation-specific
intercepts:
• Not explicitly model hidden common causes
– Neither necessary to specify the number of hidden
common causes Q nor estimate the coefficients
13
)(
2
m
)(
2
)(
121
1
)(
2
)(
2
mmQ
q
m
m exbfx
m-th obs.:
q2
Observation-specific
intercept
![Page 14: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/14.jpg)
• Compare the marginal likelihoods wth data stndrdzd
• Many obs.-specific intercepts
– Similar to mixed models and multi-level models
– Informative prior
• Model p(𝑒𝑖) by a Generalized Gaussian with a shape
parameter (hypr-prmtr selection: Empirical Bayes)
)()(
121
)(
2
)(
2
)(
1
)(
1
)(
1
m
i
mmm
mmm
exbx
ex
Bayesian model selection14
),,1;2,1()( nmim
i
Model 3 (x1 x2)
)(
2
)(
2
)(
2
)(
1
)(
212
)(
1
)(
1
mmm
mmmm
ex
exbx
Model 4 (x1 x2)
![Page 15: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/15.jpg)
Prior for the observation-specific
intercepts
• Motivation: Central limit theorem
– Sums of independent variables tend to be more
Gaussian
• Approximate the density by a bell-shaped
curve dist.
– Dependent due to hidden common causes
15
Q
q
m
mQ
q
m
m ff1
)(
2
)(
2
1
)(
1
)(
1 ,
~)(
2
)(
1
m
m
t-distribution with sd ,
correlation , and DOF1221,v
)(m
qf
(here, 8)
![Page 16: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/16.jpg)
Experiment on artificial
datasets
x1 x2
f1 fQ…
2e1e x1 x2
f1 fQ…
2e1e
?or
![Page 17: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/17.jpg)
Direction estimation17
Total: 240 trials
Precisions
N. Decisions
N. obs 2logBF>0 2logBF>2 2logBF>6 2logBF>10
50 0.62 0.63 0.70 0.59
100 0.64 0.68 0.72 0.84
200 0.66 0.69 0.74 0.81
N.obs 2logBF>0 2logBF>2 2logBF>6 2logBF>10
50 240 163 60 17
100 240 194 118 62
200 240 213 153 105
Strong evidence (Kass & Raftery,1995)
![Page 18: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/18.jpg)
Connection strength estimation18
Direction wrongly estimated
Direction correctly estimated
Estim
ate
d
True
![Page 19: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/19.jpg)
What should be the next?
Identifiable models for continuous
and discrete variable
(with hidden common causes)
![Page 20: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/20.jpg)
LiNGAM + Logistic model?
• Continuous effect and discrete cause:
• Discrete effect and continuous (discrete)
cause:
• 𝑓𝑖 satisfies:
• Difficulty: Not closed under marginalization?Prior?
20
i
dscrt
lil
R
r
dscrt
rir
Q
q
cntns
qiq
cntns
i exbffx 11
idscrt
j
cntns
j
dscrt
r
cntns
qi
dscrt
i exxfffx ),or(},{},{
cntns
jij
cntns
jij
R
r
dscrt
rir
Q
q
cntns
qiq
dscrt
i xbxbffx orlogistic11
![Page 21: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/21.jpg)
Conclusion
![Page 22: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/22.jpg)
Conclusion
• Estimation of causal direction in the
presence of hidden common causes is a
major challenge in causal discovery
• Proposed a semi-parametric approach
– LiNGAM + mixed-model
• Open problem: Identifiable models for
continuous and discrete variables (and
simple estimation algorithms for the
models)
22
![Page 23: A non-Gaussian approach for causal structure learning in ... · Estimation of causal effects using linear non-gaussian causal models with hidden variables. International Journal of](https://reader034.vdocuments.us/reader034/viewer/2022042920/5f64c1530075480c2d320011/html5/thumbnails/23.jpg)
References
• S. Shimizu, P. O. Hoyer, A. Hyvärinen and A. Kerminen. A linear
non-gaussian acyclic model for causal discovery. Journal of
Machine Learning Research, 7(Oct): 2003--2030, 2006.
• P. O. Hoyer, S. Shimizu, A. Kerminen and M. Palviainen.
Estimation of causal effects using linear non-gaussian causal
models with hidden variables. International Journal of
Approximate Reasoning, 49(2): 362-378, 2008.
• S. Shimizu and K. Bollen. Bayesian estimation of causal direction
in acyclic structural equation models with individual-specific
confounder variables and non-Gaussian distributions. Journal
of Machine Learning Research, 15(Aug): 2629--2652, 2014.
• A collection of related papers:
https://sites.google.com/site/sshimizu06/home/lingampapers
23