j. daunizeau institute of empirical research in economics, zurich, switzerland brain and spine...
TRANSCRIPT
J. Daunizeau
Institute of Empirical Research in Economics, Zurich, Switzerland
Brain and Spine Institute, Paris, France
Bayesian inference
Overview of the talk
1 Probabilistic modelling and representation of uncertainty1.1 Bayesian paradigm
1.2 Hierarchical models
1.3 Frequentist versus Bayesian inference
2 Numerical Bayesian inference methods
2.1 Sampling methods
2.2 Variational methods (ReML, EM, VB)
3 SPM applications
3.1 aMRI segmentation
3.2 Decoding of brain images
3.3 Model-based fMRI analysis (with spatial priors)
3.4 Dynamic causal modelling
Overview of the talk
1 Probabilistic modelling and representation of uncertainty1.1 Bayesian paradigm
1.2 Hierarchical models
1.3 Frequentist versus Bayesian inference
2 Numerical Bayesian inference methods
2.1 Sampling methods
2.2 Variational methods (ReML, EM, VB)
3 SPM applications
3.1 aMRI segmentation
3.2 Decoding of brain images
3.3 Model-based fMRI analysis (with spatial priors)
3.4 Dynamic causal modelling
Degree of plausibility desiderata:- should be represented using real numbers (D1)- should conform with intuition (D2)- should be consistent (D3)
a=2b=5
a=2
• normalization:
• marginalization:
• conditioning :(Bayes rule)
Bayesian paradigmprobability theory: basics
Bayesian paradigmderiving the likelihood function
- Model of data with unknown parameters:
y f e.g., GLM: f X
- But data is noisy: y f
- Assume noise/residuals is ‘small’:
22
1exp
2p
4 0.05P
→ Distribution of data, given fixed parameters:
22
1exp
2p y y f
f
Likelihood:
Prior:
Bayes rule:
Bayesian paradigmlikelihood, priors and the model evidence
generative model m
Bayesian paradigmforward and inverse problems
,p y m
forward problem
likelihood
,p y m
inverse problem
posterior distribution
Principle of parsimony :« plurality should not be assumed without necessity »
y=f(
x)y
= f(
x)
x
“Occam’s razor” :
mo
de
l evi
de
nce
p(y
|m)
space of all data sets
Model evidence:
Bayesian paradigmmodel comparison
••• hierarchy
causality
Hierarchical modelsprinciple
Hierarchical modelsdirected acyclic graphs (DAGs)
•••
prior densities posterior densities
Hierarchical modelsunivariate linear hierarchical model
t t Y t *
0*P t t H
0p t H
0*P t t H if then reject H0
• estimate parameters (obtain test stat.)
H0: 0• define the null, e.g.:
• apply decision rule, i.e.:
classical SPM
p y
0P H y
0P H y if then accept H0
• invert model (obtain posterior pdf)
H0: 0• define the null, e.g.:
• apply decision rule, i.e.:
Bayesian PPM
Frequentist versus Bayesian inferencea (quick) note on hypothesis testing
Y
• define the null and the alternative hypothesis in terms of priors, e.g.:
0 0
1 1
1 if 0:
0 otherwise
: 0,
H p H
H p H N
0
1
1P H y
P H yif then reject H0• apply decision rule, i.e.:
y
Frequentist versus Bayesian inferencewhat about bilateral tests?
1p Y H
0p Y H
space of all datasets
• Savage-Dickey ratios (nested models, i.i.d. priors):
10 1
1
0 ,
0
p y Hp y H p y H
p H
Overview of the talk
1 Probabilistic modelling and representation of uncertainty1.1 Bayesian paradigm
1.2 Hierarchical models
1.3 Frequentist versus Bayesian inference
2 Numerical Bayesian inference methods
2.1 Sampling methods
2.2 Variational methods (ReML, EM, VB)
3 SPM applications
3.1 aMRI segmentation
3.2 Decoding of brain images
3.3 Model-based fMRI analysis (with spatial priors)
3.4 Dynamic causal modelling
Sampling methodsMCMC example: Gibbs sampling
Variational methodsVB / EM / ReML
→ VB : maximize the free energy F(q) w.r.t. the “variational” posterior q(θ) under some (e.g., mean field, Laplace) approximation
1 or 2q
1 or 2 ,p y m
1 2, ,p y m
1
2
Overview of the talk
1 Probabilistic modelling and representation of uncertainty1.1 Bayesian paradigm
1.2 Hierarchical models
1.3 Frequentist versus Bayesian inference
2 Numerical Bayesian inference methods
2.1 Sampling methods
2.2 Variational methods (ReML, EM, VB)
3 SPM applications
3.1 aMRI segmentation
3.2 Decoding of brain images
3.3 Model-based fMRI analysis (with spatial priors)
3.4 Dynamic causal modelling
realignmentrealignment smoothingsmoothing
normalisationnormalisation
general linear modelgeneral linear model
templatetemplate
Gaussian Gaussian field theoryfield theory
p <0.05p <0.05
statisticalstatisticalinferenceinference
segmentationand normalisation
segmentationand normalisation
dynamic causalmodelling
dynamic causalmodelling
posterior probabilitymaps (PPMs)
posterior probabilitymaps (PPMs)
multivariatedecoding
multivariatedecoding
grey matter CSFwhite matter
…
…
yi ci
k
2
1
1 2 k
class variances
classmeans
ith voxelvalue
ith voxellabel
classfrequencies
aMRI segmentationmixture of Gaussians (MoG) model
Decoding of brain imagesrecognizing brain states from fMRI
+
fixation cross
>>
paceresponse
log-evidence of X-Y sparse mappings:effect of lateralization
log-evidence of X-Y bilateral mappings:effect of spatial deployment
fMRI time series analysisspatial priors and model comparison
PPM: regions best explainedby short-term memory model
PPM: regions best explained by long-term memory model
fMRI time series
GLM coeff
prior varianceof GLM coeff
prior varianceof data noise
AR coeff(correlated noise)
short-term memorydesign matrix (X)
long-term memorydesign matrix (X)
m2m1 m3 m4
V1 V5stim
PPC
attention
V1 V5stim
PPC
attention
V1 V5stim
PPC
attention
V1 V5stim
PPC
attention
m1 m2 m3 m4
15
10
5
0
V1 V5stim
PPC
attention
1.25
0.13
0.46
0.39 0.26
0.26
0.10estimated
effective synaptic strengthsfor best model (m4)
models marginal likelihood
ln p y m
Dynamic Causal Modellingnetwork structure identification
1 2
31 2
31 2
3
1 2
3
time
( , , )x f x u
32
21
13
0t
t t
t t
tu
13u 3
u
DCMs and DAGsa note on causality
m1
m2
diff
eren
ces
in lo
g- m
ode
l evi
denc
es
1 2ln lnp y m p y m
subjects
fixed effect
random effect
assume all subjects correspond to the same model
assume different subjects might correspond to different models
Dynamic Causal Modellingmodel comparison for group studies
I thank you for your attention.
A note on statistical significancelessons from the Neyman-Pearson lemma
• Neyman-Pearson lemma: the likelihood ratio (or Bayes factor) test
1
0
p y Hu
p y H
is the most powerful test of size to test the null. 0p u H
MVB (Bayes factor) u=1.09, power=56%
CCA (F-statistics)F=2.20, power=20%
error I rate
1 -
erro
r II
rat
e
ROC analysis
• what is the threshold u, above which the Bayes factor test yields a error I rate of 5%?