signal processing for functional brain imaging:...

Signal Processing for Functional Brain Imaging:General Linear Model (2)

Dimitri Van De VilleMedical Image Processing Lab, EPFL/[email protected]

March 7, 2013

Overviewn GLM method (part 1, last week 28.02.13)

n intuitive explanationn matrix algebra explanation

n model generationn parameter estimationn hypothesis testing

2

n hypothesis testing continuedn t-test and F-test

n multiple comparisonsn enriching the model

n accounting for imaging artifacts, physiological noisen from single-subject to group-level analysis

n GLM method (part 2, today 07.03.13 )

get error

significant?

Null hypothesis H0: cT� = 0

then cT � is asymptotically

normal N (cT�,�2cT (XTX)�1c)

and t = cT �

�p

cT (XTX)�1cfollows

Student t-distribution with N � L

degrees of freedom

From GLM fitting to hypothesis testing

3

get data

model get effect size

�� =

�

⇧⇤0.830.162.98

⇥

⌃⌅

t = = 6.42

contrast

cT = [1 0 0]

time

Hypothesis testing: t-test

4

Null hypothesis expresses “no effect” (i.e., true cT� is 0)

H0 : cT� = 0

t = cT �/q

⇥2 cT (XTX)�1c follows Student t-distribution assuming H0

reject H0 if t � T , where the �-level is the acceptable false positive rate:

� = P (t0 � T ) (one-sided t-test)

p-value indicates the assessment of t assuming H0:

p = P (t0 � t)

specificity: risk of false positives (type I errors)

sensitivity: risk of false negatives (type II errors)

Null hypothesis acceptation/rejection controls specificity only

useful as “evidence of presence”, not “evidence of absence” (neurosurgeon!)

RejectH0

H0false

H0true

F-test - putting the same question...

5

fit model

�� =

�

⇧⇤0.830.162.98

⇥

⌃⌅

2=

F = = 41.2-

fit reduced model

�� =

��0.253.40

⇥

2=

Partitioning into two blocks of regressors

Consider reduced model by design matrix X0

y = X� + e

y0 = X0�0 + e0

Null hypothesis H0 expresses “no improvement of X over X0”

F =eT e�e0

T e0L�L0

eT eN�L

follows F-distribution (L� L0, N � L) assuming H0

reject H0 if F � T , where the ↵-level is the acceptable false positive rate

two-sided test if reduced model removes one regressor

Hypothesis testing: F-test

6

... or putting more general questions

7

fit model

fit reduced model�� =

�2.87

⇥

F-contrast

cT =

�1 0 00 1 0

⇥

F2,57 = 60.8

�� =

�

⇧⇤0.74�0.712.86

⇥

⌃⌅

More flexible by contrast matrix

reduced model can be made up by linear combinations of regressors

avoid reparametrization of model

F =eT e�e0

T e0L�L0

eT eN�L

=yTMy

yTRy

N � L

L� L0=

�TXTMX�

yTRy

N � L

L� L0⇠ F (L�L0, N�L)

model to remove (specified by C): Xc = XC

reduced model: X0 = XC0 where C0 = IL � CC+(residual forming of

contrast matrix)

R = IN �XX+

R0 = IN �X0X+0

M = R0 �R

Hypothesis testing: F-test (2)

8X

y

X0

Xc

RyR0y

My

can be computed efficiently

only needs to be computed once

Hypothesis testing: F-test (3)

9

✔

✘

✔

Use of F-testinterested in activation for “faces or objects”; some arrogant voxel is activating

during “faces”, deactivating during “objects”

– t-contrast: cT = [1 1 0]

– F -contrast: CT =

"1 0 0

0 1 0

#

any difference between three conditions (like ANOVA)

– F -contrast: CT =

"1 �1 0

1 0 �1

#

Multiple comparisons

10

Mass univariate testing (V =10K-100K intracranial voxels)

E[FP] = �V , so false positives should be controlled adequately!

Family-wise error rate: �FWE = P ([Vk=1t

0k � T )

Bonferroni correction: assuming independent observations

�FWE = 1� (1� �)V ⇡ �V

to obtain �FWE, use �FWE/V at the individual tests

high specificity, low sensitivity since neglecting spatial correlation

therefore, too conversative

can be applied locally (if ROI is chosen a priori)

Glance at Gaussian random field theory

11

≈

1.8mm x 1.8mm FWHM=6mm

Consider contrast as lattice representationof continuous Gaussian random field

spatially smooth data with 3D Gaussian kernel,typical full-width-at-half-maximum (FWHM) about 6–12 mm

Euler characteristic:�T =topological measure #blobs � #holes

Assuming H0 and high T , we have

P (⇧Vk=1tk > T ) = P (maxk(tk) > T )

= P (one or more blobs)⇤ P (�T ⇥ 1) (no holes)⇤ E[�T ] (one blob)

can be further approximated assuming(sufficient) spatial smoothness

Gaussian random field theory

12

Advantages:

increased sensitivity

decreased inter-subject variability (group studies!)

Limitations:

requires sufficient smoothness: typically FWHM like 3� 4⇥ voxel size

smoothness needs to be estimated

– bias if not sufficiently smooth

several approximations in cascade (high T )

13

SPM FWE

[VDV et al., IEEE JSTSP, 2008]

5% corrected

Bonferroni

14

Bonferroni

SPM FWE

5% corrected

[VDV et al., IEEE JSTSP, 2008]

To correct or not to correct?

15


16


17

Model specificationn From stimuli to modeled BOLD response

n blocks (epochs)

n events

n Convolution is performed in microtime (see exercise)18

⊗ =

⊗ =h(t;w)

Enriching the model

19

⊗ =

Hemodynamic variations

subject-dependent, regional changes, habituation and anticipation effects

approximate h(t + �t;w + �w), where w is dispersion

h(t + �t; w + �w) � h(t; w) + �t�h

��t+ �w

�h

��w

More involved techniques using Volterra kernels

Enriching the model (2)n Low-frequency components

n truncated DCT-basisn act like a high-pass filtern scanner driftsn physiological fluctuations (aliased)n intrinsic brain activity

n Add nuissance regressorsn realignment

parametersn spike-

regressorsto “cancel”bad scans

20

Parameter estimation (revisited)

21[Bullmore et al., 1996]

GLM with correlated noise, introduce filter S

y = X� + e, where e : N (0,�2V )

Sy = SX� + Se|{z}e0

, where e0 : N (0,�2SV ST )

normal equations: (SX)TSy = (SX)T (SX)�

estimate: � = (SX)+Sy

assume known covariance matrix V = KKT

– e.g., parametric form of noise model (1/f-noise, autoregressive model)

then BLUE is obtained for S = K�1

Time

n Analysis of (many) timecourses

From single-subject analysis...

22

... to group-level analysisn Fixed effects analysis

n concatenate data and design matrices of subjectsn inference on the observed group

n Random effects analysisn estimate contrast of interest for individual subjects (1st level)n enter contrast in “basic model” and re-estimate (2nd level)n inference on the population from which group is sample

23

Subjec

ts

n Analysis of (many) subjects

... to group-level analysis (RFX)

24

...

One-sample t-test!voxel’s intensity

subject

significant?

one-

sam

ple

t-tes

t

two-

sam

ple

t-tes

t

paire

d t-t

est

Conclusionn General linear model

n many ways of testing the fitted parametersn t-test, F-test

n conceptually simple, yet powerful and flexiblen many tricks to “enrich” the modeln generalizes “basic” models

n Multiple comparisons problemn Gaussian smoothing is state-of-the-art

n degrades spatial resolution, n improves sensitivity, reduces inter-subject variability

n Alternatives n FP rates (e.g., false discovery rate)n Spatial modeling (e.g., wavelets,... )n Bayesian inferences (alternative hypothesis made explicit)n ...

25

signal processing for functional brain imaging:...

Documents