non-parametric statistical testing with clusters - fieldtrip · 2019. 2. 5. · non-parametric...

http://www.fieldtriptoolbox.org

Non-parametricstatisticaltestingwithclusters

TzvetanPopov

CentralInstituteofMentalHealth,Mannheim,Germany


Talkoutline

InferentialstatisticsChannel-levelstatistics

parametricnon-parametricclustering

Source-levelstatistics


Inferentialparametricstatistics

YoumakeNobservationandwanttofindwhethersomehypothesisH1istrue

Step1:Gatheringdata

Observation Value0 2.51 -3.2

N 2.4

.

.

. coun

t

value



YoumakeNobservationandwanttofindwhethersomehypothesisH1istrue

Step2:Statisticaltesting

DetermineprobabilityoftunderH0

Iftsufficientlyunlikely,rejectH0


N 2.4

.

.

. coun

t

value

μ

σ

t = µ −µH 0σ N

μH0



{x1, x2, x3, x4, …} {y1, y2, y3, y4, …}

t-statistic

distribution of x

Observationsin condition 1:

Observationsin condition 2:

distribution of y


Parametricstatisticaltesting

YoumakeNobservationandwanttofindwhethersomehypothesisH1istrue.

Thefirstproblemisthatthisrequiresaknowndistribution oftheteststatistic.


N 2.4

.

.

. coun

t

value

μ

σσ


Thesecondproblemisthatofmultiplecomparisons

16*30time-frequencytiles,i.e.480comparisons.

t-testwithα =0.05(chanceoffalsealarmrate)

Chanceofonefalsealarmin480tests:99.99…%Or:480*0.05=24falsealarmsexpected


Themultiplecomparisonproblem

Whole-brainanalysis

306channels100timepoints50frequencies

1.530.000statisticaltests

5%chanceoffalsealarmineverytest76.500falsealarms


SolutionstocontroltheFWER

BonferronicorrectionUsethefalsediscoveryrateUseaMonteCarloapproximationofthe

randomizationdistributionofthemaximumstatisticcfg = [];cfg.method = ‘analytic’cfg.correctm = ‘bonferroni’ERPstats = ft_timelockstatistics(cfg, ERP);

cfg = [];cfg.method = ‘analytic’cfg.correctm = ‘fdr’ERPstats = ft_timelockstatistics(cfg, ERP);

cfg = [];cfg.method = ‘montecarlo’cfg.correctm = ‘max’ERPstats = ft_timelockstatistics(cfg, ERP);

cfg = [];cfg.method = ‘montecarlo’cfg.correctm = ‘max’TFRstats = ft_freqstatistics(cfg, TFR);


Randomizationtest:generalprinciple

- Independentvariable:condition- Dependentvariable:data

H0:thedataisindependentfromtheconditioninwhichitwasobserved

Thedatainthetwoconditionsisnot different


aa

a a

aa

bb

bb

b

b

analyze

analyze

difference

Randomizationapproach

Xorg

e.g. compute mean and variance

e.g. compute t-score


aa

b b

ab

ba

ba

a

b

analyze

analyze

difference


X1


b a

b a

ba

ab

ab

a

b

analyze

analyze

difference


X2


b a

b a

ba

ab

ab

a

b


X2


Distributionof“x”cantakeanyshape

2.5% 97.5% 2.5% 97.5% 2.5% 97.5%

Xorg


Non-parametricstatistics

Randomizationofindependentvariable

Hypothesisisaboutdata,notaboutthespecificparameter

Randomizationdistributionofthestatisticofinterest“x”isapproximatedusingMonte-Carloapproach

H0istestedbycomparingtheobservedstatisticagainsttherandomizationdistribution


Avoidthemultiplecomparisonproblem

Thestatistic“x”canbeanything

Ratherthantestingeverything,onlytestthemostextremeobservation(i.e.themaxstatistic)

Computetherandomizationdistributionforthemostextremestatistic

Notethatoftenwecomputetwo extrema,oneforeachtail

2.5% 97.5%

Xorg


Increasingthesensitivity

ConventionalisunivariateparametricOurapproachistoconsiderthedata

Manychannels,timepoints,frequenciesMassiveunivariateMultiplecomparisonproblem

Thereisquitesomestructureinthedata


Increasingthesensitivity

channel/time/frequencypointsarenotindependent

neighbouringchannel/time/frequencypointsareexpectedtoshowsimilarbehaviour

combineneighbouringsamplesintoclusters->“accumulatetheevidence”=cluster-basedstatistics

avoidtheMCPbycomparingthelargestobservedclusterversustherandomizationdistributionofthelargestclusters


Clusteringintime


Clusteringintimeandfrequency


channels

Clusteringintime,frequencyandspace


Toy example


Toy example: Original observation

Condition AS1_aS2_aS3_aS4_aS5_aS6_aS7_aS8_aS9_aS10_a

Condition BS1_bS2_bS3_bS4_bS5_bS6_bS7_bS8_bS9_bS10_b

null hypothesis: condition A = condition B


Toy example: 1st permutation

Condition AS1_aS2_bS3_aS4_aS5_bS6_bS7_aS8_aS9_aS10_b

Condition BS1_bS2_aS3_bS4_bS5_aS6_aS7_bS8_bS9_bS10_a



Toy example: 2nd permutation

Condition AS1_bS2_aS3_bS4_aS5_aS6_bS7_aS8_bS9_bS10_a

Condition BS1_aS2_bS3_aS4_bS5_bS6_aS7_bS8_aS9_aS10_b



Toy example: Original observation

statistic = 17 Observation 17



statistic = 13 Observation 17



statistic = 13 Observation 17Permutation 1 13


Toy example: 2nd permutation

statistic = 11Observation 17Permutation 1 13Permutation 2 11


Toy example: 3rd permutation

statistic = 12 Observation 17Permutation 1 13Permutation 2 11Permutation 3 12


Toy example: Nth permutation

statistic = … Observation 17Permutation 1 13Permutation 2 11Permutation 3 12


Assessthelikelihoodoftheobservedmaxclustersizegiventherandomizationdistribution

2.5% 97.5% 2.5% 97.5% 2.5% 97.5%

Xorg


Summarystatistics

Parametricstatisticaltestforallchannel-time-frequencypointsprobabilityforH0oneH0foreachchannel-time-frequencymultiplecomparisonproblem

Non-parametricapproachforestimatingprobabilityrandomizationorpermutationprobabilityofH0forarbitrarystatisticincorporatepriorknowledgeinstatisticavoidMCPusingmaxstatistic


Source-levelstatistics

Sameprinciplesaschannel-levelstatistics

BeamformingSwapdatabetweenconditions:usecommonfilters


Inferentialstatistics:parametric

x1, x2, x3, x4, … x1, x2, x3, x4, …

t-statistic

distribution of x


Inferentialstatistics:distributeddataatsourcelevel

x1, x2, x3, x4, … x1, x2, x3, x4, …[ ]

dim = [x, y, z]

dim = [x*y*z, 1]


Inferentialstatistics:parametric

x1, x2, x3, x4, … x1, x2, x3, x4, …[ ] t-statistic


Inferentialstatistics:permutationapproach

“s” statistic


Permutationdistributionof“s”cantakeanyshape

2.5% 97.5% 2.5% 97.5% 2.5% 97.5%

Sorg


Cluster-basedpermutationtestonsource-level

a-priori thresholdcluster neighbouring voxelscompute sum over cluster

2.5% 97.5%


Returningtobeamforming

one trial

one condition

t-map

H0: same distribution


Commonfiltersforbeamforming

W = ( GT C-1 G )-1 GT C-1

C = M MTP = X XT = W C WT

X(t) = W M(t)

Ci = Mi MiT trial 1, 2, 3, …

C = ( C1 + C2 + C3 + … )/n

M(t) = G X(t) + N

Pi = W Ci WT

^

^ ^


Commonfiltersforbeamforming

one trial

one condition

t-map

H0: same distribution


Summarystatisticsonsource-level

Sameprinciplesasforchannel-levelstatisticsAveragecovarianceoveralldataforspatialfilterestimateOnespatialfilterpervoxel

commontobothconditionssingle-trialestimates:simplemultiplication

Permutationtestnotaffectedexchangeabilityofdataoverconditionsdoesnot

changetheoptimalfilterunderH0computationallyfast


Generalsummary

Aformalhypothesiscanbetestedwithrandomizationtestcontrolthechanceoffalsepositivesreducethefalsenegativerate

Multiplecomparisonproblemonehypothesisperchannel-time-frequencyonehypothesisforalldata

Increasesensitivityusingclusterstocapturethestructureinthedata

non-parametric statistical testing with clusters - fieldtrip · 2019. 2. 5. · non-parametric...

Documents