non-parametric statistical testing with clusters - fieldtrip · 2019. 2. 5. · non-parametric...
TRANSCRIPT
-
http://www.fieldtriptoolbox.org
Non-parametricstatisticaltestingwithclusters
TzvetanPopov
CentralInstituteofMentalHealth,Mannheim,Germany
-
http://www.fieldtriptoolbox.org
Talkoutline
InferentialstatisticsChannel-levelstatistics
parametricnon-parametricclustering
Source-levelstatistics
-
http://www.fieldtriptoolbox.org
Inferentialparametricstatistics
YoumakeNobservationandwanttofindwhethersomehypothesisH1istrue
Step1:Gatheringdata
Observation Value0 2.51 -3.2
N 2.4
.
.
. coun
t
value
-
http://www.fieldtriptoolbox.org
Inferentialparametricstatistics
YoumakeNobservationandwanttofindwhethersomehypothesisH1istrue
Step2:Statisticaltesting
DetermineprobabilityoftunderH0
Iftsufficientlyunlikely,rejectH0
Observation Value0 2.51 -3.2
N 2.4
.
.
. coun
t
value
μ
σ
t = µ −µH 0σ N
μH0
-
http://www.fieldtriptoolbox.org
Inferentialparametricstatistics
{x1, x2, x3, x4, …} {y1, y2, y3, y4, …}
t-statistic
distribution of x
Observationsin condition 1:
Observationsin condition 2:
distribution of y
-
http://www.fieldtriptoolbox.org
Parametricstatisticaltesting
YoumakeNobservationandwanttofindwhethersomehypothesisH1istrue.
Thefirstproblemisthatthisrequiresaknowndistribution oftheteststatistic.
Observation Value0 2.51 -3.2
N 2.4
.
.
. coun
t
value
μ
σσ
-
http://www.fieldtriptoolbox.org
Thesecondproblemisthatofmultiplecomparisons
16*30time-frequencytiles,i.e.480comparisons.
t-testwithα =0.05(chanceoffalsealarmrate)
Chanceofonefalsealarmin480tests:99.99…%Or:480*0.05=24falsealarmsexpected
-
http://www.fieldtriptoolbox.org
Themultiplecomparisonproblem
Whole-brainanalysis
306channels100timepoints50frequencies
1.530.000statisticaltests
5%chanceoffalsealarmineverytest76.500falsealarms
-
http://www.fieldtriptoolbox.org
SolutionstocontroltheFWER
BonferronicorrectionUsethefalsediscoveryrateUseaMonteCarloapproximationofthe
randomizationdistributionofthemaximumstatisticcfg = [];cfg.method = ‘analytic’cfg.correctm = ‘bonferroni’ERPstats = ft_timelockstatistics(cfg, ERP);
cfg = [];cfg.method = ‘analytic’cfg.correctm = ‘fdr’ERPstats = ft_timelockstatistics(cfg, ERP);
cfg = [];cfg.method = ‘montecarlo’cfg.correctm = ‘max’ERPstats = ft_timelockstatistics(cfg, ERP);
cfg = [];cfg.method = ‘montecarlo’cfg.correctm = ‘max’TFRstats = ft_freqstatistics(cfg, TFR);
-
http://www.fieldtriptoolbox.org
Randomizationtest:generalprinciple
- Independentvariable:condition- Dependentvariable:data
H0:thedataisindependentfromtheconditioninwhichitwasobserved
Thedatainthetwoconditionsisnot different
-
http://www.fieldtriptoolbox.org
aa
a a
aa
bb
bb
b
b
analyze
analyze
difference
Randomizationapproach
Xorg
e.g. compute mean and variance
e.g. compute t-score
-
http://www.fieldtriptoolbox.org
aa
b b
ab
ba
ba
a
b
analyze
analyze
difference
Randomizationapproach
X1
-
http://www.fieldtriptoolbox.org
b a
b a
ba
ab
ab
a
b
analyze
analyze
difference
Randomizationapproach
X2
-
http://www.fieldtriptoolbox.org
b a
b a
ba
ab
ab
a
b
Randomizationapproach
X2
-
http://www.fieldtriptoolbox.org
Distributionof“x”cantakeanyshape
2.5% 97.5% 2.5% 97.5% 2.5% 97.5%
Xorg
-
http://www.fieldtriptoolbox.org
Non-parametricstatistics
Randomizationofindependentvariable
Hypothesisisaboutdata,notaboutthespecificparameter
Randomizationdistributionofthestatisticofinterest“x”isapproximatedusingMonte-Carloapproach
H0istestedbycomparingtheobservedstatisticagainsttherandomizationdistribution
-
http://www.fieldtriptoolbox.org
Avoidthemultiplecomparisonproblem
Thestatistic“x”canbeanything
Ratherthantestingeverything,onlytestthemostextremeobservation(i.e.themaxstatistic)
Computetherandomizationdistributionforthemostextremestatistic
Notethatoftenwecomputetwo extrema,oneforeachtail
2.5% 97.5%
Xorg
-
http://www.fieldtriptoolbox.org
Increasingthesensitivity
ConventionalisunivariateparametricOurapproachistoconsiderthedata
Manychannels,timepoints,frequenciesMassiveunivariateMultiplecomparisonproblem
Thereisquitesomestructureinthedata
-
http://www.fieldtriptoolbox.org
Increasingthesensitivity
channel/time/frequencypointsarenotindependent
neighbouringchannel/time/frequencypointsareexpectedtoshowsimilarbehaviour
combineneighbouringsamplesintoclusters->“accumulatetheevidence”=cluster-basedstatistics
avoidtheMCPbycomparingthelargestobservedclusterversustherandomizationdistributionofthelargestclusters
-
http://www.fieldtriptoolbox.org
Clusteringintime
-
http://www.fieldtriptoolbox.org
Clusteringintimeandfrequency
-
http://www.fieldtriptoolbox.org
channels
Clusteringintime,frequencyandspace
-
http://www.fieldtriptoolbox.org
Toy example
-
http://www.fieldtriptoolbox.org
Toy example: Original observation
Condition AS1_aS2_aS3_aS4_aS5_aS6_aS7_aS8_aS9_aS10_a
Condition BS1_bS2_bS3_bS4_bS5_bS6_bS7_bS8_bS9_bS10_b
null hypothesis: condition A = condition B
-
http://www.fieldtriptoolbox.org
Toy example: 1st permutation
Condition AS1_aS2_bS3_aS4_aS5_bS6_bS7_aS8_aS9_aS10_b
Condition BS1_bS2_aS3_bS4_bS5_aS6_aS7_bS8_bS9_bS10_a
null hypothesis: condition A = condition B
-
http://www.fieldtriptoolbox.org
Toy example: 2nd permutation
Condition AS1_bS2_aS3_bS4_aS5_aS6_bS7_aS8_bS9_bS10_a
Condition BS1_aS2_bS3_aS4_bS5_bS6_aS7_bS8_aS9_aS10_b
null hypothesis: condition A = condition B
-
http://www.fieldtriptoolbox.org
Toy example: Original observation
statistic = 17 Observation 17
-
http://www.fieldtriptoolbox.org
Toy example: 1st permutation
statistic = 13 Observation 17
-
http://www.fieldtriptoolbox.org
Toy example: 1st permutation
statistic = 13 Observation 17Permutation 1 13
-
http://www.fieldtriptoolbox.org
Toy example: 2nd permutation
statistic = 11Observation 17Permutation 1 13Permutation 2 11
-
http://www.fieldtriptoolbox.org
Toy example: 3rd permutation
statistic = 12 Observation 17Permutation 1 13Permutation 2 11Permutation 3 12
-
http://www.fieldtriptoolbox.org
Toy example: Nth permutation
statistic = … Observation 17Permutation 1 13Permutation 2 11Permutation 3 12
-
http://www.fieldtriptoolbox.org
Assessthelikelihoodoftheobservedmaxclustersizegiventherandomizationdistribution
2.5% 97.5% 2.5% 97.5% 2.5% 97.5%
Xorg
-
http://www.fieldtriptoolbox.org
Summarystatistics
Parametricstatisticaltestforallchannel-time-frequencypointsprobabilityforH0oneH0foreachchannel-time-frequencymultiplecomparisonproblem
Non-parametricapproachforestimatingprobabilityrandomizationorpermutationprobabilityofH0forarbitrarystatisticincorporatepriorknowledgeinstatisticavoidMCPusingmaxstatistic
-
http://www.fieldtriptoolbox.org
Source-levelstatistics
Sameprinciplesaschannel-levelstatistics
BeamformingSwapdatabetweenconditions:usecommonfilters
-
http://www.fieldtriptoolbox.org
Inferentialstatistics:parametric
x1, x2, x3, x4, … x1, x2, x3, x4, …
t-statistic
distribution of x
-
http://www.fieldtriptoolbox.org
Inferentialstatistics:distributeddataatsourcelevel
x1, x2, x3, x4, … x1, x2, x3, x4, …[ ]
dim = [x, y, z]
dim = [x*y*z, 1]
-
http://www.fieldtriptoolbox.org
Inferentialstatistics:parametric
x1, x2, x3, x4, … x1, x2, x3, x4, …[ ] t-statistic
-
http://www.fieldtriptoolbox.org
Inferentialstatistics:permutationapproach
“s” statistic
-
http://www.fieldtriptoolbox.org
Permutationdistributionof“s”cantakeanyshape
2.5% 97.5% 2.5% 97.5% 2.5% 97.5%
Sorg
-
http://www.fieldtriptoolbox.org
Cluster-basedpermutationtestonsource-level
a-priori thresholdcluster neighbouring voxelscompute sum over cluster
2.5% 97.5%
-
http://www.fieldtriptoolbox.org
Returningtobeamforming
one trial
one condition
t-map
H0: same distribution
-
http://www.fieldtriptoolbox.org
Commonfiltersforbeamforming
W = ( GT C-1 G )-1 GT C-1
C = M MTP = X XT = W C WT
X(t) = W M(t)
Ci = Mi MiT trial 1, 2, 3, …
C = ( C1 + C2 + C3 + … )/n
M(t) = G X(t) + N
Pi = W Ci WT
^
^ ^
-
http://www.fieldtriptoolbox.org
Commonfiltersforbeamforming
one trial
one condition
t-map
H0: same distribution
-
http://www.fieldtriptoolbox.org
Summarystatisticsonsource-level
Sameprinciplesasforchannel-levelstatisticsAveragecovarianceoveralldataforspatialfilterestimateOnespatialfilterpervoxel
commontobothconditionssingle-trialestimates:simplemultiplication
Permutationtestnotaffectedexchangeabilityofdataoverconditionsdoesnot
changetheoptimalfilterunderH0computationallyfast
-
http://www.fieldtriptoolbox.org
Generalsummary
Aformalhypothesiscanbetestedwithrandomizationtestcontrolthechanceoffalsepositivesreducethefalsenegativerate
Multiplecomparisonproblemonehypothesisperchannel-time-frequencyonehypothesisforalldata
Increasesensitivityusingclusterstocapturethestructureinthedata