![Page 1: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/1.jpg)
Sequential Multiple Decision Procedures Sequential Multiple Decision Procedures (SMDP)(SMDP)
for Genome Scansfor Genome Scans
Q.Y. Zhang and M.A. Province Q.Y. Zhang and M.A. Province
Division of Statistical GenomicsDivision of Statistical GenomicsWashington University School of MedicineWashington University School of Medicine
Statistical Genetics Forum, April, 2006Statistical Genetics Forum, April, 2006
![Page 2: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/2.jpg)
ReferencesReferences
R.E. Bechhofer, J. Kiefer., M. Sobel. 1968. Sequential identification and ranking procedures. The University of Chicago Press, Chicago.
M.A. Province. 2000. A single, sequential, genome-wide test to identify simultaneously all promising areas in a linkage scan. Genetic Epidemiology,19:301-332 .
Q.Y. Zhang, M.A. Province . 2005. Simplified sequential multiple decision procedures for genome scans . 2005 Proceedings of American Statistical Association. Biometrics section:463~468
![Page 3: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/3.jpg)
SMDP SMDP
SequentialSequential Multiple DecisionMultiple Decision Procedures Procedures
Sequential testSequential test
Multiple hypothesis testMultiple hypothesis test
![Page 4: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/4.jpg)
Idea 1: SequentialIdea 1: Sequential
nn00Start from a small sample size
Increase sample size, sequential test at each stage (SPRT)
Stop when stopping rule is satisfied
nn00+1+1
nn00+2+2
nn00+i+i
…
Experiment in next stage Extra data for validation
…
![Page 5: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/5.jpg)
Idea 2: Multiple DecisionIdea 2: Multiple Decision
SNP1SNP1
SNP2SNP2
SNP3SNP3
SNP4SNP4
SNP5SNP5
SNP6SNP6
……
SNPnSNPn
Simultaneous testSimultaneous testMultiple hypothesis testMultiple hypothesis test Independent testIndependent test
Binary hypothesis testBinary hypothesis test test 1
test 2
test 3
test 4
test 5
test 6
test n
SNP1SNP1
SNP2SNP2
SNP3SNP3
SNP4SNP4
SNP5SNP5
SNP6SNP6
……
SNPnSNPntest-wise error and experiment-wise error
p value correction
Signal Signal group group
Noise Noise group group
![Page 6: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/6.jpg)
Binary Hypothesis TestBinary Hypothesis Test
SNP1SNP1
SNP2SNP2
SNP3SNP3
SNP4SNP4
SNP5SNP5
SNP6SNP6
……
SNPnSNPn
test 1 H0: Eff.(SNP1)=0 vs. H1: Eff.(SNP1)≠0
test 2 H0: Eff.(SNP2)=0 vs. H1: Eff.(SNP2)≠0
test 3 ……
test 4 ……
test 5 ……
test 6 ……
test n H0: Eff.(SNPn)=0 vs. H1: Eff.(SNPn)≠0
![Page 7: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/7.jpg)
Multiple Hypothesis TestMultiple Hypothesis Test
SNP1SNP1
SNP2SNP2
SNP3SNP3
SNP4SNP4
SNP5SNP5
SNP6SNP6
……
SNPnSNPn
H1: SNP1,2,3 are truly different from the others
H2: SNP1,2,4 are truly different from the others
H3 ……
H4 ……
H5: SNP4,5,6 are truly different from the others
H6 ……
……
Hu: SNPn,n-1,n-2 are truly different from the others
H: any t SNPs are truly different from the others (n-t)
u= number of all possible combination of t out of n
![Page 8: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/8.jpg)
SMDPSMDP
Sequential test Multiple hypothesis test
Sequential Multiple Decision Procedure
![Page 9: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/9.jpg)
Koopman-Darmois(K-D) PopulationsKoopman-Darmois(K-D) Populations (Bechhofer et al., 1968)(Bechhofer et al., 1968)
The freq/density function of a K-D population can be written in the form:
f(x)=exp{P(x)Q(θ)+R(x)+S(θ)}
A. The normal density function with unknown mean and known variance;
B. The normal density function with unknown variance and known mean;
C. The exponential density function with unknown scale parameter and known location parameter;
D. The Bernoulli distribution with unknown probability of “success” on a single trial;
E. The Poisson distribution with unknown mean;
……
The distance of two K-D populations is defined as :
)()(, jiji QQ ji
jiB
2
1
2
1,:
![Page 10: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/10.jpg)
SMDP SMDP (Bechhofer et al., 1968)(Bechhofer et al., 1968)
Selecting the Selecting the t t best of best of MM K-D populations K-D populations
Sequential Sampling
1 2 … h h+1 …
Pop. 1
Pop. 2
:
Pop. t-1
Pop. t
Pop. t+1
Pop. t+2
:
Pop. M
D
Y1,h
Y2,h
:
:
Yi,h
:
::
YM,h
U
j
thj
thU
hU
YD
YDW
1
)exp(
)exp(
)(],[
*
)(],[
*
],[
)!(!
!
tMt
MU
U possible combinations
of t out of M
t
khi
thu k
YY1
,)(
,
For each combination u
)(],[
)(],[
)(],[
)(],[ ... t
hUt
hUt
hth YYYY 121
*],[ PW hU Stopping rule
Prob. of correct selection (PCS) > P*, whenever D>D*
![Page 11: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/11.jpg)
SMDP: SMDP: P*, t, D*P*, t, D*
P* P* arbitrary, 0.95arbitrary, 0.95
t fixed or variedt fixed or varied
D* indifference zone D* indifference zone
Pop. 1
Pop. 2
:
Pop. t-1
Pop. t
Pop. t+1 Pop. t+2
:
:
:
Pop. M
D
*)exp(
)exp(
)(],[
*
)(],[
*
],[ PYD
YDW
U
j
thj
thU
hU
1
SMDP stopping rule
Prob. of correct selection (PCS) > P*whenever D>D*
Correct selection Populations with Q(θ)> Q(θt)+D* are selected
D*
Q(θt)+D
Q(θt)+D*
Q(θt)
![Page 12: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/12.jpg)
SMDP: SMDP: Computational ProblemComputational Problem
)t(h],U[
)t(h],1U[
)t(h],2[
)t(h],1[
*U
1j
)t(h],j[
*
)t(h],U[
*
h],U[
YY...YY
P)YDexp(
)YDexp(W
1
2
3
:
h
h+1
:
N
Sequential stage
Y1,h
Y2,h
:
Yt,h
Yt+1,h
Yt+2,h
:
YM,h
U sums of U possible combinations of t out of MEach sum contains t members of Yi,h
)!tM(!t
!MU
Computer time
?
![Page 13: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/13.jpg)
h],U[]1U[
h],U[]2U[
h],U[]2[
h],U[]1[
h],U[
)t(h],U[
)t(h],1U[
)t(h],1S[
)t(h],S[
)t(h],2[
)t(h],1[
*U
Sj
)t(h],j[
*)t(h],S[
*
)t(h],U[
*]SU[
h],U[
WWW...WW
YY...YY...YY
P)YDexp()YDexp()1S(
)YDexp(W
Simplified Stopping RuleSimplified Stopping Rule (Bechhofer et al., 1968)(Bechhofer et al., 1968)
U-S+1= Top Combination Number (TCN)
TCN=2 (i.e. S=U-1,U-S=1)=> the simplest stopping rule
}P1
P)1U(ln{
D
1YY
*
*
*h],tM[h],1tM[
When TCN=U (i.e. S=1, U-S=U-1)=> the original stopping rule
How to choose TCN? Balance between computational accuracy and computational time
![Page 14: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/14.jpg)
![Page 15: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/15.jpg)
SMDP Combined With Regression ModelSMDP Combined With Regression Model(M.A. Province, 2000, page 320-321)(M.A. Province, 2000, page 320-321)
Z1 , X1
Z2 , X2
Z3 , X3
: :
Zh , Xh
Zh+1 , Xh+1
: :
ZN , XN
Data pairs for a marker
Sequential sum of squares of regression residualsYi,h denotes Y for marker i at stage h
1h
1j
2j1h
21h1h1h
h
1j
21hj
h
1j
2)h(j
h
1j
2)h(j
1h
1h)h()h(
1h1h
VY
),0(N~VrV
)XX()XX(h
)XX(h
)Xˆˆ(Zr
XZ
![Page 16: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/16.jpg)
Combine SMDP With Regression ModelCombine SMDP With Regression Model(M.A. Province, 2000, page 319)(M.A. Province, 2000, page 319)
),(~
)ˆˆ( )()(
2111
111
0
NVrV
XZr
XZ
hhh
hhh
hh
Case B : the normal density function with unknown variance and known mean;
h
jjihi VY
1
2,,
![Page 17: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/17.jpg)
Simplified Stopping Rule Simplified Stopping Rule M.A. Province, 2000 M.A. Province, 2000
page 321-322 page 321-322
![Page 18: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/18.jpg)
A Real Data Example (A Real Data Example (M.A. Province, 2000, page 310)M.A. Province, 2000, page 310)
![Page 19: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/19.jpg)
A Real Data Example (A Real Data Example (M.A. Province, 2000, page 308)M.A. Province, 2000, page 308)
![Page 20: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/20.jpg)
Simulation Results (1) Simulation Results (1) M.A. Province, 2000, page 312M.A. Province, 2000, page 312
![Page 21: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/21.jpg)
Simulation Results (2) Simulation Results (2) M.A. Province, 2000, page 313M.A. Province, 2000, page 313
![Page 22: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/22.jpg)
![Page 23: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/23.jpg)
h],U[]1U[
h],U[]2U[
h],U[]2[
h],U[]1[
h],U[
)t(h],U[
)t(h],1U[
)t(h],1S[
)t(h],S[
)t(h],2[
)t(h],1[
*U
Sj
)t(h],j[
*)t(h],S[
*
)t(h],U[
*]SU[
h],U[
WWW...WW
YY...YY...YY
P)YDexp()YDexp()1S(
)YDexp(W
Simplified SMDPSimplified SMDP (Bechhofer et al., 1968)(Bechhofer et al., 1968)
U-S+1= Top Combination Number (TCN)
How to choose TCN?
Balance between computational accuracy and computational time
![Page 24: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/24.jpg)
DataData
Sample Sample sizesize
GenotypeGenotype PhenotypePhenotype
8585
Cell linesCell lines
5841 SNPs5841 SNPs
(category: 0,1,2)(category: 0,1,2)
ViabFu7ViabFu7
(continuous)(continuous)
![Page 25: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/25.jpg)
Relation of Relation of WW and and t t (h=50, D*=10)(h=50, D*=10)
Effective Top Combination Number
ETCN
Zhang & Province,2005,page 465Zhang & Province,2005,page 465
![Page 26: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/26.jpg)
ETCN CurveETCN Curve
Zhang & Province,2005,page 466Zhang & Province,2005,page 466
![Page 27: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/27.jpg)
t t =?=?
Zhang & Province,2005,page 466Zhang & Province,2005,page 466
![Page 28: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/28.jpg)
Zhang & Province,2005,page 467Zhang & Province,2005,page 467
P*=0.95P*=0.95D*=10D*=10TCN=10000TCN=10000
72 SNPs72 SNPsP<0.01P<0.01
![Page 29: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/29.jpg)
SMDP SummarySMDP Summary
Advantages:Advantages:
Test, identify all signals simultaneously, no multiple comparisons Test, identify all signals simultaneously, no multiple comparisons
Use “Minimal” N to find significant signals, efficient Use “Minimal” N to find significant signals, efficient
Tight control statistical errors (Type I, II), powerfulTight control statistical errors (Type I, II), powerful
Save rest of N for validation, reliableSave rest of N for validation, reliable
Further studies:Further studies:
Computer time Computer time
Extension to more methods/modelsExtension to more methods/models
Extension to non-K-D distributionsExtension to non-K-D distributions
![Page 30: Sequential Multiple Decision Procedures (SMDP) for Genome Scans Q.Y. Zhang and M.A. Province Division of Statistical Genomics Washington University School](https://reader035.vdocuments.us/reader035/viewer/2022062321/56649f165503460f94c2ceae/html5/thumbnails/30.jpg)
Thanks !Thanks !