applying false discovery rate (fdr) control in detecting future climate changes
DESCRIPTION
Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes. ZongBo Shang SIParCS Program, IMAGe , NCAR August 4, 2009. North American Regional Climate Change Assessment Program (NARCCAP) Predicted Changes in Future Winter Temperature ( °C). - PowerPoint PPT PresentationTRANSCRIPT
Applying False Discovery Rate (FDR) Control in Detecting Future Climate Changes
ZongBo ShangSIParCS Program, IMAGe, NCAR
August 4, 2009
North American Regional Climate Change Assessment Program (NARCCAP)Predicted Changes in Future Winter Temperature ( °C)
200 220 240 260 280 300 320
2030
4050
6070
CRCM+CGCM3 Changes in Winter Temperature
Longitude
Latit
ude
-2
0
2
4
6
8
Note: This figure shows the difference between the mean of future (2040 – 2069 ) winter temperature vs. current (1970 – 1999) winter temperature.
Can We Trust What We See?
Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.
10 20 30 40 50
1020
3040
50
x
y
-2
-1
0
1
2
10 20 30 40 50
1020
3040
50
x
y-2
-1
0
1
2
What’s the Problem with Pointwise Two-sample t Tests?
10 20 30 40 50
1020
3040
50
Two sample t statistic
x
y
-4
-2
0
2
4
10 20 30 40 50
1020
3040
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
210 : H
False Discovery Rate (FDR) Control
• FDR controls the expected proportion of incorrectly rejected null hypotheses (type I errors) among the rejected null hypotheses.
• Less conservative than Bonferroni procedures, with greater power than Familywise Error Rate (FWER) control, at a cost of increasing the likelihood of obtaining type I errors.
Applications of FDR in Genes Expression and Microarray
Applications of FDR in Functional Magnetic Resonance Imaging
Definition of False Discovery Rate
Declared non-significant (fail to reject)
Declared significant (reject)
Total
True null hypotheses
U V m₀
Non-true null hypotheses
T S m-m₀
m-R R m
Let Q = V / (V + S) define the proportion of errors committed by falsely rejecting null hypotheses. Notice Q is an unobservable random variable. Define the FDR to be the expectation of Q:
]/[)]/([][ RVESVVEQEQe
False Discovery Rates for Spatial Signals
• Testing on clusters rather than individual locations
• Procedure 1: Weighted Benjamini & Hochberg (BH) procedure
• Procedure 2: Weighted two-stage procedure• Procedure 3: Hierarchical testing procedure
– Testing stage: control FDR on clusters– Trimming stage: control FDR on selected points
Reference: Benjamini, Y. and Heller, R. 2007. False discovery rates for spatial signals. Journal of the American Statistical Association. 102:1272-1281.
Simulation Studies
• 1. Random Fields
• 2. Random Field Block
• 3. Random Field Gradient
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting I
x
y
-1.0
-0.5
0.0
0.5
1.0
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting II
x
y
-10
-5
0
5
10
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting II
x
y
-10
-5
0
5
10
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting II
x
y
-1.0
-0.5
0.0
0.5
1.0
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting I
xy
-10
-5
0
5
10
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting I
x
y
-10
-5
0
5
10
Simulation Study I: Two Random Fields
Note: Those two figures show the means of 10 replicate random fields that are generated from the same Matèrn semi-variogram model, but with different random seeds.
10 20 30 40 50
1020
3040
50
x
y
-2
-1
0
1
2
10 20 30 40 50
1020
3040
50
x
y-2
-1
0
1
2
Pre-defined Clusters
10 20 30 40
1020
3040
Simulation Study 1: Pointwise vs. False Discover Rate Control
10 20 30 40 50
1020
3040
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
0 10 20 30 40 50
010
2030
4050
Rejection at q-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
9 Repeats on Simulation Study I
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0.00
0.02
0.04
0.06
0.08
0.10
Simulation Study II: Pre-defined Block Trend
xy
trend
Trend
10 20 30 40 50
1020
3040
50
Trend
x
y
-10
-5
0
5
10
4 -10
10 -4
2
-2
Simulation Study II: Average of 10 Replicates
10 20 30 40 50
1020
3040
50
Mean of 10 Replicates from Setting I
x
y
-10
-5
0
5
10
10 20 30 40 50
1020
3040
50
Mean of 10 Replicates from Setting II
x
y-10
-5
0
5
10
Random Field (Matèrn, σ = 0.4) Random Field (Matèrn, σ = 0.4) + Block Trends
4 -10
10 -4
2
-2
Simulation Study II: Pointwise vs. False Discover Rate Control
10 20 30 40 50
1020
3040
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
0 10 20 30 40 50
010
2030
4050
Rejection at q-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
9 Repeats on Simulation Study II
10 20 30 40 50
1020
3040
50
Trend
x
y
-10
-5
0
5
10
0 10 20 30 40 500
1020
3040
500 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0 10 20 30 40 50
010
2030
4050
0.00
0.02
0.04
0.06
0.08
0.10
Study III: Pre-defined Gradient Trend
10 20 30 40 50
1020
3040
50
Trend
x
y
-10
-5
0
5
10
xy
trend
Trend
Study III: Average of 10 Replicates
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting I
x
y
-10
-5
0
5
10
10 20 30 40 50
1020
3040
50
10 Replicates Average for Setting II
x
y-10
-5
0
5
10
Random Field (Matèrn, σ = 2) Random Field (Matèrn, σ = 2) + Gradient Trends
Simulation Study III: Pointwise vs. False Discover Rate Control
10 20 30 40 50
1020
3040
50
Pointwise p-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
0 10 20 30 40 50
010
2030
4050
Rejection at q-value
x
y
0.00
0.02
0.04
0.06
0.08
0.10
9 Repeats on Simulation Study III
10 20 30 40 50
1020
3040
50
Trend
x
y
-10
-5
0
5
10
0 10 20 30 40 500
1020
3040
50x
y0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 500
1020
3040
50
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0 10 20 30 40 50
010
2030
4050
x
y
0.00
0.02
0.04
0.06
0.08
0.10
Applying FDR Control for Detecting Future Climate Changes
• Download climate datasets from NARCCAP program• Calculate seasonal average• Construct clusters from EPA Eco-regions• Conduct two-sample t test on temperature/precipitation• Pointwise p-values and corresponding z scores• Build semi-variogram model to estimate spatial
autocorrelation• Calculate z score and p-value by cluster• Reject clusters based on FDR control
http://www.epa.gov/wed/pages/ecoregions/na_eco.htm
GIS: Vector Dataset, Lambert Equal-Area Projection
61 regions rejected at q=0.25 level 56 regions rejected at q=0.1 level 54 regions rejected at q=0.05 level 51 regions rejected at q=0.01 level
H0: Future Winter Temperature Increase by 3 ˚C
0 50 100 150
0.0
0.1
0.2
0.3
0.4
0.5
p-va
lue
Reject at q=0.25Reject at q=0.1Reject at q=0.05Reject at q=0.01
220 240 260 280 30020
3040
5060
70
Rejection at q-value
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
CRCM+CGCM3 Changes in Winter Temperature
Longitude
Latit
ude
-5
0
5
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Temperature ↑ 1 ˚C
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Temperature ↑ 2 ˚C H0: Winter Temperature ↑ 3 ˚C
H0: Winter Temperature ↑ 4 ˚C
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Temperature ↑ 6 ˚CH0: Winter Temperature ↑ 5 ˚C
FDR Tests on Winter Temperature
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: Winter Prec ↓ 20 Kg/ m² H0: ↓ 10 Kg/ m² H0: ↑ 10 Kg/ m² H0: ↑ 20 Kg/ m²
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: ↑ 50 Kg/ m²H0: Winter Prec ↑ 30 Kg/ m²
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: ↑ 75 Kg/ m²
220 240 260 280 300
2030
4050
6070
Longitude
Latit
ude
0.00
0.05
0.10
0.15
0.20
0.25
H0: ↑ 100 Kg/ m²
FDR Tests on Winter Precipitation
Acknowledgement
• Dr. Steve Sain, IMAGe, NCAR• Drs. Douglas Nychka, Tim Hoar, IMAGe, NCAR• Dr. Armin Schwartzman, Harvard University• University of Wyoming• SIParCS, IMAGe, NCAR• NARCCAP