detections limits, u pp er limits, and c onÞdence i n t erv a ls in …hea- · 2006. 8. 21. ·...

22
Detections Limits, Upper Limits, and Confidence Intervals in High-Energy Astrophysics David van Dyk Department of Statistics University of California, Irvine Joint work with Vinay Kashyap, The CHAS Collaboration, and SAMSI SaFeDe Working Group 1

Upload: others

Post on 10-Aug-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Detections Limits, Upper Limits,

and Confidence Intervals

in High-Energy Astrophysics

David van Dyk

Department of StatisticsUniversity of California, Irvine

Joint work withVinay Kashyap, The CHAS Collaboration, and SAMSI SaFeDe Working Group

1

Page 2: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Outline of Presentation

This talk has two parts.

1. Soap Box

• What do short Confidence Intervals Mean?

• Is there an all-purpose statistical solution?

• Why compute Confidence Intervals at all?

2. Illustrations from High-Energy Astro-statistics

• Detection and Detection Limits

• Upper Limits for undetected sources

• Quantifying uncertainty for detected sources

• Nuisance parameters

2

Page 3: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

What is a Frequency Confidence Interval?

Background contaminated Poisson

N ∼ Poisson(µ + b),

with b = 2.88 and µ = 1.25.

• Confidence Interval:

{µ : N ∈ I(µ)},

where Pr(N ∈ I(µ)|µ) ≥ 95%.

• Values of µ with propenstity togenerate observed N .

Sampling Distribution of 95% CI

0 5 10 15

0.0

0.2

0.4

0.6

0.8

1.0

N = 0

N = 1

N = 2

N = 3

N = 4

N = 5

N = 6

N = 7N = 8 N > 8

confidence interval for µ

cum

ulat

ive

prob

abili

ty

The CI gives values of µ that are plausible given the observed data.

3

Page 4: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Short or Empty Frequency Confidence Intervals

What do they mean?

There are few values of µ that are plausible given the observed data.

What they do not mean?

The experimental uncertainty is small. (Std Error or Risk of µ̂??)

What if CI are repeatedly short or empty?

Short intervals should be an uncommon occurrence. If they are not

the statistical model is likely wrong, regardless of the strength of the

subjective prior belief in the model.

4

Page 5: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Pre-Data and Post-Data Probability

• A Frequency Confidence Interval has (at least) a 95% chance of covering thetrue value of µ.

• What is the chance that ∅ contains µ?

There is a 95% chance that Confidence Intervals

computed in this way contain the true value of µ.

• Frequency-based probability says nothing about a particular interval.

• Bayesian methods are better suited to quantify this type of probability.

• Precise probability statements may not be the most relevant probabilitystatements.

Our intuition leads us to condition on the observed data.

5

Page 6: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Leaning on the Science Rather than on the Statistics

Desirable Properties of Confidence Intervals Include (Mandelkern, 2002):

1. Based on well-define principles that are neither arbitrary nor subjective

2. Do not depend on prior or outside knowledge about the parameter

3. Equivariant

4. Convey experimental uncertainty

5. Correspond to precise probability statements

But....

1. Mightn’t there be physical reasons to prefer a parameterization over another?

2. Why so much concern over using outside information regarding the values ofthe parameter but none over the form of the statistical model?

Is it easier to make a final decision on the choice of CI than on the

choice of parameterization?

6

Page 7: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Say What You Mean, Mean What You Say

• Confidence Intervals are not designed to represent experimental uncertainty.

• Rather than trying to use Confidence Intervals for this purpose, an estimatorthat is designed to represent experimental uncertainty should be used.

• For example, we might estimate the risk of an estimator.

Risk(µ, µ̂) = Eµ

[Loss(µ, µ̂)

]=

∞∑

N=0

Loss(µ, µ̂(N))f(N |µ)

7

Page 8: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Why Use Confidence Intervals?

• The Guassian distribution canbe fully represented by two pa-rameters, the mean and the stan-dard deviation.

• No information is lost if we re-port a simple central 95% prob-ability interval.

• The CLT ensures that for largesamples many likelihoods andposterior distributions are ap-proximately Gaussian.

mean

SD

• The bell-shape of the Gaussian dis-tribution gives a simple interpreta-tion to such intervals.

• Analogous reasoning yields a largesample interpretation for FrequencyConfidence Intervals.

The 95% Confidence Interval:

A story of Gaussian sampling and posterior distributions.

8

Page 9: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

But what if....

• How do we summarize this distribution?

• Why do we want to? What is the benefit?

• With more complex distributions, simple summaries are not available.

Solution: Do Not Summarize!!

9

Page 10: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Multiple Modes

0 1 2 3 4 5 6

02

46

812

Energy (keV)

Post

erio

r pro

babi

lity

dens

ity fu

nctio

n obs!id 47

0 1 2 3 4 5 6

02

46

812

Energy (keV)Po

ster

ior p

roba

bilit

yde

nsity

func

tion obs!id 62

0 1 2 3 4 5 6

02

46

812

Energy (keV)

Post

erio

r pro

babi

lity

dens

ity fu

nctio

n obs!id 69

0 1 2 3 4 5 6

02

46

812

Energy (keV)

Post

erio

r pro

babi

lity

dens

ity fu

nctio

n obs!id 70

0 1 2 3 4 5 6

02

46

812

Energy (keV)

Post

erio

r pro

babi

lity

dens

ity fu

nctio

n obs!id 71

0 1 2 3 4 5 6

02

46

812

Energy (keV)Po

ster

ior p

roba

bilit

yde

nsity

func

tion obs!id 1269

An Interval is not a helpful summary!

10

Page 11: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

A False Sense of Confidence

versus [2.03, 7.87]

• Confidence Intervals appear to give a definite range of plausible values.

• Plots seem wishy-washy.

• But the likelihood (or posterior distribution) contains full information.

Confidence Intervals Give a False Sense of Confidence.

11

Page 12: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Outline of Presentation

This talk has two parts.

1. Soap Box

• What do short Confidence Intervals Mean?

• Is there an all-purpose statistical solution?

• Why compute Confidence Intervals at all?

2. Illustrations from High-Energy Astro-statistics

• Detection and Detection Limits

• Upper Limits for undetected sources

• Quantifying uncertainty for detected sources

• Nuisance Parameters

12

Page 13: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

A Survey

• We observe a field of optical sources with the Chandra X-ray Observatory.

• There are a total of n = 1377 potential X-ray sources.

• Some correspond to multiple optical sources.

• All observations are background contaminated.

• An independent background only observation is available.

Model:

Y Si ∼ Poisson

(κS

i (λSi + λB

j(i)))

,

Y Bj ∼ Poisson

(κB

j λBj

),

λSi ∼ Gamma(α, β).

We specify non-informative prior distributions for (λB1 , . . . , λB

J , α, β).

13

Page 14: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Detection Limits and P-values

How large must ySi be in order to conclude there is an X-ray source?

Procedure:

1. Suppose λSi = 0.

2. Use Y Bj to compute prior (gamma) distribution of λB

j(i).

3. Compute 95th percentile of marginal (negative-binomial) distribution of Y Si .

4. Set the detection limit, y!i to this percentile.

5. If ySi > y!

i conclude there is an X-ray source.

A similar strategy can be used to compute a p-value.

14

Page 15: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Illustrating Detection Limits

0 2 4 6 8 10

05

1015

Detection Limit

λB

y!

Here κS = κB = 1, and λB is treated as known.

15

Page 16: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Upper Limits

If there is no detection, how large might λSi be?

Procedure:

1. Suppose Y Si ∼ Poisson

(κS

i (λSi + λB

i )).

2. We detect a source if ySi > y!

i .

3. Let the upper limit, USi , be the smallest λS

i such that

Pr(Y Si > y!

i |λSi ) > 95%.

4. Note: Y Si | λS

i is the convolution of a Poisson and a negative binomial.

16

Page 17: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Illustrating Upper Limits

0 2 4 6 8 10

0.0

0.2

0.4

0.6

0.8

1.0

Power or Probability of Detection

pow

er=p

rob(

dete

ctio

n) 0.95

upper limit

λS

Here we assume λB = 2.

The 95% Upper Limit computed with a 95% Detection Limit.

17

Page 18: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

The Effect of Choice of Detection Limit on Upper Limits

0 5 10 150.0

0.2

0.4

0.6

0.8

1.0

Probability of Detectionpo

wer

=pro

b(de

tect

ion)

0.1430.053

0.0170.005

0.001

Probability of Type I errors are printed on the curves.Detection Limits are 3, 4, 5, 6, and 7, respectively.

0 5 10 150.0

0.2

0.4

0.6

0.8

1.0

50% Upper Limits Computed with Various Detection Limits

pow

er=p

rob(

dete

ctio

n)

1.65 2.65 3.65 4.65 5.65

λS

λS

18

Page 19: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

The Effect of Upper Limit Probability on Upper Limits

0 5 10 150.0

0.2

0.4

0.6

0.8

1.0

50% Upper Limits Computed with Various Detection Limitspo

wer

=pro

b(de

tect

ion)

1.65 2.65 3.65 4.65 5.65

0 5 10 150.0

0.2

0.4

0.6

0.8

1.0

95% Upper Limits Computed with Various Detection Limits

pow

er=p

rob(

dete

ctio

n)

5.75 7.15 8.53 9.95 11.15

λS

λS

19

Page 20: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Interval Estimates

For detected or non-detected sources,

we may wish to quantify uncertainty for λSi .

Three typical posterior distributions, computed using an MCMC sampler:

0.0 0.2 0.4 0.6 0.8

05

1015

Detection = FALSEUpper Limit = 1.03

0 2 4 6 8

0.0

0.1

0.2

0.3

0.4

Detection = TRUEUpper Limit = NA

1.5 2.0 2.5 3.0 3.5 4.0 4.5

0.0

0.2

0.4

0.6

0.8

1.0

Detection = TRUEUpper Limit = NA

λSλSλS

Confidence Intervals Do Not Adequately Represent All Distributions.

20

Page 21: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Nuisance Parameters

Typical Strategy:

• Average (or perhaps optimize) over nuisance parameters.

• Check frequency properties via simulation studies.

• If necessary, adjust procedures and priors to improve frequency properties.

Sometimes (partially) ancillary statistics are available:

T = Y S + Y B ∼ Poisson(λS + 2λB

)= Poisson

((ρ + 1)λB

)

Y S | T ∼ Binomial(

T,λS + λB

λS + 2λB

)= Binomial

(T,

ρ

1 + ρ

)

With ρ = (λS + λB)/λB and κS = κB = 1.

We use the Binomial Dist’n to compute Detection Limits and Upper Limits.

1. Both procedures appear to detect exactly the same sources.

2. The Binomial Procedure produces more conservative Upper Limits (for ρ).• If the Detection Limit is T , the Upper Limit for ρ is +∞.

21

Page 22: Detections Limits, U pp er Limits, and C onÞdence I n t erv a ls in …hea- · 2006. 8. 21. · Leaning o n t he Science R ather t han o n t he Statistics D e s ir a b le P r o p

Concluding Recommendations

1. Plot the (profile) likelihood or (marginal) posterior distribution!

2. Don’t expect too much of confidence intervals.

3. Remember frequency confidence intervals & p-values use pre-data probability.

• Rejecting may not mean the alternative is more likely than the null.

4. Remember that subjective model choices can be just as influential as asubjective prior distribution.

5. Be willing to rethink subjective model choices based on prior information.

6. Be willing to use your data to access the likely values of nuisanceparameters. Use methods that take advantage of this information.

22