1 terminating statistical analysis by dr. jason merrick

14
1 Terminating Statistical Analysis By Dr. Jason Merrick

Upload: katrina-maxwell

Post on 24-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1 Terminating Statistical Analysis By Dr. Jason Merrick

1

Terminating Statistical Analysis

By Dr. Jason Merrick

Page 2: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/2

Statistical Analysis of Output Data: Terminating Simulations

• Random input leads to random output (RIRO)

• Run a simulation (once) — what does it mean?– Was this run “typical” or not?– Variability from run to run (of the same model)?

• Need statistical analysis of output data

• Time frame of simulations– Terminating: Specific starting, stopping conditions– Steady-state: Long-run (technically forever)– Here: Terminating

Page 3: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/3

Point and Interval Estimation

• Suppose we are trying to estimate an output measure E[Y] = based upon a simulated sample Y1,…,Yn

• We come up with an estimate – For instance

• How good is this estimate?– Unbiased – Low Variance (possibly minimum variance)– Consistent– Confidence Interval

n

iiYnY

1

1

Page 4: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/4

T-distribution

• The t-statistic is given by

– If the Y1,…,Yn are normally distributed and then the t-statistic is t-distributed

– If the Y1,…,Yn are not normally distributed, but then the t-statistic is approximately t-distributed thanks to the Central Limit Theorem• requires a reasonably large sample size n

– We require an estimate of the variance of denoted

)ˆ(ˆ

ˆ

t

Y

Y

)ˆ(2

Page 5: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/5

T-distribution Confidence Interval

• An approximate confidence interval for is then

– The center of the confidence interval is

– The half-width of the confidence interval is

– is the 100(/2)% percentile of a t-distribution with f degrees of freedom.

)]ˆ(ˆˆ),ˆ(ˆˆ[ 2/1,,2/1, afaf tt

)ˆ(ˆ2/1, aft

2/1, aft

0 5 10 15 20 25 30

1S

amp

le R

epet

itio

n

Parameter Value

Page 6: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/6

T-distribution Confidence Interval

• Case 1: Y1,…,Yn are independent

– This is the case when you are making n independent replications of the simulations• Terminating simulations

• Try and force this with steady-state simulations

– Compute your estimate and then compute the sample variance

– s2 is an unbiased estimator of the population variance, so s2/n is an unbiased estimator of with f = n-1 degrees of freedom

n

i

i

n

Ys

1

22

1

)ˆ(

)ˆ(2

Page 7: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/7

T-distribution Confidence Interval

• Case 2: Y1,…,Yn are not independent

– This is the case when you are using data generated within a single simulation run• sequences of observations in long-run steady-state simulations

– s2/n is a biased estimator of

– Y1,…,Yn is an auto-correlated sequence or a time-series

– Suppose that our point estimator for is , a general result from mathematical statistics is

)ˆ(2

Y

n

i

n

jji YYn 1 1

22 ),cov(

1)ˆ(

Page 8: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/8

T-distribution Confidence Interval

• Case 2: Y1,…,Yn are not independent

– For n observations there are n2 covariances to estimate– However, most simulations are covariance stationary, that

is for all i, j and k

– Recall that k is the lag, so for a given lag, the covariance remains the same throughout the sequence

– If this is the case then there are n-1 lagged covariances to estimate, denoted k and

),cov(),cov( kjjkii YYYY

1

12

22 121)ˆ(

n

i

k

n

k

n

Page 9: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/9

Time-Series Examples

0

10

20

30

40

50

60

70

80

90

100

1 11 21 31 41 51 61 71 81 91 101

Time or Observations

Ob

serv

ed V

alu

e

0

10

20

30

40

50

60

70

1 11 21 31 41 51 61 71 81 91 101

Time or Observations

Ob

serv

ed V

alu

e

-15

-10

-5

0

5

10

15

20

1 11 21 31 41 51 61 71 81 91 101

Time or Observations

Ob

serv

ed V

alu

e

-300

-200

-100

0

100

200

300

400

500

600

1 11 21 31 41 51 61 71 81 91 101

Time or Observations

Ob

serv

ed V

alu

e

Positively correlated sequence with lag 1

Positively correlated sequence with lags

1 & 2

Negatively correlated sequence with lag 1

Positively correlated, covariance

non-stationary sequence

Page 10: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/10

T-distribution Confidence Interval

• Case 2: Y1,…,Yn are not independent

– What is the effect of this bias term?

– For primarily positively correlated sequences B < 1, so the half-width of the confidence interval will be too small• Overstating the precision => make conclusions you shouldn’t

– For primarily negatively correlated sequences B > 1, so the half-width of the confidence interval will be too large• Underestimating the precision => don’t make conclusions you

should

1

1]/[2

2

ncnnsE

B

1

12

121n

i

k

n

kc

Page 11: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/11

Strategy for Terminating Simulations

• For terminating case, make IID replications– Simulate module: Number of Replications field– Check both boxes for Initialization Between Reps.– Get multiple independent Summary Reports– Different random seeds for each replication

• How many replications?– Trial and error (now)– Approximate no. for acceptable precision – Sequential sampling

• Save summary statistics (e.g. average, variance) across replications– Statistics Module, Outputs Area, save to files

Page 12: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/12

Half Width and Number of Replications

• Prefer smaller confidence intervals — precision

• Notation:

• Confidence interval:

• Half-width =t

snn 11 2, /

X tsnn 11 2, /

n

X

s

t tn

no. replications

sample mean

= sample standard deviation

critical value from tables11 2, /

Want this to be “small,” say< h where h is prespecified

Y

Y

Page 13: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/13

Half Width and Number of Replications

nnmnn

m

m

YYYY

YYYY

YYYY

n ,,,

,,,

,,,

21

222221

111211

2

1

Y

2s

• To improve the half-width, we can

– Increase the length of each simulation run and so increase the mi

– What does increasing the run length do?– Increase the number of replications t

snn 11 2, /

Page 14: 1 Terminating Statistical Analysis By Dr. Jason Merrick

Simulation with Arena — Intermediate Modeling and Terminating Statistical Analysis C6/14

Half Width and Number of Replications (cont’d.)

• Set half-width = h, solve for

• Not really solved for n (t, s depend on n)

• Approximation:

– Replace t by z, corresponding normal critical value– Pretend that current s will hold for larger samples

– Get

• Easier but different approximation:

n ts

hn 11 22

2

2, /

n zs

h 1 2

22

2 /s = sample standarddeviation from “initial”number n0 of replications

n nh

h 0

02

2h0 = half width from “initial”number n0 of replications

n grows quadraticallyas h decreases.