vi. evaluation of model fit part 2: graphical analysis of model fit and related statistics

VI. Evaluation of Model FitPart 2: Graphical Analysis of Model

Fit and Related Statistics

1. Graphs of weighted residuals versus unweighted simulated values, and minimum, maximum, and average weighted residuals

2. Graphs of weighted observations versus weighted simulated values and the correlation coefficient R

3. Graphics using independent variables and the runs statistic

4. Normal probability plots and the correlation coefficient RN2

5. Determining acceptable deviations from independent normal weighted residuals

Use GW_Chart for plotting

1. Weighted Residuals vs Simulated Values (_ws in UCODE_2005) (Book, p. 100 – 104)

Two of the requirements for a valid regression are that the weighted residuals are random and have a mean of zero.

Use graph of the weighted residuals versus simulated values to evaluate the the weighted residuals.

The weighted residuals should be evenly distributed about zero for all weighted simulated values, and should display no trends with the simulated values.

Trends or unequal variance indicator model bias.

Examples of randomly and non-randomly distributed weighted residuals:

Figures 6-1 and 6-2 of Hill and Tiedeman (page 102-103).

Figure 6-7A (page 116) of Hill and Tiedeman shows graph for the steady-state problem.

Change to the _ws file from the book and previous codes

The book uses WEIGHTED simulated values on the horizontal axis of these graphs.

There is a statistical reason for using weighted simulated values, but in practice it is more confusing and doesn’t add much. The _ws data-exchange file now lists weighted residuals

and unweighted simulated values. This is a change from the _ws file in previous codes UCODE, MODFLOW-2000, and MODFLOWP.

If there is a very wide range in the simulated values, you can make the horizontal axis log-transformed, or use the weighted simulated values from the _ww file.)

The weighted residuals also can be plotted against observed values

1. Wted Residuals vs. Simulated Values

-4.5

-3

-1.5

0

1.5

3

4.5

65 70 75 80 85 90 95 100

Weighted simulated value

We

igh

ted

re

sid

ua

l

-4.5

-3

-1.5

0

1.5

3

4.5

65 70 75 80 85 90 95 100


We

igh

ted

re

sid

ua

l

-4.5

-3

-1.5

0

1.5

3

4.5

65 70 75 80 85 90 95 100


We

igh

ted

re

sid

ua

l

-3.69

-2.46

-1.23

0

1.23

2.46

3.69

65 70 75 80 85 90 95 100


We

igh

ted

re

sid

ua

l

-6.03

-4.02

-2.01

0

2.01

4.02

6.03

65 70 75 80 85 90 95 100


We

igh

ted

re

sid

ua

l

A

B

Figure 6-1 (page 102)Figure 6-2 of Hill and Tiedeman (page 103)

Define scale of weighted residual axis using the standard error of the regression.

(These figures use weighted simulated values, but the results are the same)

1. Minimum, Maximum, and Average Weighted Residual (Book, p. 100)

Minimum and maximum weighted residuals display the range of weighted residuals. Examination of these values can help reveal:

Areas where the fit to the observed data is especially poor, Areas where data has been incorrectly interpreted, and Data input errors.

The average weighted residual in nonlinear regression should be close to zero. In linear regression, the average residual is always exactly zero.

Caution: the average weighted residual in nonlinear regression can be close to zero even if there are other problems with the model or regression.

DO EXERCISE 6.2a: Evaluate graph of weighted residuals versus weighted simulated values. Evaluate the minimum, maximum, and average weighted residuals.

1. Weighted Residuals vs. Weighted Simulated Values (can use unweighted simulated values)

A

-2.4

-1.2

0

1.2

2.4

-50 0 50 100 150 200


Wei

ghte

d re

sidu

al

Heads

Flow s

Prior

Figure 6-7A of Hill and Tiedeman (page 116)

_ws data-exchange file

2. Graphs of Weighted Observed vs.Weighted Simulated Values (Book, p. 105)

Values on the graph of weighted observed versus weighted simulated values should plot close to a line with slope = 1.0

Generally, for assessing model bias or desired properties of weighted residuals, these graphs are not as useful as graphs of weighted residuals vs. weighted simulated values.

This is partly because a large range in magnitudes of weighted observations can obscure trends in the differences between the weighted observations and the weighted simulated values.

See Figure 6-3 (page 105) of Hill and Tiedeman.

65

70

75

80

85

90

95

100

65 70 75 80 85 90 95 100


We

igh

ted

ob

se

rve

d v

alu

e

65

70

75

80

85

90

95

100

65 70 75 80 85 90 95 100


We

igh

ted

ob

se

rve

d v

alu

e

A

B

2. Wted Observed vs.Wted Simulated Values

Figure 6-3 of Hill and Tiedeman

(page 105)

The data have the same problems as in the graphs with weighted residuals (repeated here). Bit these graphs do not reveal the problems as clearly!

-3.69

-2.46

-1.23

0

1.23

2.46

3.69

65 70 75 80 85 90 95 100


We

igh

ted

re

sid

ua

l

-6.03

-4.02

-2.01

0

2.01

4.02

6.03

65 70 75 80 85 90 95 100


We

igh

ted

re

sid

ua

l

A

B

Figures with weighted residuals

Figures with weighted

observations

_ww

_ws

In UCODE_2005, _ws has simulated instead of weighted simulated values

2. Correlation Coefficient R (Book, p. 106)

R is the correlation between the weighted simulated values and the weighted observations.

This summary statistic reflects how well the trends in the weighted simulated values match the trends in the weighted observations.

A value of R greater than 0.90 generally indicates a good match of the trends. However, R is not too useful for assessing model fit, because of the same drawbacks as for the graph of weighted observed vs. weighted simulated values.

DO EXERCISE 6.2b: Graph weighted observations versus weighted simulated values and examine the correlation coefficient R.

Figure 6-7b and c (page 116) of Hill and Tiedeman show graphs of weighted and unweighted simulated versus observed values for the steady-state problem.

Weighted Observed vs.Weighted Simulated Values

Figure 6-7b of Hill and

Tiedeman (page 116)

A

-2.4

-1.2

0

1.2

2.4

-50 0 50 100 150 200


Wei

ghte

d re

sidu

al

Heads

Flow s

Prior

B

-50

0

50

100

150

200

-50 0 50 100 150 200


Wei

ghte

d ob

serv

ed v

alue

Heads

Flow s

Prior

_ww

_ws

3. Graphs Using Independent Variables (Book, p. 106)

Graphs using independent variables include plots of weighted residuals on maps of the model area or versus time.

These plots should appear random and show no obvious patterns.

Lack of randomness can be indicative of model error – for example if weighted residuals are all positive in a particular region of the model.

Graph weighted residuals on maps of the model layers. (Exercise 6.2c)

Figure 6-9 (page 117) of Hill and Tiedeman shows these graphs for the steady-state problem.

3. Graphs UsingIndependent Variables

Figure 6-9 of Hill and Tiedeman (page 117)

Plotted using _w and .xyzt file or other source of location information.

4. Normal Probability Graphs(Book, p. 108-111)

Normality of the true errors, and therefore the weighted residuals, is not an assumption required for the regression to be valid.

However, computation of some inferential statistics, such as parameter confidence intervals, does require that the weighted residuals be normally distributed.

If weighted residuals are independent and normally distributed, they should plot on a straight line on a normal probability graph.

Cumulative Distribution of 250 Normally Distributed Random

Numbers

0

0.2

0.4

0.6

0.8

1

-3 -2 -1 0 1 2 3Ordered Value

Cum

ula

tive

Pro

bab

ility

Normal Probability Plot of 250 Normally Distributed Random Numbers

-3

-2

-1

0

1

2

3

-3 -2 -1 0 1 2 3Ordered Value

Sta

ndar

d N

orm

al S

tatistic

4. Correlation Coefficient RN2 (Book, p. 110)

Summary statistic to test for independence and normality of weighted residuals

is RN2, the correlation coefficient between ordered weighted residuals and order

statistics from a standard normal probability distribution function.

The hypothesis tested with RN2 is that the weighted residuals are independent

and normally distributed. Critical values of RN2 are used for significance levels

of 0.05 and 0.10, to test this hypothesis. The significance level is the probability that we are wrong in rejecting the hypothesis.

For example, we choose a significance level of 0.05, and its critical value:

If RN2 > critical value, we accept the hypothesis that the weighted

residuals are independent and normally distributed.

If RN2 < critical value, we reject the hypothesis that the weighted

residuals are independent and normally distributed, and there is a 5 percent probability that we are wrong.

4. Correlation Coefficient RN2

The critical values for significance levels of 0.05 and 0.10 are printed in the UCODE and MF2K output files. Because this is a strict test for independence and normality, it is generally adequate to use a significance level of 0.05.

RN2 is calculated in two ways:

1. using only the weighted residuals for the dependent-variable observations.2. using the dependent-variable and the prior information weighted residuals.

Large differences in RN2 between these two data sets indicate that the data sets

are not statistically similar.

For commonly used sample sizes, this test is more powerful than other statistics used to test normality, such as Kolomogorov-Smirnov.

DO EXERCISE 6.2d: Prepare normal probability graphs and evaluate the

correlation coefficient RN2.

Figure 6-11 (page 119) of Hill and Tiedeman shows the graph for the steady-state problem.

4. Normal Probability Graph

Figure 6-11 of Hill and Tiedeman (page 119)

-2

-1

0

1

2

-3 -2 -1 0 1 2 3

WEIGHTED RESIDUAL

STA

ND

AR

D N

OR

MA

L S

TA

TIS

TIC

Heads

Flows

Prior

_nm

5. Deviations from Independence and Normality (Book, p. 111-113)

It is possible that the weighted residuals might fail the tests for independence and normality because

There are too few residuals, or

The weighted residuals are normal, but they are correlated due to the fitting process of the regression, instead of being independent.

The regression methodology can result in the weighted residuals being correlated, because the regression fits the errors in the data. This correlation becomes more significant if the number of observations is small compared to the number of parameters.

Cooley and Naff (1990) developed a method for testing whether too few residuals or correlations between residuals might be the cause of the failure of the tests for independence and normality.

5. Deviations from Independence and Normality

Steps of the method developed by Cooley and Naff are:

1. Generate normally distributed random numbers with and without the regression-induced correlation.

2. Compare the normal probability graph of the weighted residuals with those of the generated independent normally distributed random numbers (d’s). If similar deviations from a straight line are detected in the two types of graphs, then the deviations could result from too few residuals.

3. Compare the normal probability graph of the weighted residuals with those of the generated correlated normally distributed random numbers (g’s). If similar deviations from a straight line are detected in the two types of graphs, then the deviations could result from regression-induced correlation.

UCODE and RESAN-2000 can be used to generate the independent and correlated normally distributed random numbers.

Assess deviations from independence and normality (EXERCISE 6.2e)

Figure 6-13 (p. 121) of Hill and Tiedeman shows graphs of generated numbers versus weighted simulated values; Figure 6-14 (p. 122) shows normal probability graphs of the generated numbers. Exer\ex6.2e-residual-analysis.bat

5. Deviations in normal probabilitygraphs

INDEPENDENT RANDOM NUMBER

ST

AN

DA

RD

NO

RM

AL

ST

AT

IST

IC

-2

-1

0

1

2Heads

Flow

Prior

Set 1 Set 2

-3 -2 -1 0 1 2 3

Set 4

-2

-1

0

1

2

-3 -2 -1 0 1 2 3

Set 3

Figure 6-11: weighted residuals for steady-state model

-2

-1

0

1

2

-3 -2 -1 0 1 2 3

WEIGHTED RESIDUAL

STA

ND

AR

D N

OR

MA

L S

TA

TIS

TIC

Heads

Flows

Prior

Figure 6-14A

Figure 6-14BCORRELATED RANDOM NUMBER

ST

AN

DA

RD

NO

RM

AL

ST

AT

IST

IC

-2

-1

0

1

2Heads

Flow

Prior

Set 1 Set 2

-3 -2 -1 0 1 2 3

Set 4

-2

-1

0

1

2

-3 -2 -1 0 1 2 3

Set 3

Run residual_analysis to get the _rd and _rg files. Plot with gw_chart.(gw_chart reverses the axes compared to the graphs here.)

_nm

_rd

_rg

vi. evaluation of model fit part 2: graphical analysis of model fit and related statistics

Documents