analysis of uncertainty in river water quality modelling - imperial

Analysis of Uncertainty in River

Water Quality Modelling

Neil McIntyre

A thesis submitted for the degree of Doctor of Philosophy from

the University of London

Department of Civil and Environmental Engineering, Imperial

College London

Abstract The case is given for attention to the evaluation of uncertainty in water quality modelling,

in the contexts of new demands for assessment of risk to water quality status, and typical

paucity of supporting data. A framework for the modelling of water quality is outlined

and presented as a potentially valuable component of broader risk assessment

methodologies, and potentially useful methods of numerical uncertainty analysis are

reviewed and demonstrated. A selective library of dynamic models and numerical tools

for model solving and uncertainty analysis are compiled into novel software for model

uncertainty analysis and risk-based decision-support. This software is applied to a series

of case studies in an exploration of the underlying numerical problems and their relevance

to modelling and management objectives using relatively sparse data sets. Issues

examined in some detail are the importance of reconciling numerical solution tolerances

with overall model precision; relative effects of numerical approximations, data and

model structural biases on optimal design of field experiments and on prediction

reliability; and the value and limitations of extending established methods of uncertainty

analysis to decision-support. These investigations lead to discussions about priorities for

the water quality modelling research community, in the face of contemporary and

emerging numerical, technological and management problems. The main conclusion is

that the current generation of modelling software can make very limited contribution to

risk-based decision support, due to general absence of formal uncertainty analysis

capabilities. This restriction is becoming more important due to new, ambitious spatial

and ecological management challenges. Further research into numerical issues is needed

to provide tools that allow these new challenges to be met, as well as to resolve persistent

deficiencies in modelling capability. A more pressing concern, however, is that

practitioners and their clients begin to confront issues of uncertainty, and create a demand

for software that facilitates risk-based planning.

Declaration

The work presented in this Thesis is my own except where otherwise

acknowledged.

Neil McIntyre

Acknowledgements

Without diminishing the role of all my colleagues, friends and family in supporting my

work on this PhD dissertation, here I give specific acknowledgement to a special few.

I have benefited from working with outstandingly talented and dedicated hydrologists

whose enthusiasm and high professional standards have been a constant source of

inspiration. I owe special thanks to my supervisor Howard Wheater for the confidence he

has had in my potential, allowing me to conduct the research in my own time and in my

own way, and for his intelligent management of my work-load. I am also indebted to my

colleague Adrian Butler on this latter point, as well as for his mentoring and friendship

though challenging times. Steve Chapra was the key motivating figure behind my

decision to begin a PhD - his enthusiasm taught me that an academic career was not such

a bad idea. I had the pleasure to work alongside Thorsten Wagener and Luis Camacho for

three years, and I thank them for making the day-to-day PhD experience rewarding and

fun. I thank my co-authors on the papers which have emanated from the PhD research for

their expert advice – Howard Wheater, Thorsten Wagener, Steve Chapra, Matthew Lees,

Beth Jackson, and Zeng Siyu. My thanks also to Angela Frederick for her zealous

assistance in putting this and related documents together.

My strongest thanks are for my parents, Anne and Donald McIntyre. Only thirty-four

years of nurturing and encouragement from them have permitted the effort that has

resulted in this document. Finally, I acknowledge with gratitude the role of my partner

during the PhD years, Juliet Vuong, who pushed me out of a few low points and

celebrated my milestones.

Preface

Six of the chapters in this dissertation are based on works either published by Neil

McIntyre or awaiting publication (as listed below). Although these works have co-

authors, the substantial intellectual, research and writing effort was that of the first author,

and any specific contributions of co-authors have either not been presented in this

dissertation or have been duly acknowledged.

Chapter 1

McIntyre, N.R., Wagener, T., Wheater, H.S. and Zeng, S. 2003. Uncertainty and risk in

water quality modelling and management. Journal of Hydroinformatics 5(4), 259-274.

Chapter 2

McIntyre, N.R., Wheater, H.S. and Lees, M.J. 2002. Estimation and propagation of

parametric uncertainty in environmental models. Journal of Hydroinformatics 4(3),

177-198.

McIntyre, N.R., Lees, M.J. and Wheater, H.S. 2001. A review and demonstration of

methods of uncertainty analysis in numerical environmental modelling. Proceedings of

the 8th Europia International Conference - Advances in Design Sciences and

Technology, Delft, The Netherlands, 183-196.

Chapter 4

McIntyre, N.R. and Wheater, H.S. 2003. A tool for risk-based analysis of surface water

quality. Environmental Modelling and Software (In press).

Chapter 5

McIntyre, N.R., Jackson, B., Wheater, H.S. and Chapra, S. 2003. Numerical efficiency in

Monte Carlo simulations - a case study of a river thermodynamic model. ASCE

Journal of Environmental Engineering (In press).

Chapter 6

McIntyre, N.R. and Wheater, H.S. 2003. Calibration of an in-river phosphorus model:

prior evaluation of data needs and model uncertainty. Journal of Hydrology (In press).

Chapter 7

McIntyre, N.R., Wagener, T., Wheater, H.S. and Chapra, S.C. 2003. Risk-based

modelling of surface water quality - A case study of the Charles River, Massachusetts.

Journal of Hydrology 274, 225-247.

Table of Contents

1. Introduction....................................................................................................... 1 1.1 Introduction to analysis of uncertainty in surface water quality modelling 2

1.1.1 Motivation .................................................................................................. 2 1.1.2 Causes of uncertainty.................................................................................. 4 1.1.3 Analysis of uncertainty............................................................................... 5 1.1.4 Surface water quality modelling applications............................................. 8

1.2 Introducing a framework for risk-based surface water quality modelling 10 1.2.1 Risk in context .......................................................................................... 10 1.2.2 A framework outline................................................................................. 11 1.2.3 Technical considerations .......................................................................... 12 1.2.4 A tool for risk-based management of water quality ................................. 16

1.3 Background to the case studies ................................................................. 16 1.3.1 The Hun River characteristics .................................................................. 17 1.3.2 The Charles River characteristics ............................................................. 19

1.4 Explanation of the structure of the remainder of this dissertation............. 20

2. Estimation and propagation of parametric uncertainty in environmental models................................................................................................................... 22

2.1 Introduction ............................................................................................... 23 2.1.1 Background and scope of chapter............................................................. 23 2.1.2 The sources of uncertainty and their representation in the model ............ 24

2.2 Approaches to uncertainty-based model calibration ................................. 25 2.2.1 Definitions ................................................................................................ 26 2.2.2 Objective functions and likelihood functions ........................................... 26 2.2.3 The significance of model structure errors and data bias ......................... 29 2.2.4 Possibility theory and the HSY method.................................................... 30 2.2.5 Generalised Likelihood Uncertainty Estimation ...................................... 32 2.2.6 Model output versus data uncertainty....................................................... 33 2.2.7 Multiple objective analysis....................................................................... 33

2.3 Sampling and global optimisation techniques........................................... 35 2.3.1 Monte Carlo simulation ............................................................................ 35 2.3.2 Metropolis algorithm ................................................................................ 36

2.3.3 Genetic algorithms.................................................................................... 38 2.4 Example of calibration .............................................................................. 38

2.4.1 The model and data .................................................................................. 39 2.4.2 Sampling the data error distribution ......................................................... 39 2.4.3 GLUE using a likelihood function as an objective function..................... 42 2.4.4 Metropolis using weighted squared errors as an objective function......... 46 2.4.5 GLUE using a subjective GLUE likelihood as an objective function ...... 48 2.4.6 HSY using a possibilistic objective function............................................ 49 2.4.7 Summary of this demonstration of calibration ......................................... 50

2.5 Uncertainty propagation ............................................................................ 51 2.5.1 Monte Carlo methods ............................................................................... 52 2.5.2 First order and point estimate approximations ......................................... 52 2.5.3 Possibility theory ...................................................................................... 53

2.6 Propagation of the Streeter-Phelps model parameters............................... 54

2.7 Summary ................................................................................................... 58

3. An overview of river water quality modelling theory and commonly used modelling tools..................................................................................................... 60

3.1 Introduction ............................................................................................... 61

3.2 Developments ............................................................................................ 61

3.3 The components of a river water quality model........................................ 64 3.3.1 Hydraulic and routing models .................................................................. 65 3.3.2 Solute transport models ............................................................................ 68 3.3.3 Thermodynamics ...................................................................................... 70 3.3.4 Water quality processes ............................................................................ 75

3.3.4.1 Carbon and dissolved oxygen............................................................. 75 3.3.4.2 Photosynthesis .................................................................................... 78 3.3.4.3 Nitrogen and phosphorus cycles......................................................... 79 3.3.4.4 Organic toxins and oils ....................................................................... 81 3.3.4.5 Suspended solids ................................................................................ 82

3.4 Summary ................................................................................................... 84

4. Water quality risk analysis tool (WaterRAT) .............................................. 86 4.1 Introduction ............................................................................................... 87

4.2 The concept and structure of WaterRAT................................................... 88

4.3 Spatial and temporal resolution................................................................. 90

4.4 Boundary conditions, initial conditions and model parameters ................ 91

4.5 Calibration and optimisation ..................................................................... 93

4.6 Multi-objective analysis ............................................................................ 96

4.7 Sensitivity analysis .................................................................................... 97

4.8 Prediction uncertainty................................................................................ 99

4.9 Output ...................................................................................................... 100

4.10 WaterRAT review ................................................................................. 101 4.10.1 General limitations ............................................................................... 101 4.10.2 Critical comparison with alternative modelling tools........................... 102

4.11 Summary ............................................................................................... 104

5. Numerical efficiency in Monte Carlo simulations – a case study of a river thermodynamic model ...................................................................................... 106

5.1 Introduction ............................................................................................. 107

5.2 The thermodynamic model...................................................................... 108

5.3 Monte Carlo simulation........................................................................... 112

5.4 Numerical methods.................................................................................. 114

5.5 Results ..................................................................................................... 120

5.6 Discussion ............................................................................................... 124

5.7 Summary ................................................................................................. 126

6. Identification of a phosphorus mobilisation model: prior evaluation of data needs ................................................................................................................... 128

6.1 Introduction ............................................................................................. 129

6.2 Model Description................................................................................... 131

6.3 The data ................................................................................................... 135

6.4 The calibration, prediction and performance evaluation procedures ...... 138

6.5 Results, discussion and supplementary experiments............................... 140

6.6 Summary ................................................................................................. 150

7. Risk-based modelling of surface water quality: a case study of the Charles River, Massachusetts ........................................................................................ 152

7.1 Introduction ............................................................................................. 153 7.1.1 Motivation .............................................................................................. 153 7.1.2 Scope and objectives .............................................................................. 153 7.1.3 The case study ........................................................................................ 154

7.2 Model Structure and Methods ................................................................. 155 7.2.1 Specification of the model structure ....................................................... 155 7.2.2 Specification of prior parameters and uncertainty in observed data....... 160 7.2.3 Multi-objective model conditioning ....................................................... 163

7.2.4 Graphical model evaluation.................................................................... 165 7.2.5 Regional sensitivity analysis .................................................................. 166 7.2.6 Risk-based appraisal of intervention strategies ...................................... 169

7.3 Results and discussion............................................................................. 170 7.3.1 Preliminary model evaluation................................................................. 170 7.3.2 Sensitivity analysis (1996 conditions) .................................................... 174 7.3.3 Sensitivity analysis (eutrophication reduction) ...................................... 178 7.3.4 Appraisal of intervention strategies ........................................................ 179

7.4 Summary ................................................................................................. 182

8. Conclusions .................................................................................................... 184 8.1 Summary ................................................................................................. 185

8.2 Review of GLUE..................................................................................... 185

8.3 Prior and posterior knowledge ................................................................ 186

8.4 Monte Carlo sampling ............................................................................. 188

8.5 Review of WaterRAT.............................................................................. 190

8.6 Review of Chapter 5: numerical issues ................................................... 191

8.7 Review of Chapter 6: prior identification of data needs and assessment of model capability. ........................................................................................... 193

8.8 Review of Chapter 7: a framework for model conditioning, sensitivity and risk analysis ................................................................................................... 194

8.9 The Hun River case study........................................................................ 197

8.10 A look to the future................................................................................ 200

References .......................................................................................................... 205

Notation.............................................................................................................. 221

1. Introduction The aim of this Thesis is to assert a philosophy for river quality modelling, deliver and

demonstrate associated software, and set an agenda for future research and improved

modelling practice. This aim is pursued through eight chapters which 1) formulate the

motivation for the research, 2) describe the tools and software, 3) explore the key

modelling issues using case studies and a set of numerical analyses, and 4) critically

discuss the work’s significance. The focus is on modelling problems that are supported by

typically sparse data sets, and this is reflected in the chosen case studies. The dissertation

is written with a view that the reader has limited prior knowledge of numerical modelling

and statistical methods.

In this first chapter, the case is presented for increased attention to the evaluation of

uncertainty in water quality modelling practice. A framework for the modelling of water

quality is outlined and presented as a potentially valuable component of broader risk

assessment methodologies. Technical considerations for the successful implementation of

the modelling framework are introduced. The primary arguments presented are as

follows: 1) For a large number of practical applications, deterministic use of complex

water quality models is not well supported by the available data and/or human resources,

and is not warranted by the limited information contained in the results. Modelling tools

should be flexible enough to be employed at levels of complexity which suit the

modelling task, the data and the available resources. 2) Monte Carlo simulation has

largely untapped potential for evaluation of model performance, estimation of model

uncertainty and identification of factors (including pollution sources, environmental

influences, and ill-defined objectives) contributing to the risk of failing water quality

objectives. 3) For practical application of Monte Carlo methods, attention needs to be

given to numerical efficiency, and for successful communication of results, effective

interfaces are required. This chapter finishes with an introduction to the case studies used

in later chapters, and with an overview of the content of the dissertation.

1.1 Introduction to analysis of uncertainty in surface water quality

modelling

1.1.1 Motivation

In the European Community, the recently introduced Water Framework Directive (CEC

2000) requires that member states formulate River Basin Management Plans which

identify objectives for achieving good water quality status on a catchment-wide basis.

Similar standards apply to much of the world, for example catchment management in the

United States has been guided by the Environmental Protection Agency’s Water Quality

Criteria and Standards Plan (US EPA 1998), in Australia by the National Water Quality

Management Strategy (DAFF 2000) and in China by the Environmental Quality Standard

for Surface Water (PRC SEPA 1999). Simulation models are a central part of these basin

management plans because they can apply best available scientific knowledge,

conditioned by historical evidence, to predict water quality responses to changing

controls. For example, the development of the integrated catchment model BASINS is an

explicit part of basin management plans in the United States (US EPA 1998, 1999), and

the UK government has recently commissioned new tools for diffuse source modelling

(DEFRA 2002: p76).

The new high expectations for the aquatic environment, incorporated into the current

wave of directives and regulations, is prompting additional complexity with regard to

modelling spatial variability, micro-pollutants and ecological indicators (Somlyody et al.

1998, Thomann 1998). Facilitated by improved computational resources, there is a trend

for spatial discretisation to be increased, multi-media and multi-constituent models to be

developed (e.g. Havnø et al. 1995, Cole and Wells 2000, Neitsch et al. 2002), and for

traditional water quality determinants to be broken down into constituent species (Chapra

1999, Shanahan et al. 2001). As a consequence, the typical number of modelled

components has risen exponentially over the past years, and this growth is expected to

continue (Thomann 1998).

Despite the increasing expectations placed upon water quality models, contemporary

deterministic models, when audited, frequently fail to predict the most local and basic

biological indicators with a reasonable degree of precision (e.g. Jorgensen et al. 1986).

Even when models are claimed to be ‘reliable’ following audits, a very significant margin

of error is allowed (e.g. Hartigan et al. 1983). The application of modelling to the new era

of high ecological standards presents severe challenges, especially given that our

modelling experience is with relatively stressed ecological systems (Beck 1997, Shanahan

et al. 1998), and that the economic implications of model errors may be relatively serious

(Chapra 1997). While additional model complexity might be expected to improve the

precision of model results, this has proven to be unfounded in a variety of studies (e.g.

Gardner et al. 1980, Van der Perk 1997, Lees et al. 2000, also see Young et al. 1996).

Furthermore, future driving forces such as climate (Hulme et al. 2002) and distributed

pollution sources (Shepherd et al. 1999) are poorly defined and themselves cannot be

modelled with much precision. Clearly, identification of suitable water quality policy

must take account of the uncertainties associated with both the validity of the models and

the driving forces. However, as increased model complexity hinders the formal evaluation

of uncertainty, due to the large number of uncertain model components to be

simultaneously analysed, there is a danger that our ability to evaluate uncertainty will

decrease.

The challenges facing water quality modellers should be contemplated in the wider

perspective of risk-based decision support. Firstly, a high degree of model uncertainty is

not necessarily an undesirable outcome, and undoubtedly is preferable to no indication of

reliability at all. Secondly, uncertainty in environmental models should be viewed as a

source of risk, as is traditional in other fields of engineering (e.g. Tung 1996), and should

be used to establish and achieve an acceptable failure probability in terms of water quality

status, rather than be used to decry the modelling approach (Beven 2000a). Given that

risk is a concept that can be used to integrate external criteria such as economics and

safety, as well as integrating the model result over the relevant model responses,

expressing results as risk is potentially attractive and seems inevitable. Thirdly, it is worth

noting that, in the context of decision-support, we are not justified in investing resources

in modelling (including the identification of prediction uncertainty) unless this will be

instrumental to the decisions that need to be made (Beven 1993). Therefore, we should

keep sight of the modelling task, and accept that (very) approximate solutions may be

appropriate.

To allow intelligent use of complex simulation models, and to allow informed

interpretation and application of model predictions, it is essential that a new generation of

tools is developed and disseminated. These should be directed at evaluation of model

uncertainty, as well as its minimisation, with respect to the modelling tasks. For results to

be justified and interpreted properly, methods used for uncertainty analysis must be

theoretically or intuitively well-founded, and transparent to the modeller. For methods to

be practical for day to day use, they should be relatively easy and fast to implement.

These requirements are challenges which will be addressed in the rest of this dissertation,

beginning in this chapter with review of the factors contributing to uncertainty, brief

review of current practices in the water quality community, and a proposed outline of a

framework for risk-based water quality modelling. A tool for modelling of river and lake

water quality where supporting resources are restricted, is then introduced.

1.1.2 Causes of uncertainty

Uncertainty in a water quality simulation model is inevitable due to the difficulty of

identifying a single model (including grid-scale, process formulations and parameter

values) which can accurately represent the water quality under all required model tasks

(see the discussions of Beck 1987, Van Straten 1998, and Adams and Reckhow 2001).

Although we have extensive knowledge about water quality processes from laboratory

experiments, extrapolation of this knowledge to models of the real environment has

consistently proven to be difficult. This is partly because the modelling scale is different

to the laboratory scale, and the diversity of species and heterogeneity found in natural

environments must (to some degree) be modelled approximately using lumped state

variables. This means that formulations and parameter values identified at laboratory

scale can only be used as a starting point for model design, rather than as a definitive end

result. Nor is there yet any basis for regionalisation of water quality models. Therefore,

models identified for one case study cannot be used with any confidence for another.

Literature which describes established formulations and parameter values (Bowie et al.

1985, Thomann and Mueller 1987, Chapra 1997), is evidence of the wide range of models

which are equally justified prior to observing a system’s behaviour in detail, and that the

uncertainty associated with modelling water quality on the basis of prior knowledge is

extremely large.

Given that it is desirable to evaluate the performance of models with respect to observed

water quality data, the accuracy, frequency and relevance of the available data dictates the

attainable degree of certainty in the model. Unfortunately, water quality data can be

expensive to collect and analyse, often requiring special handling and analysis in

laboratories. This means that data to support model identification are generally sparse,

often coming from sampling programmes which are fixed in frequency and location for

regulation purposes, rather than designed to encapture the system’s dynamic responses as

required for successful model identification (Berthouex and Brown 1994). Also, water

quality data are susceptible to noise and bias due to sampling, handling and measurement

procedures (see Keith 1990). In addition, information about model boundary conditions,

such as sources of pollution, often suffers from the same short-comings, especially for

distributed variables which are difficult to measure (pollution runoff, sediment quality,

etc). In summary, lack of good quality data to support model identification is a major

cause of model uncertainty.

Closely related to the issue of data quality is model equifinality, whereby different

models appear equally justified at the model design stage, but may give widely different

realisations of the future. Equifinality is caused by interactions between model

parameters, and by the near-equivalence of different model structures at the stage of

model identification. This means that the same (or effectively the same within the context

of the data errors) response can be achieved using different models. Clearly, the problem

magnifies as both the number of interacting parameters increases, and as the precision of

the data decreases. The use of parsimonious models, i.e. models which only include

parameters which can be uniquely identified from the data, is one approach to avoiding

equifinality. A parsimonious model implies that model components that are inactive

during model identification are left out, and that strongly interacting components are

combined into one (Young et al. 1996, Wagener et al. 2001). The inevitable omission of

model components which are potentially relevant means that parsimonious models may

seriously underestimate the uncertainty in model forecasts (Reichart and Omlin 1996).

When the aim of the modelling is to investigate risks associated with proposed water

quality interventions or other disturbances, it is essential that the uncertainty arising from

previously unobserved behaviour is adequately allowed for, and so parsimonious models

may be inappropriate. Thus, it may be said that identifying a single optimal model may

not be a justifiable approach.

The problem of equifinality and uncertainty in modelling environmental systems is

inevitable and model predictions based on a single ‘optimal’ model will, in general, be

rather arbitrary, and of very limited value. For this reason, a number of investigators have

devoted their attention to rationalising the modelling problem, and redefining it as

essentially stochastic whereby a population of feasible models (and by implication, a

population of model predictions) are identified (e.g. Hornberger and Spear 1980, Van

Straten and Keesman 1991, Beven and Binley 1992, Reichart and Omlin 1996, Yapo et

al. 1998, Melching and Bauwens 2001).

1.1.3 Analysis of uncertainty

Identification of a population of feasible models can include both identification of

alternative model structures (grid-scales and process formulations) and corresponding

parameter distributions. Model structures should be of a complexity consistent with the

difficulty and scale of the modelling task, and the supporting information and resources.

They should be consistent with prior knowledge of how best to represent system

processes at the scale and complexity in question. Given adequate supporting data, they

can be assessed and amended using various identification techniques (e.g. Beck 1983,

Qian 1997, Wagener et al. 2002a,b).

If one structure can be demonstrated as the most suitable for a particular modelling task

(that is, for the particular system, and the particular information which the modeller aims

to retrieve) then it would be reasonable to use this structure exclusively. On the other

hand, if there are justified alternatives then ideally, from an analytical point of view, the

implications of these also should be considered (e.g. Gardner et al. 1980, van der Perk

1997). This raises two issues. Firstly, it may be that no structures can be identified as

‘suitable’. Then (as will be expanded upon in Chapter 2) either an improved structure

should be developed, or the stringency of the model assessment should be reviewed and

the parameter uncertainty increased. Secondly, analysis of more than one structure may

not be feasible given the available resources - such analysis will be costly, perhaps

requiring purchase of additional software. Even using tools which offer some flexibility

in the choice of water quality model structure, such as DESERT (Ivanov et al. 1996) or

RWQM1 (Vanrolleghem et al. 2001) exploring candidate structures can significantly add

to the burden on human and computer resources. In such a case (and this tends to be the

case) all the significant model uncertainty must be represented, as far as possible, as

parameter uncertainty within a single suitable structure. From a mathematical point of

view, this has implications for the reliability of predictions (Draper 1995), but in a

management context it is justifiable if it has relatively little bearing on the decisions being

supported. In summary, investigating the sensitivity of decisions to different model

structures is commendable, but may be neither viable due to resource constraints, nor

worthwhile due to over-riding uncertainty in boundary conditions and parameter values.

Given a model structure, the identification of feasible sets of parameter values can be

approached by conditioning (constraining) the prior population of parameter sets so that a

specified modelling objective is better achieved. The modelling objective at this stage is

generally to simulate observed data, and is generally expressed objectively as a function

of the model residuals (the distances between the model result and the observed data). In

traditional deterministic modelling, the response of this objective function (OF) to

changes in the model parameters is used to estimate an optimum set of model parameters.

This is achieved by manual perturbations of the parameters or, more suitably for complex

models, by automatic algorithms. For uncertainty analysis, a joint distribution of

parameters is identified rather than a single optimum, by recording the response of the OF

across the parameter space. Depending partly on the algorithm which has been used, this

joint distribution may be represented as a variance-covariance matrix, or as a discrete

distribution (point estimates of probability mass over the parameter space), or as a

population of feasible parameter sets. Methods of parametric uncertainty analysis for

environmental simulation modelling are reviewed in Chapter 2.

Selecting an objective function to use for the conditioning of an environmental model is a

difficult issue which involves a degree of speculation and subjectivity. This is because

statistically-based identification of the parameter uncertainty requires knowledge of the

combined error structure of the model, the data and the boundary conditions. However,

especially when data are sparse or unreliable and the model structure is complex, there is

little or no theoretical basis for estimation of the error structure (see Chapter 2). While

parameter conditioning is often based on statistical likelihood functions (e.g. van Straten

1983), the result is dependent on the simplifying assumptions made about the error

structure. As well as being difficult to justify from prior information, such assumptions

can lead to significant mis-representation of model uncertainty (Beven et al. 2001), in

which case the model will fail to adequately explain the real system. In particular, the

common assumption that the model and/or data are unbiased can lead to a serious

underestimation of parameter and prediction uncertainties (Chapters 2 and 6).

As an alternative to statistical measures, the conditioning of the model can be based on

subjectively derived rules, for example, “if the parameter set returns a model result that is

highly consistent with my belief of true system behaviour then I will associate a relatively

high weighting”, or some objective expression of this, for example, “the relative

probability of each parameter set will be equal to the proportion of the variance of the

observed data explained by the model”. Given that it is subjectively based, such an

approach allows some freedom in achieving a satisfactory description of uncertainty,

without the encumbrance of statistical rules and the long list of associated simplifying

assumptions. Such conditioning of an environmental model, with the OF transformed to a

probability without necessarily being related objectively to the error structure, was

promoted by Beven and Binley (1992) in the context of their Generalised Likelihood

Uncertainty Estimation.

Once the uncertainty in the model is estimated, it can be propagated to give predictions.

Methods of uncertainty propagation which are relevant to simulation modelling can be

classified as variance propagation methods, point estimate methods, and Monte Carlo

methods. Tung (1996) gives an overview of these methods, and a review and

demonstration is included in the next chapter. The choice of method partly depends on the

description of the parameter uncertainty, and partly on the computational resources, with

the Monte Carlo methods generally (but not always) being more reliable and

computationally demanding.

1.1.4 Surface water quality modelling applications

There is a variety of literature promoting understanding and application of uncertainty

analysis in surface water quality modelling (e.g. Beck 1983, Beck 1987, Reckhow 1994,

Adams and Reckhow 2001). However, the application of uncertainty analysis to surface

water quality modelling seems to be relatively scarce, especially in practical decision-

support.

In the most widely used river water quality models, formal investigation of model

uncertainty is very rare. Uncertainty identification in many contemporary models such as

WASP5 (Ambrose et al. 1993), MIKE11 (Havnø et al. 1995), CE-QUAL-W2 (Cole and

Wells 2000) and ISIS (Wallingford Software 2002) is difficult because they are relatively

complex, and often linked to computationally intensive hydrodynamic, among other,

modules. Although these models are well founded in theory and well established in

practice (see Ambrose et al. 1996), their usefulness is arguably limited by their high

demand on resources, and the unknown uncertainty in their predictions. The large number

of decision-support applications of these models which do not include analysis of

uncertainty (amongst many others, Gunduz et al. 1998 and Warwick et al. 1999) is

evidence of this. It is reasonable to assume that unpublished commercial applications of

such models also under-represent the significance of uncertainty.

The popular modelling tool QUAL2E-UNCAS (Brown and Barnwell 1987), which is a

river modelling component of the US EPA’s BASINS tool, has a built-in uncertainty

analysis option. Reckhow (1994) recognises QUAL2E-UNCAS as an especially useful

development, not only because it allows formal uncertainty analysis, but the associated

documentation promotes uncertainty analysis amongst a large body of decision-makers.

QUAL2E-UNCAS relies on estimation of prediction uncertainty through specification of

feasible parameter and boundary condition ranges, and does not include a tool for

conditioning the input uncertainties on observed data. Nor does the model allow

covariance of inputs to be considered, meaning that uncertainty may be significantly over

or under-estimated (Reckhow 1994, Brown 2002).

Further to his commentary on QUAL2E-UNCAS, Reckhow (1994) notes that regulators

in the USA tend to favour relatively simple water quality models, as complex models are

too demanding on human resources, in addition to their high data demands. The UK

Environment Agency have developed the relatively simple steady-state SIMCAT model

to support regulation of river water quality (UK Environment Agency 2001a). SIMCAT

is based on the recognition that model prediction uncertainties stem mainly from

limitations in the calibration and pollution load data, rather than from the assumptions

implicit to the model equations. SIMCAT was arguably a major step forward in the

practice of river water quality modelling, in that parameter uncertainty can be identified

from data sampling error by optimising the model parameters against different

realisations of the data. As the model formulations used in SIMCAT are simple and easily

solved, it is practical to use the computationally intensive sampling method. At the same

time, the simplicity of the model structure makes the model less suitable for some tasks,

such as extrapolation to changed boundary conditions, or simulation of dynamic events,

when the effects of model structural error are more likely to be significant.

The decision-support role of relatively simple models coupled with uncertainty analysis is

evident from the continuing practices of both the UK and US environmental regulators.

This contrasts with the popularity of complex, resource-intensive models such as

WASP5, MIKE11 and CE-QUAL. Accepting that both modelling approaches may have a

role depending on the degree of detail sought and the resources available, there is

arguably a benefit in providing tools that include a hierarchy of models. Supplementing

this with uncertainty analysis facilities allows the limitations of both approaches to be

evaluated for specific modelling tasks.

DESERT (Ivanov et al. 1996, also see Somlyody 1997) is a tool for catchment

management optimisation which provides a framework in which the user can design his

own one-dimensional river water quality model. DESERT allows parameter conditioning,

although the effect of parameter interactions cannot be included in application of the

conditioned model. Based on dynamic programming, DESERT identifies all the sets of

model inputs which conform to a series of constraints, which can include cost constraints

for pollution control interventions, as well as in-river water quality criteria. In these

respects, DESERT has the capacity for uncertainty analysis and flexibility of model

design which will be needed for future water quality management problems, and is a

valuable precedent for future developments.

1.2 Introducing a framework for risk-based surface water quality

modelling

Following introduction to the driving forces behind water quality modelling, the inherent

problems in this discipline, and previously proposed directions for addressing these

problems, an outline of a modelling framework (which will be developed in Chapter 4) is

now proposed and some desirable facets of a potential modelling tool are discussed.

Beforehand, it is worth reviewing the significance of the term ‘risk’ in the context of

surface water quality modelling.

1.2.1 Risk in context

In the present context, risk may be usefully defined as “a combined measure of the degree

of detriment to society or the aquatic ecosystem caused by a defined event (or

combination of events), and the probability of that event occurring”. Traditionally, in

surface water quality management, the degree of detriment is simplified to a series of

pass-fail criteria, each criterion representing a class of water quality (e.g. UK

Environment Agency 1998). Risk can then be evaluated as probability of failure to

achieve the target class. Modelling, then, has at least two potentially valuable roles – to

extrapolate point measurements of water quality so that spatial and temporal criteria can

be used in water quality classification rather than discrete, localised measurements of

concentration; and to predict the response of risk to changing controls, to allow objective

risk management.

This brief introduction to the role of modelling in risk-based water quality management

raises a few issues. Firstly, it is important to differentiate between the frequency of failure

that will actually occur due to system variability, and the modelled probability of failure,

which includes (or should include) the influence of the uncertainty in the model and in the

estimates of future boundary conditions. That is, there is a risk that any water quality

intervention will fail to achieve its objectives due to the limitations of the modelling

employed at planning stage. Consequently, where a modelling study implies a

management option to be high-risk, this may be mainly due to the limited information and

resources available for model and boundary condition identification, and a clear

management priority would be to invest in more research. Also, there may be

considerable risk associated with ill-defined objectives – that is, a water quality

intervention may fail to be successful because at the time of planning the objectives were

under-researched or impossible to clearly define. For example, while it is reasonable to

suggest that there will be lengthy debate over local and regional definitions of ‘good

ecological status’ (Definition 22 in CEC 2000), the planning required to achieve such

questionable status is already underway (e.g. UK Environment Agency 2001b). Finally

on the point of associating risk, it is useful to distinguish between the risk stemming from

anthropogenic system variabilities (for example diurnal variations in effluents) which are

generally manageable, and risk stemming from ‘natural’ system variabilities (for example

those due to meteorological influences) which are less manageable. In particular, if the

risk of failure is predominantly due to unmanageable natural processes then reviewing the

targets would be a logical way forward. With the capability of exploring reasons for risk,

modelling has an essential role in, not only appraising pollution intervention options, but

identifying sensible precursors to intervention.

1.2.2 A framework outline

Figure 1.1 outlines a general framework for risk-based modelling of water quality that

will be further justified, developed, demonstrated and reviewed in the course of this

dissertation. Using such a framework it is intended that water quality managers have

access to risk-based evaluation of surface water quality, and be able to respond to and

develop this evaluation by,

• Identification of the principal factors affecting risk to water quality status.

• Evaluation of risk associated with alternative pollution control strategies,

potentially with integration of external criteria, such as social and economic costs

of water quality improvements.

• Consideration of alternative modelling criteria, in terms of identifying feasible

water quality targets, and identifying acceptable compromises between non-

commensurate criteria (e.g. between water quality status and need for water

abstractions).

• Consideration of different models for forecasting water quality response to

pollution interventions (to reduce and evaluate risk associated with model

structure uncertainty).

• Establishing priorities for collecting more data with which to improve model

identification (reducing risk associated with data uncertainty),

Using modelling in this manner is consistent with more general risk assessment

guidelines and frameworks used by environmental regulators. For example, UK

environmental regulators (DETR et al. 2000) encourage proactive risk management using

a tiered framework of quantitative risk assessments, whereby models, monitoring, and

management options are reviewed as the analysis moves from risk screening to the

advanced stages. This includes analysis of how the different sources of uncertainty

contribute to the final risk estimate, and review of costs and benefits. Such a tiered

approach to risk assessment has been recommended for implementing the requirements of

the Water Framework Directive (UK Environment Agency 2002). In applying this

general risk assessment framework to management of water quality and aquatic ecology,

there is clearly scope for iterative, model-based risk analyses, such as that promoted by

Figure 1.1.

1.2.3 Technical considerations

In pursuit of a practical modelling tool that provides such a capacity for risk evaluation,

the following tool features are considered essential;

1. Accessibility (ease of use), flexibility and extensibility (to cover a range of

modelling tasks).

2. Efficiency of numerical techniques (to achieve the maximum benefit from Monte

Carlo simulation).

3. Sensitivity analysis and risk evaluation capabilities.

Although the former three stipulations are common goals in the design and development

of modelling tools in general, there are important implications in the water quality

modelling context which deserve further discussion.

The need for accessibility, flexibility and extensibility

Accessibility of results is an important issue, as major management decisions usually

must be supported using visually insightful reports, hence the benefit of an adequate

interface for the graphical reporting of results. The value of advanced modelling

techniques, for example Monte Carlo simulation, should not be diminished by perceptions

that they are not transparent to decision-makers and stakeholders; effective interfaces may

go a far way to avoid or resolve this concern. Furthermore, investigation of a variety of

potential sources of risk, possibly including a large number of pollution sources and other

system characteristics, requires careful attention to the thoroughness of model input

specification. This draws attention to the value of an effective interface for model

specification and data input.

Prediction and furthersensitivity analysis

Model conditioning, sensitivity analysis and

model evaluation

Specification of modelstructure, grid scale and prior parameter ranges

Risk evaluation

Pollution load and regulation

scenarios

Monitoringdata

Modelling task

Externalconsiderations

Prediction and furthersensitivity analysis

Model conditioning, sensitivity analysis and

model evaluation

Specification of modelstructure, grid scale and prior parameter ranges

Risk evaluation

Pollution load and regulation

scenarios

Monitoringdata

Modelling task

Externalconsiderations

Figure 1.1 A framework for risk-based modelling of water quality

The requirement for flexibility is applicable to a number of aspects of a risk-based water

quality modelling tool. Firstly, unavoidable subjectivity in estimating model uncertainty

means that some choice of estimator should be provided, which is illustrated in studies by

Freer et al. (1996) and Franks and Beven (1997). Application of multi-objective

optimisation and sensitivity analysis (e.g. Bastidas et al. 1999) also requires flexibility in

specification of model performance criteria. Central to the modelling procedure illustrated

in Figure 1.1 is the capacity to explore different model structures, depending on the

modelling task, data and computational resources available. If the model uncertainty is to

be adequately represented by parameter uncertainty, the modeller should have the

opportunity to identify a model structure which best allows this. In particular, the

modelling grid scale (the spatial and temporal resolution of model) must be selected

according to the water quality problem. Uncertainty introduced by spatial and temporal

aggregations should be explored. Extensibility is essential so that new model structures

and water quality determinands can be incorporated, and so that the tool can be linked to

new databases and other conjunctive software. In particular, as the directives driving

water quality modelling promote integrated catchment management, and as the challenge

of diffuse pollution management gathers pace, the increased use of Geographical

Information Systems (GIS) as interfaces and platforms for water quality models is

inevitable, and this might be borne in mind at the development stage, whatever the

immediate modelling applications.

The need for numerical efficiency

Monte Carlo simulation provides us with the capability to retrieve a large amount of

information about the sensitivity of model results to model inputs, which is extremely

advantageous given the current limitations in the practice of water quality modelling.

Although computational costs continue to diminish, the value of a Monte Carlo

simulation will always depend on how well the continuum of possible model

inputs/outputs is represented by a finite number of realisations. This would be especially

relevant, for example, in catchment-scale distributed GIS-based modelling, due to the

large amount of computation involved as well as the large number of spatially distributed

model inputs which may be included in the analysis. There is therefore a need to either

maximise the number of realisations achievable at a given computational cost, for

example by implementing efficient numerical solvers and specifying numerical tolerances

that are consistent with the overall reliability of the analysis, or to reduce the number of

realisations required for an adequate representation by using variance reduction

techniques (Cochran 1977). For example, a variance reduction technique which has been

found useful in water quality modelling applications is Latin hypercube sampling (LHS;

MacKay et al. 1979). LHS is, in the current context, designed to thoroughly sample the

univariate distribution of each model input while leaving the sampling of interactions to

chance. While some water quality modellers (e.g. Melching and Bauwens 2001) have

successfully employed LHS to enormously improve the efficiency of sensitivity analysis,

Press et al. (1988) note “if there is an important interaction between the design

parameters, then Latin hypercube sampling gives no particular advantage (over simple

random sampling)”.

Notwithstanding the merits of efficient sampling and solution schemes, more fundamental

precursors to successful Monte Carlo analysis are, 1) appropriate limitation of model

complexity, and 2) minimisation of the number of inputs to be sampled. Again, this draws

attention to the need to match the model complexity to the specific modelling task, and

the need to provide tools that offer some flexibility in model structure choice.

The need for sensitivity analysis and risk evaluation capabilities

Monte Carlo-based approaches to sensitivity analysis such as those implemented by

Hornberger and Spear (1980), Beven and Binley (1992) and Kuczera and Parent (1998)

have found wide application in environmental modelling, including a limited number of

applications to surface water quality modelling, as reviewed in Chapter 2. Incorporation

of these methods into water quality modelling tools is an essential part of implementing

the framework outlined in Figure 1.1. Firstly, they allow evaluation of the suitability of a

model, in terms of reviewing the ability of the model and the associated parameter

uncertainty to explain observed data. Thereafter, uncertainty in model forecasts can be

estimated (e.g. Van Straten and Keesman 1991), avoiding the need for unqualified ‘best

estimate’ forecasts. Monte Carlo methods not only have the potential to produce summary

statistics of model sensitivities (e.g. Spear and Hornberger 1980, Wade et al. 2001), but

can be used to evaluate risk to water quality status due to individual pollution sources and

system properties, and can be extended to incorporate uncertainties in water quality

criteria (see Chapter 7). Such evaluation has clear potential for risk-based decision

making, particularly under conditions where data for identification of model and

boundary conditions are limited. It also has the potential to be extended to simulating

ecological risks, including spatial and temporal exposure as well as probability of

occurrence.

Emphasis has been put on the value of Monte Carlo simulation because it is a relatively

straightforward way of analysing how water quality objective functions respond over all

feasible combinations of model inputs. This can be supplemented by alternative,

computationally less demanding techniques of sensitivity analysis and uncertainty

propagation. Using first order sensitivity analysis, the effect on a model response of

perturbing each input variable around a specified value, while keeping the values of all

other inputs fixed, is calculated. This has the advantage of allowing for the response

components to be associated with individual inputs in a simple manner (e.g. Melching

and Bowens 2001). However, the interactions between inputs are not explored and non-

linear responses are not estimated, so there is very restricted scope for exploring response

surfaces, and effects (for example on risk) of low-probability values of model inputs are

likely to be misrepresented. Also, the result will generally be dependent on the value

around which the input is perturbed, as well as on the fixed values of all the other inputs,

which may be quite arbitrary given the problem of model equifinality.

1.2.4 A tool for risk-based management of water quality

As part of a European Commission Project investigating the role of computational

methods in management of surface water quality in developing countries, where

supporting data are unavoidably sparse, a modelling tool called WaterRAT (Water quality

Risk Analysis Tool) has been developed. This tool is built around the methods and

principles outlined above, and is designed to be employed in the manner illustrated by

Figure 1.1. WaterRAT allows exploration of the uncertainties arising from all sources of

prediction error – field data, model parameters, boundary and initial conditions, model

structure, scale and numerical approximations. Model parameters, boundary and initial

conditions can all be input as distributions, and can be conditioned to field data or other

designed objectives using built-in algorithms. Four simultaneous objectives can be

specified, and Pareto-optimal trade-offs can be identified. Regional sensitivity analysis

using Latin hypercube sampling is complemented by factorial sensitivity methods.

WaterRAT allows the effects of output uncertainties to be evaluated in terms of risk of

failing water quality targets, and will plot risk of failure against any one input variable,

supporting, for example, risk-based management of pollution control. Additionally, the

water quality targets can themselves be assigned uncertainty, thus incorporating risk due

to poorly defined objectives. Dynamic models are solved using an adaptive time-step

procedure, with the temporal numerical tolerances pre-specified by the user. A

description of the WaterRAT tool is provided in Chapter 4.

1.3 Background to the case studies

Two case studies are used in the development of this Thesis. They are the Hun River in

Liaoning, northeast China, and the Charles River in Massachusetts, northeast USA. The

Hun River was the focus of the European Community-funded project, Total Pollution

Load Estimation and Management (TOPLEM). As a whole, the TOPLEM project was

aimed at developing pollution management decision-support software suitable for use in

developing countries, where data and resources are especially limited. The Imperial

College role was development of suitable river and lake water quality modelling tools,

which resulted in the research constituting this dissertation. The degree to which the

research could be based on the Hun River was limited due to unforeseen difficulties in

accessing the promised data and models, notably river flow data and pollution load

models. While unfortunate, the especial limitations affecting the Hun River study are

extremely relevant to the Thesis, and provide points for discussion in Chapter 8. Due to

the failure of the TOPLEM project to deliver quality data, the Charles River was

introduced as a study for which suitable information is available (with gratitude to Steve

Chapra of Tufts University, and Camp Dresser and McKee Inc. of Cambridge,

Massachusetts).

1.3.1 The Hun River characteristics

The Hun River is one of three major rivers in the Liao River Basin in Liaoning Province,

illustrated in Figure 1.2. Liaoning is a centre for heavy industry. It is rich in deposits of

oil, coal and iron, all of which are mined heavily. Other important industries are power

generation, production of steel, oil refining and petrochemical production. The largest

city in the Liao Basin, with a population of 5 million, is Shenyang. 70km up the Hun

River from Shenyang is another important industrial centre called Fushun, with a

population of approximately 3 million. As well as its industrial importance, Liaoning

produces substantial quantities of sorghum, soya and rice, among other crops. The Hun

River originates in the northeast of Liaoning, and for the first 110 km the river descends

southwest through highlands, before entering the large Dahuofang reservoir. Following

the reservoir the Hun enters flat lowland territory, passing through Fushun and Shenyang.

After Shenyang, the river enters its lower catchment which is intensively used for

agriculture and oil drilling. 250km downstream of the reservoir, the Hun joins the larger

Taizi River.

Liaoning’s climate is monsoonal, with a hot wet season which generally includes July,

August and September, and a freezing, dry winter from November until March. The

remaining months are termed ‘mid-season’ with variable climate. Figure 1.3 shows a

typical time-series of ambient air temperature and precipitation at Shenyang (for the year

1999-2000). The flow regime of the lower Hun River is dominated by a number of

artificial controls, starting with the dam of the large Dahuofang reservoir. Due to the

threat of drought, little or no compensation flow is released from the reservoir except

during especially wet periods. While the 90 percentile low flow upstream of the reservoir

is over 10m3s-1, at the lower end of the river, it is 2m3s-1 (Montgomery Watson 2001a).

The locations of the significant point pollution loads (and where quality and flow data

were collected from September 1999 until September 2000 as part of the TOPLEM

project) are illustrated in Figure 1.4. The uses of the lower Hun River (i.e. downstream of

the Dahuofang reservoir) are severely restricted due to high pollution levels. Insight into

pollution management problems in the Hun catchment can be found in Ma (2001) and the

references therein.

North KoreaBeijing

Hebei Province

Lioaning Province

Inner Mongolia

Fushun

Shenyang

Taizi River

Hun River

Liao River

Liaotung Gulf

Yellow Sea

Jilin ProvinceArea

of detail

30°N120°E

PR China

N100 km

North KoreaBeijing

Hebei Province

Lioaning Province

Inner Mongolia

Fushun

Shenyang

Taizi River

Hun River

Liao River

Liaotung Gulf

Yellow Sea

Jilin ProvinceArea

of detail

30°N120°E

PR China

N100 km

Figure 1.2 Location Plan for Hun River

Daily precipitation

Ambient air temperature

Daily precipitation

Ambient air temperature

Figure 1.3 Air temperature and precipitation during 1999-2000 at Shenyang

Pu River (RK 160)

Xi River (RK 125)

Bataipo River (RK 70)

Shenyang 5 large mixed-source sewer outfalls (RK 58-66)

Fushunsewer(RK 19)

Downstream boundary of simulation (RK 185)

Upstreamboundary ofsimulation(RK 44)

DahuofangReservoir

Monitoring Section

YinguanRiver(RK 52)

RK river kilometer from reservoir

Pu River (RK 160)

Xi River (RK 125)

Bataipo River (RK 70)

Shenyang 5 large mixed-source sewer outfalls (RK 58-66)

Fushunsewer(RK 19)

Downstream boundary of simulation (RK 185)

Upstreamboundary ofsimulation(RK 44)

DahuofangReservoir

Monitoring Section

YinguanRiver(RK 52)

RK river kilometer from reservoir

Figure 1.4 The main sources of pollution to the Hun River While the TOPLEM project considered the full extent of the Hun River and its catchment, the work presented in Chapters 5 and 6 focus on the river reaches at Shenyang and immediately upstream of the Dahuofang reservoir.

1.3.2 The Charles River characteristics

The headwater of the Charles River is located in the hills of eastern Massachusetts in the

USA. The river flows approximately 130km through the state, through numerous towns

and over a succession of dams, before discharging into Boston harbour. Water quality

problems associated with the Charles River in previous decades were industrial pollution

and combined sewer overflows which led, among other unwelcome effects, to nutrient

enrichment and eutrophication. Storm-water interceptions and other interventions in the

1990s have greatly improved the overall ecology and amenity value of the river, although

they have failed to control eutrophication satisfactorarily. Further measures are currently

being implemented by installing phosphorus stripping facilities at a number of wastewater

treatment plants (CRWA 2000). The study in Chapter 7 looks at the 40km length of the

Upper Charles River, between the Populatic Pond in Medway County and the Cochrane

Dam in Dover County. Figure 1.5 shows the path of the Charles, and the location of the

main point sources of pollution in this stretch.

Point sources

Charles River

Monitored sections

Model boundaries

Basin boundary

1. CRPCD WWTP2. Mill River3. Stop River4. Medfield WWTP5. Bogastow Brook6. Sewall Brook7. Indian Brook8. Waban Brook9. Trout Brook

A. HeadwaterB. DS of Mill RiverC. Mill River + 4.5kmD. US of MedfieldE. US of Sewall BrookF. Sewall Brook + 4kmG. South Natick DamH. US of Trout BrookI. Cochrane Dam

BostonHarbour

Scale:10km N

Point sources

Charles River

Monitored sections

Model boundaries

Basin boundary

1. CRPCD WWTP2. Mill River3. Stop River4. Medfield WWTP5. Bogastow Brook6. Sewall Brook7. Indian Brook8. Waban Brook9. Trout Brook

A. HeadwaterB. DS of Mill RiverC. Mill River + 4.5kmD. US of MedfieldE. US of Sewall BrookF. Sewall Brook + 4kmG. South Natick DamH. US of Trout BrookI. Cochrane Dam

BostonHarbour

Scale:10km N

Figure 1.5 Charles River model boundaries, point sources and monitoring locations

1.4 Explanation of the structure of the remainder of this

dissertation

The general objective of this dissertation is to progress the science and practice of

simulation modelling to reflect the needs and resource constraints of surface water quality

managers. More specifically, the dissertation aims to develop methodologies and tools

that will assist in identification of river water quality management priorities, through

evaluation of the risk that various uncertainties pose to the decision-making procedure.

The wide scope of this aim became apparent early in the course of the work and, rather

than pursue a very specialised tract, the following chapters reflect different but inter-

related aspects of the challenge. The contribution offered by the dissertation is, therefore,

not only individual theses contained within the chapters, but the recognition of the need

for integrated examination of the issues, and the delivery of a framework to do so.

Chapter 1 has justified the concept of the Thesis and introduced the principal methods of

uncertainty and sensitivity analysis that will be employed. In Chapter 2, the importance of

parameter uncertainty estimation and propagation is recognised, and approaches

previously used in environmental simulation modelling are reviewed in some detail, and

demonstrated using a simple water quality model of a hypothetical system. Some relevant

comparisons and contrasts between the different methods are drawn, and their role in the

Thesis is discussed. Chapter 3 is a review of the state-of-the-art of river water quality

modelling, focusing on review of the types of formulation used in the various models that

are introduced in Chapter 4. Chapter 4 expands upon the introduction to the WaterRAT

modelling tool given earlier in Chapter 1, describing the components and capabilities of

the tool. Although the shortest, Chapter 4 represents the main development effort, and

this is reflected in the cited, more comprehensive WaterRAT descriptions. Chapter 5

reflects some of the first difficulties that were encountered in modelling the Hun River,

i.e. achieving numerical stability when simulating high order systems, while reconciling

the numerical precision with the overall modelling uncertainty and the computational

demands of Monte Carlo simulation. The issues that Chapter 5 explores are

fundamentally important, and seem to be under-reported in previous literature. Chapter 6

again tackles an important aspect of uncertainty analysis which was raised early in the

TOPLEM project – the prior identification of data needs and the dependent issues of the

nature of input data, expectations of output data, perceptions of structural adequacy,

limitations of the calibration algorithms and cost constraints. Chapter 7 uses the Charles

River study to test the strengths and limitations of the WaterRAT tool and the methods it

employs, in using Monte Carlo methods to identify the various factors affecting decision-

making risk, and provides a basis for risk-based water quality management. Thus, the

case study chapters look in turn at 1) the numerical challenges, 2) the data collection and

experimental design challenges, and 3) the decision-support application challenges,

pertaining to the aim and objective of the dissertation. Chapter 8 critically reviews the

success of the case study chapters in achieving their aims and exposes the gaps that they

have left for further research. Chapter 8 also includes discussion of the partial failure of

the TOPLEM project, and cursory consideration of approaches that can be used when

data are especially sparse and unreliable. Finally, current and future directions for the

field of uncertainty analysis in water quality modelling are reviewed in light of the

dissertation.

2. Estimation and propagation of parametric uncertainty in

environmental models

The use of statistically-based likelihood functions as a basis for representing model

parameter uncertainty is introduced, and the difficulties of their application when

unknown model structure error and/or data bias may be significant is discussed. More

subjective approaches to estimating model uncertainty (GLUE and possibility theory),

which attempt to allow representation of the effects of model and data biases in the

parameter uncertainty, are described. An alternative to simple random sampling of

parameter distributions is described (the Metropolis algorithm), and the significance of

uncertainty derived using a multi-objective approach is compared with the traditional

method of lumping all data into one objective function. The chapter goes on to

demonstrate the estimation of uncertainty using a data error sampling approach, GLUE

and Metropolis using the Streeter-Phelps model of stream dissolved oxygen. It is also

shown that the three calibration methods converge the parameter distributions to

practically the same end result if consistent objective functions are employed, although

the significance of the objective function (whether it is a statistically-based likelihood

function or a more subjective GLUE ‘likelihood’) is evident. Methods of propagation of

parameter uncertainty are also reviewed. Rosenblueth’s two-point method, first order

variance propagation, Monte Carlo sampling and possibility theory are applied to the

Streeter-Phelps example. The first three methods are shown to be capable of producing

practically equivalent confidence limits on the model result, but again the significance of

the objective function is evident. The relative merits of the methods for more complex

modelling problems are discussed.

2.1 Introduction

2.1.1 Background and scope of chapter

Producing a reliable set of confidence limits on a model result is not difficult given ideal

circumstances. For example, to fit a linear model to observations which are normally and

independently distributed with constant variance requires standard regression techniques,

and derived confidence limits are theoretically sound (see Berthouex and Brown 1994).

However, the natural environment is very much non-linear and this biases parameter

estimates (e.g. Tellinghuisen 2000). Also, data generally carry sampling and

measurement errors, and are often unreliable, and, no matter how well behaved the data

are, if the structure of the model is fundamentally wrong then standard regression

techniques are flawed. Clearly then, extrapolation of the model into the future also

complicates the analysis, as the reliability of the model under new conditions is always in

question. The problem of model equifinality means that many different proposed models

may appear equally adequate when compared to the data but may give significantly

different results when extrapolated to new conditions.

This chapter is a review of methods of uncertainty analysis in environmental modelling.

This subject area has previously been reviewed elsewhere (Beck 1983, Beck 1987,

Melching 1995, Tung 1996, McIntyre et al. 2001, Adams and Reckhow 2001, Kavetski et

al. 2002) and the reader is directed to this literature for additional background and

discussion. This chapter complements these previous works by taking a demonstrative

approach to the review, aiming to give insightful comparisons between the methods using

simple examples and theory. As such, it is intended to be a practical guide to the available

methods, and to enable and encourage the modeller to implement them with forethought,

and to interpret the results properly. Notably, this review excludes methods of recursive

parameter estimation (see Beck 1987). The utility of those methods is evident when the

modelling objectives are relatively well defined by observations of the environmental

system (e.g. Whitehead and Hornberger 1984). Without diminishing the importance of

recursive parameter estimation, this chapter (and dissertation) is principally concerned

with methods most used for analysis of systems for which supporting observations are

relatively sparse.

2.1.2 The sources of uncertainty and their representation in the model

A definition of uncertainty analysis is ‘the means of calculating and representing the

certainty with which the model results represent reality’. The difference between a

deterministic model result and reality will arise from,

model parameter error,

model structure error (where the model structure is the set of numerical

equations which define the uncalibrated model),

numerical errors - truncation errors, rounding errors and typographical mistakes

in the numerical implementation,

boundary condition uncertainties.

As reality can only be approximated by field data, data error analysis is a

fundamental part of the uncertainty analysis. Data errors arise from,

sampling errors (i.e. the data not representing the required spatial and temporal

averages),

measurement errors (e.g. due to methods of handling and laboratory analysis),

human reliability.

Realising that an error-free model would equate to the error-free observations, the

relationship between the actual model result M and the actual observations O can be

summarised by,

7654321 εεεεεεε −−−=−−−− OM (2.1)

where ε1 to ε4 represent the model error arising from the four sources in the order listed

above, and ε5 to ε7 represent the data error arising from the sources listed above.

Representing the overall error on either side of Equation 2.1 is not generally a simple task

of adding the error variances together, as might be implied by the equation. This is

because the errors may be unknown, and/or not of a random nature (see below), and/or

the model output may be interdependent on the various sources of error in a manner that

precludes their simple addition.

It is the goal of the modeller to achieve, to within an arbitrary tolerance, an error-free

model by removal of ε1 to ε4. However, the modeller is generally neither in control of

model structure errors ε2, nor numerical errors ε3, nor boundary condition errors ε4.

Commonly, only the values of the model parameters are under the direct control of the

modeller. The aim would then become one of compensating as far as possible for ε2 to ε4

by identification of optimum effective parameter values. Central to this Thesis is the

argument that there is always some ambiguity in the optimum effective parameter values

caused by the unknown natures of, and inseparability of, ε2 to ε7, and that this ambiguity

can be represented by parametric uncertainty. As such, the model parameters are used as

error-handling variables, and are identified according to their ability to mathematically

explain ε2 to ε7. In most environmental modelling problems, significant bias in one or

more of these errors will inevitably lead to biased parameter estimates. While the ideal

solution would be to eliminate bias, for example by compensatory adjustments to data or

by model structure refinement, such measures are often not practical and never

comprehensive. In recognition of this, the potential importance of biased model

calibration will be illustrated in this chapter, and significant attention is given to methods

of uncertainty analysis which aim to deliver some robustness to bias.

The difficult task of identifying parameter uncertainty is generally approached using

methods of calibration which derive, from the pre-calibration (a priori) parameter

distributions, calibrated (a posteriori) distributions. In hydrological modelling, due to

lack of prior knowledge, the a priori distributions are often taken as uniform and

independent (e.g. Hornberger and Spear 1980). On the other hand, the a posteriori

distributions, constrained by the data, may be multi-modal and non-linearly inter-

dependent (Sorooshian and Gupta 1995). Inter-dependency arises when the model result

is simultaneously significantly affected by two or more parameters, such that the

distribution of each parameter must be regarded as conditional on the value of all inter-

dependent parameters. Therefore, it is necessary to refer to the joint parameter

distribution which is defined by a continuous function of all the parameters, and to

sampled parameter sets rather than individual parameter values.

2.2 Approaches to uncertainty-based model calibration

Calibration is the process of tuning the model by optimisation of the set of model

parameters. In traditional deterministic modelling, a single optimum parameter set is

found such that model results fit the data as closely as possible. The closeness of fit is

quantified by one or more objective functions (OFs), and a variety of automated

optimisation procedures are available (see Sorooshian and Gupta 1995). In an

uncertainty-based calibration, the modeller is interested in the response of the OF over the

entire a priori parameter space, i.e. the OF response surface (see Berthouex and Brown

1994). Definition and sampling of the OF, and interpretation and analysis of the

subsequent response surface are the means of deriving calibrated parameter distributions.

This discussion will critically review some different approaches to these tasks.

2.2.1 Definitions

Before beginning this review, some terminology must be defined. As stated above, an

objective function (OF) is a general term for a quantative measure of how closely a model

result fits to corresponding observed data, which may or may not have a probabilistic

basis. The term likelihood is used in the statistical sense, i.e. the probability of a set of

data given a model, while a likelihood function is a function measuring this probability,

i.e. a particular type of objective function. For use in the context of uncertainty estimation

using GLUE (see section 2.2.5), a GLUE likelihood measure is the measure of probability

of a sampled model (i.e. which may be either a subjective perception of probability or

based on a likelihood function).

2.2.2 Objective functions and likelihood functions

The method of maximum likelihood (see Ang and Tang 1975) is the traditional

mathematical route to model parameter calibration, and it is a necessary starting point for

this discussion. If the OF is defined as a likelihood function of the model then for each

trial model,

[ ] resi

ii NiPPP ,,4,3;)()()(OF 121121 KK =∩∩∩= ∏ −εεεεεεε (2.2)

where εi is the ith of Nres model residuals (i.e. the difference between the ith of Nres

available data points and the corresponding the model result), P(εi) is the probability

density of εi, and P(ε2ε1) signifies the probability of ε2 assuming ε1 has already

happened. If the Nres residuals are assumed to be independent and normally distributed

with zero mean and constant variance σ2, and there are Npar degrees of freedom (i.e.

parameters to be calibrated), then Equation 2.2 becomes,

( )( )

++−=

−= ∏

2122/21

1OFresres

i εεεσπσσ

πσ (2.3)

If it can be assumed that ( ) ( )parresNres NN −++ 222

21 ...εεε is equal to 2σ then,

( )))(5.0exp(

1OF 22 parresN NNres

−−=πσ

Nres and Npar being constant during a model calibration, Equation 2.4 is reduced to,

( ) 2/2OFresN

= (2.5)

where K is a constant. Therefore, assuming that the sum of the squared residuals divided

by an appropriate constant is equal to σ 2, the least sum of squared residuals maximises

the likelihood (Box and Jenkins 1970). Usually, one or more of the assumptions used in

the derivation of Equation 2.5 is not valid. For example, the assumption that the error

variance σ 2 can be accurately estimated using the sum of the squared residuals is not

tenable when the number of residuals is small, and Equations 2.3 to 2.5 would give a very

approximate likelihood. If more than one modelled variable is being included in the OF

then σ 2 cannot generally be taken as constant for all variables, and Equation 2.2 (from

Sorooshian and Dracup 1980) becomes,

( )∏=

12/2OF

where Nvar is the number of included variables each modelled and measured at Nres

locations (in time and/or space). For finding the maximum likelihood, this is equivalent to

minimisation of the sum of weighted squared residuals, assuming the responses are

independent. Similarly, if the variance changes in time and/or space with the magnitude

of the response then an appropriate weighting scheme may be used (Sorooshian and

Dracup 1980). For autocorrelated residuals, Romanowicz at al. (1994) describe a suitable

likelihood function.

The OFs in Equations 2.2 to 2.6 give the probability of a data set sample (say data

samplek) occurring given the model result. If this model result is defined by a set of

parameters (αi) sampled from the a priori joint parameter distribution, applied to a chosen

model structure (say model structurej),

( )[ ] ijikP OFstructure model,sample data =α (2.7)

Let us assume for now that model structure j correctly and uniquely describes the

modelled system, so that it drops out of the equation. Equation 2.7 may be manipulated

using Bayes theorem (see Ang and Tang 1975), to give the probability of the parameter

set given the data sample,

[ ] ( )i

)(sample data

sample data =×α

α (2.8)

If only one data sample is considered, then P(data samplek) = 1. Furthermore, if it is

considered that all of Nsam sampled parameter sets have equal a priori probability so that

P(αi ) is equal to 1/ Nsam;

[ ]sam

sample data =α (2.9)

The standardised objective function (so that all the discrete OFs total unity) is then an

estimate of probability mass from the posterior joint parameter distribution,

[ ]∑

samNll

OFsample dataα (2.10)

In practice, the evaluated probability of the model is conditional on the data sample

employed for calibration. This is generally important in river quality modelling because

subsequent data sets from the field, apparently drawn under the same conditions, often

result in quite different parameter distributions. Results using more than one data set can

be integrated into the joint parameter distribution. If it is assumed that alternative data

sets are sampled independently,

( ) [ ] [ ]( )∑=

×=datNk

kkii PPP,1

sample datasample dataαα (2.11)

where Ndat is the number of sampled data sets, ( )iP α here is the probability of iα given

these alternative data sets, and [ ]kP sample data would generally be considered constant

for all k. If it is considered that the error associated with the variability between data sets

(rather than the error variance within them) is the main source of uncertainty, then only

the maximum likelihood model for each data set realisation may be considered, with 2.12

expressed as,

( ) [ ]kk PP sample data' =α (2.12)

where 'kα is the maximum likelihood parameter set for the kth data set realisation.

2.2.3 The significance of model structure errors and data bias

In process-based river quality modelling, some structural error is inevitable because the

complexity of the aquatic environment (its physics, chemistry and biochemistry) has

always surpassed our ability to observe, understand and numerically represent it. In

particular, the behaviour of water quality systems is liable to shift when boundary

conditions are substantially altered (van Straten 1998), and so model structures that

appear to perform well during calibration may be structurally flawed when considering

intervention options. To what extent our model structures need to be correct is introduced

in Chapter 1, and a nominal example of the significance of structural error is given later

in Chapter 6. In this and the next two sections the implications of structural error for

objective function design and uncertainty analysis philosophy is discussed.

Equations 2.7 to 2.12 assume the correct model structure, and therefore the derived

parameter response surface becomes less relevant as the structural error becomes larger.

A particular danger that arises from structural error is that the parameter estimates

become biased and their uncertainty is underestimated, potentially causing misleading

predictions. Therefore, the structural error should somehow be confronted or integrated

into the OF response surface.

Confronting the error generally would require some inference about the nature of the

error from statistical or visual analysis of the model residuals. Confronting the error could

involve re-conceptualising the modelled system (perhaps calling upon new theoretical

knowledge), or by making empirical adjustments to the model output (although this

adjustment would be specific to conditions under which the model is assessed and

therefore potentially of less use for predictive purposes). Using recursive parameter

estimation techniques is another possible route to adjusting the model empirically, or

making inferences about faults in the model structure. An additional problem with trying

to confront the error by improving the model structure is that there are generally a large

number of “improvements” that would result in a better model fit, and so a “correct”

model (i.e. a single model that best describes the system) would still not be found.

Another reason why modifying the model structure may be problematic is the tendency of

water quality data to be significantly biased due to unknown sampling and measurement

errors. This means that it is often impossible to distinguish between residuals caused by

data errors and those caused by model inadequacy. In any case, as stated at the outset of

this chapter, we are interested in the more common case that data are not of good enough

quality to embark upon model structure modifications that can generally be inferred from

residuals. Therefore, we turn to methods of integrating the structural error into the

response surface.

The first method to consider is calibrating a response surface for each of a number of

alternative proposed model structures, so that the response surface of each would

integrate to give the a posteriori probability of that structure, and all response surfaces

together would integrate to 1. Model application would then consist of applying all

structures with non-zero probability. This is hardly ideal, as there is no way of knowing

that the proposed models are a representative sample from the population of plausible

models, and it would not be easy to assign prior probabilities to them. However, this is a

tractable and transparent method of averaging over competing, justifiable models.

Alternative methods aim to allow one prescribed model structure to be used, and the

additional uncertainty caused by model structure error to be represented notionally by

increased parameter uncertainty. Referring back to the discussion in section 2.1.2, this is

often the only viable approach in practice because modellers rarely have time to consider

changing the model code or using different model structures, even if they do have the

necessary expertise and code access. Two such approaches are reviewed next.

2.2.4 Possibility theory and the HSY method

One approach to improving robustness of the modelling exercise to sparse and possibly

bias data is to use possibility theory (Zadeh 1978, also see Wierman 1996). A possibility

distribution describes the perceived possibility of an event where the maximum

possibility is 1 and the minimum is 0. In possibility theory, the rules of union and

intersection differ from those in probability theory. For example, for independent model

residuals ε1 and ε2,

[ ])(yPossibilit,)(yPossibilitMinimum)(yPossibilit 2121 εεεε =∩ (2.13a)

[ ])(yPossibilit,)(yPossibilitMaximum)(yPossibilit 2121 εεεε =∪ (2.13b)

Applying possibility theory to model calibration requires a subjective measure of the

possibility of the outcome of each candidate model. Using Equation 2.13(a), for example,

the possibility of any model result is the model residual (out of all Nres model residuals)

perceived to be the least likely. Although the significance of the remaining Nres-1

residuals (apart from not being the largest) would be lost, the robustness to data bias

would be increased by avoidance of the multiplicative likelihood function.

Another particular appeal of applying possibility theory to model calibration is that it

provides a convenient basis for calibrating the model using subjectively defined support

criteria. While such reasoning can be based on interpretation of data it may also be

knowledge-based. That is, the possibility of any candidate model can be judged on the

basis of non-numeric (even non-documented) knowledge about the environmental system

rather than by “hard” data. This is important in water quality studies where there is often

more useful information in qualitative observations of water quality (e.g. observations of

algal blooms, fish kills and discolouration) than in a sparse set of spot samples. However,

model results will reflect the modeller’s subjective interpretation of the evidence and its

translation into possibility distributions (as well as any judgements used in formulating

the model itself).

Hornberger and Spear (1980) suggested a groundbreaking approach to calibration of

environmental models which has distinct parallels with possibility theory. In their

method, an a priori parameter set, applied to a given model structure, is considered to be

a possible model of the system if the corresponding model result lies wholly within a set

of characteristic system behaviour. The characteristic behaviour is defined by subjective

reasoning which may include analysis of available data. The result of this approach to

calibration is an a posteriori sample of equally possible parameter sets and a

complementary sample of impossible parameter sets. Van Straten and Keesman (1991)

demonstrate how the a posteriori sample of possible parameters can be propagated to a

range of possible results. Statistical comparison of the contents of these parameter sets

can robustly quantify model sensitivity to individual parameters (e.g. Spear and

Hornberger 1980, Chen and Wheater 1999), and so the method is often referred to as

Regional (or Global) Sensitivity Analysis. After Beck (1987), the method is referred to in

this dissertation as the Hornberger-Spear-Young (HSY) algorithm and a Monte Carlo-

based algorithm for implementation of this method is illustrated in Figure 2.1.

Yes No

Form an a-posteriori parameter pdffrom all possible parameter samples

Randomly sample parameters froma-priori distributions and run model

Ready to propagate uncertainty

Define upper and lower limits which define the boundaries of the set ‘characteristic system behaviour’

possibility of that parameter

sample = 1

sample = 0

Is the result wholly within the characteristic set ?

Has the a posterioripdf converged?

Yes No

Form an a-posteriori parameter pdffrom all possible parameter samples

Randomly sample parameters froma-priori distributions and run model

Define upper and lower limits which define the boundaries of the set ‘characteristic system behaviour’

sample = 1

sample = 0

Is the result wholly within the characteristic set ?

Has the a posterioripdf converged?

Figure 2.1 HSY calibration procedure

2.2.5 Generalised Likelihood Uncertainty Estimation

Beven and Binley (1992) developed the HSY method into their Generalised Likelihood

Uncertainty Estimation (GLUE), so that every possible model was weighted with a

probability, called a likelihood in GLUE terminology. The GLUE likelihood measures are

interpreted as estimates of probability mass from the posterior joint parameter distribution

for that model, and predictions from alternative model structures with their own joint

parameter distributions can be combined. The GLUE methodology allows any (re-scaled)

objective function to be treated as the relative probability of the model if the modeller

considers this to meet the ends of the uncertainty analysis. Therefore, the modeller is able

to define how parameter uncertainty is measured and how it and model output uncertainty

should be interpreted. For example, some applications of GLUE use GLUE likelihoods

which are re-scaled values of the Nash-Sutcliffe efficiency (Nash and Sutcliffe 1970; see

Beven et al. 2001), with an arbitrary threshold defining what models are considered

behavioural. In that case, the modeller is making the statement that probability of the

model is equal to belief in the model, which is proportional to the measured relative

success of the model, without relating probability to the statistical properties of the

residuals. Therefore, it is emphasised that the estimated uncertainty depends largely on

the user’s design of the objective function, and the likelihood measure should not be

interpreted as a likelihood function, as used in Equations 2.2 to 2.6, unless it is

specifically designed as such (e.g. Romanowicz et al. 1994).

The obvious disadvantage of converting an objective function into a model probability

without relating it to the statistical properties of the residuals is that the model uncertainty

has no statistical significance, and model results will be arbitrary to some degree.

However, when model structural errors and data biases are unknown, and unknowable

given resource constraints, then the statistical assumptions underlying likelihood

functions will always be questionable, and the explicitly subjective judgement involved in

GLUE becomes more attractive.

2.2.6 Model output versus data uncertainty

If a likelihood function has been used, by definition the derived parameter uncertainty is

the uncertainty in the maximum likelihood parameter estimates. Hence model output

uncertainty will represent the uncertainty in the maximum likelihood result, rather than

the variance of the data. In practice, particularly for regulation purposes, the modeller

may want to predict the uncertainty in future measurements of water quality, and so he

would have to add the predicted data error variance onto the predicted model output

variance.

The GLUE framework allows the modeller to define the objective function (and the

behavioural threshold), so that the model is forced to encompass a satisfactory number of

the model residuals, without separately having to add the data error variance. For

example, the modeller could prescribe a value of Nres (Equations 2.3 to 2.6) which is less

than the number of data points, so increasing the parameter variance (e.g. Franks and

Beven 1997) to a visually satisfactory extent. This approach would have no statistical

basis, and arguably would lead to a cosmetic representation of model uncertainty, and can

only be justified when the objective function reflects the modeller’s belief in the sampled

models.

2.2.7 Multiple objective analysis

Using multiple OFs to measure model performance can add to the understanding of

uncertainty (by looking at what OFs are non-commensurate), and can provide an

alternative definition of uncertainty (by quantifying the disparity between equally

relevant modelling objectives). A well established method of multi-objective analysis,

which is emerging in hydrological modelling (e.g. Yapo et al. 1998), is to identify Pareto

fronts (Goldberg 1989). The Pareto front is the set of OF values (and corresponding

parameter values) whereby no further improvement can be made to any of the OFs

without unnecessary detriment to one or more of the others.

0 2 4 6 8 10

OF2OF1

parameter x

0 2 4 6 8 10

(b) (c)

Pareto front

0 0.1 0.2 0.3OF1

Pareto front

0 2 4 6 8 10

OF2OF1

parameter x

0 2 4 6 8 10

(b) (c)

Pareto front

0 0.1 0.2 0.3OF1

Pareto front

0 0.1 0.2 0.3OF1

Pareto front

Figure 2.2 Demonstration of the significance of the Pareto set The significance of the Pareto front is demonstrated in Figure 2.2 based on a trivial

example. Two alternative performance criteria (measured by OF1 and OF2) exist for a

one-parameter (x) model – let these OFs be likelihood functions, and give the parameter

probability distributions shown in Figure 2.2(b). Figure 2.2(a) plots the relationship

between OF1 and OF2 and highlights the Pareto front. For this simple example, this

translates to all values of x between the two distribution peaks on Figure 2.2(b). All these

values of x are equally viable compromises between OF1 and OF2 and might therefore be

interpreted as a uniform distribution, shown on Figure 2.2(c) along with the distribution

obtained by multiplying OF1 and OF2 together, i.e. joint probability assuming

independence. Clearly, the difference between the two distributions in Figure 2.2(c)

increases as the peaks of OF1 and OF2 move closer together and/or the distribution

variances increase. If the objective functions have identical optima, the Pareto solution

has no uncertainty at all. This is perfect for applications where the performance criteria

are individually well defined and uncertainty arises only from conflicts of objective. Used

incorrectly in numerical modelling, the Pareto solution may imply certain predictions

irrespective of the magnitude of data and other errors. Also the Pareto front, by definition,

gives equal weight to each objective function, irrespective of the relative importance of

the objective function and of the quality and quantity of the contributing data. Additional

insight into Pareto optimisation is given in the less trivial example in McIntyre et al.

(2001).

2.3 Sampling and global optimisation techniques 2.3.1 Monte Carlo simulation

Almost all of the work in this dissertation is based on Monte Carlo simulation, where a

random sample of a set of input variables is taken from known (or assumed) joint

distribution, and the corresponding sample of the model output is deterministically

simulated. Following a large number of samples of inputs, the joint distribution of model

outputs can be approximated. In the context of model calibration, the posterior

distribution of model parameters can also be approximated, using the objective functions

described in preceding sections. This use of Monte Carlo sampling of the prior parameter

space is a fundamental step in the GLUE and HSY calibration procedures.

One appealing feature of GLUE and HSY is that the potentially complex nature of the

response surface (including multiple local optima and non-linear dependencies between

parameters) is implicitly recognised by the large number of parameter samples and

associated probabilities, and this may be kept intact when propagating uncertainty to

model predictions (see section 2.7), rather than summarising the response surface as a

covariance matrix. However, the number of parameter samples is fundamental to the

adequacy of the approximation (Cochran 1977, Kuczera and Parent 1998). The required

number of samples to achieve a certain quality of approximation can be mitigated by

numerous variance reduction techniques (Cochran 1977), for example Latin hypercube

sampling and other stratified sampling methods (e.g. MacKay et al. 1979).

Monte Carlo simulation can be used, as well as for simulating the effect of model input

variable uncertainty, for simulating the effect of different realisations of data sampled

from a known (or assumed) distribution. This will be demonstrated later, where we use

this method to estimate parameter uncertainty arising from data sampling error, using the

algorithm summarised by the flow chart in Figure 2.3. The limitations of this approach

will also be discussed later.

Using these properties, generate an alternative realisation of the available field data

Calculate the distributionalproperties of the residuals

Find maximum likelihood model result for the available field data

Optimise the model parameters with respect to the generated data, and find maximum likelihood parameters

Form an a-posteriori parameter pdf from all realisations of the maximum likelihood parameters

Has the aposteriori pdf converged?

Using these properties, generate an alternative realisation of the available field data

Calculate the distributionalproperties of the residuals

Find maximum likelihood model result for the available field data

Optimise the model parameters with respect to the generated data, and find maximum likelihood parameters

Form an a-posteriori parameter pdf from all realisations of the maximum likelihood parameters

Has the aposteriori pdf converged?

Figure 2.3 Estimating parameter uncertainty by Monte Carlo sampling from a distribution of data errors 2.3.2 Metropolis algorithm

Using Monte Carlo simulation of the parameters, a large number of sampled objective

function values can be used to approximate the continuous response surface. However,

many thousands of parameter samples may be required for an adequate approximation to

be made (e.g. Kuczera and Parent 1998). To improve the efficiency of the calibration,

attempts have been made to adapt the a priori distribution to an a posteriori form using

Monte Carlo Markov Chain algorithms (see Brooks 1998).

Here, a Monte Carlo Markov Chain model proposed by Metropolis at al. (1953) is

described. The algorithm uses a Markov Chain process (see Rutenbar 1989, Brooks 1998)

which, in essence, assumes that the current state of a system dictates the probability of

moving to any proposed new state. The Metropolis algorithm was originally developed to

simulate the stochastic behaviour of a system of particles at thermal equilibrium. Applied

to model calibration, it adapts the population of parameters until the OF (in this case to be

minimised) is sufficiently described by the distribution,

( )1i /OFexp1)( Mi KK

P −=α (2.14)

where K is a standardisation constant such that the total of all P(αi) is unity, KM1 is a case-

dependent constant and αi is the ith parameter set in the derived population. While the

distribution of the accepted OFs converges to the Gaussian form of Equation 2.14, the

distribution of the accepted parameter sets depends upon the relationship between the

model parameters and the OF. The algorithm starts from an arbitrary location in the a

priori parameter space. From then on, the probability of any sampled parameter set αi

being accepted into the population depends entirely on comparison of OFi with that of the

last accepted set, OFi-1. This probability is defined by Equations 2.15(a) and 2.15(b),

iii KP

PP OFOFfor

OFOFexp

−==→ −

−− α

ααα (2.15a)

iiiiP OFOFfor1)( 11 ≥=→ −− αα (2.15b)

Each parameter set is sampled at a random distance and direction from the previously

added set, subject to the a priori constraints and a specified maximum distance, KM2. The

result of the Metropolis algorithm (in this context) is a large sample of parameter sets

from the posterior response surface (note the distinction from the output of GLUE which

produces a large sample of parameter sets from the prior distribution each assigned a

posterior probability).

An implementation of the Metropolis algorithm is suggested in Figure 2.4. This algorithm

could be refined by allowing constants KM1 and KM2 to be updated at intervals, thereby

gradually increasing focus on the optima.

Mailhot et al. (1997) find the Metropolis algorithm to be useful for uncertainty analysis of

a storm sewer model. Kuczera and Parent (1998) compare the performance of the

Metropolis algorithm with GLUE for estimation of rainfall-runoff model parameter

uncertainty (see comments in Section 8.3 of this dissertation).

Add αi to the population of a posteriori parameter sets.

Sample parameter set αi (i = 0) in a priori parameter space

Define objective function OF, constant KM1 and maximum step KM2

Take random sample αi of a-prioriparameter space within KM2 of αi -1

Run model and calculate objective function OFi

i = i + 1

Generate random number P’ between 0 and 1to simulate acceptance / rejection decision

Run model and calculate OFi (i = 0)

Reject αi andadd another

Probability P of addition of αi to parameterpopulation = Minimum(1, exp(OFi-1 -OFi )/ KM1)

Has the aposteriori pdfconverged?

Is P > P’ ?

Add αi to the population of a posteriori parameter sets.

Sample parameter set αi (i = 0) in a priori parameter space

Define objective function OF, constant KM1 and maximum step KM2

Take random sample αi of a-prioriparameter space within KM2 of αi -1

Run model and calculate objective function OFi

i = i + 1

Generate random number P’ between 0 and 1to simulate acceptance / rejection decision

Run model and calculate OFi (i = 0)

Reject αi andadd another

Probability P of addition of αi to parameterpopulation = Minimum(1, exp(OFi-1 -OFi )/ KM1)

Has the aposteriori pdfconverged?

Is P > P’ ?

Figure 2.4 A Metropolis calibration procedure

2.3.3 Genetic algorithms

Genetic algorithms (Holland 1975) are global optimisation procedures that are commonly

used in hydrological modelling (e.g. Duan et al. 1993, Mulligan and Brown 1998) to find

a single optimum parameter set. Unless they are designed to converge to a meaningful

distribution, for example by introducing a Marko Chain element (Vrugt et al. 2003a) or

by defining a Pareto set of parameters (Fonseca and Fleming 1995, Vrugt et al. 2003b),

then they are of limited value for uncertainty-based calibration. A good introduction to

genetic algorithms is given by Beasley et al. (1998).

2.4 Example of calibration

This example aims to demonstrate some of the above approaches to the estimation of a

joint a posteriori parameter distribution using Monte Carlo simulation, and elucidate

some similarities and contrasts between them. The importance of the objective function

design, with respect to the interpretation of the output uncertainty, is illustrated. To make

the demonstration manageable, the model is simple and the data are idealised. Attention is

drawn to the last paragraph in this section which discusses the limitations of the example

in the context of more complex and practical problems.

2.4.1 The model and data

A steady state model of biodegradable organic carbon (Cc) decay and dissolved oxygen

(Cox) in a river can be described by the Streeter-Phelps equations (Streeter and Phelps

1925, also see Chapter 3),

( ) ( )

uxkCxC occc exp0 (2.16a)

( ) ( ) ( )( )

−−+

−−

−−=

CxC raosoxraococra

cocosox exp0expexp

(2.16b)

where koc is the Cc decay rate, kra is the oxygen aeration rate, x is the distance downstream

from a point pollution source, and Cox(0) and Cc(0) are the respective concentrations in

the river at x = 0, u is the average transport velocity and Cos is the uniform concentration

of Cox at saturation. Synthetic data are generated by the model using the parameter values

and boundary conditions in Table 2.1, and random errors are introduced in Cox (=εox)

from an N(0, 22) population, and in Cc (=εc) from an independent N(0, 102) population.

With 20 data locations spaced at 5km intervals along a 100km stretch of river, the

synthetic data are illustrated in Figure 2.5.

In the following demonstrations, a joint distribution of parameters koc and kra is derived

from this model and data set (and variations of it). The other parameters are fixed at the

values in Table 2.1.

2.4.2 Sampling the data error distribution

The first method uses the available data set together with the corresponding maximum

likelihood model output to approximate the data error distribution. The available 20

observations of Cox and Cc are used to find the maximum likelihood parameter set (koc,

kra) from 1000 random samples from within the bounds koc = 0.4-1.6 and kra = 2.0-8.0.

The statistics of the residuals (mean and standard deviation) around this model output are

given in Table 2.2, together with the uncertainty in these statistics, derived using the

equations in Ang and Tang (1975, p232 & p248).

Using the algorithm summarised in Figure 2.3, 200 alternative realisations of data are

drawn from the sample distributions described by Table 2.2, and the maximum likelihood

model (out of 1000 random samples of (koc, kra ) taken from the above-stated ranges) for

each is found. A posterior joint distribution of (koc, kra ) is built up from these alternative

realisations of the maximum likelihood.

Table 2.1 ‘True’ parameter values for Streeter-Phelps example

Parameter Value Unit

koc 1 s-1

kra 5 s-1

Cc(x=0) 75 mgO/l

Cox(x=0) 12 mgO/l

Cos 12 mgO/l

u 0.5 m/s

Table 2.2 Distributional properties of the Cc and Cox residuals

Property of residuals Population Sample Standard deviation of

sampled property

Mean µ = 0 m = 0 Std(m) = 2.17 Cc

Standard deviation σ = 10.00 s = 9.69 Std(s) = 5.39

Mean µ = 0 m = 0 Std(m) = 0.43 Cox

Standard deviation σ = 2.00 s = 2.17 Std(s) = 0.87

This exercise is repeated with different quantities of synthetic data (i.e. varying the 20

locations shown in Figure 2.5), with the data error population distribution kept the same.

The comparison of calibrated marginal distributions is shown in Figure 2.6. Figure 2.7

gives a similar comparison of the marginal distributions, this time changing the data

quality (i.e. varying the population standard deviations shown in Table 2.2). Note that the

posterior distributions of koc and kra are correlated (correlation coefficient = 0.31),

meaning that the model must be defined by the bi-variate distribution of koc and kra as

opposed to the marginal distributions shown in Figures 2.6 and 2.7.

0 20000 40000 60000 80000 100000

Distance from point source

0 20000 40000 60000 80000 100000

Cc data

maximum likelihood

Cox data

maximum likelihood

0 20000 40000 60000 80000 100000

Cc data

maximum likelihood

Cox data

maximum likelihood

Figure 2.5 Synthetic data for Streeter-Phelps model

Note also from Figures 2.6 and 2.7 that the ‘true’ values of koc and kra (1 and 5

respectively) do not necessarily correspond to the identified maximum likelihood values

(see especially the result for 5 data locations in Figure 2.6). This is because the available

data, upon which this calibration is founded, are only a sample of the true water quality.

The fact that the realisations of data error are not independent samples from the true error

distribution, but from an approximation based on one sample, means that the estimates of

parameter uncertainty are of limited significance. Ideally, a large number of independent

realisations of data error would be built up from the error population (e.g. by repeatedly

sampling the river when it is under the same boundary conditions), although doing this

effectively would be difficult in practice due to resource constraints and the difficulty of

achieving independent measurement errors.

Figure 2.7 shows that this method of calibration gives a significantly uncertain value for

the parameter kra despite perfect data, which is contrary to intuition. This implies that

adequate convergence of the joint parameter distribution has not been achieved using 0.2

million model runs. Whether this is primarily due to the inefficiency of the random

sampling as an optimisation procedure, or due to the limited number of realisations of the

data, is not investigated here. However, it is clear that the difficulty of achieving

convergence, even for a relatively simple problem such as this, contributes to the

approximate nature of the solution.

It is common in environmental monitoring that data are biased descriptors of the true state

of the environment (Keith 1990, Jarvie and Neal 2002). This may be because of

heterogeneity which is not recognised in the sampling programme, or because of repeated

laboratory errors, or simply because of physical constraints such as a lower bound of

zero. To explore the effect of this, the Cox data are raised by a random amount between

zero and 5mgO/l, and to a minimum of zero and maximum of Cos. The calibration is done

as before, with 20 data locations, and the calibrated parameter distributions are shown in

Figure 2.8. This shows that where significant bias is suspected but unknown, then this

approach to calibration has failed. Note that the parameter uncertainty associated with kra

is implied to be significantly reduced, contrary to what we would desire. The effect of

model structure error is similar to that of data bias (at least in this case), in that it biases

parameter estimates and causes inappropriate reduction in parameter uncertainty. In

practice, model structure error is particularly relevant to the Streeter-Phelps model,

because it neglects many of the complexities of pollution transport and decay.

2.4.3 GLUE using a likelihood function as an objective function

The preceding method has shown how Monte Carlo sampling of data error can be used to

derive calibrated parameter distributions. Now it is shown that using a likelihood function

within the GLUE framework offers the opportunity to reduce the computation required by

not explicitly accounting for the data sampling error. Here, GLUE is applied to the

previous Streeter-Phelps example using the data sample illustrated by Figure 2.5. The

likelihood function defined in Equation 2.6 is applied (whereby, for now, we are opting

not to explore the full generality of GLUE, instead maintaining a likelihood function

without any behavioural threshold) along with Equation 2.10 to derive posterior estimates

of probability for each sample of (koc, kra) using a total of 2000 random samples.

2 3 4 5 6 7 8

0.4 0.6 0.8 1 1.2 1.4 1.6

100 data locations

20 data locations

5 data locations

100 data locations

20 data locations

5 data locations0.5

2 3 4 5 6 7 8

0.4 0.6 0.8 1 1.2 1.4 1.6

100 data locations

20 data locations

5 data locations

100 data locations

20 data locations

5 data locations

Figure 2.6 Calibrated distributions with different sample sizes and constant residual variance

0.4 0.6 0.8 1 1.2 1.4 1.6koc

std Cox = 4; std Cc = 20

2 3 4 5 6 7 8kra

0.4 0.6 0.8 1 1.2 1.4 1.6koc

2 3 4 5 6 7 8kra

Figure 2.7 Calibrated distributions with different residual variances and 20 data locations

2 3 4 5 6 7 8

krakoc

0.4 0.6 0.8 1 1.2 1.4 1.6

biased data

unbiased data

biased data

unbiased data

2 3 4 5 6 7 8

krakoc

0.4 0.6 0.8 1 1.2 1.4 1.6

biased data

unbiased data

biased data

unbiased data

Figure 2.8 Effect of Cox error bias (20 data locations)

The probability equi-potentials of the derived point estimates are shown in Figure 2.9. For

comparison with the results of the previous method (Section 2.4.2), the marginal

distributions of koc and kra are illustrated in Figure 2.10. Repeated for other data scenarios,

the results are summarised in terms of calibrated parameter variances in Figure 2.11.

0.4 0.6 0.8 1 1.2 1.4 1.6

0.0001

0.0020.004

0.4 0.6 0.8 1 1.2 1.4 1.6

0.0001

0.0020.004

Figure 2.9 Equi-potentials of point estimate probabilities using GLUE with a likelihood function

The similarity of the results from GLUE and those from the data error sampling (Figures

2.10, 2.11) is striking, considering that the GLUE method does not explicitly account for

data sampling error, and has reduced the computation from 0.2 million to 2000 model

runs. The theoretical basis for the similarity can be demonstrated at a simple level.

Equation 2.3 is re-expressed as,

( )[ ])(5.0exp

115.022 parresN

mGLUEGLUE NN

res−−

δσπ (2.17)

where LGLUE is the probability of any parameter set, δ 2 is the variance of the

corresponding model result around the maximum likelihood result, σm2 is the error

variance around the maximum likelihood result, and KGLUE is a standardisation constant.

In the data error sampling method, LR is the probability of any parameter set, but δ 2 is the

variance of the maximum likelihood result for any data realisation around the result for

the available data sample. Approximating the standard error of the maximum likelihood

as normally distributed with variance σm2/Nres (assuming that σm is accurate, see Ang and

Tang 1975) gives,

( )[ ]22

/5.0exp2

L σδπσ

−= (2.18)

As it is known that the integrals of Equations 2.17 and 2.18 are both unity, to prove that

they give the same result for all parameter samples only requires that the ratio LGLUE : LR

is proven to be the same for all δ,

( )( )

[ ][ ]225.05.022

/5.0exp

)(5.0exp

parres

res σδδσπ

πσ−

−−

+= (2.19)

Amalgamating all terms which are independent of δ into one constant K,

( ) resN

GLUE KL

)(/exp

δσσδ

(2.20)

Expanding the exponential term into a MacClaurin series, and neglecting terms higher

than quadratic, gives,

resres

GLUE KK

σδσ

=5.022

)( (2.21)

which is constant for all δ. Thus it is shown that Equations 2.17 and 2.18 are describing

the same probability distribution if δ4/σm4 and higher order terms can be neglected. These

terms may not be negligible if Nres is very low, but in such cases the assumptions

underlying Equations 2.17 and 2.18 are not justifiable anyway. Nevertheless, the theory

presented here supports the experimental results in Figures 2.10 and 2.11, and suggests

that using the likelihood function replicating the data sampling results by neglecting

higher order uncertainties.

0.4 0.6 0.8 1 1.2 1.4 1.6

data error resampling

2 3 4 5 6 7 8

0.4 0.6 0.8 1 1.2 1.4 1.6

2 3 4 5 6 7 8

Figure 2.10 Comparison of calibrated parameters using GLUE with a likelihood function and data error sampling

0 5 10 15 20

standard deviation of Cc data population

5 20 35 50 65 80 95

number of data locations

0 5 10 15 20

5 20 35 50 65 80 95

Metropolis

data errorresampling

Metropolis

0 5 10 15 20

standard deviation of Cc data population

5 20 35 50 65 80 95

number of data locations

0 5 10 15 20

5 20 35 50 65 80 95

Metropolis

Figure 2.11 Comparison of calibrated parameter variances using GLUE, data error sampling and Metropolis

2.4.4 Metropolis using weighted squared errors as an objective function

The Metropolis algorithm (Figure 2.4) is observed to further increase the efficiency (in

terms of time for convergence of (koc, kra) covariance matrix) of the Streeter-Phelps

calibration by up to 60%. The OF is defined as the sum of the variance-weighted squared

errors, i.e.

∑∑==

11OF εσ

(2.22)

where oxCm,σ and

cCm,σ are the error population standard deviations (from Table 2.2), and

oxCi,ε and cCi,ε are the ith residuals of Cox and Cc respectively, and Nres is 20 as before.

Then, the probability of selecting parameter set αi pursuant to αi-1 is given by Equation

2.15. KM1 is specified as 2, and the maximum permitted step, KM2 is specified individually

for koc and kra as (KM2, koc = 0.05, KM2, kra = 0.25). The data set illustrated by Figure 2.5 is

used. The converged koc and kra distributions are almost identical to those obtained using

the data error sampling method (Figure 2.10) and Figure 2.11 supports this result under a

range of data conditions. From Equation 2.14 it is clear that the Metropolis result is

sensitive to KM1, and it is not a coincidence that this choice of KM1 almost replicates the

data error sampling result. Idealising Equation 2.22 by considering a single response, and

using the definitions for Equations 2.14 and 2.17,

expexp1

exp1exp1

METMET

σσδ

(2.23)

Equating this with Equation 2.18 gives,

2 5.0exp

1expexp1

K σδ

πσσσ

(2.24)

and equating the exponents with the δ 2 terms gives KM1 = 2. The specification of KM1 and

the OF used here is generally applicable to approximation of the standard error of a

maximum likelihood model result assuming a large number of independent Gaussian

errors. As σm is not generally known a priori, updating of KM1 within the algorithm may

be useful. While Metropolis is an adaptive search, and therefore potentially superior to

GLUE for finding the maximum likelihood and variance, the number of samples it retains

from extreme values is, by definition, relatively small.

2.4.5 GLUE using a subjective GLUE likelihood as an objective function

A simple demonstration is now given of how the parameter uncertainty can be increased,

to safeguard against underestimating uncertainty due to the presence of structural or data

bias, using a subjective GLUE likelihood employed within GLUE. The Nash-Sutcliffe

efficiency (Nash and Sutcliffe 1970) is a widely used objective function that measures the

proportion of the variance of the data about the mean of the data that is explained by the

model, which may be regarded as an sensible (although subjective) measure of the

relative belief in alternative models. In this example, the average value of the Nash-

Sutcliffe efficiency for the Cc data and that for the Cox data is used as the objective

function,

( ) ( )

−−+

−−=

115.0OFεε

(2.25)

where Cox and Cc represent the model results for dissolved oxygen and oxygen demand

respectively, and oxC and cC the corresponding observations, and Nres = 20 as before.

All values of this (2000 random samples are used) equal to or below 0.4 are considered

non-behavioural and are discounted. The values greater than 0.4 are divided by a constant

so that they total one, and these are considered to be values of relative probability.

The equi-potentials of probability in the parameter space are shown in Figure 2.12. These

indicate that the parameter uncertainty is substantially greater than that in Figure 2.9

derived using the likelihood function. The 0.4 threshold is used arbitrarily in this

example; higher values reduce parameter uncertainty (e.g. a value of 0.75 resulted in less

parameter uncertainty than the likelihood function method) and lower values increase it

(e.g. a value of zero meant that almost the whole a priori parameter space was

behavioural).

In the Metropolis algorithm, to produce a similarly contrived posterior distribution, KM1

and the OF could be modified from the values used in Section 2.4.4.

0.4 0.6 0.8 1 1.2 1.4 1.6

0.00060.00065

0.0007

0.00075

0.0008

0.0000

0.4 0.6 0.8 1 1.2 1.4 1.6

0.00060.00065

0.0007

0.00075

0.0008

0.0000

Figure 2.12 Equi-potentials of point estimate probabilities using GLUE with a GLUE likelihood based on the Nash-Sutcliffe efficiency. Note: the zero contour is the boundary between behavioural and non-behavioural parameter sets.

2.4.6 HSY using a possibilistic objective function

Now consider the HSY method of Hornberger and Spear (1980). A set of characteristic

system response is defined, with the sampled parameter set given a possibility of 1 (P(δ)

= 1), if the model result falls wholly within pre-specified lower and upper limits. For the

Streeter-Phelps example, those limits are of Cox and Cc (Coxl Coxu and Ccl, Ccu respectively),

ucclcuoxoxlox CCCCCCP <<∩<<= if1)(δ at all Nres locations (2.26a)

resultsotherallfor0)( =δP (2.26b)

For example, if the upper and lower limits are taken to be the 90% confidence limit of the

data sample (i.e. 1.28 × the standard deviation in Table 2.2) around its maximum

likelihood model result (denoted here by 'oxC and 'cC ) then, for all Nres data points,

17.228.1' ×−= oxlox CC (2.27a)

17.228.1' ×+= oxuox CC (2.27b)

69.928.1' ×−= clc CC (2.27c)

69.928.1' ×+= cuc CC (2.27d)

The corresponding possible set of (koc, kra) is represented in Figure 2.13. This was derived

using 10000 random samples from the parameter space. Note that, as opposed to Figures

2.9 and 2.12, the set limits defined in Figure 2.13 are not smooth due to the discontinuous

nature of Equation 2.26. The HSY method is potentially more robust to model error and

data bias than likelihood functions because results such as those illustrated in Figure 2.8

can be avoided by increasing parameter uncertainty using appropriate specification of the

upper and lower bounds of characteristic response. Of course, improvement in robustness

is at the expense of a less informative, more subjective description of uncertainty.

0.4 0.6 0.8 1 1.2 1.4 1.6

0.4 0.6 0.8 1 1.2 1.4 1.62

0.4 0.6 0.8 1 1.2 1.4 1.6

Figure 2.13 The set of possible (koc, kra)

2.4.7 Summary of this demonstration of calibration

Notwithstanding its demonstrative limitations (see below), this example compares data

error sampling, GLUE and Metropolis and shows that these methods are not

fundamentally different in so far as they can produce the same calibration results, given

consistent objective functions. It has also been shown that using more subjectively-

founded objective functions, specifically possibilistic measures and the Nash-Sutcliffe

efficiency, leads to a different magnitude and nature of uncertainty than achieved using

likelihood functions, and those results are sensitive to the subjective judgements

employed.

With regard to more complex and more realistic environmental modelling, the above

example has several important limitations. Firstly, it only has two inter-dependent

parameters, while many models have significantly more. In such cases converging the

posterior joint distribution would be expected to be much more difficult, perhaps

requiring many thousands of model runs (e.g. Thyer et al. 1999) depending on the

strength and nature of the interactions. Secondly, the response surface, which is illustrated

by Figure 2.9, is well behaved. Many practical problems involve multi-modal responses

together with discontinuities derived from the discontinuities in the model structure, again

increasing the difficulty of convergence. Thirdly, Equation 2.16 is an analytical solution

to the Streeter-Phelps model, which is solved easily and quickly, which facilitates Monte

Carlo methods. Models of the environment are more often in the form of systems of

differential equations to which approximate numerical solutions are required and

computational demands are relatively high (see the discussion in Chapter 5). While

computer power is continuously increasing and parallel processing facilities are available,

computation time remains a limitation in model calibration and uncertainty analysis.

Lastly, the data have been synthesised from a normal population of residuals which are

uncorrelated and have zero mean. Only a nominal look at the effects of data bias has been

included.

2.5 Uncertainty propagation

Uncertainty propagation in this context means propagating the calibrated parameter joint

distribution to a stochastic result. Methods of propagating probability distributions can be

classified as variable transformation methods, sampling methods, point estimation

methods and variance propagation methods. An alternative to probability theory is the

theory of possibility (Zadeh 1978). Each of these approaches except variable

transformations (although see the demonstration of the Mellin transform by McIntyre

2000) are discussed here.

2.5.1 Monte Carlo methods

Monte Carlo (MC) simulation applied to uncertainty propagation means generating

discrete parameter sets according to their probability or possibility distribution, and

running a simulation using each set. Alternatively, the parameter set samples and

associated probability masses which were derived during calibration can be recalled,

thereby avoiding the need for assumptions regarding the form of the distribution. The

results of multiple simulations give a close approximation to the analytical form of the

probability density function (PDF) using frequency analysis, and any model can be easily

included in such a framework with minimal input from the modeller. For these reasons,

MC is a well-used method of uncertainty propagation. The main disadvantage of MC is

that a great number of model runs may be required to reliably represent all probable

results especially when there is a number of random variables. Methods of estimating a

preferred number of samples are available (e.g. Cochran 1977), although this also

depends on the convergence or divergence during propagation and therefore is case

specific (Tellinghuisen 2000). Stratified random sampling and Latin hypercube sampling

(see MacKay et al. 1979) are often used to improve efficiency.

2.5.2 First order and point estimate approximations

First-order variance propagation is the most common method of uncertainty propagation

(Beck 1987). If a function Y=f(X), where Y= y1, y2, … yNvar and X=x1, x2, … xNpar, is

approximated by a first-order Taylor series expansion around the expected X, µX, then,

( )XY µfµ = (2.28a)

)()()(2 YXY TY ∆Ψ∆=σ (2.28b)

where ∆(Y) is the Npar × Nvar matrix of derivatives of Y with respect to X; ψ(X) is the Npar

× Npar covariance matrix of X, and µY and σY2 are the Nvar × 1 vectors of mean and

variance of Y. This is a linear approximation of uncertainty propagation which is only

completely reliable for linear models. The accuracy of this method for non-linear models

can be improved by using a higher order Taylor series expansion, but this becomes

computationally demanding, especially if the derivative values are calculated

numerically. Variance propagation is a useful method for models which can

approximated by quasi-linearisation (e.g. Kitanidis and Bras 1980), i.e. a series of

localised linear functions.

Rosenblueth’s point estimation method for symmetric and non-symmetric variable

distributions (Rosenblueth 1981) aims to reduce the computational demands of variance

propagation by eliminating the calculation of derivatives. The PDF of each random input

variable is represented by a number Np of discrete points, located according to the first,

second and third moments of the PDF. The joint PDF of Npar parameters is represented by

the array of projected points. Therefore, parNpN points are used. Each point is assigned a

mass according to the third moment and the correlation matrix. All points are propagated

discretely to parNpN solutions and the first moment is the weighted average; the second

moment, that of the squares; and the third moment, that of the cubes. Most usually a 2-

point scheme is used whereby parN2 points are required. For symmetrical distributions for

Npar > 2, the number of evaluations can be reduced to 2Npar by using Harr’s point

estimation method (Harr 1989). Harr’s method is a useful improvement on Rosenblueth’s,

but is limited by the necessity of symmetrical distributions. A similar approach which

allows for skewed distributions but not correlations is described by Hong (1998).

Protopapas and Bras (1990) have applied Rosenblueth’s 2-point method to a rainfall-

runoff model and Yeh et al. (1997) have similarly applied Harr’s method, and a useful

review of all these point estimate methods is given by Christian and Baecher (1999).

It is worth noting that some Monte Carlo-based methods of calibration, for example

GLUE (Beven and Binley 1992), are randomised point estimate methods. In GLUE,

many random samples of the a priori parameter space are assigned GLUE likelihoods,

then these become point estimates for the uncertainty propagation stage. Unlike

Rosenblueth’s method, it is generally assumed that there are enough points to derive the

model output PDF, not just the lower moments.

2.5.3 Possibility theory

Possibility theory (Zadeh 1978) offers a robust alternative to propagation of probability

distributions. To illustrate this, let f(x1, x2) be a function which is strictly increasing or

decreasing with respect to both variables x1 and x2, and let x1 and x2 be independent have

possibility distributions which rise internally to a single peak or plateau. Then, using the

rules of union in Equation 2.13(b), the two values of x1 and the two of x2 with possibility

P (called the P-level α-cut of x1 and x2) define the two values of f(x1, x2) with possibility

P. For example, if 1dxdf is positive for all x1 and 2xdf ∂ is negative for all x2 the

upper and lower bounds of the function, with P = 0 are calculated from,

( )luu xxfxxf 2121 ,),( = (2.29a)

( )ull xxfxxf 2121 ,),( = (2.29b)

This method can be extended to problems with many uncertain parameters so long as the

aforementioned ‘increasing-decreasing’, ‘single peak’ and independence conditions are

met. While an infinite number of α-cuts are required for the exact solution to a non-linear

problem (Wierman 1996), an approximation of the propagated possibility distribution can

be made with a small number of computations. The associated difficulties and limitations

should be recognised. Firstly, special attention must be given to the method of calibration

in order to derive meaningful parameter possibility distributions. Secondly, the possibility

is greater than probability at all points (Zadeh 1978), and so the former is a less specific

descriptor of uncertainty. Thirdly, if parameter α-cuts are to be used, prior knowledge of

the sensitivity of results to the parameters is required. Lastly, there remains the problem

of parameter inter-dependence which, as in probability theory, complicates the analysis,

generally requiring that the output possibility distribution be defined by taking a large

sample of parameter sets.

2.6 Propagation of the Streeter-Phelps model parameters

The joint (koc, kra) distribution previously identified using GLUE with the likelihood

function (Figure 2.9) is propagated to give spatially varying distributions of Cc and Cox.

Again, the boundary conditions defined in Table 2.1 are used. Firstly, each of the 2000

samples of (koc, kra) and corresponding probability mass is propagated through the

Streeter-Phelps model, then the first-order variance and Rosenblueth 2-point methods are

applied, using the covariance matrix of (koc, kra) derived from the same parameter set

probabilities. It is observed that the three alternative methods give (practically) identical

results for the first three moments and the same 90% confidence limits on the output of Cf

and Cox against x (Figure 2.14). This similarity indicates the numerical efficiency of the

first-order variance and Rosenblueth 2-point methods, despite the apparent non-linearity

of the model with respect to koc and kra. In fact, the model is only significantly non-linear

at low values of Cox, and the performance of the first-order method deteriorates with

either increased Cc loading or increased data uncertainty.

40000 60000 80000 100000Distance from point source

0 20000 40000 60000 80000 100000

Cc data

modelled 90%confidence limits

Cox data

00 20000

90% confidence limiton data error variance around maximum likelihood

0 20000 40000 60000 80000 100000

Cc data

Cox data

00 20000

0 20000 40000 60000 80000 100000

Cc data

Cox data

00 20000

Figure 2.14 Propagated uncertainty using GLUE, first order analysis, and Rosenblueth’s two-point method (all give the same 90% confidence intervals). The uncertainty in the model parameters was derived using a likelihood function.

Using the likelihood function of Equation 2.6 has meant the modelled 90% confidence

limits in Figure 2.14 represent the uncertainty in the maximum likelihood solution, and

not the variance of the data error. Therefore, the posterior models derived from the

likelihood function do not reproduce the derived 90% confidence limits on the data error,

illustrated in Figure 2.14. Figure 2.15 shows the different modelled 90% confidence

limits that are obtained when using the Nash-Sutcliffe efficiency GLUE likelihood (i.e.

the set shown in Figure 2.12), and Figure 2.16 shows the limits of possibility obtained

using the possible set (shown in Figure 2.13). It is seen that the GLUE

likelihood/possibility measures used to obtain Figures 2.15 and 2.16 can produce wider

confidence limits than those based on likelihood functions, and the modeller might

consider contriving such objective functions to allow for the additional uncertainty due to

structural error or data biases. For example, the objective function could be designed to

force the modelled confidence limits to include a visually satisfactory proportion of the

data. The effect of calibration data bias on predicted confidence intervals is illustrated by

McIntyre et al. (2001), and the effect of model structural error is examined further in

Chapter 6 of this dissertation.

Also, it may be noted from Figures 2.14, 2.15 and 2.16 that model results are constrained

by the fixed boundary conditions (e.g. Cc = 12mg/l), irrespective of parameter

uncertainty. Therefore, if they are not precise, the modeller must treat the boundary

conditions as random variables as well as the parameters.

40000 60000 80000 100000

0 20000 40000 60000 80000 100000

Cc data

Cox data

00 20000

40000 60000 80000 100000

0 20000 40000 60000 80000 100000

Cc data

Cox data

00 20000

Figure 2.15 Propagated uncertainty using GLUE. The uncertainty in the model parameters was derived using a GLUE likelihood based on the Nash-Sutcliffe efficiency.

The example of the Streeter-Phelps model has illustrated that alternative methods of

propagation of parametric uncertainty can lead to practically the same result. However,

the example is too simple to fully show the limitations of the reviewed methods. There

may be strong non-linear dependency of parameters which must be approximated by a

covariance or correlation coefficient in Rosenblueth’s method and the first-order variance

method, leading to a poor approximation of prediction uncertainty. Practical

environmental models often include cyclic, non-continuous, systems of ODEs, or

otherwise non-linear mathematics which will test all methods more severely than was

attempted here.

0 20000 40000 60000 80000 100000

/l) Cc data

modelled limits of possibility

Cox data

00 20000

0 20000 40000 60000 80000 100000

/l) Cc data

Cox data

00 20000

0 20000 40000 60000 80000 100000

/l) Cc data

Cox data

00 20000

Figure 2.16 Propagated uncertainty where using possibilistic combination of parameter sets.

2.7 Summary

Imprecision in environmental modelling stems from the approximate nature of the

models, and from the inevitable difficulty of identifying a single ‘best’ model given the

limitations in our prior knowledge and in the information retrievable from field data. In

general, it may be said that the natural environment is too complex, with too many

heterogeneities and apparently random influences, to be usefully described without

including some estimation of uncertainty. The inclusion of uncertainty analysis adds to a

conventional modelling exercise in two main ways. Firstly, the calibration of model

parameters involves identification of parameter distributions rather than single parameter

values. Secondly, the parameter distributions (and alternative model structures if used)

are propagated to stochastic rather than deterministic results.

Estimation of parameter uncertainty is traditionally achieved through Bayesian

manipulation of likelihood functions which measure the probability of a data sample

occurring given a model. This is an attractive approach as it gives a result based on

assumptions about error distributions that can be clearly defined and are often statistically

auditable. However, in river quality modelling the inevitable presence of model structure

error and data biases complicates the analysis. When these errors are unknown and

neglected due to the lack of prior knowledge, lack of field data and lack of resources to

improve upon this situation, then using likelihood functions will generally underestimate

the true model uncertainty. In that case, alternative, more subjective measures of

uncertainty become attractive, including possibilistic measures (related to fuzzy set

theory) and Generalised Likelihood Uncertainty Estimation (GLUE). These tools allow

an objective function to be designed as a measure of the relative belief in alternative

models. The subjectivities involved mean that estimated uncertainty may be seen as

specific to the modeller rather than specific to the modelled system. It is therefore

important that the chosen objective function, as a basis for the estimated uncertainty, is

made explicit so that it is open to review and results can be properly interpreted.

This chapter has introduced these methods of uncertainty-based model calibration, and

has demonstrated their various similarities, differences and limitations with a simple

model of dissolved oxygen with synthetic data. This demonstration and discussion has

provided a background which will allow later results (Chapters 5, 6 and 7), which are

largely based on GLUE, to be properly interpretated and critically reviewed.

3. An overview of river water quality modelling theory and commonly used

modelling tools

The main components of water quality models are identified as transport models,

thermodynamic models, and water quality process models. For each, the theories and

alternative concepts used in model development are summarised, and their application to

some commonly used modelling tools is noted. A compendium of previous reviews on

modelling developments, tools and theory is given.

3.1 Introduction

The purpose of this chapter is to give context for the design of the water quality models

used in this Thesis. The current state of the art of water quality modelling, and its

development throughout this century are reviewed by Orlob (1992), Ambrose et al.

(1996), Chapra (1997) and Rauch et al. (1998). Currently applied theory and

implementations are described in detail by Bowie et al. (1985), Thomann and Mueller

(1987) and Chapra (1997). Somlyody et al. (1998) and Thomann (1998) review how the

state of the art may be developed in the future. Here, a brief history of the subject and a

summary review of current theory and practice, as pertinent to this Thesis, are given.

3.2 Developments

Early in the 20th century, two fundamental realisations were made with regard to human

impact on river water quality; 1. a reasonable quality of life for the metropolitan

population cannot be sustained without some formal wastewater treatment; 2. the optimal

design of wastewater treatment installations depends heavily on the natural assimilative

capacity of the receiving environment. With this design-orientated motivation, Streeter

and Phelps (1925) produced the first significant model of dissolved oxygen and organic

carbon in rivers. This model is based on the assumption that the organic carbon

decomposes aerobically at a first order rate koc, so that the concentration of easily

biodegradable organic carbon Cc (in units of oxygen demand) within any volume of water

(which has no flux of organic carbon across its boundaries), is described by the

differential equation,

cocc Ck

⋅−= (3.1)

As the aerobic decomposition consumes oxygen, there is a corresponding decrease in the

concentration of dissolved oxygen. It is also assumed that there is oxygen exchange with

the atmosphere at a rate proportional to the oxygen deficit, where the deficit is the

saturation concentration Cos minus the modelled concentration Cox,

( )oxosracocox CCkCk

−+⋅−= (3.2)

where kra is the reaeration rate. These equations can be solved analytically to give the

Streeter Phelps equation for river dissolved oxygen (Equation 2.15(b)). The fundamental

assumption used in the derivation of this model is that there is no flux of organic carbon

across the boundaries of each unit volume of water, nor any of oxygen except that due to

re-aeration at rate kra. Firstly, this implies that there are no sources of Ccf or Cox at all non-

zero x which means the model is limited to a single point source. Secondly, this means

that flow is steady and uniform, with no dispersion in the direction of x. Another

important assumption is that koc and kra are constant parameters, uninfluenced by dynamic

environmental conditions.

Despite the limitations of the original Streeter-Phelps model, its simple analytical solution

and proven validity as an approximate model, allowed it to be usefully applied for many

decades. However, since the 1920s (at least in ostensibly developed countries),

motivations arose for models which address the limitations of Streeter-Phelps, and which

give more widely applicable and more accurate results;

New pollutants emerged, and new knowledge about their social and environmental

significance meant that they could not be neglected. Consequently, the standards for

river water quality in developed countries became more specific.

Wastewater treatment technology became available to meet such standards, but at a

price which had to be justified before the investment by the expected environmental

or social improvement.

Drinking water standards increased, as did the sophistication and expense of water

treatment plants. This meant new demand for models which could predict the water

quality at the abstraction point as a function of upstream loading.

Improvements in monitoring methods meant that a direct measure of performance of

pollution control, and of water quality models, was possible.

National bodies were formed and made responsible for the regulation of river water

quality, including the setting of discharge consents which recognised the natural

assimilative capacity of the river system.

In the early 1960s, computer technology reached a stage where numerical solutions were

feasible. Orlob (1992) commentates on the developments from then. The Streeter-Phelps

model was extended to decay rates which varied spatially and with temperature. Non-

linear differentials using Monod kinetics were introduced, heat exchange models, and the

coupling of hydrodynamic models. In the early 1970s, the US EPA funded a model called

QUAL2, which could simulate systems of rivers at steady or unsteady flow, and allowed

for nitrogen oxygen demand. The 1970s also brought numerous attempts to model the

eutrophication process, after the environmental state of many lakes was severely damaged

by excessive algae growth. Since then, eutrophication has continued to be a major

problem in developed countries. The challenges of eutrophication modelling (e.g. food

chain interactions; heterogeneity in space, time and species; the importance of non-point

pollution sources; and the inevitable error in measurement procedures) have played a

major role in motivating water quality modelling research in the last 20 years.

In the 1980s and 1990s, the utility of water quality models has been fully recognised and

applied by governmental bodies and commercial organisations. Improved graphical and

menu-driven user interfaces have made models more marketable, and modelling tools for

a variety of specific applications are available e.g. QUAL2E (Brown and Barnwell 1987),

WASP5 (Ambrose et al. 1993), DESERT (Ivanov et al. 1996), QUASAR (Whitehead et

al. 1997a), OTIS (Runkel 1998), SIMCAT (UK Environment Agency 2001a), MIKE11

(DHI 2000), RWQM1 (Shanahan et al. 2000), ISIS (Wallingford Software 2002) and CE-

QUAL-W2 (Cole and Wells 2000). A comparative review of most of these models, and

some others not mentioned here, is given by Ambrose et al. (1996). Although they have

common elements, each of these models has specific features aimed at developing the

state of the art of water quality modelling. Separate modelling developments, not

generally integrated into models, have included sediment-water interactions (e.g. Di Toro

and Fitzpatrick 1993), micro-pollutant modelling (e.g. Chapra 1991), oil slick modelling

(e.g. Shen and Yapa 1993), inclusion of river ice processes (e.g. Shen and Chaing 1984,

Lal and Shen 1993), and inclusion of the higher food chain (e.g. Thomann 1989). Such

progress has been achieved through research into the physical processes occurring in the

river, and numerical representation of these processes in mechanistic models. As a result

of elaborated methods, increasing demands on the models and increased computer power,

the number of dependent variables and the spatial refinement of the models has increased

dramatically since the 1960s. This is illustrated in Figure 3.1 (based on a number of

modelling exercises in the USA, adapted from Thomann 1998).

100000

1000000

10000000

B = number of spatial

compartments

A = number ofdependent

state variables

number of interactive components = A x B

100000

1000000

10000000

B = number of spatial

compartments

A = number ofdependent

state variables

number of interactive components = A x B

Figure 3.1 Increasing sophistication of water quality models (from Thomann 1998)

A somewhat contrary development since the late 1970s was the realisation that the

multitude of physical processes which potentially affect the water quality cannot be

comprehensively identified and measured, and that resources to support use of complex

models are often in practice not available (Reckhow 1994). Therefore, as argued in

Chapter 1, a more realistic approach is to aggregate the many complex processes into a

limited number of model equations and parameters which represent the modeller’s

concept of a simplified river environment.

The remainder of this chapter reviews the physical and conceptual representations of the

river environment which are presently in use in river water quality modelling.

3.3 The components of a river water quality model

It is recognised that river water quality is strongly dependent on the river flow, the water

depth and the water temperature (Thomann and Meuller 1987). Therefore, in general,

river water quality models have the following distinct sub-models;

the hydraulic model

the thermodynamic model

the water quality process model.

The three sub-models can be idealised as forming a serial structure with the water quality

model at the end (affected by both the thermodynamic and hydraulic models), the

thermodynamic model in the middle (affected by only the hydraulic model), and the

hydraulic model at the start (physically independent of the other two). This is illustrated

in Figure 3.2. Possible exceptions to this one-way series of dependencies are the

calculation of evaporation as a function of water temperature, and the interaction of ice

growth with the hydraulic regime (Ashton 1986).

Secondary interactions which are generally neglected

Thermodynamicmodel

Primary interactions forming basis of water quality models

Hydraulic model

Water quality model

Secondary interactions which are generally neglected

Thermodynamicmodel

Primary interactions forming basis of water quality models

Hydraulic modelHydraulic model

Water quality model

Figure 3.2 The basic sub-models which make up a water quality model (Adapted from Thomann and Meuller 1987)

3.3.1 Hydraulic and routing models

An extensive review of the state of the art of river hydraulic and solute transport models

is given by Camacho (2000).

An adequate characterisation of the hydraulic state of the river is fundamental to the

success of the water quality model. The hydraulic state of the river strongly affects

various thermodynamic and kinetic processes. Examples of important hydraulic variables

are listed below;

flow rate (affects dilution),

solute and solids retention time (affects mass loss or gain due to various processes),

water surface velocity and area (affect aeration and heat exchange),

infiltration rates (affects mass losses),

turbulence (affects dispersion),

river bed shear velocity (affects sediment resuspension).

Models generally divide the water body into segments. In estuary, offshore and some lake

applications, where there is a significant flow element in more than one direction, it can

be valuable to use two or three dimensional segmentation (Watanabe et al. 1983). Rivers

tend to be relatively well-mixed over their depth and width, and the commonly adopted

approach is to segregate the river only lengthwise, i.e. to use a one-dimensional model.

An exception is when a sewage discharge or river confluence is to be studied in some

detail, for which near-field mixing models are available (see Rutherford 1994).

Routing through control volumes(conceptual)

Flow & momentum balance(physically-based)

Quasi steady

Linear

Non-linear

Advection-dispersion

Pure advection

Aggregated dead zone

Transient storage

Kinematic wave

St. Venant

Transfer function(empirical)

Diffusion wave

Linear

Non-linear

Flow modelling options Solute transport options

Fully mixedcells in series

Routing through control volumes(conceptual)

Flow & momentum balance(physically-based)

Quasi steady

Linear

Non-linear

Pure advection

Aggregated dead zone

Transient storage

Kinematic wave

St. Venant

Transfer function(empirical)

Diffusion wave

Linear

Non-linear

Flow modelling options Solute transport options

Fully mixedcells in series

Figure 3.3 Options for modelling the longitudinal flow and solute transport in a river

Options for modelling the longitudinal flow and solute transport in a river are summarised

in Figure 3.3. Modelling the hydraulics on a physical basis, assuming that the only forces

are affecting the hydraulics are in the longitudinal dimension, requires a discretised

solution of the mass and momentum balance equations known as the Saint Venant

equations. Such solutions allow unsteady flow conditions to be accurately simulated if

there is adequate supporting data and/or measurements of the channel characteristics.

These solutions are most useful when the simulation of unsteady flow conditions is

essential to the success of the water quality model, e.g. in pollution spills or in dynamic

runoff events. However, such models are not computationally easy to solve especially

when there are complex boundary conditions (for example hydraulic structures within the

studied length of river), or when numerical stability and accuracy criteria require a very

refined discretisation. Depending on the nature of the problem, the acceleration terms

may be neglected, giving the diffusion wave, kinematic wave, or gradually varied flow

equations (Chapra 1997: 251). Recent work has shown that unsteady flow conditions can

be adequately simulated using more efficient conceptual routing models (Camacho 2000).

Using these, the river is divided into a series of reaches, with each reach generally sub-

divided into cells1. This is similar to the spatial discretisation used for the Saint Venant

solution. However, instead of using momentum balance and mass balance to route the

flow, the conceptual models employ one or both of the following methods;

1. lag the flow between cells by some time α,

),1(),( α−−= jiji QQ (3.3)

(where Q(i,j) is the flow in cell number i at time step j; this convention of subscripts i

and j for cell number and time-step is used throughout this dissertation) then calculate

the water depth, velocity and hydraulic retention time using either a physically-based

formula, e.g. Manning’s equation (Chow 1959), or an empirical stage-discharge

relationship.

2. attenuate the flow by regarding the cells as reservoirs with the flow out of them

dependent on the water volume,

tQtQVV jijijiji ∆−∆+= −−−− )1,()1,1()1,(),( (3.4a)

2),(1),(q

jiji VqQ = (3.4b)

(where V(i,,j) is the volume of water in cell i at time j and ∆t is the specified time-step,

and q1 and q2 are empirical routing parameters) then calculate the water depth, velocity

and hydraulic retention time directly from Q and V, together with the channel shape

parameters.

In the first option, the simplest case is not to explicitly apply a lag, but use that which is

implicit to the numerical method. For example, in a forward time, backward space finite

difference solution (see Appendix 1) this would simply be one model calculation time-

step for each cell in the reach. If this time-step is very small, any change in the boundary

condition flow is more or less instantly applied to all downstream reaches, and so the

hydraulic model can be described as quasi-steady-state. The second case is founded on

established methods of hydrological flood routing, which attempt to simulate the

1 The division of the river into static cells is also called the “method of control volumes” whereby the states in neighbouring cells are treated as boundary conditions rather than as fully interacting states.

attenuation of the flood wave as it proceeds downstream. The combination of the two

methods provides a numerically efficient routing model, which can be shown, in theory

and practice, to match the accuracy of the Saint Venant solution if backwater effects are

not significant. Camacho and Lees 1999 describe a ‘discrete lag cascade’ model which

applies this method.

A flow routing sub-model is necessary to accurately simulate the short-term response of

water quality to periods of high flow. However, in many modelling applications the

hydrodynamic response of the river is far faster than the response period of interest. For

example, the modeller may be interested in the week to week variability of the water

quality, whereas a normal flood event may pass down the river in a few hours. While the

water quality is likely to have some ‘memory’ of the flood, the justification of an accurate

routing model (and the associated data requirements) becomes increasingly dubious as the

disparity in time-scales increases. Flow dynamics which occur on, for example, a weekly

time-scale can arguably be described as ‘gradually varied flow’ and modelled using

quasi-steady methods (i.e. simplification of the Saint Venant solution to neglect

acceleration and pressure terms and assume that energy gradient equals bed slope) which

are far more computationally efficient2. Because the case-specific time-scale determines

the optimum method, some modelling tools allow a choice of steady, quasi-steady and

unsteady hydraulic models (e.g. DESERT, ISIS and MIKE11).

3.3.2 Solute transport models

Solute transport in rivers is widely recognised as not simply a case of advection according

to the average water velocity. There are four common explanations for this;

1. Non-uniform velocity over the width and depth of the river. This causes

downstream sections to respond to a pulse of solute sooner than would be

predicted using the average velocity, and for the solute to take longer to pass.

2. Fickian-type dispersion where solute disperses at a rate directly proportional to its

spatial concentration gradient due to turbulent eddies.

3. Dead zones where effects such as eddies behind obstructions which entrap

solutes, causing the pollutograph peak to lag behind the hydrograph peak, and the

pollutograph tail to lengthen.

4. Transient storage zones where effects such as temporary solute sorption to plants

and sediments which has a similar effect to dead zones.

2 This is not necessarily to say that the modeller can neglect shorter periods of high flow, rather that he may be justified in representing them with a quasi-steady model

Solute transport models generally represent one or two of these mechanisms,

conceptually encompassing the influence of them all. The traditional method is to neglect

(1), (3) and (4), and to represent mechanism (2) through discretisation of the advection-

dispersion equation (Taylor 1954, from Camacho 2000),

+−= (3.5)

where C is concentration of an arbitrary solute, v is average water velocity, D is

dispersion, t is time and x is distance downstream. This can be incorporated into any of

the hydraulic models mentioned previously. However, the advection-dispersion model

has been shown to have limited success in describing solute transport in natural channels,

particularly the tails of pollutographs (Young and Wallis 1993). For this reason, transient

storage models have been added (e.g. Bencala and Walters 1983, Lees et al. 2000),

++−= (3.6)

where the additional term represents Fickian dispersion Dx’ to a conceptual off-stream

store of solute of concentration C’ over a mixing length dx’, illustrated in Figure 3.4.

CC’D’

main channel u = Q / Ac

off-stream store of solute which does not contribute to flow and causes solute to lag flow

CC’D’

main channel u = Q / Ac

off-stream store of solute which does not contribute to flow and causes solute to lag flow

Figure 3.4 The transient storage concept

While the transient storage model has proven useful, there are a total of four parameters

(u, D, Dx’ and dx’ or equivalent) per river reach, which must be estimated or calibrated

from the data. Beer and Young (1983) developed the aggregated dead zone model (ADZ)

as a parsimonious alternative. In continuous time form, the ADZ model is,

( )),(),1()()(

−−= −− α (3.7)

where α is the solute time delay (i.e. the travel time of the leading edge of the

pollutograph from cell i-1 to cell i) and T is the dead zone residence time. Thus, by

assuming that Fickian dispersion, transient storage effects and non-uniform velocity

effects are negligible compared to the dead zone effect, the total number of parameters is

reduced to two (T and α) per river reach.

While modelling the lag and attenuation of solute in a river has primary application in

unsteady solute loading conditions, it has also been shown to be important in steady-state

applications. For example, Chapra and Runkel (1998) show that allowance for dead zones

and transient storage significantly affects the location of an oxygen sag downstream of a

steady-state point source of BOD. However, the ADZ model cannot simulate upstream

dispersion of solute which is common in estuaries and slow-moving rivers. While there is

an extensive knowledge-base and empirical formulae for the estimation of the ADE

parameters (i.e. the dispersion coefficient D and Manning’s n or equivalent), there is less

basis for uncalibrated estimates of ADZ and TS parameters (although recent work by

Camacho (2000) and Lees et al. (2000) has addressed this limitation). The use of standard

water quality data is generally insufficient for their calibration because of the interaction

effect of pollutant decay, and carefully planned and executed conservative tracer tests are

generally required (Wagner and Harvey 1997).

3.3.3 Thermodynamics

The thermal regime of a river affects the water quality because;

the water temperature affects the saturation concentration of oxygen and multi-phase

pollutants,

rates of biological and chemical activity generally depend upon temperature,

evaporation losses can be significant,

river ice affects water temperature, aeration, atmospheric heat exchange and flow

rates.

Water temperature is a variable in all but the most simple water quality models. Water

temperature might be prescribed by the modeller as a constant (e.g. HERMES), or as a

constant for each reach (e.g. an option in QUAL2E) or, for dynamic models, as a time-

series of data (e.g. WaterRAT; see Chapter 4). This is appropriate if it gives an

approximation of temperature which is adequate for the task at hand. For example, such a

method would not be appropriate if seasonally averaged temperature data are used

whereas the model is to be used for diurnal studies, nor if extrapolations of climate or

thermal loads were to be investigated. Alternatively, the temperature can be implicitly

calculated using a thermodynamic model. Modelling the thermal regime of rivers is

covered in detail by Ashton (1986) and Bras (1990). An overview is given here.

The temperature of the river may be significantly affected by;

bulk transfer by advection, dispersion, extractions and pollution sources, fb (Js-1),

long wave radiation to / from the atmosphere, sky and surrounding land, fl (Jm-2s-1),

effective short-wave radiation from the sun, fs (Jm-2s-1),

convection to and from the atmosphere, fc (Jm-2s-1),

evaporation losses and condensation gains, fe (Jm-2s-1),

conduction to and from the river bed, fsw (Jm-2s-1),

rainfall and snowfall, fp (Jm-2s-1),

conduction to and from ice, fiw (Jm-2s-1).

The first five of these processes are usually included in dynamic water quality models

(including options in QUAL2E, WASP5 and MIKE11, although in QUASAR only the

bulk transfer is accounted for). River bed conduction and friction effects are included in

some more specialist thermodynamic models (e.g. Evans et al. 1998). The first seven

processes are illustrated in Figure 3.5 and their implementation is summarised by

Equation 3.8, in which it is assumed that there is no ice cover.

( ) sswpcelsb AfffffffdtdJ

++++++= (3.8)

where As is the surface area of the water (m2) and J is the total heat in the river reach

(Nm), linked to the water temperature Tw (oC) by,

www sdV

= (3.9)

where dw is the density of water and sw is the specific heat capacity of water .Note that

Equation 3.8 is not analytically soluble, with each of the terms non-linearly dependent on

Tw. The use of relatively complex, physically-based thermodynamic models for the

derivation of the terms in Equation 3.8 is often justifiable because the theory is well

founded and validated in application to river modelling, and does not require a large

number of prior assumptions. Air temperature, humidity and daylight hours, which are

normally the primary factors affecting the water temperature (Ashton 1986), are generally

reliable and available on a daily basis. And lastly, water temperature is relatively easily

and accurately measured. Therefore, calibration and verification of the thermodynamic

model is not necessarily complicated by large data error.

advection and dispersion

bulk pollution source

extractions & bed leakage

surface convection short wave radiation evaporation

long wave radiation

ICE ICE

conduction through ice

T=air temp.

conduction through bed

T=water temp.

T=deep bed temp.

surface convection

advection and dispersion

bulk pollution source

extractions & bed leakage

surface convection short wave radiation evaporation

long wave radiation

ICEICE

ICEICE ICEICE

conduction through ice

T=air temp.

conduction through bed

T=water temp.

T=deep bed temp.

surface convection

Figure 3.5 The main processes affecting the water temperature in a river

The presence of ice on a river signifies a local water temperature of close to zero3.

However, thermal loads such as cooling water discharges or sewage discharge can cause

large heterogeneity which will raise the average water temperature significantly above

zero, despite the widespread presence of ice. Various approaches have been employed for

the modelling of ice on rivers. The simplest is the degree-day method (Shen and Chaing

1984) where ice thickness and cumulative degree-days below freezing are linked by a

simple empirical formula,

5.0ZKHi = (3.10)

3 While a somewhat specialised field of river water quality simulation, the modelling of ice is given significant attention here due to its perceived importance in the Hun river study.

where K is an empirical constant and Z is the cumulative degree-days below freezing. A

degree-day is determined by calculating the mean daily air temperature for the day and

subtracting it from a base temperature, in this case zero. The main limitation of this

approach in dynamic river water quality modelling is the assumption that ice thickness

only depends on the air temperature, therefore neglecting the influences of thermal

pollution, radiation, sediment heat and the retardation effect of flow turbulence. Shen and

Chaing (1984) suggest a heat balance approach which assumes linear heat gradients

between the air and the ice, the ice and the water, and the water and the river-bed, as well

as the usual heat exchanges (first five in the previous list). Neglecting the terms already

presented in Equation 3.8, the water heat gain can be modelled by (adapted from Shen

and Chaing 1984 and Lal and Shen 1993),

( ) ( ) swsswsfwiiw ATTkATTkdtdJ

−+−= (3.11a)

( ) ( )

−+−

wiiwaiaii

i TTkTTkk

11 (3.11b)

where kiw, ksw and kai are the heat transfer coefficients between the water and the ice, the

water and the sediment, and the ice and the air respectively (thereby, for each, the thermal

conductivity and boundary layer thickness are combined into one parameter); Ti is the ice

temperature at the water-ice interface which equals zero; Ts is the sediment temperature;

di and li are the density and latent heat of ice respectively; and ki is the thermal

conductivity of ice. This model simulates the lag between Tw and Ta. However, the

accuracy of results is limited because the response of the terms in Equation 3.8 to the

presence of ice is necessarily simplified due to lack of present knowledge. Particular

sources of error are the insulative effect of snow cover on the ice and the reduction in

evaporative losses during periods of ice cover. In arctic regions, it is often important to

model the ice progression and melt in detail. Shen (1979) suggests a framework for

incorporating the physical processes of frazil and floe formation, illustrated in Figure 3.6.

Numerical implementations of these processes are described in Maunula (1992) and Lal

and Shen (1993).

Cooling of water to 0oC

If U < 0.6 m/s then ice sheets form on surface

If U > 0.6 m/s then frazil production

Sheet bridges riverSheetsdevelop to floes

Floes accumulate

Frazils float and flocculate

Frazil pans

UndercoverdepositionThermal growth

If u < 0.6 m/s then ice sheets form on surface

If u > 0.6 m/s then frazil production

Floes accumulate

Frazil pans

If U < 0.6 m/s then ice sheets form on surface

If U > 0.6 m/s then frazil production

Floes accumulate

Frazil pans

If u < 0.6 m/s then ice sheets form on surface

If u > 0.6 m/s then frazil production

Floes accumulate

Frazil pans

Figure 3.6 A framework for an ice model (Ashton, 1979)

Water quality models which allow for the effects of ice are rare. This may be because, in

Europe and the USA, the critical water quality periods are generally in the warm and dry

seasons, when dilution is low and oxygen-depleting decay rates are high. However, in

some climates, the cold season is also the dry season. Ranjie and Huiman (1987) simulate

the dissolved oxygen during winter in a river in northern China, and suggest that

decreased aeration due to the ice cover significantly affects the dissolved oxygen levels.

Their model is a steady-state model which assumes that the river is fully covered with ice

and that the average water temperature is zero. The QUAL2E documentation (Brown and

Barnwell 1987) recommends a factor of between 0 and 1 to adjust oxygen re-aeration

rates to allow for ice cover but does not allow dynamic representation of ice.

Table 3.1 Classes of water quality determinands commonly modelled Class of determinant Example model variables Significance

Carbon and oxygen Organic and inorganic carbon, dissolved oxygen ecology, aesthetics, WT

Nitrogen Nitrate, ammonia, organic nitrogen eutrophication, toxicity, WT

Phosphorus Orthophosphates eutrophication Organic toxins Phenols, pesticides toxicity, WT Oils Petrol, lubricants, fats toxicity, nuisance, WT Suspended solids Inorganic and organic solids aesthetics, WT, sorption Metals Mn, Fe, Hg, Ca toxicity, REDOX reactions Pathogens E.-coli, giardia disease, WT

WT = water treatment costs

3.3.4 Water quality processes

“Water quality processes” refers here to all the physical, chemical and biochemical

transformations of the water quality determinands. Due to the wide range of application

of water quality models, a variety of determinands may be included as state variables, and

these are broadly classified in Table 3.1. Thomann and Mueller (1987) and Chapra (1997)

review the approaches to modelling the processes transforming the state of these

determinands. Chapra also reviews recent methods of simulation of sediment-water

interactions, toxic substances, sorption, phytoplankton stoichiometry and bio-

accumulation. Apart from these subject areas, and some specialist fields (e.g. oil slick

modelling, Lal and Shen 1993) the basics of process modelling has not significantly

changed in the last 20 years, as it is recognised that the utility of the models is limited by

other factors (e.g. the difficulties of representing heterogeneities and of model

identification). Here, an overview is given of the state-of-the-art methods which are

pertinent to this Thesis, which includes only the first six classes in Table 3.1; carbon,

nitrogen, phosphorus, organic toxins, oils and suspended solids.

3.3.4.1 Carbon and dissolved oxygen

The traditional significance of organic carbon in a river is the impact that it has on

dissolved oxygen. Therefore, it is generally measured using its oxygen demand, most

commonly the chemical oxygen demand (COD) or carbonaceous biochemical oxygen

demand (BOD). These determinands are reasonably convenient to measure, and because

the methods are well-practiced, they allow regional and global comparisons. This means

that the data available for model calibration tends to be BOD and/or COD, and either the

model must be parameterised to give comparable results or the data must be adjusted. The

previously mentioned models are parameterised to simulate the carbon oxygen demand

via either the ultimate BOD or the 5-day BOD. Carbon has a fundamental role in water

quality processes which cannot be identified by BOD alone (Connolly and Coffin 1995,

Chapra 1999). Firstly, the fraction of carbon which will settle depends (among other

things) on the fraction which is solid. Without such data, settlement and sediment

processes cannot be modelled mechanistically, and the modeller must assume, neglect or

calibrate effective settling velocities of BOD. Secondly, the lumped approach neglects the

individual fates of the different fractions of carbon so, for example, there can be no

mechanistic model of hydrolysis or complex carbohydrates and long-chain hydrocarbons.

Thirdly, a mechanistic model of toxin sorption processes is not possible without an

estimate of the particulate and dissolved fractions of carbon (e.g. Tye et al. 1996).

In the reviewed models, carbon–dissolved oxygen processes are based on numerical

solutions of the original Streeter-Phelps model, and one or more of these improvements;

implicit calculation of aeration rate (QUASAR, QUAL2E, CE-QUAL-W2,

RWQM1),

allowance for anaerobic conditions (QUASAR, RWQM1),

temperature effects for both BOD decay and aeration (ISIS, QUASAR, QUAL2E,

WASP5, MIKE11, CE-QUAL-W2, RWQM1, SIMCAT),

some representation of sediment oxygen demand (ISIS, DESERT, QUASAR,

QUAL2E, WASP5, CE-QUAL-W2, RWQM1),

phytoplankton photosynthesis (ISIS, QUASAR, QUAL2E, WASP5, MIKE11, CE-

QUAL-W2, RWQM1),

unsteady transport models (OTIS, ISIS, WASP5, MIKE11, DESERT, CE-QUAL-

W2, RWQM1).

The aeration rate is widely recognised to be directly proportional to the dissolved oxygen

deficit, i.e. the saturation level minus the actual level. There are a variety of formulae (see

Bowie et al. 1985) for the estimation of the dissolved oxygen (Cox) saturation level Cos;

one of the most commonly used is that derived experimentally in Greenberg et al. (1992)

(from Chapra 1997) which relates it to water temperature and salinity (Csa in µg/l),

−×−

−+×

)273(101407.2

)273(754.10107674.1

)273(1062195.8

)273(102438.1

)273(66423080

)273(1575703.139

wwwwos

TTTTC (3.12)

In most freshwater applications, Csa can be taken as zero. The aeration rate kra (m s-1) is

widely recognised to also depend on the water temperature through an Arrhenius type

relationship,

20−×= Tw

rara kk θ (3.13)

where kra20 is kra at Tw = 20oC and θ is called (at least in this work) the Arrhenius constant

which for aeration is typically taken as 1.024 (Chapra 1997). kra20 is generally calibrated,

although in QUAL2E and QUASAR, for example, it can be calculated using an empirical

or physically-based formula (see Cole and Wells 2000: Appendix B) which relates it to

the hydraulic state of the river. Cole and Wells also list formulae which model aeration at

hydraulic controls, and Gulliver et al. (1998) evaluate a number of such models.

Allowance for anaerobic conditions is made in QUASAR by setting the organic carbon

decay rate koc to zero when the Cox level reaches zero. This is not allowed for in many

models, making them potentially unreliable for highly polluted rivers. Strictly, in well

mixed rivers at Cox = 0, the aerobic koc should not exceed the aeration rate (Chapra 1997).

Alternatively, it is suggested that a Michaelis-Menton relationship,

( )ochsox

oxoc kC

∝ (3.14)

where kochs is the dissolved oxygen half saturation constant for organic carbon decay, may

be more accurate (Chapra 1999), as it simulates inhibition of aerobic bacteria during

anoxia (i.e. Cox < 2mg/l). The Michaelis-Menton relationship is important in water quality

modelling generally because it is applied to limitation of microbial growth by, for

example, Cox, minerals, prey, or light.

Carbonaceous sediment oxygen demand (CSOD) is commonly represented as a zero

order process (Chapra 1997: 452) - for example in QUAL2E CSOD is input as a constant

by the user. This has been shown to be justified in some cases (e.g. Chen et al. 1999). An

alternative assumption is some simple quasi-steady relationship (for example, in

QUASAR, CSOD is directly proportional to the water Cox) or that CSOD is directly

proportional to the rate of settlement of organic carbon. CSOD involves degradation and

mixing processes which can cause Cox to lag the organic carbon load by days or weeks

(Harremoes 1982, Boyle and Scott 1984). The main processes leading to this effect are

illustrated in a simplified manner in Figure 3.7. To represent these processes

mechanistically (notwithstanding the problems of spatial heterogeneity and sediment

transport) requires a large number of differential equations and parameters (e.g. McIntyre

1998) although such formulations have been shown to be effective where supporting data

from the sediment are available (e.g. Di Toro and Fitzpatrick 1993). Simpler

conceptualisations have also been shown to simulate the effect on overlying water quality

(e.g. Li and Chen 1994).

SOLIDCARBON

DISSOLVEDCARBON

DISSOLVEDOXYGEN

MINERAL CARBON

DISSOLVEDOXYGEN

SOLIDCARBON

DISSOLVEDCARBON

ANAEROBIC SEDIMENT

AEROBIC SEDIMENT

RIVERWATER

SOLIDCARBON

DISSOLVEDCARBON

DISSOLVEDOXYGEN

MINERAL CARBON

DISSOLVEDOXYGEN

SOLIDCARBON

DISSOLVEDCARBON

ANAEROBIC SEDIMENT

AEROBIC SEDIMENT

RIVERWATER

Figure 3.7 Schematic illustration of sediment-water carbon interactions

3.3.4.2 Photosynthesis

Photosynthesis can significantly raise the Cox levels and decay of phytoplankton can

significantly reduce it, causing marked spatial and temporal variations in river Cox. A

multitude of case-specific models have been developed to simulate photosynthesis. While

there are exceptions (Reckhow and Chapra 1983a: 201-314, Whitehead and Hornberger

1984, Whitehead et al. 1997b), the general approach is to mechanistically model the mass

transport and sedimentation, together with the limiting effects of light, temperature and

nutrients on phytoplankton growth. Excessive organic carbon load or presence of

toxicants can inhibit photosynthesis and so, in heavily polluted rivers, additional

inhibition factors may be required. Also, grazing and predatory zooplankton can be

included if first order phytoplankton death is not a useful assumption. In rivers, transport

of phytoplankton significantly affects the spatial distribution of photosynthetic oxygen

production. Therefore, it is often useful to distinguish between phytoplankton, and fixed

photosynthesisers, and a number of modellers have done this (e.g. Howarth et al. 1996,

McIntyre 1998, Park and Lee 2002, Wade et al. 2002). For a detailed description of

numerical modelling of eutrophication processes and for reviews of alternative

approaches, refer to Reckhow and Chapra (1983a,b), Bowie et al. (1985: 279-365),

Chapra (1997: 519-621).

The popular models ISIS, QUAL2E, MIKE11, WASP5, CE-QUAL-W2 and RWQM1

can model the role of phytoplankton in the carbon and oxygen cycles. MIKE11 also has

the option to model macrophytes. QUASAR simulates photosynthetic oxygen production

but requires that the user inputs chlorophyll-a concentrations. The transport-orientated

OTIS neglects photosynthesis, while DESERT allows a user-defined representation.

3.3.4.3 Nitrogen and phosphorus cycles

Typically, it is the main inorganic forms of nitrogen, ammonia, ammonium and nitrate,

which are used as indices of nitrogen pollution. Partly this is for measurement

convenience, partly because there are particular health risks associated with ammonia and

nitrates (WHO 1996), and partly because they are often important controls on

eutrophication. Nitrogen modelling, however, requires some representation of the organic

state of nitrogen as this generally makes up a significant part of the pollution load

(Metcalf and Eddy Inc 1991) and some fraction of it will mineralise in the river. Clearly

there is the same problem of lumping the solid and aqueous fractions of organic nitrogen

together that is encountered in modelling carbon. In cases without data for organic

carbon, the modeller must approximate concentrations via an knowledge-based

relationship with the inorganic forms (e.g. Metcalfe and Eddy Inc 1991). Ammonia and

ammonium are generally modelled together. They exist in an equilibrium which depends

on pH and temperature (see the description of Whitehead et al. 1997a), so can only be

modelled separately when pH is also an input or modelled variable. Nitrite is generally

lumped into nitrate, as it exists in much smaller quantities. A notable exception is

QUAL2E which models nitrites explicitly. For the models used later in this dissertation

the following notation for nitrogen concentrations is used: organic nitrogen (Cns);

ammonia plus ammonium (Cna); nitrite plus nitrate (Cni).

The fundamental difference between carbon modelling and nitrogen modelling is that the

inorganic forms of nitrogen are of interest whereas those of carbon generally are not. The

mechanisms of the transformations of nitrogen are well documented (e.g. Sawyer et al.

1994), and are generally modelled using similar methods to those used for the decay of

organic carbon. That is, first order decay rates are used; the Arrhenius equation is used to

account for temperature variable decay rates; and oxygen limitation models such as

Equation 3.14 are used in the oxidation reactions. Perhaps the most important feature of

nitrogen (and phosphorus) modelling is the importance of the loads from distributed

sources. For example, an estimated 70% of the nitrate load in the UK is from distributed

sources (DEFRA 2002) because of nitrates in the rainfall, and the runoff of animal

detritus and excess chemical fertilizers. The importance of distributed sources has

motivated integrated runoff-water quality models, e.g. HSPF (Bicknell et al. 1997) and

INCA (Whitehead et al. 1998).

Denitrifying bacteria (responsible for the loss of nitrogen mass to the atmosphere) are

known to flourish on the aerobic surface layer of the sediment (e.g. Kusuda et al. 1994).

Therefore, for models without implicit representation of the active sediment area and

sediment nitrogen, there is a problem of how to conceptualise denitrification. QUASAR

assumes that denitrification is directly proportional to the nitrate concentration in the

water column (with a temperature correction), while QUAL2E neglects denitrification

altogether.

Although certain pesticides containing phosphorus are known to be toxic (Sawyer et al.

1994), there is no significant oxygen demand associated with phosphorus, nor any direct

detriment to the environment or to human health of commonly found inorganic forms.

Therefore phosphorus modelling is motivated by eutrophication, and is included in all of

the reviewed river eutrophication models. Modelling the phosphorus cycle is complicated

by the numerous and inter-related fractions of phosphorus which are relevant to the

modeller (see Figure 3.8 adapted from Chapra 1997; note that sediment fractions are

lumped together and sediment-water interactions are not shown). The various fractions

are generally represented by two mutually exclusive conceptual fractions, for example

organic P (Cps) and dissolved P (Cpo) (e.g. in QUAL2E). Alternatively, only one fraction

is assumed to be relevant at the time-scale in question (e.g. orthophosphates in WASP5).

An additional complication in phosphorus modelling is that inorganic forms tend to

absorb to solids, which can make the phosphorus unavailable as a nutrient (e.g. Tate et al.

1995). Of particular importance are the level of iron hydroxide in the sediment, because it

strongly absorbs orthophosphates, and the dissolved oxygen level, which reduces the

hydroxides thus releasing orthophosphates (Weber 1996). While these processes can be

modelled successfully under certain conditions, it is clear that adequate sediment data are

required.

Particulate unavailableorganic P

Particulateunavailable inorganic P

Non-particulateunavailable

organic P

Non-particulateunavailable inorganic P

Solublereactive

phosphorus

Unavailable for photosynthesisers

Available for photosynthesisers

Solid or attached

Dissolved

sediment P

Particulate unavailableorganic P

Particulateunavailable inorganic P

Non-particulateunavailable

organic P

Non-particulateunavailable inorganic P

Solublereactive

phosphorus

Unavailable for photosynthesisers

Available for photosynthesisers

Solid or attached

Dissolved

sediment P

Figure 3.8 Fractions of phosphorus relevant in a photosynthesis model

As with nitrogen, distributed runoff of fertilisers and animal waste is a major source of

phosphorus in many rivers. For example, an estimated 40% of the phosphorus load to UK

rivers originates from distributed sources (DEFRA 2002).

3.3.4.4 Organic toxins and oils

Organic toxins pose some special difficulties for the modeller;

They tend to accumulate and persist in sediments, and associate readily with other

organic material (Chapra 1999). This means that concentrations are sensitive to the

amount and nature of the organics present in the water and sediments.

There are a large number of types, subject to a large number of kinetic processes

(biodegradation, volatilisation, evaporation, photolysis, hydrolysis, sedimentation –

see Chapra 1997). This means that physically-based models are complex, and that

lumped conceptual models will give predictive results of limited accuracy, especially

when the predictive task differs markedly from the conditions of calibration.

Small concentrations may be hazardous, which increases the need for accurate

models, and increases the significance of spatial and temporal heterogeneity in the

river.

However, the implications of toxins mean that they have been relatively well studied. As

they are largely man-made, there is interest in the topic from the producers as well as

from environmentalists, regulators and engineers. Therefore, given supporting data, there

is motivation and scope for detailed mechanistic toxic models (Thomann 1998, also see

Chapra 1997 for a review of research and numerical implementations). Models of organic

toxins (e.g. those in WASP, MIKE11 and the specialist steady-state model EXAMS

(Burns 2000)) are generally complex with a large number of state variables. Simplified

approaches, which assume zero decay or first order decay, are available and may be

useful where the transport is considered the primary factor (e.g. applications of the OTIS

model).

The term ‘oil’ generally includes a wide variety of organic substances which are

dissolved in certain solvents (Sawyer et al. 1994: 603). This includes hydrocarbons,

esters, natural oils and fats, waxes and long-chain fatty acids. Modelling oil poses the

same difficulties as organic toxins with the specific problem that some fractions of oil are

hydrophobic and some of these are less dense than water. This means that many oils will

not disperse evenly in the water, instead forming globules which may float on the water

surface and these will tend to form distinct pools due to wind effects. Such heterogeneity

is difficult to represent in lumped models, in particular the transport processes of the

surface layer of oil are different to those assumed using previously mentioned methods.

Traditionally, river oil models are directed at modelling the transient effect of single-

source spills under gradually varied or steady flow, and this allows Lagrangian type

numerical schemes to be applied (a review of river oil models is given in Yapa and Shen

1995). Another feature of existing oil models is that the fraction of spilled oil is known,

and physically-based parameters are known with confidence. Modelling multiple sources

of oil for purposes of management of day to day pollution requires an alternative

modelling approach. In particular, industrial and municipal pollution sources include

numerous fractions of oil which are lumped together using standard methods of

measurement (Greenberg et al. 1992:5-25), therefore there is no scope to model the

fractions individually. Consequently some average, conceptual parameter values must be

used to represent the oil processes (e.g. Sherif 2000).

3.3.4.5 Suspended solids

The concentration and nature of suspended solids in a river is important because;

they control the transport and fate of absorbed pollutants, especially toxins,

suspended solids are a nuisance in water treatment, causing erosion of plant and

requiring pre-treatment sedimentation tanks,

sedimentation and erosion can be a nuisance, e.g. siltation behind weirs and erosion

of bridge piers,

sediment nature and distribution affects sediment-water interactions,

turbid water can be unaesthetic.

In lakes and deep rivers, a common conceptualisation of suspended solid behaviour is to

lump the solids into one variable, Css, and to assume an effective settling velocity, vss,

which is calibrated (e.g. Chapra 1991);

dtdC ×

= (3.15)

A problem with this model is that suspended solids will not be uniformly distributed over

the water depth (for example, hindered settling will occur towards the bottom) and so the

sedimentation will not generally be first order with respect to the average suspended

solids concentration. Also in such systems, the total suspended solids may be dominated

by bio-production and this also is a function of depth (due to light extinction) and time

(due to the changing physiology of the microbes). On the other hand, in shallow rivers,

which are relatively well mixed over their depth, the concentration of suspended solids is

dominated by local eddies, turbulence and sediment resuspension characteristics (Yang

and Molinas 1982) therefore the above simple model is also of limited value. To

overcome this limitation, specialist solids transport models have been developed.

The modelling of suspended solids in rivers can be regarded as comprising seven

elements; loading, sedimentation, bed-transport, flow-transport, resuspension, bio-

accumulation and bio-degradation. Sediment transport models are traditionally aimed at

the problems of erosion and excessive sedimentation, where the effect of organic solids is

relatively small. Therefore, sediment transport models generally neglect bio-degradation

and bio-accumulation. Even neglecting these processes, sediment transport is notoriously

difficult to simulate. This is shown by Nakato (1990) who compares the results of eleven

sediment transport models, finding large differences in results and that empirical models

are least reliable, and the most successful are those which conceptually relate sediment

transport (and the concentration of suspended solids) to the hydraulic energy dissipation

and the size classification of the sediment.

It may be concluded that accurate modelling of suspended solids and their effect on other

water quality variables requires model sophistication and data which are beyond the

resources of most water quality studies. Many standard water quality models (e.g.

QUASAR and QUAL2E) do not attempt to model the sediment, thus are limited in their

representation of sediment-water interaction. Other models, which include such

interaction, adopt the simple approach of Equation 3.15 (e.g. Ivanov et al. 1996).

Recognising the limitation of this, MIKE11 incorporates a choice of suspended solids

models (e.g. Engelund and Fredsoe 1976) for cohesive and cohesionless sediment

transport, and these have been applied with some success in data-rich studies (e.g.

Enggrob 1997). However, McNeil et al. (1996) note that such models are not reliable

under many high flow conditions because they do not account for heterogeneity in the

depth of the sediment.

3.4 Summary

1. Numerous river water quality models are freely available and are being widely

applied for design, management and research purposes.

2. Arguably, all water quality models can be described as conceptual because they

represent the river environment in a meaningful but greatly simplified way, and use

model parameters which generally cannot be measured, but must be estimated using

expert knowledge or calibrated using reference data.

3. Pollutant transport models are a fundamental part of a river water quality model as

they dictate the magnitude and location of pollution. Hydrodynamic models are

computationally expensive, and more efficient routing or quasi steady-state models

are used for most applications.

4. Water temperature models are a fundamental part of a river water quality model as

they dictate the rate of most water quality processes, and therefore the pollution

levels. Physically-based thermodynamic models can simulate the water temperature

in a well-mixed river if the air temperature and various other atmospheric conditions

are known. Ice coverage is more difficult to simulate accurately, and ice presence is

likely to reduce the accuracy of the water temperature model during the thaw.

5. The fate of pollutants in a river is simulated using the principle of conservation of

elemental mass. That is, nitrogen, carbon, etc. cannot be destroyed, only transformed

in location and in species. In essence, the task of the dynamic model is to simulate the

location and species at all times. This is complicated by the various interacting

species of nitrogen, carbon and phosphorus (among others) of potential interest.

Furthermore, the nitrogen, carbon and phosphorus fates are all inter-linked by the

critical role of oxygen in the aquatic life-cycle.

6. Phytoplankton have an important role in the carbon and nutrients cycles, especially

because they effectively fix carbon (and sometimes nitrogen) from the atmosphere.

They are difficult to model because of spatial and temporal heterogeneity in their

growth, due to the important influence of the higher food chain, and due to their

sensitivity to nutrients, light, water temperature and other environmental variables

that are not easy to accurately model or measure.

7. ‘Traditional’ pollutants are regarded as organic carbon and nutrients. Recently, the

value of water quality models in simulating toxic substances has been recognised.

There are various difficulties in simulating toxics which mean that toxic models are

necessarily more complex than those of the traditional pollutants.

8. Sediments have a recognised role in river pollution as they can store pollutants and

rapidly release them during scour, or slowly release them diffusively. However,

accurately modelling sediment transport is extremely difficult and is outwith the

scope of most models.

9. Different modelling tasks require different models because of the number of

potentially relevant pollutants, and the supporting expertise and data which is

available. This raises a difficulty with application of available models. Some of these

try to cover more pollutants with add-in modules, while others encourage the user to

specify his own differential equations for the model state variables if necessary.

4. Water quality risk analysis tool (WaterRAT)

This chapter summarises the river modelling and uncertainty analysis components of

WaterRAT (Water quality Risk Analysis Tool), developed by the author. WaterRAT

includes a number of methods of evaluation of model and prediction uncertainty, and a

library of river and lake models of different complexities to suit the predictive task, the

characteristics of the natural system, the available data and computational resources. This

analytical capability is designed to encourage the modeller to explore prediction

uncertainty fully, and hence make properly informed recommendations for water quality

management (including water quality objectives, interventions, monitoring and model

development).

4.1 Introduction

The WaterRAT software implements the modelling framework proposed in Chapter 1

using methods and ideas introduced in Chapters 2 and 3. In summary, WaterRAT aims to

allow:

1. Responsiveness of modelling approach to the user’s data and resource constraints.

2. Indication of the principal factors affecting various determinands of water quality,

under current and speculated conditions.

3. Evaluation of the risk of failure associated with alternative pollution control

strategies, and indication of the key sources of decision-making risk.

4. Identification of trade-offs and viable compromises between non-commensurate

modelling and management objectives.

5. Iterative review of management objectives, modelling approaches, and database

and model development priorities.

These aims are pursued by including the following characteristics:

1. A choice of model structures and modelled determinands, and a framework

within which additional models may easily be added.

2. A framework within which (almost) all model inputs may be treated as uncertain,

and optimised to any chosen combination of target outputs.

3. Incorporation of GLUE for model uncertainty estimation, supplemented by a

deterministic genetic algorithm and Pareto (multi-objective) optimisation.

4. A selection of objective functions allowing flexibility in derivation of parameter

uncertainty using the GLUE methodology.

5. Efficient numerical methods allowing Monte Carlo methods to be as effective as

possible, and first order approximations to supplement Monte Carlo methods.

6. An easy-to-use interface for model specification and result analysis.

As well as pursuing a generic modelling framework, WaterRAT has been tailored to the

specific objectives of the TOPLEM project, and in particular this dictated the modelled

determinands of water quality, and the user interfaces.

Descriptions of the equations used in the simulation models, and in the conditioning and

analysis modules have been omitted from this chapter, as they are lengthy and not all

immediately relevant. Instead, reference is made to the WaterRAT documentation

(McIntyre and Zeng 2002), and specific descriptions are given where needed in Chapters

5, 6 and 7. This chapter concentrates on describing the functionality of WaterRAT in

terms of its general structure, its inputs, model library, analytical modules and outputs.

Following these descriptions, the novelty and limitations of WaterRAT are reviewed.

4.2 The concept and structure of WaterRAT

WaterRAT is a spreadsheet-based modelling tool that includes a library of surface water

quality models of varying complexity, presently including a choice of one-dimensional

river models and two-dimensional lake models, as described further below. Alternative

numerical solution methods are provided, plus a limited choice of pre-processing models

for filling in missing boundary condition data. There is flexibility over spatial scale, and

the input-output time-step may be anything above one minute. With the exception of

meteorological boundary conditions, all values of input data and parameters may be

treated as uncertain variables, and their effects included in calibration and reliability

analysis. The currently included models are limited to individual lake and river water

bodies plus interacting sediments, rather than of the wider catchment, and lake-river

systems cannot presently be modelled as a whole. However, additional models to suit new

problems can be added to the library.

WaterRAT is built within MS Excel 2000, so that WaterRAT’s own data processing

modules can be supplemented by those of Excel. The input and output is via a series of

Excel spreadsheets and model specifications are made via Visual Basic (VB) menus and

dialogue boxes. The library of simulation models comprises a series of Dynamic Link

Libraries (DLLs), which minimises processing time and allows Monte Carlo techniques

to be efficiently applied. The interactions between the user interface, VB modules and the

core DLLs are illustrated in Figure 4.1.

WaterRAT has a library of model structures for its pollution transport, water temperature

and water quality modules. This library gives the modeller a choice of different model

complexities, and determinands which he wishes to model. This includes the capacity to

model total organic carbon, biochemical oxygen demand, chlorophyll-a, dissolved

oxygen, various nutrients, a toxic substance, floating and suspended oil, and total

suspended solids. This is supported by sediment models which include biochemically and

physically-driven sediment-water interactions. A thermodynamic model is available

which models heat fluxes from the atmosphere and sediment, and simulates ice thickness

and cover. This thermodynamic model may be bypassed by prescribing a time-series of

water temperature. For the river models, pollution transport is modelled using the one-

dimensional advection-dispersion equation supported by two alternative hydraulic models

(a quasi-steady friction formula and a non-linear store). The lake models use the

advection-dispersion model supported by level-pool routing, with the option of including

prescribed periods and strengths of thermal stratification. The model components

presently available are summarised in Figure 4.2. The transport, sources, losses and

changes of state of mass and energy are represented by systems of differential equations,

a detailed description of which is found in McIntyre and Zeng (2002).

Excelinterface

Visual Basic

Dynamic Link Libraries

Data input and model configuration

Data processor

River and lake simulation models

Result display

Result processor Visual Basic

Routing models

Thermodynamic model(optional)

Water quality models

Sediment quality model(optional)

Excel interface

Excelinterface

Visual Basic

Dynamic Link Libraries

Data input and model configuration

Data processor

River and lake simulation models

Result display

Result processor Visual Basic

Routing models

Thermodynamic model(optional)

Water quality models

Sediment quality model(optional)

Excel interface Figure 4.1 WaterRAT’s software components.

Total carbonTotal nitrogen

Total phosphorusTotal solids

Core model

Optionalcomponents

Routing Thermodynamics Water quality Sediment quality

FlowWater depth

Water temperature

Ice thicknessIce cover

Dissolved oxygen

Phytoplankton

Floating oilSuspended oil

CODBOD

Organic NNitrates

Ammonia

Organic PInorganic P

CODBOD

Organic NNitrates

Ammonia

Total carbonTotal nitrogen

Total phosphorusTotal solids

Core model

Optionalcomponents

Routing Thermodynamics Water quality Sediment quality

FlowWater depth

Water temperature

Ice thicknessIce cover

Dissolved oxygen

Phytoplankton

Floating oilSuspended oil

CODBOD

Organic NNitrates

Ammonia

CODBOD

Organic NNitrates

Ammonia

P = phosphorus; N = nitrogen; COD = chemical oxygen demand; BOD = biochemical oxygen demand. Figure 4.2 Model component options currently available in WaterRAT.

4.3 Spatial and temporal resolution

Appropriate spatial and temporal modelling scales depend on the resolution of the data

and boundary conditions available for model identification, the time available for

achieving results, and the scale at which model forecasts are required.

For river modelling, the river is represented as a series of well-mixed control volumes

(called ‘cells’) between which pollution transport processes are simulated. This concept is

illustrated in Figure 4.3. Each cell must be prescribed certain spatially-varying parameters

which depend on the transport model selected. For this purpose, adjacent cells may be

grouped together into reaches, within which the spatially-varying parameters are taken to

be constant. The lengths of cells and reaches, and their properties, are input to a

spreadsheet. The downstream boundary of each cell is specified in terms of kilometres

downstream from a datum. The lake models work on the same control-volume principal,

except that they can be two dimensional, able to represent the vertical variation in water

quality due to effects of thermal stratification as well as length-wise variations (Figure

4.4). The lake models’ spatially varying parameters are specified for each cell.

The output time-step is defined by the user, and may be anything greater than one minute.

The available input data will be automatically interpolated to this time-scale, using either

linear interpolation, a cubic-spline or a step-function (whereby the input will not change

until that time when the next data point is available), as chosen by the user. In general,

whatever interpolation model is used, the temporal resolution of the output will be

restricted by that of the input data. The time-domain of the simulation is specified by the

modeller, constrained only by computer memory and the available time-domain of the

boundary conditions.

The numerical integration in the time domain uses a Fehlberg adaptive time-step scheme

(see Chapra and Canale 1998). This ensures near-optimum speed of computation and a

numerical error in the time domain which is guaranteed to be below a specified

maximum. This is an important feature in the Monte Carlo simulation, where randomly

sampled inputs lead to numerical stability and accuracy criteria which can vary widely,

both over the time-domain and from one model realisation to the next (e.g. Chapter 5).

The user can vary the numerical tolerance so that precision is not inordinately high given

the overall model uncertainty, and so that the solution speed is not inordinately low given

the computational constraints. Spatial numerical errors and numerical dispersion are not

handled automatically - see the discussion in section 5.4.

Headwater

Sources& losses

Additionalcells todownstreamboundary

Sedimentinteraction

Headwater

Sources& losses

Sedimentinteraction

Figure 4.3 Cells-in-series concept of a river with sediment interactions.

Headwater Sources& losses

Sedimentinteraction

Epi-limnion

Hypo-limnion

Segment 1 Segment 2 Segment 3 Segment 4

Headwater Sources& losses

Sedimentinteraction

Epi-limnion

Hypo-limnion

Segment 1 Segment 2 Segment 3 Segment 4 Figure 4.4 Cells-in-series concept of a stratified lake with sediment interactions.

4.4 Boundary conditions, initial conditions and model parameters

Dynamic boundary conditions include the meteorological, source and abstraction data.

All meteorology time-series (rainfall, evaporation, dew-point, air temperature, wind

speed, and surface light intensity are needed as inputs to various alternative models) are

input on a single spreadsheet, and are assumed to be uniform over the river or lake. Any

number of pollution/flow sources can be input (subject to computer memory). Each is

input on a separate spreadsheet with the river kilometer (for river models) or the cell

number (for lake models) specified. The format of this input is illustrated in Figure 4.5

(where the flow, BOD and ammonium are being treated as uncertain – see below). If

negative flows are entered, then sources are taken as losses, and any associated pollution

loads are neglected. For the river models, distributed sources may also be specified,

whereby the loads are evenly distributed between specified upstream and downstream

river kilometers.

Static boundary conditions are specified for each cell. For the river models these are:

channel cross-section shape; a leakage rate that is specified as constant or proportional to

water volume in that cell; sediment oxygen demand or active sediment area (depending

on whether the sediment-water interaction module is being used); and hydraulic or

routing parameters depending on which solute transport model is being used. For the lake

models, the bathymetry is defined by a volume-level relationship for the lake, plus the

ratios of volumes (assumed to be constant) of the lake’s conceptual cells. Hydraulic or

routing parameters are not needed for the level-pool lake routing method.

Initial conditions can be either be entered via a spreadsheet as a model input, or they can

be estimated using a specified ‘warm-up’ period. During this period the dynamic

boundary conditions are assumed steady-state at those of the specified start time of the

simulation. For systems where the response time is significantly smaller than the

specified output time-step, this latter option is likely to be a sufficient approximation.

Model parameter values are entered on another spreadsheet. Templates are given for each

alternative model structure, making it clear which parameters are relevant for each option.

Figure 4.5 A typical format of time-series data input.

All model parameters, initial conditions, static boundary conditions, and point and

distributed loads can be considered as uncertain inputs. Prior to running the model, the

user signifies that an input is uncertain by specifying a maximum and minimum value

instead of an assumed value. For the time-series inputs, all entries in each time-series are

assumed to have the same level of uncertainty, and this may be specified as either

absolute or relative. For example, in Figure 4.5, all entries in the flow time-series have

prior uncertainties of ±30%, and all entries in the BOD time-series have prior

uncertainties of ±25%. Specified maximum and minimum values define uniform prior

distributions which are assumed independent of each other. Each distribution is

propagated to prediction uncertainty (Section 4.7), or included in the calibration or

sensitivity analysis. This means that the model calibration and predictions need not be

conditional on the precision and reliability of input data, and that the relative significance

of uncertainties in parameters and in other inputs can be revealed through sensitivity

analysis (Section 4.6).

4.5 Calibration and optimisation

WaterRAT has a number of options for automatic model calibration. There are three

alternative algorithms – Monte Carlo using Latin Hypercube or stratified random

sampling (see MacKay et al. 1979), a Monte Carlo Markov Chain algorithm (based on the

Metropolis algorithm -see Chapter 2), and a genetic algorithm (based on the descriptions

of Beaseley et al. 1993). Each of these alternative methods has advantages depending on

the nature of the calibration task, and depending on whether uncertainty or sensitivity

analysis is required.

An important feature of WaterRAT is its capability to estimate the uncertainty in the

calibrated, optimal parameter set by defining a posterior parameter response surface (i.e.

values of probability mass for a large number of parameter set samples), using the Monte

Carlo algorithms. From this response surface, marginal probability distributions of

parameters and their co-variance matrix can easily be derived, and used for regional

sensitivity analysis and risk evaluations (e.g. Portielje et al. 2000). Figures 4.6a and 4.6b

give examples of bi-variate response surfaces of parameters n and Hs, and ks and τcr from

the phosphorus model calibration in Chapter 6.

In addition to the choice of calibration algorithms, WaterRAT provides flexibility in

choice of the objective function used to define the likelihood response surface. This

includes selection of which determinands, which cells and which time-periods are to be

incorporated into the objective function. Objective function definition also includes

alternative likelihood functions (see Sorooshian and Gupta 1995) based on assumptions

about the nature of the data errors, and more subjective estimators of likelihood based on

the HSY method of Hornberger and Spear (1980) and the GLUE methodology of Beven

and Binley (1992).

0.150.10

Manning’s n

0.00030

0.00027

0.00023

0.00020

0.00017

0.150.10

Manning’s n

0.00030

0.00027

0.00023

0.00020

0.00017

0.000250.00023

0.00022

0.00020

0.00019

10 15 2520 30

Scour rate ks

0.00020

0.000250.00023

0.00022

0.00020

0.00019

10 15 2520 30

Scour rate ks

0.00020

Figure 4.6 Example of bi-variate response surface output of WaterRAT (taken from calibration described in section 6.4). The contours are interpolations of point values of probability mass.

Ideally, the objective function used to calculate the GLUE likelihood should reflect the

perceived data and model structure error so that, for example, a higher data error will lead

to an objective function which discriminates less between sets of factors. As discussed in

Chapter 1, achieving this objectively is difficult due to the complex and largely unknown

nature of the errors and, whatever approach is taken, some post-conditioning appraisal of

the OF specification is needed. This appraisal should take account of the number of sets

of factors which the OF has defined to be successful, and the performance of the

conditioned model with respect to the available data (i.e. are enough of the data explained

by the estimated uncertainty in the results?). On both these accounts, the adequacy of the

OF is dependant on the adequacy of the model structure and the data, and review of these

is a parallel part of uncertainty description and reduction. On this basis, an iterative

approach to model structure choice, data interpretation and OF design is suggested in

Figure 4.7.

Calibration dataavailable?

Redesign model structure

Retrievable fault in model

structure ?YesNo conditioning

OF* design anddata interpretation

* OF = objective function

Synthesisesome data

Knowledge-basedoutput constraints?

Adequate stochasticdescription of data ?

Adequate number of valid parameter sets?

Calibration dataavailable?

Redesign model structure

Retrievable fault in model

structure ?YesNo conditioning

OF* design anddata interpretation

* OF = objective function

Synthesisesome data

Knowledge-basedoutput constraints?

Adequate stochasticdescription of data ?

Adequate number of valid parameter sets?

Figure 4.7 An iterative approach to model conditioning with WaterRAT

In WaterRAT, calibration data (and their error bounds if these are entered) from the

monitoring points are entered into spreadsheets (one spreadsheet per monitoring point).

During automatic calibration, WaterRAT will search these entries for data that is included

in the specified objective functions. The relevant observed data (plus any specified error

bounds) are included in the graphical report of calibrated model output, so that the

success of the calibration can be visualised.

Using the calibration algorithms introduced above, pollution sources and other boundary

conditions can be optimised to meet water quality targets, defined as “pass or fail”

objective functions. Using the uncertainty analysis capability allows the risk of failing to

meet these targets (due to uncertainty in the other inputs) to be evaluated for different

intervention options. This is demonstrated in Chapter 7.

The observed water quality data on which the model is to be conditioned are input as

time-series in a similar manner to the pollution load inputs (although no synchronisation

of this data is required). To allow for the uncertainties introduced by sampling error, and

other errors, the data uncertainty can be specified. It is specified individually for each

data point as a relative or absolute error bound, so that the errors are taken as uniform and

independent. This uncertainty can then be employed in the calculation of the objective

function used to condition the model (see sections 7.2.2-7.2.3).

Unless a series of sample values has been provided in the appropriate spreadsheet, the

modeller is required to prescribe distributions or mean values for the model parameters.

The parameters pertinent to the chosen model structure are listed in a reference

spreadsheet together with default upper and lower bounds which are based on the

modelling literature (see McIntyre and Zeng 2002).

4.6 Multi-objective analysis

WaterRAT allows up to four objective functions to be simultaneously calculated during

calibration. Multi-objective analysis also allows the sensitivities of different modelling

criteria to be simultaneously evaluated and compared (e.g. Bastidas et al. 1999). Also, it

has been shown that using multiple objective functions for calibration can indicate model

equation error and the resulting prediction uncertainty (e.g. Gupta et al. 1998). For

example, in calibration of a model representing interactions between biochemical oxygen

demand (BOD) and dissolved oxygen (DO), this could reveal the disparity between the

optimum model of DO, and that of BOD (McIntyre et al. 2001). Such a disparity would

indicate a fault in the model’s representation of the interactions between the two.

Similarly, a marked difference between optimal parameter values identified using

seasonally-exclusive objective functions, would indicate a misrepresentation of seasonal

dynamics.

Following calibration using multiple objective functions, WaterRAT can filter out the

Pareto-optimal solutions. These are the solutions that provide a valid compromise

between the alternative objectives, and the variability of the Pareto-optimal solutions

represents the disparity of the objectives. If the objectives should be commensurate (as in

the BOD-DO model calibration example above) then disparity indicates an error either in

the data or in the model equations. On the other hand, assuming errors are relatively

minor, WaterRAT can be used to expose the necessary trade-offs between different

management objectives, and to explore acceptable compromises. For example, the trade-

off between maximising abstractions and minimising risk to downstream chemical status

(both of which can be formulated in a simple manner into objective functions) can be

assessed. WaterRAT, therefore, has potential application to catchment management

planning (e.g. UK Environment Agency 2001b, 2002).

4.7 Sensitivity analysis

Sensitivity analysis is implicit to model conditioning (in which the sensitivities of the OFs

to the inputs are explored). However, using sensitivity analysis to its full benefit, for

example identifying the key causes of pollution, requires that measures of sensitivity are

evaluated and reported explicitly. Also, sensitivity analysis can be used as a screening-

level approach, whereby the parameter responses need not be rigorously evaluated, but

can be used to give useful indications of the main driving forces of the system.

WaterRAT contains a number of complementary approaches to sensitivity analysis.

The simplest option is first order sensitivity analysis. Using this, the effect of perturbing

each uncertain input between its specified upper and lower bound is reported for each

output variable as an absolute value, displayed in tabular form for any chosen date and

cell number within the domain of the simulation. Since the results are absolute measures

they give an approximation of the variations which might be observed given the input

perturbations, although this result is local to the chosen mean values of all other inputs.

To make the analysis more robust to local effects, it can be extended to a factorial

analysis (Henderson-Sellers and Henderson-Sellers 1993) which allows for two-factor

interactions.

The Monte Carlo-based calibration methods evaluate the response of an objective

function over the possible combinations of values of the input variables. Therefore these

methods can be used to report sensitivities which take account of the high order

interactions between variables which the factorial methods cannot, and which are not

centred around an arbitrary point in the parameter space. Furthermore, the data upon

which the objective function is based may be synthesised, so that sensitivity of

speculative or regulation-based objectives to the various model inputs can be evaluated.

For example, the Monte Carlo methods can be used to give an indication of which input

uncertainties are most likely to cause failure of future chemical status objectives.

As part of the Monte Carlo procedure, values of the in-river (or in-lake) calibration data

can be randomly sampled from within their specified error bounds. In essence, this means

that WaterRAT can measure the sensitivity of the objective function value to how well

that objective has been defined by the target data. As well as indicating priorities for

collection of calibration data (e.g. Chapter 7), this allows the significance of uncertainty

in future water quality objectives to be evaluated. For example, should there be

uncertainty in the level of dissolved oxygen needed to secure good ecological status, this

could be represented as error bounds on that target. The sensitivity analysis would then

indicate whether refining the definition of this objective (i.e. improving the chemico-

ecological model) would be a research priority.

One option for reporting the results of Monte Carlo-based sensitivity analysis is through

comparison of the post-calibration covariance matrix with the uncalibrated equivalent

(where it would generally be expected that a parameter to which the objective function

was sensitive would reduce in variance during calibration). WaterRAT also allows the

user to summarise the results of Monte Carlo-based calibration by the Kolmogorov-

Smirnov (KS) statistic. In this context, the KS statistic is the maximum distance

separating the calibrated marginal cumulative distributions of the factor values, and the

uncalibrated, uniform marginal distribution.

It should be noted that review of KS statistics and variance-covariance matrices gives a

summary of the results of a Monte Carlo-based sensitivity analysis, which does not make

full use of the information available. To supplement this summary, scatter plots

(projections of the point estimates of OF onto a plane) can be used to view in detail the

variation of the OF over the range of each factor (e.g. Beven and Binley 1992, Freer et al.

1996). For example, this can be used to review skewness, peakedness and multi-modality

of the univariate response. Additionally, bi-variate plots of OF values allow response

surfaces to be visualised, potentially highlighting the non-Gaussian natures of posterior

distributions and non-linear dependencies between parameters. As examples, Figures 4.6a

and 4.6b show bi-variate distributions of parameters n and Hs, and ks and τcr from the

phosphorus model calibration in Chapter 6.

4.8 Prediction uncertainty

Whereas sensitivity analysis can highlight which uncertain inputs are most likely to

influence the model results, prediction of space and time-series is needed to show where

and when this influence is significant. WaterRAT offers three basic methods of

propagating uncertainty to model predictions, each of which can be applied to either the

uncalibrated or calibrated distributions of inputs. Using the first order-second moment

method (see Tung 1996) and Rosenblueth’s two-point estimation method (Rosenblueth

1981, see Tung 1996), the propagation is based on the prior or calibrated covariance

matrix and mean input values. Using Monte Carlo sampling, a specified number of

samples are taken either from the prior uniform distributions of inputs, or from the sets

pre-sampled during calibration. In this latter case, the relative probability of the model

result obtained from each sample is the relative probability of that sample (as calculated

during calibration). In the case that multiple measures of posterior probability have been

used during calibration, one may be chosen to define the uncertainty propagated to

predictions, or two or more may be combined in Bayesian or possibilistic manner.

Alternatively, all Pareto-optimal solutions may be regarded as equally likely and

propagated on this basis.

The aforementioned methods of uncertainty propagation use probability theory to derive a

probability distribution of each determinant at each time-step and cell. Using the first

order-second moment method, only the first and second moments are computed, and

confidence limits are then calculated assuming either a normal or log-normal distribution.

Using Rosenblueth’s method, the same distributional assumptions are employed, although

the first three moments are computed, and if the skewness is found to be negative, then an

inverted log-normal assumption may be used. An obvious limitation of these assumptions

is that estimates of the extreme percentiles may be unreliable, and that the constraint of

non-negative concentration is neglected. Using Monte Carlo-derived results, no

assumptions need to be made about the probability distribution of results. Instead, a

histogram approximating the true shape of the probability distribution at each time-step

and cell is derived, and used to compute percentiles (although, depending on available

computer memory, there may be limitations on the number of time-steps and cells at

which the full set of Monte Carlo output data can be stored).

The use of probability theory for the derivation of output confidence limits assumes that

the calibration has produced valid estimates of probability mass for each sampled set of

model inputs. This may be in doubt, considering the inevitable subjectivity and

assumptions used in defining the likelihood function or the GLUE likelihood measure,

and limitations in sampling frequency (especially true for the Pareto-optimal set of

solutions which may be a very sparse sample of the Pareto-optimal population – see

Fonseca and Fleming 1995). Given these limitations, a less ambitious and more liberal

representation of uncertainty may be preferred. WaterRAT offers this, using the rules of

possibility (Zadeh 1977, Wierman 1996) applied within the Monte Carlo procedure.

Using these, only a possible range of outputs at each time-step and cell are reported,

defined by the maximum and minimum values (for each time-step and cell) recorded

during the Monte Carlo-based uncertainty propagation. In general, this gives much more

significance to extreme values. Again, this method may be used pre or post calibration.

4.9 Output

Output data are stored in text files, processed by WaterRAT’s Visual Basic modules and

viewed in a series of spreadsheets and graphs. Results can be displayed as,

• Time-series for any state variable at any cell, showing the mean, and upper and

lower percentiles (or possible ranges) plus any relevant observed data.

• Spatial variation of any state variable on any date, showing the mean and upper

and lower percentiles (or possible ranges) plus any relevant observed data (Figure

4.8 is an example from a river modelling study).

• The estimated probability density function and cumulative density function for

any variable, date and cell.

• For each determinant, the modelled probability of failing to meet a specified

water quality target within a specified stretch of river (integrating the uncertainty

and variability over time).

• For any determinant, the modelled probability of failing to meet a specified water

quality target, plotted against the value of any uncertain input.

• A list of the sampled values of all uncertain inputs following calibration, with

associated relative probabilities. This list can be used to illustrate parameter

response surfaces (e.g. Figures 4.7a and 4.7b).

• The covariance matrix of calibrated inputs derived using one selected objective

function.

• A list of the KS statistic, defined by up to four different objective functions, for

all uncertain inputs.

• A list of the Pareto-optimal solutions following Monte Carlo-based calibration

using multiple objective functions.

Chainage 5km

1Chainage 15km

1Chainage 25km

Probability distributions

mean90% confidence limitsdata

Spatial variation of nitrate 15th October

Chainage 5km

1Chainage 15km

1Chainage 25km

Chainage 5km

1Chainage 15km

1Chainage 25km

Figure 4.8 Example of graphical output of WaterRAT

4.10 WaterRAT review

4.10.1 General limitations

The overriding limitation of WaterRAT is that it is restricted to evaluation of single rivers

and lakes rather than systems of rivers and the wider catchment. Groups of pollution

sources must be represented by a point or distributed load at the river boundary, rather

than discriminating between types of origin, and so scope for pollution management is

restricted. For example, without an integrated pollution load model, only the relative

significance of different sewers and tributaries can be assessed rather than the pollution

load components. While various pollution load models were developed as part of

TOPLEM (Qinghua University 2001), they were not integrated with WaterRAT, so that

the model identification and sensitivity analysis methods described above could not be

applied to the ‘whole’ system.

A further limitation of WaterRAT is the restricted number of river water quality models

available. While the provided selection of model structures gives some flexibility in

approach and allows model structure uncertainty to be explored somewhat, and the choice

of determinands allows a variety of tasks to be considered, there is a range of river

models that could be added to extend and strengthen these virtues. This could include

both more complex formulations, for example to look at new transient transport models

(e.g. Lees et al. 2000, Sincock and Lees 2002), and empirical formulations, for example

statistical models, static regression models (e.g. Robson and Neal 1997) and time-series

models (e.g. Whitehead et al. 1997b). For application to UK catchment management

problems, new state variables would be justified, for example coliforms and pesticides

(see DEFRA 2002). Even for the Hun River study, around which the tool was designed,

the importance of metals and organic toxins (Xianxin and Yongjiu 1991) is not reflected

in the nominal toxic substance model. The framework has been developed envisaging that

the library of models will be extended to include improved models – for example priority

is to include improved pollutant transport using the aggregated dead zone model and/or

transient storage model (see Young and Wallis, 1993) – and to meet the demands of

future modelling tasks,for example higher dimensionality, coupling of groundwater

models and GIS interfaces.

Another potentially valuable improvement to WaterRAT would be the provision of a

more powerful genetic algorithm, which could be used to estimate parameter

uncertainties (e.g. Vrugt et al. 2003), as opposed to the existing version which is designed

to converge to a global optimum with no significant representation of uncertainty.

Development of an uncertainty analysis capability of WaterRAT’s genetic algorithm is

considered by Lai 2002. Although WaterRAT includes the Monte Carlo Markov Chain

that was demonstrated in Chapter 2, its value relative to GLUE was not proven within this

research.

4.10.2 Critical comparison with alternative modelling tools

WaterRAT is now compared with three other, prominent tools for water quality

modelling, model uncertainty analysis and risk-based pollution management: QUAL2E-

UNCAS (Brown and Barnwell 1987), SIMCAT (UK Environment Agency 2001a) and

DESERT (Ivanov et al. 1996, de Marchi et al. 1999). These tools were introduced in

Chapters 1 and 3. All of them are more advanced than WaterRAT in that they have

relatively advanced windows interfaces, and are able to simulate systems of rivers. All are

the same as WaterRAT in that they do not extend to modelling catchment runoff,

groundwater or urban sewerage systems but represent their effects as point or distributed

sources to the river(s). All use Monte Carlo simulation to represent uncertainty in some

model parameters and pollution loads.

QUAL2E-UNCAS has various model structure options, including two choices of

hydraulic model (similar to those in WaterRAT), light extinctions models and aeration

formulae (for both of which there is no choice in WaterRAT). It does not have sediment-

water interaction, oil or ice modelling options but has all the other state variables of

WaterRAT, and in addition it models nitrites and coliforms. It allows a slightly more

restricted representation of channel cross-sectional shape than WaterRAT (see Chapter 5).

QUAL2E-UNCAS can only model steady-state loads, although it can model water quality

dynamics due to meteorological diurnal variability. Although it allows propagation of

uncertainty using Monte Carlo or first order methods, it does not have a facility for

automatic sensitivity analysis, conditioning or optimisation algorithms. It does not allow

task-specific models to be developed and inserted into its framework, nor does it allow

user-specified numerical tolerance.

SIMCAT has no choices for model structure, instead using fixed, relatively simple

formulations. It models carbonaceous BOD, ammonia, a conservative substance and

dissolved oxygen under steady-state conditions. Effects of sediment are not explicitly

considered. It has an advanced calibration routine which automatically balances flow in

each reach by adding distributed sources/losses, and which identifies parameter values on

a reach-by-reach basis. At each reach it uses sampled observations to identify samples of

parameter values, thus representing uncertainty in parameters due to the effects of data

sampling error. The key assumption is that all the uncertainty in the model can be

represented by the parameter uncertainty arising from the calibration data sampling error.

It also estimates uncertainty in calculated percentiles. Like WaterRAT, SIMCAT allows

pollution load variability to be represented by sampling from distributions (log-normal or

normal in SIMCAT; uniform in WaterRAT), or as a series of samples. Like QUAL2E-

UNCAS, SIMCAT does not enable automatic optimisation of pollution sources, nor user-

specified models to be used, nor does it allow user-specified numerical tolerance

DESERT has taken the extreme approach to providing choice of water quality model

structure and determinands, by providing an interface in which the modeller can write the

formulations. A number of pre-written models are also provided, and an extensive choice

of hydrodynamic/transport modules are also provided, from fully mixed steady-state

reaches increasing in complexity to a diffusion wave model. The modeller can also write

cost functions for pollution load reductions and optimise these against water quality

constraints using dynamic programming. Automatic calibration in DESERT is performed

using the HSY algorithm introduced in Chapter 2 (i.e. the same as WaterRAT’s

possibilistic use of GLUE), although uncertainty in boundary conditions is not

represented during the calibration, the calibration must be done in a steady-state period,

and the dependencies between parameter distributions are not considered. It is not clear

from the documentation whether the optimisation using dynamic programming takes into

account parameter uncertainty.

It may be said that each of these tools has its own merits – QUAL2E-UNCAS has ready-

made mechanistic models and basic uncertainty analysis modules, SIMCAT focuses on

ease of use and robustness to data uncertainty, and DESERT has extensive modelling

flexibility and optimisation capabilities. WaterRAT was an attempt to integrate such

features in a manner consistent with the remit of the TOPLEM project and the particular

needs of the end-users in Shenyang. The discussion of possible developments of

WaterRAT in Chapter 8 will draw from the examples set by these other tools.

4.11 Summary

A surface water quality modelling tool has been described which provides a framework

for the extensive analysis of uncertainty and associated risk. This includes established

methods of uncertainty estimation including first order methods, Regional Sensitivity

Analysis, Generalised Likelihood Uncertainty Estimation, and multi-objective

optimisation. Using these methods, the model can be conditioned to include the effect of

all sources of error, and the uncertainty can be propagated to stochastic forecasts.

Additionally, the sensitivity of model outputs to inputs can be explored thoroughly to

indicate the key driving forces (e.g. pollution sources) behind water quality. This can be

done using first order, factorial or Monte Carlo methods, at a screening level or part of a

more rigorous scenario investigation. Given a stochastic forecast, the modeller can

explore the risk of failing to achieve regulatory water quality criteria.

The tool includes a library of semi-distributed one-dimensional river models allowing

some flexibility in specifying model structure, and some capacity to explore the

uncertainty associated with model structure. Within the library, the tool has the capability

to model organic pollution, chlorophyll-a, dissolved oxygen, various nutrients, a toxic

substance, floating and suspended oil, and total suspended solids. This is supported by

sediment models which include biochemical and physical sediment-water interactions. A

thermodynamic model is available which models heat fluxes from the atmosphere and

sediment, and includes simulation of ice. Pollution transport is modelled using the

advection-dispersion equation supported by two alternative hydraulic models (a quasi-

steady friction formula and a non-linear store). The framework has been developed

envisaging that these libraries will be extended to meet the demands of future modelling

tasks, for example higher dimensionality, coupling of groundwater models, dead-zone

analysis etc.

An important limitation of this tool is that the measurement of uncertainty, and therefore

the risk that should be associated with using a chosen model, is based partly on the

subjective judgement of the modeller with respect to the relative reliability of the model

and the observed data. This is because unknown and irresoluble model structure error and

data bias always exist to some extent in practical water quality modelling problems.

Giving general guidelines for uncertainty definition does not seem possible given the

variety of prospective case studies.

5. Numerical efficiency in Monte Carlo simulations – a case study of a

river thermodynamic model

Trade-offs between precision of numerical solutions to deterministic models of the

environment, and the number of model realisations achievable within a framework of

Monte Carlo simulation, are investigated and discussed. A case study of a model of river

thermodynamics is employed. It is shown that the tractability of Monte Carlo simulation

relies on adaptation of the numerical solution time-step, giving results with a guaranteed

error in the time domain as well as near-optimum speed of calibration under any chosen

accuracy criteria. Time-step control is implemented using two adaptive Runge-Kutta

methods - a second order scheme with first order error estimator, and an embedded

fourth-fifth order scheme. In the case study, where the effects of sparse and imprecise

data dominate the overall modeling error, both the schemes appear adequate. However,

the higher order scheme is concluded to be generally more reliable and efficient, and has

wide potential to improve the value of applying the Monte Carlo method to

environmental simulation. The problem of reconciling spatial error with the specified

temporal error is discussed.

5.1 Introduction

Despite the increasing availability of parallel processing facilities, the computing restraint

is an important limitation in the practice of Monte Carlo simulation due to the large

number of model and data realisations which may be required. Consequently, the

numerical efficiency of the simulation can be a key factor in the overall feasibility and

value of the modelling exercise. In particular, the trade-off between numerical error (due

to the numerical approximations used in the simulation) and computing time should not

be arbitrary; rather it should be made objectively and in light of the particular modelling

task. Arguably, Monte Carlo simulation loses much of its analytical value if insufficient

attention is paid to this trade-off.

This chapter focuses on the need for effective Monte Carlo simulation when model

parameters must be calibrated to a sparse set of data, and on the benefits to be gained by

awareness of underlying numerical issues. This is pursued through studying dynamic

river water temperature simulation of the Hun River. This study provides a good setting

for the investigation because of the high a priori uncertainty in many of the model

parameters, the limited amount of supporting data, and because of the arduousness of the

numerical solution to the model’s partial differential equations. Numerical efficiency of

the Monte Carlo-based calibration procedure is investigated through comparison of

alternative solution schemes with respect to spatial and temporal discretisation.

As part of the Hun River water quality modelling programme, a thermodynamic model is

required to simulate the day to day fluctuations in water temperature. The parameters of

this model are to be calibrated to water temperature data. The water temperature in the

Hun River was measured at four river cross-sections on one day every month from

October 1999 until March 2000 and again in June 2000. The frequency of measurement

was thus restricted due to resource constraints and difficult field circumstances. The four

sections, marked on Figure 1.4, are at river kilometers 44 (i.e. 44km downstream of the

reservoir dam); 72, 135 and 185. For each measurement location and time, the

temperature was taken at one-third of the water depth at each of the quartiles of the river

width, then these three measurements were averaged. To represent the diurnal

temperature variation, the width-averaged temperature at river kilometer 72 was

measured every 4 hours on each monthly sampling day. For this numerical investigation,

the most interesting and challenging period is the winter period when warm wastewater

discharging to the Shenyang reach contrasts with sub-zero air temperatures, causing

exchange of heat which is rapid and of high numerical order.

5.2 The thermodynamic model

The Hun model is a one-dimensional dynamic model, that is the water temperature is

assumed constant over the depth and width of the river, and only the longitudinal and

temporal variations are simulated. The following heat transfer processes are considered

(see Ashton 1986, Chapra 1997); advection and dispersion of the river flow; point

pollution sources; long wave radiation to and from the atmosphere, sky and surrounding

land; short-wave radiation from the sun; convection and conduction to and from the

atmosphere; conduction to and from the river bed; conduction to and from ice;

evaporation losses and condensation gains.

The transport of heat in the river is modelled using the control volume approach (Chapra

1997: p.192) whereby the river is conceptualised as a series of instantaneously mixed

cells each of uniform temperature, flow and depth. The flow in the i th cell at time-step j,

Q(i,j), is assumed equal to the flow entering it from the upstream cell, Q(i-1,j), plus the

concurrent sources, Qs(i,j), minus the concurrent losses, Ql(i,j), and evaporation, Qev(i,j), in

that cell,

),(),(),(),1(),( jievjiljisjiji QQQQQ −−+= − (5.1)

in which all terms have units m3s-1. The temperature of the water leaving each cell is

assumed equal to that within the cell, and so the rate of increase in temperature, due to

advective transport of heat only, is given by,

( ) [ ] [ ] [ ] [ ] ),(),(),(),1(),(

jiwljipsjiwjiwji

w TQTQTQTQdt

TVd⋅−⋅+⋅−⋅=

⋅− (5.2)

where the specific heat capacity and the density of the river and the pollution sources are

assumed constant; V is the volume of water in the cell (m3); Tw is the water temperature

(oC) in the cell; and Tp is the temperature of the pollution source (oC). The dispersive and

convective heat transfer between cells is assumed to be directly proportional to the

difference in temperature,

( ) [ ] [ ]),(),1(),1(),(),1(),(),(

'' jiwjiwjijiwjiwjiji

w TTDTTDdt

TVd−+−=

⋅++− (5.3)

where D’ (m3s-1) is a dispersion coefficient, calculated as a function of water velocity and

depth (Chapra 1997: 245).

The other heat exchange processes are modelled to act uniformly across each cell, and are

based on the descriptions of Bras (1990) and Chapra (1997). The short-wave radiation

reaching the water (or ice) surface, fs (Wm-2) is calculated by,

)65.01( 2casf ss −⋅= (5.4)

where s (Wm-2) is the component of the daily average of the short-wave radiation which

is incident on the outside of the atmosphere above the subject site in a direction radial to

the earth; c is the effective cloud cover as a ratio of the area of open sky; and as is an

atmospheric transmission coefficient, the calculation of which is described by Bras (1990:

p.35). The calculation of long-wave radiation reaching the water (or ice) surface, fl (Wm-

2) is based on the Stefan-Boltzmann law,

( )( ) ( )445.0 273273031.0 +⋅−++= waairll TmBTeaBf (5.5)

where B is the Stefan-Boltzmann constant (Wm-2K-4); eair is the vapour pressure of the air

(mmHg); al is a transmission coefficient (dimensionless); Ta is the air temperature some

meters above river level (oC); m is a coefficient of emissivity of water = 0.97

(dimensionless); and Tw is the average water temperature (oC). The net heat gain due to

conduction and convection at the water-air interface, fc (Wm-2) is calculated as,

( )waWc TTkbf −⋅= (5.6)

where b is Bowen’s coefficient (0.47 mmHgK-1) and kW is a wind dependent heat transfer

coefficient (Wm-2mmHg-1) which is assumed to be described by the empirical

relationship,

295.019 Wkw += (5.7)

where W is the average wind speed (ms-1) measured 7m above the water surface. Heat

loss due to evaporation (or gain due to condensation), fe (Wm-2) is calculated by Dalton’s

law, or when the air temperature is below 0 oC, by the Russian winter formula (Ashton

1986),

( ) 0for >−= Taeekf airsatWe (5.8a)

( )( )( ) 0for95.2263.004.6 ≤−+−+= aairairsawe TeewTTf (5.8b)

from which Qev in Equation 5.1 is calculated as,

eev ld

fQ = (5.9)

where eairs is the saturation vapour pressure of the air (mmHg); dw is the density of water

(kgm-3); and lw is the latent heat of evaporation of water (Jkg-1). Conduction from the ice

surface fiw (Wm-2) is calculated using an effective heat transfer parameter, kiw (W m-2K-1)

(from Ashton 1986),

( )wiiwiw TTkf −= (5.10)

where Ti is the temperature of the ice at the water-ice interface = 0oC. Calculation of the

conduction from the sediment fsw (Wm-2) conceptualises a layer of sediment through

which the temperature varies linearly from that of the underlying sediment, Ts, to that of

the water, Tw. This concept is a simplification of the more theoretical sediment heat

gradient through a homogeneous sediment (see Hondzo and Stefan 1994). The heat flux

from the sediment is,

( )wsswsw TTkf −= (5.11)

where ksw (Wm-2K-1) is an effective heat transfer parameter lumping together the

sediment-water transfer, the sediment thermal conductivity and the conceptual sediment

layer thickness.

Exchanges of heat directly between the atmosphere and the water are assumed to occur

only over the area of open water area in the river cell. That is, in frozen conditions the

water is insulated from the air, and the ice reflects or absorbs all radiation. Conversely,

the exchange between the ice and the water only occurs over the iced area in that cell.

Thus, the rate of temperature increase by cell i at discrete time j is,

( ) ( )( )( ) ( )[ ]

[ ] [ ] [ ] [ ][ ] [ ]),(),1(),1(),(),1(),(

),(),(),(),1(

),(),(

jiwjiwjijiwjiwji

jiwljipsjiwjiw

jiwswiwiwisecwlswwji

TTDTTD

TQTQTQTQ

AfAAfAAffRffsddt

−+−+

⋅−⋅+⋅−⋅+

++−−+−+=⋅

− (5.12)

where Aw is the surface area of the water; Ai is the area of ice as a fraction of Aw; sw is the

specific heat capacity of water (Jkg-1K-1); and Rw is the water reflectance (dimensionless)

which is assumed the same for both short and long-wave radiation.

To predict the periods, locations and extent of ice cover, a simple ice model was included.

This is based on a heat balance approach which assumes linear heat gradients between the

air and the ice, and the ice and the water. Adapted from Shen and Chaing (1984) and Lal

and Shen (1991), the rate of increase of ice thickness is modelled as,

( ) ( ) ( )( )

−+−−+−

ilswiiwaiaii

i RffTTkTTkk

(5.13)

where Hi is the thickness of ice (m); di is the density of ice (kgm-3) and li is the latent heat

of melting of ice (Jkg-1); ki is the conductivity of ice (Wm-1K-1); kai is an air-ice heat

exchange rate (Wm-2 K-1); Ri is the reflectance of the ice. Ice growth is assumed to be not

initiated in cells with flow of velocity above 0.6 ms-1 (Ashton 1979). This simple ice

model does not explicitly represent a number of physical ice processes such as ice floe

transport, and frazil ice formation and deposition (see Ashton 1986). In particular, the

latter omission means that the modelled water temperature may fall slightly below zero,

as heat loss to the sub-zero atmosphere due to Equation 5.12 is not instantly transformed

into frazil ice production; rather there is a lag determined by parameter kiw through which

it is gradually transformed into increased ice thickness. The spread of ice across each

river cell cannot be represented conceptually because of the one-dimensional limitation,

so instead an empirical relationship is proposed. For Hi > 0,

Aicei H

kA exp (5.14)

where kAice (m) is an empirical parameter, so that Ai rises to a maximum of unity as the ice

thickness becomes high.

5.3 Monte Carlo simulation

The model parameters included in Equations 5.4 to 5.14 may be classified according to

knowledge of their values prior to model calibration i.e.; 1. physically based parameters

of which the values are known, or can be measured with adequate certainty (includes m,

b, B, di, li, ki, dw, sw, lw, al); 2. conceptual or spatially averaged parameters of which the

values cannot be precisely measured but some range of possible values, based on

measurement or prior modeling experience, can be used to constrain the a priori

parameter space (includes c, Rw, Ri); 3. conceptual or empirical parameters of which there

is no prior knowledge, and of which the prior ranges should be made wide enough so they

do not constrain the a posteriori parameter space (includes kai, kiw, ksw, kAice). According to

this classification, the parameters and their assumed values or ranges are listed in Table 1

(from Ashton 1986, Lal and Shen 1993, Chapra 1997). For the purpose of this study, it is

assumed that the reach characteristics used for calculating water velocity v and water

surface area Aw, and the formula used for calculating dispersion D’ (Chapra 1997: 245)

are accurate.

A simple automatic method of model calibration and uncertainty analysis is random

sampling of the a priori parameter ranges, i.e. Monte Carlo simulation. Using this

method, the 'goodness of fit' (of model results to observed system response) which is

associated with each sampled parameter set is measured according to a pre-specified

objective function. If this function is suitably specified, the array of objective function

values obtained from the calibration can be regarded as point values of probability mass

from the a posteriori parameter distribution, and of the associated model result.

Confidence limits on the model parameters and the model outputs can then be derived.

Such an approach to model uncertainty estimation is justified by Beven and Binley (1992)

in the context of their Generalised Likelihood Uncertainty Estimation (GLUE). The

GLUE approach is used in the current study, using the objective function defined in

Equation 5.15:

mmkwwk TTKL

−= ∑

2, (5.15)

Table 5.1 Thermodynamic parameter values and a priori ranges Parameter Description Value or range Units Ref. Parameter classification 1 - Values assumed known with certainty m emissivity of water 0.97 none C

B Stefan-Boltzmann constant 56.63 × 10-9 Wm-2K-4 C

b Bowen’s coefficient 0.47 mmHgK- C

di density of ice 917 kgm-3 L

li latent heat of melting of ice 0.334 × 106 Jkg-1 L

ki conductivity of ice 2.24 Wm-1K-1 L

dw density of water 1000 kgm-3 L

sw specific heat capacity of water 4182 Jkg-1K-1 L

lw latent heat of evaporation of water 2.5 × 106 Jkg-1 L

al long-wave attenuation coefficient 0.6 none C

Parameter classification 2 – Possible ranges supposed on some prior basis c cloudiness 0 - 1 none

Rw reflectivity of water 0 – 0.1 none A

Ri reflectivity of ice 0.2 - 0.75 none A

Parameter classification 3 – Possible ranges supposed without prior basis kai transfer from air to ice 5 - 50 Wm-2K-1

kiw transfer from ice to water 20 - 200 Wm-2K-1

ksw transfer from sediment to water 20 - 200 Wm-2K-1

kAice ice coverage coefficient 0.05 – 0.15 none C = Chapra 1997; L = Lal and Shen 1993; A = Ashton 1986

where Lk is the posterior probability of the kth sampled parameter set, ( ) mkww TT ,− is the

mth of the 63 residuals between observed water temperature wT and modelled water

temperature wT obtained from the kth sampled parameter set, Kr is a root constant which

defines the variance of the a posteriori distribution (taken as equal to 1 for this

application), and K is a standardisation constant such that the sum of all Lk values is equal

to unity. Equation 5.15 defines a statistically-based likelihood function which assumes

that the data errors are unbiased, normally distributed, constant and uncorrelated over

time and space; and that the model equation errors are small in comparison. The latter

assumption significantly simplifies the uncertainty analysis because we can neglect the

hypothesis that Equations 5.1 to 5.14 are wrong in structure, and define a series of trial

models by sampling from the a priori joint parameter distribution. There is no allowance

for numerical truncation error in the uncertainty analysis, as our aim is to manage this

error to be as large as possible without having overall significance to model reliability.

The a priori uncertainty in seven model parameters is defined by their ranges in Table

5.1; no assumption is made about their correlation. 6561 samples from this distribution

are taken using stratified random sampling (MacKay et al. 1979) (6561=38, three

stratifications for each parameter), and for the purpose of this investigation, it is assumed

that this gives adequate coverage of the parameter space.

5.4 Numerical methods4

For a finite difference model which includes both temporal and spatial variability (such as

our river model), accuracy and efficiency depend on the temporal and spatial grid sizes,

and the use of appropriate integration schemes. The simplest and most common approach

to implementing a finite difference solution is to specify a fixed grid size prior to the

simulation. However, consistently adequate accuracy demands that the fixed grid size be

designed such that the approximate solution is sufficiently close to the actual solution in

its most varying part, irrespective of the smoother regions where a much larger grid size

would achieve the required accuracy. Therefore, within any realisation of a model, the

preferred grid size is likely to vary significantly over the domain of integration. The

problem is compounded when using the Monte Carlo method, as the dynamics of the

simulation depend largely on the randomly sampled parameter values, and so the

preferred grid sizes are inherently random from one realisation to the next. To ensure

convergence of the solution using a single grid size for all realisations, the grid size will

need to be inordinately small. This often leads to a demand on computational resources

that is several orders of magnitude above the optimum. Such inefficiency is likely to be

compounded if there are discontinuities in the model structure (e.g. Equation 5.8), or in

the driving forces, since these generally require minute grid sizes to maintain the required

accuracy. Furthermore, since it is not usually known a priori, determination of an

appropriate fixed grid size may require additional human and computer effort. An

informal trial and error approach to this can leave us in some doubt as to the reliability of

final results.

4 The text of much of this section (5.4) is adapted from the work of Bethanna Jackson, in McIntyre et al. (2003).

All of these difficulties can be overcome by the use of an adaptive scheme. Such a

scheme automatically reduces the grid size when the truncation error is undesirably high,

and increases it when this error is unnecessarily low. A good implementation of an

adaptive scheme will not only guarantee a solution to a specified accuracy, but also

achieve this in a near-optimum time period. The step control mechanism should recognise

and handle any discontinuity in time, since the crossing of such will register as a large

error. It then recursively lowers the grid until it lands sufficiently close to the

discontinuity. We note that automatic step size control has been marked by many

numerical analysts (e.g. Gear 1971, Gustafsson 1993) as being the single most important

means of making an integration method efficient.

Since the accuracy of the river model is dependent on both the spatial and temporal

discretisation, a completely adaptive scheme should monitor and vary the grid in both

time and space. However, the difficulties associated with producing reliable and

representative measurements over the space dimensions for processes such as this make

adapting the spatial grid-lengths problematic. A second, more general concern is the

complex inter-relationships between different points in the spatial mesh. These tend to

produce errors of higher magnitude when a grid varies in space (Carver 1976). Due to

these complexities, it seems practical to have a predetermined spatial grid, and apply a

method that is adaptive in time only. This is achieved through transformation of the

system of partial differential equations to a system of ordinary differential equations

which step forward in time. This approach is generally known as the “method of lines”

(Berezin and Zdidkov 1965), commonly applied in the context of river modelling as the

“method of control volumes” (Chapra 1997: p.192). One complication with the method of

control volumes, solved using a backward-space difference method (implicit in Equation

5.12), is the presence of numerical dispersion, which tends to smooth out differences in

concentrations or temperature from one cell to the next (Chapra 1997: p.201). In this

example, we have chosen neither to calibrate a dispersion parameter nor to adjust the

calculated dispersion D’ to allow for numerical dispersion. Therefore it would be

expected that a compensation for numerical dispersion would be reflected in the

distribution of calibrated parameters.

To pre-determine an efficient solution scheme, we need a measure of computational cost

for comparative purposes. A convenient measure of this is the total number of function

evaluations, which, for the thermodynamic river model, is a constant quantity per time-

step. Therefore, the obvious aim is to minimise the number of time-steps. The maximum

permissible error (tolerance) must be specified and suitably related to the step size and

actual error per step. This relation gives rise to an expression we refer to as the step-

controller. Its aim is to monitor the error over each step, and estimate a new step length to

carry the integration as far forward in time as is possible within accuracy constraints.

The checks and step size predictions rely on manipulation of the lowest order error terms.

Since a numerical method of order p integrates a solution exactly up to the pth order, the

error term contributed over any step is a combination of all terms of higher order, that is,

the truncated terms in the Taylor’s series expansion of the function (Butcher 1987). If we

take one step forward to find the value of dependent variable β at time tj, β (j) = β (j-1)+∆t(j)

given a solution at β (j-1), the exact error contribution jζ over the jth step (the local error)

∑∞

− ∆Ω=

)1()( !pi

j itβ

ζ (5.16)

where each Ωi is a problem dependant error constant, and )()1(

ij−β denotes the ith derivative

of β at time step j-1.

If the step size ∆t is sufficiently small, the lowest order term remaining will dominate the

higher order terms to such an extent that they can be treated as negligible. This first term

is called the principal truncation error, and is directly proportional to the (p+1)th

derivative of the solution of α, and the (p+1)th power of the step size. The local error can

therefore be approximated by,

− ∆Ω= pj

pjj tβλ (5.17)

where Ω = Ωp+1/(p+1)! from Equation 5.16. Taking an arbitrary value for the maximum

permissible error over each step (denoted by ξ), the optimum step size at time t(j) is

approximately that for which ε(j) = ξ. Denoting this “near-optimum” step size by ∆t’(j), we

find from Equation 5.17 that

)1()1(

1)( )'(

∆=∆⇒

Ω=∆

βξ (5.18)

The step controller takes the error estimate ε at the end of each step, and checks that this

is within the specified tolerance bounds (-ξ, ξ). If the estimate is outside this bound, the

step is rejected and recalculated with a lower step size. On acceptance, the error estimate

is assumed to be equal to the principal truncation error. It is then assumed that β (p+1) does

not change significantly between t(j) and t(j+1). The optimal value of ∆t can then be

estimated by calculating the step size giving λ(j) = ξ subject to these assumptions. The

step controller becomes,

)1()1()(

−−

∆=∆

jjj tt

λξ (5.19)

This is generally multiplied by a safety factor η to reduce the chance of overestimating

the maximum permissible step size, and performing a rejected step. The resultant step

controller is

)1()1()(

−−

∆=∆

jjj tt

λξη (5.20)

Since the necessity for a numerical solution precludes ready availability of the true

solution, a cheap, sufficiently accurate estimation of the truncation error is sought.

Generally, this means finding a second numerical solution which is considerably more

accurate than the first, so that the difference between the two is a good approximation of

the truncation error. A viable approach involves using methods of order p and (p+1)

which share the same function evaluation points. The cost then reduces to the difference

between obtaining the order (p) solution alone, and calculating the (p+1) solution along

with it. This error estimate can be combined with a suitable tolerance, and used by the

step-controller to govern the solution.

It should be noted that the tolerance should be carefully chosen such that misleading

accuracy is not generated. With an exact spatial representation, a robust adaptive scheme

should be capable of integrating a solution to arbitrarily high accuracy (subject only to the

computer’s machine precision). However, our time integration is along a solution path

which may have been significantly perturbed by spatial errors. Since the adaptive

temporal scheme is following the perturbed solution path rather than the exact one, any

precision surplus to that of the spatial representation places a purposeless burden on

computational resources.

1234567891011121617

444855606568728090100110120130140150160170185

River kilometer defining reach boundaries

River reach number15 14 13

Point thermal loadsPoint thermal loads

Upstreamboundary

Down-streamboundary

1234567891011121617

444855606568728090100110120130140150160170185

River kilometer defining reach boundaries

River reach number15 14 13

Point thermal loadsPoint thermal loads

Upstreamboundary

Down-streamboundary

Figure 5.1 Spatial grid

For the Hun River study, the spatial grid is made up of 17 reaches of river shown in

Figure 5.1 and each reach has four cells, giving 68 control volumes in series. Following

the discussion above, the grid will be automatically adapted only in the time domain. The

derivatives defined by Equations 5.12 and 5.13 are numerically integrated using a first-

second order adaptive scheme and a fourth-fifth order adaptive scheme (henceforth

referred to as (1,2) and (4,5) schemes respectively) for comparison. The second order

approximation is commonly known as Heun’s method (see Chapra and Canale 1998:

p.688),

)1(),()1,(

)1,(),( 5.0 −−

− ∆

wjiwjiw t

TT (5.21)

where ∆tj-1 is the adapted step size computed for time-step j-1; the first derivative is

defined by Equations 5.12 and 5.13 derived at time-step j-1, and the second is similarly

derived at time-step j using the approximation Tw(i,j) = Tw’(i,j) given by,

)1()1,(

),( −−

− ∆

+= jji

wjiwjiw t

TT (5.22)

The truncation error over any step is then estimated by the magnitude of the difference of

Equations 5.21 and 5.22. This is divided by Tw and maximised over all i to give the worst

relative error, λj, over the all N river control volumes at time-step j,

= ),(,1)(

'ABSmax

TTλ (5.23)

λj is then used as a basis for adapting ∆t using the integral controller described previously,

and the calculations of Equations 5.21, 5.22 and 5.23 are repeated until the desired

tolerance ξ is achieved. Note that the estimated truncation error is that of the first order

solution, and we are guaranteed improved accuracy because the second order evaluation

of Tw is adopted. The (1,2) algorithm is illustrated in Figure 5.2.

Start End

( ) ( )( )

( )11,

1,'',, allFor −∆

−− jt

dtdTTTi

wjiwjiw

( )1,evaluate , allFor

Set boundary conditions for t(j)

( )( )jiw

dTi ,,

'' usingevaluate , allFor

( ) ( )( ) ( )

, 5.0, allFor −∆

−− jt

dtdTTTi

wjiwjiw

( )( ) ( )

−= '

allfor ABSmax

jiwjiw

( ) ? Is ξλ ≤j

( ) ( )'

,, , allFor jiwjiw TTi =( ) ( )( )

11 9.0+

−−

jjj t∆t

( ) ( ) ( )11 −− ∆+= jjj ttt

1+= jjSet boundary conditions for t(j-1)

( ) ( ) ( ) ttTTttj jwjwj ∆=∆=== −−− arbitrary : initial : initial:1 111

( ) ( )( )

jjj t∆t

( ) 000189.0 Is >jε

( ) ( )14 −∆= jj t∆t

No Yes

Finished simulation ?Yes

( ) ( )( )prjj ttt ∆∆=∆ ,min

Start End

( ) ( )( )

( )11,

1,'',, allFor −∆

−− jt

dtdTTTi

wjiwjiw

( )1,evaluate , allFor

Set boundary conditions for t(j)

( )( )jiw

dTi ,,

'' usingevaluate , allFor

( ) ( )( ) ( )

, 5.0, allFor −∆

−− jt

dtdTTTi

wjiwjiw

( )( ) ( )

−= '

allfor ABSmax

jiwjiw

( ) ? Is ξλ ≤j

( ) ( )'

,, , allFor jiwjiw TTi =( ) ( )( )

11 9.0+

−−

jjj t∆t

( ) ( ) ( )11 −− ∆+= jjj ttt

1+= jjSet boundary conditions for t(j-1)

( ) ( ) ( ) ttTTttj jwjwj ∆=∆=== −−− arbitrary : initial : initial:1 111

( ) ( )( )

jjj t∆t

( ) 000189.0 Is >jε

( ) ( )14 −∆= jj t∆t

No Yes

Finished simulation ?Yes

( ) ( )( )prjj ttt ∆∆=∆ ,min

Figure 5.2 First-second order adaptive time-step algorithm

The (4,5) scheme is an embedded Runge-Kutta, or Fehlberg method developed by Cash

and Karp (1990) (also see Chapra and Canale 1998: p.713). Methods of this type are

widely accepted as among the most competitive methods for many differential equation

systems. In principle, it is similar to the method of the (1,2) scheme, using two separate

numerical approximations to gain an error estimate: in this case, the absolute difference

between fourth and fifth order solutions of Tw. Since both estimates share the same

function evaluation points, the second estimate is of negligible additional cost. This (4,5)

method has the benefit of providing a more reliable estimator of the true temporal

truncation error than the lower order scheme. Its added reliability along high-order

solution paths makes larger time-steps permissible, but for each of these, there are four

more intermittent derivative evaluations. The preferred scheme in this application

depends on the relative importance of the high order processes in Equations 5.12 and

5.13, which will be investigated by experiment.

5.5 Results

The model calibration is first done using the (4,5) scheme using a tolerance, or specified

maximum temporal truncation error, of 0.2. This value is a ratio of the absolute estimated

truncation error to either the high order evaluation of temperature or 0.1oC, whichever is

higher (the lower limit avoids convergence to inappropriately low tolerances at

temperatures near zero). While this specified maximum error may seem relaxed, it is

quite justifiable for various reasons, as will be seen later. The posterior probability

associated with each of the 6561 parameter sets was used to derive confidence limits for

the water temperature at the Hun Gate (river kilometer 72), shown in Figure 5.3. These

confidence intervals were derived from the variance of the 6561 model results at each

time-step assuming a normal distribution. Although Figure 5.3 does not validate the

model (because the same data were used for calibration), it shows how the uncertainty

analysis is used to represent the variability in the data.

Observed water temperature90% confidence limitMaximum likelihood result

Figure 5.3 Time-series of modelled and observed water temperature at the Hun Gate

Consider the significance of the adaptive numerical scheme in obtaining these results.

Figure 5.4 is a scatter-plot of the simulation run-times during the calibration using the

higher order scheme (the cyclical trend in the run-times is due to the procedural ordering

of the stratifications used in the sampling procedure). Figure 5.5 shows the time-series of

daily average time-steps for the most time-consuming simulations.

0 1000 2000 3000 4000 5000 6000Simulation number

Simulation number10000 2000 3000 4000 5000 60000

0 1000 2000 3000 4000 5000 6000Simulation number

Simulation number10000 2000 3000 4000 5000 6000

Figure 5.4 Scatter of run-times during calibration by stratified sampling

First-second order

Fourth-fifth order

First-second order

Fourth-fifth order

Figure 5.5 Profile of time-steps during the most time-demanding simulation

These illustrations suggest that, due to the range of time-steps required, the use of an

adaptive scheme is fundamental to the feasibility of the calibration. For example, if the

minimum time-step from Figure 5.5 of 0.006 days were used (and this may be regarded as

a liberal estimate because it is a day-averaged value, not the actual minimum) the 6561

simulations would have taken over 10 days instead of under 12 hours. Alternatively, if a

feasible constant time-step of 0.1 days is used, which is considered practical for

performing 6561 simulations, then the results of a large number of realisations are ruined

by numerical instability, mainly during the numerically onerous period of freeze. If a

constant time-step of 0.006 is used, but the number of parameter set samples is reduced to

500, then stability is achieved, but comparing repeated results with those obtained using

more comprehensive sampling implies that 500 samples is inadequate to give reliable

confidence limits. Again, it is noted that the adequacy of 6561 samples is not investigated

in this chapter.

0 0.2 0.4 0.6 0.8 1Specified maximum relative truncation error

First-second order scheme

First-second order schemeFourth-fifth order scheme

Fourth-fifth order scheme

(a) (b)

0 0.2 0.4 0.6 0.8 1Specified maximum relative truncation error

First-second order scheme

First-second order schemeFourth-fifth order scheme

Fourth-fifth order scheme

(a) (b)

Figures 5.6(a) and 5.6(b) Comparison of the performance of the alternative adaptive schemes

Using the (1,2) scheme with the same specified tolerance, there was no significant

difference in the derived confidence limits, nor in the total time required for the

calibration. Although smaller time-steps were generally required for the (1,2) scheme,

only two derivative evaluations are required per time-step as opposed to six in the higher

order scheme. However, close inspection of the performance of the schemes under a

variety of tolerance criteria implies that the higher order scheme is potentially much more

efficient. Figures 5.6(a) and 5.6(b) show how the achieved tolerance and the time taken

for the simulation vary against the specified tolerance. These results are based on a single

simulation using the set of maximum likelihood parameter values, and use a specified

maximum tolerance of 0.0001 to approximate the numerically ‘true’ solution. In terms of

accuracy, the (4,5) scheme outperforms the (1,2) scheme by an order of magnitude for all

specified tolerances. The former is also faster for all specified tolerances above 0.2. Note

that the actual errors are up to 20 times lower than the specified tolerances for the higher

order scheme, implying that processes higher than fifth order are numerically

insignificant. In contrast, for the (1,2) scheme, the actual error is in cases larger than the

specified tolerance, highlighting the scheme’s limitations, and the importance of the step-

controller, both in terms of the robustness of the control mechanism itself, and its reliance

on a good error estimate. Since the error estimate of the (1,2) scheme is only first-order

accurate, it is substantially more vulnerable to changes in the smoothness of the solution

than the Cash-Karp estimate. The flattening of the curves in Figure 5.6(a) is because of

the upper limit to the computation time-step of 1 day.

Such exploration of numerical performance has been extremely valuable in improving the

functionality of the river temperature model, and in allowing reliable calibration and

uncertainty analysis to be performed. For example, there clearly is no justification in

specifying the tolerance as anything less than 0.2 when using the (4,5) scheme. Any lower

tolerance will be generating excess accuracy in the time domain, overshadowed by the

errors generated by the spatial grid, and those due to the variability and sparseness of the

calibration data. This accuracy demand is comparatively relaxed, and accordingly we find

that the (1,2) scheme appears to give adequate results at this tolerance when compared to

the overall uncertainty in model results. This (1,2) scheme can be regarded as a simple

and easily implemented method of avoiding numerical instability but does not give any

useful guarantee of truncation error. It is also expected to become increasingly less

competitive as other errors are reduced, for example by extension to the data set or by

refinement of the spatial grid, and increased demands are made on the temporal precision.

The higher order scheme is more robust and reliable in general, at least as fast, and has

wider applicability in modelling of high order processes.

In the river model described here, and many dynamic environmental models, accuracy is

limited by spatial as well as temporal truncation errors. In order to achieve a broad

understanding of the model’s numerical error, the spatial discretisation should be subject

to critical investigation. The spatial grid used in this study is variable in resolution along

the length of the river. In theory, this variation could be automated in a similar manner to

the temporal scheme, but the computation required to estimate the truncation error over

the whole time domain is not practical. Instead, an appropriate grid is identified prior to

calibration by the modeller by successively sub-dividing the grid and running individual

tests. For example, consider how the grid used above (4 cells per reach in Figure 5.1 = 68

cells in series) performs in comparison to a less refined grid (one cell per reach = 17 cells

in series) and a more refined grid (16 cells per reach = 272 cells in series). The model

parameter values which required the smallest time-step are used. The spatial model

results (up to the Hun Gate) for the water temperature on the 20th January are shown in

Figure 5.7(a), and Figure 5.7(b) shows the reach-averaged water temperatures. At the

Hun Gate it is found that the refinement of the spatial grid did not make a significant

difference to results. On the other hand, from the same figures, the spatial truncation error

is significant in the reaches further upstream, which contain the Shenyang point sources

of heat. While the model can assign confidence limits to all results, these are based on the

performance of the model at locations significantly downstream of the thermal loads,

where truncation error is low (e.g. the Hun Gate), and cannot reliably account for high

spatial truncation errors elsewhere. Thus, the stochastic model in its 68-cell form is not a

reliable predictor of the temperature profile within the Shenyang reaches during the

winter. However, even in these reaches, the adaptive temporal scheme is useful in

ensuring maximum efficiency given the spatial grid, and allows investigation of spatial

error without the complication of error interactions. While the model gives a reasonable

prediction (numerically at least) of the reach-averaged value in all reaches, whether or not

this is adequate for the task of supporting the Hun River water quality model is a matter

for further research.

44 49 5459 64 69

River kilometer

re (o C

49 5459 64 69

River kilometer

re (o C

72 44 72

(a) (b)

Reach-averaged TwCell-specific Tw

1 cell per reach4 cells per reach16 cells per reach

44 49 5459 64 69

River kilometer

re (o C

49 5459 64 69

River kilometer

re (o C

72 44 72

(a) (b)

Reach-averaged TwCell-specific Tw

1 cell per reach4 cells per reach16 cells per reach

Figures 5.7(a) and 5.7(b) The limitation of the spatial accuracy in the Shenyang reaches.

5.6 Discussion

The relevance and benefits of attention to numerical efficiency have been illustrated by

this case study, specifically in the derivation of the temperature time-series shown in

Figure 5.3 through use of an adaptive numerical grid. The derivation of the confidence

limits in Figure 5.3 employed random sampling of the wide range of parameter values

listed in Table 5.1, resulting in a numerical grid requirement which was highly

inconsistent over the domain of the sampling, illustrated by Figure 5.5. Thus it is argued

that the adaptive grid is beneficial, if not necessary. Furthermore, Figure 5.6 clearly

illustrates that the numerical accuracy should be specified conservatively with respect to

the overall model uncertainty, otherwise the simulation is much more likely to be

inhibitively computationally expensive. While this chapter has focused on the calibration

stage, the benefit of the specification of the adaptive grid and the required accuracy

applies also to predictive application of Monte Carlo simulation.

It is argued for the case study that the high a priori uncertainty in the model parameters,

as well as the high order dynamics of the modelled system, have led to the high

variability in required time-step during the model calibration. However, if the overall

model uncertainty is relatively low, a higher numerical accuracy is justified, and

variability of the time-step may be equally as high or, noting the exponential rise in

computation time for low truncation errors in Figure 5.6(b), even higher. This would be

the case when using an a posteriori parameter distribution which has been significantly

constrained by the calibration, i.e. in relatively well-defined environmental systems.

Furthermore, it may be noted that the results presented here have significance beyond

simple Monte Carlo simulation. A variety of evolutionary methods of parameter

calibration are in common use in environmental modelling, for example genetic

algorithms (e.g. Mulligan and Brown 1998), Markov chain modelling and shuffled

complex evolution (e.g. Thyer et al. 1999). All these employ an automatic semi-random

exploration of the parameter space which leads to unpredictable numerical behaviour.

The derivation of reliable stochastic model results is an extremely challenging task,

especially in the field of complex environmental modelling. For the conciseness of this

work, it has been necessary to make assumptions which, in general, would be detrimental

to the reliability of results. One already mentioned assumption is that 6561 stratified

random samples gives adequate coverage of the parameter space. Other limiting

assumptions are that the set of Equations 5.1 to 5.14 are correct, that the model boundary

conditions (e.g. air temperature) are errorless, and that the data errors are independent,

unbiased and normally distributed. However, inspection of Figure 5.3 shows that on 20th

March, all 6 data points lie to one side of the model result (and the lower 90% confidence

limit), implying that their errors are not independent or that the model structure is

incorrect. This present work is regarded as an essential preliminary step in addressing

such difficulties.

For deterministic environmental models that are considerably larger than the case study

model, perhaps with 2 or 3 spatial dimensions and hundreds of parameters and state

variables, the need for attention to efficiency in Monte Carlo simulation is especially

relevant. Whereas it might intuitively be expected that more complex models should

provide more accurate deterministic predictions, hence reducing the motivation for Monte

Carlo simulation, many investigations imply that this is untrue and that best-estimate

results of such models can be ambiguous (see, for example, the discussions of O’Connell

and Todini 1996, Reichart and Omlin 1996). Given that important decisions and policies

are supported using complex environmental models, one of the primary challenges facing

numerical modelers is the improved representation of uncertainty. As such models are

relatively costly to solve, a fundamental aspect of this challenge is the application of

efficient solution schemes and tolerances that are consistent with the overall prediction

accuracy. While, in some cases, the pursuit of this will be less straightforward than it was

for the case study, and the achievable computational cost will be higher, this study has

sought to demonstrate the degree of improvements that may be possible.

5.7 Summary

Monte Carlo-based methods of uncertainty estimation are central to the value of

environmental simulation modelling because data are generally too sparse and imprecise

to usefully identify a single representative model. Using a study of river water

temperature simulation, it has been shown that the value of Monte Carlo analysis, in

terms of its ability to explore the feasible models, depends heavily upon the selection of

appropriate numerical solution schemes and tolerances. In particular, the implementation

of an efficient adaptive time-step procedure has considerable benefits in handling the

variability of the time-step requirements over the time domain of individual realisations,

and in handling the inherent randomness of this variability from one model realisation to

the next. For the case study, identification of model parameter uncertainty was carried out

over-night as opposed to the several days it would have required with a fixed time-step

for practically equivalent numerical precision. The case study also illustrated the

potentially large benefits in conservative specification of the numerical precision as the

achieved precision may be significantly more than that specified, and may be completely

adequate given the uncertainty stemming from limitations in supporting data. The spatial

truncation error arising from the fixed spatial grid is noted as restricting the value of

adaptive time-step schemes in solution of space-time partial differential equations

problems, such as the case study. Further work is needed to rationalise spatial errors in

the context of overall modelling error.

6. Identification of a phosphorus mobilisation model: prior evaluation

of data needs

A simulation model of in-stream phosphorus mobilisation and transport, which was

developed to predict monthly phosphorus export from the upper Hun River, is described.

In order to evaluate alternative programmes for collection of in-river calibration data, a

set of a priori computational experiments are devised. Initially assuming that the model

and boundary condition data are error-free, daily in-river phosphorus concentration data

are synthesised, representing an idealised system response. Scenarios of error in the data

and model structure are then improvised, as a hypothetical representation of the

conditions under which the model will actually be calibrated and evaluated. The model is

calibrated under the various conditions of error, and its predictive reliability is tested

using an independent idealised data set. These controlled experiments allow evaluation of

sampling programme design, of other controls on model reliability, and of needs for

additional a priori investigations. The results indicate that the value of the calibration

data is seriously compromised by the presence of error in the pollution load data, error in

the model structure, and inherent parameter equifinality. While these controls were not

very detrimental to model reliability under calibration conditions, in cases they caused

serious misrepresentation of forecast phosphorus export. For the case study, it is

recommended that sampling should initially be restricted to the first major storm event of

the year, and that other resources are directed at collecting improved data on phosphorus

sources to reduce model input error. Also, in light of the limited resources, expectations

of model performance should be reviewed, and a more robust approach to model

uncertainty estimation adopted.

6.1 Introduction

The phosphorus budgets of rural catchments are often dominated by runoff-induced

mobilisation driven by rainfall events which occur on hourly or daily time-scales

(Kronvang et al. 1997). Such system responses are too localised in time to be

comprehensively observed and formulated into a model using typically available data - if

undertaken at all, weekly or monthly monitoring of in-river phosphorus is normal. As

illustrated later, the resulting structural error and bias in estimated parameter values may

lead to very misleading predictions of phosphorus export, even when they are moderated

by uncertainty analysis. In such circumstances, investment in supplementary field

experiments is clearly important. However, it is unreasonable that investment should be

made without prior cost-benefit analysis. Even if a modelling exercise has a guaranteed

budget, the wide range of expenditure options (e.g. land-use surveys, collection of rainfall

data, in-river phosphorus data, and sediment data) means that some objective

prioritisation of data requirements and resource allocation is beneficial, if not essential.

The design of a field experiment is not easy prior to detailed understanding of the system

– nevertheless a working hypothesis of the system is needed for preliminary appraisal of

options. This leads to the view that some preliminary a priori modelling is a necessary

starting point for field experiment design. The objective of this chapter is to explore the

frequency and quality of data needed to identify, with a sought degree of reliability, the

structure and parameters of a one-dimensional conceptual model of river sediment

phosphorus mobilisation and transport.

One aim of the TOPLEM project was to estimate the monthly nutrient budgets of the Hun

River (under both current conditions and proposed intervention strategies) into the

Dahuofang reservoir. In the present study, we look at an upper reach of the river, from the

river-reservoir boundary to the Beikouqian gauging station, 40km upstream (Figure 6.1).

The catchment of this reach covers 2750km2, approximately 60% of which is forested,

15% is arable, 6% is pastoral, and the remainder is either urban or unfarmed moorland. It

is estimated that, in an average year, 800 tonnes of phosphorus from agricultural sources

is washed through this reach of river (Qinghua University 2001). The hydrograph for

1998 at Beikouqian, shown in Figure 6.2, illustrates that the wet season from July until

September dominates the flow regime. The maximum headwater flow is 239 m3s-1 on the

24th of August, with three other distinct runoff peaks, on the 16th of July, the 6th of August

and the 12th of August. The average headwater flow is 19m3s-1 and the mode is 2m3s-1.

The nutrient budget is known to be dominated by these storm events (Qinghua University

2001), ostensibly due to the mobilisation of nutrients stored in the soil (e.g. Kronvang et

al. 1997, Daldorph et al. 2000), and the scour of the river sediments and subsequent

release of nutrients (e.g. House and Denison 1998).

Southeast to Yellow Sea

Shenyang City

Fushun City

Dahuofang reservoir

Hun River

Beikoqiangauging station

N25 km

Beijing

NORTHEASTCHINA

JAPANShenyang City

Hun River

Modelled length

Southeast to Yellow Sea

Shenyang City

Fushun City

Dahuofang reservoir

Hun River

Beikoqiangauging station

N25 km

Beijing

NORTHEASTCHINA

JAPANShenyang City

Hun River

Modelled length

Figure 6.1 Location of modelled length of Hun River

As the processes controlling phosphorus fluxes occur predominantly in the wet season,

intensive monitoring of river phosphorus concentrations throughout this period is

expected to return valuable information about an appropriate model structure and set of

parameter values. However, as in most research projects, the Hun River study has

resource constraints, and sampling more frequently than weekly would be at the expense

of other important project goals. Therefore, rationalisation of the benefits and costs of

alternative phosphorus sampling programmes is needed. Similarly, spatial sampling errors

and other measurement errors could be reduced by focussing resources, but with benefits

which are not easy to justify a priori. In planning event-based monitoring there is also a

need to offset the risk of missing the key features of the event against the cost of

improving response times of personnel and equipment (at least in studies like that of the

Hun River, where automatic sampling is not practicable). This chapter describes

preliminary computational experiments which lead to hypotheses about the potential

value of investing in extra and/or improved data collection in the Hun River.

The computational experiments are based on incomplete, prior knowledge of the system

dynamics, so it cannot be expected that an optimum, or even near-optimum, monitoring

plan will be achieved. Rather, the objective is to indicate the risks of wasting resources by

either over-investing or under-investing in data collection, given the limitations of prior

knowledge and inevitable model identification problems. These indications will allow this

risk to be managed by considering strategies where scope for losses (unreturned

investment) is limited, while offsetting this conservatism against the desire for rapid

results. For example, a minimalist sampling programme could be used as a preliminary

study of system response leading, in conjunction with additional modelling experiments,

to refinement of the monitoring programme design before the next wet season. Promoting

interactions between the monitoring and data analysis procedures in such a way is widely

regarded as fundamental to progress in water quality modelling and management

(Somlyody 1995). This study is restricted to looking at frequency and quality of

measurements at one river cross-section (at the entry-point of the river to the Dahuofang

reservoir). In other cases, the approach presented here could be extended to spatial-

temporal investigations.

At all stages in the investigation, synthetic phosphorus data are used so that errors in data

and in the model structure are known a priori. Thus, the purpose is not to validate the

particular model employed. Instead, controlled experiments are used to elucidate the

significance of hypothetical yet credible scenarios of errors. Firstly, an error-free model

and error-free data given at the same frequency as the model output (i.e. daily) are

employed, whereby the only sources of parameter error are, 1) the limitations of the

automatic calibration algorithm in identifying optimum parameter values (see below), and

2) the interactions between parameters leading to equifinality of parameter sets (Beven

1993). The effects of these errors on the ability of the model to replicate the calibration

data, and to replicate a second independent set of error-free data, are illustrated. The

calibration data are thinned out to two-daily, then to the baseline frequency of weekly, so

that the bias introduced to parameters and model predictions can be reviewed, and the

value of using event data (i.e. from only the distinct storm events within the wet season)

is also evaluated. Unbiased Gaussian error is introduced to the calibration data and to the

phosphorus load, and then a component of the model is simplified and the significance of

this known structural error (as a nominal representative of other unknown structural

errors) is evaluated. Finally, the value of including sediment phosphorus data is

examined.

6.2 Model Description

The description of the model given below is brief, giving the minimum information

needed in the current context - a comprehensive model description is given in McIntyre

and Zeng (2002).

A key concept employed in the proposed model is that phosphorus exchange between the

sediments and the main body of river water (which we will refer to simply as the ‘water’)

is driven by two mechanisms. First, there is a diffusive exchange, whereby flux of

phosphorus per unit wetted area of sediment Fsp (gm-2s-1) is proportional to the difference

between the reach-averaged sediment concentration (Sp/Hs) and the reach-averaged water

concentration Cp,

−= p

pdsp C

kF (6.1)

where Sp (gm-2) is the sediment concentration, Hs (m) is the average depth of the

responsive sediment layer, and kd (ms-1) is the diffusion rate. Secondly, there is net

release Rsp (gm-2s-1) comprised of resuspension which is initiated if a critical shear stress

at the channel bed τcr (gm-1s-2) is exceeded, and a first order sedimentation term,

τ for τ1ττ crcr

−= pppssp CvSkR (6.2a)

τ for τ cr≤−= ppsp CvR (6.2a)

where τ is shear stress at the channel bed (gm-1s-2), ks is the scour rate (g-1m2s) and vp is

the sedimentation velocity (ms-1). This is based on the resuspension model proposed by

Lick (1982), also see Blom and Aalderink (1998). The shear stress is calculated using

Manning’s friction formula (Chow 1954: p.201),

rnugdw=τ (6.3)

where dw is the density of water (1000 gm-3); g is the gravitational constant (9.8 ms-2); u is

the water velocity at the sediment-water interface (ms-1), which is assumed equal to the

average water velocity over the river cross-section; n (sm-1/3) is Manning’s coefficient and

r (m) is the hydraulic radius.

The in-river transport model is based on the one dimensional advection-dispersion

equation, with water quality averaged over the river’s width and depth. The river is split

longitudinally into a series of reaches between which the P advects and disperses. Then,

the rate of change of phosphorus in the water in reach i at time-step j is given by5,

[ ] [ ] [ ][ ] [ ]

),(),(),(),(),1(),1(

),(),1(),(),(),1(),(

jipjiwjispspjipjipji

jipjipjijipjipji

ARFCCD

CCDCQCQdt

Φ+++−

+−+⋅−⋅=⋅

−− (6.4)

where V (m3) is the time-variable volume of water in each reach , Q (m3s-1) is the flow, Aw

(m2) is the surface area of water in the reach, Φp (gs-1) represents external loads of P, and

D’ (m3s-1) is a river diffusion coefficient calculated as a function of the water velocity and

depth (see Chapra 1997: p.245). The cross-section of the river channel is represented by a

composite section with sloping sides (Figure 6.3), so that the water surface area is a

function of the flow, allowing reasonable simulation of Aw over the wide range of flows

observed in the Hun River. Backwater calculations are based on a quasi-steady model,

where flow reaches steady-state at every computational time-step, hence transient effects

on phosphorus transport and resuspension are not fully accounted for.

Date01

-Jan-9

Flow (measured)

Total phosphorus concentration(simulated)

Date01

-Jan-9

Flow (measured)

Total phosphorus concentration(simulated)

Figure 6.2 Beikoqian 1998 daily flow (measured) and total phosphorus (simulated) used as inputs to the river model.

Intuitively, and evidently from Equation 6.2, the flux of phosphorus from the sediment to

the water depends on the concentration of phosphorus stored in the sediment, Sp, which is

also simulated.

5 In all other differential equations in this chapter, all terms should be taken to have subscripts i and j.

( ) wspspp

s ARFdt

dSA −−= (6.5)

where Aw (m2) is the area of submerged sediment. The distinction between the area of

sediment As (time-invariant) and the area of submerged sediment Aw (time-variable) is

non-trivial. Sediment-water interactions will occur over Aw, while the overall sediment

store of phosphorus extends over a potentially much larger area (illustrated in Figure 6.3).

This allows dry-bed storage to be accounted for in an approximate manner.

Low flow –net sedimentation

High flow –net resuspension

Sediment store of phosphorus:constant volume, dynamic concentration

b2 = 100m

b1 = 10m

d2 =5m

Water surface

Low flow –net sedimentation

High flow –net resuspension

Sediment store of phosphorus:constant volume, dynamic concentration

b2 = 100m

b1 = 10m

d2 =5m

Water surface

Figure 6.3 Channel cross-section and sediment-water interaction concepts. The composite cross-section, defined by effective parameters b1, b2, d1 and d2, allows a wide range of flow conditions to be simulated with restricted model complexity. The sediment store concept aims at representing dry-bed storage, but only wet-bed sediment-water interactions.

For this investigation, the 40km length of the Hun River is divided into ten reaches each

of 4000m length. The set of ordinary differential equations given by Equations 6.4 and

6.5 are solved using the Fehlberg scheme that was described in Chapter 5.

In summary, this model is based on the concepts that 1) sedimentation is directly

proportional to the concentration of total phosphorus in the water, 2) resuspension will

occur at a rate proportional to the submerged area of sediment, the energy (above a

threshold) provided by the movement of water, and the concentration of phosphorus

within the sediment, and 3) the availability of sediment phosphorus diminishes in times of

net resuspension, and increases in times of net sedimentation. Clearly, the realism of the

model is flawed, not least because the parameters u, ks, and τcr are assumed to spatially

and/or temporally constant, and variables Cp, Sp and τ are lumped into reaches. Among

other simplifications, the responsive area As and depth Hs of sediments do not respond to

the resuspension process (instead the concentration of Cp within a conceptual sediment

volume decreases) and the model in no way discriminates between the fates and

interactions of different fractions of phosphorus (see for example Boers et al. 1998).

There is also the issue of whether the quasi-steady hydraulic model is an adequate

replacement for transient approach (e.g. Zeng and Beck 2001). Such conceptualisations

have allowed a reasonably elegant and intuitive a priori model of phosphorus

mobilisation in the Hun River to be formed – but one which, due to its limited physical

basis, relies heavily on calibration for identification of representative parameter values.

6.3 The data

The average channel slope (0.1%) and cross-sectional shape are applied to each of the ten

reaches that make up the river model. The downstream boundary of the modelled length

of river (i.e. the reservoir) is represented as a constant water level of 4m above bed level.

The upstream boundary is defined by a daily time-series of Q and Cp (Qu and Cpu) which,

for 1998, are illustrated in Figure 6.2. These P data are simulated using a modified

version of the integrated catchment phosphorus runoff model of Daldorph et al. (2000)

(the land-use and daily rainfall data used in this runoff model were estimated from the

records of Shenyang Institute of Environmental Science, Liaoning). The use of daily Qu

and Cpu data dictates that our idealised system response is daily. The effect of uncertainty

in the Qu and Cpu data on the value of the downstream calibration data will be

investigated. As well as the upstream boundary, a diffuse load of Q and P (Qd and Cpd),

distributed uniformly over the 40km length, is applied. The total distributed load is

assumed to be one half of the headwater contribution, on the basis of the relative

catchment area of the 40km length of river.

Table 6.1 Prescribed feasible ranges of parameters and true values Parameter Notation Units Range True value

Scour rate (Eq. 6.2) ks ms-1 10-30 20

Scour rate (Eq. 6.10) ks ms-1 0.005-0.5 0.3

Critical shear stress τcr Nm-2 0.001-0.1 0.01

Critical velocity ucr ms-1 0.1-0.7 0.2

Sediment depth Hs m 0.05-0.15 0.1

Sedimentation velocity vp ms-1 0-2 1

Manning’s coefficient[1] n sm-1/3 0.05-0.15 0.1

Initial sediment conc.[1] Sp0 % ±25[2] 0 1Same value is used for each river reach. 2Deviation from simulated value on 30th June.

Error-free calibration dataSample of calibration data with error

“True” model result

(a) daily calibration data (d) daily event calibration data (all events)

(e) daily event calibration data (first event)

(b) 2-daily calibration data

(c) weekly calibration data

Error-free calibration dataSample of calibration data with error

Figure 6.4 Calibration data and “true” model result for 1998. Only one of five realisations of data error is shown.

As a baseline for studying the value of data, error-free in-river data are generated using

the phosphorus transport model together with assumed true parameter values, listed in

Table 6.1. These parameter values are the mid-points of ranges of values perceived to be

feasible (also see Table 6.1) so that the generated data represent a feasible, although

hypothetical and initially idealised system response. The data which are used for

calibration are those generated for the last river reach (river kilometre 40) between the 1st

July and the 31st of August, initially at a sampling frequency of 1 day (Figure 6.4(a)).

This period is chosen because it contains the significant rainfall events of both the studied

years. The calibration is repeated using the same data with increased sampling intervals

of 2 days, which still captures the storm events (Figure 6.4(b)), and 7 days, which

includes only one data point during each of the events (Figure 6.4(c)). Two subsets of the

daily calibration data set are also created (Figures 6.4(d) and 6.4(e)). The first of these

contains only the data within the first of the 1998 storm events, and the second contains

those data within all three of the significant 1998 events.

Following a series of model calibrations (described below) using these error-free data,

noise is introduced to the calibration data;

( )ipipip CCC 25.0,Norm' = i = 1, Nres (6.6)

where iP is the ith sample of error-free data, 'iP is the ith sample of noisy data,

( )σµ ,Norm signifies a random sample from a normal distribution with mean µ and

standard deviation σ , and Nres is the number of data points available (listed in Table 6.2).

Thus, the introduced error is Gaussian, unbiased, not autocorrelated, and has a time-

variable standard deviation proportional to the magnitude of the associated error-free data

point. Figures 6.4(a) to 6.4(e) include samples of the noisy time-series. Noise is also

introduced also to the 62 points of daily data which describe the load from upstream (i.e.

Cpu), and to that from the distributed source (Cpd), according to Equations 6.7(a) and

6.7(b).

( )ipuipuipu CCC 25.0,N' = i = 1,62 (6.7a)

( )ipdipdipd CCC 25.0,N' = i = 1,62 (6.7b)

This is to examine the effect on the calibration of possible error in these loads (which are

themselves the output of a simulation model). In looking at effects of randomly generated

noise, it is expected that results will depend upon the particular realisation of noise used.

To gauge this dependency, five alternative realisations are used, for each investigated

scenario of calibration data frequency.

To define the predictive task of the calibrated models, the upstream boundary condition is

changed to simulated data from July and August 1999, providing a significantly different

time-series of flow and phosphorus at Beikouqian, and a new set of idealised data are

generated using the true model parameters. The resulting true model result at 40km

downstream becomes the data to which the calibrated models are tested in forecasting

6.4 The calibration, prediction and performance evaluation

procedures

The parameters which, throughout the investigation, are considered uncertain and to be

calibrated are kd, ks, τcr, Hs, vp, and n. The initial condition of the sediment phosphorus

concentration Sp0 in each reach is also considered as an uncertain parameter, as only

sparse measurements are available. The ranges considered feasible prior to calibration are

given in Table 6.1, and during calibration these ranges are considered to be uniform and

independent distributions. Latin hypercube sampling (LHS) is used as the calibration

procedure. 5000 sample sets of parameters are taken in each calibration, and a simulation

is performed using each sampled set. The degree to which each of the corresponding

model results matches the calibration data is measured using the following sum-of-

squared-residuals objective function,

( )∑=

−=resNi

ikppk CCOF,1

, (6.8)

where OFk is the kth of 5000 sampled objective function values, ( )ikpp CC,

− is the ith of

the Nres residuals between the true phosphorus concentration pC and the modelled

concentration Cp ( pC is replaced by 'pC in the case of noisy data) obtained from the kth

sampled parameter set, and Nres is the number of data points in that particular data set (see

Table 6.2). For example, a high objective function value implies that the model result

gives a poor fit to the data, and the optimum parameter set is that which minimises the

objective function value. The main limitation of LHS as an optimisation procedure is that

the possible combinations of parameter values are sampled relatively sparsely. This

means that successive calibrations to one data set give alternative realisations of the

optimum parameter set, whereby the LHS procedure introduces its own source of error

into model results. The significance of this error compared with those arising from data

limitations is analysed as part of the discussion below.

The experimental procedure is summarised in Figure 6.5. For each calibration data set, a

series of ten independent calibrations are done, thereby identifying ten alternative

'optimum' parameter sets and corresponding time-series for July to August, 1998. These

parameter sets are then applied to modelling the 1999 response. The variation in the ten

optimum modelled time-series is an estimate of result uncertainty, and visual comparison

indicates the degree to which this estimated uncertainty explains the idealised data.

2-daily

Weekly

All events

1st event

Monitoringprogramme

Datarealisation

Calibration(10 x 5000 runs)

Prediction(10 x 1 run)

Performanceevaluation

1. Standard error2. Failure rate3. Bias4. Time-series

comprison

2-daily

Weekly

All events

1st event

Monitoringprogramme

Datarealisation

Calibration(10 x 5000 runs)

Prediction(10 x 1 run)

Performanceevaluation

1. Standard error2. Failure rate3. Bias4. Time-series

comprison

Figure 6.5 Experimental procedure. Five data realisations shown for the case of noisy calibration data. Only one data realisation is used for the case of error-free data.

At this stage, the model’s original task should be recalled – to estimate monthly

phosphorus loads to the Dahuofang reservoir. While confidence in the identified model is

related to comparisons of true and modelled time-series, it is not adequate to evaluate

proposed monitoring schemes (or our scenarios of data and model errors) solely on that

basis. In particular, the modelling purpose does not necessarily require that the low flow

phosphorus concentrations are simulated accurately, but that the monthly integration of

flow and concentration over time is adequate. Therefore, an analysis of errors in modelled

monthly exports also needs to be done. The modelled export is defined as,

∑ ⋅= pp CQE 4.86 (6.9)

where, in this context, Q (m3s-1) and Cp (gm-3) are daily modelled flow and phosphorus at

the downstream boundary (river kilometre 40), Ep (kg) is the modelled monthly

phosphorus export, and 86.4 effects the necessary unit conversions. Three measures of the

error in Ep are applied. Firstly, the standard error of Ep, for each calibration data scenario

separately, is calculated as the standard deviation of Ep for the corresponding set of ten

‘optimum’ modelled time-series. This is expressed as a percentage of the mean modelled

Ep, and represents the uncertainty in Ep due to difficulty of identifying a single optimum

model from the data. Secondly (again for each individual calibration data scenario), the

difference between the mean Ep and the known true value is expressed as a percentage of

the latter, representing the actual bias in Ep. The third measure of error is the percentage

of realisations of Ep which fail to fall within a specified tolerance of the true value (from

here on, such realisations are referred to simply as ‘failed results’). For the current study,

this tolerance is specified as %25± , which is considered necessary to drive a useful

model of the Dahuofang Reservoir. This proportion of realisations that lead to failure of

the performance criteria may be considered a measure of the probability of obtaining an

unhelpful outcome, should we choose to employ a single ‘optimum’ model.

6.5 Results, discussion and supplementary experiments

Under the various experimental conditions of data and model structure errors, Tables 6.2

to 6.4 list the three alternative evaluations of error in monthly exports, as defined above.

In Tables 6.3 and 6.4, where data error is involved, the range of results over the five data

error realisations is given. The results in parenthesis are those obtained using a genetic

algorithm for calibration (see below).

Figure 6.6 shows the ten optimum time-series of Cp identified using the trial sets of error-

free calibration data shown in Figures 6.4(a) to 6.4(d). Figures 6.6(a), 6.6(c), 6.6(e) and

6.6(g) show the 1998 results (i.e. performance under calibration conditions) and Figures

6.6(b), 6.6(d), 6.6(f) and 6.6(h) show the 1999 results (i.e. performance in predictive

mode). The respective wet season hydrographs are included in Figures 6.6(i) and 6.6(j).

Regarding Figure 6.6(a), the limitations of the LHS calibration procedure have caused

some uncertainty in the optimum model result (note that the model is known to be

capable of perfectly fitting the true data). While this uncertainty is quite minimal under

calibration conditions, Figure 6.6(b) indicates that there is significant divergence of

solutions in predictive mode. This indicates that the information retrieved during

calibration has lost relevance under the changed boundary conditions.

While it might be supposed that an improved search procedure will reduce the prediction

error, the degree to which this is true depends also on the significance of equifinality of

parameter sets under calibration conditions. That is, even if a more efficient global search

algorithm is employed, a unique optimum parameter set, which allows a unique forecast

of the future, still may not be found. To investigate the significance of this effect, the

experiment using the error-free data was repeated, this time using a genetic algorithm that

is part of the WaterRAT tool (McIntyre and Zeng 2002). This algorithm evolves a sample

of parameter sets until their objective function values (from Equation 6.8) converge to

within a specified tolerance of the sample optimum (a sample of 100 and a tolerance of

0.1% were used in this case). The average number of simulations required as part of this

procedure was approximately 8100.

4(i) 1998 flow data 4(j) 1999 flow data

(a) 1998 result; daily calibration data

(e) 1998 result; weekly calibration data

(g) 1998 result; event daily calibration data

(b) 1999 result; daily calibration data

(d) 1999 result; 2-daily calibration data

(f) 1999 result; weekly calibration data

(h) 1999 result; event daily calibration data

4(i) 1998 flow data 4(j) 1999 flow data

(a) 1998 result; daily calibration data

(e) 1998 result; weekly calibration data

(g) 1998 result; event daily calibration data

(b) 1999 result; daily calibration data

(d) 1999 result; 2-daily calibration data

(f) 1999 result; weekly calibration data

(h) 1999 result; event daily calibration data

Figure 6.6 The optimum model results retrieved from the calibrations using an error-free model and error-free data at different frequencies

Table 6.2 Model performance with no data or structural errors. The results in parenthesis are those obtained using a genetic algorithm for calibration.

Year Calibration data frequency Nres

Standard error (%)

Bias (%) No. failed (%)

1998 Daily 62 4 (3) 2 (1) 0 (0)

2-daily 31 8 (5) 4 (3) 0 (0)

Weekly 9 17 (16) 11 (6) 30 (25)

Daily event (all events) 11 6 (4) 1 (6) 0 (0)

Daily event (first event) 4 5 (6) 9 (6) 0 (0)

1999 Daily 62 15 (4) 10 (13) 20 (15)

2-daily 31 13 (5) 21 (18) 35 (20)

Weekly 9 14 (7) 28 (22) 45 (40)

Daily event (all events) 11 17 (4) 16 (16) 25 (20)

Daily event (first event) 4 19 (14) 23 (28) 30 (40) Table 6.3 Model performance with data error and no structural error

Standard error (%)

Bias (%)

No. failed (%)

1998 Daily 62 4 - 8 2 - 3 0

2-daily 31 5 - 7 1 - 5 0

Weekly 9 10 - 15 18 - 23 0 - 40

Daily event (all events) 11 5 - 9 3 - 5 0

Daily event (first event) 4 6 - 8 5 - 8 0 - 10

1999 Daily 62 9 - 20 13 - 27 20 - 45

2-daily 31 17 - 19 10 - 35 20 - 50

Weekly 9 12 - 21 27 - 56 40 - 55

Daily event (all events) 11 17 - 21 9 - 24 25 - 40

Table 6.4 Model performance with data error and structural error

Standard error (%)

Bias (%)

No. failed (%)

1998 Daily 62 5 - 9 5 - 9 0 – 5

2-daily 31 6 - 8 5 - 8 0

Weekly 9 6 - 7 5 - 7 0 – 15

Daily event (all events) 11 7 - 11 4 - 9 0 – 10

Daily event (first event) 4 7 - 11 5 - 8 0 – 10

1999 Daily 62 12 - 13 17 - 170 40 – 55

2-daily 31 10 - 19 26 - 157 20 - 50

Weekly 9 9 - 12 13 - 188 30 – 55

Daily event (all events) 11 9 - 38 10 - 81 30 – 55

The result was that, although the genetic algorithm led to the average standard error of the

1998 monthly export using daily data reducing from 4% to 3% and the bias from 2% to

1% (see Table 6.2), the bias in the predicted 1999 export increased from 10% to 13%.

Similar observations can be made for the other calibration data frequencies, with the

genetic algorithm generally improving performance under calibration conditions, but

providing no significant improvement in the forecasts of export. This indicates that using

a more powerful search algorithm during calibration will reduce result uncertainty under

idealised calibration conditions but, due to the equifinality caused by over-

parameterisation, will not necessarily improve the quality of model forecasts.

The step-down to 2-daily data, from Table 6.2, leads to a consistent deterioration in

performance from that achieved with daily data. Reducing the calibration data to weekly

further worsens the reliability of results (Figures 6.6(e) and 6.6(f), and Table 6.2), with a

45% (40% using the genetic algorithm) rate of failed results in predictive mode, and an

average 28% (22%) bias in the monthly export compared to the standard error of 14%

(7%). (The increase in bias is because the July P fluxes are more consistently under-

estimated when the less frequent calibration data are employed. Although not obvious in

Figure 6.6, the error in concentration is compounded when multiplied by high flows). The

estimated standard error is a poor measure of the reliability of the model, as it is often

significantly lower than the actual bias, and it may be concluded that, even in these

idealised conditions, a more robust estimator of model uncertainty is needed.

The use of daily event calibration data (11 samples covering all 3 main events of the 1998

wet season; and 4 samples covering only the first event) leads to better reliability than

achieved using weekly sampling. For the 4-sample option this improvement is not

significant, for example reducing average bias from 28% to 23% using LHS, and

increasing it from 22% to 28% using the genetic algorithm. While there are potential

economies to be made in pursuing event-based monitoring, small-sample results are

likely to be particularly sensitive to the introduction of model structural errors (Chatfield

1995), as evaluated below.

The use of error-free data has allowed us to gauge the effects of parameter equifinality

and search algorithm limitations. It provides an indication of the limit to which improved

data quality, and model structural quality, can improve export forecasts at various

calibration data frequencies. The results exemplify the fundamental reliability problems

in predictive modelling, due to equifinality and the biases introduced by restricting

calibration data to a frequency below that of the model response time (daily, in this case).

The results in Table 6.2 show that biases, which we would wish to assume much smaller

than the estimated standard errors, are likely to become larger as calibration data become

typically sparse.

The introduction of data error (both in loads Cpu and Cpd, and in calibration data pC )

provides a more realistic basis upon which to review the benefit of increased sampling

frequency. For this experiment, the 1998 calibration and loading data sets have five

realisations of Gaussian error imposed, all described by Equations 6.6 and 6.7. The 1999

loading data are left error-free, allowing us to concentrate on examining errors introduced

at calibration stage. Table 6.3 shows the range of performances obtained over the average

over the five realisations of data noise. These ranges are wide, with, for example, the

number of prediction failures associated with the one-event data ranging from 5 to 14,

implying that the predictive performance depends largely on the particular realisation of

data noise. Noting that, in general practice, only one realisation of calibration data is

available, and estimating the noise distribution is a fundamental part of the calibration

problem (see Chapter 1).

So far, in reviewing the relative effects of the calibration algorithm limitations,

equifinality of parameter sets, data errors and data sparseness, the important issue of

model structure error has been deliberately set aside. However, it is clear that the model

proposed in Equations 6.1 to 6.6 is a great simplification of the real Hun River system,

and that it has a number of significant structural faults that will in practice further

complicate the quest for prediction reliability. To review the implications of the inevitable

structural error on the value of calibration data, a manufactured structural error is

imposed upon the model. It is supposed that evaluating the performance of this erroneous

model will be indicative of more general structural problems, which may affect the

planning of the monitoring programme. The newly contrived model is based on the

resuspension model described by Blom and Aalderink (1998), in which the resuspension

rate is proportional to velocity-over-threshold (rather than the shear stress relationship

used in Equation 6.2),

( ) uu CvSuukR crpppcrssp >−−= for (6.10a)

for uu CvR crppsp ≤−= (6.10b)

where ucr is the water velocity at which resuspension is initiated, and the other terms are

already defined. The model is calibrated using the noisy data that were previously

generated by the error-free model structure. Again, ten LHS calibrations are done for each

of the five realisations of data noise.

The calibrated (1998) and predictive (1999) results for the daily, 2-daily, weekly and first

event data are shown in Figure 6.7, and Table 6.4 summarises the model performances.

The model appears to have had general success in simulating the 1998 resuspension

events, but inspection of the event recessions reveals consistent misrepresentation as

evidence of the structural fault. Nevertheless, Table 6.4 shows that under calibration

conditions the model performs no less impressively than its error-free alternative. On the

other hand, the 1999 prediction results indicate deteriorated performance, with, for

example, percentage bias of up to 170% using the daily data set. Differences in the

number of failed results are not so significant, because the large percentage bias is caused

by a small number of very biased results (again it is worth emphasising that in a

conventional, deterministic calibration, there would be no indication that such a result

was so misrepresentative). Another important result in Table 6.4 is that using event-based

calibration data has led to the generally less biased results. While not conclusive, this

indicates that the value of additional information in the calibration data (which was

evident in Table 6.2) is compromised by the presence of model error; moreover, it

indicates that the additional information contained in the low-flow conditions has become

an impediment, at least in some realisations. In the presence of structural error, the

objective function has become a poorer evaluator of the sampled model, identifying

optimum parameter sets which compensate for the structural fault under calibration

conditions. The preferred data set (and the preferred objective function) are those which

limit the adverse nature of this compensatory effect, with respect to the predictive task of

the model. For example, this study implies that neglecting the low flow data may improve

estimates of monthly export. Generally speaking, the utility of data cannot be judged

without consideration of the complexity of the modelling problem - the interactions

between the model structure, data reliability and objective function. More specifically, the

calibration data should be restricted to those which are more relevant to the intended

application of the model.

As a final experiment, sediment data were generated to explore the benefits which might

arise from their inclusion in the calibration. Again, nominal noise is introduced into the

data using Equation 6.6, and the calibration objective function is modified to Equation

(a) 1998 result; daily calibration data (b) 1999 result; daily calibration data

(e) 1998 result; weekly calibration data (f) 1999 result; weekly calibration data

(g) 1998 result; event daily calibration data (h) 1999 result; event daily calibration data

(a) 1998 result; daily calibration data (b) 1999 result; daily calibration data

(e) 1998 result; weekly calibration data (f) 1999 result; weekly calibration data

(g) 1998 result; event daily calibration data (h) 1999 result; event daily calibration data

Figure 6.7 The optimum model results retrieved from the calibrations using a model with structural fault and data with Gaussian error at different frequencies

( ) ( )kNi Nj

jppippkres res

SSCCOF

−−= ∑ ∑

= =,1 ',1

22 (6.11)

where pS is the reference sediment data, Nres’ is the number of points of sediment data

included (= Nres for all experiments), and the other variables are as previously defined.

The new calibration was performed using the same frequencies of Cp and Sp data as in the

previous calibration, with both the true and erroneous model structures, using the

experimental procedure outlined in Figure 6.5. The ‘true’ Sp result along with the error-

free data and one realisation of erroneous data are illustrated in Figure 6.8, and Tables 6.5

- 6.7 summarise the performance in predicting monthly export (see the end of section 6.4

for definitions of performance measures). Comparing Tables 6.5 and 6.6 with Tables 6.2

and 6.3 indicates that the additional information within the Sp data is generally valuable,

in terms of reducing bias and failed results, where there is no structural error. With the

introduction of structural error (compare Tables 6.4 and 6.7), while the extreme instances

of bias in Table 6.4 have been avoided, the number of failed results is generally slightly

higher. Thus the value of the extra data seems to be more questionable when model

structural error is present but not addressed prior to calibration. Another issue with using

sediment data would be the practical difficulty of achieving representative Sp

measurements, especially considering the conceptual nature of the modelled sediment

phosphorus store.

It is notable from the results in Tables 6.3, 6.4, 6.6 and 6.7 that, when data and structural

error are allowed to affect the calibration process, the intuitive expectation that model

performance is proportional to calibration data frequency (and prior perceptions of

information content) is not at all clear. Far more important controls on model

performance, as measured by percentage bias, are data noise and model structural error.

This leads to the suggestion that resources should be initially directed at collecting short

periods of high quality data, and evaluating and developing this data in a quest to resolve

the structural error. Specifically, in the Hun River study, quality data from the first event

would be recommended. However, it is naive to suggest that a single event’s data may be

adequate for structural identification, especially given the inevitable structural change that

arises in water quality systems (Van Straten 1998; also see Section 2.2.3 of this

dissertation). There is also the practical limitations of sampling and measurement

precision – the size of the residuals in Figure 6.7 indicates that visually identifying the

structural fault would be problematic, should there be even a small amount of error.

Furthermore, the error in the phosphorus loading data seems bound to contribute

significantly to the identification problem, irrespective of the quality of the in-river

calibration data. Improving the quality of the loading data is an ominous task, as it is a

function of rainfall and diffuse loads (among other inputs) from the large and sparsely

gauged upper Hun River catchment (Suttamanutwong 2001). Also, while the 1999 load

data were assumed to be precise in the experiments described above (to allow focus on

the value of calibration data), their errors would directly and significantly add to

prediction errors. Commitment of resources to in-river sampling should not, therefore, be

undertaken without a broader review of data requirements, especially with regard to

identification and modelling of diffuse sources.

Table 6.5 Performance with no data or structural errors (with sediment data)

Standard error (%)

Bias (%)

No. failed (%)

1998 Daily 62 8 3 0

2-daily 31 9 8 5

Weekly 9 9 2 0

Daily event (all events) 11 5 7 0

Daily event (first event) 4 7 3 0

1999 Daily 62 12 4 10

2-daily 31 14 6 10

Weekly 9 12 2 5

Daily event (all events) 11 13 2 15

Daily event (first event) 4 14 9 20

Table 6.6 Performance with data error and no structural error (with sediment data)

Standard error (%)

Bias (%)

No. failed (%)

1998 Daily 62 9 4 - 10 0 - 5

2-daily 31 7 - 11 2 - 6 0

Weekly 9 9 - 12 2 - 5 0 - 5

1999 Daily 62 1 - 19 2 - 14 0 - 15

2-daily 31 9 - 11 2 - 4 0

Weekly 9 12 - 17 2 - 5 5 - 10

Daily event (all events) 11 9 - 33 2 - 23 5 -35

Table 6.7 Performance with data error and structural error (with sediment data)

Standard error (%)

Bias (%)

No. failed (%)

1998 Daily 62 6 - 9 17 - 18 5 - 10

2-daily 31 5 - 6 19 - 21 15 - 25

Weekly 9 4 - 6 16 - 20 5 - 20

1999 Daily 62 13 - 25 24 - 26 30 - 50

2-daily 31 6 - 9 8 - 34 33 - 50

Weekly 9 4 - 7 32 - 34 45 - 50

) Error-free calibration sediment dataSample of calibration sediment data with error

Figure 6.8 Sediment calibration data and “true” model result for 1998. Only one of five realisations of data error is shown.

The results have clearly shown that the various sources of error frequently cause the

calculated standard error of monthly exports of P to be lower than the actual (and in

practice unknown) bias of the result. The modelled uncertainty has, in this study, merely

represented the sampling error in the calibration procedure along with the equifinality of

parameter sets, and is not an adequate indicator of total model uncertainty. Even if the

error in the calibration data could be estimated, and some suitable allowance made

(Chapters 2, 7), this does not allow for the important influence of the unknown but

inevitable structural error. Intelligent judgement of the scope for model and data errors is

necessary, together with a less restricted estimator of model parameter uncertainty.

Established approaches to doing so are Bayesian parameter estimation (e.g. Reichart and

Omlin 1998), set-theoretic methods (Van Straten and Keesman 1991), the use of a

number of candidate model structures (Chatfield 1995), and an integrated framework of

these called Generalised Likelihood Uncertainty Estimation (GLUE; Beven and Binley

1992). While these methods cannot solve the uncertainty estimation problem (i.e. cannot

reliably estimate the scope for biases in model predictions), they provide a more

reasonable alternative to objective methods which inevitably over-simplify the

complexity and extent of the issue. Where permitted by constraints of available modelling

resources, established (e.g. Beck 1983) and novel (e.g. Wagener et al. 2002b) system

identification techniques should be considered as means of reducing model structural

error.

6.6 Summary

A conceptual model of river phosphorus mobilsation and transport has been proposed,

and applied to the upper Hun River in Liaoning, China over the wet seasons of 1998 and

1999. The aim was to investigate the benefits associated with alternative programmes of

in-river calibration data collection, in terms of the achievable reliability of model

predictions.

Alternative programmes of in-river data collection to support model calibration were

evaluated. Using synthesised, but reasonable scenarios of data error and model structural

error, the model performance was tested in terms of the reliability and precision of its

predictions of monthly phosphorus exports. The experiments show that, when data and

structural errors are present, decreasing the total number of samples in a 2-month period

from 62 to 4 led to only marginal deterioration in performance. This is partly because the

information content of the data varied markedly within the 2-month period, with low flow

data being information-poor, or in cases detrimental. Model predictive performance was

strongly deteriorated by the investigated scenarios of data noise and model structural

error, despite reasonable performance under calibration conditions.

Failure to identify a unique optimum set of parameters caused significant variability

amongst calibration results, even using an error-free model and data set. The limitations

of the Latin Hypercube Sampling calibration procedure in finding a unique global

optimum parameter set were examined by comparisons with results of a genetic

algorithm. While the latter method reduced uncertainty under calibration conditions (i.e.

improved convergence of the calibration objective function), it led to no significant and

consistent improvements in model predictions, (i.e. no improvements in convergence of

the calibrated model). While the alternative realisations of optimum parameter sets were

carried forward to give an estimate of standard error in the modelled phosphorus exports,

this estimate was, in many cases, under-representative of the actual error. More liberal,

robust estimators of uncertainty are needed, which aim to limit under-representation of

the scope for prediction error. The pollution load errors are thought to be another limiting

control on model performance, and it is reasonable to suggest that improved nutrient load

models should take priority over the in-river transport model. Further study of the

sensitivity of nutrient export to a wider range of inputs (to include, for example, rainfall

and fertiliser application) is needed.

In summary, the experiments indicate a high risk that in-river data collection will fail to

support the sought modelling capability, given the wider modelling problems that are

perceived to be present. Therefore it is recommended that a minimalist approach to

sampling is initially taken, which concentrates on the first major storm event where data

information content is considered to be relatively high and robust to structural error.

However, recommendations cannot be restricted to the in-river sampling regime, but

should encompass the broader modelling strategy, expectations, and general data

priorities.

7. Risk-based modelling of surface water quality: a case study of the

Charles River, Massachusetts

A model of chlorophyll-a, dissolved oxygen and nutrients is presented and applied to the

Charles River, Massachusetts within a framework of Monte Carlo simulation. The model

parameters are conditioned using data from eight sampling stations along a 40km stretch

of the Charles River, during a (supposed) steady-state period in the summer of 1996, and

the conditioned model is evaluated using data from later in the same year. Regional multi-

objective sensitivity analysis is used to identify the parameters and pollution sources most

affecting the various model outputs under the conditions observed during that summer.

The effects of Monte Carlo sampling error are included in this analysis, and the

observations which have least contributed to model conditioning are indicated. It is

shown that the sensitivity analysis can be used to speculate about the factors responsible

for undesirable levels of eutrophication, and to speculate about the risk of failure of

nutrient reduction interventions at a number of strategic control sections. The analysis

indicates that phosphorus stripping at the CRPCD wastewater treatment plant on the

Charles River would be a high-risk intervention, especially for controlling eutrophication

at the control sections further downstream. However, as the risk reflects the perceived

scope for model error, it can only be recommended that more resources are invested in

data collection and model evaluation. Furthermore, as the risk is based solely on water

quality criteria, rather than broader environmental and economic objectives, the results

need to be supported by detailed and extensive knowledge of the Charles River problem.

7.1 Introduction

7.1.1 Motivation

The reasons for and significance of uncertainty in predictions of water quality are widely

recognised and documented elsewhere (e.g. Hornberger and Spear 1980, Beck 1983,

Beck 1987, Reckhow 1994, Reichart and Omlin 1996, Van Straten 1998, Adams and

Reckhow 2001, McIntyre et al. 2001). The inevitability of significant uncertainty, and the

need to account for it in water quality management, has been recognised in the

development of some decision-support tools, for example QUAL2E-UNCAS (Brown and

Barnwell 1987), DESERT (Ivanov et al. 1996), SIMCAT (UK Environment Agency

2001a) and WaterRAT (McIntyre and Zeng 2002). An alternative modelling philosophy

is to aim to reduce uncertainty to an insignificant level through refinement of scale and

process representation (Young et al. 1996, Beck 1999). While so refined models have

proven valuable in a number of applications (see the review of Ambrose et al. 1996),

there are four reasons why this modelling philosophy seems to be of restricted value in

practice. 1) Field data are not usually adequate to identify the boundary conditions and

parameter values of such models (Beck 1997). 2) In any case, more data do not

necessarily lead to better models or system understanding (Chatfield 1995, Beck 1999).

3) Human resource constraints often preclude intricate, data-intensive, modelling

exercises (Reckhow 1994, van Straten 1998). 4) Results of complex models are more

easily mis-interpreted, while not necessarily being any more reliable (e.g. Gardner et al.

1980, Van der Perk 1997). Therefore, there remains a need to promote uncertainty

estimation, and to continue to develop models and analytical tools which permit such

analysis, and which reflect the resource constraints of users.

This chapter will exhibit and review the utility of the WaterRAT tool for water quality

management, using the Charles River, Massachusetts as a case study.

7.1.2 Scope and objectives

The case study is approached in a way which addresses the difficulty of applying a water

quality model to decision support on the basis of limited supporting data and modelling

resources. This is done in the context of five tasks which are set for the study.

• To condition the model using the water quality data observed at various control

sections of the Charles River on the 20th August 1996.

• To evaluate the conditioned model with respect to its success in representing the

water quality observed on the 8th October of 1996.

• To identify the principal factors affecting water quality on 20th August 1996.

• To support the appraisal of options for reducing eutrophication in the Charles

River, specifically limiting chlorophyll-a to less than 10mgm-3 at a number of

control sections. Proposed water quality interventions will be evaluated on the

basis of associated risk of failure to achieve the specified target.

• To identify ways to reduce the element of risk which stems from model

prediction uncertainty.

Arguably, confronting these tasks rigorously would require careful consideration of

different modelling tools, and selection of one or more approaches. Furthermore, a critical

review of data reliability including evaluation of sources of sampling and measurement

errors would normally be recommended, followed by iterative model structure

adjustments and parameter calibrations. However, within the scope of this chapter, none

of these practices are adopted. Instead, this investigation starts from the premise that

modelling resources are limited so that only one model structure can be used, and that the

readily available data must be interpreted without researching quality control issues.

Additionally, the view is taken that the in-river data are too sparse to be usefully analysed

using traditional maximum likelihood techniques. Instead, this investigation is founded

on the methodologies of Regional Sensitivity Analysis (RSA; Hornberger and Spear

1980) and Generalised Likelihood Uncertainty Evaluation (GLUE; Beven and Binley

1992), whereby qualitatively derived constraints are used to supplement the information

in the sparse data set. In adopting these approaches, we set out to address the question “If

human resources and observed data are limited, as they typically are, what degree of

support can be given to strategic management of water quality?”

The study does not take account of a large number of factors presently affecting policies

for management of the Charles River, nor of many observations outwith the 1996 study.

The study is primarily a demonstration of methods, and all results should be seen in this

context.

7.1.3 The case study

Water quality problems associated with the Charles River in previous decades were

industrial pollution and combined sewer overflows which led, among other unwelcome

effects, to nutrient enrichment and eutrophication. Storm-water interceptions and other

interventions in the 1990s have greatly improved the overall ecology and amenity value

of the river, although they have failed to control eutrophication satisfactorarily. Further

measures are currently being implemented by installing phosphorus stripping facilities at

a number of wastewater treatment plants (CRWA 2000). However, such control of point

sources will not necessarily solve the problem, as phosphorus from non-point sources

may enter the river directly or via tributaries. Furthermore the phytoplankton may be

resilient to low phosphorus concentrations. There is therefore a need for decision support

tools which can estimate the residual eutrophication given various options for point and

non-point interventions.

This study looks at the 40km length of the Upper Charles River, between the Populatic

Pond in Medway County and the Cochrane Dam in Dover County, on two days in the

summer to autumn of 1996 (the 20th August and the 8th October), when data were

collected at nine sections along the river (CDM 1997), shown in Figure 1.5. The

determinands measured include,

• chlorophyll-a (Ca),

• dissolved oxygen (Cox),

• 5-day biochemical oxygen demand (Ccf),

• organic phosphorus (Cps) and orthophosphates (Cpo),

• organic nitrogen (Cns), nitrates (Cni) and ammonium (Cna).

• flow (Q), water depth (Hw) and water temperature (Tw),

These measurements were made three times on each day to estimate a daily mean and an

expected diurnal range. In addition, the major point sources to the river (two wastewater

discharges and seven tributaries) were monitored, and daily pollution and hydraulic loads

were estimated. An additional pollution and hydraulic load was assumed to be evenly

distributed along the studied length of river, based on measurements at a number of minor

inlets to the river. The three dates were chosen from periods when the river was

considered to be near steady-state, and a steady-state assumption is maintained in this

exercise.

7.2 Model Structure and Methods

7.2.1 Specification of the model structure

We begin with the premise that a model structure which is adequate for the tasks can be

adopted, and all the uncertainty in the model structure can be represented by parameter

uncertainty. While not analytically ideal, this premise is consistent with the constraints of

time and resources which are normal in practice, and the adequacy of the model structure

will be reviewed and discussed as part of the model evaluation. The philosophy of

parsimonious modelling has been rejected, as one important task is to explore the risk

stemming from model components which are not identifiable during conditioning, but are

relevant to future scenarios. Importantly, it should be noted that the selected structure,

summarised below, is not intended to represent our full prior knowledge of phytoplankton

dynamics and nutrient cycling due to the unmanageable complexity this would entail

without expected improvements in predictive performance, but is a simplification based

on our prior experience of what the principal components of the system are likely to be.

Notwithstanding the simplifications, the model is of a similar complexity to the

commonly used QUAL2E model (Brown and Barnwell 1987).

The selected model structure includes all the monitored determinands previously listed.

The in-river nutrient and oxygen cycling processes are represented by the set of

differential equations presented below, and the interactions of the water quality

determinands are summarised in Figure 7.1. The descriptions below are purposefully brief

- a more in-depth description and discussion of the formulations are given in McIntyre

and Zeng (2002) and Chapra (1997).

Phytoplankton

)()( aaadaagaa CCCkCk

∆+Φ+−= (7.1)

where t is time (in units of s); Ca is phytoplankton concentration measured as

Chlorophyll-a (gm-3); ∆(Ca) represents the advective and dispersive transport of

phytoplankton to and from adjacent control volumes of water; Φ(Ca) represents the

external load of phytoplankton; kga (s-1) is the net photosynthesis rate of phytoplankton

(includes effect of phytoplankton respiration); and kda (s-1) is the death rate of

phytoplankton. kga is a function of water temperature Tw (oC), light availability I (Jm-2 s-1),

nutrient availability and the maximum net photosynthesis rate at Tw = 20oC, kga20.

Similarly, kda is a function of water temperature and the maximum death rate at Tw =

20oC, kda20;

20,)(),,()( gawTponinaNIga kTfCCCfIfk = (7.2)

20,)( daTda kTfk = (7.3)

where the functions fI, fN and fT are respectively based on the Steele model of light

limitation, the Michaelis-Menten equation of nutrient limitation (where fN is defined by

the minimum of nitrogen and phosphorus limitation), and the Arrhenius formula of

temperature effect (see Chapra 1997: p40, p607, p610). The coefficient θ which defines

the relationship between reaction rates and temperature (Equation 3.13) is, for this study,

assumed to have a common value over all the model components.

Ccs Cns

CONTROL VOLUMEOF WATER

Cni and Ccf lostCcf lost

Cns lostCcs lost

ATMOSPHERE

Hydrolysis

Resp-iration

SedimentationSedimentation

SEDIMENT

Denitri-fication

Sediment oxygen demand

Nitrifi-cation

Carbon fixed

Photo-synthesis

Phytoplankton death

Cps lost

FixationPhytoplankton death

Hydrolysis Phytoplankton death

Fixation

Aeration

Cox lost/gained

Ccs Cns

CONTROL VOLUMEOF WATER

Cni and Ccf lostCcf lost

Cns lostCcs lost

ATMOSPHERE

Hydrolysis

Resp-iration

SedimentationSedimentation

SEDIMENT

Denitri-fication

Sediment oxygen demand

Nitrifi-cation

Carbon fixed

Photo-synthesis

Phytoplankton death

Cps lost

FixationPhytoplankton death

Hydrolysis Phytoplankton death

Fixation

Aeration

Cox lost/gained

Figure 7.1 Schematic of conceptual processes and state interactions

Dissolved oxygen

Dissolved oxygen Cox (gm-3) is gained or lost through exchange with the atmosphere

which occurs in direct proportion to the oxygen deficit (Cos - Cox) at a rate kra (ms-1),

where Cos is the dissolved oxygen concentration in equilibrium with the atmosphere

calculated as a function of water temperature (Equation 1.15). Cox is gained due to

photosynthesis, at a rate of one unit of oxygen to roa units of chlorophyll-a produced; lost

due to bacterial respiration at a rate koc (s-1); lost due to nitrification at the rate of

nitrification kon multiplied by the oxygen demand of a unit mass of ammonium nitrogen

ron; and finally dissolved oxygen is lost due to sediment oxygen demand SOD (gm-2s-1)

which is a spatially varying model parameter.

( ) )()( oxoxw

naononcfocagaoaoxosw

raerox CCH

SODCrkCkCkrCCH

dC∆+Φ+−−−+−=

The aeration rate kra (gm-2s-1) is calculated through relationships with water velocity, Hw

and Tw (see Chapra 1997: 377), and the effect of a weir or sluice on the aeration rate is

modelled using an additional empirical relationship (see Chapra 1997: 380). Error in

these aeration formulae is allowed for by assuming that any deviation from the

relationship increases linearly with kra at a rate given by parameter ker.

Non-living organic carbon

Non-living organic carbon (i.e. excluding that in phytoplankton) is modelled using two

conceptual fractions, all modelled in units of equivalent oxygen demand. The first is 5-

day biochemical oxygen demand Ccf representing the fast-decaying dissolved organic

carbon, and the second is Ccs representing the slow-decaying particulate detritus, which

slowly hydrolyses into Ccf.

)()( cfcfnionpdncfoccshccf CCCrkCkCk

∆+Φ+−−= (7.5)

)()( cscsaoadacsw

cscshc

cs CCCrkCHv

dC∆+Φ++−−= (7.6)

where khc (s-1) is the hydrolysis rate of Ccs which is dependent on Tw (as already described

for kda); koc (s-1) is the decay rate of Ccf which is similarly dependent on Tw and is also

limited by Cox using the previously mentioned Michaelis-Menten formulation; vcs (ms-1) is

the sedimentation rate of Ccs; kdn (s-1) is the rate of denitrification of nitrate Cni, a process

which consumes ronp units of Ccf for each unit of Cni (see Chapra 1997: 478).

Nitrogen

Nitrogen is included in the model as non-living organic nitrogen Cns, ammonium (plus

ammonia) Cna, and nitrate Cni. Nitrite is omitted under the assumption that the conversion

of ammonium to nitrite is the rate-limiting process (Chapra 1997: 422). Nitrogen is

allowed to be lost by sedimentation of Cns at a rate vn and by denitrification of Cni at a rate

kdn (s-1) which is a function of Tw as previously described (while denitrification generally

occurs only in anoxic waters, the kdn term represents the effect of denitrification processes

in the sediments – see Whitehead and Toms 1993). Similarly, the rate of nitrification of

Cna to Cni is dependent on Tw, and is also limited by Cox using the previously mentioned

Michaelis-Menten formulation. Both ammonium and nitrate are assimilated at a rate kga in

proportion to the nitrogen-chlorophyll-a ratio rna in the phytoplankton. The phytoplankton

consume Cni and Cna in a proportion (which is defined by coefficient kna) to the relative

availability of these two nutrients.

)()( nananshnnaonananagana CCCkCkCkrk

∆+Φ++−−= (7.7)

)()()1( nininidnnaonananagani CCCkCkCkrk

∆+Φ+−+−−= (7.8)

)()( nsnsnsw

nsnshnanada

ns CCCHv

CkCrkdt

dC∆+Φ+−−= . (7.9)

Phosphorus

Phosphorus is represented by non-living organic phosphorus Cps and inorganic

phosphorus Cpo.

)()( popopshpapagapo CCCkCrk

∆+Φ++−= (7.10)

)()( pspspsw

pspshpapada

ps CCCHv

CkCrkdt

dC∆+Φ+−−= (7.11)

where rpa is the ratio of phosphorus to chlorophyll-a and vps is the effective sedimentation

rate of organic phosphorus.

The pollution transport model (through which the ∆ terms above are calculated) is a one-

dimensional control volume solution of the advection-dispersion equation (see Chapra

1997: 215), i.e. a different application of Equation 6.4. The control volumes are defined

by a series of 44 cells (average length of 910m) which make up the full length of the

river. Adjacent cells with similar hydro-geometric properties are grouped together into

reaches, giving the 12 reaches illustrated in Figure 7.2. The flow routing model is a non-

linear store whereby the flow out of each cell Q (m3s-1) is proportional to a power q2 of

the average depth in that cell H (m), and a constant of proportionality q1 (m3-b s-1),

wHqQ = (7.12)

The rate of change of water volume V (m3) in each cell is simply the balance of flow from

the upstream cell Qup, flow out of the cell Q, and external sources Φ(Q),

)(QQQdtdV

up Φ+−= (7.13)

Hw is a function of V and the geometric properties of the cell. Each cell is conceptualised

as having a symmetrical trapezoidal cross-section, so these properties are the cell length,

the base width and the side-slope. The dispersion between cells is calculated from an

empirical relationship with water velocity (Chapra 1997: p245). The water temperature is

prescribed on the basis of observations.

CRPCDWWTW

MillRiver

Head-water

StopRiver

Medfield WWTW

BogastowBrook

WabanBrook

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

1 2 3 4 5 6 8 9 10 11 127

TroutBrook

SewallBrook

CochraneDam

IndianBrook

CRPCDWWTW

MillRiver

Head-water

StopRiver

Medfield WWTW

BogastowBrook

WabanBrook

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 441 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44

1 2 3 4 5 6 8 9 10 11 127

TroutBrook

SewallBrook

CochraneDam

IndianBrook

Figure 7.2 River reaches, spatial grid and point sources

The set of ordinary differential equations described above is integrated using the Fehlberg

scheme described in Chapter 5. The integration was performed during the 12 days leading

up to the observed days, on the assumption that the daily average pollution loads and

other boundary conditions were constant during this period. 12 days was found to be

more than sufficient to allow the water quality to reach steady-state.

7.2.2 Specification of prior parameters and uncertainty in observed data

The model includes 24 biochemical parameters which are all considered to be uncertain.

The advective and dispersive terms (∆) in Equations 7.1 to 7.11 are assumed not to carry

uncertainty. The dispersion is time-variable as a function of velocity (Chapra 1997:

p245), and the q1 parameter is calibrated independently of water quality, on a reach-by-

reach basis, with q2 fixed at a value of 1. The ranges of possible values of the biochemical

parameters, prior to model conditioning, are taken from various reviews (Brown and

Barnwell 1987, Bowie et al. 1987, Thomann and Mueller 1982, Chapra 1997). All prior

parameter ranges together with references are listed in Table 7.1. All values within these

ranges are taken to be independent and equally likely prior to conditioning.

Table 7.1 Prior parameter ranges

component Parameter Unit Min. Max. [a] Ref [b]

Slow carbon khc20 s-1 0.001 0.1 Bo vsc m.s-1 0.05 1 Br, C Fast carbon koc20 s-1 0.1 3 T kochs mgO.L-1 0.1 1 Bo Oxygen ker % -100 100 C kwe 0.0325 1.26 C SOD g.m-2 s-1 0.2 1 C Nitrogen khn20 s-1 0.001 0.4 Br, Bo vsn m.s-1 0.05 1 (as vsc) kon20 s-1 0.1 1 Br ron gO.gN-1 4.57 4.57 C konhs mgO.L-1 0.1 1 (as kochs) kdn20 s-1 0 0.1 Bo ronp gO.gN-1 2.86 2.86 C Phosphorus khp20 s-1 0.01 0.7 Br vsp m s-1 0.05 1 (as vsc) Phytoplankton kga20 s-1 1 2.5 Br, Bo kgahsn mgN.L-1 0.01 0.03 Br, Bo kna mgN.L-1 0 1 Br, Bo kgahsp mgP.L-1 0.001 0.05 Br, Bo kgahsl W.m-2 3.8 24 Br, Bo kda20 s-1 0.003 2 Bo rna gN.gChla-1 3.6 18 C roa gO.gChla-1 55 280 C rpa gP.gChla-1 0.5 2.5 C Common θ 1.024 1.15 Br, Bo

Notes: [a] If Max. = Min. then the parameter is taken as known with certainty. [b] Br = Brown and Barnwell (1987); Bo = Bowie et al. (1987); T = Thomann and Mueller (1982); C = Chapra (1997). Parenthesised entries indicate that the range has been assumed the same as that of another parameter.

As well as the high number of parameters, the sources of pollution are also not known

precisely. To make justified inferences about the parameters, this additional source of

uncertainty must be allowed for during the model conditioning. As is often the case, the

available data for sources of pollution are daily means with no supporting quality control

information to indicate the scope for procedural errors. Therefore we initially assume that

all sources are described as uniform distributions with ranges ±30% around the given

mean value. If there is evidence in the data to suggest that this assumption is important

and unreasonable, then this will become evident during model evaluation and sensitivity

analysis.

The in-river water quality data consists of three measurements per determinant per

monitoring section (see Figure 1.5) on both of the monitored days – one of the three

measurements was taken early in the morning, one at mid-day and one in early evening.

Therefore, the variability observed within each set of three will consist of errors (e.g.

sampling errors) and diurnal variations. For this study, it is appropriate to use the mean of

the three measurements to condition the model, as the model is assumed steady-state at a

daily time scale, and is driven by mean daily inputs. The uncertainty in the estimation of

the daily mean water quality is difficult to estimate due to the limited statistical

significance of such small sets of data, and due to unknown procedural biases. The

problem of describing data error structures using typically available data sets is a major

obstacle to objectivity in water quality model uncertainty analysis, and some degree of

simplification and subjectivity is required. For the purpose of this investigation, the

simplifying assumption is made that the uncertainty bounds on the observed mean water

quality are represented by the maximum and minimum daily values - all values between

these bounds are perceived to be equally likely. Exceptions to this rule are when the

concentrations are below the detection limits, in which case the daily mean is taken as

anywhere between zero and the detection limit, which occurs frequently for Cna.

Therefore, as well as having a prior estimation of the uncertainty in all the model’s input

variables (24 parameters plus 7 types of pollution for each of 9 point sources plus the

headwater), we have an estimation of the uncertainty in the data (7 determinands at each

of 8 downstream monitoring sections). These data are central to the first of our modelling

tasks – to condition the model using the observations made on the 20th August 1996.

Clearly, there is very large number of factors to be taken into account in confronting this

The following calibration samples realisations of both the model inputs and the

observations of river water quality, as well as the model parameters. The aim is to

integrate the effects of input and output observation uncertainty into the calibrated

parameter distributions, and to test the relative sensitivity of the objective function to the

perceived uncertainty in the observations. The logic behind the method is that the

objective function is regarded as a function with two uncertain inputs; 1) the output of the

model (itself a function of uncertain parameters and observed pollution load data) and 2)

the observations of water quality. It is then reasonable, within the Monte Carlo analysis,

to sample realisations of both the model output and the observations of water quality for

the calculation of the objective function. It follows that the posterior joint parameter

distribution will be derived by integrating across all sampled values of the observed

pollution load and river water quality data sets. Similarly, posterior joint distributions of

observed pollution load, and of observed river water quality can be derived. ‘Factor’ is

used through the remainder of this chapter to describe all three types of uncertain input to

the objective function.

This approach is in contrast to many previous applications which do not explicitly

represent scope for error in the observed calibration data. Instead, it is usually considered

that the objective function implicitly integrates the data error into the posterior parameter

response surface. This is valid using likelihood functions that are designed explicitly

recognising a data error distribution (this is supported by the experimental and theoretical

comparisons in section 2.4.3), but the usefulness is not clear when subjective measures of

performance are being employed. It is proposed that this new approach will improve

robustness of the estimated parameter uncertainty, and will indicate to what extent the

perceived error in the data is controlling the value of the calibration exercise. The primary

limitation is the simplistic representation of prior distributions of data error, although

arguably this is no more an issue than the assumption of uniform independent priors for

the parameters.

7.2.3 Multi-objective model conditioning

The initial objectives of the model are to adequately replicate the observed concentrations

of the eight determinands observed on the 20th August, 1996. The degrees to which the

eight objectives are met are measured by eight objective functions (OFs), one for each

determinant;

OF1 ≡ replicate the observations of Ca

OF2 " Cox

OF3 " Ccf

OF4 " Cns

OF5 " Cni

OF6 " Cna

OF7 " Cps

OF8 " Cpo

(7.14a-i)

To condition the model to meet each of these objectives individually, the values of the

OFs are calculated for each of a number of randomly sampled sets of factors α within a

Monte Carlo procedure. This involves running a simulation of the Charles River for each

of the sampled sets and calculating the corresponding OF values for all 8 determinands.

For the kth of Nsam sampled sets of factors αk and the mth determinant, OFk,m is defined

as the inverse of the sum of the squared residuals for that determinant,

−= ∑

nnmkmk CCOF k=1,Nsam; m=1,8; n=2,9 (7.15)

where ( ) nmkCC ,,− is the residual between the observed value of the mth determinant

C and the corresponding model result C at the nth of the nine monitoring sections (the

headwater section, n = 1, is not included) obtained from the kth sampled set of factors.

Therefore, if the model result closely replicates the data then OFk,m will be high and it will

be nearly zero if the result is far from the data. Nsam sets of factor values and

corresponding OF values are obtained. These OF values are multiplied by the prior

probability Lp of α and normalised so that they sum to unity,

[ ][ ]∑ ⋅

mkpmk L

LL k=1,Nsam; m=1,8 (7.16)

Then, all Lk,m can be regarded as values of probability mass from a conditioned (posterior)

joint distribution of factors. In this case Lp is constant at 1/Nsam for all k and m, and its

terms in Equation 7.16 cancel each other. However, the analysis will later incorporate

Bayesian updating of a model (i.e. incorporating new data or knowledge into an existing

model using probability theory) whereby the old posterior becomes the new prior (L →

Lp), and in the general case Lp would become an essential term in Equation 7.16.

The result of Equation 7.16 defines the conditioned distribution of model parameters,

representing the uncertainty in the model. It should be noted that this uncertainty has been

subjectively defined, because Equation 7.15 is not a likelihood function which is based on

statistical evidence or probability theory. Also, in this study there are eight independently

conditioned models, associated with the eight determinands, which may or may not be

complementary. That is to say, although the concept of the model is founded on the

speculated interactions between these eight determinands, our model identification

criteria neglect this. Evaluation of the uncertainty in model predictions which this may

introduce can be approached using multi-objective processing, and this is discussed later.

7.2.4 Graphical model evaluation

The justification of the model as a tool for sensitivity and scenario analysis depends on

the model structure being a good conceptual representation of the actual water quality

processes under all conditions of interest. To some extent, the structure is known to be

reasonable a priori because the formulations (Equations 7.1 to 7.13) are based on

common knowledge of the principal processes affecting water quality. On the other hand,

some processes known to have the potential to affect water quality are not represented.

For example, sediment-water nutrient interactions are not represented explicitly, but

might become important following nutrient load reduction, and therefore their omission

casts doubt on the model’s reliability for appraising intervention scenarios. Ideally, the

model structure would be evaluated under a wide range of conditions to identify the

importance of these and other “missing “ structural components. Additional discussion of

approaches to model structure evaluation is given in Wagener et al. (2001).

The traditional criterion for evaluating a model structure is its ability to achieve a

satisfactory distribution of model residuals (i.e. the residuals between the model result

and the corresponding observed data point) (see Kuczera 1983, Beck 1987). This

approach is rejected here because the observations in this study (and many other

distributed water quality studies) are not of sufficient quality and quantity to allow any

useful statistical inference about the residuals. It is more useful to interpret observations

subjectively alongside the prior knowledge contained in the model result. Arguably, in

cases of unknown data reliability, it may be equally justifiable to invalidate the data on

the basis of an a priori hypothesis of the model structure as vice-versa (see the discussion

of Chatfield (1995) on data-mining).

For this study, the model is evaluated using the Charles River data from the 20th August

and the 8th October 1996. The spatial variations in water quality on these two dates are

stochastically modelled using the conditioned values of probability mass, with each

determinant m modelled using its own OFm (see Equations 7.14 to 7.16). For the 20th

August, the variations modelled along the river have been conditioned on the same day’s

observed data and therefore, excepting serious structural error, would be expected to

explain the data relatively well. For the later day, an additional Monte Carlo analysis is

needed which incorporates the changed boundary conditions - pollution loads, hydraulic

loads, water temperature and light intensity. Graphically comparing the modelled spatial

variations with the in-river data indicates, on a subjective basis, the suitability of the

model structure and the suitability of the definition of uncertainty given by Equations

7.15 and 7.16. That is, this evaluation seeks to answer the question “are the observations

sufficiently described by the modelled confidence limits?” (see Section 7.3.1 for results).

7.2.5 Regional sensitivity analysis

Regional Sensitivity Analysis (RSA) identifies the factors which have significant

probability of influencing the degree of achievement of each of the objectives. It is an

essentially probabilistic sensitivity analysis, as opposed to traditional derivative-based

methods. This means that the measure of regional sensitivity assigned to each factor is not

only dictated by the sensitivity of the model outcome to a unit perturbation in that factor,

but by the relative responses due to all the factors, and due to the relative uncertainties in

all the factors.

The employed RSA procedure is founded on the behavioural analysis described by

Hornberger and Spear (1980) and Spear and Hornberger (1981), together with the

Generalised Likelihood Uncertainty Estimation (GLUE) procedure of Beven and Binley

(1992). The methods of sensitivity analysis are employed in multi-objective mode, using

a similar approach to that described by Bastidas et al. (1999) and Meixner et al. (1999).

The procedure may be summarised as follows. The marginal frequency distribution of

each of the factors for all eight objectives is derived. The marginal distribution function

Pm(γ), indicates the probability that objective m will be met across the range of factor γ,

given the uncertainty in all the other factors. Pm(γ) is derived by totalling the values of L

(for objective m) within each of a number of equally sized bins of γ, which is illustrated

schematically in Figures 7.3(a) and (b). The difference between the cumulative marginal

distribution and the factor’s prior distribution (which is shown as uniform in Figure 7.3)

is summarised by the Kolmogorov-Smirnoff statistic, KSm(γ) (see Ang and Tang 1975,

pp. 277-280). The significance of this statistic is illustrated in Figure 7.3(c).

unconditionedconditioned

KS statistic = KSi(γ)

d Figure 7.3(a)Scatter-plot ofOFi over rangeof factor γ

Figure 7.3(b)Marginalprobabilitydistributionof factor γ

Figure 7.3(c)Conditioned andunconditionedcumulativedistributionsof factor γ

KS statistic = KSi(γ)

d Figure 7.3(a)Scatter-plot ofOFi over rangeof factor γ

Figure 7.3(b)Marginalprobabilitydistributionof factor γ

Figure 7.3(c)Conditioned andunconditionedcumulativedistributionsof factor γ

Figure 7.3 Derivation of K-S statistic

Figure 7.3(a) draws attention to the fact that different posterior probabilities are found for

any one value of γ - the probability is not solely a function of γ, but also of the values of

all the other interacting factors. This is why joint and marginal distributions must be

reported rather than uni-variate distributions. High correlation between factors will tend

to diminish their regional sensitivity, and so the results of the RSA should be interpreted

in conjunction with the factor covariance matrix.

Note from Equation 7.15 that different realisations of the in-river data are being used as

part of the sensitivity analysis. This is done to improve robustness of the sensitivity

analysis to data uncertainty, and to indicate the importance of the data uncertainty

compared to that of the a priori parameter uncertainty. For example, referring again to

Equation 7.15, if the OF is shown to be more sensitive to the C terms than the C terms

then it may be argued that the model cannot usefully be conditioned given the quality of

the available data (i.e. the task of replicating the observations is too badly defined). The

penalty for sampling the data errors is that there are more interacting factors so that less

information is retrieved about the effect on the posterior marginals of high order factor

interactions. This is an issue of sampling adequacy, which is discussed later.

For this sensitivity analysis, 10000 samples are taken from the prior ranges using Latin

hyper-cube sampling (McKay et al. 1979). Latin hyper-cube sampling ensures optimum

coverage of the individual ranges, and with 10000 samples gives relatively

comprehensive representation of two-, three-, and four-factor interactions, but sampling

of higher level interactions is sparse. In theory, this means that if the overall significance

of a factor is dependent on the simultaneous value of more than three other factors, then

its evaluation will not be reliable. In practice, however, it tends to be only the lower

interactions that affect the results of a sensitivity analysis (Henderson-Sellers and

Henderson-Sellers 1993). Therefore, 10000 samples are assumed to be sufficient. Due to

the semi-random nature of Latin hyper-cube sampling, values of the KS statistic beneath

an arbitrary level will not be significant. While significance levels can easily be

calculated as a function of the number of samples for fully random uni-variate

experiments (see Ang and Tang 1975, pp. 278), this is not valid in the context of Latin

hyper-cube sampling. Also, the KS statistic refers to the difference between the marginal

posterior and prior distributions, but the sampling is from the multivariate prior

distribution. While this does not preclude the KS test, it makes analytical derivation of the

significance level very difficult. To address this issue of identification of meaningful

significance levels for the KS statistic, a number of control factors – factors which are

known not to have any significance – are included. Significant factors can then be

identified as those whose KS statistics clearly above those of the control factors.

Although the KS statistic is a potentially insightful summary of model sensitivity, it can

diminish the importance of local effects, especially at extreme values. Where significant

factors are reported by the KS statistic, or where significant factors are expected but not

reported, the sensitivity associated with that factor can be visualised by the scatter-plot of

the probability mass or the marginal distribution (Figure 7.3). This isolates important

local sensitivities not identifiable from summary statistics.

The ultimate task of the Charles River model is to identify pollution management

scenarios which lead to an acceptable risk of failing to limit phytoplankton concentrations

(Ca) to below 10mgm-3 of chlorophyll-a. It is not safe to assume that the model

components which are important under this new objective are the same as those identified

for the objective of replicating the 1996 data, and another sensitivity analysis is

warranted. The same algorithm is used, but in this case it is known precisely what the

objectives are (<10mgm-3 at each monitoring section), and so there is no need to treat

aC as a randomly sampled variable (although there may be practical cases where

regulatory objectives are not so exact, with the qualitative objectives set by the European

Community’s Water Framework Directive (CEC 2000) being a prime example).

The previously conditioned parameter sets and associated probabilities α1,k, L1,k : k=1,

Nsam are employed to define the prior distribution for this second sensitivity analysis.

This is based on the premise that model parameters represent the principal physical

components of the system and will not change under future pollution load scenarios (as

opposed to the pollution loads, for which the preferred changes are under investigation).

The probabilities are translated to the prior probabilities for this second sensitivity

analysis,

kpk LL →,1 k=1,Nsam (7.17)

The sampled pollution loads within α1,k, L1,k : k=1, Nsam, which previously represented

the perceived range of errors in the historic pollution load data, are overwritten by

random samples from feasible ranges of future pollution reductions. These ranges are

defined as from zero up to the average pollution loads measured in the summer-autumn of

1996 (i.e. from 100% reduction to no substantial change). This provides the basis for

investigating phytoplankton responses to pollution interventions.

Nine objectives are defined – the constraint Ca < 10mgm-3 at each of nine control sections

(Sections A to I in Figure 1.5). The constraints are defined by upper and lower constraints

lCa and uCa (in this case, 0 and 10mgm-3 respectively), and are imposed on the model

results using Boolean objective functions,

otherwise 0OF for 1OF

≤≤=

uaalmk CCC k=1,Nsam; m=1,9 (7.18)

Equation 7.16 is then applied and Pm(γ) and KSm(γ) are derived as before (although this

time the prior marginals of the parameters are not uniform), and factors which will dictate

our ability to achieve the chlorophyll-a management objectives are speculated.

7.2.6 Risk-based appraisal of intervention strategies

Although the results of the second sensitivity analysis will indicate the pollution sources

most affecting our ability to achieve the objective of eutrophication reduction, a useful

question to ask is “on the evidence of the model (and given all the uncertainties involved)

what is the probability that a specified intervention will produce the desired effect?”. This

question can be answered by further processing of the results of the RSA. Following

application of Equations 7.18 then 7.16, and derivation of the factor’s marginal

distribution as in Figure 7.3, Pm(γ) is the probability of factor value γ given that Ca <

10mgm-3 at the mth control section. To derive the probability that Ca < 10mgm-3 given a

value of γ is a simple application of Bayes Theorem,

( ) ( ) ( )( )γ

mamma L

<=< m=1,9 (7.19)

where Lp(γ) is the prior probability of γ and ( )10<maCP is the overall probability of

success at control section m,

( ) ∑=<k

mkkpma LCP ,OF10 k=1,Nsam; m=1,9 (7.20)

( )γ10<maCP is not conditional on the value of any factor other than γ, and will reflect

the risk that the uncertainty in the other factors will cause non-achievement of the target

across the range of γ. For example, the risk of not achieving the target water quality at

each control section can be plotted against the degree of a proposed pollution load

intervention. This uni-variate report of risk could be extended into a bi-variate plot,

although more realisations may be required to identify a usefully smooth risk surface. For

this study, each potentially important pollution load intervention is analysed individually.

The other pollution loads, as well as the model parameters, are kept as conditioned by the

August 1996 observations.

7.3 Results and discussion

7.3.1 Preliminary model evaluation

The modelled spatial variation and 90% confidence limits of all 8 determinands for the

20th August are shown in Figures 7.4(a) to 7.4(h), together with the data of the observed

water quality and their error bounds. It is seen that the model, with isolated exceptions, is

successfully representing spatial variations in water quality on that date.

(h) Spatial variation in inorganic phosphorus Cpo 20/8/96

0 7.5 15 22.5 30 37.5Distance downstream of headwater (km)

(b) Spatial variation in disolved oxygen Cox 20/8/96

Distance downstream of headwater (km)

0 7.5 15 22.5 30 37.5

(a) Spatial variation in phytoplankton Ca 20/8/96

0 7.5 15 22.5 30 37.5

(d) Spatial variation in organic nitrogen Cns 20/8/96

0 7.5 15 22.5 30 37.5

(f) Spatial variation in ammonia Cna 20/8/96

0 7.5 15 22.5 30 37.5

(e) Spatial variation in nitrate Cni 20/8/96

(g) Spatial variation in organic phosphorus Cps 20/8/96

0 7.5 15 22.5 30 37.5

(c) Spatial variation in biochemical oxygen demand Ccf 20/8/96

0 7.5 15 22.5 30 37.5

Figure 7.4 Modelled and observed spatial variations in water quality 20th August 1996 (mean value and 90% confidence limits shown). *error bounds on Cna data signify that all observations were below the detection limit of 0.1mgN/L.

The estimated confidence limits for the phytoplankton concentration Ca are interesting, as

they are not constrained by the observations as much as might be expected relative to the

other determinands. In particular, the lower confidence limit at the downstream reach

diverges from the observed data. It is speculated that this is due to the high order nature

of the phytoplankton model (evident in Equations 7.1 to 7.3), together with the non-

discriminating nature of the phytoplankton OF defined by Equation 7.15. For example, a

high numerical order would mean that the Ca result could be a good replication of the

observed data until the downstream stretch when it could swing rapidly to a poor

replication. As the OF defined by Equation 7.15 is aggregated over all of the monitored

sections (except the headwater), such a result would be given significant probability of

occurrence. Hence the lower confidence limit in Figure 7.4(a) is not reflecting the

observed data at monitoring section (I). This is a case where, arguably, the discrimination

between alternative sets of factors is not high enough, and Equation 7.15 should be re-

designed so that the model uncertainty is reduced, and the confidence limits are narrower.

On the other hand, more discrimination would mean that the estimation of uncertainty is

less robust to structural error, data bias, and inadequate sampling of the prior ranges of

factors (the value of Nsam). That is, narrower confidence limits are less likely to explain

the effects of these sources of error. For example, as one point of data on Figure 7.4(g)

and another on 7.4(h) are not explained by the combined estimates of model and data

error, it may be argued that the confidence limits on the Cps and Cpo models are not wide

enough. Essentially, this issue needs to be addressed using the judgement of the modeller

– to achieve a balance between robustness (presumed higher model uncertainty), and

precision (presumed lower uncertainty) which is appropriate to,

• the modeller’s judgement of data precision and reliability,

• the modeller’s judgement of model structure validity,

• the answers the modeller requires from the model, and

• the available modelling resources.

As the observed data in Figure 7.4 were used to condition the model, this result does not

demonstrate the model’s predictive capability. To attempt to do so, the conditioned model

is applied to prediction of the water quality on the 8th October, and the results are shown

in Figures 7.5(a) to 7.5(h). The confidence limits are generally wider than in Figure 7.4

indicating that the principal processes affecting water quality have changed from August

to October, so that the result is more dependant on the poorly identified model

parameters. This is clear for Ca, for which the upper confidence limit diverges greatly

from the observed data. Clearly, we are not able to accurately predict chlorophyll-a

concentrations under the October conditions, given the information retrieved from the

conditioning. Notwithstanding the lack of precision, we are able to predict with

confidence that the Ca values on the 8th October are less than the target of 10mgm-3.

0 7.5 15 22.5 30 37.5

(b) Spatial variation in dissolved oxygen Cox 8/10/96

0 7.5 15 22.5 30 37.5

Figure 7.5 Modelled and observed spatial variations in water quality 10th October 1996 (mean value and 90% confidence limits shown). *error bounds on Cna data signify that all observations were below the detection limit of 0.1mgN/L.

The purpose of Figures 7.4 and 7.5 is not to evaluate the model structure alone, but to

jointly evaluate the structure, the GLUE likelihood measure defined by Equation 7.15 and

the data. The inseparability of these three facets of model evaluation, which is endemic in

modelling water quality using sparse and/or unreliable data, must be approached using

subjectively orientated visualisation.

The approach was taken that the model should be conditioned individually for each of the

eight determinands, so that there were eight joint distributions of model parameters. This

is justified because it maximises the information retrieved about the sensitivities of the

determinands, as individual entities, to all the factors. However, arguably such an

approach fails to rigorously estimate the uncertainty in the model because, for every

determinant, the information contained in the data of the other determinands is neglected.

In fact, if the model had been required to explain the variations in all the determinands

simultaneously as a function of one joint distribution of factors, the confidence limits

would have inevitably been significantly wider. A related observation was made with

respect to the Ca spatial variation in Figure 7.4. In that case, the model was expected to

explain the variations in Ca at the upstream sections simultaneous to explaining those

downstream, leading to confidence limits which were not intuitive from the observations.

This prompts a discussion of how to use multiple objectives to robustly represent model

uncertainty, and to diagnose why and in what respects the model is failing to achieve all

its objectives simultaneously. Such discussion is not pursued here, but the reader is

referred, for in-depth discussion, to Gupta et al. (1998) and Wagener et al. (2001). For

current purposes, it is proposed that the conditioned model is sufficiently explaining the

spatial variation and error in the observed data, and that, in the context of limited

resources and high uncertainty, the model conditioned by the chlorophyll-a observations

may usefully be applied to the remaining tasks of this study.

7.3.2 Sensitivity analysis (1996 conditions)

Rather than tabulating the values of the KS statistic, the model sensitivities are reported

graphically, in Figure 7.6. Within this figure, there are six graphs which report the model

sensitivities measured as described in Section 7.2.5 using the Cna, Cni, Ca, Cpo, Cox and Ccf

data (only the Cns and Cps results are not illustrated). The common x-axis of these graphs

is the series of factors, comprising of model parameters, point loads and observed in-river

data. On the y-axes, the value of the KS statistic corresponding to each of these factors is

plotted, and these points are joined to give a trajectory, the significantly high peaks of

which indicate significant factors. The significance level, which is shown on Figure 7.6 as

the horizontal dashed line, is defined by the maximum KS statistic identified for the

control factors (factors to which the specific OFs are known to be independent). For

clarity, only the evidently significant factors are labelled.

In Figure 7.6, the mean observed water quality for determinant m at the nth control section

(from Figure 1.5) is signified by dn,m. For example d2,na signifies the mean observation of

ammoniacal nitrogen at the second control section, i.e. downstream of Mill River whereas

d8,a signifies the mean observation of chlorophyll-a at the section upstream of the Trout

Brook confluence. Similarly, the point load of determinant m at the nth point source (from

Figure 1.5) is signified by wn,m. The special case of n = 0 signifies the point load from the

headwater.

Factors

Parameters Pollutionsources

Data error

OF6 (Ammonia, Cna)

d4,na d5,na

d6,na d7,nad8,na

w4,nakda20

kgahsn

OF8 Orthophosphate, Cpo

khp20:vsp

w3,pow4,po d8,po

OF1 (Phytoplankton, Ca)kda20

roa:rpa:vsn:kdn20:w0,a d8,a

OF2 (Dissolved oxygen, Cox)kga20

roa: koc20: khc20: ker

kon,20 w1,nad2,ox d3,ox

d4,ox d5,ox d6,ox d7,ox

OF3 (Biochemical oxygen demand, Ccf)

vsn: kdn20: khp20, w0,cf

kga,20, kda20: roa:, rpa: khc20: koc20

w3,po d8,cf

OF5 (Nitrate, Cni)

Factors

Parameters Pollutionsources

Data error

OF6 (Ammonia, Cna)

d4,na d5,na

d6,na d7,nad8,na

w4,nakda20

kgahsn

OF8 Orthophosphate, Cpo

khp20:vsp

w3,pow4,po d8,po

OF1 (Phytoplankton, Ca)kda20

roa:rpa:vsn:kdn20:w0,a d8,a

OF2 (Dissolved oxygen, Cox)kga20

roa: koc20: khc20: ker

kon,20 w1,nad2,ox d3,ox

d4,ox d5,ox d6,ox d7,ox

OF3 (Biochemical oxygen demand, Ccf)

vsn: kdn20: khp20, w0,cf

kga,20, kda20: roa:, rpa: khc20: koc20

w3,po d8,cf

OF5 (Nitrate, Cni)

Figure 7.6 Value of the Kolmogorov-Smirnoff statistic across the model factors for objectives of replicating August 1996 observed data. The common x-axis of these six graphs is the series of factors, comprising of model parameters, point loads and observed in-river data. The significance level of the KS statistic is shown as the horizontal dashed line. For clarity, only the evidently significant factors are labelled.

It is seen that the Cna objective has been affected largely by the data uncertainty (perhaps

predictable because ammonium was below the detection limit for all but one of the

monitored sections). Lack of more precise measurements of Cna has restricted the

information that can be retrieved about lesser factors such as the model parameters.

Nevertheless, four parameters (khn20, kon20, kda20, and kgahsn) are identified as significantly

affecting the value of the Cna objective function. Considering the role of these four

parameters in the model concept (Figure 7.1), all three arrows leading to or from the Cna

box may be considered “active” components of the model. There is no evidence,

therefore, upon which to reduce the complexity of the Cna representation. Returning to

Figure 7.6, three point sources (organic nitrogen from the headwater, ammoniacal

nitrogen from the CRPCD wastewater treatment plant and ammoniacal nitrogen from the

Medfield wastewater treatment plant) are clearly most affecting the modelled Cna

pollution under the conditions of the 20th August, 1996.

The objective of replicating the Cox data is suggested by Figure 7.6 to be dominated by

parameters (kga20, kda20, roa, koc20, khc20, kon20 and ker). Thereby, with the exception of the

sediment oxygen demand component, all the model components affecting Cox are implied

as “active” and there is no justification contained in Figure 7.6 for removing them from

the model. That Cox should be largely affected by phytoplankton dynamics is somewhat

predictable (at least for a water quality expert) from general knowledge of the issues

affecting the Charles river (e.g. CRWA 2000). However, that the uncertainty in the

phytoplankton’s chemical composition (parameter roa) may be more important to our

success in modelling Cox (and hence the broader ecological status of the river) than, for

example, improved knowledge of the organic pollution loads, is an example of the insight

which the sensitivity analysis can offer.

The result for Ccf indicates the dependency of Ccf on nitrogen and phosphorus

concentrations, which can only be caused by the role of Ca in the model of the carbon,

nitrogen and phosphorus cycles. Why then is the Ccf significantly influenced by the 3rd

point source of phosphates (w3,po) but Ca has not been implied as so? This is because there

is some information contained in the Ccf data which has improved the chance of w3,po

standing out as an individual factor, and apparently less such information in the Ca data.

This demonstrates that no inferences should be made about individual factors without

taking into account all the available evidence, and that the most informative data are not

always where they might be expected.

If the current objectives were limited to the replication of the nitrate data, Figure 7.6

indicates that barely two parameters would be justified, and only one point source

(CRPCD wastewater treatment plant) would be significant in effect. The sensitivity of the

Cni objective to d2,ni, the Cni data at monitoring section 2 (shortly downstream of Mill

River), is because this point of data is the most uncertain of all the Cni data. This is

evident from Figure 7.4. The significance of the point source from the CRPCD

wastewater treatment plant (w1,ni) is predictable because this is such an intense source of

nitrate (six times the average from the other point sources). It should be noted that a

perceived uncertainty of ±30% rather than a statistically identified distribution was

applied for this source, and the value of the KS statistic depends on this assumption.

However, inspection of the scatter plot of posterior probability against w1,ni, Figure 7.7,

implies significant responses even for example in the ±5% bracket.

0.0001

0.0002

0.0003

0.0004

0.0005

-30% -20% -10% 0% 10% 20% 30%Point source 1 (CRPCD WWTW Nitrate % deviation from measured average)

0.0001

0.0002

0.0003

0.0004

0.0005

-30% -20% -10% 0% 10% 20% 30%Point source 1 (CRPCD WWTW Nitrate % deviation from measured average)

Figure 7.7 Scatter-plot of posterior probability for the error in nitrate load from CRPCD

Finally, from Figure 7.6, it is seen that the data uncertainty at the eighth analysed

monitoring section (section I in Figure 1.5) is repeatedly implied to significantly affect

the ability of the model to meet the objectives. In other words, the “goalposts “ are being

moved too much, through sampling of C in Equation 7.15. This suggests that the

uncertainty in the data at that monitoring section is limiting the information retrievable

about the other factors, and that this section is a priority for more data collection and/or

more precise measurement techniques. This suggestion may equally well be applied to the

data shortcomings previously identified specifically for the Cna and Cni models. Thus, it is

proposed that application of regional sensitivity analysis in the described manner has

potential value in designing and updating field monitoring progammes, and in the

objective management of sources of data error. Clearly, significant investment would not

be justified solely on the basis of a preliminary analysis, like that presented here. Instead,

for example, the analysis could be repeated to incorporate hypothesised reductions in

sampling and measurement error, to evaluate associated reductions in model uncertainty

and improvements in decision-making.

7.3.3 Sensitivity analysis (eutrophication reduction)

Figure 7.8 shows the superimposed trajectories of the KS statistic for each of the

eutrophication reduction objectives (keeping chlorophyll-a concentrations below 3mgm10 − of chlorophyll-a at each of the nine control sections, A-I). The overwhelming

implication of the analysis is that the concentration of chlorophyll-a in the headwater w0,a

is mainly responsible for the occurrences of failing to achieve the objective. For the

sections further down the river, the growth and death rates of phytoplankton become

more significant, as the phytoplankton have had more time to grow, or to die.

FactorsParameters Pollution sources

w1,pok gah

vspkdn20 w3,po w4,po

FactorsParameters Pollution sources

w1,pok gah

vspkdn20 w3,po w4,po

Figure 7.8 Plot of Kolmogorov-Smirnoff statistic across the model factors for objectives of reducing eutrophication at each of nine control sections

Various other phytoplankton parameters are implied to be relevant, so that the uncertainty

in the phytoplankton properties is a source of risk which may lead to nutrient reduction

interventions being ineffective. For example, this suggests that it is important to know the

dominant species of phytoplankton (e.g. blue-greens, diatoms, etc) in identifying low-risk

interventions for eutrophication management, as species are known to have individual

maximum growth rates, carbon to chlorophyll-a ratios, etc. Both the phytoplankton

phosphorus and light half-saturation constants (kgahsp and kgahsl) are included in the

significant parameters, implying that there is insufficient evidence to predict whether the

system would be phosphorus or light limited, although there is evidence that it would not

be nitrogen limited. The evidence for phosphorus limitation is corroborated by the

observation that three point sources of phosphorus have significant KS values - at the

CRPCD wastewater treatment plant, at the Stop River and at the Medfield treatment

plant.

7.3.4 Appraisal of intervention strategies

Before appraising options for reduction of summer eutrophication, it is worth noting from

Figure 7.4(a) that a “do nothing “ strategy has little chance of working, assuming that

1996 conditions are typical. It is seen that all the observed chlorophyll-a concentrations

are well above the target chlorophyll-a concentration of 10mgm-3, and the lower 90%

percentile of the model result just dips below the target at the downstream end of the

studied reach (although, as previously argued, this percentile is an outcome of the

limitations of the GLUE likelihood measure, and may be overly optimistic).

Firstly, the risk associated with reducing the phosphate load from the CRPCD treatment

plant was investigated. Even with 100% reduction in this load, results indicated

insignificant probability of achieving the target at any of the control sections. Further to

this, the effect of reducing the total phosphorus load from the CRPCD treatment plant

was investigated (the organic fraction of the total phosphorus load was lumped into the

inorganic fraction). Again, this was ineffectual. The results are strong evidence that

phosphorus stripping at the CRPCD treatment plant alone is unlikely to adequately

control eutrophication. The residual phosphorus in the river is estimated as adequate to

sustain undesirable phytoplankton growth under almost all feasible summer conditions.

Additional or alternative measures are required. These results also emphasise the lack of

detail given by the sensitivity analysis results in Figure 7.8. Figure 7.8 indicates the

factors for which the successful factor values are significantly different from the

unsuccessful values, not the level of success.

-100% -75% -50% -25% 0%Reduction

(B)(A)

(G)0.0

-100% -75% -50% -25% 0%Reduction

(B)(A)

Figure 7.9 - Risk associated with reduction in chlorophyll-a in headwater.

(A) = probability of achieving objective at control section (A); (B) = probability of

achieving objective at control section (B); etc.

The third scenario involved leaving all phosphorus loads at their 1996 levels, and

investigating the effect of reducing the concentration of chlorophyll-a in the headwater.

The results are presented in Figure 7.9 for the sections (A) to (I) indicated on Figure 1.5.

These suggest, for example, that reducing the headwater Ca by 60% will guarantee to

satisfactorily reduce Ca at control section (A) (although this is a trivial result, as section

(A) is the headwater); has a 80% chance of success at control section (B); 30% chance at

(C); 5% chance at (D); and no chance at any of the further downstream sections.

Alternatively, 100% reduction in the headwater Ca is more or less guaranteed to be

effectual upstream of South Natick Dam, and even has a 30% chance of success at the

Cochrane dam.

For the fourth scenario, the concentration of Ca in the headwater was fixed at 50% of the

average value observed in the summer of 1996, and the risk associated with reducing the

CRPCD total phosphorus concentration was investigated. The risk associated with

reducing the CRPCD total phosphorus load under these new conditions is plotted in

Figure 7.10. Firstly, it is noted that the Ca is almost bound to be below the target at

section (A) which is consistent with the result in Figure 7.9. In general, Figure 7.10 shows

that even after reducing Ca in the headwater, phosphorus stripping at the CRPCD plant is,

by itself, a high-risk intervention. For example, 95% reduction in phosphorus loading is

suggested as having a 40% chance of adequately reducing chlorophyll-a at section (D), a

25% chance at section (E) and no chance of success further downstream, following the

influences of Medfield WWTP, Sewall Brook and Bogastow Brook. However, this

intervention is likely to be effective at sections (B) and (C). Clearly, the specific objective

of the planners is crucial in this case.

This demonstration has highlighted the considerable degree of uncertainty, or risk,

attached to any water quality intervention. Importantly, the derived risk should not be

interpreted as the expected frequency of failure. This would imply that the model

parameters behave randomly as described by their conditioned joint distribution when, in

fact, we simply do not know what their statistical properties or time-variance should be.

Although predicted fluctuations in the water quality can be allowed for in the measure of

risk (for example by using a dynamic simulation or, as done here, by treating diurnal

variations as random effects), a large part of the risk stems from the low reliability of

model predictions. Therefore, interventions which can be modelled with relative precision

- which are not affected by highly uncertain components of the model - will be identified

as preferable. For example, the reduction in chlorophyll-a concentration which would be

achieved by doubling the flow in the Charles River would be identified as a low risk

intervention, as it does not rely on highly uncertain biochemical properties. However,

given that much of the risk comes from the model’s limitations, it is sensible and

probably economical to explore methods of improving the reliability of the model, rather

than opt for a low-risk intervention.

-100% -75% -50% -25% 0%Reduction

(C)(D)(E)

-100% -75% -50% -25% 0%Reduction

(C)(D)(E)

Figure 7.10 Risk associated with reduction in phosphorus load from CRPCD. (A) = probability of achieving objective at control section (A); (B) = probability of achieving objective at control section (B); etc.

A primary motivation for risk-based modelling is that management decisions should not

be restricted to those which prioritise water quality interventions. A judicious decision

may recognise that a sufficiently low-risk intervention is unattainable, and instead call for

model improvements, review of water quality targets or further data collection. Further

data collection has been identified as a priority for this investigation of the Charles River,

due to the restrictions which the employed data (and their assumed errors) imposed upon

model reliability. By implication, the reliability of any decisions about preferred

interventions is degraded, and this is represented in the high risk levels reported in

Figures 7.9 and 7.10. A useful extension to this investigation, therefore, would be to

evaluate how the attainable quality of decision (reduction in risk) would respond to

proposed investments in data collection. This could simply involve repeating the above

conditioning, sensitivity and risk analyses with respecified data errors – in particular,

Figure 7.8 indicates that the value of reducing errors associated with the headwater

chlorophyll-a may significantly improve the reliability of the decisions which need to be

7.4 Summary

A Monte Carlo-based framework of sensitivity analysis and risk evaluation has the

capacity to support risk-based management of surface water quality through,

• calculating the response of model outputs to uncertain or stochastic model inputs

(boundary conditions, pollution loads and model parameters), allowing the

distributions of model outputs to be reported at all points within the model’s

space and time domains,

• calculating the response required of model inputs to meet constraints imposed

upon the model output, allowing the posterior distributions of model inputs to be

reported. The imposed constraints are either defined by observed water quality

(with the purpose of conditioning the model, and exploring the sensitivities of the

system), or by prescribed water quality targets (with the purpose of appraising

intervention scenarios, and identifying realistic targets).

This has been demonstrated in this chapter using the upper Charles River, Massachusetts

as a case study. A model structure has been selected based on a prior hypothesis of the

principal water quality processes. The model has been conditioned using observations of

water quality observed on 26th August 1996 at a number of control sections along the

river. The model was conditioned with respect to each of eight objectives, one for each of

the eight key water quality determinands, and was evaluated through visualisation of the

spatial variation of modelled water quality and observations (Figures 7.4 and 7.5). The

superposition of confidence limits (estimated uncertainty in the model) and error bars

(estimated uncertainty in the observed data) allowed informed judgement upon the

relative reliabilities of the model structure and the measured data.

In pursuit of rigorous evaluation of risk, all uncertain factors potentially affecting the

achievement of water quality objectives should be included in the sensitivity analyses.

This may include model parameters, pollution and hydraulic sources, boundary

conditions, and perhaps the water quality objectives themselves. It was shown (Figure

7.6) that treating the water quality objectives as random variables allows the relative

importance of data error to be indicated. This feature also has potential for integrating the

effects of uncertain regulations into the appraisal of scenarios (e.g. which scenario is

safest given qualitative or otherwise uncertain water quality criteria?), and in offsetting

different targets against the cost of conforming (i.e. what are viable targets?).

Figures 7.6 and 7.8 are examples of how the factors significantly affecting water quality

are identified using the Kolmogorov-Smirnov statistic to summarise the results of a

regional sensitivity analysis. These results should be seen as indicators of the factors most

likely to be affecting the respective objectives given the knowledge embodied in the

model structure and parameter ranges. It is a useful approach to screening the model for

unexpected sensitivities, and for identifying factors to be taken forward to more detailed

analysis. For example, Figure 7.8 clearly identified that reducing the headwater Ca and the

Cpo load from the CRPCD treatment plant were priorities for further investigation –

because there is indication that the former might be an effective way forward, and that the

latter might be less so. Further investigation resulted in Figures 7.9 and 7.10 which

confirmed these suspicions in terms of risk of failure associated with various magnitudes

of intervention.

The overriding and unavoidable limitation to the methods employed here, at least in

applications where data are typically sparse and imprecise, is that results are partly

subjective. This is because model evaluation must be based on judgement of the relative

unbiasedness of the model structure and the observed data. Tools such as WaterRAT

(McIntyre and Zeng 2002) are needed to support the modeller in these judgements, and

allow justifiable use of results in a decision support role, within the practical constraints

of data and modelling resources.

8. Conclusions

The path that this dissertation has followed so far may be summarised as - establishment

of motivation (Chapter 1); review of previous models and methodologies (Chapters 2 and

3); description of the developed tool (Chapter 4); insight into numerical issues

encountered in the development (Chapter 5); a priori exploration of model identification

issues (Chapter 6); and application to a management problem (Chapter 7). Each chapter,

especially the latter three, has raised issues relating to the Thesis which need further

discussion, and Chapter 8 picks up these issues, expands upon them and integrates them

into a set of conclusions. Additional discussions of the practical obstructions encountered

in Hun River modelling study, and of current trends in the field of water quality

modelling in the UK are also included.

8.1 Summary

The principal achievements of this Thesis have been;

• A statement of the increasing motivation for risk-based modelling of water

quality and continuing relevance of model uncertainty analysis.

• Instructive inter-comparison between uncertainty estimation methods.

• Provision of software for model uncertainty analysis and risk-based decision-

support.

• Identification and exploration of numerical issues that control formal risk

evaluation, with limited guidance on how to deal with these issues. This included

the relevance of numerical approximations, field experiment design, model

structural errors, boundary and initial condition errors, and calibration data

information content.

• A framework for establishing links between model uncertainty, data quality and

decision-making risk.

• An agenda for further research and development needed to cope with persisting

and emerging problems (this chapter).

Thus, this dissertation may be regarded as a valuable resource for any modeller charged

with estimating, reporting and reducing uncertainty in model predictions, and for anybody

wishing to rigorously evaluate sources of risk to water quality status and in associated

decisions.

8.2 Review of GLUE

Arguably GLUE is now the most used and best recognised approach to the evaluation of

model uncertainty in the field of hydrological modelling. This may be because of the

relatively simple concepts and theory upon which it is based compared, for example, to

Monte Carlo Markov Chains and recursive parameter estimation methods. It is also due

partly to the relative ease with which GLUE can be implemented using plain Monte Carlo

sampling, and extended to regional sensitivity analysis and Bayesian analysis as

presented in Chapter 7.

The GLUE methodology is a framework within which Bayesian methods are applied to

parameter and model output uncertainty estimation, and regional sensitivity analysis. It

may also be seen as an extension of the set-based method introduced by Hornberger and

Spear (1980) and Spear and Hornberger (1980), introducing Bayesian or fuzzy

discrimination between behavioural models and promoting scope for Bayesian updating

of models. The originality of GLUE, therefore, is not in providing a new analytical

method, but by providing an analytical methodology that, rather than promoting specific

measures of uncertainty, requires modellers to define uncertainty according to their

perceptions of the model and data errors (as introduced in Chapter 1), and the proposed

use of the model. Thus, the role of the modeller’s judgement is recognised, and his

subjective input is objectively (i.e. numerically) expressed so that it is open to audit

(Beven and Freer 2001). This is consistent with the view expressed, in the wider context

of Bayesian modelling, by West and Harrison (1997) – that likelihood functions have a

subjective foundation for which modellers should be accountable.

It may be said that the research reported within this dissertation, although limited in a

variety of regards, is one of the more extensive explorations of application of GLUE to

water quality modelling. This included the introductory demonstrations of Chapter 2, the

integration into WaterRAT in Chapter 4, identification of the need for a GLUE-type

approach in Chapter 5, the estimation of confidence limits for river water temperature in

Chapter 6 and the sensitivity analysis and extension to risk analysis in Chapter 7.

The primary difficulty with using GLUE is justifying the measure of likelihood. Beck

(1987) points out that set-based method of Hornberger and Spear (1980), which he calls

the HSY method, may be preferred for this reason, simply because the origin of the

estimated model uncertainty can be relatively intuitive. Early in this dissertation, it was

suggested that we are only warranted in investing resources in uncertainty estimation to

the degree that it could affect management decisions. Given the innumerable contributing

political, management, and economic factors affecting decisions, whether discrimination

within the behavioural models is influential in practice, and worth the extra complexity

and degree of subjective input that GLUE may introduce, is open to argument and study.

On the other hand, it may be argued that it is intuitive to give more weight to the

behavioural models that appear to better fit the data, and that not doing so is neglecting

relevant information.

8.3 Prior and posterior knowledge

Following from the arguments about the value of GLUE, the question must be raised “in

what cases is it a waste of time attempting to condition the model?”, irrespective of the

conditioning algorithm used. For example, this can easily be argued when the information

content in the data is too poor and/or the uncertainty in the model boundary and initial

conditions is too high to allow that information to be extracted. A considerable amount of

this research has focussed on model conditioning – yet in the motivating case study of the

Hun River, reducing uncertainty through conditioning proved to be futile (see Section

8.9). Even with the relatively reliable data used in the Charles River study, the effect of

model conditioning was limited by the perceived scope for error in the loading and in-

river data. An interesting extension to Chapter 7 would be to investigate the effects on the

report to the decision-maker (e.g. Figures 7.9 and 7.10) that the conditioning stage can

initiate, given the data limitations. While model conditioning is arguably important in

data-rich studies (although, even in such cases, the effect on ultimate decisions appears

not to have been demonstrated in the literature) this study has raised the question of its

importance in studies with severe or typical data-limitations.

Notwithstanding the popularity of QUAL2E-UNCAS and SIMCAT, one of the general

challenges for proponents of the Monte Carlo method remains achieving its widespread

application in ‘mainstream’ water quality modelling. If the data are sufficient, model

conditioning means reduced risk in decision-making, but is a stage that inevitably adds to

the perceived complexity of the Monte Carlo analysis, and draws on resources of the

investigators. Arguably, therefore, model conditioning is a secondary challenge to

achieving recognition and formal incorporation of a priori uncertainty.

Given that in some studies, due to data limitations and/or the philosophy or resources of

the modeller, the uncertainty analysis will be limited to propagation of a priori model

uncertainty, what degree of model complexity is justified? Should we be reverting to the

most basic a priori knowledge of mass balance (in the Hun River study even this proved

difficult – see Section 8.9), or should we be maximising the prior knowledge contained in

the model? One argument is that, if a process is known to be potentially significant, and

the prior knowledge and computational resources are available, then there is no reason

not to include it; while the opposing view is that the inclusion of processes that cannot be

identified from the data means unwarranted assumptions and complexity potentially

leading to unjustified modelling effort and ill-founded conclusions, and increases the

danger of ‘over-fitting’ noisy data. Reducing model complexity to suit the data makes the

model more specific to the conditions of calibration, and so the reliability of extrapolation

is more questionable, while including freedom which is redundant under conditions of

calibration generally leads to equifinality and higher parameter uncertainty. While, then,

it can be argued that a combination of a complex model and a Monte Carlo uncertainty

analysis which thoroughly explores the a priori parameter space is the preferred solution,

the human and computational effort required to do so successfully is unlikely to be

available. In particular, if the sampling error associated with the Monte Carlo method,

which is inevitably more significant when applied to models with a large number of

uncertain parameters, becomes the prevalent limitation to the justification of results, the

object of the uncertainty analysis is defeated, and the modeller must take the

responsibility of avoiding this.

The reliability of model conditioning and/or uncertainty propagation using Monte Carlo

analysis depends upon the validity of the model structure, its numerical implementation,

the representation of data uncertainty, the adequacy of the number of Monte Carlo

samples of model inputs, and the likelihood measure that was used to condition the

model. The modeller is faced with a difficult and important task in balancing the effort

directed at researching and resolving each of these issues, and decisions about the model

structure (either towards making it more complex or more simple) should consider the

relative importance of the other contributing factors, as well as resource constraints and

the actual answers that are sought from the model output. This dissertation has proposed a

framework and associated modelling tool through which this task can be approached.

8.4 Monte Carlo sampling

The comparisons of GLUE with MCMCs and Monte Carlo sampling of data provided in

Chapter 2 has shown that each are different approaches to exploring a response surface,

and assuming that the theoretical definition of this response surface is consistent, the

methods can be expected to produce consistent results (the inconsistencies being due to

the mathematical approximations used in the OF definition, and the sampling efficiency

of the exploration). This result provides a basis for further, case-study comparisons of the

efficacy of the alternative methods, but the wider objectives of this research meant that

this was not pursued. While Kuczera and Parent (1998) found that MCMC is significantly

more efficient than importance sampling at analysis of extreme values, this apparently

contradicts the observation in Chapter 2 that GLUE using uniform prior distributions is

better suited to analysis of extremes. However, these investigators assumed normal and

uniform posterior distributions rather than the posterior distributions obtained from

GLUE, and their conclusions seem somewhat misleading. Clearly, some additional

research into the issue of extreme value analysis using GLUE and other Monte Carlo

methods is warranted.

While in Chapters 2 and 5 stratified sampling was used, Chapters 6 and 7 go on to use

Latin hypercube sampling (LHS) (McKay et al. 1979 review the differences). Yu et al.

(2001) extol the value of the LHS method for evaluating parameter uncertainty in rainfall-

runoff modelling, noting that it reduced the required number of samples by 90%.

However, as Press et al. (1988) note, LHS can only be expected to be advantageous when

the sampling is so sparse that plain random sampling may not even identify univariate

effects, and it cannot provide an advantage in identifying factor interactions (see also

Section 8.5). In particular, LHS is most valuable where the uncertainty is predominantly

arising from univariate effects of some of the factors, but of which factors is unknown a

priori. Therefore, it is speculated that that the conclusion of Yu et al. (2001) arose

because of the limited significance of interactions in their particular study, and it should

not be seen as a general result. Furthermore, it may be argued that the use of LHS in

Chapters 6 and 7 of this dissertation, rather than plain sampling, is unlikely to have added

considerable significance to results.

It is important to note that, while a number of studies have been done comparing

sampling methods and response surface exploration algorithms (e.g. Kuczera and Parent

1998, Thyer et al. 1999, Portielje et al. 2000, Melching and Bauwens 2001, Yu et al.

2001), these are in the context of relatively simple water quality and rainfall-runoff

models, with considerably fewer parameters than the thermodynamic models and water

quality models employed in Chapters 5 and 7. Furthermore, these other studies have not

been done in the context of possibly large data and model structure errors, and perhaps

therefore emphasise the overall importance of sampling efficiency more than is

appropriate in water quality modelling using typical data. The exploration of the

multivariate response surface of parameters of the Chapter 5 and Chapter 7 models is

inevitably very sparse. As raised in Chapter 7, the relevant question is whether the higher

order interactions that are not sampled extensively are significant to the modelling task;

and (returning us to the discussion above), if higher interactions are not important, then

does the model need to be so complex? Far-reaching audits of decisions and associated

costs and models are required to answer this.

8.5 Review of WaterRAT

Further to the discussion in 8.3, a priority development of WaterRAT would be to make

the selection of the GLUE likelihood measure easier, or at least to make the significance

of the measure more transparent to the user. At present, there is a menu which gives a

variety of options for the likelihood measure, and the user must refer to the manual to

view the definitions (see McIntyre and Zeng 2002). In addition, behavioural zones can be

specified by entering limits on spreadsheets, and the determinands, and the time and

space domain to be included in the objective function is defined on another menu. To

simplify the GLUE likelihood specification, various alternatives could be removed,

reducing the GLUE to its parent HSY format. The Excel interface provides an especially

useful means of ‘drawing’ behavioural zones. On the other hand, for those potential users

who insist on the discrimination allowed by GLUE, an improvement would be to require

the user to write the GLUE likelihood definition themselves, rather than offer a

prescribed choice of functions. This is more in line with the philosophy of GLUE, forcing

the modeller to be aware of and responsible for that definition.

Statistical analysis of residuals is not presently included in WaterRAT, and some

functions for looking at this (e.g. residual autocorrelation, heteroscedacity, outliers and

non-normality) would allow some assessment of the presence of model structure error

and data bias (see section 2.2.3). Should these errors be small, analysis of residuals would

also allow data error variance to be added to model result variance obtained on the basis

of likelihood functions (see section 2.2.6).

The version of WaterRAT delivered to the European Commission did not include the

MCMC algorithm, as its value remains to be tested. Furthermore, the genetic algorithm is

at a relatively undeveloped stage, unable to make any justifiable estimation of parameter

uncertainty. It would seem particularly valuable to develop a multiple-objective capability

of this algorithm, to avoid the especially computationally demanding task of identifying

Pareto optimal parameters using random sampling and the Pareto filter, as currently used

in WaterRAT (refer to McIntyre and Zeng 2002).

Another component of WaterRAT which was not included in the delivered version is

factorial sensitivity analysis (Henderson-Sellers and Henderson-Sellers 1993) which

allows for two-factor interactions between model inputs. A two-factor factorial analysis

may be regarded as the antithesis of Latin hypercube sampling – in the factorial analysis

the prior marginal distributions of individual factors are represented by just two points so

that two-factor interactions can be explored rigorously, while Latin hypercube sampling

relegates exploration of interactions in favour of a much more thorough sampling of the

univariate form. For model screening, two-factor factorial analysis may be preferred to

Monte Carlo-based methods of analysis, as it can be used to concentrate on the

sensitivities associated with extreme values (Kleijnen 1997) (although it still cannot

identify non-linearities in the response surface). Use of factorial sensitivity analysis of

factors affecting oil pollution in the Hun River was undertaken by Sherif (2000) using

WaterRAT.

The limited choice of model structures within WaterRAT could limit the applicability of

the tool. For example, the lack of a hydrodynamic model together with the need for direct

estimation of shear friction to drive the sediment resuspension model, meant that the

quasi-steady friction formula model was employed in Chapter 6, where a fully dynamic

model would have been helpful, at least to prove or disprove the sufficiency of the quasi-

steady approximation. Furthermore, the capability of WaterRAT to model sediment-water

interactions is not fully realised due to the absence of a choice of sorption models, for

example for modelling toxins, and absence of a model of redox status that may control

phosphorus release. In these regards, and with regard to conceptualisation of other

perceived important processes, there is much to be said for the framework adopted for

DESERT (Ivanov et al. 1996) whereby the user can specify his own model structure,

giving ultimate flexibility. Of course, this would require advanced modelling expertise

and added time needed to apply the tool, and certainly would be inappropriate for the

TOPLEM application of WaterRAT where, ultimately, minimal end user expertise is

available.

8.6 Review of Chapter 5: numerical issues

The numerical study presented in Chapter 5 was born out of necessity, when it became

clear that Monte Carlo simulation involving solution of systems of space-time partial

differential equations was not viable without careful attention to solution schemes and

tolerances. Not only was the time required to maintain numerical stability using the

originally devised fixed grid integration scheme too large, but the differences in

numerical precision from one Monte Carlo realisation to the next were a considerable and

unwanted complication in the overall analysis. The solution of the river

thermodynamic/ice growth model incorporated significant variations in time-step

requirements within and between realisations, and this was also observed in the water

quality studies of Chapters 6 and 7, although to a lesser extent.

Chapter 5 had a number of outcomes worthy of discussion. The river ice model was an

interesting stage of development – the work involved was justified by the perceived

dominant effect of the ice on the pollution transport and fate during the critical winter dry

season on the Hun River. Unfortunately, this was another aspect that could not be

developed within this project due to data limitations. More relevant to the Thesis, the

chapter draws attention to the benefits to be gained from reconciling numerical tolerances

with the overall task and achievable precision of the model. There is clearly more work to

be done in this regard, as the chapter took only a cursory look at what tolerance was

actually appropriate – the study did not attempt to identify the relative benefits of

increasing the number of samples, and reducing numerical precision. In the wider context,

given the inevitable limitations caused by errors in the model structure, calibration and

boundary condition data, and given the subjective nature of the GLUE likelihood, how

may samples at what numerical precision are really justified? It may be argued that the

constantly increasing power of desktop computers is removing numerical restrictions, and

makes such questions less relevant. However, we are clearly very far from achieving

satisfactorily fast Monte Carlo simulation on personal computers, and far from knowing

what is satisfactory given a certain model and its various uncertainties. In addition,

various demands for the additional computer power are evolving (such as GIS interfaces),

which are in competition with improved application of Monte Carlo methods.

The numerical investigation also raised the issue of design of the spatial grid, as, while

the truncation errors in the time domain were automatically controlled, the spatial errors

were managed manually. Not only should the modeller think of what scale he wishes to

be modelling on (e.g. at the reach scale, or at the sub-reach scale, referring to Figure 5.7),

but ideally either he would know a priori what spatial grid-size is needed to achieve

adequate precision at this scale, or the grid would adapt in interaction with that in the time

domain. While the spatial truncation errors may be lumped into the overall error and the

effects included in parameter estimates and their uncertainty, this is not ideal as it

constitutes another loss of rigour and reliability in extrapolation to changed modelling

tasks. The automatic generation of spatial grids is a field which has not been touched

upon in this work, and seems to be relatively untouched in water quality modelling.

8.7 Review of Chapter 6: prior identification of data needs and

assessment of model capability.

Much of the review in Chapter 2 was based on rainfall-runoff modelling which typically

uses daily input-output data, and in which it is generally assumed that the input data are

error free (notable exceptions are Xu and Vandewiele 1994, Shah et al. 1996). Water

quality data are considerably more costly to collect than streamflow data, and may be

considerably less accurate both in terms of sampling errors and measurement errors.

Adding to the problem is the general desire, whether well-founded or not, to use models

that are mathematically over-parameterised. Chapter 6 recognised this fundamental

problem and set out to explore what we might expect to achieve out of a data-limited

water quality modelling study using a priori modelling experiments. At the same time,

that chapter covered one of the practical problems encountered in allocating TOPLEM

resources to monitoring.

Chapter 6 is limited in scope. It is hypothetical, based on synthetic data and untested

models, and so only gives an indication of the extent of difficulties that can arise from

data limitations. It also assumes that some important data (namely the upstream flow and

phosphorus concentrations, and the diffuse sources) were available at the desired

frequency, and while effects of random error in these inputs were explored, serious biases

were not. However, the investigation exposed the degree of difficulty arising from model

and data limitations in a way which can only be done on a synthetic basis. In particular,

the significance of a (relatively minor) structural fault on both the preferred monitoring

programme and the achievable reliability of forecasts is revealing. In ‘real’ studies, the

effects of various sources of error (e.g. data error and model structural error) cannot be

separated, and therefore little insight can be gained into controls on model uncertainty

and reliability.

The large number of sources of error in real modelling studies have effects on model

reliability that are inextricable from each other, and therefore leave us no option but to

use devices such as GLUE that agglomerate errors in a manner that is not easy to justify

either to an objective-minded mathematician or to a ‘layman’ decision-maker, and will

always leave doubt about the usefulness of forecast error estimates. While semi-synthetic

experimental modelling will never eradicate errors or remove the need to subjectively

estimate uncertainties in forecasts, it can at least be used to indicate the likely significance

of individual error sources, as in Chapter 6. Extending the investigation to evaluating the

ability of GLUE and multiple-objective optimisation to improve the robustness of the

error forecasts, is a matter for further research.

Following the conclusions of Chapter 6, together with recognition of its limited scope, it

is arguable that, in general, extensive numerical studies are warranted before embarking

on costly monitoring programmes data to support predictive modelling. These should be

both of the type in Chapter 6, whereby speculative modelling is used to enhance the

efficiency of data collection, and of the recursive modelling-monitoring type, where

model and monitoring programme developments feed off each other (Somlyody 1995,

Van Straten 1998).

8.8 Review of Chapter 7: a framework for model conditioning, sensitivity and risk analysis

Firstly, it is noted that, for the purpose of Chapter 7, the arguments for parsimonious

modelling have been rejected. That study aimed to relate model sensitivities to reasonably

meaningful model parameters, and reducing the physical meaning by lumping processes

into a parsimonious form was considered contrary to this aim. Also, parameters to which

the algae output showed no sensitivity under calibration conditions were indicated as

important when investigating future scenarios.

Using multiple objectives for the definition of parameter uncertainty would be an

interesting development, which seemingly has not yet been done in water quality

applications, apart from the simple demonstrations given by McIntyre et al. (2001). While

Chapter 7 measured sensitivity simultaneously with respect to multiple objectives, it did

not go so far as to define parameter and model output uncertainty this way, in the manner

of Yapo et al. (1998) for example. Clearly, following the discussion in Chapter 2 (see also

the discussions of Beven (2000b) and, Kavetski et al. (2002) in the rainfall-runoff

modelling context), the Pareto definition of uncertainty is quite different from that which

might be identified using GLUE, and there seems to be some interesting scope for

extending the genericism of GLUE to encompass multi-objective considerations (e.g. the

fuzzy union of two equally likely response surfaces).

Multi-objective conditioning has recently been extended by Wagener et al. (2002a, b) to a

dynamic conditioning, which they call DYNIA. Using DYNIA, the objective function is

defined by the residuals within a window (of a predefined size) which moves over a

whole time-series of data. Therefore, instead of the distinct multiple objectives used for

example in Chapter 7, there is a ‘continuum’ of objectives over the time domain. This

allows the relative importance of each model parameter and its optimum value to be

assessed over the time domain, thus identifying the system response modes that are most

relevant to each parameter, and indicating structural weaknesses from time-variations in

optimum parameter values. In principle, DYNIA could be applied to evaluation of water

quality models either over the time-domain or, in the same manner, over the space

domain. However, it would be expected that, applied to studies such as that in Chapter 7

where data are quite sparse and potentially imprecise, the results of DYNIA would be

very sensitive to data errors. Like other ‘dynamic’ methods of parameter estimation, such

as Kalman filtering (Beck 1983), DYNIA is suited to relatively intense and good quality

data sets.

The study in Chapter 7 may be regarded as innovative in that all the factors affecting the

objective function value are considered as uncertain and treated the same way in the

Monte Carlo analysis. This is an advance on the common approach of estimating

parameter uncertainty (e.g. Whitehead and Hornberger 1984, Freer et al. 1996, McIntyre

et al. 2001, Wagener et al. 2002b) with no explicit recognition of data uncertainty, or

boundary and initial condition uncertainty. For some objectives, the relative importance

of in-river data error was highlighted, whilst for others it was the significance of

uncertainty in pollution load estimates, indicating the impediments to successful model

conditioning. Chapter 7 also tried to present the application of GLUE from its Bayesian

foundation, and thereby to develop the regional sensitivity analysis results into estimates

of probability of failing prescribed objectives (where the objective would be some water

quality target or targets). It was argued that the risks of failure of environmental

management arise largely from the assumptions and approximations employed at the

modelling stage, and so risks can largely be managed by investment in the monitoring and

modelling process, rather than in pollution control. For this to be pursued requires that all

significant assumptions and approximations may be represented as uncertainties. It was

emphasised in Chapter 7 that these may include the uncertainty in the objective itself.

This raises opportunities for exploring questions such as “is it the hydro-chemical model

and its input uncertainty that leads to risk in making decisions about aquatic ecosystem

management (i.e. the basis for this dissertation), or is it our inability to link the chemistry

to the ecosystem?”; what relative efforts should be put into improving the hydro-chemical

model versus the chemico-ecological model?

Chapter 7 also introduced the concept that evaluation of uncertainty is not restricted to

identifying and reducing uncertainties in the model, but, in cases of sparse and unreliable

data, may be used to identify data of dubious quality. The visualisation methods used in

the Charles River study (Figures 7.4 and 7.5) aim to assess data using simulation

modelling and GLUE-based confidence limits. While, of course, any conclusions about

data reliability must be made in the context of the limitations of the model and its

uncertainty estimation, the same is equally true for well-established methods of data

quality control using time-series methods (e.g. Tsakalias and Koutsoyiannis 1999).

The subjective input to practical modelling studies is evident in the work of Chapter 7.

Not only is the structure judged to represent the main water quality processes, but the

prior ranges of parameter values are taken from the literature, and the prior uncertainty in

point pollution sources is based on perceived possible error in the daily load estimates.

While it might be argued that the variability in point sources can be measured, in practice

generally only the inputs to sewage treatment works are monitored, and measuring the

discharges as part of a modelling exercise is costly. In practice, daily discharges are often

approximated by assumed reductions to the input, or by assuming direct proportion to the

served population (e.g. Daldorph et al. 2000). The estimation and consideration of

pollution load uncertainties, therefore, is needed in order to qualify any results, for

example by expressing uncertainty as confidence limits or as risk associated with a

decision. The subjective nature of this and all other subjective aspects of the uncertainty

analysis (the GLUE likelihood measure has already been highlighted) must be

emphasised as so by the modeller, and laid open to scrutiny and review.

When estimating scope for error in data, the modeller needs to have some knowledge of

how the data were collected. Quality control is essential, and ideally the modeller would

observe, and have some expertise in, the monitoring and measurement processes.

Unfortunately, in practice, modelling, monitoring and measuring tend to be separate skills

and disciplines (Somlyody 1995). Fundamental communication problems (as encountered

in the Hun River study) and tendency of data to be freely available without the supporting

quality control documentation (as with the Charles River data) mean that the modeller

may have insufficient insight into the significance of the data that is being used.

Modellers have no chance of communicating the significance of model results to the

decision-maker if they themselves lack basic appreciation of how the results originated.

This basic problem has limited this dissertation to an exploration of issues and

methodologies rather than case-specific conclusions about water quality and/or suitable

models.

8.9 The Hun River case study

The Hun River case study was supposed to provide a basis for developing and testing the

modelling framework and associated methodology, and it largely failed in that respect.

Discussion of why it failed allows broader understanding of uncertainty, and allows

comment upon the worth of modelling and uncertainty evaluation in difficult practical

circumstances, which might be expected in developing country studies.

The summarised outcomes of the river and lake water quality modelling work packages

of the TOPLEM project are numbered below (from McIntyre 2002), and each is followed

by additional discussion.

1) Within the limitations of the Hun River modelling study, there is minimal

evidence that the COD, ammonia and nitrate pollution levels in the Hun River

are significantly affected by in-river processes. There is somewhat stronger

evidence that oil and phenol concentrations are significantly affected by in-river

mass losses6. This leads to the opinion that very simple models are appropriate

for immediate strategic decision-support tasks.

The hypothesis that the water quality was sensitive to in-river processes was tested using

models of lesser complexity (see McIntyre 2002) than that used in the Charles River

study, together with the objective function, regional sensitivity analysis and the KS

significance test as described for the Chapter 7 study. For each factor, this test poses the

question “given the overall variation in the model output (i.e. the objective function

value) due to all the input uncertainties and their interactions, and given the sampling

error that is part of the Monte Carlo method, is there significant evidence that this factor

is affecting the model output”. That the answer, for COD, nitrate and ammonia, was

negative indicates that, given the limitations of the modelling method and the perceived

data uncertainties, the inclusion of water quality processes in the model was arguably

superfluous. The message regarding these pollutants is simple; that the managers of the

Hun River quality need not be accounting for in-river processes in current pollution

management strategy, at least not before better models can be identified. Furthermore, a

clear message to the Hun River modellers is that, with currently available knowledge and

data-bases, and for currently relevant modelling tasks, including in-river processes in the 6 Chemical oxygen demand (COD), ammonia, nitrate, phenol and oil were the five Hun River pollutants investigated.

models of COD, nitrate and ammonia is a waste of human and computer resources

(except as a demonstration of this same point).

2) As the key sources of pollution are known prior to any modelling and no other

key factors affecting water quality can be identified from the available evidence,

arguably computer models have a limited role at this stage in management of the

Hun River water quality. However, their role of providing graphic reports of

spatially and temporally varying water quality with which to support decisions is

undoubtedly of value, and for identifying key information which needs to be

collected in order to advance modelling/management strategy.

The value of water quality models for studying rivers badly polluted by known discharges

lies largely in reporting expected improvements following interventions, rather than their

ability to represent environmental processes. In particular, decision-makers and other

stake-holders may have little conception of causes and effects of spatial and temporal

water quality variations (even though they may appear obvious to a modeller), which can

be effectively reported using model interfaces. For example, Montgomery Watson

(2001b) have employed simple models with GIS interfaces for effective communication

of the Hun River (and wider Liaoning water pollution) situation. Similarly graphic reports

of uncertainty would provide a higher level of information for decision-making purposes.

3) As there is no databased evidence of seasonal variation in the pollution loads, the

loads were regarded as stationary inputs to the model, represented by the raw

data collected from the Hun River tributary sewers and rivers. The very high

variability in each pollution load was represented in the model as a stochastic

process, and was the major cause of uncertainty in model results. For this

reason, the model parameters could not be successfully conditioned by in-river

observed water quality. Therefore, a modelling priority is to reduce uncertainty

through coupling the river models with the TOPLEM load estimation model,

which was not achieved within this (TOPLEM) project.

As the in-river water quality observations allowed conditioning of the pollution loads but

not the model parameters, it can be said that the conditioning failed, and all risk

evaluations were done effectively using the a priori model. As in Chapter 6, the value of

surface water quality models and hence the risk of poor decisions was shown to be

controlled by lack of knowledge of model boundary conditions especially pollution loads,

together with the uncertainty in the in-river water quality data. Effective management of

modelling projects must involve suitable allocation of resources to modelling and

monitoring of boundary conditions, and effective communication between modelling

parties. A modelling issue that follows on from (3) is the justification of Monte Carlo

simulation - if the predominant future uncertainty is in the pollution load, then

propagation of uncertainty through the water quality model is a first order task, for which

the first order second moment method (see Chapter 2) may be more suitable.

4) All results presented in this report (i.e. McIntyre 2002) are conditional on the

precision of the flow data, and the unconditioned hydraulic model. These

components were not modelled as uncertain inputs due to the difficulty in

interpreting available data and quality control information. Had they been input

to the model with nominal uncertainty, this would have led to the simple (but

valid and important) conclusion that flow is a principal factor affecting water

quality. Therefore, useful practical modelling of the Hun River quality must be

preceded by (at least) daily, well-documented flow gauging at all key sections

and/or improved communication of existing data.

A flaw of the TOPLEM modelling project was that the political/institutional/cultural

obstacles to delivery of key data (especially flow and stage data), and to delivery of

supporting quality control documentation, were neither fully foreseen nor overcome. This

aspect of modelling cannot be completely neglected, as it seems that modellers, and the

reliability of their models, will always rely on the efficiency of other parties in delivery of

adequate data and supporting documentation. The success of the modelling (including

uncertainty estimation and minimisation) was ultimately controlled by the project

management issues, more than by the limitations of modelling methodology.

Management and communication are elements of uncertainty analysis which have been

almost completely outwith the scope of this dissertation, but their general importance

should be emphasised.

Following (4), it is noted that the severe pollution problem in the Hun River, as in many

developing country situations, is as much due to water shortage rather than to lack of

wastewater treatment. Therefore, some estimation of the uncertainty in flow data, leading

to the flow data being shown by the sensitivity analysis to be amongst the most

significant factors, would be advisable.

8.10 A look to the future

This dissertation has been non-committal about the relative merits of parsimonious and

more complex models, and has argued that the model design should depend upon the case

rather than any predetermined mindset. Increased numbers of uncertain inputs introduce

extra difficulty in conducting and interpreting sensitivity analyses, model conditioning

and uncertainty propagation; on the other hand, parsimony leaves extra doubt about the

reliability of the model structure. While some modellers call for identification of simple

cause-effect relationships (e.g. Beck 1997) others justify increased parameterisation and

numbers of state variables (Shanahan et al. 2001). While some have identified the

principal components of interacting parameters in order to reduce the dimensionality of

the parameter space (Kuczera 1990), others have the view that dimensionality should not

be reduced but handled using Monte Carlo analysis (Reichart and Omlin 1996). There is

no conflict; simply differing needs and resources (e.g. data, human and computer

resources). The challenge is to allow modellers, through provision of suitable modelling

toolkits and guidance, to recognise the implications of their needs and resource

limitations.

It is argued that there is nothing wrong with over-parameterised models per say, so long

as 1) at calibration, the extra degree of freedom is not used to explain what are, for all

intents and purposes, data errors; 2) increased model elaboracy is not, without evidence,

taken to imply increased prediction accuracy; and 3) the additional demands on human,

computer and field resources do not draw attention from more constructive, relevant

tasks. On the first point, a priori constraints on ranges of parameters and a priori

specification of perceived data error bounds, will avoid the extreme consequences of

over-fitting data. On the second, while the more complex model may be inaccurate in a

deterministic sense, it is likely to be a better tool for exploring the set of possible futures,

and a more powerful decision-making tool if parameter uncertainties could be

comprehensively managed. If they could (there is no evidence that they are), it would be

difficult to find a purely mathematical argument to support a simple conceptual model

over a more physically based model. The third point of reservation, relating to complex

models being an unjustified use of resources, is the heart of the matter. Is a complex

model, with or without adequate supporting data, with or without adequate uncertainty

analysis, likely to be cost effective in improving quality of decisions? While there are

(very few) reports of model post-audits in terms of fitting data, there are (apparently)

none that audit the cost-effectiveness of the modelling exercise in terms of quality of

decisions.

Returning to the objective of providing frameworks for uncertainty analysis, it seems that

a valuable development in water quality modelling would be toolkits that provide

flexibility in selecting model structures. This would encourage exploration and

recognition of structural uncertainty. WaterRAT provides some capacity to test

alternative structures and modelling scales; DESERT (Ivanov et al. 1996) seems to

provide more choice of model structure, with a loss of ease of use. Ease of use is a

paramount consideration, and recent software advances (e.g. SIMULINK, Mathworks

2002) may provide a platform from which multiple structures can easily be devised and

tested. It is easy to say that toolkits could automatically combine results from alternative

models into an ‘ensemble’ estimation of uncertainty, but in practice this idea may be

premature, considering that few modellers yet even consider parameter uncertainty.

Apart from uncertainty analysis, current developments in water quality modelling include

models which represent runoff-groundwater-river-lake interactions (e.g. Daldorph et al.

2000), integrated urban sewerage–river water quality models (e.g. Lau 2002), and new

extensions of hydro-chemical models to include hydro-ecology (e.g. Wade et al. 2002). A

more general development is towards distributed or semi-distributed catchment-wide

models, often using geographical information system (GIS) databases and interfaces (e.g.

Crosetto et al. 2000). The computational demands of the GIS itself, and the increased

spatial dimension of the problem, pose additional problems for the implementation of

uncertainty analysis. Apparently, there has been almost no research into how the benefits

of increased spatial representation can be reconciled with the high and unknown

uncertainty in results. Cerati (2002) demonstrates that, using established Monte Carlo

methods and current desktop hardware, rigorous sensitivity analysis of such models is

difficult, and maybe impossible.

Surface water quality modelling has traditionally been the field of civil engineers (Chapra

1997). Increasing relevance of groundwater, ecology and GIS has meant that the

collaboration of hydrologists, ecologists and geographers is now essential. Also, now

more than ever before, the skills of mathematicians are needed to develop efficient and

reliable algorithms for model solving, uncertainty analysis and optimisation. As was

stated near the outset of this dissertation, the value of Monte Carlo methods and

importance of uncertainty analysis should not be diminished by the perceived inefficiency

of the numerical implementations. The efforts directed at numerical efficiency within

WaterRAT, in particular the work reported in Chapter 5, have merely scratched the

surface of the numerical modelling issues. Furthermore, the full potential of the MCMC

and the genetic algorithm in application to uncertainty estimation can best be attained

with in-depth understanding of the mathematics. Recent initiatives (NERC 2003) point

the way forward for improved cross-collaboration between mathematicians and

environmental modellers.

In the context of systems identification of wastewater treatment models, Beck (1999)

observes that better technology for data collection does not necessarily permit improved

understanding of system processes, as key responses will always remain unobserved. In

the context of prior perceptions of environmental models, Beven (2002c) notes that the

key system characteristics may be “essentially unknowable”. While the contexts of these

two references are different - the latter is referring to identification of inputs to a priori

models, while the former refers to measurement of the output state variables – both imply

that, despite research and emerging technology, a complete set of information with which

to develop an accurate model will never be available. There are two pragmatic responses

to this concern. Firstly, for decision-support purposes, process understanding at the

fundamental scale is not essential if justified uncertainty analysis is undertaken, and

neither is a precise model. Secondly, the future of environmental modelling relies on

improved monitoring technology to facilitate affordable, justifiable uncertainty analysis.

At least in developing country applications where continuous monitoring is not

practicable, data quantity will remain low and modelling methods must strive to account

for the associated uncertainty. It is the opinion of the author of this Thesis that the future

of uncertainty analysis in modelling data poor environments, as far as uncertainty

estimation is concerned, lies in the well-established set-based method introduced to

environmental modelling by Hornberger and Spear (1980), which was later applied to

uncertainty forecasting by Van Straten and Keesman (1991). This method provides a

basis for incorporating alternative structural assumptions, as well as parameter, and initial

and boundary condition uncertainties. It provides a transparent method, which uses a

minimum number of clearly defined assumptions, for conditioning, sensitivity testing and

uncertainty propagation. The use of more elaborate methods of response surface analysis

(such as might be applied using data sampling, MCMCs or GLUE as in Chapter 2)

involves extra assumptions that are likely to obstruct the acceptance of the result rather

than to fortify it, due to their subjective and sometimes quite complex basis. The primary

objective is to achieve recognition of parameter and structural uncertainty, their effects on

decisions and the reasons for them; the best way to achieve such recognition is to keep

the analysis as simple as possible.

In developed country applications, where water quality constituents and hydrodynamics

might be effectively monitored on a time scale of minutes (at least in the case of laterally

mixed rivers) to give a truly dynamic picture of the hydro-chemistry, there may be scope

in the future for exploring the model-data error structure. Using time-varying

identification of optimum model parameter values (e.g. Beck 1983, Wagener et al. 2002b)

allows combinations of structural and data error to be hypothesised and potentially

resolved, or represented as uncertainty. Of course, the success of such research depends

on adequate boundary condition identification, which may be problematic in models of

diffuse-sourced runoff. In the cases of surface water where two- and three-dimensional

variations in water quality exist, the same methods could be aspired to, but achieving

adequate data of the system is obviously much more difficult. Looking at the wider

picture, specifically the growing relevance of ecological modelling, there will inevitably

be strong competition for monitoring and modelling resources. Again, the question to ask

is “where is the risk of not achieving our water quality management objectives coming

from?” and that is where resources should be directed. Furthermore, there is going to be a

need for significant trade-offs between protection of aquatic ecology, and the resulting

social and economic costs. A future modelling problem then becomes not only identifying

the risks of failing different criteria, but establishing acceptable compromises between the

risks.

The overall impression gained from the development of this Thesis is that, at the moment,

further research into improving river water quality models (and perhaps environmental

models in general) for decision-support is not warranted. Real improvements in the

practical value of models lie in the willingness of modellers to seriously promote and

confront the problem of uncertainty in research and tool development; and for decision-

makers to accept and use the outcomes, and integrate uncertainty in results into risk

assessments. The complexity of commonly used models far surpasses the complexity of

thought given to using the results properly. This view might be extended to say that

further development of modelling methods (possibly including development of tools for

uncertainty analysis) is not warranted by the current inclination of decision-makers to

properly use them. After all, a minority of modellers have been promoting the need for

such analysis and tools since at least 1970 (see Beck 1987) without proportional uptake of

the advice. However, there are signs of changing views on how models should be used

(e.g. Cipra 2000; Beven 2000c), with a number of important policy documents and codes

of practice now referring to integration of model uncertainty into risk management

(DETR et al. 2001; UK Environment Agency 2002). Additionally, at least in Europe, the

stakes are currently being raised by the Water Framework Directive - the potential cost of

sufficient water quality management is high; the probability of unsuccessful investment is

high; worst case scenarios cannot be considered; the necessary data to reliably inform

decisions will not be available.

References Adams, B. and Reckhow, K.H. 2001. An examination of the scientific basis for

mechanisms and parameters in water quality models. Unpublished paper. Available

from second author at www2.ncsu.edu/ncsu/CIL/WRRI/AdamsReckhow.pdf

Ambrose, R.B., Wool, T.A. and Martin, J.L. 1993. The water quality analysis simulation

program WASP5 Version 5.10 Part A: Model Documentation. Environmental

Research Laboratory, Office of Research and Development, US EPA, Athens,

Georgia, USA.

Ambrose, R.B., Barnwell, T.O., McCutcheon, S.C. and William, J.R. 1996 Computer

models for water quality analysis. In Water Resources Handbook, Mays, L.W. (Ed),

McGraw-Hill, pp14.1-14.28.

Ang, A.H.S. and Tang, W.H. 1975. Probability Concepts in Engineering Planning and

Design, Vol.1- Basic principles, John Wiley, pp 56, 222, 232-248..

Ang, A.H.S. and Tang, W.H. 1984. Probability Concepts in Engineering Planning and

Design, Vol.2- Decision, risk and reliability, John Wiley, pp 274.

Ashton, G. (Ed.). 1986. River and Lake Ice Engineering. Book Crafters, Inc. MI. USA.

Ashton, G. 1979. Modelling of ice in rivers. In Modelling of Rivers, Shen, H.W. (Ed.),

John Wiley.

Bastidas, L.A., Gupta, H.V., Sorooshian, S., Shuttleworth, W.J. and Yang, Z.L. 1999.

Sensitivity analysis of a land surface scheme using multicriteria methods. Journal of

Geophysical Research 104(D16), 19481-19490.

Beasley, D., Bull, D.R. and Martin, R. 1993. An overview of genetic algorithms: Part 1,

Fundamentals. University computing 15(2), 58-69.

Beck, M.B. 1983. Uncertainty, system identification, and the prediction of water quality.

In Uncertainty and forecasting of water quality, Beck, M.B. and Van Straten, G.

(Eds.), Springer-Verlag, pp3-68.

Beck, M.B. 1987. Uncertainty in water quality models. Water Resources Research 23(8),

Beck, M.B. 1997. Applying systems analysis in managing the water environment:

Towards a new agenda. Water Science and Technology 36(5), 1-17.

Beck, M. B. 1999. Coping with ever larger problems, models, and data bases. Water

Science and Technology 39(4), 1-11.

Beer, T. and Young, P.C. 1983. Longitudinal dispersion in natural streams. ASCE Journal

of Environmental Engineering 109(50), 1049-1067.

Bencala, K.E. and Walters, R.A. 1983. Simulation of solute transport in a mountain pool

and riffle stream: a transient storage model. Water Resources Research 19(3), 732-

Berezin, I.S. and Zdidkov, N.P. 1965. Computing Methods, Permagon Press.

Berthouex, P.M. and Brown, L.C. 1994. Statistics for environmental engineers, CRC

Press.

Beven, K.J. and Binley, A.M. 1992. The future of distributed models; model calibration

and predictive uncertainty. Hydrological Processes 6, 279-298.

Beven, K.J. 1993. Prophecy, reality and uncertainty in distributed hydrological

modelling. Advances in Water Resources 16(1), 41-51.

Beven, K.J. 1998. Generalised Likelihood Uncertainty Estimation (GLUE) User Manual.

Lancaster University, Lancashire, UK.

Beven, K.J. 2000a. On modelling uncertainty, risk and decision making. Hydrological

Processes 14, 2605-2606.

Beven, K.J. 2000b. Rainfall-runoff modelling: The Primer, John Wiley.

Beven, K. J., 2002c. Towards a coherent philosophy for environmental modelling,

Proceedings of the Royal Society of London (A) 458, 2465-2484.

Beven, K.J. and Freer, J. 2001. Equifinality, data assimilation, and uncertainty estimation

in mechanistic modelling of complex environmental systems using the GLUE

methodology. Journal of Hydrology 249(1-40), 11-29.

Beven, K.J., Freer, J., Hankin, B. and Schulz, K. 2001. The use of generalised likelihood

measures in high order models of environmental systems. In Non-linear and non-

stationary signal processing, Fitzgerald, W.S., Smith, R.C., Walden, A.T., Young,

P.C. (Eds.), Cambridge University Press.

Bicknell, B.R., Imhoff, J.C., Kittle, J.L., Donigan, A.s. and Johanson, R.C. 1997.

Hydrologic simulation program – Fortran. Users manual for version 11. EPA/600/R-

97/080.

Bierman, V.J. and Dolan, D.M. 1986. Modeling of phytoplankton in Saginaw Bay: II.

Post-audit phase. ASCE Journal of Environmental Engineering 112(2), 415-429.

Binley, A.M., Beven, K.J., Calver, A. and Watts, L.G. 1991. Changing responses in

hydrology: assessing the uncertainty in physically based model predictions. Water

Resources Research 27, 1253-1261.

Blom, G. and Aalderink, H.R. 1998. Calibration of three resuspension/sedimentation

models. Water Science and Technology 37(3), 41–49.

Boers, P.C.M. Van Raaphorst, W. and Van der Molen, D. T. 1998. Phosphorus retention

in sediments. Water Science and Technology 37(3), 31–39.

Bowie, G.L., Williams, B.M., Porcella, D.B., Campbell, C.L., Pagenkopf, J.R., Rupp,

G.L., Johnson, K.M., Chen, P.W.H. and Gherini, S.A. 1985. Rates, Constants and

Kinetics formulations in Surface Water Quality (2nd Edition). Report no. EPA/600/3-

85/040, US EPA, Athens, Georgia, USA.

Boyle, J.D. and Scott, J.A. 1984. The role of benthic films in an East Devon River. Water

Research 18, 1089-1099.

Box, G.E.P. and Jenkins, G.E. 1970. Time series analysis forecasting and control,

Holden-Day, pp 208.

Bras, R.L. 1990. Hydrology, Addison-Wesley.

Brooks, S. 1998. Markov chain Monte Carlo method and its application. Journal of the

Royal Statistical Society, Series D 47(1), 69-100.

Brown, L.C. 2002 Modeling uncertainty – QUAL2E-UNCAS A case study. Water

Environment Federation. Unpublished. www.wef.org/pdffiles/TMDL/Brown.pdf.

Brown, L.C. and Barnwell, T.O. 1987. The enhanced stream water quality models

QUAL2E and QUAL2E-UNCAS: Documentation and user manual. Report no.

EPA/600/3-87/007, US EPA, Athens, Georgia, USA.

Burns, L.A. 2000. Exposure analysis modeling system: User’s guide and system

documentation. National exposure research laboratory, EPA/600/R-00/081. US EPA.

Butcher, J.C. 1987. The numerical analysis of ordinary differential equations, John

Wiley.

Camacho, L.A. 2000. A hierarchical modelling framework for solute transport in rivers

under unsteady flow conditions. PhD Dissertation. Imperial College of Science,

Technology and Medicine.

Camacho, L.A. and Lees, M.J. 1999. Multilinear discrete lag-cascade model for channel

routing. Journal of Hydrology 226(1-2), 30-47

Carver, M.B. 1976. The choice of algorithms in automated method of lines solution of

partial differential equations. In Numerical methods for differential systems, L.

Lapidus and W.E. Schiesser (Eds.), Academic Press.

Cash, J.R. and Karp, A.H. 1990. A variable order Runge-Kutta method for initial value

problems with rapidly varying right-hand sides. ACM Transactions on Mathematical

Software 16, 201-222.

CDM. 1997. Charles River Pollution Control District/Town of Holliston, Massachusetts:

Upper Charles River Wasteload Allocation Study, Camp, Dresser and McKee Inc.,

Cambridge, MA, USA.

CEC. 2000. Directive 2000/60/EC of the European Parliament and of the Council of 23rd

October 2000 establishing a framework for Community action in the field of water

policy. Council of the European Community, Brussels.

Cerati, M. 2002. Phosphorus export at the catchment scale – analysis and mapping of

sensitivities using GIS. MSc Project Dissertation. Imperial College London,

September 2002.

Chapra, S.C. 1991. A toxicant loading concept for organic contaminants in lakes. ASCE

Journal of Environmental Engineering 117(5), pp656-677.

Chapra, S.C. 1997. Surface Water-Quality Modeling, McGraw-Hill.

Chapra, S.C. 1999. Organic carbon and surface water quality modelling. Progress in

Environmental Science 1(1), 49-70.

Chapra, S.C. and Canale, R.P. 1998. Numerical Methods for Engineers (3rd Edition),

McGraw-Hill, 710-718.

Chapra, S.C. and Runkel, R.L. 1998. Modelling impact of storage zones on stream

dissolved oxygen. ASCE Journal of Environmental Engineering 125(5), 415-419.

Chatfield, C. 1995. Model uncertainty, data mining and statisitcal inference. Journal of

the Royal Statisitcal Society (A) 158(3), 419-466.

Chen, G.H., Leong, I.M., Liu, J. and Huang, J.C. 1999. Study of oxygen uptake by tidal

river sediment. Water Research 33(13), 2905-2912.

Chen, J. and Wheater, H. 1999. Identification and uncertainty analysis of soil water

retention models using lysimeter data. Water Resources Research 35(8), 2401-2414.

Chow, V.T. 1959. Open Channel Hydraulics, McGraw-Hill.

Christian, J.T. and Baecher, G.B. 1999. Point-estimate method as numerical quadrature.

Journal of Geotechnical And Geoenvironmental Engineering 125(9), 779-786.

Cipra, B. 2000. Revealing uncertainties in computer models. Science 287, 960-961.

Cochran, W.G. 1977. Sampling techniques (3rd edition). John Wiley, pp 72.

Cole, T.M. and Wells, S.A. 2000. CE-QUAL-W2: A two-dimensional, laterally-averaged

hydrodynamic and water quality model, Version 3.0. User manual. U.S. Army Corps

of Engineers. Instruction Report EL-00-1.

Connolly, J.P. and Coffin, R.B. 1995. Model of carbon cycling in planktonic food webs.

ASCE Journal of Environmental Engineering 121(10), 682-690.

Crosetto, M., Tarantola, S. and Saltelli, A. 2000. Sensitivity and uncertainty analysis in

spatial modelling using GIS. Agriculture, Ecosystems and Environment 81, 71-79.

CRWA, 2000. Charles River Watershed Association Annual Report 2000. (Available

from www.crwa.org).

Daldorph, P.W.G., Lees, M.J., Wheater, H.S. and Chapra, S. 2000. Integrated Lake and

Catchment Phosphorus Model: A Eutrophication Management Tool: I Model Theory.

Journal of The Chartered Institution of Water and Environmental Management 15(3),

174-181.

DAFF. 2000. National Water Quality Management Strategy, 9, Rural Land Uses and

Water Quality. Australian Department of Agriculture Fisheries and Forestry,

Canberra, Australia.

DEFRA. 2002. The Government’s strategic review of diffuse water pollution from

agriculture in England: Paper 1: Agriculture and water: a diffuse pollution review.

Chapter 4 - Current status of waters in England and Wales, DEFRA June 2002.

DETR, Environment Agency and Institute for Environment and Health. 2000. Guidelines

for environmental risk assessment and management. HMSO, London.

De Marchi, C., Ivanov, P., Jolma, A., Masliev, I., Smith, M.G. and Somlyody, L. 1999.

Innovative tools for water quality management and policy analysis DESERT and

STREAMPLAN. Water Science and Technology 40(10), 103-110.

DHI, 2000. MIKE 11: A modelling system for rivers and channels: User guide. DHI

Software.

Di Toro, D.M. and Fitzpatrick, J.J. 1993. Chesapeake Bay Sediment Flux Model. U.S.

Army Corps of engineers, Waterways Experiment Station, Technical Report EL-93-2.

Draper, D. 1995 Assessment and propagation of model uncertainty. Journal of the Royal

Statistical Society, Series B 57(1), 45-97.

Duan, Q., Gupta, V.K. and Sorooshian, S. 1993. A shuffled complex evolution approach

for effective and efficient global minimization. Journal of Optimization Theory and

Applications 76(3), 501-521.

Engelund, F. and Fredsoe, J. 1976. A sediment transport model for straight alluvial

channels. Nordic Hydrology 7, 239-307.

Enggrob, H. 1997. Danubian Lowland - Ground Water Model: Part 3 - Sediment

Transport Modelling. DHI user conference proceedings 1997.

Evans, E.C., McGregor, G.R. and Petts, G.E. 1998. River energy budgets with special

reference to river bed processes. Hydrological Processes 12(4) 575-595.

Fonseca, C.M. and Fleming, P.J. 1995. An overview of evolutionary algorithms in multi-

objective optimization. Evolutionary computing 3(1), 1-16.

Franks, S. and Beven, K.J. 1997. Bayesian estimation of uncertainty in land-surface-

atmosphere flux predictions, Journal of Geophysical Research 102(D20), 23991-

23999.

Franks, S.W., Beven, K.J. and Gash, J.H.C. 1999. Multi-objective conditioning of a

simple SVAT model. Hydrology And Earth System Sciences 3(4), 477- 489.

Freer, J., Beven, K. and Ambroise, B. 1996. Bayesian estimation of uncertainty in runoff

prediction and the value of data: an application of the GLUE approach. Water

Resources Research 32(7), 2161.

Gardner, R.H., O’Neill, R.V., Mankin, J.B. and Kumar, D. 1980. Comparative error

analysis of six predator prey models. Ecology 6(2), 323-332.

Gear, C.W. 1971. Numerical initial value problems in ordinary differential equations,

Prentice-Hall.

Goldberg, D.E. 1989. Genetic Algorithms in Search, Optimization and Machine

Learning, Addison-Wesley, Reading, MA.

Greenberg, A.E., Clesceri, L.S. and Eaton, A.D. (Eds.) 1992. Standard methods for the

examination of water and wastewater (18th Edition), APHA-AWWA-WEF.

Gulliver J.S., Wilhelms S.C. and Parkhill K.L. 1998. Predictive capabilities in oxygen

transfer at hydraulic structures. ASCE Journal of Hydraulic Engineering 124(7), 664-

Gunduz, O., Soyupak, S. and Yurteri, C. 1998. Development of water quality

management strategies for the proposed Isikli reservoir. Water Science and

Technology 37(2), 369-376.

Gupta, H.V., Sorooshian, S. and Yapo, P.O. 1998. Toward improved calibration of

hydrologic models: Multiple and noncommensurable measures of information. Water

Resources Research 34(4), 751-763.

Gustafsson, K. 1993. Control of error and convergence in ODE solvers. PhD Thesis,

Lund Institute of Technology, Sweden.

Hankin, B.G. and Beven, K.J. 1998. Modelling dispersion in complex open channel flow;

2. Fuzzy calibration. Stochastic Hydrology and Hydraulics 12, 397-411.

Harr. M.E. 1989. Probabilistic estimates for multi-variate analyses. Applied Mathematical

Modeling 13(5), 313-318.

Harremoes, P. 1982. Immediate and delayed oxygen reaction in rivers. Water Research

16, 1093-1098.

Hartigan, J.P., Friedman, J.A. and Southerland, E. 1983. Post-audit of lake model used for

NPS management. ASCE Journal of Environmental Engineering 109(6), 1354-1370.

Havnø, K., Madsen, M.N. and Dørge, J. 1995. MIKE 11 - A Generalized River Modelling

Package. In Computer Models of Watershed Hydrology, V.P. Singh (Ed.), Water

Resources Publications, pp733-782.

Henderson-Sellers, B. and Henderson-Sellers, A. 1993. Factorial techniques for testing

environmental model sensitivity. In Modelling Changes in Environmental Systems,

Jakeman, A.J., Beck, M.B. and McAleer, M.J. (Eds.), John Wiley, pp. 59-76.

Holland, J.H. 1975. Adaption in Natural and Artificial Systems, Ann Arbor.

Hondzo, M. and Stefan, H.G. 1994. River-bed heat conduction prediction. Water

Hong, H.P. 1998. An efficient point estimate method for probabilistic analysis. Reliability

Engineering and System Safety 59(3), 261-267.

Hornberger, G.M. and Spear, R.C. 1980. Eutrophication in Peel Inlet, 1. Problem-

defining behaviour and a mathematical model for the phosphorus scenario. Water

Research 14, 29-42.

House, W.A. and Denison, F.H. 1998. Phosphorus Dynamics in a Lowland River. Water

Research 32(6), 1819-1830.

Howarth, R.W., Schneider, R. and Swaney, D. 1996. Metabolism and organic carbon

fluxes in the tidal freshwater Hudson River. Estuaries 19(4), 848-865.

Hulme, M., Jenkins, G.J., Lu, X., Turnpenny, J.R., Mitchell, T.D., Jones, R.G., Lowe, J.,

Murphy, J.M., Hassell, D., Boorman, P., McDonald, R. and Hill, S. 2002. Climate

change scenarios for the United Kingdom: The UKCIP02 Scientific Report. Tyndall

Centre for Climate Change Research, School of Environmental Sciences, University

of East Anglia.

Ivanov, P., Masliev, I., Kularathna, M., De Marchi, C. and Somlyódy, L. 1996. DESERT

User Manual. International Institute for Applied Systems Analysis, Laxenburg,

Austria / Institute for Water and Environmental Problems, Barnaul, Russia.

Jarvie, H.P., Withers, P.J.A. and Neal, C., 2002. Review of robust measurement of

phosphorus in river water; sampling, storage, fractionation and sensitivity. Hydrology

and Earth System Sciences 6(1), 113-132.

Jorgensen, S.E., Kamp-Nielsen, L., Christensen, T., Windolf-Nielsen, J. and Westergaard,

B. 1986. Validation of a prognosis based upon a eutrophication model. Ecological

Modelling 32, 165-182.

Kavetski, D., Franks, S.W. and Kuczera, G. 2002. Confronting uncertainty in

environmental modelling: the case for a systems perpective. Advances in calibration

of watershed models.

Keith, L.H. 1990. Environmental Sampling: A Summary. Environmental Science and

Technology 24(5), 610-617.

Kitanidis, P.K. and Bras, R.L. 1980. Real-time forecasting with a conceptual hydrologic

model: 1. Analysis of uncertainty. Water Resources Research. 16(6), 1025-1033.

Kleijnen, J.P.C. 1997. Sensitivity analysis and related analyses: a review of some

statistical techniques. Journal of Statistical Computation and Simulation 57(1-4),

111-142.

Kronvang, B. Laubel, A. and Grant, R. 1997. Suspended sediment and particulate

phosphorus transport and delivery pathways in an arable catchment, Gelbæk Stream,

Denmark. Hydrological Processes 11(6), 627-642.

Kuczera, G., 1983. Improved parameter inference in catchment models 1. Evaluating

parameter uncertainty. Water Resources Research 19(5), 1151-1162.

Kuczera, G. 1990. Assessing hydrologic model nonlinearity using response surface plots.

Journal of Hydrology 118(1-4), 143-161.

Kuczera, G. and Parent, E. 1998. Monte Carlo assessment of parameter uncertainty in

conceptual catchment models. Journal of Hydrology 211(1), 69-85.

Kusuda, T., Futawatari, T. and Oishi, K. 1994. Simulation of nitrification and

denitrification processes in a tidal river. Water Science and Technology 30(2), 43-52.

Lai, C.H. 2002. Use of genetic algorithms in calibration of water quality models. MSc

Project Dissertation. Imperial College London, September 2002.

Lal, A.M. and Shen, H.W. 1993. Mathematical model for river ice processes. Journal of

hydraulic Engineering 119(11),1231.

Lau, K.T. 2002. Optimising storage within the integrated urban wastewater system. PhD

Thesis, Imperial College London, March 2002.

Lees, M.L., Camacho, L.A. and Chapra, S.C. 2000. On the relationship of transient

storage and aggregated dead zone models of longitudinal transport in streams. Water

Resource Research 36(1), 213-224.

Li, S.Y. and Chen, G.H. 1994. Modelling the organic removal and oxygen consumption

by biofilms in an open channel flow. Water Science and Technology 30(2), 53-61.

Lick, W. 1982. Entrainment, transport and deposition of the fine grained sediments in

lakes. Hydrobiologia 91, 31-40.

Ma, R. 2001.An option appraisal for Hun River pollution control using a stochastic water

quality simulation model. MSc Project Dissertation. Imperial College London,

September 2001.

MacKay, M.D., Beckman, R.J. and Conover, W.J. 1979. A comparison of three methods

for selecting values of input variables in the analysis of output from a computer code.

Technometrics 21(2), 239-245.

Mailhot, A., Gaume, E. and Villeneuve, J.P. 1997. Uncertainty analysis of calibrated

parameter values of an urban storm water quality model using Metropolis Monte

Carlo algorithm. Water Science and Technology 36(5), 141-148.

Mathworks 2000. SIMULINK: Dynamic system simulation, The Mathworks Inc. Product

description available from www.mathworks.com/products/simulink.

Maunula, M. 1992. Finnish experience with river ice in China. Proceedings of the 9th

International Northern Research Basins Symposium/Workshop, NHRI Symposium

No. 10, Canada 1992.

McIntyre, N. 1998. A eutrophication model incorporating sediment-water interactions

and macrophyte-phytoplankton competition with the Venice Lagoon as a case study.

MSc. dissertation. Imperial College London, September 1998.

McIntyre, N. 2000. Estimation and propagation of parametric uncertainty in

environmental models (first draft). Unpublished.

McIntyre, N. Lees, M.J. and Wheater, H.S. 2001. A review and demonstration of methods

of uncertainty analysis in numerical environmental modelling. Proceedings of the 8th

Europia International Conference - Advances in Design Sciences and Technology,

Delft, The Netherlands, 183-196.

McIntyre, N. and Zeng, S. 2002. River and lake water quality models – Final Report:

WaterRAT User Manual. A report to the European Commission for the 4th

Framework Project “TOPLEM”. Report no. PL972722 D4.3(10). Department of Civil

and Environmental Engineering, Imperial College, London, UK.

McIntyre, N. 2002a. River and lake water quality models – Final Report: Hun River case

study. A report to the European Commission for the 4th Framework Project

“TOPLEM”. Report no. PL972722 D4.3(7). Department of Civil and Environmental

Engineering, Imperial College, London, UK.

McIntyre, N., Jackson, B., Wheater, H.S. and Chapra, S. 2003. Numerical efficiency in

Monte Carlo simulations - a case study of a river thermodynamic model. ASCE

Journal of Environmental Engineering (Accepted).

McNeil, J., Taylor, C. and Lick, W. 1996. Measurements of erosion of undisturbed

bottom sediments with depth. ASCE Journal of Hydraulic Engineering 122(6), 316-

Meixner, T., Gupta, H.V., Bastidas, L.A. and Bales, R.C. 1999. Sensitivity analysis using

mass flux and concentration. Hydrological Processes 13, 2233-2244.

Melching, C. 1995. Reliability estimation. In Computer Models of Watershed Hydrology,

Singh, V.P. (Ed.), Water Resource Publications.

Melching, C.S. and Bauwens, W. 2001. Uncertainty in coupled nonpoint source and

stream water-quality models. ASCE Journal of Water Resources Planning and

Management 127(6), 403-419.

Metcalf and Eddy Inc. 1991. Wastewater engineering – Treatment disposal and reuse 3rd

Edition. McGraw-Hill.

Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H. and Teller, F. 1953.

Equation of state calculations by fast computing machines. Journal of Chemical

Physics 21, 1087.

Montgomery Watson. 2001a. Liaoning Integrated Environmental Programme. Water

quality modelling report Annex 4. Unpublished Report.

Montgomery Watson. 2001b. Liaoning Integrated Environmental Programme. Lioa River

Basin Plan: Executive Summary. Montgomery Watson, Shenyang, March 2001.

Mulligan, A.E. and Brown, L.C. 1998. Genetic algorithms for calibrating water quality

models. ASCE Journal of Environmental Engineering 124(3), 202-211.

Nakato, T. 1990. Tests of selected sediment-transport formulas. ASCE Journal of

Hydraulic Engineering 116(3), 362-379.

Nash, J.E. and Sutcliffe, J.V. 1970. River flow forecasting through conceptual models

Part 1 – A discussion of principles. Journal of Hydrology, 121, 217-238.

Neitsch, S.L., Arnold, J.G., Kiniry, J.R., Williams, J.R. and King, K.W. 2002. Soil and

water assessment tool theoretical documentation. GSWRL Report 02-01. Grassland,

Soil and Water Research Laboratory, Temple, Texas.

NERC. 2003. Environmental mathematics and statistics. Thematic Programme. NERC

(www.nerc.ac.uk/funding/interdisciplinary)

Orlob, G. T. 1992. Water-quality modeling for decision making. Journal of Water

Resources Planning and Management 118, 3, 295-307

Park, S. and Lee, Y.S. 2002. A water quality modeling study of the Nakdong River,

Korea. Ecological Modelling 152(1), 65-75.

Portielje, R., Hvitved-Jacobsen, T. and Schaarup-Jensen, K. 2000. Risk analysis using

stochastic reliability methods applied to two cases of deterministic water quality

models. Water Research 34(1), 153-170.

PRC SEPA. 1999. Environmental quality standard for surface water. Regulation GHZB1-

1999. State Environmental Protection Administration, Beijing, People’s Republic of

China.

Press, W.H., Teukolsky, S.A., Flannery, B.P. and Vetterling, W.T. 1988. Numerical

Recipes in C: The Art of Scientific Computing. Cambridge University Press, pp308.

Protopapas, A.L. and Bras, R.L. 1990. Uncertainty propagation with numerical models

for flow and solute transport in the unsaturated zone. Water Resources Research

26(10), 2463-2474.

Qian, S.S. 1997. An illustration of model structure identification. Journal of American

Water Resources Association 33(4), 811-824.

Qinghua University. 2001. Pollution load estimation methodology: The estimation of

pollution load of the river sources of Dahuofang Reservoir. A report to the European

Commission for the 4th Framework Project “TOPLEM”. Report no. PL972722

D3.3(10). Qinghua University, Department of Environmental Science, Beijing,

China.

Ranjie, H. and Huimin, L. 1987. Modelling of BOD-DO Dynamics in an ice-covered

river in northern China. Water Resources 21(3), 247-251.

Rauch, W., Henze, M., Koncsos, L., Reichert, P., Shanahan, P., Somlyody, L. and

Vanrolleghem, P. 1998. River water quality modelling: I. State of the art. Water

Reckhow, K.H. 1994. Water quality simulation modeling and uncertainty analysis for risk

assessment and decision making. Ecological Modelling 72(1-2), 1-20.

Reckhow, K.H. and Chapra, S.C. 1983a. Engineering approaches for lake management.

Vol.1: Data analysis and empirical modeling. Butterworth.

Reckhow, K.H. and Chapra, S.C. 1983b. Engineering approaches for lake management.

Vol.2: Mechanistic modeling. Butterworth.

Reichart, P. and Omlin, M. 1996. On the usefulness of over-parameterised ecological

models. Ecological Modelling 95, 289-299.

Robson, A.J. and Neal, C. 1997. Regional water quality of the river Tweed. Science Of

The Total Environment 194, 173-192

Romanowicz, R., Beven, K.J. and Tawn, A. 1994. Evaluation of predictive uncertainty in

non-linear hydrological models using a Bayesian approach. In Barnett, V. &

Turkman, K.F. (eds), Statistics for the environment 2. John Wiley.

Rosenblueth, E. 1981. Two-point estimates in probabilities. Applied Mathematical

Modelling 5, 329-335.

Runkel, R.L. 1998. One-dimensional transport with inflow and storage (OTIS): A solute

transport model for streams and rivers. Water Resources Investigation Report 98-

4018, USGS

Rubinstein, R.Y. 1981. Simulation and the Monte Carlo Method, Wiley, pp 240.

Rutherford, J.C. 1994. River Mixing, Wiley.

Rutenbar, R.A. 1989. Simulated annealing algorithms: an overview. IEEE Circuits and

Devices Magazine Jan, 1989, 19-26.

Sawyer, C.N., McCarty, P.L. and Parkin, G.F. 1994. Chemistry for environmental

engineering; 4th edition, McGraw-Hill.

Shah, S.M.S., O’Connell, P.E. and Hosking, J.R.M. 1996. Modelling the effects of spatial

variability on catchment responses. 2. Experiments with distributed and lumped

models. Journal of Hydrology 175, 89-111.

Shanahan, P., Henze, M., Koncsos, L., Rauch, W., Reichert, P., Somlyody, L. and

Vanrolleghem, P. 1998. River water quality modelling: II. Problems of the art. Water

Shanahan, P., Borchardt, D., Henze, M., Rauch, W., Reichert, P., Somlybody, L. and

Vanrolleghem, P. 2001. River water quality model no. 1 (RWQM1): I. Modelling

approach. Water Science and Technology 43(5), 1-9.

Shao, J. and Tu, D. 1995. The jackknife and bootstrap. Springer-Verlag, pp2.

Shao, J. 1996. Bootstrap model selection. Journal of the American Statistical Association

91(434), 655-665.

Shen, H.T. and Chaing, L.A. 1984. Simulation of growth and decay of river ice cover.

ASCE Journal of Hydraulic Division 110(7), 958-971.

Shen, H.T. and Yapa, P.D. Oil slick transport in rivers. ASCE Journal of Hydraulic

Engineering, 114(5), 529-543.

Shepherd, B., Harper, D. and Millington, A. 1999. Modelling catchment-scale nutrient

transport to watercourses in the UK. Hydrobiologia 396, 227-237.

Sherif, E. 2000. An oil model for polluted rivers with the Hun River, China as a case

study. MSc Project Dissertation. Imperial College London, September 2000.

Sincock, A. M., Lees., M. J. 2000. Extension of the QUASAR river water quality model

to unsteady flow conditions, Journal of The Chartered Institution of Water and

Environmental Management 16(1), 12-17.

Somlyody, L. 1995. Water quality management: can we improve integration to face

future problems. Water Science and Technology 31(8), 249-259.

Somlyody, L. 1997. Use of optimization models in river basin water quality planning.

Water Science and Technology 36(5), 209-218.

Somlyody, L., Henze, M., Koncsos, L., Rauch, W., Reichert, P., Shanahan, P. and

Vanrolleghem, P. 1998. River water quality modelling: III. Future of the art. Water

Sorooshian, S. and Dracup, J.A. 1980. Stochastic parameter estimation procedures for

hydrologic rainfall-runoff models: correlated and heteroscedastic error cases. Water

Sorooshian, S. and Gupta, V.K. 1995. Model calibration. In Computer Models of

Watershed Hydrology, Singh, V.P. (Ed.), Water resources Publications.

Spear, R. C. and Hornberger, G. M., 1980. Eutrophication in peel inlet - 2. Identification

of critical uncertainties via generalized sensitivity analysis. Water Research 14(1),

43-49.

Streeter, H.W. and Phelps, E.B. 1925. A Study of the Pollution and Natural Purification

of the Ohio River, III, Factors Concerned in the Phenomena of Oxidation and

Reareation. U.S. Pub. Health Serv., Pub. Health Bulletin, 146, pp.75.

Suttamanutwong, W. 2001. A comparison of empirical nitrogen runoff models in

application to an agricultural catchment in China. MSc Project Dissertation, Imperial

College London, September 2002.

Tate, C.M., Broshears, R.E. and McKnight, D.M. 1995. Phosphate dynamics in an acidic

mountain stream: interactions involving algal uptake, sorption by iron oxide and

photoreduction. Limnology and Oceanography 40(5), 938.

Taylor, G.I. 1954. The dispersion of matter in turbulent flow through a pipe. Proceedings

of the Royal society of London, Series A 223, 446-468.

Tellinghuisen, J.A. 2000. Monte Carlo study of precision, bias, inconsistency, and non-

Gaussian distributions in non-linear least squares. Journal of Physical Chemistry

104(12), 2834- 2844.

Thomann, R.V. and Mueller, J.A. 1987. Principles of surface water quality modeling and

control. Addison-Wesley.

Thomann, R.V. 1989. Bioaccumulation model of organic chemical distribution in aquatic

food chains. Environmental Science and Technology 23, 699-707.

Thomann, R.V. 1998. The future "golden age" of predictive models for surface water

quality and ecosystem management. ASCE Journal of Environmental Engineering

124(2), 94-103.

Thyer, M., Kuczera, G., Bates, B.C. and Bryson C. 1999. Probabilistic optimization for

conceptual rainfall-runoff models: A comparison of the shuffled complex evolution

and simulated annealing algorithms. Water Resources Research 35(3), 767-773.

Tsakalias, G. and Koutsoyiannis, D. 1999. A comprehensive system for the exploration

and analysis of hydrological data. Water Resource Management 13, 269-302.

Tung, Y.K. 1996. Uncertainty and reliability analysis. In Water Resources Handbook,

Mays, L.W. (Ed.), McGraw-Hill.

Tye, R., Jepsen, R. and Lick, W. 1996. Effects of colloids, flocculation, particle size, and

organic matter on the adsorption of hexachlorobenzene to sediments. Environmental

Toxicology and Chemistry 15(5), 643-651.

UK Environment Agency. 1998. The State of the Environment of England and Wales:

Fresh Waters. The Stationery Office, London.

UK Environment Agency. 2001a. SIMCAT 7.6: A Guide and Reference for Users. UK

Environment Agency, Bristol, UK.

UK Environment Agency. 2001b. Water resources for the future – A summary of the

strategy for England and Wales. UK Environment Agency, Bristol, UK.

UK Environment Agency. 2002. The Water Framework Directive: Guiding Principles on

the Technical Requirements, Chapter 7: Review of the impacts of human activity. UK

Environment Agency, Bristol, UK.

US EPA. 1998. Water quality criteria and standards plan – priorities for the future. US

EPA 822-R-98-003.

US EPA. 1999. BASINS (Better Assessment Science Integrating Point and Nonpoint

Sources) Version 2 User's Manual. Report no. EPA#: 823/B-98-006, US EPA,

Athens, Georgia, USA.

Valdes, J.B., Rodriguez-Huras, I. and Vicens, G.J. 1980. Choosing among hydrologic

regression models 2. Extensions to the standard model. Water Resources Research

16(3), 507-516.

Van der Perk, M. 1997. Effect of model structure on the accuracy and uncertainty of

results from water quality models. Hydrological Processes 11(3), 227-239.

Van Straten, G. 1983. Maximum likelihood estimation for phytoplankton models. In

Uncertainty and forecasting of water quality, Beck, M.B. and Van Straten, G (Eds.),

Springer-Verlag.

Van Straten, G. and Keesman, K.J. 1991. Uncertainty propagation and speculation on

projective forecasts of environmental change: a lake eutrophication example. Journal

of Forecasting 10, 163-190.

Van Straten, G. 1998. Models for water quality management: the problem of structural

change. Water Science and Technology 37(3), 103–111

Vanrolleghem P., Borchardt D., Henze M., Rauch W., Reichert P., Shanahan P. and

Somlyody L. 2001. River water quality model no. 1 (RWQM1): III. Biochemical

submodel selection. Water Science and Technology 43(5), 31-40.

Vrugt, J.A., Gupta, H.V., Bouten, W. and Sorooshian, S. 2003a. A shuffled complex

evolution Metropolis algorithm for optimization and uncertainty assessment of

hydrologic model parameters. Water Resources Research 39(8), SWC1:1-1:16.

Vrugt, J.A., Gupta, H.V., Bastidas, L., Bouten, W. and Sorooshian, S. 2003b. Effective

and efficient algorithm for multi-objective optimisation of hydrologic models. Water

Resources Research 39(8), SWC5:1-5:16.

Wade, A.J., Hornberger, G.M., Whitehead, P.G. and Flynn, N. 2001. On modeling the

mechanisms that control in-stream phosphorus, macrophyte and epiphyte dynamics:

An assessment of a new model using general sensitivity analysis. Water Resources

Reasearch 37(11), 2777-2792.

Wagener T., Boyle D.P., Lees M.J., Wheater H.S., Gupta H.V. and Sorooshian S. 2001. A

framework for the development and application of hydrological models. Hydrological

and Earth System Sciences 5(1), 13-26.

Wagener, T., Camacho, L.A., Lees, M.J. and Wheater, H.S. 2002a. Dynamic parameter

identifiability analysis of a solute transport model. Journal of Hydroinformatics 4(3),

199-212.

Wagener, T., McIntyre, N., Lees, M.J., Wheater, H.S. and Gupta, H.V. 2002b. Reducing

uncertainty in conceptual rainfall-runoff modelling: Dynamic identifiability analysis.

Hydrological Processes 17(2), 455-476.

Wagner, B.J. and Harvey, J.W. 1997. Experimental design for estimating parameters of

rate-limited mass transfer: analysis of stream tracer studies. Water Resources

Research 33(7), 1731-1741.

Wallingford Software. 2002. ISIS Technical Review. Wallingford Software Ltd, UK.

(Available from www.wallingfordsoftware.com/products/PDF/ISIS_tech_review.pdf).

Warwick, J.J., Cockrum, D. and McKay, A. 1999. Modelling the impact of subsurface

nutrient flux on water quality in the lower Truckee river, Nevada. Journal of

American Water Resources Association 35(4), 837-851.

Watanabe, M., Harleman, D.R.F. and Vasiliev, O.F. 1983. Two- and three-dimensional

mathematical models for lakes and reservoirs. In Mathematical modelling of water

quality, Orlob, G.T. (Ed.), Wiley.

Weber, E.J. 1996. Iron-mediated reductive transformations: Investigation of reaction

mechanism. Environmental Science and Technology 30(2), 716-719.

Weglarczyk, S. 1998. Interdependence and applicability of some statistical quality

measures for hydrological models. Journal of Hydrology 206(1-2), 98-103.

West, M. and Harrison, J. 1997. Bayesian forecasting and dynamic models (2nd edition),

Springer-Verlag, New York.

Wheater, H.S., Bishop, K.H. and Beck, M.B. 1986. The identification of conceptual

hydrological models for surface water acidification. Hydrological Processes 1, 89-

Whitehead, P.G. and Hornberger, G.M. 1984. Modelling algal behaviour in the River

Thames. Water Research 18(18), 945-953.

Whitehead, P.G. and Toms, I.P. 1993. Dynamic modelling of nitrate in reservoirs and

lakes. Water Research 27(8), 1377-1384.

Whitehead, P.G., Williams, R.J. and Lewis, D.R. 1997a. Quality simulation along river

systems (QUASAR): model theory and development. The Science of the Total

Environment 194-195, 447-456.

Whitehead, P.G., Howard, A. and Arulmani, C. 1997b. Modelling algal growth and

transport in rivers: a comparison of time-series analysis, dynamic mass balance and

neural network techniques. Hydrobiologia 349, 39-46.

Whitehead, P.G., Wilson, E.J. and Butterfield, D. 1998. A semi-distributed Integrated

Nitrogen model for multiple source assessment in Catchments (INCA): Part I - model

structure and process equations. The Science of the Total Environment 210(1-6), 547-

WHO. 1996. Guidelines for drinking-water quality : health criteria and other supporting

information 2nd Edition. World Health Organization.

Wierman, M.J. 1996. Assessing fuzzy sets and the potential of possibility theory.

Information Sciences 88(1-4), 247-261.

Xianxin, C. and Yongjiu, B. 1991. Water pollution and control measures in Liaoning

Province. Waterlines 10(1), 21-23.

Xu, C.Y. and Vandewiele, G.L. 1994. Sensitivity of monthly rainfall-runoff models to

input errors and data length. Hydrological Sciences 39(2), 157-176.

Yang C.T. and Molinas, A. 1982. Sediment transport and unit stream power function.

ASCE Journal of the Hydraulics Division 108(6), 774-793.

Yapa, P.D. and Shen, H.T. 1995. Modelling river oil spills: a review. Journal of

Hydraulic Research 32(5), 765-782.

Yapo, P.O., Gupta, H.V. and Sorooshian, S. 1998. Multi-objective global optimization for

hydrological models. Journal of Hydrology 204(1-4), 83-97.

Yeh, K., Yang, J.C. and Tung, Y.K. 1997. Regionalization of unit hydrograph

parameters; 2. Uncertainty analysis. Stochastic Hydrology and Hydraulics 11, 173-

Young, P.C. and Wallis, S.G. 1993. Solute transport and dispersion in channels. In

Channel Network Hydrology, Beven, K.J. and Kirkby, M.J. (Eds.), John Wiley,

pp129-174.

Young, P., Parkinson, S. and Lees, M. 1996. Simplicity out of complexity in

environmental modelling: Occam's razor revisited. Journal of Applied Statistics

23(2), 165-210.

Yu, P-S., Yang, T-C.and Chen, S-J. 2001. Comparison of uncertainty analysis methods

for a distributed rainfall–runoff model, Journal of Hydrology 244(1-2), 43-59.

Zadeh, L.A. 1978. Fuzzy sets as a basis for the theory of possibility. Fuzzy Sets and

Systems 1, 3-28.

Zeng, W. and Beck, M.B. 2001. Development and evaluation of a mathematical model

for the study of sediment-related water quality issues. Water Science and Technology

43(7), 47-54.

Notation al Atmospheric transmission of long-wave as Atmospheric transmission of short-wave Ac Cross-section area of water Ai Coverage of ice as fraction of Aw As Surface area of sediment Aw Surface area of water b Bowen’s coefficient B Stefan Boltzman constant c Cloud cover C Concentration of arbitrary pollutant C’ Concentration of arbitrary pollutant in transient storage zone Cag Concentration of phytoplankton Cc Concentration of biodegradable carbon Ccf Concentration of 5-day BOD Ccs Concentration of slow reacting carbon Cna Concentration of ammonium plus ammonia Cni Concentration of nitrate plus nitrite Cns Concentration of organic nitrogen Cos Saturated concentration of dissolved oxygen Cox Concentration of dissolved oxygen Cp Concentration of total phosphorus Cpd Concentration of total phosphorus in distributed load Cpo Concentration of inorganic phosphorus Cps Concentration of organic phosphorus Cpu Concentration of total phosphorus at upstream boundary Csa Salinity Css Concentration of suspended solids di Density of ice dw Density of water D Diffusivity (units length2/time) D’ Diffusion coefficient (units length3/time) Dx’ Diffusion to transient storage zone (units length/time) eair Vapour pressure of air eairs Vapour pressure of air at saturation EP Monthly export of phosphorus fb Bulk heat exchange due to flow and dispersion fc Convective heat exchange fe Heat exchange due to evaporation fi Light limitation factor fiw Ice-water heat exchange fl Long-wave exchange fp Heat input due to precipitation fs Short-wave heat exchange fsw Sediment-water heat exchange F() Mass flux due to transport processes

Fsp Diffusive flux of phosphorus from sediment g Gravitational constant Hi Thickness of ice Hs Sediment depth Hw Water depth I Light intensity J Heat m Coefficient of emissivity kai Air-ice heat exchange coefficient kAice Empirical coefficient relating Aice to Hi kd Diffusion rate between sediment and water (units length/time) kda (kda20) Phytoplankton death rate (max. Rate at 20oc) kdn (kdn20) Inorganic nitrogen denitrification (max. Rate at 20oc) ker Error in aeration formula kga (kga20) Phytoplankton growth rate (max. Rate at 20oc) kgahsl Phytoplankton light half-saturation constant kgahsn Phytoplankton nitrogen half-saturation constant kgahsp Phytoplankton phosphorus half-saturation constant khc (khc20) Slow carbon hydrolysis rate (max. Rate at 20oc) khn (khn20) Organic nitrogen hydrolysis rate (max. Death rate at 20oc) khp (khp20) Organic phosphorus hydrolysis rate (max. Rate at 20oc) ki Conductivity of ice kiw Heat transfer coefficient (ice-water) kna Ammonium preference coefficient koc (koc20) Fast carbon oxidation rate (max. Rate at 20oc) kochs Fast carbon oxygen half-saturation kon (kon20) Inorganic nitrogen nitrification rate (max. Rate at 20oc) konhs Inorganic nitrogen oxygen half-saturation kra Reaeration rate ks Scour rate ksw Heat transfer coefficient (sediment-water) kwe Weir aeration coefficient kW Convective heat transfer coefficient K Defined constant KM1 Constant used in metropolis algorithm KM2 Constant used in metropolis algorithm KR Root constant in GLUE likelihood KS Kolmogorov-Smirnov statistic li Latent heat of melting of ice lw Latent heat of evaporation L Posterior likelihood Lp Prior likelihood M Notional model result n Mannings coefficient N Defined integer Ndat Number of samples of data sets Npar Number of model parameters Nres Number of residuals Nsam Number of parameter samples

Nvar Number of model responses O Notional observation OF Objective p Model order P Probability P’ Random number q1 Linear flow residence time parameter q2 Non-linear flow residence time parameter Q Flow Qev Evaporation rate Ql Flow loss Qs Flow source Qu Flow at upstream boundary Qup Flow from upstream cell r Hydraulic radius rcn Carbon demand of denitrification rna Phytoplankton ratio Nitrogen:Chl-a rN Nutrient limitation factor roa Oxygen:Chl-a ration ron Oxygen demand of nitrification rpa Phosphorus:Chl-a ratio rT Temperature limitation factor Ri Reflectance of ice Rsp Resuspension of P from sediment Rw Reflectance of water s Short wave radiation reaching outside of atmosphere sw Specific heat capacity of water Sp Concentration of phosphorus in sediment SOD Sediment oxygen demand t Time T Residence time Ta Ambient air temperature Tp Water temperature of pollution source Ts Temperature of deep sediment Tw Water temperature

wT Measured water temperature Tw’ and Tw’’ Intermediate evaluations of Tw u Water velocity ucr Critical water velocity vp Total phosphorus settling velocity vsc Slow carbon settling velocity vsn Organic nitrogen settling velocity vsp Organic phosphorus settling velocity vss Suspended solids settling velocity V Volume of water in control volume W Wind speed x Distance downstream, or arbitrary parameter X Vector of arbitrary model outputs Y Vector of arbitrary model inputs

Z Degree-days below freezing ∆() Derivative matrix Ω Problem dependant error constant α A sample set of factors β Arbitrary dependant variable δ2 Variance of model result around optimum model result ε Model residual η Step adaption safety constant γ Arbitrary factor λ Approximated local truncation error µ Mean value σ2 Variance of observed data around model result σm

2 Variance of observed data around optimum model result θ Arrhenius coefficient for all reactions ξ Specified tolerance Φ Pollution or hydraulic load τcr Critical shear stress at bed τ Shear stress at bed ζ Local truncation error

analysis of uncertainty in river water quality modelling - imperial

Documents

modeling variability and uncertainty in risk … variability...

uncertainty-based grade modelling of kimberlite: a …...

probabilistic uncertainty identification modelling in...

modelling and simulation uncertainty - boris...

uncertainty modelling using software freet

modelling the uncertainty of slope estimation from …

modelling inflation uncertainty in transition … ·...

gaussian process modelling for uncertainty quanti cation

uncertainty and risk modelling - ulisboa

8. modelling uncertainty

geostatistical modelling of uncertainty in soil science...

bond graph based sensitivity and uncertainty analysis...

x-33 uncertainty modelling

statistical modelling of collocation uncertainty in...

modelling uncertainty in transmission system investment...

8. modelling uncertainty - intranet...

modelling physiological uncertainty

modelling uncertainty in t-rans simulations of thermally ......

software measurement: uncertainty and causal modelling

bayesowl: uncertainty modelling in semantic web...