basin scale hydrology support vector machines
TRANSCRIPT
-
8/14/2019 Basin Scale Hydrology support vector machines
1/14
ABSTRACT: Water scarcity in the Sevier River Basin in south-cen-
tral Utah has led water managers to seek advanced techniques for
identifying optimal forecasting and management measures. To
more efficiently use the limited quantity of water in the basin, bet-
ter methods for control and forecasting are imperative. Basin scale
management requires advanced forecasts of the availability ofwater. Information about long term water availability is important
for decision making in terms of how much land to plant and what
crops to grow; advanced daily predictions of streamflows and
hydraulic characteristics of irrigation canals are of importance for
managing water delivery and reservoir releases; and hourly fore-
casts of flows in tributary streams to account for diurnal fluctua-
tions are vital to more precisely meet the day-to-day expectations of
downstream farmers. A priori streamflow information and exoge-
nous climate data have been used to predict future streamflows and
required reservoir releases at different timescales. Data on snow
water equivalent, sea surface temperatures, temperature, total
solar radiation, and precipitation are fused by applying artificial
neural networks to enhance long term and real time basin scale
water management information. This approach has not previously
been used in water resources management at the basin-scale andcould be valuable to water users in semi-arid areas to more effi-
ciently utilize and manage scarce water resources.
(KEY TERMS: artificial neural networks; multi-sensor data; irriga-
tion; water management; multi-time scale forecasting; streamflow.)
Khalil, Abedalrazq F., Mac McKee, Mariush Kemblowski, and Tirusew Asefa,
2005. Basin Scale Water Management and Forecasting Using Artificial Neural
Networks. Journal of the American Water Resources Association (JAWRA)
41(1):195-208.
INTRODUCTION
Forecasting of streamflow at different temporalscales is of practical importance to several disciplines.
Techniques for predicting seasonal, daily, and hourly
streamflows are utilized in this paper to address theneed for accurate information about water deliveries
on a short term scale and to formulate long term or
seasonal plans for allocation of water and relatedresources. Streamflow prediction is used in applica-
tions as diverse as agricultural planning, reservoir,and watershed management. Ames (1998) has dis-
cussed the financial returns to agriculture and indus-try that could be derived from successful extended
range streamflow forecasts.Short term and real time forecasts of flows in rivers
and tributaries, and near real time recommendations
for required operational decisions for canal diversionsand reservoir releases can provide additional opportu-
nities for improving system level water use efficien-cies. These information needs for long term and real
time streamflow forecasts and near real time reser- voir releases require a substantial investment in
acquisition and analysis of a wide range of temporallyand spatially disparate data. These information needsare very much the case for the highly regulated Sevier
River Basin of south-central Utah, which has beenheavily instrumented in recent years and which pro-
vides both the motivation and case study area for thispaper.
Physically based hydrologic and hydraulic mathe-matical modeling approaches have been proposed forstreamflow predictions, but complexities in these
modeling processes and difficulties associated withobtaining the data that such models would require
have limited the scope and applicability of these
1Paper No. 03202 of theJournal of the American Water Resources Association (JAWRA) (Copyright 2005). Discussions are open untilAugust 1, 2005.
2Respectively, Graduate Research Assistant, Professors of Civil and Environmental Engineering, and Graduate Research Assistant,Department of Civil and Environmental Engineering, Utah Water Research Laboratory, Utah State University, Logan, Utah 84322-8200 (E-Mail/Khalil: [email protected]).
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 195 JAWRA
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
FEBRUARY AMERICAN WATER RESOURCES ASSOCIATION 2005
BASIN SCALE WATER MANAGEMENT AND FORECASTINGUSING ARTIFICIAL NEURAL NETWORKS1
Abedalrazq F. Khalil, Mac McKee, Mariush Kemblowski, and Tirusew Asefa2
-
8/14/2019 Basin Scale Hydrology support vector machines
2/14
traditional methods. As a result, there is a need forthe development of modeling approaches that capturethe behavior of the system utilizing available data,
are computationally robust, and could be used in realapplications. One such approach is presented in this
paper.The goals of the work reported in this paper are to:
1. Provide analyses that can be used to improvedecisions in river basin management through exploit-
ing the wealth of available, diverse data regardingcanals and streamflows, irrigation water orders, cli-
mate information, and earth and sea surface satelliteimagery.
2. Provide decision relevant information thatfacilitates the on-farm management of water in boththe short and long term.
STUDY AREA SEVIER RIVER BASIN
The Sevier River Basin in rural south-central Utahis one of the states major drainages (Figure 1). A
closed river basin, it encompasses 12.5 percent of thestates total area. From the headwaters 250 miles
(402 km) south of Salt Lake City, the river flows north
and then west 255 miles (410 km) before reachingSevier Lake (Berger et al., 2002). The Sevier River
Basin has five subwatersheds and is divided into twomajor divisions, the upper and lower basins, for the
administration of water rights. The dividing pointbetween the upper and lower basins is the Vermillion
Diversion Dam. Average annual precipitation variesaround 13.0 inches (33 cm), and the growing seasonranges from 60 to 178 days (Bergeret al., 2002; Utah
Board of Water Resources, 2001). Most of the surfacewater runoff comes from snowmelt during the spring
and early summer months. The primary use of waterin the basin is for irrigation. The average annual
JAWRA 196 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA
Figure 1. The Sevier River Basin in South-Central Utah.
-
8/14/2019 Basin Scale Hydrology support vector machines
3/14
amount of water diverted for cropland irrigation is903,460 acre-feet (1,114 million cubic meters, mcm).Of this amount, approximately 135,000 acre-feet
(166.5 cmc) are pumped from ground water. About 40percent of the diversions are return flows from
upstream use (Berger et al., 2002). For a detaileddescription of the basin and much of the real time
database utilized in this research, refer to SevierWater Users Association (2004).
BACKGROUND
Real time integrated management of river basinscan be important in achieving optimum allocation of
scarce water resources. Researchers have proposedphysical and stochastic approaches for prediction of
streamflow at different time scales for managementpurposes. Complexities in the underlying physical
processes and difficulties in acquiring needed datalimit the utility of these approaches. The main func-tions of an integrated real time water resources man-
agement system are: water resources real timemonitoring and data collection, information and
knowledge mining, and prediction and real time deci-sion support. Real time water resources management
requires a heavily instrumented basin to monitor pre-cipitation, runoff, climatic indices, and streamflow.The Sevier River Basin has been heavily instrument-
ed with gages that measure all the aforementionedfactors. Measurements of flows at several locations on
the mainstem of the Sevier River, tributary flows andcanal diversions, various meteorological data, reser-
voir volumes and releases, and other data are report-ed hourly, stored in a database, and made available
via the internet. While the managers of the Sevier
River water systems have utilized these data in rawform to improve overall system operations, much
more could be done with these data to develop andimplement advanced tools for forecasting and real-
time management. The Sevier River is therefore asuitable study area to test tools that are not physical-
ly based, but that let the data speak. The emphasisof this manuscript is on integration of the available
data by artificial neural networks to obtain decision-relevant predictions of flows and reservoir operationrecommendations at different time scales. These mod-
els will ultimately be integrated into a waterresources information management system to be
delivered to the operators of the reservoir and canalsystems in the Sevier River Basin.
ARTIFICIAL NEURAL NETWORKS
In this paper, artificial neural network (ANN)learning methods are used to develop basin scale
management models. Artificial neural networks arepractical information processing systems that provide
methods for learning functions from observations
An ANN roughly replicates the behavior of the organ-ic brain by emulating the operations and connectivity
of biological neurons. This emulation, of course, isdone in a mathematical form that is greatly simplified
from the biological prototype. The advantage of ANNsin engineering and practical applications lies in their
ability to learn and capture information from datathat describe the behavior of a real system (Govin-daraju and Rao, 2000; Hayken, 1994).
An interesting property of ANNs is that they oftenwork well even when the training data sets contain
noise and measurement errors (Hammerstrom, 1993).Moreover, they have the capability of representing
complex behaviors of nonlinear systems (Maier andDandy, 2000).
Artificial neural networks are characterized by
their architecture, an activation function, and thelearning rule and learning parameter set used in
their construction. A common architecture is oneembodied in feed forward backpropagation ANNs,
which consists of layers of neurons in the network anddifferent number of neurons in each layer (Skapura,
1995). It is composed of a sequence of layers that areclassified as input, hidden, and output layers. Eachlayer consists of a set of one or more nodes, or neu-
rons. The nodes in the input layer receive informa-tion from the outside world, process this information,
and send output to the next layer of neurons in thenetwork. Each neuron is connected to neurons in the
preceding layer, from which it receives inputs, and tothe neurons in the subsequent layer, to which it pass-es its output.
The learning rule specifies the way in whichweights will be determined during the training pro-
cess, and this depends on the input, output, and acti-vation values of the model. Each neuron has an
activation function, which can be continuous, linear,or nonlinear functions [i.e., monotonic nonlinear func-
tion that saturates at finite value arguments likesgm() and tanh()]. The output signal that passesfrom one neuron to another in a subsequent layer is
transformed by a weight, or connection strength,that modifies the signal before it reaches the receiv-
ing neuron. Thus, the output of a node in any layer isdetermined by applying a nonlinear transformation(the activation function) to the sum of the weighted
inputs it receives from the neurons of the previouslayer. Figure 2 shows an ANN model that takes input
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 197 JAWRA
BASIN SCALE WATER MANAGEMENT AND FORECASTING USING ARTIFICIAL NEURAL NETWORKS
-
8/14/2019 Basin Scale Hydrology support vector machines
4/14
valuesx1,x2, ...xl and generates an output signal y1,y2, ...yK. A multi-layer ANN is described as feedfor-ward when the connections are directed from the
input layer, forward through the network, to the out-put layer.
The activation functions are evaluated through twosteps. First, the activation is calculated as the inner
product of the input vector, x = [x1,x2, ...xl]T, and the
ond, the output, y, is evaluated as a function f(u) ofthe activation. Optimal values for the weight vector
are determined by minimization of an objective func-tion that measures the error between the models out-
put and the measured behavior of the real systemEmpirical Risk Minimization. Typically, the error
for query t may be defined as the difference betweenthe observed or measured target response, T(t), andthe models response, y(t). Generally, a method called
backpropagation (Rumelhartet al., 1986) is used fortraining ANNs, by which w is modified in such a wayto find a set ofw that minimizes the error.
For details about ANNs, interested readers arereferred to Govindaraju and Rao (2000) and Schalkoff
(1997).
RELEVANT DATA SETS
The development of the predictive learning modelrequires the precise identification of the relevant
data. In the next sections, a brief description of therelevant data sets will be provided. The relevancyevaluation is judged subjectively. In other words, this
paper utilizes the available data that could be relatedto the given model from a hydrologic perception.
Streamflow
Streamflow is the result of interactions betweenmany hydrologic events, such as precipitation,snowmelt, evapotranspiration, infiltration, and
ground water recharge, with anthropogenic influ-ences, such as irrigation activities.
Continuous historical streamflow data wereobtained for different sites. Data appropriate for use
in seasonal streamflow predictions are available inthe form of average daily flows from 1976 to 2002.
Short term predictions can be supported by daily and
hourly streamflow data that are available in bothdaily and hourly form from 2000 to 2003.
Irrigation Demands
Irrigation demands represent the quantities ofwater that farmers request be delivered to their
headgates. Such requests are made one day inadvance of the expected time of deliveries to takeplace. Data on irrigation demands for various canals
in the Sevier River Basin are available for the years1952 through 2002.
Temperature
Temperature can directly affect the rate ofsnowmelt, which in turn contributes to streamflow.The inclusion of temperature data as a predictor can
enhance the model. Historic daily and hourly temper-ature data are available at many SnoTel and weather
stations.
JAWRA 198 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA
Figure 2. Typical ANN Structure.
weight vector, w = [w1
, w2
, ... wl
]T, u w xi ii
l
=
, . Sec-
-
8/14/2019 Basin Scale Hydrology support vector machines
5/14
Sea Surface Temperature Anomaly
Satellite derived measurements of sea surface tem-perature anomaly (SSTA) data can be useful in mak-ing seasonal predictions of streamflow. Sea surface
temperature influences continental precipitation pat-terns, and hence provides information about the
quantity of water that will become available for stor-age in reservoirs. Incorporation of SSTA measure-
ments over a broad temporal scale can therefore berelevant to the study of basin-scale water manage-ment issues. A long, statistically homogeneous record
of sea surface temperature anomalies is available(Kaplan et al., 1998) on a 5-degree-by-5-degree grid
covering the majority of the worlds oceans for theperiod 1856 to present. Details of the statistical devel-
opment of these data are beyond the scope of thispaper. Readers are referred to Kaplanet al. (1997) for
a description of the methodology.
Snow Water Equivalent
Information about snow can be critical for forecast-ing spring runoff and water levels in streams. Snow
serves as storage of water supplies at the beginning ofthe season. Daily data on snow water equivalent
(SWE), which is the equivalent depth of waterobtained when the snow is completely melted, areavailable from several SnoTel sites in the Sevier River
Basin, including the three shown in Figure 1.
Precipitation
Daily precipitation measurements are available at
different locations across the Sevier River Basin. Theprecipitation data used in this manuscript wereobtained from the Kimberly Mine SnoTel station, and
the Richfield airport weather station, as it is the near-est station to the locations at which streamflow pre-
dictions are desired.
MODEL FORMULATION AND APPLICATION
Developing an ANN model for a particular applica-tion requires designing the network architecture for
capturing the dynamical characteristics of the systembeing simulated from data that are available to
describe the problem domain. The structure of anANN requires identification of the input and output vectors. It also requires selection of the number of
hidden layers and specification of the number of
neurons in each hidden layer, which is usually accom-plished through a trial-and-error process. Finally, theresulting ANN model must be evaluated, or tested,
in terms of the quality of its predictions.
Seasonal Streamflow Prediction Model
Seasonal predictions of future streamflow and
reservoir volumes can play a vital role in planningand decision making in river basins. In the case of theSevier River Basin, ranchers must make decisions to
purchase livestock early in the year, well before infor-mation is available about how much water will be
supplied in the summer and fall for irrigation andproduction of feed for those livestock. Financial com-
mitments made early in the water year can result insubstantial economic losses if the winter snow packand resulting spring runoff do not subsequently sup-
ply enough irrigation water. Seasonal predictionswere made in this study for flows on the Sevier River
at the Hatch gage, which is high in the upper basin.The quantity of water that flows through this gauge
represents a large portion of the total water availableto the basin. The streamflow at this gauge changesfrom season to season due to the interactions of a
multitude of factors. Regional and local meteorologi-cal conditions and snowpack in the mountains will
obviously influence streamflows. Previous work hasshown that ANNs are appropriate to capture the
complex nonlinear relationships among these phe-nomena. For a more complete review of the uses of
ANNs in water resources applications, refer to Maierand Dandy (2000) and Govindaraju and Rao (2000).
The approach adopted here in building a model for
forecasting seasonal streamflow quantities is basedupon a multi-sensor data driven approach that uses
an ANN as a learning machine. Inputs to the modelconsist of previous seasonal streamflows, SSTA data,
and SWE data from the SnoTel stations at Harris Flatand Midway Valley. The cumulative quantity of waterthat flows past the Hatch gage in a season provides
information on the overall status of the basin withrespect to water availability and the response of basin
hydrology to climatic forcings. The SWE input to the
model is the average of the monthly SWE over theprevious 12 months. Sea surface temperature anoma-ly data are input to the model in the form of the 12previous monthly average SSTA values. The relation-
ship between inputs and outputs of the seasonal ANNmodel, then, can be expressed as
Qt+6 = (I)
where Qt+6 is the expected quantity of water (cfs)coming to the basin through the Hatch gage for six
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 199 JAWRA
BASIN SCALE WATER MANAGEMENT AND FORECASTING USING ARTIFICIAL NEURAL NETWORKS
(1)
-
8/14/2019 Basin Scale Hydrology support vector machines
6/14
months from time t, I is the vector of inputs to theANN, and is the ANN nonlinear transformation ofinputs to outputs. The input vector can be expressed
as
I = [Qt-6 St-12 T
t-12]T
where Qt-6 is the total quantity of water (cfs) flowingpast the Hatch gage in the last six moths, St-12 is theaverage SWE (in) calculated over the 12 months prior
to time t for each SnoTel station, and T
t-12 represent avector of average monthly SSTAs (C) for the previous
12 months. The SSTA data were obtained for six dif-ferent stations (see Figure 3). Therefore, six ANN
models were built using one individual SSTA stationat a time (see Figure 3). Detailed descriptions of themodel performance for the SSTA station that proved
to be the most significant are presented in the resultssection.
Daily Reservoir Release Prediction Model
The need for daily prediction is of great importanceto manage irrigation canals and reservoir releases inriver basins. Piute Reservoir was selected to test the
applicability of ANNs for supplying information fordaily reservoir management (see the Middle Sevier
portion of Figure 1). Each day, the operator of thePiute Reservoir must set releases at a level that will
be sufficient to meet the needs of nine irrigationcanals that divert water from the river downstream ofthe reservoir, these canals all lie between the Clear
Creek confluences and the Vermillion Diversion Dam(see Figure 1).
If too little water is released, it is likely that thelower canals will not receive enough water. If too
much water is released, some might be spilled to thelower basin; water that is spilled is considered lostby the users in the upper basin, who, in accordance
with the complicated system of water rights on theSevier, are entitled to it. Vermillion Diversion Dam,
shown in Figure 1, is the administrative dividingpoint between the upper and lower Sevier River. Effi-
cient daily management decisions about the operationof the reservoir, then, can result in reduction of waterlosses and improved deliveries to users. This will
translate into increased overall farm production forthe upper basin. Modeling all the climatic, hydrologic,
and hydraulic physical processes involved to provide
near real time forecasts of river and canal flows and,ultimately, required reservoir releases would involvesolution of a complex system of nonlinear, partial dif-ferential equations. Implementation of such a model
would need a substantial amount of data, a skilledmodeler, and powerful computing devices.
There is uncertainty involved in the reservoirreleases owing to the variations in the influencing
processes throughout the season and the travel timesfrom the reservoir to the last demand that range from
JAWRA 200 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA
(2)
Figure 3. Significant Sea Surface Temperature Anomaly Measurement Locations.
-
8/14/2019 Basin Scale Hydrology support vector machines
7/14
two to three days depending on the quantity of flow inthe river and on antecedent flow conditions. In theface of uncertainty, the Piute Reservoir operator
needs a tool to help decide on a near real time basishow much water to release to meet water orders to
canal operators located downstream of the reservoir.In other words, a common requirement for managing
the reservoir that is operated on an on-demandbasis is the anticipation of the quantity of water thatmust be released while accounting for losses and trav-
el time. The Piute Reservoir operators would like toset the diversion gates once per day and maintain a
constant flow into the canal over the following 24-hour period. Therefore, the desired output of the ANN
model is simply the daily quantity of water thatshould be released from the Piute Reservoir. Theinformation that should be made available to the
ANN model through the neurons in the input layershould include the data that describe current, and
perhaps recent historical, flow conditions in the river
and canals. This information is readily available fromthe on-line database maintained by the Sevier RiverWater Users Association (2004). Input to the ANNshould also include the orders that have been received
by the canal managers for water deliveries along thelength of the river. The relationship between inputs
and outputs of the daily ANN model, then, can beexpressed as
ODt = (I)
where ODt is the rate flow of water (cfs) to be releasedon day t, I is the vector of inputs to the ANN, and is
the ANN nonlinear transformation of inputs to out-puts. The input vector can be expressed as
I = [Dt-1 Q
t-l O
t]T
where Dt-1 is the average release flow (cfs) from theprevious day, Q
t-l is a vector composed of the average
flows (cfs) from the previous day at the flow gagesalong the river, and O
t is a vector of water orders (cfs)
to be delivered during day t. The use of previous daycanal flow information and orders for next day water
deliveries produces an input layer with 14 neurons.
Hourly Streamflow Predictions
In some situations, unregulated tributary streamscan cause flows to fluctuate in the main river over a
diurnal pattern that is difficult to predict and thatcauses management problems in planning for diver-
sions in locations downstream of the tributary. Clear
Creek, a tributary of the Sevier River, is an exampleof an uncontrolled tributary stream that dischargesinto the river in such a way that its diurnal fluctua-
tions make downstream water management more dif-ficult.
In the spring and early summer, snowmelt in theClear Creek watershed can produce runoff quantities
with substantial diurnal fluctuations. The irrigatorsin the upper basin are entitled to capture and useflows from Clear Creek, but they have limited capabil-
ities to do so. Instead, they must let flows from ClearCreek enter the mainstem of the Sevier, and then
divert these waters downstream. If they fail to do so,the excess flows received at Vermillion that cannot be
diverted and locally used will be spilled from the Ver-million Diversion Dam and lost from the upper basin.Clearly, capture of Clear Creek waters will require
coordination of releases from Piute Reservoir,upstream, with diversion of irrigation water into
canals, downstream. This coordination will be best
facilitated with advanced forecasts about likely diur-nal fluctuations in Clear Creek flows.
The design of an appropriate hourly predictionmodel requires the use of data that reflect the physi-
cal forces that cause streamflow in these tributarystreams to fluctuate throughout the day. These
include hourly total solar radiation, previous daystreamflow, precipitation, and air temperature. An
hourly model is required to provide information onthe diurnal fluctuations in the river flows due to trib-utary inflows. The nonlinear mapping equations used
to capture the relationships between inputs and out-puts of a hourly ANN model can be expressed as
Qt = (I)
where Qt is the rate of flow (cfs) past the Clear Creekgage for the coming 24 hours, and t = (1,2,..., 24). I is
the vector of inputs to the ANN, and is the ANNnonlinear transformation of inputs to outputs. The
input vector can be expressed as
I = [Q
t-24 T
t-24 R
t-24 S
t-24 P
t-24]T
where Q
t-24, T
t-24, and R
t-24 are averages of the vec-
tors of hourly streamflow (cfs) at the Clear Creekgage, air temperature (C), and solar radiation
(kW/m2), respectively, for the 24 hours previous to theprediction time; S
t-24 and P
t-24 are averages of the
vectors of wind speed (mph) and precipitation (in)respectively, for a period of 24 hours before time t
Precipitation data were provided from the KimberlyMine SnoTel station (see Figure 1).
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 201 JAWRA
BASIN SCALE WATER MANAGEMENT AND FORECASTING USING ARTIFICIAL NEURAL NETWORKS
(3)
(4)
(5)
(6)
-
8/14/2019 Basin Scale Hydrology support vector machines
8/14
RESULTS AND DISCUSSION
Model Specifications
Obtaining an optimal level of performance for any
learning machine entails a considerable number of
design choices, especially for ANN learning. The char-acteristics of an optimal architecture are a model that
produces acceptable predictions, has good generaliza-tion abilities, and requires a minimal number of cali-
brated parameters (i.e., degrees of freedom). Theapproach for selecting an optimal architecture bene-
fits from a rigorous statistical analysis and expertknowledge. Splitting the data into two sets, where the
machine is trained on one and tested on the other toavoid underestimating the true error, has a twofolddisadvantage: the problem of having sufficient data
for training, and the possibility of statistical depen-dence between the two subsets (Blum et al., 1999).
Moreover, since the available data are scarce, k-foldcross-validation can be used to overcome these defi-
ciencies. In k-fold cross-validation, the data set ispartitioned into k mutually disjoint folds (subsets)
each Sj, the model is trained on all folds except Sj.The final error is estimated as
whereQ
(S
j,X
) is the statistic of interest for evaluationof an ANN model trained usingXand tested onSj. In
this paper, a set aside sample of data is used (i.e., val-idation data set) to test the model plausibility. Toavoid data splitting, the training data sets were used
in a cross-validation context to build the ANN model.The problem of choosing a suitable architecture for
ANNs lies in specifying the activation function andthe number of neurons in the hidden layer. Trial-and-
error analysis resulted in selection of a suitable acti- vation function for each model. Selection of thenumber of hidden nodes in ANNs is a most difficult
but important step. The root mean square error
(RMSE) from the five-fold cross-validation error wasused to select the optimal number of hidden nodes(Rivals and Personnaz, 2000). The number of hidden
nodes was increased, starting from only one, and eval-uated the five-fold cross-validation error (mean and
variance). The optimal number of hidden nodes was
selected at the point where the decrease in the five-fold error becomes insignificant (see Figure 4 for the
case of the hourly model). Table 1 provides a summa-ry of the characteristics of the seasonal, daily, and
hourly models that have been discussed. The table
shows the number of neurons in each layer, the opti-mal transfer function, and the learning rule used foreach model.
The ANN model is constructed once the modelstructure is selected. Construction of an ANN model
involves training the network with known input/output data available from the real system, and then
testing the resulting model against other data notused in training (the withheld sample). In thismanuscript, ANNs are developed using Neural Works
Professional II/Plus (NeuralWare, Inc., 2000) and the
Matlab toolbox NetLab (Bishop, 1995; Nabney,2001).
Performance Criteria
The objective of the training phase in building anANN is to produce a set of connection weights that
causes the outputs of the ANN,y(t), to match as close-ly as possible the observed system outputs, T(t), for
every set of training patterns. Achievement of thisobjective is typically measured by the correlation coef-
ficient, R2
, defined as
wherey and T
are the means ofy and T, respectively.The correlation coefficient is not a measure of the pre-
dictive capabilities of the model since it is sensitive to
JAWRA 202 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA
S j kj { }1 2, ,..., (Shakhnarovich et al., 2001). For
ErrK
Q S X S X CV k jj
k
j=
= ( ) 11
, , (7)
Figure 4. RMSE (five-fold cross-validation) and
95 Percent Confidence Bounds as a Function
of Number of Hidden Nodes.
Ry y T T
y y T T
t
tt
2
2
=( ) ( )
( ) ( )
(8)
-
8/14/2019 Basin Scale Hydrology support vector machines
9/14
outliers and spurious data. Therefore, the coefficient
of efficiency,E, has been widely used, defined as
A model withE = 0.9 has a mean square error of 10percent of the variance of the observed data. It is,
however, sensitive to significant outliers. To overcomethe susceptibility to extreme values, the Index of
Agreement, d, can be used. It is defined as follows
It is less sensitive to large values. To quantify theerror in terms of the units of the variable, one could
use the RMSE. It is defined as
Bias and mean absolute error are also physical mea-
sures. Bias is the average of the differences betweenobserved and predicted values, while mean absoluteerror is the average of the absolute of the residuals.
For more details about goodness-of-fit measures, seeDavid and Gregory (1999).
A complete assessment of the model should also
include scatterplots with error bounds. The perfor-mance of the ANN model is evaluated during the
ANN testing phase using scatterplots ofy(t) versus
T(t). The magnitude of the scatter of [T(t),y(t)] about
a 45 degree line can be examined using error boundsto assess the deviation of predicted outputs from mea-
sured system behavior.
Seasonal Streamflow Prediction Model Performance
Figure 5 illustrates the relationship between the
model predictions and the actual data. Using SSTAdata from the East Atlantic station produced the opti-
mal performance. The correlation coefficient of the
model is 0.88 and the coefficient of efficiency is 0.76.Adequacy of the seasonal ANN could provide a very
useful utility to the water users in making decisionsin regard to the basin operations. Figure 6 provides a
scatterplot, together with 20 percent error bounds, ofmodel predictions versus actual system behavior. It
should be noted that in the total training data set cor-responding to the period 1981 to 2002, the small num-ber of patterns could be a direct reason for the ANN
to exhibit relatively poor predictions at some points(i.e., the peak flows). It also could be attributed to a
lack of sufficient data included in the inputs to theANN model to fully represent the hydrology of the
watershed for these events. It is worth mentioningthat the lack of accuracy of ANNs in predicting peaks
and valleys in hydrologic time series is one of themajor concerns facing users of ANN technology in thehydrologic community. For techniques to improve
peak flow estimation in ANNs, readers are referred toSudheeret al. (2003).
Successful seasonal forecasts of water quantityshould help answer difficult questions such as, Will
there be sufficient water to meet competing demandsin the Sevier River Basin? and How far will one beable to stretch the water that will become available?
Daily Prediction of Required Reservoir Releases
Figure 7 presents a time series plot comparing the ANN model release forecasts and the actual diver
sions for the irrigation seasons of 2000, 2001, and2002. This figure shows good model performance inpredicting the required releases from the reservoir.
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 203 JAWRA
BASIN SCALE WATER MANAGEMENT AND FORECASTING USING ARTIFICIAL NEURAL NETWORKS
TABLE 1. Model Structures Summary.
Hidden Hidden
Input Layer Layer Output Transfer Learning
Model Layer 1 2 Layer Function Rule*
Seasonal Model 15 10 01 sig(.) -Rule
Daily Model 14 14 3 01 tanh(.) NCD
Hourly Model 07 06 24 sig(.) -Rule
*The learning rules are the delta rule (-Rule), and the normalized cumulative delta (NCD). A discussion of these rules can be found in Neu-*ralWare (2000).
ET y
T T
t
t
=
1
2
2
( )
( )
dT y
y T T T
t
t
=
( ) + ( )
1
RMSE N T y t N t
= ( ) = 1 2 1, ,...,
(9)
(10)
(11)
-
8/14/2019 Basin Scale Hydrology support vector machines
10/14
Figure 8 provides a scatterplot, together with 20 per-
cent error bounds, of model predictions versus mea-sured releases for the validation data used in the2000, 2001, and 2002 irrigation seasons. The correla-
tion coefficient for this scatterplot had a value of R2 =0.98 and the coefficient of efficiency = 0.95. To utilize
the model in near real time, the predicted reservoir
releases can be provided to the reservoir operator, andthen it is possible for the operator and experts to ana-lyze, judge, and evaluate the results of the ANN
model according to their own knowledge and experi-ence.
JAWRA 204 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA
Figure 5. Time Series Performance of the ANN Model in Predicting Seasonal Quantity of Water.
Figure 6. Scatterplot of Model Predictions Versus Actual Flows.
-
8/14/2019 Basin Scale Hydrology support vector machines
11/14
The results indicate that the model forecast can be
used to address the conflicting goals of satisfyingdownstream demands with high certainty while at
the same time conserving water in the reservoir foruse later in the season.
Hourly Streamflow Prediction Model Results
A hourly streamflow prediction model has been
built to forecast the substantial diurnal fluctuationfor the Clear Creek watershed. The total data avail-
able for building this model are from the spring runoff
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 205 JAWRA
BASIN SCALE WATER MANAGEMENT AND FORECASTING USING ARTIFICIAL NEURAL NETWORKS
Figure 7. Time Series Performance of the ANN Model in Predicting the 2000, 2001, and 2002 Irrigation Season Releases.
Figure 8. Scatter Plot of Model Predictions Versus Measured Releases for the 2000, 2001, and 2002 Irrigation Seasons.
-
8/14/2019 Basin Scale Hydrology support vector machines
12/14
periods of 2000 through 2003. As shown in Figure 9, itis possible to predict 2003 hourly flows at Clear Creekduring the first months of the irrigation season when
diurnal fluctuations play a strong role in determiningflows in the creek.
On average, the linear correlation between theactual and the predicted flow is 0.97 and the RMSE is
9.43 cfs (0.27m3 /sec). Different trials with differentdata sets proved that the hourly predictions wouldnot be as good unless all the relevant data previous
streamflow, total solar radiation, air temperature, andprecipitation were employed.
As shown in Figure 10, the predicted flows versusthe actual flows illustrate very good model perfor-mance. Hourly streamflow predictions provide useful
management information for the Sevier River Basinmanagers and farmers in dealing with diurnal fluctu-
ations of tributary streams. It is worth mentioninghere that the model was able to accurately simulate
the rapid rise in streamflow that occurs at sunrise, aswell as other diurnal fluctuations in flow.
SUMMARY AND CONCLUSIONS
To improve water management for the Sevier RiverBasin, an extensive, basin wide automated system
has been installed that records and stores data on ahourly basis to enable real time information process-
ing. Moreover, Internet based communications andcontrol systems are in place to allow managers to
remotely manipulate all reservoir releases and canaldiversion gates at will. Operators of the basin widesystem and water users alike have begun to view the
resulting information and control system as an inte-grated tool for basin wide management (Bergeret al.,2002; Bretet al., 2002).
In most river basins, and particularly in the Sevier,
water supply is managed at different temporal andspatial scales, and decisions made by differentmanagers are not always well coordinated. This is
JAWRA 206 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA
Figure 9. Model Performance Evaluated Using Coefficient
of Efficiency and RMSE (2003 irrigation season).
Figure 10. Daily Predicted Versus Actual Flow at 10 p.m. for Clear Creek.
-
8/14/2019 Basin Scale Hydrology support vector machines
13/14
particularly difficult given long travel times anduncertainties in system behavior. The modelsdescribed in this manuscript represent a first attempt
to exploit the real time database available on theSevier River to address the range in information
needs of stakeholders and managers. A seasonalmodel provides prediction of future water availability
in the upper basin to reduce the vulnerability of waterusers to unforeseen water shortages. This informationwill help them avoid financial commitments that
must be made early in the water year but that couldresult in substantial economic losses if future water
supplies become limited. A daily reservoir releasemodel was designed to improve on-demand flexibility
in reservoir operation. Efficient daily managementdecisions about reservoir releases reduce water lossesand improve deliveries to downstream irrigators. A
hourly model of uncontrolled tributary flows allowswater managers to accurately anticipate diurnal flow
conditions and consequently integrate both upstream
reservoir releases with numerous downstream canaldiversions. These models exploit the real timedatabase with the coordinated input of water demandinformation by diverse canal and reservoir operators
to provide both short term and long term decision rel-evant information. In these functions, they constitute
a foundation of an integrated framework for basin-scale management of the available scarce water
resources.The ANN model was able to successfully transform
measured input vectors into reasonably accurate fore-
casts of outputs for the three models. Large amountsof data, including multi-sensor data in the form of
meteorological and streamflow data, were integratedinto an ANN framework to develop useful models for
water management problems. The adequacy of theANN models is demonstrated by the quality of theirforecast. This shows that construction of real time
monitoring and management systems can be accom-plished to provide more efficient utilization of thebasins water resources. This paper demonstrates the
applicability of ANNs to learn relationships betweeneasy to measure streamflow, meteorological, and
satellite data to enhance basin scale management.The performance of ANN techniques in extracting
useful information is satisfactory (see Table 2).Overall, the resulting models are easily used and
have been found to provide useful and efficient fore-
casts without resorting to the development and appli-cation of complex, computationally demanding
physically based models that require expensive datacollection efforts to support them. In the future, such
models could also provide a substantial potential con-tribution to computer controlled basin automation bylinking them to the basin database. This is being con-
sidered in the Sevier River Basin, and, if implement-ed, might reduce the cost of management and more
fully exploit the available database for the basin. This
leads us to optimistically share the view voiced by oneof the water users in the Sevier River Basin that:when something goes down and I have to go backto the old way of doing things, it is like being blind
after being able to see (Bergeret al., 2002, p. 25-11).
ACKNOWLEDGMENTS
The authors wish to thank Dr. Roger Hansen of the U.S. Bureau
of Reclamation, Provo, Utah, for the extremely valuable contribu-
tions he has made to the work reported in this paper. Thanks are
also due to Dr. Luis Bastidas and Connely K. Baldwin for their
valuable insights and help. The authors are grateful to the SevierRiver Water Users Association, the U.S. Bureau of Reclamation
and the Utah Water Research Laboratory at Utah State University
for providing funding in partial support of the work reported here.
Thanks are also due to anonymous reviewers for their insightful
comments.
JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION 207 JAWRA
BASIN SCALE WATER MANAGEMENT AND FORECASTING USING ARTIFICIAL NEURAL NETWORKS
TABLE 2. Key Statistics of Model Performance in the Training and Testing Phases.
Seasonal Daily Hourly
Statistics Training Testing Training Testing Training Testing
Correlation Coefficient 0.93 0.88 0.99 0.98 0.99 0.97Coefficient of Efficiency 0.86 0.76 0.98 0.95 0.98 0.91
Index of Agreement 0.96 0.94 0.99 0.99 0.99 0.97
RMSE 19.58 mcf 20.59 mcf 20.16 cfs 38.13 cfs 4.25 cfs 9.43 cfs
0.55 mcm 0.58 mcm 0.57 cms 1.08 cms 0.12 cms 0.27 cms
Bias -3.44 mcf -4.24 mcf 0.00 cfs 1.84 cfs 0.00 cfs -3.84 cfs
0.097 mcm -0.12 mcm 0.00 cfs 0.05 cms 0.00 cfs -0.1 cms
Mean Absolute Error 14.00 mcf 15.67 mcf 13.25 cfs 27.38 cfs 2.86 cfs 5.91 cfs
0.4 mcm 0.44 mcm 0.38 cms 0.78 cms 0.081 cms 0.17 cms
-
8/14/2019 Basin Scale Hydrology support vector machines
14/14
LITERATURE CITED
Ames, D., 1998. Seasonal to Interannual Streamflow Forecasts
Using Nonlinear Timeseries Methods and Climate Information.
Master of Science Thesis, Utah State University, Logan, Utah.
Berger, B., R. Hansen, and A. Hilton, 2002. Using the World-Wide-
Web as a Support System to Enhance Water Management. The
18th ICID Congress and 53rd IEC Meeting, Montral, Canada,
pp. 25-1 to 25-12.
Bishop, C.M., 1995. Neural Networks for Pattern Recognition.
Oxford University Press.
Blum, A., A. Kalai, and J. Langford, 1999. Beating the Holdout:
Bounds for k-Fold and Progressive Cross-Validation. Proceed-
ings of the 12th Annual Conference on Computational Learning
Theory, pp. 203-208.
Bret, B., H. Rogers, and R. Jensen, 2002. Sevier River Basin Sys-
tem Description. Available at http://www.sevierriver.org/sys_
desc/t1.html.Accessed onApril 20, 2004.
David, R.L. and M.J. Gregory, 1999. Evaluating the Use of Good-
ness-of-Fit Measures in Hydrologic and Hydroclimatic Model
Validation. Water Resources Research 35(1):233-241.
Govindaraju, R.S. and A.R. Rao, 2000. Artificial Neural Networks
in Hydrology. Kluwer Academic Publishers, Amsterdam, The
Netherlands.
Hammerstrom, D., 1993. Working With Neural Networks. IEEESpectrum, July, pp. 46-53.
Hayken, S., 1994. Neural Networks: A Comprehensive Foundation.
IEEE Press, McMillan College Publishing, New York, New York.
Kaplan, A., M. Cane, Y. Kushnir, A. Clement, M. Blumenthal, and
B. Rajagopalan, 1998. Analyses of Global Sea Surface Tempera-
ture 1856-1991. Journal of Geophysical Research 103:18,567-
18,589.
Kaplan, A., Y. Kushnir, M. Cane, and M. Blumenthal, 1997.
Reduced Space Optimal Analysis for Historical Datasets: 136
Years of Atlantic Sea Surface Temperatures. Journal of Geo-
physical Research 102:27,835-27,860.
Maier, H.R. and G.C. Dandy, 2000. Neural Networks for the Predic-
tion and Forecasting of Water Resources Variables: A Review of
Modeling Issues and Applications. Environmental Modeling and
Software 15:101-124.Nabney, I., 2001. Netlab: Algorithms for Pattern Recognition.
Springer, New York, New York.
NeuralWare, Inc., 2000. Neural Computing, NeuralWorks Profes-
sional II/PLUS. Carnegie, Pennsylvania.
Rivals, I. and L. Personnaz, 2000. A Statistical Procedure for
Determining the Optimal Number of Hidden Neurons of a Neu-
ral Model. Second International Symposium on Neural Compu-
tation, Berlin, Germany.
Rumelhart, D.E., G.E. Hinton, and R.J. Williams, 1986. Learning
Internal Representations by Error Propagation.In: Parallel Dis-
tributed Processing: Explorations in the Microstructure of Cog-
nition, D.E. Rumelhart and J.L. McClelland (Editors). MIT
Press, Cambridge, Massachusetts, Vol. 1, Chapter 8, pp. 318-
362.
Schalkoff, R.J., 1997. Artificial Neural Networks. McGraw-Hill,New York, New York.
Sevier River Water Users Association, 2004. Sevier River Water
Users Association: Real-time Water/Weather Data. Available at
http://www.sevierriver.org/.Accessed in December 08, 2004.
Shakhnarovich, G., R. El-Yaniv, and Y. Baram, 2001. Smoothed
Bootstrap and Statistical Data Cloning for Classifier Evalua-
tion. Proceedings of International Conference on Machine
Learning, pp. 521-528.
Skapura, D.M., 1995. Building Neural Networks. Addison-Wesley
Publishing Company, Boston, Massachusetts.
Sudheer K.P., P.C. Nayak, and K.S. Ramasastri, 2003. Improving
Peak Flow Estimates in Artificial Neural Network River Flow
Models. Hydrological Processes 17:677-686.
Utah Board of Water Resources, 2001. Utahs Water Resources
Planning for the Future. Division of Water Resources Publica-
tions, Salt Lake City,Utah. Available at http://www.water.utah.
gov/waterplan/uwrpff/TOC.htm. Accessed on May 21, 2001.
JAWRA 208 JOURNAL OF THE AMERICAN WATER RESOURCES ASSOCIATION
KHALIL, MCKEE, KEMBLOWSKI, AND ASEFA