cost-effective sampling network design for contaminant plume
TRANSCRIPT
Journal of Contaminant Hydrology 77 (2005) 41–65
www.elsevier.com/locate/jconhyd
Cost-effective sampling network design for
contaminant plume monitoring under
general hydrogeological conditions
Jianfeng Wua,b, Chunmiao Zhenga,*, Calvin C. Chienc
aDepartment of Geological Sciences, University of Alabama, Tuscaloosa, AL, United StatesbDepartment of Earth Sciences, Nanjing University, Nanjing, China
cCorporate Remediation, DuPont Company, Wilmington, DE, United States
Received 1 October 2003; received in revised form 10 November 2004; accepted 16 November 2004
Abstract
A new simulation–optimization methodology is developed for cost-effective sampling network
design associated with long-term monitoring of large-scale contaminant plumes. The new
methodology is similar in concept to the one presented by Reed et al. (Reed, P.M., Minsker, B.S.,
Valocchi, A.J., 2000a. Cost-effective long-term groundwater monitoring design using a genetic
algorithm and global mass interpolation. Water Resour. Res. 36 (12), 3731–3741) in that an
optimization model based on a genetic algorithm is coupled with a flow and transport simulator and a
global mass estimator to search for optimal sampling strategies. However, this study introduces the
first and second moments of a three-dimensional contaminant plume as new constraints in the
optimization formulation, and demonstrates the proposed methodology through a real-world
application. The new moment constraints significantly increase the accuracy of the plume
interpolated from the sampled data relative to the plume simulated by the transport model. The
plume interpolation approaches employed in this study are ordinary kriging (OK) and inverse
distance weighting (IDW). The proposed methodology is applied to the monitoring of plume
evolution during a pump-and-treat operation at a large field site. It is shown that potential cost
savings up to 65.6% may be achieved without any significant loss of accuracy in mass and moment
estimations. The IDW-based interpolation method is computationally more efficient than the OK-
based method and results in more potential cost savings. However, the OK-based method leads to
0169-7722/$ -
doi:10.1016/j.
* Corresp
E-mail add
see front matter D 2004 Elsevier B.V. All rights reserved.
jconhyd.2004.11.006
onding author. Tel.: +1 205 348 0579; fax: +1 205 348 0818.
ress: [email protected] (C. Zheng).
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6542
more accurate mass and moment estimations. A comparison of the sampling designs obtained with
and without the moment constraints points to their importance in ensuring a robust long-term
monitoring design that is both cost-effective and accurate in mass and moment estimations.
Additional analysis demonstrates the sensitivity of the optimal sampling design to the various
coefficients included in the objective function of the optimization model.
D 2004 Elsevier B.V. All rights reserved.
Keywords: Contaminant transport; Monitoring network design; Interpolation method; Moment analysis; Genetic
algorithm; Massachusetts Military Reservation (MMR)
1. Introduction
During the past two decades, models that solve the governing groundwater flow
and/or solute transport equations in conjunction with an optimization technique have
been increasingly used as aquifer management tools (e.g., Gorelick, 1983; Ahlfeld and
Mulvey, 1988; Wagner and Gorelick, 1989; Culver and Shoemaker, 1992; Karatzas
and Pinder, 1993; Tiedeman and Gorelick, 1993; Rizzo and Dougherty, 1996; Minsker
and Shoemaker, 1998; Zheng and Wang, 1999a; Mayer et al., 2001). The combined
simulation and optimization model is appealing because it can account for the complex
behavior of the groundwater system and identify the best management strategy to
achieve a given set of objectives under prescribed constraints (Wagner, 1995a).
Simulation–optimization models have been developed for a variety of applications. A
representative simulation–optimization model is one that seeks to identify the least-cost
strategy to meet specified constraints. Recently the least-cost strategies are often
related to groundwater remediation (McKinney and Lin, 1994; Huang and Mayer,
1997; Bear and Sun, 1998; Aly and Peralta, 1999; Smalley et al., 2000; Zheng and
Wang, 2002).
Groundwater remediation is associated with numerous costs and often has time
horizons of up to 30 years or more. Installation of a cleanup system does not mean the
end of a groundwater remediation project. Instead, long-term monitoring of the system’s
performance is needed to ensure that the remediation objectives are being achieved and
the risks to human health and environment are being properly managed (Chien et al.,
2002). Over-sampling is a common problem encountered in groundwater quality
monitoring, while data collection and analysis of long-term monitoring are very
expensive. For a typical contaminant site, several hundred samples may be collected and
analyzed each year that cost hundreds of thousands of dollars. Applying simulation–
optimization models to long-term monitoring network design can achieve substantial
cost savings by eliminating unnecessary samples (Chien et al., 2002). This paper
presents a new methodology that builds on the work of Reed et al. (2000a) for cost-
effective groundwater monitoring design using a genetic algorithm and global mass and
moment estimation.
The optimization of long-term monitoring network design can be accomplished using a
variety of approaches. Selecting an appropriate method involves numerous criteria, the
most important of which include the site-specific long-term performance objectives and
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 43
the amount and type of available data. Although monitoring network design has been
studied intensively in the past, most studies have only focused on the theoretical
development and simple hypothetical examples or small-scale field problems (e.g.,
Loaiciga, 1989; McKinney and Loucks, 1992; Hudak and Loaiciga, 1993; Christakos and
Killam, 1993; Andricevic, 1993, 1996; James and Gorelick, 1994; Meyer et al., 1994,
1995; Cieniawski et al., 1995; Wagner, 1995b; Bogaert and Russo, 1999; Montas et al.,
2000; Reed et al., 2000a). Often viable approaches for sampling network design in large-
scale applications are variance-based methods (Rouhani, 1985; Rouhani and Hall, 1988;
Hudak and Loaiciga, 1992; Ridley et al., 1995). For example, Rouhani (1985) did a
seminal work on groundwater sampling network design based on the variance reduction
analysis in which the variances of the estimation errors obtained from kriging interpolation
were sequentially decreased by adding more sampling points to a regional-scale water
table monitoring network until the variance reduction became insignificant. Rouhani and
Hall (1988) improved this variance reduction analysis and applied it to groundwater
network design for contaminant plume monitoring.
Two recent studies are of particular interest to this work. Reed et al. (2000a) presented a
methodology for cost-effective monitoring network design that is used to examine the
feasibility of reducing long-term monitoring costs under conditions similar to those at the
Hill Air Force Base, Utah with a comparatively small study area (approximately only 0.02
km2). As discussed in more detail in Section 4, it would be difficult to achieve a
satisfactory result applying the methodology developed by Reed et al. (2000a) to a large
field site such as the Massachusetts Military Reservation in Cape Cod, Massachusetts.
Montas et al. (2000) developed a direct partial enumeration method coupled with a
stochastic flow and transport model to characterize a hypothetical contaminant plume by
minimizing the spatial moments subject to sampling cost constraints. Although the
methodology proposed by Montas et al. (2000) was successful in generating near-optimal
sampling networks that satisfied all imposed constraints, it is not suitable for minimization
of monitoring costs. In addition, whether the proposed methodology is applicable to real-
world sites still needs to be verified further.
This paper is related to the work of Zheng and Wang (2002) who demonstrated the
application of optimization modeling to the design of a pump-and-treat system for the
containment and cleanup of a large trichloroethylene (TCE) plume in a shallow,
unconfined sandy aquifer at the Massachusetts Military Reservation in Cape Cod,
Massachusetts. The current work involves the design of a long-term sampling network to
monitor the plume evolution during the remedial operation in a most cost-effective
manner using a genetic algorithm and global plume estimation. The aim of this study is
to minimize the monitoring costs by eliminating data redundancy without significantly
affecting the accuracy and adequacy of sampled data. The methodology presented in this
study is similar in concept to one presented by Reed et al. (2000a) in that an
optimization model coupling a genetic algorithm and global mass estimation is used to
search for sampling plans. The interpolation methods used in this study for global plume
estimates are still ordinary kriging and inverse distance weighted interpolation. The most
notable difference for the methodology presented in this paper is that it adds and
emphasizes the plume moment constraints on the optimization procedure, leading to
more accurate sampling designs.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6544
2. Methodology
The simulation–optimization (S/O) model developed in this study for cost-effective
monitoring network design includes four primary components: (1) a groundwater flow and
transport simulator, (2) a global mass estimator, (3) a plume moment estimator, and (4) an
optimal search technique based on a genetic algorithm (GA). Components (1)–(3) are used
to evaluate various constraints for all potential sampling designs, while component (4) is
used to identify the optimal or near-optimal sampling design. The framework of the S/O
model is similar to that proposed by Reed et al. (2000a). The most important difference is
the plume moment constraints introduced and emphasized in the current study.
A prerequisite for applying the monitoring network design model to any field site is the
existence of a calibrated flow and transport model for the study site. The flow and
transport code is used to simulate the contaminant plume from the initial time t0 to the
expected time tm under any given remedial scenario. The simulated plume is assumed to
represent the future conditions to be monitored. If a model node is selected as a potential
monitoring well location, the simulated concentration value at that location is considered
known. The known concentrations at all potential monitoring well locations are used to
construct a new plume through interpolation. The interpolated plume is then compared, in
terms of both mass and first and second moments, with the simulated plume output from
the transport model. If enough nodes were selected as potential monitoring well locations,
the difference between the simulated and interpolated plumes would be minimal. On the
other hand, the more model nodes are selected as potential monitoring well locations, the
higher the capital and sampling costs would be. Thus there is a tradeoff between the
accuracy of the interpolated plume based on monitoring data and the cost-effectiveness of
the monitoring network.
The objective function for each potential sampling design is evaluated in terms of total
monitoring costs while considering the accuracy of estimated global mass and spatial
plume moments based on sampling data. Two interpolation methods, ordinary kriging
(OK) and inverse distance weighting (IDW), are applied to estimate contaminant
concentrations at all unsampled nodes within the model domain. For each potential
design the global contaminant mass and spatial moments of the plume along different
directions are computed using the respective global mass estimator and plume moment
estimator. Moreover, the fitness value of each potential sampling design is expressed as the
combination of objective function (capital and sampling costs) and penalty costs due to the
violations of both global mass and plume moment estimation errors. Then fitness values
are evaluated by a GA to determine which individual sampling designs are allowed to
reproduce and evolve to the next generation. As the generation evolves, the optimal or
near-optimal solution (sampling design) will be found. Fig. 1 shows the framework used in
this study for implementing the cost-effective groundwater sampling network design.
2.1. Flow and transport simulation
In this study, it is assumed that the flow and transport models are based on the three-
dimensional finite-difference flow code, MODFLOW (McDonald and Harbaugh, 1988),
and its solute transport companion, MT3DMS (Zheng and Wang, 1999b). MODFLOW
Fig. 1. Flowchart for monitoring network design using GA and global plume estimation.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 45
has a modular structure that allows it to be easily adapted for a particular application.
MT3DMS also has a modular structure and has a comprehensive set of options and
capabilities for simulating contaminant transport, including advection, dispersion/
diffusion, and chemical reactions under general hydrogeological conditions. Note that
the contaminant plume simulated by the flow and transport models is assumed to represent
the future conditions to be monitored. Thus, the flow and transport models should have
been calibrated before it can be used in sampling network design for monitoring future
migration and remediation.
2.2. Global plume estimation
The global plume estimation includes two parts: global mass estimation and spatial
moment estimations. For each potential sampling design the coordinates and concen-
trations of sampled points are used to estimate the contaminant concentrations at all
unsampled nodes within the model domain using OK or IDW interpolation. The
interpolated concentrations are then used to estimate the global mass (zeroth moment) and
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6546
the first and second moments. The degree to which the estimates of global mass and higher
moments agree with those directly output from the transport model serves as the criterion
for evaluating the accuracy of individual sampling designs.
2.2.1. Ordinary kriging (OK) method
Kriging is a best, linear unbiased estimator (BLUE). It possesses some unique
properties including conditional unbiasedness, smoothing effect, additivity, and exact
interpolation. Kriging, as an interpolation technique, has been applied widely to estimate
hydrogeologic variables at unsampled locations from scatter data points, such as
transmissivity and hydraulic conductivity (e.g., Lavenue and Pickens, 1992; McKinney
and Loucks, 1992; Eggleston et al., 1996; Fabbri, 1997) and contaminant concentrations
(e.g., Zhu et al., 1997; Reed et al., 2000a).
The ordinary kriging (OK) method filters the mean from the simple kriging estimator by
requiring that the kriging weights sum to one (for more details see Isaaks and Srivastava,
1989; Deutsch and Journel, 1998; Olea, 1999). Note that the contaminant concentration
data can be converted to be normally distributed, i.e., normalized (e.g., by changing the data
into logarithm transform). Even in the non-normal case, the kriging method still gives the
best approximation to the conditional expectation (Vogely et al., 1978).
In this study the kriging packages modified from the Geostatistical Software Library
(GSLIB) (Deutsch and Journel, 1998) are used to estimate the plume concentrations at all
unsampled nodes within the study area. Reed et al. (2000a) mentioned two potential
shortcomings of using kriging for interpolation: (1) computational complexity; and (2)
difficulty of getting the semivariogram. While the first can be overcome with the rapid
advent of computing powers, the second makes it necessary to have considerable expertise
to apply this interpolation approach to field-scale problems.
2.2.2. Inverse distance weighted (IDW) interpolation
The IDW interpolation is one of the most commonly used techniques for interpolation
of scatter points. It is based on the assumption that the interpolation should be influenced
more by the nearby points and less by the more distant points. The interpolated value is a
weighted average of the scatter points and the weight assigned to each scatter point
diminishes as the distance from the interpolation point to the scatter point increases
(Vucetic et al., 2000).
The interpolated concentration, cinterp, at an unsampled location within the model
domain can be expressed as
cinterp ¼Xnri¼1
wicSi ð1Þ
where nr is the number of sampling points within the search radius from the interpolation
point, cSi is the concentration at sampling points, and wi is the weight function assigned to
each sampling point. The classical form of the weight function is:
wi ¼r�piXn
j¼1
r�pj
ð2Þ
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 47
where p is an arbitrary positive real number called power parameter (typically, p=2), and riis the distance from the ith sampling point to the interpolation point.
The key feature of IDW interpolation is its numerical simplicity. However, the
assumption that the interpolation should be influenced more by the points nearby and less
by the points far away may result in considerable estimation errors, especially for the case
where the number of interpolated nodes is much larger than that of total sampled nodes.
2.2.3. Spatial moment estimation
Spatial moment estimation has often been used to define the characteristics of a
contaminant plume versus time (Freyberg, 1986; Garabedian et al., 1991; Adams and
Gelhar, 1992; Srivastava and Brusseau, 1996; Ezzedine and Rubin, 1997; Montas et al.,
2000; Vereecken et al., 2000). The zeroth moment measures the global mass of the plume.
The first moment provides the centroid of the plume, while the second moments provide a
measure of the spread of the plume around its centroid (Ezzedine and Rubin, 1997). This
study utilizes all three moments for the global plume estimation.
The global mass of the plume, mass, at some time t is defined by
mass tð Þ ¼ZX
gc u; tð ÞdX ð3Þ
where c(u, t) is the plume concentration at point u=(u1,. . .,und) and at time t within the
model domain, nd denotes the space dimension and is equal to 2 or 3, depending on the
type of the flow and transport model, g is the effective porosity, X denotes the aquifer
domain. The coordinate location of the center of the plume mass at time t is define by the
first moment about the origin (Freyberg, 1986; Ezzedine and Rubin, 1997)
luii tð Þ ¼ 1
mass
ZX
gc u; tð ÞuidX; ia 1;: : :; ndf g ð4Þ
where l1u1 denotes the first moment (coordinate) about the origin along the ui axis. For a
three-dimensional problem, l1u1, l1
u2, and l1u3 denote the first moments along the x, y, and
z directions, respectively, in the Cartesian coordinate system. For a two-dimensional
problem, there are only l1u1 and l1
u2, representing the first moments along the x, and y
directions, respectively.
The second moment about the center of mass, describing a measure of the spread of the
plume around its centroid, defines a spatial covariance tensor
X ¼lu1u12
: : : lu1und2
v : : : vlundu12
: : : lundund2
35
24 ð5Þ
The component terms of the spatial covariance tensor, X, at time t can be expressed as
luiuj2 tð Þ ¼ 1
mass
ZX
gc u; tð ÞuiujdX � lui1 luj
1 ; i; ja 1;: : :; ndf g ð6Þ
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6548
where l2uiuj denotes the second moment (spatial covariance) along the uiuj cross direction.
In a three-dimensional groundwater system, there are nine component terms for the spatial
covariance tensor, including u2xx, u2
xy, u2xz, u2
yx, u2yy, u2
yz, u2zx, u2
zy, and u2zz. Due to the
symmetry the second spatial moment tensor actually includes only six unknown
component terms (i.e., u2xy=u2
yx, u2xz=u2
zx, and u2yz=u2
zy). In a two-dimensional groundwater
system, the second spatial moment tensor has only three unknown component terms: u2xx,
u2xy, and u2
yy.
The components of the spatial covariance tensor are physically related to the spread of
the plume concentration distribution about its center of mass. In applications, the
magnitudes of the moments must be estimated from a discrete set of specified point
concentrations. In this study, the interpolated concentrations described above can be used
to estimate the global mass and the first and second moments for comparison with those
determined from the concentration distributions directly output from the transport model
without any interpolation.
2.3. Monitoring network design model
The goal of this study is to minimize long-term monitoring costs while preserving the
accuracy of estimated contaminant mass and plume moments based on the sampled data.
The objective function is the summation of total installation/drilling and sampling costs at
all sampled locations. The monitoring problem can be expressed as follows:
Minimize J ¼ C1
Xni¼1
xili þ C2
Xni¼1
yidi ð7Þ
subject to errormassVemass ð8Þ
errormomentVemoment ð9Þ
where, in the objective function as expressed in Eq. (7), J is the management objective in
terms of the total costs for sampling and well installation/drilling, n is the total number of
potential monitoring wells; C1 is the cost for each sampling, xi is a binary variable
indicating whether sampling takes place at well i (yes if xi=1; no if xi=0), li is the number
of sampling at different elevations for well i. If sampling takes place at well i, well i is
considered to be sampled at multiple (i.e., li) elevations. C2 is the fixed capital cost for
installation/drilling per unit depth of well i, di denotes the depth of borehole associated
with well i, and yi is a binary variable indicating whether well i is drilled (yes if yi=1; no if
yi=0).
Eqs. (8) and (9) are the constraints on the global mass estimation error and plume
moment estimation errors, respectively. Eq. (8) specifies that the error term errormass in the
estimation of the total dissolved contaminant mass must be less than or equal to the
prescribed acceptable tolerance emass. The global mass estimation error errormass can be
obtained by
errormass ¼masscal �massj
masscal
�������� ð10Þ
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 49
where masscal is the contaminant mass in the area of interest as determined by the transport
model, and massj is the mass estimate as determined by the global mass estimator based on
the sampling design j. As mentioned above, we assume that the simulated plume
represents the future conditions to be monitored. Then, masscal obtained from the transport
model is considered the btrueQ mass of contaminants dissolved in the area of interest; and
Eq. (10) defines the absolute relative error between masscal and the estimated mass, massj,
based on the interpolated plume for the sampling design j.
Eq. (9) is the first and second moment constraints for the plume along all directions,
which requires that the center and shape of the interpolated plume agrees as closely as
possible with that output from the transport model. Similar to Eq. (8), it specifies that the
weighted sum of moment estimation errors, errormoment, must be less than or equal to the
prescribed acceptable tolerance emoment. The moment error term errormoment in Eq. (9) can
be stated as
errormoment ¼X2i¼1
Xk
lki;cal � lk
i;j
lki;cal
����������weightki ð11Þ
where lcal is the first or second moment of the simulated plume as output from the
transport model, lj is the estimated moment based on the sampling design j, i denotes the
order of moment, k is the direction along which the moment is computed, and weightik is
the weight assigned to the ith moment along the k direction. The larger a weight is
assigned to a moment, the more consistent is the moment determined by the transport
model with that based on the sampling design along a specific direction. In this study, all
weights assigned to both first and second moments along different directions are set
identically to 1. As a result, the effect of different coefficients on the sampling network
design can be equivalently compared as described below in Section 4.
The goal of the monitoring network design model is to find the optimal sampling plan
from among many alternatives. Eqs. (7)–(9) define an integer-programming problem
whose solution could be obtained using the branch-and-bound or exhaustive enumeration
scheme. Unfortunately, either of these schemes would require prohibitive computational
efforts. For example, if 100 potential monitoring locations exist at a field site, there will be
2100 possible sampling designs to be evaluated. Moreover, constraints (8) and (9) involve
the interpolation procedure for global mass estimation and moment evaluations, which
makes the computational burden even more excessive. Thus, to be practically feasible, an
optimization algorithm should only require that the objective function be evaluated for a
fraction of all possible monitoring network designs before the optimal or near-optimal
solution is obtained. Lee and Ellis (1996) compared eight algorithms for nonlinear integer
optimization of two applications and considered the heuristic algorithms promising for
solving realistic groundwater monitoring network design problems. In this study, a
heuristic search technique, GA, is used for this purpose.
2.4. Implementation of the genetic algorithm
The GA is a random search technique that is designed to mimic some of the key
features of natural selection and natural genetics in order to identify the optimal or near-
optimal solutions in a specified search space. Detailed descriptions of GA can be found in
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6550
many references (e.g., Goldberg, 1989; Sen and Stoffa, 1995; Pham and Karaboga, 2000).
In recent years, GA has been shown to be a valuable tool for solving complex optimization
problems in broad fields, including the field of water resources in papers by McKinney
and Lin (1994), Ritzel et al. (1994), Cieniawski et al. (1995), Wagner (1995b), Wang and
Zheng (1997), Aly and Peralta (1999), Cai et al. (2001), Karpouzos et al. (2001), Wang
and Jamieson (2002), and Zheng and Wang (2002).
The GA used here consists of three operators: selection, crossover, and mutation. For
any monitoring network design problem, there are n potential locations. The GA considers
each sampling alternative to be a string (chromosome) consisting of n zero–one variables,
where a value of 1 in the ith digit (bit) represents sampling from the ith location, and 0 no
sampling. The total number of nonzero digits in the string denotes the number of sampling
locations used in the current design.
The first stage of GA is selection. Individual strings (sampling alternatives) are selected
and reproduced on the basis of their fitness values. Strings with high fitness will have a
higher probability of reproducing and contributing offspring to the next generation. For the
monitoring network design problem proposed in Eqs. (7)–(9), the goal is to minimize the
sampling costs while considering the accuracy of global mass estimation and moment
estimation. Thus the fitness function must account for the constraints. To accomplish that,
the fitness measure can be modified by adding the amount of any constraint violation to
the objective function as a penalty:
F ¼ J þ V1 þ V2 ð12Þ
and
V1 ¼ a1errormass � emass
emass
þ a2Nunestimate; emassN0 ð13Þ
V2 ¼ a3 errormoment � emomentð Þ ð14Þ
where F is the fitness value, V1 and V2 are the amounts of constraint violation with respect
to the global mass and moment estimation errors, respectively; Nunestimate is the number of
points at which concentration is not estimated as a result of no sampling data point within
the specified search radius, and ai (i=1, 2, 3) are penalty coefficients.
The selection procedure chooses an interim population representing possible solutions
via a random number generator. This procedure begins by picking out four individual
strings with respective fitness values of Fi (i=1, 2, 3, 4) at random from the initial
population. It then compares the bfitnessQ between F1 and F2, and between F3 and F4,
respectively. Next a copy of the two winners in this process of btournament selectionQ isplaced in a temporary bmating poolQ and subject to the crossover operator described below.The tournament selection is repeated 1
2M times to reproduce a new interim population
where M is the total number of individual strings in the population (i.e., the population
size). As a result, the size of the new interim population is the same as that of the initial
population. The selection process is a key step in the GA that embodies the idea of bfittestsurvivalQ in nature. Accordingly, the selection operator ensures that the best individual
bhas probability 1.0 of prevailing in the competition while the worst and middle-ranking
individuals have probabilities 0 and 0.5, respectivelyQ (McKinney and Lin, 1994).
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 51
Note that a string can be selected more than once and placed in the mating pool. The
next step in a GA is crossover. The crossover operator is performed on each selected pair
described above from the mating pool with a certain probability, referred to as crossover
probability, Pc. There are several ways of doing this. Some commonly used are one-point
crossover, two-point crossover, cycle crossover, and uniform crossover (Pham and
Karaboga, 2000). Since the task of sampling network design is to decide whether a
potential sampling location should be selected or not (i.e., 1 or 0), uniform crossover is
most appropriate for this study. With uniform crossover, a selected bit at a specific position
of one string is exchanged with its corresponding bit at the same position of the other
string. After crossover, a new generation is formed with the same number of strings as in
the previous generation.
The last and final genetic operator is mutation. In this procedure, all strings in the new
population are checked bit-by-bit and the bit values are randomly reversed according to a
specified probability Pm. Mutation can be carried out during the crossover operation. The
new generation formed after selection, crossover and mutation, is expected to contain
individual sampling designs with improved fitness values.
These three operators are conceptually simple. The selection seeks to retain those
feasible sampling designs with high information content. The crossover operator
contributes to improve sampling efficiency by combining the highly informative sampling
plans, and the mutation operator aims to prevent from the irrecoverable loss of information
(Wagner, 1995b). As the generation evolves, the optimal or near-optimal solution
(sampling design) is obtained. Here the number of generations a GA needs to evolve
depends on the number of potential sampling locations, the interpolation method and the
required accuracy of global mass and moment estimates.
Note that the methodology presented in this study identifies the optimal or near-optimal
sampling design for one monitoring period between times tn and tn+1. It can also be used
to solve monitoring network design problems with multiple monitoring periods. For each
monitoring period, the procedures outlined previously can be repeated for different output
times of the flow and transport model. If the ith potential monitoring well is selected for a
particular monitoring period, the capital (installation and drilling) cost is incurred, i.e., the
value of yi in Eq. (7) is 1, for that period. For all other periods, the value of yi will remain 0
since the capital cost should be counted only once. Because the fixed capital cost is
generally higher than the sampling cost at any monitoring location, the sampling design
for one monitoring period affects those of other periods. In other words, the optimal
monitoring network designs for different monitoring periods are not independent.
3. Application to a large-scale field site
The Massachusetts Military Reservation (MMR), located near the town of Falmouth in
Cape Cod, Massachusetts, was established in the early 20th century (Fig. 2). Many
activities included operation of aircraft runaways, aircraft and vehicle maintenance,
landfilling of waste materials and firefighter training have contributed to soil and
groundwater contamination on the MMR. Since MMR was added to the U.S. Environ-
mental Protection Agency’s National Priorities List as a Superfund Site in 1989, more than
Fig. 2. The elements of a pump-and-treat (PAT) system for the CS-10 TCE plume and the interpolation subdomain at the MMR in Cape Cod, Massachusetts. The gray-
scale-filled concentrations represent the peak value of all vertical layers at each horizontal location at the start of the PAT system. The solid dots represent the optimal
pumping well locations. The extracted water, after treatment, is reinjected into the infiltration trenches. The triangles along Sandwich Road denote an existing remedial
well fence (after Zheng and Wang, 2002).
J.Wuet
al./JournalofContaminantHydrology77(2005)41–65
52
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 53
US$200 million had been spent up to 1999 for soil and groundwater cleanup. The cost
estimate for entire cleanup exceeds US$800 million. Projected costs in 1999 dollars for
groundwater remediation alone are evaluated to be greater than US$300 million (Zheng
and Wang, 2002).
The Chemical-Spill 10 (CS-10) site is located in the southeast corner of MMR and was
used by the U.S. Air National Guard for vehicle maintenance and storage. Trichloro-
ethylene (TCE) is the primary contaminant at the CS-10 site. The pump-and-treat (PAT)
remedy was selected as the most feasible approach for groundwater remediation at this site
(Zheng and Wang, 2002). An optimal remedial design for the PAT system was presented
by Zheng and Wang (2002). Fig. 2 illustrates the location of the study site and the elements
of the optimal PAT system for the CS-10 plume. The goal of this study is to develop the
most cost-effective sampling network to monitor the performance of the PAT system as
designed by Zheng and Wang (2002). A total of 160 potential monitoring locations are
assumed for this study (Fig. 3). The total mass and the first and second moments for the
interpolated plume based on the 160 potential monitoring locations are in reasonably close
agreement with those calculated directly from the output of the transport model using all
nodal points within the study area.
Fig. 3. The plan view of all potential sampling locations within the interpolation domain. The concentrations
represent the peak values from model layers 12 to 16 at each horizontal location at the end of the first monitoring
period. Only those concentrations greater than the cleanup level of 5 ppb are shown. Sampling locations outside
the main plume are needed for accurate estimation of the global mass and the first and second spatial moments.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6554
The flow and transport model for the CS-10 site is based on the MODFLOW code
(McDonald and Harbaugh, 1988) and MT3DMS (Zheng and Wang, 1999b). The model
consists of 159 columns, 161 rows and 21 layers, covering an area of approximately 57
km2 (22 square miles). The horizontal grid spacing is 34 m (110 ft) in the vicinity of the
TCE plume, increasing gradually toward the boundaries. The vertical layer thickness
ranges from less than 1.5 m (5 ft) to more than 15 m (50 ft). Key input data used in the
flow and transport model are listed in Table 1. More details of the flow and transport
model for the CS-10 site can be found in Zheng and Wang (2002).
Note that the computational complexity of kriging and IDW interpolation are O(mns3)
and O(mns2), respectively (Reed et al., 2000a), where m is the number of nodes requiring
interpolation and ns is the number of sampled TCE concentrations within the model
domain. Further note that m is much larger than ns. To reduce the computational effort, a
subdomain of the original numerical grid was selected for performing global mass
estimation and moment evaluation. The horizontal subdomain consists of an 81 by 122
grid as shown in Fig. 3. Each grid cell is a square of a uniform spacing, 34 m (110 ft),
along columns and rows. Thus, for each layer, the size of the computational grid was
reduced from 25,599 to 9882 nodes within the interpolation subdomain. In this study the
interpolation subdomain was vertically restricted in five model layers (12–16 from top
down) that contained most of the TCE plume. This field application example only
considers the first monitoring period, assumed to cover the first 5 years since the
beginning of the PAT remediation.
To ensure a normal distribution, the original TCE concentrations were transformed into
logarithmic form. The type of the semivariogram structure for TCE concentrations was
approximately fitted to an exponential model (Deutsch and Journel, 1998):
r hð Þ ¼ c 1� exp � 3h
a
� �ð15Þ
where c is the positive variance contribution value or sill value, a is the effective range (a/
3 is the integral range), and h is the variable denoting the spatial distance between data
along some direction. For this study, the contaminant plume was determined to have a
longer correlated range in the longitudinal direction (along the y axis) than in the
transverse direction (along the x axis). Thus the semivariogram model used for the TCE
plume was anisotropic. The parameter c was determined to be 0.8, and the parameter a
Table 1
Primary input data used in the CS-10 TCE transport modela
Parameter Value
Porosity 0.3
Longitudinal dispersivity 11.0 m (35 ft)
Ratio of horizontal transverse to longitudinal dispersivity 0.1
Ratio of vertical transverse to longitudinal dispersivity 0.01
Effective molecular diffusion coefficient 0
Retardation factor for linear sorption 1.23
First-order decay coefficient for both dissolved and sorbed phases (half-life) 3.16�10�5 day�1 (60 years)
a Adapted from Zheng and Wang (2002).
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 55
was determined to be 1829 m (6000 ft) along the y axis and 1219 m (4000 ft) along the x
axis, respectively.
Note that the use of normal or lognormal transformation implies that the concentrations
are multi-Gaussian distributed. This assumption will result in a bdestructurizationQ effect,which means that the extreme concentrations (near-zero concentrations and plume-center
concentrations) are less spatially correlated (Reed et al., 2004). If the Gaussian assumption
is inappropriate, the assumption of less correlation between the extreme concentrations can
significantly affect the uncertainty evaluation and misrepresent the real site conditions
(Reed et al., 2004).
According to the effective range selected above and the rule recommended by Reed et
al. (2000a), the maximum anisotropic search radii for kriging were defined to be 1829 m
(6000 ft) along the y axis and 1219 m (4000 ft) along the x axis, respectively. Meanwhile,
the search radii for IDW interpolation were set equal to those for kriging to facilitate the
comparison between the two interpolation techniques. In this study, empirically the
slightly high Pm works well even though Pm was recommended as the inverse of the
population size (1/M) (Reed et al., 2000b). All of the GA operators and parameters were
set identically in this study: M=800, Pc=0.85, and Pm=0.08. The number of generations
was set to 80 for all optimization runs except for those mentioned below. In the following
section, the results will be presented and the effects of using different coefficients in
constraints (8) and (9) will be discussed.
4. Results and discussion
In this study, it is assumed that the cost of each sampling (C1li) is US$500, and the
fixed cost (installation/drilling) per well (C2di) is US$20,000. The total mass of
dissolved TCE, masscal, within the model subdomain from layer 12 to layer 16, was
583.7 kg at the end of the first 5-year monitoring period according to the transport
model. Note that the current analysis does not address the uncertainty in the flow and
transport model. The model prediction is assumed to represent the future condition in the
field. Fig. 3 shows the peak concentration value of TCE of model layers 12–16 at each
horizontal location at the end of the first monitoring period.
4.1. Comparison of OK and IDW methods
Fig. 4a shows the plan view of the near-optimal sampling design based on the OK
interpolation scheme.Theparameters for the fitness evaluation functiongiven inEqs. (12)–(14)
are set to a1=5.0�104, a2=5.0, a3=3.0�105, emass = 0.05, and emoment = 0 (parameter set 1,
or PM1). The boptimalQ design includes 55 monitoring wells, and the estimated TCE mass
within the subdomain of interest is 566.3 kg, yielding an estimation error of 2.96%. The
average estimation error for the TCE plume within the subdomain of interest is 0.43% for
the first moments and 34.35% for the second moments. Compared with the initial design
using all 160 potential sampling locations the optimization model reduces the total costs
by 65.6% within the prescribed mass estimation error. The contours of the interpolation
plume for model layer 14 based on the 55 sampling locations are shown in Fig. 4a.
Fig. 4. The comparison of the optimal sampling designs based on OK and IDW methods for parameter set PM1:
(a) OK-based design; and (b) IDW-based design. The solid triangles represent potential monitoring locations. The
circle dots denote the optimal sampling design. The solid line contours show the interpolated concentration
distribution in model layer 14 at the end of the first monitoring period, while the gray-scale contour map
represents the corresponding concentration distribution output by the transport model.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6556
Fig. 4b shows the boptimalQ sampling design of 47 monitoring wells based on the IDW
interpolation scheme; all other parameters are the same as those for kriging. Under the
IDW scheme, the boptimalQ sampling design by the GA reduces the total costs by 70.6%.
However, the total TCE mass interpolated from the 47 wells is equal to 792.1 kg, yielding
an estimation error of 35.7% relative to the calculated mass, masscal, by the transport
model. The average estimation error of the TCE plume within the subdomain of interest is
4.03% for the first moments and up to 128.31% for the second moments. The contours of
the IDW interpolated plume for model layer 14 are also shown in Fig. 4b. From a
comparison of the results for parameter set PM1 in Table 2, it can be seen that the IDW
interpolation method results in greater cost savings, however, the global plume estimation
based on the OK method is more accurate.
The amount of cost savings and accuracy of mass and moment estimation are affected
by the choice of various cost coefficients in the fitness function. When the penalty
coefficients become large enough, the monitoring costs will increase dramatically. Fig. 5
shows that the near-optimal sampling designs obtained under increased penalty
coefficients, a1=5.0�107, a2=5.0, a3=3.0�108 (PM2). Also shown in Fig. 5a and b are
the interpolated TCE plumes for model layer 14 based on the kriging and IDW schemes,
respectively. Under the OK scheme, the global mass estimated for the subdomain of
Table 2
Comparisons of the sampling designs based on OK and IDW interpolation methods and the effects of various
coefficients on optimal sampling designs
Parameter seta Interpolation No. of optimal Potential cost Global plume estimation errors (%) b
method sampling wells savings (%)errormass errormoment
1st errormoment2nd
PM1 OK 55 65.6 2.96 0.43 34.35
IDW 47 70.6 35.70 4.03 128.31
PM2 (a1=5.0�107
and a3=3.0�108)
OK 90 43.8 3.75 0.42 2.10
IDW 71 55.6 3.38 1.59 83.60
PM3 (a3=0) OK 52 67.5 2.89 1.31 61.02
PM4 (emass=0.10) OK 55 65.6 2.96 0.43 34.35
PM5 (emass=0.10
and a3=1.5�105)
OK 47 70.6 12.13 2.15 34.58
PM6 (a2=0) OK 55 65.6 2.96 0.43 34.35
PM7 (a1=0) OK 55 65.6 2.96 0.43 34.35
a PM1 is set to a1=5.0�104, a2=5.0, a3=3.0�105, emass = 0.05, and emoment = 0; and the others (PM2–PM7)
are the same as PM1 except for the specified value in the parentheses. Note that the tolerance emass is specified as
fraction not percentage.b errormoment
1st and errormoment2nd represent the average estimation errors for the first and second moments,
respectively.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 57
interest is 561.8 kg, resulting in an estimation error of 3.75%. The boptimalQ design
requires 90 monitoring wells, reducing the total costs by 43.8% relative to the total 160
potential wells. The average estimation error of the TCE plume is 0.42% for the first
moments and 2.10% for the second moments. In contrast, the global mass estimated under
the IDW scheme is 564.0 kg, yielding an estimation error of 3.38%. Only 71 monitoring
wells are required, reducing the total costs by 55.6%. However, the average estimation
errors of the TCE plume within the subdomain of interest are 1.59% and 83.61%,
respectively, for the first and second moments.
From the viewpoint of economics, the sampling network design obtained using the
IDW method is more effective than that using kriging. However, the comparison of Fig. 5a
and b indicates that the TCE plume interpolated by kriging (solid lines) agrees with that
output by the transport model (gray-scale-filled), significantly better than that interpolated
by IDW. The reason that the IDW interpolation scheme results in a greater mass estimation
error (Fig. 4b) and less accurate plume configuration (Fig. 5b) is that IDW is a lower-
accuracy interpolation scheme compared to kriging. The cost-effectiveness of a sampling
design should not be attained at the expense of less accuracy in the global mass and
moment estimates. Thus, for large-scale field applications, IDW is not suitable for use as
an interpolation scheme in optimal sampling design. The next section will discuss the
effects of fitness function parameters on the sampling design based on the OK method.
4.2. Effect of fitness function parameters on optimal sampling designs
To compare the results with and without the moment constraints, the penalty
coefficient of moment violations in PM1 is changed to a3=0 (PM3). Without the moment
constraints, the boptimalQ design by the GA includes 52 monitoring wells, and the
estimated TCE mass dissolved within the subdomain of interest is 600.6 kg, yielding an
Fig. 5. The comparison of the optimal sampling designs based on OK and IDW methods for parameter set PM2:
(a) OK-based design; and (b) IDW-based design. All symbols are the same as defined in Fig. 4.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6558
estimation error of 2.89%. Compared with the initial design containing all 160 potential
sampling locations, the near-optimal design reduces the total monitoring costs by 67.5%
within the prescribed mass estimation error. In view of this, there seems no obvious
difference between the designs with and without moment constraints. However, without
moment constraints, the average estimation error of the TCE plume within the study
subdomain is 1.31% for the first moments and 61.02% for the second moments. The
latter is nearly twice the average error of 34.35% with moment constraints. The contours
of the interpolated TCE plume for model layer 14 based on the 52 sampling locations
obtained without moment constraints are shown in Fig. 6. Compared with Fig. 4a, it is
clear that the addition of moment constraints leads to significantly closer agreement
between the interpolated plume (solid lines) and the calculated plume directly output
from the transport model (gray-scale-filled), particularly for the outer contour of 5 ppb.
Thus it is important to add the moment constraints to the optimal sampling design model.
This can be further proved as follows.
If the value of mass estimation error tolerance emass is changed from 0.05 in PM1 to 0.10
in PM4, we get an interesting result showing no difference between the two cases, which can
be seen by comparing Figs. 7a and 4a. But if the penalty coefficient for moment violations
(a3) is decreased from 3.0�105 in PM4 to 1.5�105 in PM5, there are significant differences
between the results for PM 4 and PM5. Fig. 7b shows the near-optimal sampling design
using PM5 and the interpolated TCE plume for model layer 14 as comparedwith that directly
Fig. 6. The optimal OK-based sampling design for parameter set PM3 (without moment constraints). All symbols
are the same as defined in Fig. 4.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 59
output from the transport model. The boptimalQ sampling design by the GA consists of 47
monitoring wells and the global mass estimate using the 47 sampling locations is 512.8 kg.
This yields an estimation error of 12.13%, which is larger than 2.96% for PM4 (and also
PM1). For PM5 the average estimation error of TCE plume is 2.15% for first moments and
34.58% for second moments. Recall that for PM4 (and also PM1) the average errors for the
first and second moments are 0.43% and 34.35%, respectively. While the average moment
errors for PM4 and PM5 are quite similar, the difference between the global mass estimation
errors is rather substantial due to different penalty coefficients for moment violations. The
effect of such a difference can be clearly seen by comparing Fig. 7a and b.
Furthermore, if the penalty coefficient a2 for PM1 is set to zero in PM6 (i.e., no penalty
for unestimated nodes) and the penalty coefficient a1 for PM1 is set to zero in PM7 (i.e., no
mass constraint), the boptimalQ designs by the GA for both PM6 and PM7 are identical to
that for PM1. The moment constraints, therefore, are necessary and more important than the
mass constraint for the purpose of reducing the monitoring costs while ensuring the
accuracy of contaminant mass and spatial moments based on the sampled data. Conversely,
if proper penalty coefficients for moment violations are selected, the optimal or near-
optimal design can also be obtained without imposing constraints on the mass estimate or
the number of unestimated nodes.
Fig. 7. The comparison of the optimal OK-based sampling designs with the penalty coefficient for moment
violations (a3): (a) sampling design using a3=3.0�105 (as in PM4); and (b) sampling design using a3=1.5�105
(as in PM5). All symbols are the same as defined in Fig. 4.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6560
The result for PM2 demonstrates that the optimization model can find a sampling design
that is accurate in terms of global mass estimation and moment estimation if the penalty
coefficients are several orders of magnitude (approximately three) larger than the capital
and sampling costs of a monitoring well. However, if the penalty coefficients are reduced to
approximately the same order as the capital and sampling costs, the resulting sampling
design will be more cost-effective while preserving the accuracy of global mass and
moment estimates except for the second moments (Fig. 4a). Whether the sampling design
resulting from PM1 or PM2 is used, it is up to the groundwater manager who considers the
trade-off between additional monitoring costs and more accurate mass and plume
estimations. The sampling design of 55 monitoring wells as shown in Fig. 4a seems to
possess the right balance between the cost-effectiveness and the accuracy of the mass and
moment estimations. The design as shown in Fig. 5a is more accurate in terms of the mass
and moment estimation, but it is likely too expensive to be practically useful. The effects of
objective function parameters on OK-based sampling designs are summarized in Table 2.
4.3. Discussion of computational efforts
All computations were carried out on a desktop personal computer (PC) with a 2.20-
GHz Pentium 4 processor and the Windows XP operating system. Given a flow and
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 61
transport model, the run time for a sampling design optimization problem depends on two
primary factors: one is the speed of GA convergence and the other is the choice of
interpolation method.
It is difficult to know when the optimal or near-optimal solution is attained using GA.
Few guidelines have been suggested in the literature for determining the stopping criterion
for GA. Reed et al. (2000a) pointed out that two conditions must be met before the GA
converges. The first condition is that a single subset of the potential sampling locations
must be selected by about 90% of the individuals within the last generation. The second
one requires that all of remaining sampling locations could not be sampled by more than
10% of the individuals within the last generation. Reed et al. (2000b) presented a simple
three-step method for the GA to determine the number of control parameters. They also
proposed the relationships for the population size, convergence rates and genetic drift, and
demonstrated their proposed methodology through a long-term groundwater monitoring
application (Reed et al., 2000a,b). However, these suggested rules are not directly
applicable to this study because a different GA is used. For our study, we have checked
that for PM1 the OK-based sampling design has no improvement in the objective function
from generation 58 to generation 100. For the IDW-based design, the objective function
stops improving even sooner. Thus, to ensure mature convergence, the number of
generations is set to 80 for all optimization runs in this study. Fig. 8 shows the evolutions
of the OK-based and IDW-based objective functions versus the number of generations,
respectively, for the PM1 run. Increasing the population size from 800 to 2000 does not
affect the objective function, but is much more time-consuming.
The second factor is the choice of interpolation method. Using kriging it takes an
average of 50 min for each generation to carry out 800 estimates of global mass, and first
and second moments. Completion of an optimization run requires a total run time of 67 h.
Conversely, the speed of IDW interpolation is so fast that each generation is evaluated in
an average of 7 min and an optimization run is completed in 9 h. In any case, the run time
of less than 70 h for the OK-based model is considered acceptable for real-world
Fig. 8. Evolution of the fitness function values over GA generations using OK and IDW methods under parameter
set PM1.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6562
applications, and the run time can be expected to decrease quickly as desktop computers
become more and more powerful.
5. Conclusions
A new simulation–optimization methodology is developed in this study that
combines a numerical flow and transport simulator, a global mass estimator, a plume
moment estimator, and a genetic algorithm to design cost-effective long-term
monitoring networks under general field conditions. Application of the methodology
to monitoring of plume evolution during a pump-and-treat operation at the MMR site
indicates that potential cost savings up to 65.6% may be achieved without significant
loss of accuracy in global mass and plume moment estimations. The methodology can
also be used to consider the trade-off between the conflicting demands for reduced
monitoring costs and increased accuracy in mass and plume moment estimations.
For the monitoring network design model presented in this paper, the IDW-based
interpolation method is computationally more efficient than the OK-based method and
results in more cost-effective monitoring designs. However, the OK-based method
leads to more accurate mass and plume moment estimations. The plume interpolated
by IDW based on sampled data can deviate significantly from the plume output by the
transport model. Thus the IDW-based method is not recommended for solving the
groundwater monitoring network design problem under general field conditions.
Although the optimization model using the OK-based interpolation method is more
time-consuming, it is still very feasible for today’s desktop PCs.
This study shows that the first and second moment constraints are more important
than the mass (i.e., zeroth moment) constraint. Without the moment constraints, the
plume interpolated from the sampled data cannot sufficiently match the plume output
by the transport model, thus failing the very purpose of plume monitoring. If the
penalty coefficient for the moment constraints is set appropriately, an optimal or near-
optimal design can be reached without the mass constraint. The effects of various
coefficients for the objective function on sampling design indicate that the penalty costs
can be set approximately 5–20 times the expected real monitoring costs to obtain an
optimal or near-optimal sampling design that is both cost-effective and sufficiently
accurate in terms of mass and moment estimations. Also, it is necessary to set the error
tolerances for the plume moment constraints as small as possible. Because the number
of sampled data is always much smaller than the total number of model nodes in a
numerical simulation model, it is rather difficult to reduce the second moment
estimation errors. Finally, it should be pointed out that considerable expertise and time
may be required to determine the plume semivariogram needed for kriging-based
interpolation.
It is noteworthy that the solutions to the sampling design optimization model for this
study are based on a single-objective GA. One run of the single-objective GA model can
only produce a unique sampling network design that is the most cost-effective subject to
a prescribed set of constraints. Reed et al. (2001) proposed the use of a multi-objective
GA such as the Non-dominated Sorting Genetic Algorithm (NSGA) to determine a trade-
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 63
off curve between the optimal cost-effectiveness and the reduced global plume estimation
error. The advantage of a multi-objective GA is appealing, but its applicability to a large
field-scale problem should be tested and evaluated in future studies.
Further research should consider the sampling frequency as well as other techniques of
plume interpolation. Future work will also investigate the uncertainties in the flow and
transport model so that robust monitoring strategies can be developed for the range of
possible conditions existing in the subsurface. In addition, contaminant source
identification using an existing network of monitoring wells will be addressed. Important
issues, such as whether the existing monitoring wells are adequate or where new wells
should be added, will be explored.
Acknowledgments
This study is supported in part by DuPont Company and the National Natural Science
Foundation of China (Nos. 40002022 and 40472130). We are grateful to Gaisheng Liu of
the University of Alabama who provided the initial source code for computing the spatial
moments of a three-dimensional contaminant plume. We also thank Albert Valocchi,
Patrick Reed, and two anonymous reviewers whose constructive comments have led to
significant improvement of this manuscript.
References
Adams, E.E., Gelhar, L.W., 1992. Field study of dispersion in a heterogeneous aquifer: 2. Spatial moments
analysis. Water Resour. Res. 28 (12), 3293–3307.
Ahlfeld, D.P., Mulvey, J.M, Pinder, G.F., 1988. Contaminated groundwater remediation design using simulation,
optimization, and sensitivity theory, 2. Analysis of a field site. Water Resour. Res. 24 (5), 443–452.
Aly, A.H., Peralta, R.C., 1999. Optimal design of aquifer cleanup systems under uncertainty using a neural
network and a genetic algorithm. Water Resour. Res. 35 (8), 2523–2532.
Andricevic, R., 1993. Coupled withdrawal and sampling designs for groundwater supply models. Water Resour.
Res. 29 (1), 5–16.
Andricevic, R., 1996. Evaluation of sampling in the subsurface. Water Resour. Res. 32 (4), 863–874.
Bear, J., Sun, Y., 1998. Optimization of pump-treat-inject (PTI) design for the remediation of a contaminated
aquifer: multi-stage design with chance constraints. J. Contam. Hydrol. 29, 225–244.
Bogaert, P., Russo, D., 1999. Optimal spatial sampling design for the estimation of the variogram based on a
squares approach. Water Resour. Res. 35 (4), 1275–1289.
Cai, X., Mckinney, D.C., Lasdon, L.S., 2001. Solving nonlinear water management models using a combined
genetic algorithm and linear programming approach. Adv. Water Resour. 24, 667–676.
Chien, C.C., Medina Jr., M.A., Pinder, G.F., Reible, D.R., Sleep, B.E., Zheng, C. (Eds.), 2002. Environmental
Modeling and Management: Theory, Practice and Future Directions. Today Media.
Christakos, G., Killam, B.R., 1993. Sampling design for classifying contamination level using annealing search
algorithm. Water Resour. Res. 29 (12), 4063–4076.
Cieniawski, S.E., Eheart, J.W., Ranjithan, S., 1995. Using genetic algorithm to solve a multiobjective
groundwater monitoring problem. Water Resour. Res. 31 (2), 399–409.
Culver, T.B., Shoemaker, C.A., 1992. Dynamic optimal control for groundwater remediation with flexible
management periods. Water Resour. Res. 28 (3), 629–641.
Deutsch, C.V., Journel, A.G., 1998. GSLIB: Geostatistical Software Library and User’s Guide, 2nd ed. Oxford
Univ. Press, New York.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–6564
Eggleston, J.R., Rojstaczer, S.A., Peirce, J.J., 1996. Identification of hydraulic conductivity structure in sand and
gravel aquifer: Cape Cod data set. Water Resour. Res. 32 (5), 1209–1222.
Ezzedine, S., Rubin, Y., 1997. Analysis of the Cape Cod tracer data. Water Resour. Res. 33 (1), 1–11.
Fabbri, P., 1997. Transmissivity in the geothermal Euganean basin: a geostatistical analysis. Ground Water 35 (5),
881–887.
Freyberg, D.L., 1986. A natural gradient experiment on solute transport in a sand aquifer: 2. Spatial moments and
the advection and dispersion of nonreactive tracers. Water Resour. Res. 22 (13), 2031–2046.
Garabedian, S.P., LeBlanc, D.R., Gelhar, L.W., Celia, M.A., 1991. Large-scale natural gradient tracer test in sand
and gravel, Cape Cod, Massachusetts: 2. Analysis of spatial moments for a nonreactive tracer. Water Resour.
Res. 27 (5), 911–924.
Goldberg, D.E., 1989. Genetic Algorithm in Search, Optimization and Machine Learning. Addison Wesley,
Reading, MA.
Gorelick, S.M., 1983. A review of distributed parameter groundwater management modeling method. Water
Resour. Res. 19 (2), 305–319.
Huang, C., Mayer, A.S., 1997. Pump-and-treat optimization using well locations and pumping rates as decision
variables. Water Resour. Res. 33 (5), 1001–1012.
Hudak, P.F., Loaiciga, H.A., 1992. A location modeling approach for groundwater monitoring network
augmentation. Water Resour. Res. 28 (3), 643–649.
Hudak, P.F., Loaiciga, H.A., 1993. An optimization method for monitoring network design in multilayered
groundwater flow system. Water Resour. Res. 29 (8), 2835–2845.
Isaaks, E.H., Srivastava, R.M., 1989. An Introduction to Applied Geostatistics. Oxford Univ. Press, New York.
James, B.R., Gorelick, S.M., 1994. When enough is enough: the worth of monitoring data in aquifer remediation
design. Water Resour. Res. 30 (12), 3499–3513.
Karatzas, G.P., Pinder, G.F., 1993. Groundwater management using numerical simulation and the outer-
approximation method for global optimization. Water Resour. Res. 25 (10), 2245–2258.
Karpouzos, D.K., Delay, F., Katsifarakis, K.L., de Marsily, G., 2001. A multipopulation genetic algorithm to
solve the inverse problem in hydrogeology. Water Resour. Res. 37 (9), 2291–2302.
Lavenue, A.M., Pickens, J.F., 1992. Application of a coupled adjoin sensitivity and kriging approach to calibrate a
groundwater flow model. Water Resour. Res. 28 (6), 1543–1569.
Lee, Y.-M., Ellis, J.H., 1996. Comparison of algorithms for nonlinear integer optimization: application to
monitoring network design. J. Environ. Eng. 122 (6), 524–531.
Loaiciga, H., 1989. An optimization approach for groundwater quality monitoring network design. Water Resour.
Res. 25 (8), 1771–1780.
Mayer, A.S., Kelley, C.T., Miller, C.T., 2001. Optimal design for problems involving flow and transport
phenomena in saturated subsurface systems. Adv. Water Resour. 25, 1233–1256.
McDonald, M.G., Harbaugh, A.W., 1988. A modular three-dimensional finite-difference ground water flow
model. USGS Techniques of Water Resources Investigations, Book vol. 6.
McKinney, D.C., Lin, M.D., 1994. Genetic algorithm solution of groundwater management models. Water
Resour. Res. 30 (6), 1897–1906.
McKinney, D.C., Loucks, D.P., 1992. Network design for predicting groundwater contamination. Water Resour.
Res. 28 (1), 133–147.
Meyer, P.D., Valocchi, A.J., Eheart, J.W., 1994. Monitoring network design to provide initial detection of
groundwater contamination. Water Resour. Res. 30 (9), 2647–2659.
Meyer, P.D., Eheart, J.W., Ranjithan, S., Valocchi, A.J., 1995. Design of groundwater monitoring networks for
landfills. In: Kundzewicz, Z.W. (Ed.), Proceedings of the International Workshop on New Uncertainty
Concepts in Hydrology and Water Resources. Cambridge Univ. Press, pp. 190–196.
Minsker, B.S., Shoemaker, C.A., 1998. Dynamic optimal control of in-situ bioremediation of ground water.
J. Water Resour. Plan. Manage. 124 (3), 149–161.
Montas, H.J., Mohtar, R.H., Hassan, A.E., AlKhal, F.A., 2000. Heuristic space–time design of
monitoring wells for contaminant plume characterization in stochastic flow fields. J. Contam. Hydrol.
43, 271–301.
Olea, R.A., 1999. Geostatistics for Engineers and Earth Sciences. Kluwer Academic Publishers, Boston.
Pham, D.T., Karaboga, D., 2000. Intelligent Optimization Techniques: Genetic Algorithms, Tabu Search,
Simulated Annealing and Neural Networks. Springer-Verlag, New York.
J. Wu et al. / Journal of Contaminant Hydrology 77 (2005) 41–65 65
Reed, P.M., Minsker, B.S., Valocchi, A.J., 2000a. Cost-effective long-term groundwater monitoring design using
a genetic algorithm and global mass interpolation. Water Resour. Res. 36 (12), 3731–3741.
Reed, P.M., Minsker, B.S., Goldberg, D.E., 2000b. Designing a competent simple genetic algorithm for search
and optimization. Water Resour. Res. 36 (12), 3757–3761.
Reed, P.M., Minsker, B.S., Goldberg, D.E., 2001. A multiobjective approach to cost effective long-term
groundwater monitoring using an elitist nondominated sorted genetic algorithm with historical data. J.
Hydroinform. 3 (2), 71–89.
Reed, P.M., Ellsworth, T.R., Minsker, B.S., 2004. Spatial interpolation methods for nonstationary plume data.
Ground Water 42 (2), 190–202.
Ridley, M.N., Johnson, V.M., Tuckfield, R.C., 1995. Cost-Effective Sampling of Groundwater Monitoring Wells.
Lawrence Livermore National Laboratory, Livermore, CA. UCRL-JC-118909.
Ritzel, B.J., Eheart, J.W., Ranjithan, S., 1994. Using genetic algorithms to solve a multiple-objective groundwater
pollution containment problem. Water Resour. Res. 30 (5), 1589–1603.
Rizzo, D.M., Dougherty, D.E., 1996. Design optimization for multiple management period groundwater
remediation. Water Resour. Res. 32 (8), 2549–2561.
Rouhani, S., 1985. Variance reduction analysis. Water Resour. Res. 21 (6), 837–846.
Rouhani, S., Hall, T.J., 1988. Geostatistical schemes for groundwater sampling. J. Hydrol. 103, 85–120.
Sen, M., Stoffa, P.L., 1995. Global Optimization Methods in Geophysical Inversion. Elesevier Science
Publishers B.V.
Smalley, J.B., Minsker, B., Goldberg, D.E., 2000. Risk-based in situ bioremediation design using a noisy genetic
algorithm. Water Resour. Res. 36 (10), 3043–3052.
Srivastava, R., Brusseau, M.L., 1996. Nonideal transport of reactive solutes in heterogeneous porous media: 1.
Numerical model development and moment analysis. J. Contam. Hydrol. 24, 117–173.
Tiedeman, C., Gorelick, S.M., 1993. Analysis of uncertainty in optimal groundwater contaminant capture design.
Water Resour. Res. 29 (7), 2139–2154.
Vereecken, H., Doring, U., Hardelauf, H., Jaekel, U., Hashagen, U., Neuendorf, O., Schwarze, H., Seidemann, R.,
2000. Analysis of solute transport in a heterogeneous aquifer: the Krauthausen field experiment. J. Contam.
Hydrol. 45, 329–358.
Vogely, W.A., Sani, E., Monzon, P.G., 1978. The application of kriging methods to oil and gas resource
estimation. Dept. Mineral Economics, Pennsylvania State Univ., submitted to the U.S. Dept. Energy.
Vucetic, S., Fiez, T., Obradovic, Z., 2000. Examination of the influence of data aggregation and sampling density
on spatial estimation. Water Resour. Res. 36 (12), 3721–3730.
Wagner, B.J., 1995a. Recent advances in simulation–optimization groundwater management modeling. U.S.
national report to international union of geodesy and geophysics 1991–1994, Review of Geophysics,
pp. 1021–1028. Supplement.
Wagner, B.J., 1995b. Sampling design methods for groundwater modeling under uncertainty. Water Resour. Res.
31 (10), 2581–2591.
Wagner, B.J., Gorelick, S.M., 1989. Reliable aquifer remediation in the presence of spatially variable hydraulic
conductivity: from data to design. Water Resour. Res. 25 (10), 2211–2225.
Wang, C.G., Jamieson, D.G., 2002. An objective approach to regional wastewater treatment planning. Water
Resour. Res. 38 (3).
Wang, M., Zheng, C., 1997. Optimal remediation policy selection under general conditions. Ground Water 35 (5),
757–764.
Zheng, C., Wang, P.P., 1999a. An integrated global and local optimization approach for remediation system
design. Water Resour. Res. 35 (1), 137–148.
Zheng, C., Wang, P.P., 1999b. MT3DMS: a modular three-dimensional multispecies transport model for
simulation if advection, dispersion and chemical reactions of contaminants in ground water systems:
documentation and user’s guide. Contract Report SERDP-99-1, U.S. Army Engineer Research and
Development Center, Vicksburg, Mississippi (available at http://www.hydro.geo.ua.edu/mt3d).
Zheng, C., Wang, P.P., 2002. A field demonstration of the simulation–optimization approach for remediation
system design. Ground Water 40 (3), 258–265.
Zhu, X.Y., Xu, S.H., Zhu, J.J., Zhou, N.Q., 1997. Study on the contamination of fracture-karst water in Boshan
District, China. Ground Water 35 (3), 538–545.