predicting groundwater level changes using grace … · predicting groundwater level changes using...

13
Predicting groundwater level changes using GRACE data Alexander Y. Sun 1 Received 31 December 2012 ; revised 7 July 2013 ; accepted 13 July 2013. [1] The purpose of this work is to investigate the feasibility of downscaling Gravity Recovery and Climate Experiment (GRACE) satellite data for predicting groundwater level changes and, thus, enhancing current capability for sustainable water resources management. In many parts of the world, water management decisions are traditionally informed by in situ observation networks which, unfortunately, have seen a decline in coverage in recent years. Since its launch, GRACE has provided terrestrial water storage change (DTWS) data at global and regional scales. The application of GRACE data for local-scale groundwater resources management has been limited because of uncertainties inherent in GRACE data and difficulties in disaggregating various TWS components. In this work, artificial neural network (ANN) models are developed to predict groundwater level changes directly by using a gridded GRACE product and other publicly available hydrometeorological data sets. As a feasibility study, ensemble ANN models are used to predict monthly and seasonal water level changes for several wells located in different regions across the US. Results indicate that GRACE data play a modest but significantly role in the performance of ANN ensembles, especially when the cyclic pattern of groundwater hydrograph is disrupted by extreme climate events, such as the recent Midwest droughts. The statistical downscaling approach taken here may be readily integrated into local water resources planning activities. Citation : Sun, A. Y. (2013), Predicting groundwater level changes using GRACE data, Water Resour. Res., 49, doi:10.1002/ wrcr.20421. 1. Introduction [2] Groundwater is the primary source of drinking and irrigation water supplies in many parts of the world. Shal- low groundwater influences 22–32% of global land area [Fan et al., 2013]. In the US, groundwater accounts for 20% of the total water withdrawal, providing more than 90% of water used for rural domestic supplies and 40% used for public supplies [Barber, 2009]. Freshwater aqui- fers are increasingly stressed by population growth and cli- mate extremes, which are of particular concern for arid and semiarid regions that rely exclusively on groundwater. Worldwide, the pace of groundwater depletion has more than doubled during the last several decades, caused pri- marily by increasing water demand [Konikow, 2011; Wada et al., 2011, 2012]. Thus, to manage the limited ground- water resources in a way that will ensure equitable, sustain- able, and economically prudent decisions, we must enhance our existing capability to monitor and predict water availability. Although in situ monitoring networks provide high-resolution estimates, remotely sensed data constitute the only source of information for assessing water resources in sparsely monitored regions [Alsdorf et al., 2007; National Research Council, 2008]. For this reason, the Gravity Recovery and Climate Experiment (GRACE) satellite mission has attracted much attention since its launch over a decade ago. GRACE observes temporal variations of Earth’s gravitational potential which, after removing atmospheric and oceanic effects, is mostly caused by changes in terrestrial water storage or DTWS [Tapley et al., 2004]. [3] Remote sensing of groundwater storage changes can- not be done directly using the current technologies [Becker, 2006; Brunner et al., 2007]. In the case of the GRACE sat- ellite, it only tracks DTWS but not changes of individual hydrologic components (e.g., surface water, soil moisture, and groundwater). By far, the most popular approach for inferring groundwater storage changes from GRACE DTWS has been to isolate and remove contributions of all other TWS components using either auxiliary information or land surface models. The foci of many previous studies were on comparison between the groundwater storage change obtained by disaggregating GRACE data and that by using in situ data, with the main purpose of validating the accuracy of GRACE data at either regional or continen- tal scales [e.g., Henry et al., 2011; Leblanc et al., 2009; Rodell et al., 2009, 2007; Scanlon et al., 2012a; Swenson et al., 2008; Syed et al., 2008; Yeh et al., 2006]. Practical application of GRACE data for local water resources man- agement, especially nowcasting and forecasting, has been limited. A fundamental issue is related to the low spatial Additional supporting information may be found in the online version of this article. 1 Jackson School of Geosciences, Bureau of Economic Geology, Univer- sity of Texas at Austin, Austin, Texas, USA. Corresponding author: A. Y. Sun, Jackson School of Geosciences, Bureau of Economic Geology, University of Texas at Austin, University Station, Austin, TX 78713, USA. ([email protected]) ©2013. American Geophysical Union. All Rights Reserved. 0043-1397/13/10.1002/wrcr.20421 1 WATER RESOURCES RESEARCH, VOL. 49, 1–13, doi :10.1002/wrcr.20421, 2013

Upload: lytruc

Post on 06-Sep-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

Predicting groundwater level changes using GRACE data

Alexander Y. Sun1

Received 31 December 2012; revised 7 July 2013; accepted 13 July 2013.

[1] The purpose of this work is to investigate the feasibility of downscaling GravityRecovery and Climate Experiment (GRACE) satellite data for predicting groundwater levelchanges and, thus, enhancing current capability for sustainable water resourcesmanagement. In many parts of the world, water management decisions are traditionallyinformed by in situ observation networks which, unfortunately, have seen a decline incoverage in recent years. Since its launch, GRACE has provided terrestrial water storagechange (DTWS) data at global and regional scales. The application of GRACE data forlocal-scale groundwater resources management has been limited because of uncertaintiesinherent in GRACE data and difficulties in disaggregating various TWS components. In thiswork, artificial neural network (ANN) models are developed to predict groundwater levelchanges directly by using a gridded GRACE product and other publicly availablehydrometeorological data sets. As a feasibility study, ensemble ANN models are used topredict monthly and seasonal water level changes for several wells located in differentregions across the US. Results indicate that GRACE data play a modest but significantlyrole in the performance of ANN ensembles, especially when the cyclic pattern ofgroundwater hydrograph is disrupted by extreme climate events, such as the recent Midwestdroughts. The statistical downscaling approach taken here may be readily integrated intolocal water resources planning activities.

Citation: Sun, A. Y. (2013), Predicting groundwater level changes using GRACE data, Water Resour. Res., 49, doi :10.1002/wrcr.20421.

1. Introduction

[2] Groundwater is the primary source of drinking andirrigation water supplies in many parts of the world. Shal-low groundwater influences 22–32% of global land area[Fan et al., 2013]. In the US, groundwater accounts for20% of the total water withdrawal, providing more than90% of water used for rural domestic supplies and 40%used for public supplies [Barber, 2009]. Freshwater aqui-fers are increasingly stressed by population growth and cli-mate extremes, which are of particular concern for arid andsemiarid regions that rely exclusively on groundwater.Worldwide, the pace of groundwater depletion has morethan doubled during the last several decades, caused pri-marily by increasing water demand [Konikow, 2011; Wadaet al., 2011, 2012]. Thus, to manage the limited ground-water resources in a way that will ensure equitable, sustain-able, and economically prudent decisions, we mustenhance our existing capability to monitor and predictwater availability. Although in situ monitoring networks

provide high-resolution estimates, remotely sensed dataconstitute the only source of information for assessingwater resources in sparsely monitored regions [Alsdorfet al., 2007; National Research Council, 2008]. For thisreason, the Gravity Recovery and Climate Experiment(GRACE) satellite mission has attracted much attentionsince its launch over a decade ago. GRACE observestemporal variations of Earth’s gravitational potentialwhich, after removing atmospheric and oceanic effects, ismostly caused by changes in terrestrial water storage orDTWS [Tapley et al., 2004].

[3] Remote sensing of groundwater storage changes can-not be done directly using the current technologies [Becker,2006; Brunner et al., 2007]. In the case of the GRACE sat-ellite, it only tracks DTWS but not changes of individualhydrologic components (e.g., surface water, soil moisture,and groundwater). By far, the most popular approach forinferring groundwater storage changes from GRACEDTWS has been to isolate and remove contributions of allother TWS components using either auxiliary informationor land surface models. The foci of many previous studieswere on comparison between the groundwater storagechange obtained by disaggregating GRACE data and thatby using in situ data, with the main purpose of validatingthe accuracy of GRACE data at either regional or continen-tal scales [e.g., Henry et al., 2011; Leblanc et al., 2009;Rodell et al., 2009, 2007; Scanlon et al., 2012a; Swensonet al., 2008; Syed et al., 2008; Yeh et al., 2006]. Practicalapplication of GRACE data for local water resources man-agement, especially nowcasting and forecasting, has beenlimited. A fundamental issue is related to the low spatial

Additional supporting information may be found in the online version ofthis article.

1Jackson School of Geosciences, Bureau of Economic Geology, Univer-sity of Texas at Austin, Austin, Texas, USA.

Corresponding author: A. Y. Sun, Jackson School of Geosciences,Bureau of Economic Geology, University of Texas at Austin, UniversityStation, Austin, TX 78713, USA. ([email protected])

©2013. American Geophysical Union. All Rights Reserved.0043-1397/13/10.1002/wrcr.20421

1

WATER RESOURCES RESEARCH, VOL. 49, 1–13, doi:10.1002/wrcr.20421, 2013

(>150,000 km2) and temporal (>10 days) resolution ofGRACE [Rodell et al., 2007]. Another compounding issueis the quantification of uncertainties arising from postpro-cessing and disaggregation GRACE signals.

[4] The raw GRACE data are sets of spherical harmoniccoefficients describing monthly variations of Earth’s grav-ity field. Postprocessing needs to be done to remove sys-tematic errors (i.e., destriping) and random errors in higher-degree spherical harmonic coefficients not removed by des-triping [Landerer and Swenson, 2012]. The postprocessingoperations, however, also cause signal degradation. Disag-gregation of GRACE data into contributions of differenthydrological components tends to introduce additionaluncertainties, as summarized in Seo et al. [2006]. Quantifi-cation of the uncertainties arising from aggregating (in lat-eral directions) and disaggregating (in the verticaldirection) different TWS components is challenging, espe-cially when large-scale hydrological models are used orwhen in situ observations are averaged over large areas. Inthe former case, rigorous uncertainty analyses wouldrequire probability distributions of inputs, which are hardto assess in practice. Simple techniques for estimationunder uncertainty might be useful. For example, Sun et al.[2010] applied a robust least squares approach to estimateaquifer specific yield, which requires only bounds of theDTWS components.

[5] At the present time, incorporating GRACE datadirectly into local water management decision makingactivities, such as groundwater allocation and drought man-agement, is unlikely to be fruitful for aforementioned chal-lenges. But can we use the GRACE product indirectly toinform water management at subgrid scales? This questionis equivalent to testing the feasibility of downscalingGRACE data.

[6] Dynamic downscaling techniques are model based[Fowler et al., 2007]. Recently, Houborg et al. [2012]assimilated GRACE data into the Catchment Land SurfaceModel using ensemble Kalman smoother and forcing datafrom North American and Global Land Data AssimilationSystems Phase 2 (NLDAS-2). NLDAS (1/8� and hourly)consists of several land surface models that cover centralNorth America and is constructed from gauge-basedobserved precipitation, bias-corrected shortwave radiation,and surface meteorology analysis fields. The results ofHouborg et al. [2012] show statistically significantimprovement in drought prediction skills over many partsof the continental US, highlighting the potential value ofGRACE data. Sun et al. [2012] imposed GRACE observa-tions as constraints when recalibrating a regional-scalegroundwater model such that the model-simulated DTWSshould agree with the GRACE-derived DTWS. In theircase, the regional aquifer model was originally developedfor long-term water management and transforming themodel for transient simulation at monthly intervals waschallenging, requiring significant more spatial and temporalinformation on groundwater use, recharge rates, and bound-ary fluxes.

[7] As an alternative to model-based downscaling,statistical downscaling techniques explore empiricalrelationships between large-scale variables (predictors) andlocal-scale variables of interest (predictands). The empiri-cal relationship is usually given in the form of a nonpara-

metric model or, transfer function, that is ‘‘learned’’ byperforming linear or nonlinear regression on training datasets. Once trained, the transfer function provides a func-tional mapping between predictors and predictands that canbe used to facilitate cross-scale information exchange.Compared to dynamic downscaling, statistical downscalingis data driven and is much less time consuming to develop[Fowler et al., 2007].

[8] The main hypothesis of this study is that the GRACEDTWS may serve as a useful predictor for water levelchanges, in the absence of continuous in situ water levelobservations. Revolving around this hypothesis, the objec-tives of this study are to (a) develop statistical downscalingmodels for incorporating GRACE DTWS to predict waterlevel changes and (b) validate the usefulness of statisticaldownscaling, or the ‘‘data worth’’ of GRACE, for wellslocated in different geographic and climatic regions. Heredata worth refers to the contribution of additional observa-tions in reducing predictive uncertainty [Brunner et al.,2012]. In addition to local water resources managementneeds, this study is also motivated by concerns over thedwindling coverage of in situ monitoring networks, whichhave been traditionally used to gauge the wellbeing of aqui-fers, as well as to provide support for local water manage-ment decisions (observation wells that support aquifermanagement decision making are widely known as indexwells). The spatiotemporal coverage of in situ monitoringnetworks depends critically on dispensable local resources.Currently, many water agencies are facing increasing con-straints because of budget cuts [Nelson, 2012]. Moreover,even with in situ well networks in operation, holisticgroundwater assessments can be difficult to perform on asystematic basis because political and aquifer boundariestypically do not coincide with each other [Rodell et al.,2009]. Thus, insights gained from this analysis by using thefreely available GRACE product will be of broad interestto the groundwater community.

[9] The remainder of the paper is organized as follows.Section 2 describes data used and data processing techni-ques. Section 3 discusses methodologies used for statisticaldownscaling and forecasting. Section 4 shows results fromtwo sets of downscaling tests. Test I looks at three wellslocated in different aquifers and climate regions in the U.S.Test II zooms into a single GRACE pixel and further vali-dates the feasibility of GRACE downscaling for multiplesites at subgrid scale. Finally, section 5 summarizes mainfindings of this study.

2. Data and Data Processing

[10] The gridded GRACE product (1� or �110 km) usedin this study is based on Release-5 of GRACE Level-2data, which has improved accuracy over its precursors dueto new data processing algorithms [Landerer and Swenson,2012]. As mentioned before, filtering and truncation ofGRACE spherical harmonic coefficients result in signalattenuation and leakage, which need to be correctedthrough leakage and bias correction. Landerer and Swenson[2012] used scaling to restore signal amplitudes. The spa-tially distributed scaling field is obtained based on simula-tion outputs from the Global Land Data AssimilationSystem (GLDAS). The scaling approach does not seek to

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

2

match GRACE measurements to GLDAS model ampli-tudes; rather, it uses synthetic model patterns to determinerelative signal attenuation based on the ratio of true and fil-tered signal amplitudes. The availability of this griddedproduct will enable users who do not possess necessarybackground in GRACE data processing to extract and useDTWS time series. The data set is being updated regularlyand can be downloaded from Jet Propulsion Laboratory(JPL)’s ftp site [JPL, 2012]. Unless otherwise noted, theperiod of study spans from March 2003 (the beginningmonth of the gridded data set) to August 2012 in monthlyintervals.

[11] Two tests are designed to validate the concept andsignificance of GRACE-based downscaling. Test I exam-ines three observation wells located in Texas (Well-TX),Nebraska (Well-HP), and Illinois (Well-IL), respectively(Figure 1). The wells were chosen because (a) all of themhave continuous water level records during the study periodand (b) at the regional scale, good correlation (>0.45) hasbeen found between GRACE-derived groundwater storagechanges and those estimated based on in situ water leveldata for all three aquifers [Scanlon et al., 2012b; Sun et al.,2010; Yeh et al., 2006]. Further background informationpertaining to the three wells is provided below.

[12] Well-TX is located in Reeves County, Texas, and iscompleted in the unconfined Pecos Valley alluvial aquifer.The average water level elevation is about 140 ft [1ft¼ 0.3048 m] below land surface. According to TexasWater Development Board (TWDB), more than 80% ofgroundwater pumped from the Pecos Valley aquifer is usedfor irrigation and the rest is withdrawn for municipal andindustrial uses [TWDB, 2012b]. The groundwater level datawere queried from TWDB’s groundwater database [TWDB,2012a]. Both Well-HP and Well-IL belong to the US Geo-logic Survey (USGS)’s Active Groundwater Level network[USGS, 2013]. The former is located in Dundy County,Nebraska, and is completed in the High Plains aquifer,while the latter is located in Vermilion County, Illinois,

and is completed in a shallow glacial aquifer. Climate ofthe High Plains is semiarid. The unconfined High Plains aq-uifer is the largest freshwater aquifer in the US. Extensiveirrigation in recent decades has led to significant decreasein regional groundwater levels from the predevelopmentera (1950s) [Scanlon et al., 2012b; Sophocleous, 2005].The average water level of Well-HP is 133 ft below landsurface during the study period. In comparison, water tableof the unconfined Illinois aquifer is relatively shallow andthe average water level of Well-IL is only 22 ft below landsurface. Rodell and Famiglietti [2001] showed that thegroundwater storage change in Illinois is equal in magni-tude to soil moisture change, and the two TWS componentsaccount for most of the TWS variations in Illinois relativeto snow and surface water.

[13] All three Test I wells exhibit a certain degree ofseasonality and mild nonstationarity (see supportinginformation S1, Figure S1). To quantify the correlationbetween GRACE DTWS and groundwater level changes,the high-pass, Hodrick-Prescott filter (HP filter) was appliedto remove long-term trends from water level data (seesection A).

[14] Test II further looks into seven continuously moni-tored wells that are all located within a 1� GRACE pixel at101–102�W and 40–41�N (Figure 2). The area, which is inthe Republican River Basin, encompasses several Nebraskacounties, including Dundy County where Well-HP (i.e.,Benkelman Well in Figure 2) is located. Nebraska is under-lain by the largest volume of water-in-storage of the HighPlains aquifer [McGuire et al., 2012]. Most of the wellsselected for Test II are completed in the Ogallala Formationof the High Plains aquifer. One exception is Haigler Well,which is completed in a relatively shallow sand-and-gravelaquifer. The area is covered by a relatively dense in situwell network and, thus, offers a unique opportunity to fur-ther validate the usefulness of GRACE downscaling formultiple sites located within the same GRACE pixel. Allwell hydrographs were downloaded from USGS web site

Figure 1. Locations of wells considered in the first test. The background map shows the total ground-water withdrawal (in 106 gals/d) estimated by USGS based on 2005 water use data in the US (Source:National Atlas, http://www.nationalatlas.gov).

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

3

[USGS, 2013]. More information on Test II well locationsand completion depths is provided in Table S1.

[15] In addition to GRACE DTWS, other hydrometeoro-logical predictors used in this study include the Parameter-elevation Regressions on Independent Slopes Model(PRISM) total monthly precipitation and average minimumand maximum temperature data (4 km resolution) [PRISMClimate Group, 2012], which were extracted for all welllocations for the study period.

3. Methodology

[16] The statistical downscaling method adopted in thisstudy is artificial neural network (ANN), which has beenwidely used in streamflow forecasting and water resourcesmanagement [e.g., ASCE, 2000a, 2000b; Chang and Chang,2006; House-Peters and Chang, 2011; Hsu et al., 1995;Maier et al., 2010; Moradkhani et al., 2004]. A majorstrength of the ANN lies in its universal approximation prop-erty—an ANN model with a single hidden layer can betrained to approximate the causal relationship of any nonlin-ear dynamic system without a priori assumptions about theunderlying physical system. Another important attribute ofANN is its built-in capability to adapt to changes from theoriginal environment [Haykin, 2009]. Thus, trained ANNmodels are relatively robust to data noise and are well suitedto support real-time decision making.

[17] In the context of water level prediction, Coulibalyet al. [2001] developed ANNs to predict the groundwater

level by using in situ hydrometeorological observationsand antecedent groundwater levels from the same well asinputs. Coppola et al. [2003, 2005a] applied ANNs to pre-dict groundwater levels in response to variable pumpingand weather conditions. Feng et al. [2008] developedANNs to simulate regional groundwater levels affectedby human activities by incorporating population andmonthly irrigation data as additional input variables. Chuand Chang [2009] and Coppola et al. [2007] integratedANN in an optimization framework for groundwaterresources planning, in which ANN models were trained asa surrogate of process models for predicting groundwaterlevels under different pumping conditions. The modelsdeveloped in this study differ from previous studies inthat the GRACE DTWS is used in lieu of antecedentgroundwater levels as a predictor for 1 month ahead pre-diction. In other words, the models are developed for sit-uations in which in situ monitoring is disrupted. Becauseof the GRACE data, the ANN models predict water levelchanges, instead of absolute water levels. Water agenciesare often interested in the former quantity for water avail-ability analysis [McGuire et al., 2012], although estima-tion of the long-term mean can also be accomplishedusing a number of existing regression methods [e.g.,Hengl et al., 2004] and typically has less requirement onspatiotemporal frequency of data. For completeness, theANN algorithm is briefly reviewed below. More detailson ANN algorithms can be found in Maier et al. [2010]and Haykin [2009].

Figure 2. Map of wells considered in Test II, where all wells are located in the same GRACE pixel inNebraska. Well names are labeled above the wells.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

4

3.1. Multilayer Perceptron (MLP) Networks

[18] Time series prediction problems seek to learn afunctional mapping between a set of predictors x and thetarget variable y (assumed here as a scalar variable)

y ¼ f xð Þ þ E; ð1Þ

where f is the mapping and E is process noise. In this work,MLP networks are used to learn the functional mapping.MLP network is a type of feedforward ANN consisting ofan input layer, one or more hidden layers, and an outputlayer, with each layer consisting of one of more neurons.All MLPs developed in this work are single-hidden-layerMLPs. Construction of an MLP proceeds by connectinglayers one-by-one through a series of transformations. Inthe first step, connections between the hidden layer andinput variables are established. Let xif gM

i¼1 denote a set ofM predictors. The hidden layer consists of K hidden neu-rons and each is a weighted sum of predictors [Bishop,2006]:

ak ¼XMi¼1

w 1ð Þki xi þ w 1ð Þ

k0 ; k ¼ 1; :::;K; ð2Þ

in which ak is a hidden neuron, w1ð Þ

ki

n oM

i¼1are unknown

weights associated with each input neuron, and w1ð Þ

k0 is anunknown bias term used for correcting the estimation bias.The superscripts in equation (2) denote the layer number.In the second step, equation (2) is passed to a transfer func-tion to yield outputs from hidden neurons

zk ¼ akð Þ; k ¼ 1; :::;K; ð3Þ

where zk are outputs, and is the transfer function. A com-monly used transfer function for the hidden layer is thelogistic sigmoid function, which ranges in [0, 1] and acts asa compressing function. Finally, connections from the hid-den layer to the output layer are established via a lineartransfer function

yj ¼XK

k¼1

w 2ð Þjk zk þ w 2ð Þ

j0 ; ð4Þ

where yj j ¼ 1; :::; Jð Þ are output neurons (i.e., model pre-

dictions), and w 2ð Þjk

n oK

k¼1and w 2ð Þ

j0 are the unknown weights

and the bias term of the output layer, respectively. Thenumber of output neurons is always one for the currentstudy. In the training phase, the unknowns in equations (2)and (4) are solved through backpropagation, which is a pro-cess of propagating fitting errors backward through the net-work to obtain the optimal weights in each layer. TheMatlab Neural Network Toolbox [Demuth et al., 2008] wasused to develop and train all MLP networks for this study.

3.2. Performance Measures

[19] Performance of the developed MLPs is quantifiedusing several criteria. Root mean square error (RMSE)measures the global fitness of a predictive model

RMSE ¼ 1

Nv

XNv

i¼1

yi � oið Þ2 !1=2

; ð5Þ

where y and o are predicted and observed values, respec-tively; and Nv is the number of target data used for testing.The scaled RMSE, R�, is defined as the ratio betweenRMSE and the standard deviation of observations �o

R� ¼ RMSE

�o; ð6Þ

which can vary between zero and a large positive value.The correlation coefficient (R) is a measure of how futureoutcomes are likely to be predicted by the model and isequivalent to the sample cross correlation between pre-dicted and observed values

R ¼

XNv

i¼1

yi � yð Þ oi � oð ÞffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXNv

i¼1

yi � yð Þ2vuut

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiXNv

i¼1

oi � oð Þ2vuut

; ð7Þ

where the overbar denotes mean values. The maximumabsolute error (MAE) is defined as

MAE ¼ 1

Nv

XNv

i¼1

jyi � oij: ð8Þ

[20] Finally, the Nash-Sutcliff efficiency (NSE), rangingfrom �1 to 1, measures the predictive skill of a model rel-ative to the mean of observations,

NSE ¼ 1�

XNv

i¼1

yi � oið Þ2

XNv

i¼1

oi � oð Þ2: ð9Þ

[21] In the literature, a predictive model is said toachieve (a) very good performance if the resulting NSE isgreater than 0.75 and R� is less than 0.50, (b) good per-formance if NSE is greater than 0.65 and R� is less than0.6, and (c) satisfactory performance if NSE is greater than0.5 and R� is less than 0.70 [Moriasi et al., 2007]. Theseranking criteria are used here as a convenient measure forcomparing the relative performance of different models,although readers should be aware that they are context andapplication dependent.

3.3. ANN Model Structure

[22] Figure 3a shows a representative MLP networkstructure for 1 month ahead prediction. The input variablesx includes six variables and the target variable y is waterlevel change, Dh. Both Well-TX and Well-HP MLPs usethe network structure shown in Figure 3a. For Well-IL, anadditional antecedent series, G t � 2ð Þ, is included. TheMLP structures were determined to maintain a balancebetween model parsimony and performance. Recall that themain purpose of this work is to test the data worth of

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

5

GRACE for fulfilling the role of in situ data. Therefore, theantecedent Dh is not included as a predictor when formulat-ing 1 month ahead prediction MLPs. More details on thedesign and training of MLPs are provided in section 4.

[23] Water managers are often interested in water levelchanges at the seasonal scale for operational planning. Astrategy proposed by Chang et al. [2007] is used here formultimonth forecast. The prediction proceeds 1 month at atime, and each time the intermediate output is used toinform the next prediction. This multistep prediction strat-egy is referred to as serial propagation by Chang et al.[2007], who found that inclusion of outputs from interme-diate steps not only improved the accuracy, but also reli-ability of predictions. Thus, output from 1 month aheadprediction is used as a predictor when performing 2 monthahead prediction (Figure 3b). Similarly, both 1 and 2 monthahead Dh predictions are used as inputs for the 3 monthahead prediction. A separate MLP network is developedand trained for each additional prediction step.

3.4. Relative Importance of Input Variables

[24] In the literature, several metrics for quantifying therelative importance of ANN input variables exist [Gevreyet al., 2003; Olden and Jackson, 2002; Olden et al., 2004],some of which have been applied in previous water levelprediction studies. For example, Coppola et al. [2003] usedthe backward stepwise elimination, in which the model per-formance between excluding and including a certain predic-tor is compared. However, this classical stepwise method(either forward addition or backward elimination) is limited

because the ‘‘necessity to use a new model for each variableselection skews the results’’ [Gevrey et al., 2003]. A compar-ative study of Olden et al. [2004] examined the performanceof nine relative importance metrics on a synthetic data setwith known correlation structure. They concluded that a con-nection weight method which uses raw input-hidden andhidden-output connection weights in the trained ANN pro-vides the best methodology for accurately quantifying vari-able importance. The application of the connection weightmethod proceeds in three steps. Let the input-hidden connec-tion weight matrix be denoted as IWK�M and the hidden-output connection weight vector as lw1�K , where K is thenumber of hidden neurons, and M is the number of input var-iables (see Figure 3a). Each column in IW is first multipliedby lw (element-wise multiplication) to give a product matrixCK�M . Summing over rows of C gives an importance vector.The relative importance of the ith input variable is thendefined as follows

reli ¼jvijX

i

jvij; i ¼ 1; � � �K; ð10Þ

where vif gKi¼1 are elements of v. The connection weight

method is straightforward to implement and will be usedhere to rank the relative importance of input variables.

3.5. Ensemble Generation

[25] Ensemble techniques provide a means for quantify-ing uncertainty and improving generalization and stability

Figure 3. Structure of MLP used for (a) 1 month ahead and (b) 2 month ahead predictions. In Figure3a, the input layer consists of six neurons, which are antecedent precipitation (P), minimum and maxi-mum temperature (Tmin and Tmax), and GRACE DTWS (G). In Figure 3b, the input layer consists ofseven neurons. The predictand is water level change (Dh) ; t and tþ 1 indicate 1 and 2 month ahead; IWand lw denote connection weights; and represents the sigmoid transfer function.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

6

of ANN models, which becomes important when the size oftraining set is small such as in the current problem. Severalmethods exist for generating ANN ensembles, includingbagging [Breiman, 1996], boosting [Freund and Schapire,1995], and randomization of initial weights [Dietterich,2000b; Opitz and Maclin, 1999]. Bagging is based on boot-strapping original training sets to generate different realiza-tions; a network model is then trained for each realization.Boosting is a technique for producing a series of predictorstrained with a different distribution of the original trainingdata. Initial weight randomization simply uses different ini-tial weights to train and generate an ensemble of networks.The difference in ensemble members reflects uncertaintycaused by the limited training sample size and the local na-ture of ANN training algorithms. Previous studies suggestthat the initial weight randomization method can yieldresults as accurate as the more sophisticated bagging andboosting methods [Opitz and Maclin, 1999; Shu and Burn,2004]. Thus, it is used to create ensembles in this work.

[26] After the ensemble models are generated, they arecombined to generate ensemble outputs. Common methodsinclude simple averaging, weighted averaging, and stacking,a survey of which is provided in Zhou [2012]. Because thecurrent problem involves ensemble models of similar per-formances (i.e., homogeneous learners), the computationallyefficient simple averaging is used, in which all members areassigned equal weights. Regardless of the method used to

construct and combining the ensembles, the generalizationability of a network ensemble is often better than that of asingle model [Dietterich, 2000a; Zhou, 2012].

4. Results and Discussion

4.1. Test I

[27] Correlation analysis shows a positive correlationbetween GRACE DTWS (G tð Þ) and water level change(Dh tð Þ) for all three wells (Figures 4a–4c), with a Spear-man’s correlation coefficient of 0.33, 0.57, and 0.73,respectively, for Well-TX, Well-HP, and Well-IL. The cor-relation is the strongest for Well-IL because of its relativelyshallow depth, while intermittent temporal lags betweenG tð Þ and Dh tð Þ can be observed in the case of the other twowells, especially Well-TX. The lags largely result from dif-ferent water table depths, the vadose zone soil structure,and shallow aquifer geology which, in turn, all affectrecharge patterns. In the case of TX, G tð Þ shows no obviousseasonal pattern, while distinct seasonal patterns can beobserved in the other two wells. Precipitation and tempera-ture, which have been included in many previous studies aspredictors [e.g., Coppola et al., 2005b; Coulibaly et al.,2001], exhibit strong seasonal patterns as expected (FigureS2). Results of autocorrelation analysis on predictors,which is part of the ANN model selection process, are pro-vided in supporting information S3 and Figure S3.

Figure 4. (left axis) GRACE DTWS and (right axis) groundwater level time series extracted for(a) Well-TX, (b) Well-HP, and (c) Well-IL. Correlation coefficient, R, is labeled in the upper right cornerof each subplot.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

7

4.1.1. One Month Ahead Prediction[28] Input data were split into three sequential parts :

60% was used for training, 20% for validation, and the restfor testing the performance of trained models. Before train-ing, all input and target data were standardized by scalinglinearly to the range 0:01; 0:99½ � using the maximum andminimum values of each series. Scaling is necessary toensure that all variables receive equal consideration duringtraining of the ANN. The number of hidden neurons is criti-cal to the performance of ANN. If too many hidden neuronsare used, the network may overfit and, thereby, reducing itsgeneralization capability and increasing training time [Xuand Li, 2002]. On the other hand, using too few hidden neu-rons may result in underfitting. A rule-of-thumb is that thenumber of hidden neurons should be about half the numberof predictors and should never be more than twice as large[Berry and Linoff, 2004; Minns and Hall, 1996]. In prac-tice, the optimal number is determined by trial-and-error, inwhich hidden neurons are added gradually until the RMSEof training starts to increase. Following this general proce-dure, the number of hidden neurons used for Well-TX,Well-HP, and Well-IL models was found to be 3, 3, and 6,respectively. The algorithm chosen for backpropagation isLevenberg-Marquardt, a gradient-based algorithm. Train-ing was stopped when either the performance goal wasreached or when the error on validation subset increasedconsecutively for six iterations. Early stopping is a mecha-nism for preventing overfitting [Demuth et al., 2008].

[29] For each well and each lead time, an ensemble of 50MLPs were generated and combined according to the pro-cedure described in section 3.5. The performance metricsof the ensemble models are reported in Table 1, in whichthe right three columns were calculated based on the testingdata. On the basis of NSE, the performance of 1 monthahead ensemble falls into either good or very good catego-ries (see section 3.2 for definition of performance catego-ries). The fit for Well-HP is the best and that for Well-IL isthe worst. Same conclusion can be drawn on the basis ofR�. Figures 5a–5c compare ensemble prediction of Dh to insitu data. On each subplot, the two vertical lines mark thedivision of training, validation, and testing subsets.

[30] The relatively poor performance of the Well-ILmodel is surprising because Well-IL shows the strongestDh versus G correlation among the three wells. The poorperformance may be contributed to both a sharp decline in

water levels during the testing period and a change in sea-sonal variation patterns from a subdued (2003–2008) to amore pronounced one (2008–2012) (see Figure 4c). In thecase of Well-TX, the 2006 drought event was covered bythe training period, which helped model training to some

Table 1. Test I Ensemble Performance Metrics

Lead Month Architecturea Rtrain Rval Rtest R� NSE MAE (ft)

Well-TX1 6-3-1 0.88 0.85 0.83 0.55 0.68 1.612 7-4-1 0.89 0.86 0.81 0.66 0.55 1.913 8-4-1 0.91 0.85 0.80 0.64 0.57 1.82

Well-HP1 6-3-1 0.94 0.94 0.92 0.36 0.86 0.372 7-5-1 0.95 0.94 0.91 0.45 0.78 0.473 8-4-1 0.91 0.89 0.87 0.60 0.62 0.60

Well-IL1 7-6-1 0.86 0.76 0.80 0.58 0.65 0.822 8-6-1 0.90 0.69 0.77 0.60 0.62 0.883 9-4-1 0.88 0.73 0.75 0.67 0.54 0.97

aThe three numbers represent the number of input, hidden, and outputneurons in each layer.

Figure 5. One step ahead Dh prediction for (a) Well-TX,(b) Well-HP, and (c) Well-IL. The dotted vertical linesdenote boundaries among training, validation, and test sub-sets. Gray hairlines are predictions of 50 ensemble mem-bers, ensemble averages are thick solid lines, and data arelabeled by circles.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

8

degree, although the developed models need to be furtherimproved to fully capture the extreme values. For the latterpurpose, it is necessary to include as many extreme eventsas training patterns if the corresponding in situ data are alsoavailable. It is also important to realize that GRACE moni-tors the average large-scale impact caused by natural andanthropogenic events. Thus, if an observation well is signif-icantly affected by local pumping activities, GRACE datamay have limited value for it. In contrast to Well-TX andWell-IL, Well-HP shows persistent cyclic patterns through-out the study period, which makes its Dh the easiest topredict.

[31] The relative importance of each predictor was quan-tified using the connection weight method described undersection 3.4 and the results are shown in Figure 6a. For easeof visualization, relative importance belonging to the samevariable group (e.g., Tmax t � 1ð Þ and Tmax t � 2ð Þ) is com-bined. The relative importance of G in 1 month ahead pre-diction is 0.15, 0.08, and 0.18, respectively, for Well-TX,Well-HP, and Well-IL. In all cases, the contribution ofGRACE is similar to or slightly greater than that of the pre-cipitation group, but both are dominated by the temperaturegroup which represents the most important predictor groupfor 1 month ahead prediction. The relative importance ofGRACE appears to be higher when the Dh pattern is lesscyclic, such as in the case of Well-TX and Well-IL. Tem-perature is a good surrogate for seasonality and, thus, italso reflects potential irrigation water uses. The roles oftemperature and precipitation as observed here are consist-

ent with those reported elsewhere in previous studies [Cop-pola et al., 2005b; Feng et al., 2008].4.1.2. Multimonth Ahead Prediction

[32] Two and three month ahead MLP models weredeveloped according to descriptions in section 3.3 and Fig-ure 3. The ensemble performance metrics are summarizedin Table 1. As mentioned before, the intermediate waterlevel change prediction, Dh tð Þ, is used as feedback in 2month ahead prediction (i.e., at t þ 1), and intermediatepredictions Dh tð Þ and Dh t þ 1ð Þ are used for 3 month pre-diction (i.e., at t þ 2).

[33] The performance of the 2 month ahead predictionspans from satisfactory to very good category, while theperformance of the 3 month ahead predictions all falls intothe satisfactory category. Figures 6b and 6c plot the rela-tive importance of input variables. For the 2 month aheadprediction, the relative importance of GRACE data staysalmost the same as that in the 1 month ahead prediction.The relative importance of intermediate output, Dh tð Þ, issecond only to the temperature group. For the 3 monthahead prediction, the intermediate water level change pre-dictions become the most important group. This is becauseof the strong autocorrelation of water level [Coppola et al.,2005a]. Thus, in the absence of in situ Dh data, the role ofGRACE in 1 month ahead prediction is seen as helping tojumpstart a serial propagation process. The outputs thenprovide feedbacks to subsequent multimonth ahead predic-tion. This way, the cross-scale information exchange isfacilitated.

Figure 6. Relative importance of input variables in (a) 1 month, (b) 2 month, and (c) 3 month aheadprediction for Test I, where Dh represents water level change from intermediate steps.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

9

[34] Overall, Test I results are encouraging. Downscalingof GRACE data leads to modest but significant improve-ment in Dh. Only one representative well in each differentGRACE pixel was selected for testing. This leads naturallyto the question about the value of downscaling GRACE formultiple wells at the subgrid level. Test II is designed toaddress this question.

4.2. Test II

[35] The 1� GRACE pixel used for Test II was selectedmainly because of the abundance of in situ data that can beused for validation. As can be seen from Figure 2, the wellsare relatively distributed across the pixel. Correlation anal-ysis shows that a positive correlation between G and Dhexists for all seven wells in Test II (Figure 7), with Cham-pion Well showing the strongest (0.61) and Haigler Wellthe weakest (0.42) correlation. Recall that Haigler Well iscompleted in a different formation than other wells. Never-theless, the phases of Dh for all wells are quite similar,although the magnitudes of variations differ among wells.Grant South Well exhibits the largest variations, while var-iations in other wells are more subdued. For testing pur-pose, only 1 month ahead MLPs were developed, whichadopt the same model structure as that used for Well-HP(i.e., Benkelman Well). Figure 7 also suggests that GRACEdata show a slightly upward trend during 2008–2011,which corresponds to a slowdown in groundwater depletion

Figure 7. Comparison between GRACE (G) and Dh for all Test II wells. Correlation coefficients arelabeled in each subplot.

Table 2. One Month Ahead Performance Metrics for Test IIWells

Well Name R� NSE MAE (ft)

Haigler 0.51 0.73 0.72Benkelman 0.36 0.86 0.37Enders 0.41 0.83 0.39Champion 0.51 0.73 0.84Lamar 0.56 0.67 1.82Grainton 0.48 0.76 0.95Grant South 0.44 0.80 2.42

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

10

(Figure S1). The performance metrics of all ensemble mod-els are reported in Table 2.

[36] The NSE values of all models are in either good orvery good category. Benkelman and Enders Well are rela-tively close, and the ANN models of which show similarperformance. Lamar Well and Champion Well, which arelocated in higher elevations, have lower NSE than others.Figure 8 shows the relative importance of each predictorgroup in the 1 month ahead MLPs. Again, GRACE hassimilar relative importance as the precipitation group, andboth are dominated by the temperature groups.

[37] Thus, Test II indicates that downscaling of GRACEto multiple wells at the subgrid scale is feasible and hascertain merits in this case. This is mainly because most ofthe wells are completed in the same formation. Of course,if wells are completed in different and disconnected forma-tions, downscaling may have mixed results. Only one pixelwas tested here. If data are available, further validationtests for other regions around the world can yield additionalinsights about the usefulness of GRACE downscaling.

5. Conclusion

[38] GRACE has supported many advances in TWSmonitoring since its launch. To realize the full potential ofGRACE for hydrological applications, however, monthlyTWS anomalies from GRACE must be downscaled inspace and time and extrapolated to the present, therebymeeting the specificity, timeliness, and high spatial resolu-tion requirements of local water resources management[Houborg et al., 2012]. This study provides a relativelystraightforward nonparametric approach for tapping intothe predictive power of GRACE.

[39] A newly released, gridded GRACE product wastested for its value in serving as a surrogate for continu-ously monitored in situ water level measurements, which

have seen a great decline in coverage decline. Groundwateragencies, especially those in semiarid regions, are inter-ested in tracking groundwater level changes to manageaquifers in a sustainable manner [McGuire et al., 2012].The nonparametric ANN models used in this study weredeveloped using PRISM monthly precipitation, maximumand minimum temperatures, and GRACE DTWS as inputs.The target variable of interest is the groundwater levelchange, Dh. The wells tested are located in different geo-graphic and climatic regions in the US. Main findingsinclude:

[40] 1. At 1 month prediction interval, the trained ensem-ble MLPs gave good to very good performance. The relativeimportance of GRACE to Dh prediction ranges from 8% tonearly 20%. The relative importance of GRACE tends to begreater when the water level patterns are less cyclic.

[41] 2. The performance of the ensemble models is satis-factory at multimonth lead times.

[42] 3. Results support the main hypothesis that GRACEDTWS can be downscaled to infer or predict Dh when con-tinuous in situ measurements are not available.

[43] 4. The approach developed here can be applied toforce multiple ANNs developed for a network of wells, theoutputs of which can then be combined via spatial interpo-lation techniques.

[44] 5. The ANN approach taken here is well established,although its training and validation can be time consump-tion. However, any other statistical learning method can beused in its place if necessary.

Appendix A[45] The Hodrick–Prescott filter [Hodrick and Prescott,

1997] or HP filter is used to obtain Dh from absolutewater level measurements. HP filter is a high-pass filterthat removes a smooth trend from a time series by solving[Ravn and Uhlig, 2002]

Figure 8. Relative importance of predictors used in Test II.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

11

minXT

t¼1

yt � ytð Þ2 þ � ytþ1 � yt

� �� yt � yt�1ð Þ

� �2n o; ðA1Þ

where yt is the trend term and � is a smoothing parameter.If the frequency of raw data is monthly, the recommendedperiodicity parameter of the HP filter is 14,400 [Ravn andUhlig, 2002]. The Matlab function hpfilter was used in thisstudy.

[46] Acknowledgments. The author is grateful to the Associate Editorand three anonymous reviewers for their careful review and constructivecomments, which have significantly improved the original manuscript. Theauthor also wishes to thank Sean Swenson at the National Center forAtmospheric Research for his constructive comments on the originalmanuscript.

ReferencesAlsdorf, D. E., E. Rodr�ıguez, and D. P. Lettenmaier (2007), Measuring sur-

face water from space, Rev. Geophys., 45, RG2002, doi:10.1029/2006RG000197.

ASCE (2000a), Committee on the application of ANNs in hydrology artifi-cial neural networks in hydrology. I: Preliminary concepts, J. Hydrol.Eng., 5(2), 115–123.

ASCE (2000b), Committee on the application of ANNs in hydrology artifi-cial neural networks in hydrology. II: Hydrologic application, J. Hydrol.Eng., 5(2), 124–137.

Barber, N. L. (2009), Summary of estimated water use in the United States in2005: U.S. Geological Survey Fact Sheet 2009–3098, 2 pp., U.S. Geol.Surv., Reston, Va.

Becker, M. W. (2006), Potential for satellite remote sensing of groundwater, Ground Water, 44(2), 306–318.

Berry, M. J., and G. S. Linoff (2004), Data Mining Techniques: For Mar-keting, Sales, and Customer Relationship Management, John Wiley, In-dianapolis, IN.

Bishop, C. M. (2006), Pattern Recognition and Machine Learning, 738 p.,Springer, New York.

Breiman, L. (1996), Bagging predictors, Mach. Learning, 24(2), 123–140.Brunner, P., H.-J. H. Franssen, L. Kgotlhang, P. Bauer-Gottwein, and W.

Kinzelbach (2007), How can remote sensing contribute in groundwatermodeling?, Hydrogeol. J., 15(1), 5–18.

Brunner, P., J. Doherty, and C. T. Simmons (2012), Uncertainty assessmentand implications for data acquisition in support of integrated hydrologicmodels, Water Resour. Res., 48, W07513, doi:10.1029/2011WR011342.

Chang, F. J., and Y. T. Chang (2006), Adaptive neuro-fuzzy inference sys-tem for prediction of water level in reservoir, Adv. Water Resour., 29(1),1–10.

Chang, F. J., Y.-M. Chiang, and L.-C. Chang (2007), Multi-step-ahead neu-ral networks for flood forecasting, Hydrol. Sci. J., 52(1), 114–130.

Chu, H. J., and L. C. Chang (2009), Optimal control algorithm and neuralnetwork for dynamic groundwater management, Hydrol. Processes,23(19), 2765–2773.

Coppola, E. A., F. Szidarovszky, M. Poulton, and E. Charles (2003), Artifi-cial neural network approach for predicting transient water levels in amultilayered groundwater system under variable state, pumping, and cli-mate conditions, J. Hydrol. Eng., 8(6), 348–360.

Coppola, E. A., A. J. Rana, M. M. Poulton, F. Szidarovszky, and V. W. Uhl(2005a), A neural network model for predicting aquifer water level ele-vations, Ground Water, 43(2), 231–241.

Coppola, E. A., C. F. McLane, M. M. Poulton, F. Szidarovszky, and R. D.Magelky (2005b), Predicting conductance due to upconing using neuralnetworks, Ground Water, 43(6), 827–836.

Coppola, E. A., F. Szidarovszky, D. Davis, S. Spayd, M. M. Poulton, and E.Roman (2007), Multiobjective analysis of a public wellfield using artifi-cial neural networks, Ground Water, 45(1), 53–61.

Coulibaly, P., F. Anctil, R. Aravena, and B. Bobee (2001), Artificial neuralnetwork modeling of water table depth fluctuations, Water Resour. Res.,37(4), 885–896.

Demuth, H., M. Beale, and M. Hagan (2008), Neural Network ToolboxTM

User’s Guide, 907 pp., The MathWorks Inc., Natick, Mass.Dietterich, T. G. (2000a), Ensemble methods in machine learning, in Multi-

ple Classifier Systems, edited Kittler J. and F. Roli, pp. 1–15, Springer,Heidelberg.

Dietterich, T. G. (2000b), An experimental comparison of three methodsfor constructing ensembles of decision trees: Bagging, boosting, and ran-domization, Mach. Learning, 40(2), 139–157.

Fan, Y., H. Li, and G. Miguez-Macho (2013), Global patterns of ground-water table depth, Science, 339(6122), 940–943.

Feng, S., S. Kang, Z. Huo, S. Chen, and X. Mao (2008), Neural networks tosimulate regional ground water levels affected by human activities,Ground Water, 46(1), 80–90.

Fowler, H., S. Blenkinsop, and C. Tebaldi (2007), Linking climate changemodelling to impacts studies: Recent advances in downscaling techni-ques for hydrological modelling, Int. J. Climatol., 27(12), 1547–1578.

Freund, Y., and R. Schapire (1995), A desicion-theoretic generalization ofon-line learning and an application to boosting, in Computational Learn-ing Theory, vol. 904, pp. 23–37, Springer, Berlin.

Gevrey, M., I. Dimopoulos, and S. Lek (2003), Review and comparison ofmethods to study the contribution of variables in artificial neural networkmodels, Ecol. Modell., 160(3), 249–264.

Haykin, S. S. (2009), Neural Networks and Learning Machines, 3rd ed.,906 p., Prentice Hall, New York.

Hengl, T., G. B. M. Heuvelink, and A. Stein (2004), A generic frameworkfor spatial prediction of soil variables based on regression-kriging, Geo-derma, 120(1–2), 75–93.

Henry, C. M., D. M. Allen, and J. Huang (2011), Groundwater storage vari-ability and annual recharge using well-hydrograph and GRACE satellitedata, Hydrogeol. J., 19(4), 741–755.

Hodrick, R. J., and E. C. Prescott (1997), Postwar US business cycles: Anempirical investigation, J. Money Credit Banking, 28, 1–16.

Houborg, R., M. Rodell, B. Li, R. Reichle, and B. F. Zaitchik (2012),Drought indicators based on model-assimilated Gravity Recovery andClimate Experiment (GRACE) terrestrial water storage observations,Water Resour. Res., 48, W07525, doi:10.1029/2011WR011291.

House-Peters, L. A., and H. Chang (2011), Urban water demand modeling:Review of concepts, methods, and organizing principles, Water Resour.Res., 47, W05401, doi:10.1029/2010WR009624.

Hsu, K. L., H. V. Gupta, and S. Sorooshian (1995), Artificial neural-network modeling of the rainfall-runoff process, Water Resour. Res.,31(10), 2517–2530.

JPL (2012), GRACE product, Pasadena, CA [Available at http://podaac-ftp.jpl.nasa.gov/allData/tellus/L3/land_mass/RL05.]

Konikow, L. F. (2011), Contribution of global groundwater depletion since1900 to sea-level rise, Geophys. Res. Lett., 38, L17401, doi:10.1029/2011GL048604.

Landerer, F. W., and S. C. Swenson (2012), Accuracy of scaled GRACEterrestrial water storage estimates, Water Resour. Res., 48, W04531,doi:10.1029/2011WR011453.

Leblanc, M. J., P. Tregoning, G. Ramillien, S. O. Tweed, and A. Fakes(2009), Basin-scale, integrated observations of the early 21st centurymultiyear drought in southeast Australia, Water Resour. Res., 45,W04408, doi:10.1029/2008WR007333.

Maier, H. R., A. Jain, G. C. Dandy, and K. P. Sudheer (2010), Methodsused for the development of neural networks for the prediction of waterresource variables in river systems: Current status and future directions,Environ. Modell. Software, 25(8), 891–909.

McGuire, V. L., K. D. Lund, and B. K. Densmore (2012), Saturated thick-ness and water in storage in the High Plains aquifer, 2009, and water-level changes and changes in water in storage in the High Plains aquifer,1980 to 1995, 1995 to 2000, 2000 to 2005, and 2005 to 2009, USGS Sci.Invest. Rep. 2012–5177, 28 pp., Reston, Va.

Minns, A., and M. Hall (1996), Artificial neural networks as rainfall-runoffmodels, Hydrol. Sci. J., 41(3), 399–417.

Moradkhani, H., K.-L. Hsu, H. V. Gupta, and S. Sorooshian (2004),Improved streamflow forecasting using self-organizing radial basis func-tion artificial neural networks, J. Hydrol., 295(1–4), 246–262.

Moriasi, D. N., J. G. Arnold, M. W. VanLiew, R. L. Bingner, R. D. Harmel,and T. L. Veith (2007), Model evaluation guidelines for systematic quan-tification of accuracy in watershed simulations, Trans. Am. Soc. Agric.Biol. Eng., 50(3), 885–900.

National Research Council (2008), Integrating Multiscale Observations ofU.S. Waters, 210 pp. The National Academies Press, Washington, D. C.

Nelson, R. L. (2012), Assessing local planning to control groundwaterdepletion: California as a microcosm of global issues, Water Resour.Res., 48, W01502, doi:10.1029/2011WR010927.

Olden, J. D., and D. A. Jackson (2002), Illuminating the ‘‘black box’’: Arandomization approach for understanding variable contributions in arti-ficial neural networks, Ecol. Modell., 154, 135–150.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

12

Olden, J. D., M. K. Joy, and R. G. Death (2004), An accurate comparison ofmethods for quantifying variable importance in artificial neural networksusing simulated data, Ecol. Modell., 178(3–4), 389–397.

Opitz, D., and R. Maclin (1999), Popular ensemble methods: An empiricalstudy, J. Artif. Intel. Res., 11, 169–198.

PRISM Climate Group (2012), PRISM, Corvallis, OR [Available at http://prism. oregonstate.edu.]

Ravn, M. O., and H. Uhlig (2002), On adjusting the Hodrick-Prescott filterfor the frequency of observations, Rev. Econ. Stat., 84(2), 371–376.

Rodell, M., and J. S. Famiglietti (2001), Terrestrial water storage variationsover Illinois: Analysis of observations and implications for Gravity Re-covery and Climate Experiment (GRACE), Water Resour. Res., 37(5),1327–1340.

Rodell, M., J. L. Chen, H. Kato, J. S. Famiglietti, J. Nigro, and C. R. Wilson(2007), Estimating groundwater storage changes in the Mississippi Riverbasin (USA) using GRACE, Hydrogeol. J., 15(1), 159–166.

Rodell, M., I. Velicogna, and J. S. Famiglietti (2009), Satellite-based esti-mates of groundwater depletion in India, Nature, 460(7258), 999–1002.

Scanlon, B. R., L. Longuevergne, and D. Long (2012a), Ground referencingGRACE satellite estimates of groundwater storage changes in the Cali-fornia Central Valley, USA, Water Resour. Res., 48, W04520,doi:10.1029/2011WR011312.

Scanlon, B. R., C. C. Faunt, L. Longuevergne, R. C. Reedy, W. M. Alley,V. L. McGuire, and P. B. McMahon (2012b), Groundwater depletion andsustainability of irrigation in the US High Plains and Central Valley,Proc. Natl. Acad. Sci., 109(24), 9320–9325.

Seo, K., C. Wilson, J. Famiglietti, J. Chen, and M. Rodell (2006), Terres-trial water mass load changes from Gravity Recovery and ClimateExperiment (GRACE), Water Resour. Res., 42, W05417, doi:10.1029/2005WR004255.

Shu, C., and D. H. Burn (2004), Artificial neural network ensembles andtheir application in pooled flood frequency analysis, Water Resour. Res.,40, W09301, doi:10.1029/2003WR002816.

Sophocleous, M. (2005), Groundwater recharge and sustainability in theHigh Plains aquifer in Kansas, USA, Hydrogeol. J., 13, 351–365.

Sun, A. Y., R. Green, M. Rodell, and S. Swenson (2010), Inferring aquiferstorage parameters using satellite and in situ measurements: Estimation

under uncertainty, Geophys. Res. Lett., 37, L10401, doi:10.1029/2010GL043231.

Sun, A. Y., R. Green, S. Swenson, and M. Rodell (2012), Toward calibra-tion of regional groundwater models using GRACE data, J. Hydrol.,422–423, 1–9.

Swenson, S., J. Famiglietti, J. Basara, and J. Wahr (2008), Estimating pro-file soil moisture and groundwater variations using GRACE and Okla-homa Mesonet soil moisture data, Water Resour. Res., 44, W01413,doi:10.1029/2007WR006057.

Syed, T. H., J. S. Famiglietti, M. Rodell, J. Chen, and C. R. Wilson (2008),Analysis of terrestrial water storage changes from GRACE and GLDAS,Water Resour. Res., 44, W02433, doi:10.1029/2006WR005779.

Tapley, B. D., S. Bettadpur, M. Watkins, and C. Reigber (2004), The grav-ity recovery and climate experiment: Mission overview and early results,Geophys. Res. Lett., 31, L09607, doi:10.1029/2004GL019920.

TWDB (2012a), Groundwater database, Austin, TX. [Available at http://www.twdb.state.tx.us/groundwater/data/gwdbrpt.asp.].

TWDB (2012b), Pecos valley aquifer. [Available at http://www.twdb.state.tx.us/groundwater/aquifer/majors/pecos-valley.asp.].

USGS (2013), USGS active groundwater level network, Reston, Va. [Avail-able at http://groundwaterwatch.usgs.gov.]

Wada, Y., L. VanBeek, D. Viviroli, H. H. D€urr, R. Weingartner, and M. F.P. Bierkens (2011), Global monthly water stress: 2. Water demand andseverity of water stress, Water Resour. Res., 47, W07518, doi:10.1029/2010WR009792.

Wada, Y., L. P. H. vanBeek, F. C. Sperna Weiland, B. F. Chao, Y.-H. Wu,and M. F. P. Bierkens (2012), Past and future contribution of globalgroundwater depletion to sea-level rise, Geophys. Res. Lett., 39, L09402,doi:10.1029/2012GL051230.

Xu, Z., and J. Li (2002), Short-term inflow forecasting using an artificialneural network model, Hydrol. Processes, 16(12), 2423–2439.

Yeh, P. J. F., S. C. Swenson, J. S. Famiglietti, and M. Rodell (2006),Remote sensing of groundwater storage changes in Illinois using theGravity Recovery and Climate Experiment (GRACE), Water Resour.Res., 42, W12203, doi:10.1029/2006WR005374.

Zhou, Z.-H. (2012), Ensemble Methods: Foundations and Algorithms, vol.14, 222 p., Taylor and Francis, Boca Raton, Fla.

SUN: PREDICTING GROUNDWATER LEVEL CHANGES USING GRACE DATA

13