two methods to calibrate the total travel demand and ... · computer-aided civil and infrastructure...

18
Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand and Variability for a Regional Traffic Network Tao Wen* School of Civil and Environmental Engineering, University of New South Wales, Sydney, Australia and Data61, CSIRO, Sydney, Australia Lauren Gardner, Vinayak Dixit & S. Travis Waller School of Civil and Environmental Engineering, University of New South Wales, Sydney, Australia Chen Cai & Fang Chen Data61, CSIRO, Sydney, Australia Abstract: This article proposes a novel methodology that uses the bi-level programming formulation to cali- brate the expected total demand and the corresponding demand variability of traffic networks. In the bi-level for- mulation the upper-level is either a new maximum likeli- hood estimation method or a least squares method and the lower-level is the strategic user equilibrium assign- ment model (StrUE) which accounts for the day-to-day demand volatility. The maximum likelihood method pro- posed in this article has the ability to utilize information from day-to-day observed link flows to provide a unique estimation of the total demand distribution, whereas the least squares method is capable of capturing link flow variations. The lower-level StrUE model can take the to- tal demand distribution as input, and output a set of link flow distributions which can then be compared to the link-level observations. The mathematical proof demon- strates the convexity of the model, and the sensitivity to the prediction error is analytically derived. Numerical analysis is conducted to illustrate the efficiency and sen- sitivity of the proposed model. Some possible future re- search is discussed in the conclusion. 1 INTRODUCTION Enhanced origin–destination (O–D) matrix estimation methodologies could prove useful for transportation planning. Traditionally, the O–D matrix is obtained by To whom correspondence should be addressed. E-mail: tao. [email protected]. trip generation and distribution module using data from plate surveys, household surveys, or roadside surveys (Castillo et al., 2014). Such survey activities may suffer from limited response and sampling coverage. As an al- ternative, there are some statistical approaches for esti- mating or calibrating an O–D matrix from observed link flows and some prior knowledge of the O–D demand. However, it is difficult to infer a unique O–D matrix directly using these approaches because the number of O–D pairs is much larger than the number of links; thus statistical assumptions or some prior information on O– D matrix are necessary to guarantee a unique solution. For example, prior O–D matrix may be used as a regu- larization which ensures the objective function to be a convex one (Menon et al., 2015). Unlike the traditional O–D estimation, this article proposes a supplementary approach for calibrating the aggregated O–D demand and corresponding variability for a regional traffic net- work using day-to-day observed link counts, assuming that the demand proportions are known a priori infor- mation. The proposed method explicitly considers the inherent volatility in demand which is commonly ob- served in traffic networks, in conjunction with the link flow fluctuation “caused” by demand volatility. In this article, we present two methods to calibrate the total demand distribution. Both methods can be rep- resented as a bi-level programming model, which differ only in the upper-level problem. The lower-level prob- lem for both methods is the strategic user equilibrium assignment model (StrUE), which accounts for day-to- day demand volatility. The upper-level model is either C 2017 Computer-Aided Civil and Infrastructure Engineering. DOI: 10.1111/mice.12278

Upload: others

Post on 27-Mar-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299

Two Methods to Calibrate the Total Travel Demandand Variability for a Regional Traffic Network

Tao Wen*

School of Civil and Environmental Engineering, University of New South Wales, Sydney, Australia and Data61, CSIRO,Sydney, Australia

Lauren Gardner, Vinayak Dixit & S. Travis Waller

School of Civil and Environmental Engineering, University of New South Wales, Sydney, Australia

Chen Cai & Fang Chen

Data61, CSIRO, Sydney, Australia

Abstract: This article proposes a novel methodologythat uses the bi-level programming formulation to cali-brate the expected total demand and the correspondingdemand variability of traffic networks. In the bi-level for-mulation the upper-level is either a new maximum likeli-hood estimation method or a least squares method andthe lower-level is the strategic user equilibrium assign-ment model (StrUE) which accounts for the day-to-daydemand volatility. The maximum likelihood method pro-posed in this article has the ability to utilize informationfrom day-to-day observed link flows to provide a uniqueestimation of the total demand distribution, whereas theleast squares method is capable of capturing link flowvariations. The lower-level StrUE model can take the to-tal demand distribution as input, and output a set of linkflow distributions which can then be compared to thelink-level observations. The mathematical proof demon-strates the convexity of the model, and the sensitivity tothe prediction error is analytically derived. Numericalanalysis is conducted to illustrate the efficiency and sen-sitivity of the proposed model. Some possible future re-search is discussed in the conclusion.

1 INTRODUCTION

Enhanced origin–destination (O–D) matrix estimationmethodologies could prove useful for transportationplanning. Traditionally, the O–D matrix is obtained by

∗To whom correspondence should be addressed. E-mail: [email protected].

trip generation and distribution module using data fromplate surveys, household surveys, or roadside surveys(Castillo et al., 2014). Such survey activities may sufferfrom limited response and sampling coverage. As an al-ternative, there are some statistical approaches for esti-mating or calibrating an O–D matrix from observed linkflows and some prior knowledge of the O–D demand.However, it is difficult to infer a unique O–D matrixdirectly using these approaches because the number ofO–D pairs is much larger than the number of links; thusstatistical assumptions or some prior information on O–D matrix are necessary to guarantee a unique solution.For example, prior O–D matrix may be used as a regu-larization which ensures the objective function to be aconvex one (Menon et al., 2015). Unlike the traditionalO–D estimation, this article proposes a supplementaryapproach for calibrating the aggregated O–D demandand corresponding variability for a regional traffic net-work using day-to-day observed link counts, assumingthat the demand proportions are known a priori infor-mation. The proposed method explicitly considers theinherent volatility in demand which is commonly ob-served in traffic networks, in conjunction with the linkflow fluctuation “caused” by demand volatility.

In this article, we present two methods to calibratethe total demand distribution. Both methods can be rep-resented as a bi-level programming model, which differonly in the upper-level problem. The lower-level prob-lem for both methods is the strategic user equilibriumassignment model (StrUE), which accounts for day-to-day demand volatility. The upper-level model is either

C© 2017 Computer-Aided Civil and Infrastructure Engineering.DOI: 10.1111/mice.12278

Page 2: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 283

a maximum likelihood estimation method or the leastsquares method. Each of the two calibration methods isdefined as follows:

(1) The maximum likelihood method for O–D matrixestimation with StrUE which is hereby referred toas the MLOD method;

(2) The least squares method for O–D matrix estima-tion with StrUE hereby referred to as LSOD.

The novelty of both methods evolves from the in-corporation of the strategic user equilibrium model (re-ferred to as StrUE in this article) (Dixit et al., 2013). TheStrUE model is defined such that “at strategic user equi-librium all used paths have equal and minimal expectedcost.” For each user present in the network, they makea route choice decision based on their knowledge of de-mand distribution, then this chosen route is followed re-gardless of the realized travel demand on a given day(i.e., given demand scenario). Therefore, under StrUE,the link flows will not necessarily result in an equilib-rium state for any particular demand realization; in-stead, equilibrium exists stochastically over all demandrealizations. The StrUE model was proposed to capturethe impact of day-to-day demand volatility on networkperformance, and eventually route choice within thestatic user equilibrium framework. The StrUE modelcan take the total demand distribution as input (esti-mated at the upper level), and output a set of link flowdistributions, which can then be compared to the link-level observations.

In the StrUE model, each O–D pair demand is as-sumed to be a fixed proportion of the total demand inthe system; hence each O–D pair demand varies ac-cording to the change in total demand. Therefore, theobjective of this work is to estimate the total demand,and perhaps more importantly, the parameters whichdefine the total trip demand distribution. Given the es-timated demand distribution, the proposed method alsoprovides the variance of link flows (from StrUE), whichcan be used as a measurement of reliability for planningpurposes.

The bi-level programming method is proposed toeliminate the impact of strongly biased prior estimates,where the upper-level provides information about thetotal demand distribution to the lower-level StrUEmodel, and the results from the StrUE model canprovide link flow distributions back to the upper level.A benefit of the proposed model includes the incor-poration of observed day-to-day link flows, insteadof aggregated or averaged values. Additionally, theperformance of both MLOD and LSOD methods canbe assessed by comparing the estimated link flow distri-butions (which are a direct output of the StrUE model

based on the estimated total demand distribution) withthe simulated day-to-day link flows.

The association of link flow variables to the total de-mand in StrUE allows for the use of day-to-day linkflows (which in return provide actual link flow distri-butions) to calibrate the total demand distribution. Thecalibration is accomplished by implementing the follow-ing methods: (1) MLOD method, in which we maximizethe joint probability of observing the entire link flowswithin a time period and (2) LSOD method, in which weminimize the sum of the residuals of mean and standarddeviation of link flows. The main difference betweenthe two methods is that the MLOD method considersevery observation of link flow, and seeks to find a dis-tribution that fits the observed link flow best, whereasthe LSOD method uses only the mean and standarddeviation of link flow as the inputs. The LSOD findsthe O–D demand that can produce a mean and stan-dard deviation of link flows similar to the observed ones.When data sets are typically small or moderate in size,extensive simulation studies show that in small sampledesigns where there are only a few failures or errors,the maximum likelihood estimation is better than theleast squares method (Maus et al., 2001; Genschel andMeeker, 2010).

An additional contribution of this research addressesthe issue of low coverage rate or failure of loop de-tectors, which can lead to error in observed link flows(Zhou and List, 2010). One way to mitigate the low cov-erage rate issue was investigated by various sensor lo-cation problem models (Gan et al., 2005; Ehlert et al.,2006; Yang et al., 2006; Larsson et al., 2010; Gentiliand Mirchandani, 2012). In this work, sensitivity anal-ysis is conducted to demonstrate the model’s robustnessagainst varying levels of detector error. In addition, thesensitivity to error in observed link flows is analyticallyderived for the proposed model, and the results are val-idated using simulation.

The remainder of this article includes a literature re-view of previous research in Section 2. Section 3 de-fines the mathematical model and includes a derivationfor the analytical solution to the total demand estima-tion. Numerical analysis is demonstrated in Section 4where the estimated results of both methods are com-pared; conclusions, limitations of the model, and futureresearch are presented in Section 5.

2 LITERATURE REVIEW

Historically, O–D matrix estimation and its calibrationmainly relied upon statistical approaches using loopcount data. O–D estimation research has since been ex-panded to include a wide range of methods such as the

Page 3: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

284 Wen et al.

generalized least square method (Cascetta, 1984; Bell,1991), the maximum likelihood (Spiess, 1987), bi-levelprogramming approach (Yang et al., 1992), Bayesianapproaches (Maher, 1983; Tebaldi and West, 1998),and maximum entropy (Fisk, 1988). Integration of themethods mentioned above was also of recent interest(Aerde et al., 2003; Castillo et al., 2014). Statistically, themaximum likelihood method estimates a set of param-eters for a probability density function to fit the ob-servations, that is, it maximizes the joint probability ofobserving the existing data, and a preassumed distribu-tion is normally required. The method of least squaresis a standard approach to approximate the solution ofoverdetermined systems, that is, sets of equations inwhich there are more equations than unknowns, suchas linear regression. In this article, these two traditionalmethods, generalized least square method and maxi-mum likelihood, are modified and applied to calibratethe total network demand.

In many literatures, estimation of the O–D matrix isa once-off procedure, that is, calibration/estimation ofO–D matrix is only performed once. This may lead tosome issues when the network is congested, the prior in-formation on O–D matrix is inaccurate, or data noise isnot insignificant. In the bi-level programming approachproposed in this article, the upper level is an O–D ma-trix estimation problem and the lower level is the as-signment of O–D trips. Yang et al. (1992) first intro-duced the convex bi-level optimization problem, whichwas later extended to account for link flow correlation(Yang, 1995). The heuristic algorithm, which is a globaloptimum technique, was applied to solve the upper levelin Yang (1995), Kim et al. (2001), and Stathopoulos andTsekeris (2004); however, the solution was not provedto be optimal mathematically. Codina et al. (2006) pro-posed two algorithms under the bi-level programmingframework: one sought for an approximation of thesteepest descent direction for the upper level and onelinearized the lower-level assignment problem. In ad-dition, an iterative column generation algorithm wasdemonstrated based on the characteristics of path costfunction continuity, and the solution was proved to be alocal minimum (Garcia-Rodenas and Verastegui-Rayo,2008). However, the aforementioned algorithms wereonly applied on a small network. In this article the pro-posed bi-level programming approach is proved to beapplicable to medium-sized networks such as the Ana-heim network.

Typically, the objective of O–D matrix estimation isto optimize an objective function (which may vary basedon model requirements) subject to a set of constraints(typically the flow conservation and the link-path in-cidence relationship). However, the problem is oftenchallenging due to that the number of monitored links

in a traffic network is often much smaller than the num-ber of O–D pairs to be estimated; therefore it may notbe possible to obtain a unique solution from a singleset of link counts alone. Furthermore, the problem hasbeen extended to account for the stochastic nature offlows (Lo et al., 1996; Lo et al., 1999), and the time-dependent characteristics of the network (Bierlaire andCrittin, 2004; Frederix et al., 2011). Some computer-aided heuristic algorithms are also applied to this prob-lem (Stathopoulos and Tsekeris, 2004). The genetic al-gorithm, which is a heuristic search method, plays animportant role in the O–D estimation problem (Kimet al., 2001; Baek et al., 2004; Yun and Park, 2005;Kattan and Abdulhai, 2006). The main advantage of thegenetic algorithm is its capability of solving nonconvex,complex optimization, whereas the drawback is that thesolution is not guaranteed to be optimal. In addition,path flow estimator techniques also provide O–D spe-cific flows, which could be used in the estimation of O–D matrix (Sherali et al., 1994; Chootinan et al., 2005;Chen et al., 2009) but this model either needs path enu-meration (Sherali et al., 2003) or information on the setof shortest paths (Nie and Lee, 2002; Nie et al., 2005).Some other methods have also been proposed by re-searchers to enhance the model applicability, such asmulti-class O–D estimation (Baek et al., 2004; Wonget al., 2005), fuzzy-based approach (Xu and Chan, 1993;Reddy and Chakroborty, 1998; Foulds et al., 2013) andneutral network-based approach (Gong, 1998). How-ever, issues regarding computation complexity and theapplication to large-scale networks still remain a chal-lenge.

On the other hand, higher order information of a net-work, such as the variance and covariance of observedlink flows, can potentially provide more constraints tothe optimization problem. This is considered as net-work tomography problems in statistics and computerscience literature (Vardi, 1996; Cao et al., 2000; Airoldiand Blocker, 2013; Hazelton, 2015), but its applicationin transportation models is yet to be fully explored. Cre-mer and Keller demonstrated that aggregating or aver-aging link count data collected over a sequence of timeperiods may result in the loss of important informa-tion (Cremer and Keller, 1987). Hazelton (2003) pro-posed a weighted least squares method to account forthe covariance of links and assumed a parameter to ex-plain the circumstances when the variance exceeds themean if a Poisson distribution is used. Bell (1983) pro-posed a maximum likelihood method and found the an-alytical solution to the covariance of O–D matrix byusing a Taylor approximation. However, the assump-tions made in these studies may limit the model appli-cability. For example, the O–D demand was assumedto follow the Poisson or multinomial distribution, which

Page 4: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 285

stipulates certain relationships between the mean andvariance of the O–D demand. In monitored networks,loop detectors can easily provide link counts on a day-to-day basis; therefore, it is important to consider thevariation of link flows and the distribution of O–D totaldemand as effective information to calibrate the O–Dtrip matrix (Wen et al., 2015). In this study, the StrUEtraffic assignment model assumes the O–D demand fol-lows a lognormal distribution, which allows the meanand variance of total demand to be different from eachother, and assures the nonnegativity of the demand. Theproposed model in this article, therefore, uses link flowvariation to provide a robust estimation of total demandwhile maintaining computation efficiency.

Estimation of the O–D trip matrix also requiresa proper assignment model. When applying the as-signment model to a large network, realism andcomputational complexity are both critical in determin-ing a model’s practical applicability. Further, a majorcomplication in transportation modeling is the ability toproperly account for the inherent uncertainties regard-ing demand (Bellei et al., 2006; Kim et al., 2009; Szetoet al., 2011) and capacity levels (Brilon et al., 2005; Wuet al., 2010). Additionally, as has been noted, uncer-tainty regarding these variables directly affects routechoice behavior (Uchida and Iida, 1993) and traffic pre-dictions (Duthie et al., 2011). It is, therefore, necessaryto incorporate these stochastic elements into modelsto ensure robust planning capabilities, but to do so in amanner that maintains computational tractability. Thestrategic user equilibrium (Dixit et al., 2013) effectivelyaccounts for the impact of demand uncertainty subjectto Wardrop’s UE conditions, and under the static userequilibrium, the computation tractability and simplicityare preserved. The model was extended to the dy-namic traffic assignment (Waller et al., 2013), roadpricing scheme (Duell et al., 2014), and independentlydistributed O–D demands (Wen et al., 2016).

The contribution of this study can be summarized asfollows:

(1) We apply the strategic user equilibrium model,which explicitly accounts for demand volatility inusers’ routing mechanism. It can also provide linkflow fluctuation based on demand volatility.

(2) We assume the total demand follows a lognormaldistribution, a lognormal distribution allows thevariance and the mean of the total demand to beindependent, and nonnegative demand estimationis guaranteed.

(3) Given the day-to-day link flow fluctuations, we es-timate the total network travel demand distribu-tion using two different methods: (i) the MLODmethod and (ii) LSOD method. The performance

of both methods is evaluated and compared for amedium-sized network and large-sized network.

(4) A bi-level formulation is proposed, which reducesthe impact of biased initial estimates. Both up-per level and lower level are proven to be strictlyconvex.

(5) O–D estimation results from the proposed meth-ods are presented and compared for both analyti-cal and simulated analysis.

3 PROBLEM FORMULATION

This section defines the mathematical concept of themodel. Table 1 defines the notations used in this article.

In this article, two assumptions are made to guaranteeconsistency, uniqueness, and computation simplicity:

(1) Each O–D pair demand is proportional to the to-tal demand and the demand proportions are fixed.That is, the objective of this article is to scaleeach O–D demand distribution while preservingthe demand proportions. Statistically, this meanswe have more confidence in the demand propor-tions but not the exact number of trips. Similar tothe traditional O–D estimation approaches whereeven a full prior O–D matrix is assumed to beknown (not necessarily, but in several models theprior O–D matrix guarantees uniqueness of the so-lutions), the demand proportions can be regardedas prior information, which is obtained from othertransportation techniques such as gravity model orcensus data. These techniques can provide com-paratively reliable demand proportions, and theuse of loop detector data in this article provides away to calibrate the number of trips. Note that thisassumption of known priori O–D proportions maylimit the model’s applicability, but it also guaran-tees the model’s uniqueness. In some scenarios,such an assumption may be relaxed if uniquenessis not considered critical.

(2) The trip demand is assumed to follow a certainstatistical distribution; previously a lognormal dis-tribution has been used (Kamath and Pakkala,2002; Zhao and Kockelman, 2002; Duell et al.,2014; Wen et al., 2014). Under the assumption ofa log-normally distributed demand, this article fo-cuses on estimating the demand distribution pa-rameters. Note that other distributions can also beused if they do not change the convexity of the ob-jective function, and preserve the positiveness. Inaddition, for the StrUE model a log-normal dis-tributed total trip demand allows for a closed form

Page 5: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

286 Wen et al.

Table 1Summary of notations

Symbol Definition

N Link (index) set.Krs Path set for O–D pair rs.A Node (index) set.pn Proportion of total demand on link n; P =

(p1, . . . , pn).tn() Travel time on link n; t = ( . . . ,tn , . . . ).tn f Free flow travel time on link n.qrs Fraction of total demand that are between O–D

pair r-s;∑∀rs

qrs = 1.

T Random variable for total demand withprobability distribution g(T ).

g(T ) Lognormal probability density function of thetotal demand.

xni Observed flow on link n, for day i.ln Flow on link n.hrs

k The proportion of flow on path k, connecting O–Dpair r-s.

Cn The capacity on link n.δm

n,k Indicator variable

δrsn,k = {1 i f n is included in path k

0 otherwise, which

indicates if link n is on path k for O–D pair rs.(�rs)n,k = δrs

n,k; � = (. . . , �rs, . . .)μ The location parameter for a lognormal

distribution, which is also mean of thecorresponding normal distribution.

σ 2 The scale parameter for a lognormal distribution,which is also the variance of the correspondingnormal distribution.

mT The expected total demand.vT The variance of the lognormal distribution.mn The expected link flow on link n.vn The variance of link flow on link n.sT The standard deviation of the total demand.sn The standard deviation of flow on link n.cov The coefficient of variation, which is defined as the

ratio of the standard deviation to the mean of avariable.

Std() The standard deviation of a variable.E() The mean/expectation of a variable.

expression of the probability density function,which helps to construct the model analytically.

3.1 The Strategic user equilibrium assignment model(StrUE)

The StrUE model is used as the assignment model forthe O–D estimation, and details of the StrUE model canbe found in Dixit et al. (2013). In summary, the StrUEmodel has the following objective function:

Minimize : z ([p]) =∑nεN

pn∫0

+∞∫−∞

tn ( f T ) g (T ) dT d f, (1)

Subject to: ∑k

hrsk = qrs, ∀k, r, s, (2)

hrsk ≥ 0, ∀k, r, s, (3)

pn =∑

r

∑s

∑k

hrsk δrs

n,k, ∀k, r, s. (4)

In this formulation, users’ strategic choice is denotedas the link proportions pn , that is, the link flow di-vided by the total demand T; [p] represents a vec-tor of all the link proportions; and f is a dummyvariable for integration. Each O–D demand is thefixed demand proportion qrs multiplied by the totaldemand T, that is, each O–D demand is proportionalto the total demand, and the proportions are fixed con-stants. This implies that all O–D demands are perfectlycorrelated. The total demand is assumed to follow a cer-tain distribution. The objective function is the sum ofthe integrals of the expected value of the link cost func-tions; the link proportions are proved to be unique un-der the assumption of fixed demand proportions. Notethat in StrUE users will stick to their link choice pro-portions pn regardless of the realized total demandday-to-day (i.e., once decided, pn is a constant in theformulation), whereas link flow does vary every day cor-responding to the realized total demand. In many O–Ddemand calibration literatures, the users’ route choiceis assumed to be either a known a priori or obtainedfrom a traditional assignment model. In this article, theStrUE explicitly accounts for demand uncertainty andits impact on users’ route choice while maintaining thecomputation simplicity, and is integrated into the totaldemand calibration context, which has rarely been donein the past literature.

The link travel time function for the StrUE model isdefined by the U.S. Bureau of Public Roads (U.S., 1964)cost function due to its widespread use in transport plan-ning models:

tn (ln) = tn f

[1 + α

(ln

Cn

)β]

, (5)

where α and β are the parameters for the BPR function.The fraction of the total demand between O–D pair r-s, namely qrs, can be obtained from the prior estimates,that is, census data, gravity model. The total demand isassigned to each link by the link proportions:

ln = pn T pn ∈ P, n ∈ N . (6)

Page 6: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 287

The link proportions are obtained from the StrUE as-signment model, therefore it is known constant in theO–D matrix estimation problem and total demand is thestochastic variable; from Equation (6), each link flowis the link proportions multiplied by the total demandvariable. According to the properties of lognormaldistribution, if the total demand variable follows alognormal distribution, then the link flow representedin Equation (6) also follows a lognormal distribution,which is related to the total demand distribution by thefollowing equations:

mn = pn mT , (7)

vn = p2n vT . (8)

3.2 The maximum likelihood O–D estimationincorporating the strategic user equilibrium(MLOD)

In MLOD, we first find the parametric expression ofthe probability density function of link flow distribution.The parameters for the link flow distribution can be ob-tained by the definition of lognormal distribution:

μn = ln mn − 12

ln(

1 + vn

m2n

), (9)

σ 2n = ln

(1 + vn

m2n

). (10)

Substitute Equations (7) and (8) into Equations (9)and (10), we have the transformation of the total de-mand distribution to link flow distribution:

σn = σT , (11)

μn = lnpn + μT . (12)

It is important to realize that μ and σ 2, which ap-pear in the equations of the lognormal distribution, donot denote the mean and the variance of the lognor-mal distribution, but of the corresponding parametersof the normal distribution. The mean and the varianceof the lognormal distribution are indicated in the follow-ing discussion by m and v. Since each link flow follows alognormal distribution, the probability density functionof observing xni trips on link n is

P (xni ) = 1

xniσn

√2π

e− (ln xni −μn)2

2σ2n n ∈ N , (13)

where, xni is the observed flow on link n for a fixed timeperiod (e.g., the flow rate during morning peak hours)on day i. Here the observed link flows are indicated by

an n-by-i matrix, where n is the number of links and i isthe number of observations:

Xni =

⎡⎢⎣

x11 · · · x1i...

. . ....

xn1 · · · xni

⎤⎥⎦ . (14)

The maximum likelihood method here is to maximizethe joint probability of observing all sets of link flows,in order to reduce the effect of noise and observationfailure. The joint probability density function is givenby the following equation:

j (Xni ) =i∏1

n∏1

1

xniσn

√2π

e− (ln xni −μn)2

2σ2n . (15)

Conventionally, we maximize the logarithm of thejoint probability, because taking the log of the func-tion will not change its convexity. By plugging Equa-tions (11) and (12) into Equation (15) and changing thesigns, the objective function becomes:

min : J (μT , σT ) =i∑1

n∑1

[ln(xniσT

√2π)

+(

ln xnipn

− μT

)2

2σT2

, (16)

subject to: σT > 0.The Hessian matrix of the objective function is posi-

tive definite, hence the function is strictly convex, whichguarantees the solution is globally optimal and unique.Note that convexity is critical for transport planners,because if there exist multiple optimal solutions to theobjective function it would be difficult to perform thesubsequent analysis which is based on the estimated O–D matrix. The optimal solutions can be found by takingthe first derivative with respect to mean and variance oftotal demand:

μT =∑i

1

∑n1 ln xni

pn

ni, (17)

σ 2T =

∑i1

∑n1

(ln xni

pn− μT

)2

ni(18)

Sensitivity is a measurement of a model’s robustness,in the proposed model, observed loop counts may be er-roneous due to loop detector failure, and here the errorterm is expressed as a matrix that has the same dimen-sion as the observed loop counts:

Eni =

⎡⎢⎣

e11 · · · e1i...

. . ....

en1 · · · eni

⎤⎥⎦ . (19)

Page 7: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

288 Wen et al.

Note that in lots of previous literature the error termis assumed to be Gaussian, that is, the error term isa variable which follows a normal distribution. Moregenerally, in this article the error term is modelled asa variable, and thus no distributional assumptions arerequired. The loop count matrix with error term is:

Rni = Xni + Eni =

⎡⎢⎣

x11 + e11 · · · x1i + e1i...

. . ....

xn1 + en1 · · · xni + eni

⎤⎥⎦ . (20)

The corresponding estimated μT and σ 2T based on ob-

served loop counts with error term are represented asRμT and Rσ 2

T :

R μT =∑i

1

∑n1 ln xni +eni

pn

ni, (21)

R σ 2T =

∑i1

∑n1

(ln xni +eni

pn− RμT

)2

ni. (22)

From the definition of lognormal distribution, thecorresponding expected total demand with and withouterror term are expressed as RmT and mT , respectively:

mT = eμT +0.5σ 2T = e

∑i1

∑n1 ln

xnipn +0.5

∑i1

∑n1 (ln

xnipn −μT )2

ni , (23)

RmT = eRμT +0.5Rσ 2T = e

∑i1

∑n1 ln

xni +enipn +0.5

∑i1

∑n1 (ln

xni +enipn −RμT )2

ni .

(24)

The sensitivity function of the expected total demandis

RmT − mT = e

∑i1

∑n1 ln

xni +enipn +0.5

∑i1

∑n1 (ln

xni +enipn −RμT )2

ni

−e

∑i1

∑n1 ln

xnipn +0.5

∑i1

∑n1 (ln

xnipn −μT )2

ni . (25)

Also, the analytical solution of the standard deviationof the total demand with and without error term can beexpressed as RsT and sT , respectively:

sT = eμT +0.5σ 2T

√eσ 2

T − 1 = e

∑i1

∑n1 ln

xnipn +0.5

∑i1

∑n1 (ln

xnipn −μT )2

ni

×√

e∑i

1∑n

1 (lnxnipn −μT )2

ni − 1,

(26)

R sT = eRμT +0.5Rσ 2T

√eRσ 2

T − 1

= e

∑i1

∑n1 ln

xni +enipn +0.5

∑i1

∑n1 (ln

xni +enipn −RμT )2

ni

×√

e∑i

1∑n

1 (lnxni +eni

pn −RμT )2

ni − 1. (27)

The sensitivity function of the standard deviation oftotal demand is:

RsT − sT =

⎡⎢⎢⎣e

∑i1

∑n1 ln

xni +enipn +0.5

∑i1

∑n1

⎛⎝ln

xni +enipn −

∑i1

∑n1 ln

xni +enipn

ni

⎞⎠

2

ni

×

√e

∑i1

∑n1

⎛⎝ln

xni +enipn −

∑i1

∑n1 ln

xni +enipn

ni

⎞⎠

2

ni − 1

⎤⎥⎥⎦

−⎡⎣e

∑i1

∑n1 ln

xnipn +0.5

∑i1

∑n1 (ln

xnipn −μT )2

ni ×√

e∑i

1∑n

1 (lnxnipn −μT )2

ni − 1

⎤⎦ .

(28)

Note that although Equations (17) and (18) alreadyprovide the formula for the estimator and hence alsothe dependence on the noise terms, the complication inEquations (25) and (28) mainly stem from convertingμT and σT to the mean and variance of the total de-mand due to that showing the mean and variance couldbe more intuitive than showing parameters themselves.

The analytical expression of the sensitivity functionindicates some characteristics of the estimated results ifwe design the sensitivity analysis as following:

(1) The specific error: the proportion of the error termeni to the actual flow xni on link n:

eni = kxni , n ∈ N . (29)

(2) The systematic error: the number of links that areunder a specific error:

epi = kx pi , p ∈ P, P ∪ Q = N , (30)

eqi = 0, q ∈ Q, P ∪ Q = N . (31)

where P is the set of links with error and Q is the set oflinks without error. The expected total demand is mono-tonically increasing with respect to the systematic errorand specific error, but the estimated standard deviationof total demand is not monotonic with respect to the sys-tematic error, because from the analytical expression inEquation (28) we can see it is determined by several fac-tors including link proportion, link flow and the specific

Page 8: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 289

error. The combined effect of these factors does not sat-isfy monotonicity. Some discrete scenarios are designedin Table 3 to quantify the impact of error in a simplifiedway.

3.3 The least squares O–D estimation incorporatingthe strategic user equilibrium (LSOD)

Using the similar notations in the MLOD method buta different statistical method, the LSOD method mini-mizes the sum of the squared residual in mean and stan-dard deviation of link flow:

min : Obj (mT , sT ) =n∑1

(pnmT − mn)2

+n∑1

(pnsT − sn)2. (32)

Comparing to maximum likelihood method, the leastsquares method does not require distributional assump-tions on link flow. Here the link flow is still assumed tobe a random variable, with mean mT and variance sT .It is shown that the Hessian matrix is strictly positive,therefore the objective function has unique optimal so-lution, which can be found when the first derivative ofthe objective function is equal to zero:

∂Obj (mT , sT )∂mT

= 0 → mT =∑n

1 pnmn∑n1 p2

n

, (33)

∂Obj (mT , sT )∂sT

= 0 → sT =∑n

1 pnsn∑n1 p2

n

. (34)

Similar to the maximum likelihood method, the es-timated mean and standard deviation of total demandwith error term can be expressed as RmT and RsT , re-spectively:

R mT =∑n

1 pn

∑i1(eni +xni )

i∑n1 p2

n

, (35)

R sT =∑n

1 pn Rsn

i × ∑n1 p2

n

. (36)

Therefore, the sensitivity expression of estimated re-sults can be obtained:

RmT − mT =∑n

1 pn

∑i1(eni +xni )

i∑n1 p2

n

−∑n

1 pn

∑i1 xni

i∑n1 p2

n

=∑n

1 pn∑i

1 eni

i × ∑n1 p2

n

, (37)

RsT − sT =∑n

1 pn (Rsn − sn)i × ∑n

1 p2n

=

∑n1 pn

⎛⎜⎝

√∑i1

[xni +eni −

∑i1(xni +eni )

i

]2

i −

√∑i1

[xni −

∑i1 xnii

]2

i

⎞⎟⎠

i × ∑n1 p2

n

.

(38)

In LSOD method, the sensitivity function of boththe mean and standard deviation of total demand illus-trates its monotonicity with respect to systematic errorand specific error. The sensitivity analysis will prove themonotonicity in both methods in Section 4.

If we compare the optimal solution of both MLODand LSOD methods, since the logarithm calculus inMLOD method (see Equations (17) and (18)) is onlydefined for strictly positive numbers, loop counts dataneeds to be filtered out when either loop counts or linkproportions equals zero. Therefore, MLOD may not beused if a network has too many links with zero loopcount.

3.4 The bi-level iterative process

Assuming that the StrUE model represents the routechoice behavior, we can formulate a bi-level program-ming problem, where the upper level is either MLODor LSOD, and the lower level is the StrUE assignmentproblem. The input to the upper level is the fixed de-mand proportions, link proportions, and loop detec-tor data, the output is the total demand distribution.The total demand distribution, in conjunction with thefixed demand proportions, will then be used as inputto the lower-level problem, which produces the linkproportions. The objective functions of both the up-per and the lower level are strictly convex, the con-straints are also convex, which means the optimiza-tion model always have feasible solutions. However,it should be noted that the bi-level programming op-timization program may not be convex, because theoptimization variable of the upper level is the param-eters of the demand distribution whereas that of thelower level is the link proportions, and if the Karush-Kuhn-Tucker (KKT) conditions of lower-level opti-mization problem were plugged in as the constraintsfor the upper-level problem, they are no longer con-vex with respect to the upper-level’s optimization vari-able (the demand distribution parameters). Therefore,the iterative process may not converge. Despite this,when the initial solution is close to the actual one, the

Page 9: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

290 Wen et al.

solution will very likely be the global optimal becausethe search domain will be limited to a certain neigh-borhood of the optimal solution. Experiences haveshown that in some cases the results are encourag-ing (Yang et al., 1992). In this article, a 4-step so-lution algorithm has been proposed to the bi-levelprogramming:

(1) Initialization: k = 0. Start from the prior O–D ma-trix; obtain the demand proportions qrs and initialvalues for the mean and the variance of the to-tal demand. Using this total demand distributionas the input and implement the StrUE model toget a set of link proportions [p]k (which providesthe proportions of users choosing each link). Notethat qrs will be kept invariant over the bi-level it-erations, whereas μk

T and σ kT will be calibrated.

(2) Optimization: Substituting the link-flow propor-tion matrix [p]k , solve the upper-level to obtainμk+1

T + and σ k+1T of the total demand.

(3) Simulation: Using μk+1T and σ k+1

T , apply theStrUE model to produce a new set of link flowproportions [p]k+1.

(4) Convergence test: Calculate the deviation betweenanalytically estimated and observed link flows, andthe deviation between analytically estimated to-tal demand distributions of two consecutive iter-ations, when both of them have met the stoppingcriterions (the relative change is smaller than acritical value), stop.

4 NUMERICAL RESULTS AND ANALYSIS

4.1 Example of a moderate-scale network: the SiouxFalls network

The objective of the analysis is to test if the MLODand LSOD can effectively estimate the total demanddistribution from day-to-day simulated link flows. Thesimulation approach consists of artificially determiningthe total demand distribution (In Sioux Falls network,the expected total demand is provided, the simulatedstandard deviation of total demand is assumed here.)and generating random link flow samples accordingly.The simulated link flows are used to represent day-to-day observed link flows discussed in the previousparts. The estimated total demand distribution shouldclosely approximate the simulated total demand distri-bution; the link flow distribution produced by the StrUEmodel should also closely match the simulated linkflows. The analysis reveals both proposed estimationmethods will reproduce the desired total demand distri-bution from the random samples with perturbed prior

Fig. 1. The Sioux Falls network.

estimates. The analysis also reflects the scalability ofthe both MLOD and LSOD to networks of substantialcomplexity.

Numerical tests are conducted on the Sioux Fallsnetwork (24 nodes and 76 links). The network prop-erties are predefined in Bar-Gera (2012a,b) (seeFigure 1). The notations used in this section are definedin Table 1. Each O–D demand is specified as a propor-tion of the total network demand, therefore the demandfor a given O–D pair is the O–D proportion multipliedby the total demand. The BPR function parameters α

and β are set to 0.15 and 4.0, respectively.The simulated link flows are generated by the follow-

ing way:

(1) The parameters μT , σT are determined for thesimulated total demand.

(2) We implement the StrUE based on the simulatedtotal demand distribution and obtain a set of linkproportions.

(3) We generate 100 samples of the total demand fromthe lognormal distribution using μT , σT as param-eters and each sampled total demand is assigned tothe network using the precalculated link propor-tions.

Note that by doing this we assume that StrUE canrepresent users’ route choice behavior. In the realworld, users’ travel behavior is extremely complicated

Page 10: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 291

Table 2Different scenarios of initial estimation of the O–D matrix

Scenario Scenario description mT sT

1 mT = 0.8m A and cov = 0.1 288,480 28,8482 mT = 0.8m A and cov = 0.3 288,480 86,5443 mT = 1.2m A and cov = 0.1 432,720 43,2724 mT = 1.2m A and cov = 0.3 432,720 129,8165 mT = 1.5m A and cov = 0.1 540,900 54,0906 mT = 1.5m A and cov = 0.3 540,900 162,270Simulated m A = 360, 600 and cov = 0.2

and is still an open question, which is beyond the scopeof this article. The simulated expected total demand ofthe Sioux Falls network is m A = 360, 600, and the co-efficient of variation (cov) is equal to 0.2, that is, thestandard deviation is 20% of the expected total demand.In Table 2, scenarios 1–6 represent different initialestimates of the total demand distribution. From thesimulation all the links are used.

In Figures 2 and 3, the x-axis represents the numberof iterations of the bi-level program. Figures 2 and 3illustrate the estimated mean and standard deviationof total demand in each iteration for the MLOD andLSOD methods, respectively. Each series representsan initial demand scenario. Both figures show that theestimated results converge to the simulated ones withinthree iterations. This indicates the model’s robustperformance against biased initial estimates anddemonstrates the efficiency in arriving at convergence.The estimated results of the first iteration in bothfigures are very different from the simulated ones. Thisis because the link proportions of the first iteration areobtained based on the initial demand scenario specified.The initial estimates in scenarios 1 and 2 are very biased,and as a result, the first iteration results are inaccurate.Both MLOD and LSOD provide a similar estimationof E(T) that is approximately equal to the simulatedexpected total demand. Note that MLOD alwaysprovide an overestimation in Std(T) as long as initialestimates are biased, but this overestimation is elimi-nated after the second or third iteration. It is, therefore,necessary to incorporate the bi-level process to reducethe impact of biased initial estimates, especially dueto the difficulty in obtaining the standard deviation ofdemand.

Figures 4 and 5 compare the performance of the esti-mation methods at the link level for the simulated andanalytical results. In Figure 4, the x-axis indicates thesimulated expected link flow, whereas the y-axis repre-sents the estimated expected link flow. The estimatedlink flows are analytically produced by the StrUE modelbased on the total demand distribution after the conver-

gence criterion has been met. The estimated expectedlink flows and the corresponding simulated expectedlink flows are sorted from the smallest to the largest.The R2 values of both methods are 0.9837 and 0.9917,respectively, which are very close to 1, it indicates thatthe estimated results closely approximate the simulatedexpected link flows.

A major strength of the proposed estimation meth-ods, which is an artefact of using the StrUE model fortraffic assignment, is the estimation of link flow varia-tion. Since the total demand distribution is calibratedbased on day-to-day simulated link flows, it is thereforenecessary to compare the estimated standard deviationof link flow to the simulated one. In Figure 5, the esti-mated standard deviation of link flow is produced by theStrUE model based on the total demand distributionafter the bi-level convergence criterion has been met.The x-axis denotes the simulated standard deviation oflink flow, whereas the y-axis indicates the estimatedone. It is illustrated in Figure 5 that despite the factthat the R2 value is smaller than that of the expectedlink flow analysis; the R2 values of both methods stillsuggest a satisfying goodness of fit. Note that if thestandard deviation of link flow is very high, the esti-mated results may be more than 20% different fromthe simulated ones. Such heteroscedasticity is due tothat the corresponding link proportions are high. How-ever, both methods can reproduce the link flow distribu-tion if provided an estimated total demand distribution,which indicates its applicability to traffic assignmentmodel.

4.2 Model sensitivity to error in loop detector data forthe Sioux Falls network

Loop detector data is prone to error, and the error canbe far more complicated in reality, for simplicity, herewe design the systematic error and specific error as inEquations (29)–(31) to test the model’s sensitivity. Therobustness of the proposed estimation methods with re-spect to both error types is explored using sensitivityanalysis. The specific error indicates the significance ofthe error such as the failure of loop detectors or lack ofinformation on a link; the latter one measures the scaleof the error, that is, which links have an error. In thisanalysis, the specific error is set at 10%, 20%, and 30%respectively, and systematic error is represented as a setof links that have an error, which is shown in Table 3.The impact of different systematic error and specific er-ror is illustrated, by providing the estimated mean andstandard deviation of the total demand. Note that asthis is a sensitivity analysis, the link proportions derivedfrom the prior demand distribution are assumed to beidentical to the simulated ones to eliminate the effect of

Page 11: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

292 Wen et al.

Fig. 2. (a) Estimated expected total demand of MLOD under different scenarios of initial estimation; results of 10 bi-leveliterations are presented. (b) Estimated standard deviation of the total demand of MLOD under different scenarios of initial

estimation; results of 10 bi-level iterations are presented.

biased prior estimates, that is, link proportions are fixed,since we only focus on the impact of erroneous link flowobservations in this part.

In Figure 6, the x-axis represents the systematic er-ror, from 0% (no link has an error) to 100% with anincrement of 20%. Each series indicates a different spe-cific error, from 10% to 30% inflation with an increment

of 10%. The y-axis starts from the estimated expectedtotal demand without error. It is demonstrated that theestimated expected total demand of both methods riseswith the increase in systematic error and specific error;these two error categories have a moderate impact onthe estimated expected total demand in both methods.Additionally, LSOD provides estimation closer to the

Page 12: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 293

Fig. 3. (a) Estimated expected total demand of LSOD under different scenarios of initial estimation; results of 10 bi-leveliterations are presented. (b) Estimated standard deviation of the total demand of LSOD under different scenarios of initial

estimation; results of 10 bi-level iterations are presented.

simulated one under low systematic error (20%). There-fore, both systematic error and specific error should betreated equally, and under low systematic error, LSODcan potentially provide a better estimation of estimatedexpected total demand.

Interestingly, in Figure 7, there is a drop of Std[T]when the systematic error is high in the results ofMLOD. This can be explained by the nonlinearity char-acteristics in the analytical expression of sensitivity. Be-cause from the analytical expression in Equation (28),

Page 13: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

294 Wen et al.

Fig. 4. (a) The estimated and simulated expected link flowcomparison of MLOD method, estimated results are

produced by the StrUE model based on estimated demanddistribution. (b) The estimated and simulated expected link

flow comparison of LSOD method, estimated results areproduced by the StrUE model based on estimated demand

distribution.

we can see it is determined by several factors includ-ing link proportion, link flow, and the specific error.The combined effect of these factors does not satisfymonotonicity. In LSOD method, Std[T] monotonicallyincrease with systematic error and specific error, and itprovides a better estimated Std[T] under all scenariosbecause of its advantage in the estimation of an overde-termined system.

Fig. 5. (a) The estimated and simulated standard deviation oflink flow comparison of MLOD method, estimated results

are produced by the StrUE model based on estimateddemand distribution. (b) The estimated and simulated

standard deviation of link flow comparison of LSOD method,estimated results are produced by the StrUE model based on

estimated demand distribution.

4.3 Example of a medium-scale network: the Anaheimnetwork

To demonstrate the proposed model’s scalability, a nu-merical analysis is also conducted on the Anaheim net-work, which consists of 38 zones, 416 nodes, and 914links. The demand proportions and network propertiescan be found in Bar-Gera (2012a,b). Units used in thenetwork are: length is in feet; free flow travel time in

Page 14: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 295

Table 3Different systematic error and specific error

Links with errorterm (systematicerror)

Total numberof samples with

error term Specific error

1-76 7,600 10%, 20%, 30%16-76 6,100 10%, 20%, 30%31-76 4,600 10%, 20%, 30%46-76 3,100 10%, 20%, 30%61-76 1,600 10%, 20%, 30%

minutes; speed in feet per minute. Parameters for BPRfunction can be found in Table 4. A similar method asin example one is used to generate the simulated linkflows, the actual expected total demand is computed byaggregating the trip table. Figure 8 presents a map ofthe Anaheim network.

Results of both methods are illustrated in Table 5;same performance measures are depicted as in example

Fig. 6. (a) The estimated expected demand of MLODmethod under different systematic error and specific error.

(b) The estimated expected demand of LSOD method underdifferent systematic error and specific error.

Fig. 7. (a) The estimated standard deviation of the totaldemand of MLOD method under different systematic errorand specific error. (b) The estimated standard deviation of

the total demand of LSOD method under differentsystematic error and specific error.

Table 4Parameters of BPR function

BPR function

tn (ln) = tn f [1 + 0.15 ∗ tn f ( lnCn

)4]

one. In addition, to demonstrate the computation effi-ciency, computation time is recorded (Code is writtenin MATLAB, the computer used here has the followingconfigurations: Cpu: Intel i7-3770 3.4G Hz quad core,Ram size: 16 GB, Software: Windows 7 enterpriseversion, note that computation time may vary depend-ing on computer configuration, software version, andother factors). The estimated expected total demandof both methods is statistically indifferent to the actualexpected demand. MLOD method slightly underesti-mates the standard deviation of total demand; althoughLSOD overestimates the standard deviation of totaldemand, but the relative error is rather small (relativeerror is calculated by dividing absolute error by actual

Page 15: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

296 Wen et al.

Fig. 8. The Anaheim network.

Table 5Performance measures of the Anaheim network

Performance measures MLOD LSOD

Estimated mT 104,730 104,890Relative error of

estimated mT

0.03% 0.19%

Estimated sT 20,304 21,433Relative error of

estimated sT

3.03% 2.36%

R2 value of estimated andsimulated expected linkflow

0.991 0.993

R2 value of estimated andsimulated standarddeviation of link flow

0.984 0.989

Computation time 179 seconds 156 secondsActual demand distribution mA = 104690

and sA = 20939

value). R2 values on the link level demonstrate that theestimated total demand distribution can reproduce a setof link flow close to the simulated ones. Computationtime is the time of running the model till convergenceand both methods have a short computation time(less than 3 minutes), which proves the model’s ap-plicability on larger-scale networks. The computationburden is mainly on the F-W algorithm of the StrUEassignment because each iteration of the F-W algo-rithm implements the Dijkstra’s algorithm to solve theshortest path problem, in the worst-case scenario thecomputation complexity of Dijkstra’s algorithm willbe n2, the total demand calibration process is compu-tationally simple. MLOD takes a moderately longercomputation time, due to that some links have no flow

on it, and some link proportions may become zeroduring the bi-level process. In this case, these data needto be filtered out, which increases the computationburden to some extent.

5 CONCLUSION

This article proposes two methods (MLOD and LSOD)to estimate the total traffic demand distribution (triptable) based on day-to-day observed link flows. Themodel considers link flow variation when estimatingthe total demand distribution and the StrUE modelaccounts for the impact of demand volatility in users’route choice decision. A bi-level programming methodis included to reduce the impact of biased initial es-timates of the total demand distribution; both upper-level and lower-level problems are proved to be strictlyconvex. A sensitivity analysis is conducted for bothMLOD and LSOD methods. A numerical analysis isconducted on the Sioux Falls network and Anaheimnetwork, results for the system level and the link levelare similar, and demonstrated the robustness of bothMLOD and LSOD methods. In general, both MLODand LSOD demonstrate scalability with computation ef-ficiency while providing a satisfying estimation of to-tal demand distribution. However, the MLOD methodrequires nonzero link flow and link proportions, whichmay limit its applicability in some cases.

Sensitivity analysis shows that the impact of error ispredictable. Under the proposed model the problem isoverdetermined (unlike the traditional O–D estimationproblem), that is, the number of variables to be esti-mated is far smaller than the number of known con-straints, therefore the lack of information on severallinks will not interfere with the model’s implementation.In addition, two consequences can be derived: (1) theadvantage of the LSOD method is that it is less sensitiveto detector error, which is due to that only the mean andstandard deviation of link flows are considered; the im-pact of errors or outliers will be averaged and thus theestimation tends to be less sensitive to error and (2) de-spite different objective functions for the two proposedmethods, the estimated results without error will be sim-ilar because in the proposed models the number of con-straints greatly exceeds the parameters to be calibrated,namely the total demand distribution.

Finally, it should be noted that the assumption offixed O–D demand proportions may limit the modelapplicability. Therefore, one future research effort willbe the generalization of the proposed model, especiallyon estimating O–D demand independently for each O–D pair. This requires extending the strategic user equi-librium for the case of independently distributed O–D

Page 16: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 297

demands; the key contribution will be to account fordemand volatility in the assignment model while con-sidering observed link flow variation in O–D demandestimation. This article already provides an insight inlight of this. Also, it is valuable to investigate the use ofthe covariance of loop counts. This can potentially pro-vide much more information than only the link flow dis-tribution. Additionally, dynamic traffic assignment maybe integrated into the proposed model. Generally, sincethe OD estimation problem is a combination of a statis-tical optimization model and a traffic assignment model,an improvement in either process warrants furtherresearch.

REFERENCES

Aerde, M., Rakha, H. & Paramahamsan, H. (2003),Estimation of origin-destination matrices: relationshipbetween practical and theoretical considerations, Trans-portation Research Record: Journal of the TransportationResearch Board, 1831, 122–30.

Airoldi, E. M. & Blocker, A. W. (2013), Estimating latentprocesses on a network from indirect measurements, Jour-nal of the American Statistical Association, 108(501), 149–64.

Baek, S., Kim, H. & Lim, Y. (2004), Multiple-vehicle origin-destination matrix estimation from traffic counts using ge-netic algorithm, Journal of Transportation Engineering,130(3), 339–47.

Bar-Gera, H. (2012a), Transportation Test Problems ofAnaheim Network. Available at: http://www.bgu.ac.il/˜bargera/tntp/, accessed December 2015.

Bar-Gera, H. (2012b), Transportation Test Problems ofSioux Falls Network. Available at: http://www.bgu.ac.il/˜bargera/tntp/, accessed December 2015.

Bell, M. G. H. (1983), The estimation of an origin-destinationmatrix from traffic counts, Transportation Science, 17(2),198–217.

Bell, M. G. H. (1991), The estimation of origin-destinationmatrices by constrained generalised least squares, Trans-portation Research Part B: Methodological, 25(1), 13–22.

Bellei, G., Gentile, G., Meschini, L. & Papola, N. (2006), A de-mand model with departure time choice for within-day dy-namic traffic assignment, European Journal of OperationalResearch, 175(3), 1557–76.

Bierlaire, M. & Crittin, F. (2004), An efficient algorithm forreal-time estimation and prediction of dynamic OD tables,Operations Research, 52(1), 116–27.

Brilon, W., Geistefeldt, J. & Regler, M. (2005), Reliabilityof freeway traffic flow: a stochastic concept of capacity, inProceedings of the 16th International Symposium on Trans-portation and Traffic Theory, vol. 125143, College Park,MD.

Cao, J., Davis, D., Vander Wiel, S. & Yu, B. (2000), Time-varying network tomography: router link data, Journal ofthe American Statistical Association, 95(452), 1063–75.

Cascetta, E. (1984), Estimation of trip matrices from trafficcounts and survey data: a generalized least squares esti-mator, Transportation Research Part B: Methodological, 18(4–5), 289–99.

Castillo, E., Menendez, J. M., Sanchez-Cambronero, S.,Calvino, A. & Sarabia, J. M. (2014), A hierarchical opti-mization problem: estimating traffic flow using gamma ran-dom variables in a Bayesian context, Computers & Opera-tions Research, 41, 240–51.

Chen, A., Chootinan, P. & Recker, W. (2009), Norm approx-imation method for handling traffic count inconsistenciesin path flow estimator, Transportation Research Part B:Methodological, 43(8), 852–72.

Chootinan, P., Chen, A. & Recker, W. (2005), Improvedpath flow estimator for origin-destination trip tables, Trans-portation Research Record: Journal of the TransportationResearch Board, 1923, 9–17.

Codina, E., Garcıa, R. & Marın, A. (2006), New algorith-mic alternatives for the O-D matrix adjustment problemon traffic networks, European Journal of Operational Re-search, 175(3), 1484–1500.

Cremer, M. & Keller, H. (1987), A new class of dynamic meth-ods for the identification of origin-destination flows, Trans-portation Research Part B: Methodological, 21(2), 117–32.

Dixit, V., Gardner, L. M. & Waller, S. T. (2013), Strategic userequilibrium assignment under trip variability, in Proceed-ings of Transportation Research Board 92nd Annual Meet-ing, vol. 9.

Duell, M., Gardner, L., Dixit, V. & Waller, S. (2014), Evalua-tion of a strategic road pricing scheme accounting for day-to-day and long-term demand uncertainty, TransportationResearch Record: Journal of the Transportation ResearchBoard, 2467, 12–20.

Duthie, J. C., Unnikrishnan, A. & Waller, S. T. (2011), Influ-ence of demand uncertainty and correlations on traffic pre-dictions and decisions, Computer-Aided Civil and Infras-tructure Engineering, 26(1), 16–29.

Ehlert, A., Bell, M. G. & Grosso, S. (2006), The optimisationof traffic count locations in road networks, TransportationResearch Part B: Methodological, 40(6), 460–79.

Fisk, C. S. (1988), On combining maximum entropy trip ma-trix estimation with user optimal assignment, Transporta-tion Research Part B: Methodological, 22(1), 69–73.

Foulds, L. R., do Nascimento, H. A. D., Calixto, I. C. A. C.,Hall, B. R. & Longo, H. (2013), A fuzzy set-based approachto origin–destination matrix estimation in urban traffic net-works with imprecise data, European Journal of Opera-tional Research, 231(1), 190–201.

Frederix, R., Viti, F. & Tampere, C. M. J. (2011), Dynamicorigin–destination estimation in congested networks: theo-retical findings and implications in practice, Transportmet-rica A: Transport Science, 9(6), 494–513.

Gan, L., Yang, H. & Wong, S. C. (2005), Traffic counting lo-cation and error bound in origin-destination matrix esti-mation problems, Journal of Transportation Engineering,131(7), 524–34.

Garcia-Rodenas, R. & Verastegui-Rayo, D. (2008), A col-umn generation algorithm for the estimation of origin–destination matrices in congested traffic networks, Euro-pean Journal of Operational Research, 184(3), 860–78.

Genschel, U. & Meeker, W. Q. (2010), A comparison of max-imum likelihood and median-rank regression for Weibullestimation, Quality Engineering, 22(4), 236–55.

Gentili, M. & Mirchandani, P. (2012), Locating sensors ontraffic networks: models, challenges and research opportu-nities, Transportation Research Part C: Emerging Technolo-gies, 24, 227–55.

Page 17: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

298 Wen et al.

Gong, Z. (1998), Estimating the urban OD matrix: a neuralnetwork approach, European Journal of Operational Re-search, 106(1), 108–15.

Hazelton, M. L. (2003), Some comments on origin–destinationmatrix estimation, Transportation Research Part A: Policyand Practice, 37(10), 811–22.

Hazelton, M. L. (2015), Network tomography for integer-valued traffic, The Annals of Applied Statistics, 9(1), 474–506.

Kamath, K. R. & Pakkala, T. P. M. (2002), A Bayesian ap-proach to a dynamic inventory model under an unknowndemand distribution, Computers & Operations Research,29(4), 403–22.

Kattan, L. & Abdulhai, B. (2006), Noniterative approach todynamic traffic origin-destination estimation with parallelevolutionary algorithms, Transportation Research Record:Journal of the Transportation Research Board, 1964, 201–10.

Kim, H., Baek, S. & Lim, Y. (2001), Origin-destination ma-trices estimated with a genetic algorithm from link traf-fic counts, Transportation Research Record: Journal of theTransportation Research Board, 1771, 156–63.

Kim, J., Kurauchi, F. & Uno, N. (2009), Analysis of variationin demand and performance of urban expressways usingdynamic path flow estimation, Transportmetrica, 7(1), 63–84.

Larsson, T., Lundgren, J. T. & Peterson, A. (2010), Allo-cation of link flow detectors for origin-destination matrixestimation—a comparative study, Computer-Aided Civiland Infrastructure Engineering, 25(2), 116–31.

Lo, H., Zhang, N. & Lam, W. (1999), Decomposition algo-rithm for statistical estimation of OD matrix with randomlink choice proportions from traffic counts, TransportationResearch Part B: Methodological, 33(5), 369–85.

Lo, H., Zhang, N. & Lam, W. H. (1996), Estimation of anorigin-destination matrix with random link choice propor-tions: a statistical approach, Transportation Research PartB: Methodological, 30(4), 309–24.

Maher, M. (1983), Inferences on trip matrices from obser-vations on link volumes: a Bayesian statistical approach,Transportation Research Part B: Methodological, 17(6),435–47.

Maus, M., Cotlet, M., Hofkens, J., Gensch, T., De Schryver, F.C., Schaffer, J. & Seidel, C. (2001), An experimental com-parison of the maximum likelihood estimation and non-linear least-squares fluorescence lifetime analysis of singlemolecules, Analytical Chemistry, 73(9), 2078–86.

Menon, A. K., Cai, C., Wang, W., Wen, T. & Chen, F.(2015), Fine-grained OD estimation with automated zoningand sparsity regularisation, Transportation Research Part B:Methodological, 80, 150–72.

Nie, Y. & Lee, D.-H. (2002), Uncoupled method forequilibrium-based linear path flow estimator for origin-destination trip matrices, Transportation Research Record:Journal of the Transportation Research Board, 1783,72–79.

Nie, Y., Zhang, H. & Recker, W. (2005), Inferring origin–destination trip matrices with a decoupled GLS path flowestimator, Transportation Research Part B: Methodological,39(6), 497–518.

Reddy, K. H. & Chakroborty, P. (1998), A fuzzy inference-based assignment algorithm to estimate OD matrix fromlink volume counts, Computers, Environment and UrbanSystems, 22(5), 409–23.

Sherali, H. D., Narayanan, A. & Sivanandan, R. (2003), Esti-mation of origin–destination trip-tables based on a partialset of traffic link volumes, Transportation Research Part B:Methodological, 37(9), 815–36.

Sherali, H. D., Sivanandan, R. & Hobeika, A. G. (1994),A linear programming approach for synthesizing origin-destination trip tables from link traffic volumes, Trans-portation Research Part B: Methodological, 28(3), 213–33.

Spiess, H. (1987), A maximum likelihood model for estimatingorigin-destination matrices, Transportation Research PartB: Methodological, 21(5), 395–412.

Stathopoulos, A. & Tsekeris, T. (2004), Hybrid meta-heuristicalgorithm for the simultaneous optimization of the O–Dtrip matrix estimation, Computer-Aided Civil and Infras-tructure Engineering, 19(6), 421–35.

Szeto, W., Jiang, Y. & Sumalee, A. (2011), A cell-basedmodel for multi-class doubly stochastic dynamic traffic as-signment, Computer-Aided Civil and Infrastructure Engi-neering, 26(8), 595–611.

Tebaldi, C. & West, M. (1998), Bayesian inference on networktraffic using link count data, Journal of the American Statis-tical Association, 93(442), 557–73.

Uchida, T. & Iida, Y. (1993), Risk assignment. A new trafficassignment model considering the risk of travel time vari-ation, in Proceedings of the 12th International Symposiumon the Theory of Traffic Flow and Transportation, ElsevierScience Publishers B.V., Berkeley, CA, United States, pp.89–105.

U.S. (1964), Department of Commerce. Bureau of PublicRoads. Traffic Assignment Manual Urban Planning Divi-sion, U.S. Department of Commerce, Washington DC.

Vardi, Y. (1996), Network tomography: estimating source-destination traffic intensities from link data, Journal of theAmerican Statistical Association, 91(433), 365–77.

Waller, S. T., Fajardo, D., Duell, M. & Dixit, V. (2013), Lin-ear programming formulation for strategic dynamic trafficassignment, Networks and Spatial Economics, 1–17.

Wen, T., Cai, C., Gardner, L., Dixit, V., Waller, S. T. & Chen,F. (2015), A maximum likelihood estimation of trip tablesfor the strategic user equilibrium model, in Proceedings ofTransportation Research Board 94th Annual Meeting, No.15–3826.

Wen, T., Cai, C., Gardner, L., Dixit, V., Waller, S. T. & Chen,F. (2016), A strategic user equilibrium for independentlydistributed origin-destination demands, in Proceedings ofthe Transportation Research Board 95th Annual Meeting,No. 16–3161.

Wen, T., Gardner, L., Dixit, V., Duell, M. & Waller, S.T. (2014), A strategic user equilibrium model incorporat-ing both demand and capacity uncertainty, in Proceed-ings of the Transportation Research Board 93rd AnnualMeeting.

Wong, S., Tong, C., Wong, K., Lam, W. K., Lo, H., Yang, H.& Lo, H. (2005), Estimation of multiclass origin-destinationmatrices from traffic counts, Journal of Urban Planning andDevelopment, 131(1), 19–29.

Wu, X., Michalopoulos, P. & Liu, H. X. (2010), Stochastic-ity of freeway operational capacity and chance-constrainedramp metering, Transportation Research Part C: EmergingTechnologies, 18(5), 741–56.

Xu, W. & Chan, Y. (1993), Estimating an origin-destinationmatrix with fuzzy weights: Part II: case studies, Transporta-tion Planning and Technology, 17(2), 145–63.

Page 18: Two Methods to Calibrate the Total Travel Demand and ... · Computer-Aided Civil and Infrastructure Engineering 33 (2018) 282–299 Two Methods to Calibrate the Total Travel Demand

Two methods to calibrate 299

Yang, H. (1995), Heuristic algorithms for the bilevel origin-destination matrix estimation problem, Transportation Re-search Part B: Methodological, 29(4), 231–42.

Yang, H., Sasaki, T., Iida, Y. & Asakura, Y. (1992), Estima-tion of origin-destination matrices from link traffic countson congested networks, Transportation Research Part B:Methodological, 26(6), 417–34.

Yang, H., Yang, C. & Gan, L. (2006), Models and Algorithmsfor the screen line-based traffic-counting location problems,Computers & Operations Research, 33(3), 836–58.

Yun, I. & Park, B. B. (2005), Estimation of dynamic origindestination matrix: a genetic algorithm approach, IntelligentTransportation Systems, Proceedings, 2005 IEEE, pp. 522–27.

Zhao, Y. & Kockelman, K. M. (2002), The propagation ofuncertainty through travel demand models: an exploratoryanalysis, The Annals of Regional Science, 36(1), 145–63.

Zhou, X. & List, G. F. (2010), An information-theoretic sensorlocation model for traffic origin-destination demand esti-mation applications, Transportation Science, 44(2), 254–73.