a modular modelling framework for hypotheses testing in ...in this paper, we present a modelling...

21
A modular modelling framework for hypotheses testing in the simulation of urbanisation Working Paper A modular modelling framework for hypotheses testing in the simulation of urbanisation Clémentine Cottineau 1,2 * ,Romain Reuillon 2 ,P aul Chapron 2,3 , Sébastien Rey-Coyrehourcq 2 ,Denise Pumain 2,4 1 Centre for Advanced Spatial Analysis, University College London, UK 2 UMR 8504 Géographie-cités and The Institute of Complex Systems Paris, France 3 University of Lausanne, Switzerland, 4 Université Paris 1 Panthéon-Sorbonne, France Abstract In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large territories over long time spans. Building on geographical theories of urban evolution, we rely on agent-based models to 1/ formalise complementary and alternative hypotheses of urbanisation and 2/ explore their ability to simulate observed patterns in a virtual laboratory. The paper is therefore divided into two sections : an overview of the mechanisms implemented to represent competing hypotheses used to simulate urban evolution; and an evaluation of the resulting model structures in their ability to simulate - efficiently and parsimoniously - a system of cities (the Former Soviet Union) over several periods of time (before and after the crash of the USSR). We do so using a modular framework of model-building and evolutionary algorithms for the calibration of several model structures. This project aims at tackling equifinality in systems dynamics by confronting different mechanisms with similar evaluation criteria. It enables the identification of the best-performing models with respect to the chosen criteria by scanning automatically the parameter space along with the space of model structures. I. Introduction Simulation models of urban systems were first developed in the 1950s and 1960s as a way to understand the complexity of cities and to forecast trends and consequences of planning policies. Several formalisms (thermodynamics, general systems theory, synergetic, microsimu- lation) were used [Batty, 2008], following fash- ions as well as opportunities arisen from ac- cess to new technologies [Heppenstall et al., 2012, Pumain and Sanders, 2013]. Each of the methods cited have their specific advantages and drawbacks, that we will not discuss here, but all of them provide a same opportunity and challenge that is linked to simulation and has not changed as the formalisms evolved. The opportunity that we focus on in this paper is the function of virtual laboratory that a computerised simulation model enables [Carley, 1999], which is of paramount in- terest for human and social sciences in which in vivo experiments are impossible or difficult. By allowing to implement competing and/or comple- mentary hypotheses into generative mechanisms 1 in a model testable against empirical data, this function of virtual laboratory makes the model a framework for the evaluation of the plausibility of different theories. However, simulation as a method and an epistemological way of testing theories gives way to a limitation known since Bertalanffy * Corresponding author: [email protected] 1 that is, a set of interaction activities performed by entities (cities) [Hedström and Ylikoski, 2010] producing emergent patterns. 1 arXiv:1509.01200v1 [physics.soc-ph] 25 Aug 2015

Upload: others

Post on 22-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

A modular modelling framework forhypotheses testing in the simulation of

urbanisationClémentine Cottineau

1,2∗, Romain Reuillon2, Paul Chapron

2,3,Sébastien Rey-Coyrehourcq

2, Denise Pumain2,4

1 Centre for Advanced Spatial Analysis, University College London, UK2 UMR 8504 Géographie-cités and The Institute of Complex Systems Paris, France

3 University of Lausanne, Switzerland, 4 Université Paris 1 Panthéon-Sorbonne, France

Abstract

In this paper, we present a modelling experiment developed to study systems of cities and processesof urbanisation in large territories over long time spans. Building on geographical theories of urbanevolution, we rely on agent-based models to 1/ formalise complementary and alternative hypotheses ofurbanisation and 2/ explore their ability to simulate observed patterns in a virtual laboratory. The paper istherefore divided into two sections : an overview of the mechanisms implemented to represent competinghypotheses used to simulate urban evolution; and an evaluation of the resulting model structures in theirability to simulate - efficiently and parsimoniously - a system of cities (the Former Soviet Union) overseveral periods of time (before and after the crash of the USSR). We do so using a modular framework ofmodel-building and evolutionary algorithms for the calibration of several model structures. This projectaims at tackling equifinality in systems dynamics by confronting different mechanisms with similarevaluation criteria. It enables the identification of the best-performing models with respect to the chosencriteria by scanning automatically the parameter space along with the space of model structures.

I. Introduction

Simulation models of urban systems were firstdeveloped in the 1950s and 1960s as a wayto understand the complexity of cities and toforecast trends and consequences of planningpolicies. Several formalisms (thermodynamics,general systems theory, synergetic, microsimu-lation) were used [Batty, 2008], following fash-ions as well as opportunities arisen from ac-cess to new technologies [Heppenstall et al., 2012,Pumain and Sanders, 2013]. Each of the methodscited have their specific advantages and drawbacks,that we will not discuss here, but all of themprovide a same opportunity and challenge that

is linked to simulation and has not changed asthe formalisms evolved. The opportunity that wefocus on in this paper is the function of virtuallaboratory that a computerised simulation modelenables [Carley, 1999], which is of paramount in-terest for human and social sciences in which invivo experiments are impossible or difficult. Byallowing to implement competing and/or comple-mentary hypotheses into generative mechanisms1

in a model testable against empirical data, thisfunction of virtual laboratory makes the model aframework for the evaluation of the plausibility ofdifferent theories. However, simulation as a methodand an epistemological way of testing theoriesgives way to a limitation known since Bertalanffy

∗Corresponding author: [email protected] is, a set of interaction activities performed by entities (cities) [Hedström and Ylikoski, 2010] producing emergent patterns.

1

arX

iv:1

509.

0120

0v1

[ph

ysic

s.so

c-ph

] 2

5 A

ug 2

015

Page 2: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

[von Bertalanffy, 1968] as equifinality. It describesthe fact that even when a model is performing well,one cannot infer that the underlying combinationof mechanisms is the one operating "in real life",because several models can lead to the same results(many-to-one), and the same process can lead toseveral patterns (one-to-many) [O’Sullivan, 2004].The problem of adequacy assessment of the modelwith real life (also known as ontological adequacytesting, cf. [Rossiter et al., 2010]) is twofold. First,many effective (or "real") processes can lead tothe same observed situation within the targetsystem that we aim to model. Second, inade-quate mechanisms implemented in the model cansimulate in a satisfying way the situation understudy. This causal challenge of identification ofthe generative mechanisms is a recurrent prob-lem in social simulation for explanatory purposes[Grüne-Yanoff, 2009, Elsenbroich, 2012], but onethat is usually overlooked at the stage of results’analysis.

We have tried in a previous study[Chérel et al., 2015] to tackle the one-to-many partof the challenge within the model, using an evo-lutionary algorithm to look for the maximum di-versity in patterns produced by a given set ofmechanisms in a model of systems of cities. Whatwe present here relies to the multiplicity of possiblecauses leading to the single historical trajectoryobserved empirically. We present a multimodellingframework that allows to combine different mech-anisms into a modular model evaluated against aunique set of evaluation criteria, enabling for thecomparison of the performance of different modelstructures to simulate urbanisation and the evolu-tion of a system of cities.

More precisely, we are interested in simulatingthe co-evolution of cities encompassed in a terri-torial system (typically a nation or a continent),and to reproduce the regular patterns or organisedstructures that are observed empirically in vari-ous systems of cities: their hierarchy, spacing andfunctional differentiation [Pumain, 1997]. Section II

presents the patterns we aim to reproduce and thecatalogue of theories and mechanisms on which wedraw to compose the model. Section III describesthe multi-model and its implementation in ourstudy case, as well as its exploration through multi-calibration. This exploration aims to explore theperformance of different hypotheses in explainingthe evolution of Soviet and post-Soviet cities. Sec-tion IV concludes on the study case and the method.

II. A catalogue of possible mechanisms of

urbanisation

"It may also be useful to think of com-plex geographical models as extensions ofthought experiments, where the necessaryand contingent implications of theories canbe examined. Further, admitting that ’allmodels are wrong’ is akin to the realisationin post-structural social science that multi-ple competing accounts of the same settingsare possible, and that faced with a diversityof accounts the context and intent of eachmust be an important element in the evalu-ation process." [O’Sullivan, 2004, p.291]

A large bunch of theories in geography, eco-nomics and natural sciences have tried to providean account of the regularity of urbanisation pro-cesses and the structuration of a system of cities,through models and narratives. Keeping in mindthe equifinality challenge, this means that we al-ready have a strong theoretical basis and severalcausal model candidates to confront with empiricalregularities. Without being exhaustive on these the-ories, we try to provide an overview of the mecha-nisms that have been proposed in the literature. Wethen expose the kind of results that are achieved bystatistical models using empirical data, and presentthe particularity of our study case.

II.1. Competing theories

Out of the three stylised facts describing the struc-ture of a system of city: the hierarchy of cities bysize, their regular spacing and the functional differ-entiation [Pumain, 1997], the former has fostered

2

Page 3: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

the larger body of research, and will be our maincriterion for stating the performance of a simula-tion model, whereas the latter two remain hard toformalise and to compare over time and space withrespect to the availability of urban information ata local level. Consequently, we present in largerdetails the competing theories aiming at explain-ing the regularity of city size distribution and itsevolution over time, and quickly review theories oflocation and functional specialisation.

The hierarchical organisation of city sizes rep-resents a "mystery" [Krugman, 1996c] that hasintrigued many researchers, because of its reg-ularity and simplicity of description (the rank-size "rule") despite the complexity of urbanfunctioning and interactions (for reviews, cf.[Cheshire, 1999, Pumain, 2006]). [Auerbach, 1913,Lotka, 1925, Singer, 1936] are known to be thefirst to formalise this regularity, leading to thefamous rule that the size of a city is a powerlaw function of its rank in the hierarchy, a ruleparticularized by [Zipf, 1949] arguing that thePareto exponent is expected to be -1. Alterna-tive mathematical descriptions have been pro-posed, for instance the lognormal distribution[Gibrat, 1931, Berry, 1961, Eeckhout, 2004]. Thetwo distributions however are close and tend tocoincide in the middle-upper part of the hierarchy[Mitzenmacher, 2004, Clauset et al., 2009].

Because of this regularity, generative mod-els of power laws and lognormal distributionshave logically been considered as candidate expla-nations [Mitzenmacher, 2004, Clauset et al., 2009].The most famous one is Gibrat’s law [Gibrat, 1931],which generates a lognormal distribution by aprocess of "proportional effect". This model of mul-tiplicative growth allocates randomly growth rates2

from independent distributions, at each short timestep, to cities independently from their size. It isthus considered close to a random walk. It resultsin amplifications of urban growth and decline lead-

ing to a lognormal distribution. The Simon modelof preferential attachment [Simon, 1955] applied tocities generates power laws of city sizes by simu-lating an incremental creation of new urban blocksattached to existing urban clusters with a proba-bility dependent on their size [Krugman, 1996a].Those two models are elegantly simple enough togenerate hierarchical distributions, but they lack anexplaining power to help understand urbanisationprocesses: "Scaling laws often are viewed as over-identified: they can be generated by a wide rangeof distinct models. It is essential to select specifica-tions that integrate in the model a significant partof the existing knowledge about towns and cities"[Pumain, 2004, p. 1]. Some attempts were made tocharacterise them in terms of exogenous shocks andcontingent local policies (for example in Gabaix’smodel [Gabaix, 1999]), but random growth modelsseem to miss the causal power of mechanism-basedsimulation models [Hedström and Ylikoski, 2010].

Other accounts of urban growth models lead-ing to the observed stylized hierarchical struc-ture and evolution (deterministic in Dimou andSchaffar’s typology [Dimou and Schaffar, 2011],by opposition to random models) range intotwo broad categories: equilibrium modelsof externalities (such as [Henderson, 1974,Krugman, 1996b]) and evolutionary models of spa-tial interactions (such as [Allen and Sanglier, 1981,Weidlich and Haag, 1983, Bura et al., 1996]). Theformer type of models formalizes a balance of cen-tripetal (sharing, matching, learning in the labourmarket for example) and centrifugal forces (pollu-tion and congestion) resulting in optimal sizes forcities given the current technology. The latter typeof models rely on attractivity and spatial interac-tions of cities to explain the dynamic of competitionand cooperation resulting in a regular hierarchy inthe urban system.

Location theories of cities typically build on twoconcepts to explain the regular spacing and empir-

2If not stated otherwise, growth rates in this paper refer to the ratios of population variation during a time period by the totalpopulation at the beginning of the period.

3

Page 4: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

ical distribution of cities in space: site and/or sit-uation advantages [Christaller, 1933, Ullman, 1941,Krugman, 1996b, Pumain, 1997]. Site advantagesrefer to natural features available at an absolute lo-cation (natural resources, harbour conditions, etc.),whereas situation advantages refer to centrality ina transportation network or an interface relativeposition for example. All the theories mentionedabove rely necessarily on relations with (consis-tent) territories providing resources or distanceconstraints for instance, in order to explain urbancoevolution.

Theories of urban specialisation inherit fromtrade theories (via comparative and competitiveadvantages) and theories of product and innova-tion cycles. They state that cities have differentfeatures (size, situation, site, former specialisation)that make them akin to be more competitive tospecialise in one economic sector in comparisonwith other cities, or to adopt an innovation sooneror later than their neighbours. Depending on thecurrent cycle of the product, those specialisationsaffect not only the sectoral composition of the ac-tive population and its economic output, but alsoconfers an advantage to early adopters and de-fines further options for specialisation and growth[Duranton and Puga, 2001, Pumain et al., 2006].

Finally, political factors and singular policiesare usually considered necessary to explain dif-ferentiated urban growth (through administrativefunctions or investment policies targeting specificcities at a given moment in time).

We compose our catalogue with mechanismsthat formalize the theories cited above. They fallinto five broad classes of generative mechanismspotentially accounting for the emergence of a struc-tured system of cities:

• Spatial Interactions and diffusion allow forthe exchange of information, money, goods andpeople. It thus makes cities co-evolve over timeand adapt collectively to changing economic and

innovation cycles through competition and cooper-ation, resulting in some complementarity of theirspecialisation. These local interactions and theirconsequences on the regular organisation of thesystem as a whole under spatial constraints couldbe thought of as "complex systems effects".• Size effects like agglomeration economies andurbanisation externalities illustrate a very directand self-reinforcing cause for hierarchical differen-tiation.• Site effects explain the spatial location of growthprocesses around resource-rich areas for the relatedinnovation cycle.• Situation effects illustrate the importance ofthe neighbouring relational environment (potentialfield, network accessibility, etc.) on a city’s patternof growth.• Territorial effects account for some exogenous(policy) shocks and the solidarity of urban tra-jectories in a common political space (throughredistributive processes for example).

We look for statistical evidences of these factorsof urbanisation in the empirical literature beforeturning to our case study experiment, where wepropose an implementation and evaluate the powerof each of the five mechanisms into an agent-based(multi-) model.

II.2. Empirical Results from the Literature

Spatial Interactions are tricky to measure be-cause of the variety and non-commensurability offlows circulating between cities at various tempo-ralities. Until recently, the diffusion of innova-tions (agricultural techniques [Hägerstrand, 1968],telephone lines [Robson, 1973] or newspapers[Pred, 1973]) served as a proxy for these interac-tions. Since the development of various volumes ofhigh velocity data, actual interactions (like phonecalls [Krings et al., 2009]) have confirmed for exam-ple the relevance of the gravity model to describeinter-city interactions.

Size effects in urban growth and differentia-tion were revealed by a persistent empirical cor-

4

Page 5: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

relation between growth rates and city sizes overlong periods of time. All over the 19th century,[Robson, 1973, p. 79] measured a positive co-efficient between the log of English and Welshcities’ population and their ten-year growth rates(from a minimum of +1.47 between 1861 and 1871to a maximum of +8.53 between 1821 and 1831).This correlation is found for French cities as well[Guérin-Pace, 1995]. The size effect finally relatesto the stability of the rank position of large cities (bycomparison with the fluctuations of smaller cities).

Site effects were classically approached byestimating the surplus of growth associated witha localised resource (typically, deposits of naturalmaterials, such as coal or gas). In the Soviet urbansystem of the 1920s-1930s, the location of a cityon a coal deposit was associated with a surplus of1.15 points in percentage of average demographicgrowth per annum, everything else being equal. Asurplus over 0.5 point is observed nowadays (1989-2010) for oil and gas deposits [Cottineau, 2014a].In the USA, [Black and Henderson, 2003] estimatethat the coastal location (e.g. a resource for tourism)is associated with a significantly higher ten-yeargrowth rates of 3 to 5 points.

Situation effects can be revealed by the spa-tial autocorrelation of growth or the coevolutionbetween transportation networks and urban net-works. In the first case, [Hernando et al., 2015]found a characteristic distance for spatial autocorre-lation of growth rates of 215 km for American coun-ties and of 80 km for Spanish cities. As for trans-portation dynamics, [Bretagnolle, 2003] measuredthe correlation between accessibility and growthrates for French cities in the last two centuries. Shefinds that cities that were weakly connected (by anytransportation network: road, rail or air) in 1900and stayed isolated in 2002 grew slower (0.94% onaverage per annum between 1900 and 2002) thancities that became motorway nodes (1.21%) or mul-timodal hubs (1.69%). Likewise, well connectedcities at the beginning of the period tended to growfaster than the first category.

Territorial effects can be approached empir-ically by relating political statuses to dynamics ofgrowth. In developing countries, regional capitalswere found to grow significantly faster by 0.5 to 1point of annual average growth rate in the 1960s[Preston, 1979] and the 1990s [Brockerhoff, 1999].In the Former Soviet Union, the regional statusof capital has proven important to predict urbangrowth [Harris, 1970], the coefficient regressedagainst growth rates over time ranges from +0.24point between 1989 and 2002 to +1.88 between 1926and 1939 [Cottineau, 2014a]. Besides, cities that be-long to the same territory have shown an increasedpattern of synchronicity in their growth and declinetrajectories from the 1980s on, suggesting evidenceof both political shocks and territorial solidarity inthe spatial distribution of urban growth.

Despite the corroboration of theories providedby the numbers cited above, statistical correlationsdo not prove any causal chain and fail to explain theprocesses at work. Moreover, they are not adaptedto model dynamic, non-linear and complex evolu-tions and interactions. This is a recurrent problemthat has led social scientists to promote a combina-tion of statistical methods with generative (simula-tion) modelling [Byrne, 1998, Goldthorpe, 2001].

II.3. A Case Study : the Former Soviet Union

We applied such a mixed methodology to thecase study of the urban Soviet Union of the last50 years [Cottineau, 2014a, Cottineau et al., 2015].This system is mainly interesting to us becauseit has the reputation of having been a controlledeconomic, political and social system aiming atreaching equalisation (of city size for example) andyet, while observing its evolution with generic ur-ban models, the urban trends appear very classic.By this we mean for instance that the evolutionof the percentage of urban population follows avery classical logistic function of urban transition,that the rank-size slope is close to the value ex-pected with respect to the time of settlement ofthe different parts of the territory, and that it hasincreased over time, indicating an increase of in-

5

Page 6: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

equality of city sizes (despite anti-urban politicaldiscourses [Cottineau, 2014a]). The monographicexplanation of urban growth in the Former SovietUnion has been dominant in urban geography, andwe argue that it might have hidden very genericprocesses of urbanisation in this system of cities[Cottineau and Slepukhina, forthcoming].

Our modelling experiment aims at testing thegenericity of Soviet and post-Soviet urbanisation bysimulating generic mechanisms and testing themagainst harmonised historical data. The databaseused in this study consists of open demographic,territorial and site informations on 1929 agglomera-tions of the Former Soviet Union between 1840 and2010 [Cottineau, 2014b].

We analyse and compare the explaining powerof generic mechanisms in a simulation of the periodbefore (1959-1989) and the period after (1989-2010)the dislocation of the Soviet Union. The agent-based model framework of this experiment is calledMARIUS3. In this model, cities interact as collec-tive agents [Bura et al., 1996] and we evaluate everysimulations with three evaluation criteria. The firsttwo criteria are boolean indicators of the "realism"of the microdynamics of the simulation : we onlyanalyse simulations in which the number of citieswith a nil wealth and the number of cities pro-ducing more output in one year than their stockof wealth is 0. Once these controlling criteria aremet, we try to minimise the distance δ between thesimulated population Ps and the observed popula-tion Po of each city i at each date t we have censusinformation for, as stated in equation (1).

δ = ∑t(∑

i(log(Po,i,t)− log(Ps,i,t))

2) (1)

This criterion ensures that a "perfect" model(achieved for δ = 0) would predict the right amountof urban growth and its exact location over time.In order to compare different time periods, we nor-malise δ by the number of cities and the number ofcensuses used in a given period.

III. Modular multimodelling experiment

The incentive to implement competing and comple-mentary theories into different models evaluatedagainst one another is a recurrent plea in the simu-lation literature [Openshaw, 1983, O’Sullivan, 2004,Batty and Torrens, 2005, Thiele and Grimm, 2015,Batty, 2015]. It reveals how tricky its implementa-tion and automatic evaluation might be, besidesthe epistemological challenge of equifinality andthe kind of conclusions one can draw from thisconfrontation. Indeed, thirty years after the "Auto-mated Modeling System to Explore a Universe ofSpatial Interaction Models" by [Openshaw, 1983],there are no standard tools nor formal methodologyfor theory testing with simulation models. Indeed,Openshaw’s automated way to explore model struc-tures, being a pure optimisation way of discoveringmodel structures, is impressive methodologicallybut it does not suit our goal of theory testing in avirtual laboratory, because it can result in optimalmodels that are impossible to interpret.

Instead, we think that the first step shouldbe to gather a catalogue of theoretical pro-cesses and mechanistic hypotheses working aspotential explanations. This usually precedesthe mixed-modelling step [Contractor et al., 2000,Auchincloss et al., 2008] and prevents fromendlessly building models "from scratch"[Thiele and Grimm, 2015]; it allows to build onprevious work and experience.

In a second step, we have built a baseline modelof urban interactions and implemented hypothesesas blocks of mechanisms that can be activated ordiscarded to compose a family of simulation mod-els (section III.1). By automatically calibrating allmodel structures against a similar evaluation ob-jective (section III.2), we were able to compare thepower of several groups of hypotheses in the caseof the Soviet and post-Soviet urbanisation (sectionIII.3).

3Models of Agglomerations in the perimeter of Imperial Russia and the former Soviet Union. For further details, see[Cottineau et al., 2015]

6

Page 7: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

III.1. Implementing mechanisms as buildingblocks

The multi-model is composed of the baseline modeland modules of mechanisms that override the se-quence of agents’ rules when they are activated.In the following sections, we detail the baselinemodel and the modular blocks of mechanisms thatare added incrementally to the original equations.For further informations, the baseline model andthe first two additional mechanisms were describedand evaluated in details in [Cottineau et al., 2015].

III.1.1 The Baseline Model

The baseline model relies on the assumption thatpopulation and wealth are the basic descriptors ofcities and the engine of their coevolution. It there-fore models exclusively size effects of populationon wealth and spatial interactions.

At initialization, each city i of the Former SovietUnion is setup with its historical population Pi atthe beginning date of simulation, and located atits empirical coordinates, enabling site, situationand distance to play in the same geometry as inthe target system. An estimated value of wealth Wi(expressed in a fictive unit) is determined for eachcity with respect to its size4, following equation (2).

Wi = PpopulationToWealthi (2)

populationToWealth ≥ 1 (3)

Time is modelled as discrete steps, each ofwhich represents a time period of one year, duringwhich interactions occur in a synchronous way. Ateach step:

Each city i updates its economic variables: aglobal supply Si (4) and a global demand Di (7),according to its population Pi and three parame-

ters (economicMultiplier, sizeE f f ectOnSupply andsizeE f f ectOnDemand).

Si = economicMultiplier× PsizeE f f ectOnSupplyi (4)

economicMultiplier > 0 (5)

sizeE f f ectOnSupply ≥ 1 (6)

Di = economicMultiplier× PsizeE f f ectOnDemandi (7)

sizeE f f ectOnDemand ≥ 1 (8)

Each city interacts with other cities accord-ing to the intensity of their potential IP (9). Fortwo distinct cities i and j, the computation of theinteraction potential IPij consists in confronting thesupply of i (11) to the demand of j with an equationborrowed to the gravity model (12).

IPij =Si × Dj

ddistanceDecayij

(9)

distanceDecay ≥ 0 (10)

with dij a measure of distance between i and j.

Interactions of cities i and j based on their po-tential IPij result in a transaction Tij from i to j(13).

Sij = Si ×IPij

∑k IPik(11)

Dij = Di ×IPji

∑k IPki(12)

Tij = min[Sij, Dji] (13)

4This first mechanism is a first possibility of implementing the theoretical hypothesis of agglomeration economies. Wealth isindeed distributed superlinearly for each value of populationToWealth significantly greater than 1.

7

Page 8: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

Each city updates its wealth Wi according tothe results of its transactions T (unsold supply USi(15) and unsatisfied demand UDi (16)) in which itwas committed (14).

Wi,t = Wi,t−1 + Si − Di −USi + UDi (14)

USi = Si −∑j

Tij (15)

UDi = Di −∑j

Tji (16)

A simulation step ends when each city up-dates its population according to its new result-ing wealth (17) :

Pi,t = Pi,t−1 +WwealthToPopulation

i,t −WwealthToPopulationi,t−1

economicMultiplier(17)

III.1.2 Mechanism Increments

The mechanism that accounts best for inter-actions benefits at the intercity level is the onecalled bonus. It

"[...] features a non-zero sum game[...], rewarding cities who effectively in-teract with others rather than internally.We assume that the exchange of any unitof value is more profitable when it isdone with another city, because of thepotential spillovers of technology andinformation." [Cottineau et al., 2015]

This bonus Bi depends on the volume of trans-actions and the diversity of partners Ji the city i hasexchanged with (18). It is added to the wealth atthe end of each step (19), following equation (14):

Bi = bonusMultiplier×(∑j Tij + ∑j Tji)× Ji

n(18)

n being the total number of cities in the system (i.e. 1145for a simulation beginning in 1959, 1822 for a simulationbeginning in 1989), and Ji the number of partners withwhich i has engaged in the current simulation step.

Wi,t = Wi,t + Bi (19)

A mechanism related to situation advantagesis called fixed costs. It ensures that the situation ofeach city in the system is taken into account in itsinteractions with other cities.

"Every interurban exchange gener-ates a fixed cost (the value of whichis described by the free parameterf ixedCost). This implies two featuresthat make the model more realistic: first,no exchange will take place betweentwo cities if the potential transactedvalue is under a certain threshold ; sec-ond, cities will select only profitablepartners and not exchange with everyother cities. This mechanism plays therole of a condition before the exchange."[Cottineau et al., 2015]

The interaction potential between city i and cityj will be positive only if the potential value that i iswilling to sell to j is superior to the fixed value itcosts it to negotiate, prepare and send the transac-tion (20):

IPij =

{IPij if Sij > f ixedCost,

0 otherwise.(20)

Therefore, each transaction of a city i gives wayto a fixed cost. Their sum is subtracted from thewealth of city i at the end of each step (21), follow-ing equation (14):

Wi,t = Wi,t − Ji × f ixedCost (21)

8

Page 9: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

Site effects are targeted by the resource mech-anism: site advantages are particularised in thismodel by natural resource deposits (more specifi-cally: coal deposits C on the one hand, and oil andgas deposits O on the other hand). The assump-tion is made that if the city i is located on somecoal or oil deposits (Ci = 1 or Oi = 1), the citybenefits from the advantage granted by the extrac-tion activity. The capacity of extraction depends onthe capital (wealth) of the city and takes the formof a wealth multiplier for each resource (22) afterequation (14):

Wi,t = Wi,t × (1 +

{coalE f f ect if Ci = 1,

oilAndGasE f f ect if Oi = 1)

(22)

Territorial and political effects are for-malised by the redistribution mechanism. It al-lows for a redistribution of wealth between citiesof the same territory R (region or State). To do so,territorial taxes ttk are collected in each city kR, asa proportion territorialTaxes of their wealth. Thetotal amount of taxes collected is TTR(23):

TTR = ∑i

tti,R = ∑i

territorialTaxes×Wi,R (23)

From this taxes, the administrative status of theterritory R (denoted by CCi,R, set to 1 if i is the cap-ital city of the region and 0 otherwise) allows thecapital city to take a share CS for its administrationneeds (24):

CSR = capitalShareO f Taxes× TTR (24)

The rest of the taxes is redistributed to citiesof region. Each city iR receives a share tri,R that isproportionate to its population (25):

tri,R = (TTR − CSR)×Pi,R

∑k Pk,R(25)

The balance of the territorial redistribution isadded to the wealth of a city (26) after equation

(14):

Wi,t = Wi,t − tti,R + tri,R +

{CSR if CCi,R = 1,

0 otherwise(26)

Finally, territorial and situation explanationsare mixed in the urban transition mechanism. Toaccount for the different opportunities of cities toattract rural migrants in the different regions, wemodel the evolution of the urban transition curvesover time. As shown empirically [Cottineau, 2014a],100 out of the 108 regions of the Former SovietUnion have followed the scheme of the urban tran-sition. It means that their urbanisation rate UR (in%) has followed a logistic function over time t (27):

UR,t =100

1 + e−urbanisationSpeed×t (27)

We have estimated the parameterurbanisationSpeed on empirical data and nor-malised the time units in order to position everyregion in a single urbanisation curve with respectto their time lags based on their urbanisation rate.The consequence of this initialisation process isthat each region will move one step further on theurbanisation curve (leading eventually to 100%) ateach simulation step, but that the rural potential tomigrate in cities will depend on its current positionon the urban transition curve (high potential forweakly urbanised regions, small potential for re-gions already very urban). The migration potentialof each city i in territory R is built as a multiplierTMR specific to each region for each time step (28):

TMR,t = 1 + (1−UR,t × ruralMultiplier) (28)

This extra population growth is added (29) after(17) and the new urbanisation rates of regions areupdated for t + 1 (27):

Pi,R,t = Pi,R,t × (1 + TMR,t) (29)

Because we want to evaluate the contributionof each theoretical mechanism to the simulation ofurbanisation and its itneraction with other mecha-nisms, we need a model that enables modules to be

9

Page 10: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

activated and de-activated. The technical implemen-tation of MARIUS permits the modular functioningof the model, i.e. with or without any of its buildingblocks.

III.1.3 Technical modular implementation

In order to achieve a modular implementation ofthe model and its mechanisms, we leveraged themixin-methods system of the Scala programminglanguage [Schärli et al., 2003]. The mixins werefirst proposed at the beginning of the 90s in the Jig-saw programming language [Bracha, 1992]. Theyare now adopted in mainstream languages suchas Scala. It has been established as a powerfulway of to perform type-safe dependency injectionframework [Wampler and Payne, 2009], which isa powerful paradigm to achieve modularity. Themixin pattern allows to achieve type safe depen-dency injection. Mixins are therefore a suited toolto achieve modularity in model implementations,which we use to implement MARIUS in Scala lan-guage5.

The mixin-methods concept [Steyaert et al., 1993,Lucas and Steyaert, 1994] generalizes object-oriented programming to enable feature-orientedprogramming [Prehofer, 1997]. It makes thedefinition of class hierarchy more flexible[Steyaert and Meuter, 1995]. Variations of isolatedfeatures (such as the different update functionsof city wealth and population in our model) aredefined in separate modules and mixed-in witheach other at the object instantiation point, alsocalled mixin application. The advantage of mixins (ortrait) over classical object-oriented programming isthe possibility of defining numerous variations ofseveral features without increasing exponentiallythe number of specialized class.

For instance, let’s consider a class C implement-ing two methods a and b6. To define alternativeimplementations of those methods in the classical

object-oriented paradigm, one would implementsubclasses of C and override the implementationsof a and b in each subclass. For instance in thelisting 1 the class C1 specialises C and defines im-plementations for the methods a and b.

Listing 1: Object oriented specialisation1 abstract class C extends A with B {2 def a(x: Double): Double3 def b(x: Double): Double4 def c(x: Double) = /* Compute something

using a and b */5 }67 class C1 extends C {8 def a(x: Double) = /* Some

implementation of a */9 def b(x: Double) = /* Some

implementation of b */10 }

This pattern achieves a very low level of codereusability. Let a and b have 10 possible implemen-tations, then 100 specialised implementations of Cwould be required. The mixins method solves theproblems of combinatorial explosion of the numberof implementations by delaying the entanglementof the class components at the instantiation site.Listing 2 exposes an implementation based on mix-ins providing alternative implementations of A andB and the corresponding parameter specifications.The implementation choice is delayed until the in-stantiation point (last lines of listing 2) at which amixin is defined.

Listing 2: Mixin in Scala1 trait A {2 def a(x: Double): Double3 }45 trait B {6 def b(x: Double): Double7 }89 trait C extends A with B {

10 def c(x: Double) = /* Compute somethingusing a and b */

5 The source code of the model is available under open-licensing here: https://github.com/ISCPIF/simpuzzle6 We could think of class C as the interaction potential function, a representing the basic implementation of equation 9 and b

the fixed cost selection of equation 20.

10

Page 11: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

11 }1213 // Implementation 1 of trait A14 trait A1 extends A {15 def a(x: Double) = /* Some

implementation */16 }1718 // Implementation 2 of trait A19 trait A2 extends A {20 // Parameter p0 used in this version of

a21 def p0: Double22 def a(x: Double) = /* Some

implementation using p0 */23 }2425 // Implementation 1 of trait B26 trait B1 extends B {27 def b(x: Double) = /* Some

implementation */28 }2930 // Implementation 2 of trait B31 trait B2 extends B {32 def b(x: Double) = /* Some

implementation */33 }3435 val instance1 =36 new C with A1 with B2 {}3738 val instance2 =39 new C with A2 with B2 {40 // Value for parameter p0 of trait A241 def p0 = 1.042 }

Scala traits expressiveness can be leveraged toimplement modular evolutive type-safe modellingframeworks proposing alternative model features,feature composability and the formalisation of thefeature dependencies. The implementation of al-ternative behaviours in several traits provides theisolation of model component implementations andexplicit dependencies between these components.Each component defines free parameters that haveto be set at the model instantiation site, otherwiseit won’t compile.

In this experiment we defined each alternativemodel mechanism in a particular trait. The exe-

cutable model has then be composed by picking thetraits we wanted to test. In order to evaluate concur-rently all the alternative mechanisms, we generatedall the possible models (or combination of mech-anisms). It was achieved using a code generationalgorithm which produces all the possible modelsimplementations by generating a Scala source codecontaining all the possible traits combinations, suchas the one shown on Listing 3.

Listing 3: Example of generated code12 def model(index: Int , parameters: Seq[

Double ]) =3 index match {4 case 0 =>5 new Model with T11 with T21 with

... {6 def p0 = parameters (0)7 def p1 = parameters (1)8 ...9 }

10 case 1 =>11 new Model with T11 with T22 with

... {12 def p0 = parameters (0)13 def p1 = parameters (1)14 ...15 }16 case 2 => ...17 }

This generated source code implements a singlefunction encapsulating all the model alternatives.This function takes two arguments: an index ofthe implementation that shall be executed and avector of parameters to set for the model (for thiswork we calibrated only double precision floatingpoints values). Note that the vector has a fixed sizewhich does not depends on the model instantiated.A given model implementation generally does notuse all the parameters: instead it will use only someof them and ignore the others.

III.2. Calibrating a multi-model

"Whatever changes occur in the institu-tional, political and social context of com-putational models, the question of how tolearn from models remains. It is clear that

11

Page 12: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

assessment of the accuracy of a model as arepresentation must rest on argument abouthow competing theories are represented inits workings, with calibration and fittingprocedures acting as a check on reasoning."[O’Sullivan, 2004, p.291]

In order to evaluate the capacity of the modelsto reproduce the historical trajectory of urbanisa-tion in the Former Soviet Union, we rely on anautomated calibration. This procedure is part ofthe model evaluation [Grimm and Railsback, 2012,Thiele et al., 2014, Schmitt et al., 2015]. Its aim is tofind the values of parameters for which the modelresults match the fitness criteria (here: δ, the lowestpossible distance to the data, the realistic criteriahaving been met). If the model can be calibrated,then it is shown that the mechanisms includedin the model are sufficient to simulate the urbantrajectory (not necessary though). If there are noparameter value for which the fitness criteria ismet, then the combination of mechanisms is notsufficient to model urbanisation under the currentimplementation.

In order to calibrate all the models at once,we designed a variant of the NSGAII geneticalgorithm that includes a niching mechanism[Mahfoud, 1995]. Niching methods aim at pre-serving suboptimal solutions to preserve diversity.Our niching algorithm divided the population intosub-populations, each sub-population containingone model alternative. The genome of each individ-ual7 contains two parts. The first part is an integervalue that corresponds to the index of the modelalternative on which the genome is evaluated. Thesecond part is a vector of double values containingthe values of all the parameters for the model.

In order to evaluate a genome, we designeda fitness function. This function calls the gener-ated function described above, runs the model andevaluates its dynamics using the fitness functiondescribed in section II.3 (i.e. the criteria of realism

of micro-dynamics and the distance between simu-lated and observed population data for each city ateach census date).

In NGSAII, the elitism operation preserves thebest individuals among the whole population. Theevaluation algorithm we applied has the exact sameelitism strategy for each sub-population. No globalelitism strategy was performed, instead we keptthe 50 best-performing individuals in each sub-population (or model combination of mechanisms).In order to speed up the convergence of the algo-rithm, we also tweaked the mutation operation: ithad a 10% chance of mutating the "model index"part of the genome. The new value for the "modelindex" was drawn uniformly among the possiblemodel indexes. This allowed to periodically teston other models, some parameter values that wereperforming well on a given model.

We then distributed this algorithm on the Eu-ropean grid EGI using the technique known as the"island model" in the same way as we describedin [Schmitt et al., 2015]. We ran 200 000 jobs of 2hours.

III.3. Hypothesis testing to explain urbanisa-tion in the Former Soviet Union

After the multicalibration procedure, we end upwith the best performing simulations of 64 differentstructures of models8. The analysis of calibra-tion results consists in relating the performanceof these calibrated models (in terms of distancebetween simulated and observed growth hier-archically and spatially) to their structure andthe values of parameters associated with eachactivated mechanism. This analysis was madeusing an interactive application available online:http://shiny.parisgeo.cnrs.fr/VARIUS.

We present the results of this exploration in theform of three questions at the macro, meso and

7An individual correspond to a vector of parameter values and structure index that defines the model under evaluation8For example: baseline model, baseline model + fixed cost, baseline model + urban transition + resources, baseline model +

redistribution + urban transition, baseline model + fixed cost + bonus + resources, etc.

12

Page 13: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

micro scale of the city-system.

1/ Which is the most parsimonious model tosimulate the evolution of cities before and afterthe collapse of the Soviet Union? For simula-tions starting in 1959 up to 19899, the best modelwith only one additional mechanism is composedof the baseline model plus the mechanism of urbantransition (cf. Table 1).

It is characterised by significant economies ofagglomeration (sizeE f f ectOnSupply = 1.05) but alinear function of demand with size. The rural mul-tiplier is equal to 3.5% and allows to simulate fasturbanising regions of Siberia and Central Asia (cf.figure 1). However, the population of a majorityof cities is under-estimated in the simulation, es-pecially in the upper part of the hierarchy, aroundMoscow and in eastern Ukraine10.

Table 1: Best simple model before the transition

Parameter name Value MechanismeconomicMultiplier 0.002193758 GenericpopulationToWealth 1.000184755 GenericsizeEffectOnSupply 1.053943022 Generic

sizeEffectOnDemand 1.000000000 GenericwealthToPopulation 0.203567639 Generic

distanceDecay 1.872702086 GenericruralMultiplier 0.034975771 UrbanTransitionNormalized δ n cities Time steps

0.01423387 1145 30

After the transition, the best performing modelwith one additional mechanism includes site ad-vantages for cities. This model has two additionalparameters and meets the evaluation criteria (percity per census) twice better than the best modelfor the previous period (0.005 vs. 0.01, cf. tables 1and 2).

The analysis of the parameters fits the empiricalobservations of faster growing cities located on oiland gas deposits (oilAndGasE f f ect = 0.02), and

declining cities in the Donbass and Kuzbass coalregions (coalE f f ect = −0.01). This model’s speci-fications include very low size effects and a veryuneven wealth distribution amongst cities at initial-isation (populationToWealth = 1.12). The residualsare distributed roughly symmetrically (figure 1),but without any mechanism of urban transition, themodel clearly underestimates the post-1989 growthof all the rapidly growing cities of Central Asia.

Table 2: Best simple model after the transition

Parameter name Value MechanismeconomicMultiplier 0.502616330 GenericpopulationToWealth 1.124963276 GenericsizeEffectOnSupply 1.002982515 Generic

sizeEffectOnDemand 1.000808442 GenericwealthToPopulation 0.699943763 Generic

distanceDecay 1.475836151 GenericoilAndGazEffect 0.017066495 Resources

coalEffect -0.011792670 ResourcesNormalized δ n cities Time steps0.005180008 1822 21

To summarise, situation effects and territorialeffects seem to be the dominant candidates for ex-plaining the specific part of the trajectories of citiesin the Soviet era, while site effects seem to havetaken over since 1991, the transition to capitalismand the rise of oil prices in the world markets. More-over, the better fit of the latter model could be anindication of the "normalization" of the urban pro-cesses in the post-Soviet space, compared to a moresingular pattern of Soviet urbanisation.

2/ Which are the mechanisms (and mecha-nisms’ interactions) that are essential to modelthe Soviet and post-Soviet urbanisation patterns?We statistically analyse the results of the multicali-bration to evaluate the contribution of each mech-anism (everything else being equal in the modelstructure) to the reduction of distance between sim-ulated and observed demographic data for eachcity.

9The Soviet Union actually came to an end in Decembre 1991. In this study, we rely on historical data and therefore considerthe last Census date of the Soviet Union in 1989 as the transition point between the two economic, political and territorial regimes.

10or more generally in the Western part of the Former Soviet Union. See question 3 for an hypothesis as to why that might be.

13

Page 14: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

Figure 1: Residuals of the parsimonious models

Urban Transition model. Simulation 1959-1989 | Resource Extraction model. Simulation 1989-2010

Urban Transition model 1959-1989 (|residuals| ≥ 0.3)

Resource Extraction model 1989-2010 (|residuals| ≥ 0.3)

14

Page 15: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

Figure 2: Contribution of mechanisms to a simulation thatreduces the distance to observed data

N.B.: Estimated coefficients are considered significantfor p-values inferior to 0.005

We find that on average, the bonus, fixed costsand urban transition mechanisms tend to reducethe simulation error significantly (cf. figure 2). Thetransition and bonus mechanisms are identifiedto be the most effective ones. By contrast, the re-source mechanism is correlated significantly witha change in the evaluation criteria, but tends toincrease the error when it is activated (comparedwith model structures without this mechanism).This counter-intuitive result might be linked tothe weak influence of resource extraction for thefirst period of simulation, when there is a diversityof urban trajectories within resource-rich regions.The redistribution mechanisms does not appearsignificant on average in this analysis, as it plays aminor role in the reduction of error when activated.

Finally, we confirm the observation that ourmodels work better to simulate the urban evolutionfollowing the dismantlement of the Soviet Union.Indeed, as shown in figure 2, the value of normal-ized δ is greater by 0.01 point on average when themodel is specified for the first period. This could bean expression of "normalisation" or simplificationof the urban processes in the post-Soviet space afterthe political and economic transition of the 1990s.

3/ What are the cities that resist modelling? Inother words, what are the cities that are too spe-cific to be modelled by any of the mechanismsimplemented? To answer this last question, westatistically analyse the difference between the logof the population observed and the log of the popu-lation simulated at the last evaluation Census datefor cities included in the simulation, with respectto their locational and functional attributes. Themodels for which we present the results belowcontains all the implemented mechanisms, and areapplied to the two periods of enquiry.

For the two periods, we find that our modelspersistently and significantly under-estimate thegrowth of the largest and most western cities ofthe (Former) Soviet Union, everything else beingequal (cf. figure 3). Moreover, capital cities appearto have grown less historically than what we canpredict with a complete model of the period 1959-1989. The other urban attributes included in theregressions (natural resources and mono-functionalspecialisation) do not seem associated with anysystematic pattern of over- or under-estimation.

The difficulty to reproduce the trajectory ofthe largest cities has been encountered for a com-parable model of system of cities (Simpop2, see[Bretagnolle and Pumain, 2010]) and solved by theexogenous introduction of innovations to accountfor the creative features and higher probability ofadoption of new technology and functions by thelargest cities.

The under-estimation of growth in the westernpart of the territory might be due to its integrationwithin a larger area (the Eastern Europe) duringthe periods under study: our hypothesis here isthat the centrality of western (post-)Soviet citieswould then be minored in our model because itdoes not take into account the interactions witheast-european cities (Warsaw, Prague, Bratislava,etc.) which formed altogether an economic system(even though the integration was always strongerwithin the FSU).

15

Page 16: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

Figure 3: Profiles of residuals

Complete model. Simulation 1959-1989 | Complete model. Simulation 1989-2010

N.B.: Estimated coefficients are considered significant for p-values inferior to 0.01

Table 3: Observed and Simulated populations of urban out-liers in 1989

Positive ResidualsCity Observed Simulated

Naberezhnye Tchelny 500 000 30 000Volgodonsk 191 000 36 000Chajkovskij 86 000 19 000

Toljatti 685 000 158 000Bratsk 285 000 73 000

Balakovo 197 000 52 000Tihvin 71 000 20 000

Chervonograd 72 000 21 000Obninsk 111 000 32 000

Staryjoskol 174 000 53 000

.

Negative ResidualsCity Observed Simulated

Zaozernyj 16 000 54 000Gremjachnsk 21 000 56 000

Atakent / Ilitch 15 000 38 000Kizel 37 000 88 000

Cheremhovo 74 000 172 000Ilanskij 18 000 42 000

Gornoaltajsk 46 000 102 000Volchansk 15 000 32 000Zujevka 16 000 35 000

Taldykorgai 138 000 296 000

Table 4: Observed and Simulated populations of urban out-liers in 2010

Positive ResidualsCity Observed Simulated

Mirnyja 41 000 12 000Sertolovo 48 000 16 000

Beineu 32 000 11 000Govurdak 76 000 28 000

Serdar / Gyzylarbat 98 000 37 000Bayramaly 131 000 53 000

Sarov 92 000 39 000Turkmenabat / Tchardjou 427 000 185 000

Astana / tselinograd 613 000 278 000Dashougouz 275 000 126 000

.

Negative ResidualsCity Observed Simulated

Sovetabad 11 000 33 000Zhanatas 21 000 50 000

Krasnozavodsk 13 000 31 000Gagra 11 000 25 000

Nevelsk 12 000 26 000Arkalyk 28 000 59 000

Chyatura 14 000 28 000Aleksandrovsk Sahalinsk 11 000 21 000

Uglegorsk 10 000 20 000Baikonyr 36 000 67 000

16

Page 17: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

Finally, some individual cities appear as clearoutlyers of the model, and could correspond to aprofile that is too specific to be modelled by anygeneric mechanism. For the first period, (cf. table3), the most obvious examples of singular trajec-tories are the cities which grew much faster thanwhat was expected from their site, situation or in-teraction attributes. Indeed, Naberezhnye Tchelny,Volgodonsk, Toljatti or Bratsk owe their suddendevelopment to political decisions to implementflagship projects: automobile industry mega-plantsin Naberezhnye Tchelny (trucks) and Toljatti (cars),energy production sites in Volgodonsk (atomicpower) and Bratsk (hydroelectric power station).These economic policies of the 1950s and 1960s ledthose cities to be four times as populated 30 yearslates than what was expected from their interac-tions, resource or regional characteristics.

For the second period, (cf. table 4), Astana isa good example of a similar singular trajectorythat we would not aim to simulate with genericurbanisation mechanisms, as it owes its boom-ing growth to the decision of the Kazakh newlyindependent State to locate its headquarters inthis city more central to the country (comparedto Almaty). On the contrary, Baikonyr, also inKazakhstan, has suffered from the cuts in thespace industry (non-predictable at the urban levelof our mechanisms). Other shrinking cities likeAleksandrovsk-Sahalinsk, Krasnozavodsk or Ugle-gorsk would require more detailed mechanisms ofdemographics (lack of birth and emigration) andeconomic cycles to be simulated adequately.

To summarise, there are particular types of ur-ban trajectories that are not simulated well by themodel because of its simplicity, and trajectories thatare too specific to be modelled. We find that theexploration of our models, their calibration and theanalysis of residuals has helped to identify thosecities and to suggest some missing mechanisms.

IV. Conclusion

"Despite the fact that the experience of in-dividual cities has become more varied in-ternationally (at least within what mightbe called the mature economies) there isstronger evidence of a predictable pattern ofchange, determined by common causal fac-tors, than might be expected given the diver-sity and variety of cities." [Cheshire, 1999,p. 1342]

Systems of cities have attracted a lot of attentionfrom social modellers because of the regularity oftheir patterns. Instead of regarding the profusionof competing theories as a source of confusion (orsuspicion), we proposed a framework to integratecomplementary accounts of urbanisation processesinto a modular agent-based model. This work ofsynthesis and testing within a virtual laboratoryhas been made possible by its automation and theextensive use of computation resources to calibra-tion sixty-four models structures with empiricaldata on almost two thousands cities in the FormerSoviet Union over the last fifty years. The modelprovides a basis for comparison of the different the-ories that can be augmented by new or alternativeimplementations of mechanisms. It could also beapplied to different systems of cities (in space ortime).

In the present study, we showed that the multi-modelling approach has helped identify and orderthe mechanisms that most probably generated theurban pattern or Soviet and post-Soviet urbanisa-tion: situation effects of urban transition before1991 and resource extraction afterwards. The in-tuitive mechanism of redistribution however hasproved unsignificant. This method was finally use-ful to spot the most singular trajectories of citiesthat we interpret with empirical and monographicknowledge to refine the model.

To conclude, modelling experiments performa radical compression of reality and an extremesimplification of individual trajectories, events andpersons into synthetic aggregates. The ontological

17

Page 18: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

adequacy of the model to real life is therefore neces-sarily evaluated at an aggregated level that createsthe problem of equifinality. No simulation modelcan elude this question, but everything should betried to reduce its impact on what can be learnedfrom the modelling experience.

Acknowledgments

This work has been funded by the ERC Advanced GrantGeodivercity, the University Paris 1 Panthéon-Sorbonneand the Institut des Systèmes Complexes Paris Île-de-France. The results obtained in this paper were com-puted on the biomed and the vo.complex-system.euvirtual organization of the European Grid Infrastructure(http://www.egi.eu). We thank the European Grid In-frastructure and its supporting National Grid Initiatives(France-Grilles in particular) for providing the technicalsupport and infrastructure.

References

[Allen and Sanglier, 1981] Allen P. M., Sanglier M.,1981. "Urban evolution, self-organization, and deci-sionmaking", Environment and Planning A 13 (2):167?83.

[Auchincloss et al., 2008] Auchincloss A. H., Roux A. V.D., 2008. "A New Tool for Epidemiology: The Use-fulness of Dynamic-Agent Models in Understand-ing Place Effects on Health.", American Journal ofEpidemiology, 168(1), 1-8.

[Auerbach, 1913] Auerbach F., 1913, "Das Gesetzder Bevolkerungskonzentration.", Petermanns Ge-ographische Mitteilungen LIX, 73-76.

[Batty and Torrens, 2005] Batty M., Torrens P. M., 2005,"Modelling and Prediction in a Complex World.",Futures, 37(7), 745-766.

[Batty, 2008] Batty M., 2008, "Fifty years of urban mod-elling: Macro statics to micro dynamics.", in S. Al-bevrio, D. Andrey, P. Giordano, A. Vancheri (Eds.),The dynamics of complex urban systems: An inter-disciplinary approach, Heidelberg, Physica. 1-20.

[Batty et al., 2011] Batty, M., Vargas-Ruiz C., SmithD., Serras J., Reades J., Johansson A., 2015,"Visually-Intelligible Land Use Transportation Mod-els for the Rapid Assessment of Urban Futures",CASA Working Papers 163, University College

London. http://www.casa.ucl.ac.uk/working_papers/paper163.pdf

[Batty, 2015] Batty, M., 2015, "Editorial. Models again:their role in planning and prediction". Environmentand Planning B: Planning and Design, 42(2), 191-194.

[von Bertalanffy, 1968] von Bertalanffy L., 1968, GeneralSystem Theory, Georges Brazziler, Inc. new York

[Berry, 1961] Berry B. J., 1961, "City size distributionsand economic development.", Economic develop-ment and cultural change, 573-588.

[Black and Henderson, 2003] Black D., Henderson V.,2003, ?Urban evolution in the USA?, Journal ofEconomic Geography, 3, 343-372

[Bracha, 1992] Bracha G., 1992, The programming lan-guage jigsaw: mixins, modularity and multipleinheritance. PhD thesis, The University of Utah.

[Bretagnolle, 2003] Bretagnolle A., 2003, "Vitesse et pro-cessus de sélection hiérarchique dans le systèmedes villes françaises", in: D. Pumain, MF. Mattéi(eds.), Données urbaines, tome, 4.

[Bretagnolle and Pumain, 2010] Bretagnolle A., PumainD., 2010, "Simulating urban networks through mul-tiscalar space-time dynamics (Europe and UnitedStates, 17th-20th centuries)", Urban Studies, 47(13),2819-2839.

[Brockerhoff, 1999] Brockerhoff M., 1999, n Urbangrowth in developing countries : a review of pro-jection and predictions z, Population and develop-ment review, 25(4), 757-778.

[Bura et al., 1996] Bura S., Guérin-Pace F., Mathian H.,Pumain D., Sanders L., 1996, Multi-agents systemand the dynamics of a settlement system, Geo-graphical Analysis, 28(2), 161-178

[Byrne, 1998] Byrne D., 1998, Complexity theory andthe social sciences. An introduction, London, NewYork, Routeledge, 206p.

[Carley, 1999] Carley K. M., 1999, "On generating hy-potheses using computer simulations.", http://www.dtic.mil/cgi-bin/GetTRDoc?AD=ADA458065.

[Chérel et al., 2015] Chérel G., Cottineau C., Reuillon R.,2015, "Beyond corroboration: strengthening modelvalidation by looking for unexpected patterns.",PLoS ONE

[Cheshire, 1999] Cheshire P., 1999, "Trends in sizes andstructures of urban areas. Handbook of regionaland urban economics", 3, 1339-1373.

18

Page 19: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

[Christaller, 1933] Christaller, W., 1933, Die zentralenOrte in Süddeutschland: eine ökonomisch-geographische Untersuchung über die Gesetzmäs-sigkeit der Verbreitung und Entwicklung der Sied-lungen mit städtischen Funktionen.

[Clauset et al., 2009] Clauset A., Rohilla Shalizi C.,Newman M.E.J. 2009, Power-law distributions inempirical data. SIAM review, 51(4), 661-703.

[Contractor et al., 2000] Contractor, N., Whitbred R.,Fonti F., Hyatt A., O’Keefe B., Jones P., 2000, "Struc-turation Theory and Self-Organizing Networks.",Organization Science Winter Conference, Keystone,CO.

[Cottineau, 2014a] Cottineau, C., 2014, L’évolution desvilles dans l’espace post-soviétique. Observation etmodélisations, Doctoral dissertation.

[Cottineau, 2014b] Cottineau C., 2014, "DARIUSDatabase", Harmonised database of cities in thepost-Soviet space, 1840-2010, Figshare, http://dx.doi.org/10.6084/m9.figshare.1108081

[Cottineau et al., 2015] Cottineau C., Chapron P., Reuil-lon R., 2015, "Growing models from the bottomup. An evaluation-based incremental modellingmethod (EBIMM) applied to the simulation of sys-tems of cities", Journal of Artificial Societies andSocial Simulation (JASSS), 18(4).

[Cottineau and Slepukhina, forthcoming] Cottineau C.,Slepukhina I., forthcoming, "The Russian urban sys-tema: evolution under transition?, in D. Pumain, C.Rozenblat, Velasquez E., International and Transna-tional Perspectives on Urban Systems, MIT Press.

[Dimou and Schaffar, 2011] Dimou, M., Schaffar, A.,2011, Les théories de la croissance urbaine. Revued’économie politique, 121(2), 179-207.

[Duranton and Puga, 2001] Duranton G., Puga D., 2001,"Nursery cities: Urban diversity, Process innovationand the Life cycle of products", the American Eco-nomic Review, 91(5), 1454-1477

[Eeckhout, 2004] Eeckhout J., 2004, "Gibrat’s law for (all)cities", American Economic Review, 1429-1451.

[Elsenbroich, 2012] Elsenbroich C., 2012, "Explanationin agent-based modelling: Functions, causality ormechanisms?", Journal of Artificial Societies andSocial Simulation, 15(3), 1.

[Gabaix, 1999] Gabaix X., 1999, "Zipf?s Law for Cities:An Explanation.", Quarterly Journal of Economics,114, 739-767.

[Gibrat, 1931] Gibrat, R., 1931, Les inégalitéséconomiques. Recueil Sirey.

[Goldthorpe, 2001] Goldthorpe J. H., 2001, "Causation,statistics, and sociology.", European SociologicalReview, 17(1), 1-20.

[Grimm and Railsback, 2012] Grimm V., Railsback S.F., 2012. "Pattern-Oriented Modelling: A ’multi-Scope’ for Predictive Systems Ecology." Philosophi-cal Transactions of the Royal Society B: BiologicalSciences, 367 (1586), 298-310.

[Grüne-Yanoff, 2009] Grüne-Yanoff T., 2009, "The ex-planatory potential of artificial societies", Synthese,169(3), 539-555.

[Guérin-Pace, 1995] Guérin-Pace F., 1995, "Rank-sizedistribution and the process of urban growth", Ur-ban Studies, 32(3), 551-562.

[Hägerstrand, 1968] Hägerstrand T., 1968, Innovationdiffusion as a spatial process, University of ChicagoPress, 334p.

[Harris, 1970] Harris C. D., 1970, Cities of the SovietUnion, Studies in their Functions, Size, Densityand Growth, Chicago, Assciation of Amercian Ge-ographers, 484p.

[Hedström and Ylikoski, 2010] Hedström P., Ylikoski P.,2010, "Causal mechanisms in the social sciences",Annual Review of Sociology, 36, 49-67.

[Heppenstall et al., 2012] Heppenstall, A. J., A. T.Crooks, L. M. See, M. Batty, eds. 2012. Agent-basedmodels of geographical systems. Springer Science& Business Media

[Henderson, 1974] Henderson J. V., 1974, "The sizes andtypes of cities", The American Economic Review,640-656.

[Hernando et al., 2015] Hernando A., Hernando R.,Plastino A., Zambrano E., 2015, "Memory-endowedUS cities and their demographic interactions",Journal of The Royal Society Interface, 12(102),20141185.

[Krings et al., 2009] Krings G., Calabrese F., Ratti C.,Blondel V. D., 2009, "Urban gravity: a modelfor inter-city telecommunication flows", Journalof Statistical Mechanics: Theory and Experiment,2009(07), L07003.

[Krugman, 1996a] Krugman P. , 1996, The self-organizing economy. Oxford: Blackwell.

19

Page 20: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

[Krugman, 1996b] Krugman P., 1996, "Urban concentra-tion: the role of increasing returns and transportcosts", International Regional Science Review, 19(1-2), 5-30.

[Krugman, 1996c] Krugman P., 1996, "Confronting themystery of urban hierarchy". Journal of theJapanese and International economies, 10(4), 399-418.

[Livet et al., 2014] Livet P., Phan D., Sanders, L., 2014,"Diversité et complémentarité des modèles multi-agents en sciences sociales", Revue française desociologie, 55(4), 689-729.

[Lotka, 1925] Lotka A. J., 1925, Elements of physicalbiology, William & Wilkins.

[Lucas and Steyaert, 1994] Lucas C., Steyaert P., 1994,"Modular inheritance of objects through mixin-methods". Proceedings of the 1994 Joint ModularLanguages Conference (JMLC), 273?282.

[Mahfoud, 1995] Mahfoud S. W., 1995, "Niching meth-ods for genetic algorithms", Urbana, 51(95001), 62-94.

[Mitzenmacher, 2004] Mitzenmacher, M., 2004, "A briefhistory of generative models for power law and log-normal distributions", Internet mathematics, 1(2),226-251.

[Openshaw, 1983] Openshaw S., 1983, "From DataCrunching to Model Crunching: The Dawn of aNew Era.", Environment and Planning A, 15(8),1011-1012.

[Openshaw, 1988] Openshaw S., 1988, "Building an Au-tomated Modeling System to Explore a Universe ofSpatial Interaction Models.", Geographical Analy-sis, 20(1), 31-46.

[O’Sullivan, 2004] O’Sullivan D., 2004, "Complexity sci-ence and human geography." Transactions of theInstitute of British Geographers, 29(3), 282-295.

[Pred, 1973] Pred A. R., 1973, "Systems of cities and in-formation flows", Lund Studies in Geography, SerieB.

[Prehofer, 1997] Prehofer C., 1997, "Feature-orientedprogramming: A fresh look at objects",ECOOP?97?Object-Oriented Programming,Springer, 419?443.

[Preston, 1979] Preston S. H., 1979, "Urban growth indeveloping countries : a demographic reappraisal",Population and development review, 5(2),195-215.

[Pumain, 1997] Pumain D., 1997, "Pour une théorie évo-lutive des villes", L’Espace géographique, 2, 119-134.

[Pumain, 2004] Pumain D., 2004, "Scaling laws and ur-ban systems", Santa Fe Institute, Working Paper n04-02

[Pumain, 2006] Pumain D. (ed.), 2006, Hierarchy in theSocial Sciences, Springer, Methodos Series, 243p.

[Pumain et al., 2006] Pumain, D., Paulus, F., Vacchiani-Marcuzzo, C., Lobo, J., 2006, "An evolutionary the-ory for interpreting urban scaling laws", Cybergeo:European Journal of Geography, document 343.

[Pumain and Sanders, 2013] Pumain D., Sanders L.,2013, "Theoretical principles in inter-urban simu-lation models: a comparison". Environment andPlanning A, 45(9), 2243-2260.

[Robson, 1973] Robson B.T., 1973, Urban growth, an ap-proach. London: Methuen.

[Rossiter et al., 2010] Rossiter S., Noble J., Bell K. R.W.,2010, "Social simulations: Improving interdisci-plinary understanding of scientific positioning andvalidity.", Journal of Artificial Societies and SocialSimulation, 13(1), 10.

[Schärli et al., 2003] Schärli N., Ducasse S., NierstraszO., Black A. P., 2003, "Traits: Composable unitsof behaviour", ECOOP 2003?Object-Oriented Pro-gramming, Springer, 248?274.

[Schmitt et al., 2015] Schmitt C., Rey-Coyrehourcq S.,Reuillon, Pumain D., 2015. "Half a billion simu-lations: evolutionary algorithms and distributedcomputing for calibrating the SimpopLocal geo-graphical model." Environment and Planning B:Planning and Design 42 (2): 300?315

[Simon, 1955] Simon H. A., 1955, "On a class of skewdistribution functions". Biometrika, 425-440.

[Singer, 1936] Singer H. W., 1936, "The ’courbe des pop-ulations.’ A parallel to Pareto’s Law." The EconomicJournal, 254-263.

[Steyaert et al., 1993] Steyaert P., Codenie W., D?HondtT., Hondt K., Lucas C., Limberghen M., 1993,"Nested mixin-methods in agora", in O. M. Nier-strasz (ed), ECOOP? 93 ? Object-Oriented Program-ming, volume 707 of Lecture Notes in ComputerScience, Springer Berlin Heidelberg, 197-219

20

Page 21: A modular modelling framework for hypotheses testing in ...In this paper, we present a modelling experiment developed to study systems of cities and processes of urbanisation in large

A modular modelling framework for hypotheses testing in the simulation of urbanisation • Working Paper

[Steyaert and Meuter, 1995] Steyaert P., Meuter W.,1995, "A marriage of class- and object- based in-heritance without unwanted children" in M. Tokoroand R. Pareschi (eds), ECOOP?95 ? Object-OrientedProgramming, volume 952 of Lecture Notes in Com-puter Science, Springer Berlin Heidelberg, 127?144.

[Thiele et al., 2014] Thiele J. C., Kurth W., Grimm V.,2014, "Facilitating Parameter Estimation and Sen-sitivity Analysis of Agent-Based Models: A Cook-book Using NetLogo and?R?.", Journal of ArtificialSocieties and Social Simulation 17 (3)

[Thiele and Grimm, 2015] Thiele J. C., Grimm V., 2015,"Replicating and Breaking Models: Good for Youand Good for Ecology", Oikos.

[Ullman, 1941] Ullman E., 1941, "A theory of locationfor cities". American Journal of sociology, 853-864.

[Wampler and Payne, 2009] Wampler D., Payne A.,2009, "Programming Scala: Scalability = FunctionalProgramming + Objects, O’Reilly Media.

[Weidlich and Haag, 1983] Weidlich W., Haag G., 1983,Concepts and Models of a Quantitative Sociology,The Dynamics of Interacting Populations. SpringerSeries in Synergetics, Berlin.

[Zipf, 1949] Zipf G., 1949, Human Behavior and thePrinciple of Least Effort. Cambridge, MA: Addison-Wesley

21