synthesizing agents and relationships for land use / transportation modelling

David PritchardCivil Engineering, University of Toronto

September 12, 2008

Synthesizing Agents and Relationships for Land Use / Transportation

Modelling

Lecture Outline

● Introduction● Previous Work● Data● New Methods● Results

Introduction

● How would land use, transportation patterns and emissions react to...

● High congestion charge?● Greenbelt policy?● “Do nothing” while population grows ● Major transportation projects

● Major extrapolations from current behaviour● Too hard to predict conventionally

Introduction

Traditional 4-stage

Introduction

Integrated Land Use/Transportation Environment (ILUTE) model

Introduction

● We can’t build such a complicated model using conventional methods

● Instead, preferred approach is microsimulation model

● What is microsimulation?

Introduction

Conventional Model

Simulation Model

Introduction

● Microsimulation = Simulation + Agents● Models the state of agents● Combined behaviour of agents yields

system state● 1. Begin with initial population in start year● 2. Update population, year by year

● age persons, change family structures● change jobs, move homes● use this to predict annual travel patterns

● 3. Obtain travel patterns in forecast year

Introduction

● Need an initial population in the start year● List of agents and their attributes - e.g.,

● Number of persons, and their ages● Number of vehicles● Type of dwelling● etc.

● But - complete list is unknown● “Population Synthesis” used instead

● Use known data to create initial agents● Result has known statistical properties● Best estimate from limited data

Introduction

My results:Improved method for population synthesis

Allows more attributes for each agentNew method for relationship synthesis

Allows correct set of agents and correct set of relationships

Created a synthetic population for ILUTE Persons, families, households and dwellings Complete 1986 population for GTHA

Previous Work

Two representations of set of agents List of agents and their attributes (as categories) Contingency table

One cell for each combination of attributes Cell contains count of number of agents

Previous Work

Data Limitations Patchwork of partial data Mostly, we have one-way margins Break down of a single attribute into a few

categoriesExample: look at how we can use one-way margins

Previous Work

Iterative Proportional Fitting

Previous Work

Iterative Proportional Fitting

Previous Work

Iterative Proportional Fitting e.g., “Biproportional Updating” of O/D tables

Exactly satisfies target marginsAlso minimizes discrimination information relative to source population

Information theory: maximum entropyResulting PDF satisfies the constraints without assuming any information we do not possess

Previous Work

Many options for margins in 3D

Previous Work

Beckman, Baggerley & McKay (1996)State-of-the-art application of IPF for census Geography attribute gets special treatment

Due to nature of data in PUMS and census tablesTwo approaches: zone-by-zone, or all zones at once

Treats final table as a PMF Monte Carlo draws used to integerize Hurts fit to target margins

Limited number of attributes

Previous Work

Williamson, Birkin and Rees (1998)Not IPF: “Combinatorial Optimisation”List-based, instead of tablesPros:

good fit to target margins may handle more attributes

Cons: no guarantees about relationship with source

sample not entropy maximizing slow

Summary Tables Usually one attribute, by zone (2D margin) Contingency table Large sample: 20% or 100% Sometimes 2-3 attributes by zone Used as Target Margins

Public Use Microdata Sample (PUMS) List; almost all attributes, except zones Small sample (1-2%) Canada: defined for each large Census

Metropolitan Area (CMA) Used as Source Sample

Canadian Census includes three PUMS Persons Census families Households & Dwellings

Also summary tables related to each

New Methods: Sparsity

Beckman et al.’s approach doesn’t work well with many attributes

Computation becomes hard Huge memory requirement Slow

Thirteen attributes on family agent: Beckman Zone-by-Zone needs 1.4 GB memory Beckman Multizone needs 1,036 GB memory

Number of cells in multiway table grows exponentially with number of attributes (dimensions)

Large number of binsMost bins are zeroNumber of bins is larger than sample!

Is it meaningful to use many attributes? Tentatively, yes Not a meaningful 13-way distribution But, a link between many statistically valid low-

order distributions (e.g., 3-way) If acceptable, can we do better than standard IPF?

Yes - use a sparse data structure instead of a complete array to represent table

Store only non-zero cells in table

Same representation as Williamson’s “Combinatorial Optimisation”

But, uses IPF algorithmMaximum entropy guarantee; fastCan implement either zone-by-zone or multizone IPF using sparse data structure

New Methods: Relationships

Land use/transportation models have more types of agents Agents: Persons, families,

households, business establishments

Objects: Vehicles, dwellings

Need to synthesize correct relationshipsExamples:

Which persons are married? Opposite sex, similar ages - usually

Which household owns/rents a given dwelling? Number of rooms and number of persons should

be correlatedEarlier methods could guarantee correct PDF for one agent type, but not all simultaneously

Family PUMS contains information about persons in family husband/wife ages; child ages

Can synthesize “family” agent Include some “person” attributes in family

Then, conditionally synthesize persons on family attributes IPF result is a joint probability mass function

P(AGE, EDU, INCOME, OCCUP, SEX, ...)

Can convert to a conditional PMF

P(EDU, INCOME, OCCUP, ... | AGE, SEX)

Synthesize, repeating for husband, wife, children

Guarantees good fit for both agent types Correct Family PDF Correct Person PDF

Simple, data-driven No rules No special data sources, models

Provided that attributes can be aligned between agents

Results

Programmed in R A statistical programming platform Dynamic language, fast prototyping Good support for categorical data, contingency

tablesToronto CMA: 1.1 million households, 1.0 million families, 3.3 million persons

Run time: 2 hours, 7 minutes on older 1.5GHz computer

Repeated for Hamilton and Oshawa CMAs

Results

Experiment Is there value in using really rich input data? Or does PUMS + 1D tables give enough?

Calculated fit against all available dataSRMSE and G2 information theoretic statistics

Results

Improvement of result with additional data evident However, no statistical tests possible

Monte Carlo stage causes some errorMy conditional synthesis introduces small amount of additional error

Little difference between zone-by-zone and multizone methods

Questions?

synthesizing agents and relationships for land use / transportation modelling

Documents

relational agents: effecting change through human ...3...

synthesizing realistic neural population activity …

targeting and synthesizing new anti-tuberculosis agents for...

building relationships · 2019. 10. 11. · building...

organizing and synthesizing sources

sustentable use of the wetting agents in protected...

synthesizing 3d worlds

the mutagenesis enhancing activity of tumour promoting...

intentions and agents from entities and relationships to...

synthesizing sources

relational agents: effecting change through human ......

dsagen: synthesizing programmable spatial accelerators

142826213x wh synthesizing watermark

synthesizing sephora

glucocorticoid induction of epinephrine synthesizing...

the networking game during our 34 years of building...

summarizing, analyzing, and synthesizing

chapter two: agency 1. agency principalagent third party...

summarize and synthesizing

synthesizing cultural identity