an inquiry into the form and function of zipfs law thomas wagner with john nystuen china data center...

An inquiry into the form and function of Zipf’s Law

Thomas Wagnerwith John Nystuen

China Data Center

University of Michigan

Updated on 21th February 2003

•In 1950: 29% of 2.5 billion people lived in cities and towns.

•By 2005: 50% of 6.5 billion people will live in urban areas.

•By 2025: 60% of 8.3 billion people will live in urban areas.

What determines urban patterns?

US Urbanization1790-2000

What is Zipf’s Law?

• Within a network of cities, the size of a city’s population is inversely proportional to its rank on an ordered list: e.g. the largest city is #1, the second largest is #2 with ½ the population of #1, the third largest city is #3 with 1/3 the population of #1, etc.

• The populations of urban areas follow consistent power law relationships within very narrow limits with an exponent close to 1.

• This relationship applies to large urban systems around

the world and over long periods of time.

Why do we want to talk about Zipf’s Law?

• It’s remarkably consistent• It’s frequently referenced• Its causes are unknown• Its variations are many• Its applications are growing• It leads to interesting questions

Who was George Kingsley Zipf (1902-1950)?

• Prof. & Lecturer, Harvard Univ., 1930-50.• Proposed many linguistic and social power law

relationships.• Authored “Human Behavior & The Principle of

Least Effort” (1949).• Not well known in his time.• Not the inventor of the “rank-size rule”.

Zipf footnote (p374):

“The first person to my knowledge to note the rectilinear distribution of communities in a country was Felix Auerbach in 1913 who, however, generalized incorrectly upon the value, p =q =1 , and was quoted therein by A.J.Lotka. In 1931 R. Gilbrat reported the rectilinearity of the large communities of Europe with a value of p less than 1.”

Some power law relationshipsnoted by Zipf

Zipf’s Law

1

communitylargest theof population

communityurban an of population

orderrank

q

K

P

r

KPr q

from Human Behavior and the Principle of Least Effort (1949)

Zipf’s urban unit definition

“Since our argument referred to the natural boundaries of communities, as opposed to their political boundaries, we shall use the data for the populations of Metropolitan Districts in the US in 1940, although… we shall find that politically bounded regions follow the same equation.” (p 375)

Growth of US urban populations: 1790 - 1930

Changes in the urban areas of other countries, e.g. India

Zipf’s Law (updated)

exponent scaling

city of population

constant

more.or S of population a with cities ofnumber

S

A

R

ASR

Size-rank of the 135 largest cities in the US (2000) is remarkably linear on a log-log scale (left)

Size-rank of the 135 largest US cities (2000) has a power law exponent close to 1

Zipf’s Law doesn’t explain why, but if true…

• All urban areas are nodes within a network of urban areas within a region

• Urban networks are scale-free– relations within subsets apply to the whole– transitioning from chaos to order

Questions about Zipf’s Law

• Does it work? How consistent is it?• What does it tell us about urban networks

or their growth?• What’s the right urban unit? region?

– minimum size? maximum size? – do they change over time?

• Why population? How about other things?• What are its applications?

Who else is involved?

• Vilfredo Pareto (1898)• Felix Auerbach (1913)• R. Gilbrat (1931)• Herbert Simon (1955)• Paul Krugman (1994, 1996)• X. Gabaix (1999)• Laszlo Barabasi (1999)• ~ 300 other people – see Li @ rockerfeller.edu

applications of Gilbrat’s Law

• The probability distribution of the growth process does not depend on the initial size of a city’s population or economy.

• All cities grow and shrink stocastically, e.g. with common means and variances that are independent of city size.

tttSS /

11 tS

Herbert Simon says

• Probability of (a firm’s) growth is related to its size.

• “the rich get richer” e.g. Pareto’s law.

• Big firms have preferential growth

• time is required to converge to Zipf’s Law

1 an

S

dS

dn

maybe Spatial Hierarchies (Christaller & Losch)?

from G. William Skinner’s “The City in Late Imperial China” (1977)

China’s different economic zones

Economist J.V Henderson says:

Cities differ in size because:• they specialize in the production of goods with

different scale economies.• Firms seek to avoid diseconomies of population

sizes and qualities.• The mismatch of population diseconomies and

firm scale economies results in Zipf’s Law.

what about resource availability?

• Cities are open systems for in-flows of people, water, and energy

• Big cities use people, water, and energy more efficiently than smaller cities, and far more than rural areas.

• Different efficiencies result in different patterns.

there’s Gabaix’s shocks

• Population growth comes from migration• Need city-specific shocks (changes in taxes,

pollution, floods, civil unrest, etc.) to generate common urban growth means and variances (Gilbrat’s law).

• Shocks overcome migration costs and cause populations to move to new cities based on opportunities that are independent of size.

from “Zipf’s Law of Cities: An Explanation”, Quarterly Journal of Economics, August 1999

thoughts from Barabasi* (regarding power laws, not urban networks)

“As long as we thought networks were random, we modeled them as static graphs. The scale-free model reflects our awakening to the reality that networks are dynamic systems that change constantly through the addition of new nodes and links.” (p.106)

“Normally nature hates power laws…But all that changes if the system is forced to undergo a phase transition. Then power laws emerge – natures unmistakable sign that chaos is departing in favor of order.” (p.77)

* Linked (2002), Perseus Pub., Cambridge, MA

What about the US?Decadal Census data are available by

– Civil divisions, census tracts, e.g. incorporated cities(detailed local data including most data before 1950)

– Metropolitan Areas units (PMSAs, CMSAs), incorporates county-size areas.

– Urbanized Areas (UAs) & Urban Clusters (UCs)• newly created for the 2000 Census• includes urban populations in Porto Rico and Guam• 464 UAs incorporate 194,323,824 people on

71,693 sq. miles (~2%US)• 3175 UCs incorporate 30,036,715 people on

20,672 sq. miles • UAs and UCs have 79% of total US population

Standard Metropolitan Areas (MAs) cover large areas and include non-urban land

Urbanized Areas include only urban land uses(448 UAs for the lower 48 states shown here)

Comparison of an Urbanize Area (blue) map with a civil division/census tract map of Ann Arbor,

Michigan

Comparison of Urbanized Area patterns for western US (left) & eastern US (right)

Zipf’s exponent for 48 statesWestern ½ = -1.07, Eastern ½ = -1.13

• Western cities = 50.3M people (26% of urban population) • Eastern cities = 140.0M people (74% of urban population)

US Urban Patterns: Pacific Coast (upper-left), East & Gulf Coast (lower-left) Great Plains &

Mountains (upper-right), Middle America (lower-right)

Regional Zipf’s exponents: Pacific Coast = -1.26 Great Plains & Mountains = -1.19 East & Gulf Coast = -1.27 Middle and South = -1.07

interpreting Zipf’s graph lines

• Distance from origin = total urban population• Slope: an integrated scaling factor• Curves (violate Zipf’s Law)

– concave– convex

• Tails (a problem for power laws)– Upper (a few big cities)

Lower “fat” (many, many small cities)

What does the exponent mean?

• Slope of the line (amt of drop for 1 unit on the horizontal axis)

• e.g.1.0 =

• e.g. 2.0 =

• high = even distribution of urban populations among all cities

• low = urban population concentrated in

certain size cities

x2x

What does the curve tell us?US Census 2000 data, n=448

• Convex (see right): Big cities may be limited in size, population in many medium size cities

• Concave: Few medium size cities, big cities or small cities may account for large share of urban population

from Rosen and Resinik, 1980 (44 countries, 50 largest cities, 1970 census)

• sd= 0.196• 32 countries • Australia • central city larger than metro area• 30 countries had concave (upward)

curves• Large cities are growing faster than

small cities• Lack of a rigorous theoretical model

*”Size distribution of Cities”, J of Urban Economics 8 (p165-186)

1963.1

,136.1

Should we care?

• Urban robustness and vulnerabilities

• Monitoring urban changes

• Monitoring global change

• Many possible futures

Urban robustness and vulnerabilities

• Do “hub” cities provide network robustness, e.g. resistance to disruption?

• Are urban networks vulnerable to particular types of shocks, e.g.– power-grid failures or fuel cutoffs?– abnormal weather events, e.g. droughts, excessive

heat, excessive cold, floods, hurricanes?– earthquakes?– War?

Monitoring changes in urban networks

• What other things can we measures?– areas (size and patterns)– light emissions (energy use)– housing stocks (nos. of residences)– firms (size and location of production

units)– impervious (built surfaces)

US population has a correlation with area at the R2 = 0.9 level

Night lights observed by satellite provide urban patterns for US

Land use maps (left) or satellite-observed impervious surfaces (right) relate to urban population patterns. Detroit metropolitan area shown.

Conclusions:• Zipf’s law has an exponent of close to 1 for the 135

largest US cities, but greater than 1 (e.g. 1.13) for 448 cities over 50,000 population.

• Regional Zipf exponents ranging from -1.07 to -1.27, suggesting differences in regional patterns within a relatively mature urban network.

• The log rank-size line for the US is convex (downward curving) and suggests under-representation of large cities or over representation of medium size cities.

• Perhaps Zipf’s law is not “a law” with a rigorous theory at all, but rather “a rule” expressing a natural (power law) tendency – more like the more well known “normal distribution”.

an inquiry into the form and function of zipfs law thomas wagner with john nystuen china data center...

Documents

zipfs law slide

zipf slide

india slide

landscape slide

populations of urban

urban areas form

network of urban areas

urban patterns