an inquiry into the form and function of zipfs law thomas wagner with john nystuen china data center...
TRANSCRIPT
An inquiry into the form and function of Zipf’s Law
Thomas Wagnerwith John Nystuen
China Data Center
University of Michigan
Updated on 21th February 2003
•In 1950: 29% of 2.5 billion people lived in cities and towns.
•By 2005: 50% of 6.5 billion people will live in urban areas.
•By 2025: 60% of 8.3 billion people will live in urban areas.
What determines urban patterns?
US Urbanization1790-2000
What is Zipf’s Law?
• Within a network of cities, the size of a city’s population is inversely proportional to its rank on an ordered list: e.g. the largest city is #1, the second largest is #2 with ½ the population of #1, the third largest city is #3 with 1/3 the population of #1, etc.
• The populations of urban areas follow consistent power law relationships within very narrow limits with an exponent close to 1.
• This relationship applies to large urban systems around
the world and over long periods of time.
Why do we want to talk about Zipf’s Law?
• It’s remarkably consistent• It’s frequently referenced• Its causes are unknown• Its variations are many• Its applications are growing• It leads to interesting questions
Who was George Kingsley Zipf (1902-1950)?
• Prof. & Lecturer, Harvard Univ., 1930-50.• Proposed many linguistic and social power law
relationships.• Authored “Human Behavior & The Principle of
Least Effort” (1949).• Not well known in his time.• Not the inventor of the “rank-size rule”.
Zipf footnote (p374):
“The first person to my knowledge to note the rectilinear distribution of communities in a country was Felix Auerbach in 1913 who, however, generalized incorrectly upon the value, p =q =1 , and was quoted therein by A.J.Lotka. In 1931 R. Gilbrat reported the rectilinearity of the large communities of Europe with a value of p less than 1.”
Some power law relationshipsnoted by Zipf
Zipf’s Law
1
communitylargest theof population
communityurban an of population
orderrank
q
K
P
r
KPr q
from Human Behavior and the Principle of Least Effort (1949)
Zipf’s urban unit definition
“Since our argument referred to the natural boundaries of communities, as opposed to their political boundaries, we shall use the data for the populations of Metropolitan Districts in the US in 1940, although… we shall find that politically bounded regions follow the same equation.” (p 375)
Growth of US urban populations: 1790 - 1930
Changes in the urban areas of other countries, e.g. India
Zipf’s Law (updated)
exponent scaling
city of population
constant
more.or S of population a with cities ofnumber
S
A
R
ASR
Size-rank of the 135 largest cities in the US (2000) is remarkably linear on a log-log scale (left)
Size-rank of the 135 largest US cities (2000) has a power law exponent close to 1
Zipf’s Law doesn’t explain why, but if true…
• All urban areas are nodes within a network of urban areas within a region
• Urban networks are scale-free– relations within subsets apply to the whole– transitioning from chaos to order
Questions about Zipf’s Law
• Does it work? How consistent is it?• What does it tell us about urban networks
or their growth?• What’s the right urban unit? region?
– minimum size? maximum size? – do they change over time?
• Why population? How about other things?• What are its applications?
Who else is involved?
• Vilfredo Pareto (1898)• Felix Auerbach (1913)• R. Gilbrat (1931)• Herbert Simon (1955)• Paul Krugman (1994, 1996)• X. Gabaix (1999)• Laszlo Barabasi (1999)• ~ 300 other people – see Li @ rockerfeller.edu
applications of Gilbrat’s Law
• The probability distribution of the growth process does not depend on the initial size of a city’s population or economy.
• All cities grow and shrink stocastically, e.g. with common means and variances that are independent of city size.
tttSS /
11 tS
Herbert Simon says
• Probability of (a firm’s) growth is related to its size.
• “the rich get richer” e.g. Pareto’s law.
• Big firms have preferential growth
• time is required to converge to Zipf’s Law
1 an
S
dS
dn
maybe Spatial Hierarchies (Christaller & Losch)?
from G. William Skinner’s “The City in Late Imperial China” (1977)
China’s different economic zones
Economist J.V Henderson says:
Cities differ in size because:• they specialize in the production of goods with
different scale economies.• Firms seek to avoid diseconomies of population
sizes and qualities.• The mismatch of population diseconomies and
firm scale economies results in Zipf’s Law.
what about resource availability?
• Cities are open systems for in-flows of people, water, and energy
• Big cities use people, water, and energy more efficiently than smaller cities, and far more than rural areas.
• Different efficiencies result in different patterns.
there’s Gabaix’s shocks
• Population growth comes from migration• Need city-specific shocks (changes in taxes,
pollution, floods, civil unrest, etc.) to generate common urban growth means and variances (Gilbrat’s law).
• Shocks overcome migration costs and cause populations to move to new cities based on opportunities that are independent of size.
from “Zipf’s Law of Cities: An Explanation”, Quarterly Journal of Economics, August 1999
thoughts from Barabasi* (regarding power laws, not urban networks)
“As long as we thought networks were random, we modeled them as static graphs. The scale-free model reflects our awakening to the reality that networks are dynamic systems that change constantly through the addition of new nodes and links.” (p.106)
“Normally nature hates power laws…But all that changes if the system is forced to undergo a phase transition. Then power laws emerge – natures unmistakable sign that chaos is departing in favor of order.” (p.77)
* Linked (2002), Perseus Pub., Cambridge, MA
What about the US?Decadal Census data are available by
– Civil divisions, census tracts, e.g. incorporated cities(detailed local data including most data before 1950)
– Metropolitan Areas units (PMSAs, CMSAs), incorporates county-size areas.
– Urbanized Areas (UAs) & Urban Clusters (UCs)• newly created for the 2000 Census• includes urban populations in Porto Rico and Guam• 464 UAs incorporate 194,323,824 people on
71,693 sq. miles (~2%US)• 3175 UCs incorporate 30,036,715 people on
20,672 sq. miles • UAs and UCs have 79% of total US population
Standard Metropolitan Areas (MAs) cover large areas and include non-urban land
Urbanized Areas include only urban land uses(448 UAs for the lower 48 states shown here)
Comparison of an Urbanize Area (blue) map with a civil division/census tract map of Ann Arbor,
Michigan
Comparison of Urbanized Area patterns for western US (left) & eastern US (right)
Zipf’s exponent for 48 statesWestern ½ = -1.07, Eastern ½ = -1.13
• Western cities = 50.3M people (26% of urban population) • Eastern cities = 140.0M people (74% of urban population)
US Urban Patterns: Pacific Coast (upper-left), East & Gulf Coast (lower-left) Great Plains &
Mountains (upper-right), Middle America (lower-right)
Regional Zipf’s exponents: Pacific Coast = -1.26 Great Plains & Mountains = -1.19 East & Gulf Coast = -1.27 Middle and South = -1.07
interpreting Zipf’s graph lines
• Distance from origin = total urban population• Slope: an integrated scaling factor• Curves (violate Zipf’s Law)
– concave– convex
• Tails (a problem for power laws)– Upper (a few big cities)
Lower “fat” (many, many small cities)
What does the exponent mean?
• Slope of the line (amt of drop for 1 unit on the horizontal axis)
• e.g.1.0 =
• e.g. 2.0 =
• high = even distribution of urban populations among all cities
• low = urban population concentrated in
certain size cities
x2x
What does the curve tell us?US Census 2000 data, n=448
• Convex (see right): Big cities may be limited in size, population in many medium size cities
• Concave: Few medium size cities, big cities or small cities may account for large share of urban population
from Rosen and Resinik, 1980 (44 countries, 50 largest cities, 1970 census)
• sd= 0.196• 32 countries • Australia • central city larger than metro area• 30 countries had concave (upward)
curves• Large cities are growing faster than
small cities• Lack of a rigorous theoretical model
*”Size distribution of Cities”, J of Urban Economics 8 (p165-186)
1963.1
,136.1
Should we care?
• Urban robustness and vulnerabilities
• Monitoring urban changes
• Monitoring global change
• Many possible futures
Urban robustness and vulnerabilities
• Do “hub” cities provide network robustness, e.g. resistance to disruption?
• Are urban networks vulnerable to particular types of shocks, e.g.– power-grid failures or fuel cutoffs?– abnormal weather events, e.g. droughts, excessive
heat, excessive cold, floods, hurricanes?– earthquakes?– War?
Monitoring changes in urban networks
• What other things can we measures?– areas (size and patterns)– light emissions (energy use)– housing stocks (nos. of residences)– firms (size and location of production
units)– impervious (built surfaces)
US population has a correlation with area at the R2 = 0.9 level
Night lights observed by satellite provide urban patterns for US
Land use maps (left) or satellite-observed impervious surfaces (right) relate to urban population patterns. Detroit metropolitan area shown.
Conclusions:• Zipf’s law has an exponent of close to 1 for the 135
largest US cities, but greater than 1 (e.g. 1.13) for 448 cities over 50,000 population.
• Regional Zipf exponents ranging from -1.07 to -1.27, suggesting differences in regional patterns within a relatively mature urban network.
• The log rank-size line for the US is convex (downward curving) and suggests under-representation of large cities or over representation of medium size cities.
• Perhaps Zipf’s law is not “a law” with a rigorous theory at all, but rather “a rule” expressing a natural (power law) tendency – more like the more well known “normal distribution”.