local political business cycles evidence from philippine...
TRANSCRIPT
Local Political Business Cycles
Evidence from Philippine Municipalities∗
Julien Labonne†
Oxford University
November 2013
Draft
Abstract
In this paper, I test for the presence of political business cycles in Philippine municipalitiesover the period 2003-2009, a context where according to the literature such cycles are likelyto be observed. I find robust evidence for the presence of political business. This effect is onlypresent when I use quarterly data and vanishes when I aggregate the data at the yearly-level.The difference is not merely driven by a decline in statistical power due to aggregation: pointestimates for the overall effects are 7 times larger when I use quarterly data than when Iuse yearly data. This discrepancy can be explained by a drop in employment post-electionthat dilutes the yearly effects. Specifically, using data from 26 nationally representativequarterly labor force surveys, I construct a balanced panel of more than 1,100 municipalitiesand show that the share of the working-age population that is employed increases by 0.87percentage-points in the two quarters before elections. In the two post-election quarters, it is0.48 percentage-point lower than what it would have been without the elections. Results arerobust to the inclusion of a number of control variables, time trends and to two-way clusteringof the residuals along both time and geographic dimensions.
∗I am grateful to Marcel Fafchamps, Simon Franklin, Clement Imbert and Simon Quinn for useful discussionswhile working on this paper. APPC kindly shared their electoral data. Financial support from the CSAE andOxford Economic Papers Fund is gratefully acknowledged. I thank Jacobus Cilliers, Paul Niehaus, YasuhikoMatsuda and participants in the CSAE Research Workshop, CSAE Conference 2013 and UPSE Friday Seminarfor comments. All remaining errors are mine.†email: [email protected]
1
1 Introduction
In this paper I examine whether employment levels are affected by election timing. Economists
and political scientists have been interested in analyzing so-called political business cycles,
the fluctuations of employment around elections, and political budget cycles, the expected
increase in government expenditures before elections, as they provide insights into voter and
politician behavior.1 To-date, the empirical evidence is not as strong as theory and anecdotal
evidence would suggest.2
A number of explanations have been proposed for the challenges in identifying political
budget and business cycles in some contexts and for the weakness of the estimated effects when
cycles are identified. Researchers have argued that cycles are a function of, among other, the
structure of fiscal federalism (Jones, Meloni, and Tommasi 2012), institutional constraints
on politicians (Shi and Svensson 2006) and, the degree of control politicians have over the
economy (Duch and Stevenson 2008). To take one example, Duch and Stevenson (2008)
argue that in an open economy politicians have less control over the state of the economy
and as such it provides a weaker signal of incumbent quality. Voters are more reluctant to
use the signal when deciding whether to re-elect the incumbent and, in turn, she has less
incentives to attempt to improve the economy before elections. Similarly, Jones, Meloni, and
Tommasi (2012) show that in a decentralized setting increased spending act as signal for the
1Recognizing that voters use retrospective information about the economy to decide whether to re-elect theincumbent, Nordhaus (1975) developed a model where politicians have incentives to decrease unemploymentand increase inflation ahead of elections. While the model assumed myopic voters, it can be modified toincorporate rational agents. For example, shifting the focus from employment to budget spending, Rogoff(1990) proposed a model where it is optimal for voters to use information about public spending when decidingwhether to re-elect the incumbent or not as it provides information about private incumbent’s type. If votersobserve an increase in public spending they rationally attribute some of the improvements to incumbent’sactions.
2Political business and budget cycles have been identified in some countries but not in others (Peltzman(1992); Franzese (2002); Besley (2006); Shi and Svensson (2006)). More recently, the empirical literature hasmoved from cross-country to subnational analyses and has tested whether local governments spending followa different pattern in election years. Evidence from countries as diverse as Argentina, Brazil, India, Indonesia,Israel, Portugal and Russia suggests that local governments increase spending and/or reduce taxes beforeelections (Akhmedov and Zhuravskaya (2004); Khemani (2004); Brender (2003); Jones, Meloni, and Tommasi(2012); Sakurai and Menezes-Filho (2010); Aidt, Veiga, and Veiga (2011), Sjahrir, Kis-Katos, and Schulze(2013)). In addition, elections appear to affect spending composition, with local governments allocating moreresources to visible investments (Drazen and Eslava 2010). This is consistent with Rogoff (1990)’s predictionthat, ahead of elections, there is a re-allocation of government spending towards easily observable expenditures.More recently electoral cycles have also been identified in foreign aid (Faye and Niehaus 2012), in cementconsumption in India (Kapur and Vaishnav 2011) and in prices paid to farmers for cane (Sukhtankar 2012).
2
incumbent’s ability to extract resources from the center.
In this paper, I test for political business cycles in Philippine municipalities over the
period 2003-2009, a context where according to the literature strong cycles are likely to be
observed. First, local politicians often campaign on their ability to secure resources from
higher levels of government. Second, Philippine municipalities are often headed by strong
mayors with significant discretionary powers over budget spending. Third, mayors sometimes
act as employment brokers in both the public and private sectors.
I find robust evidence for the presence of political business. This effect is only present
when I use quarterly data and vanishes when I aggregate the data at the yearly-level. The
difference is not merely driven by a decline in statistical power due to aggregation: point
estimates for the overall effects are 7 times larger when I use quarterly data than when I use
yearly data. This discrepancy can be explained by a drop in employment post-election that
dilutes the yearly effects. A potential explanation for this decline is that, in situations where
local governments are unable to borrow, local incumbents can increase spending ahead of
elections by shifting some of their planned post-election spending in the pre-election period.3
This within-year effect is not captured by analyses using aggregated data.
The main findings can be summarized as follows. Using a unique balanced panel dataset
of about 1,140 cities and municipalities, I show that the share of working-age population that
is employed increases by 0.87 percentage-points in the two quarters before elections. This
is equivalent to an additional 470,000 jobs in April 2004 compared with what employment
would have been without the May 2004 elections. Employment as a share of the working-age
population in the two post-election quarters is 0.48 percentage-point lower than it would have
been without the elections. Results are robust to the inclusion of a number of control vari-
ables, time trends and to two-way clustering of the residuals along both time and geographic
dimensions. In addition, results are similar when I control for lagged values of the dependent
variables and estimate the model using the generalized methods of moment.
The paper provides a simple explanation for the challenges in identifying political business
cycles when using yearly data and for the weakness of the estimated effects when cycles are
identified. In addition, the paper expands the literature along three main dimensions. First, I
3Importantly, in the Philippines, the calendar and fiscal years coincide.
3
return to the literature’s origin and carry out one of the first political business cycle analyses at
the subnational level.4 Indeed, to-date subnational analyses have focused on budget cycles,
capturing only part of the potential distortions to labor markets. In contexts where local
politicians have either significant business interests and/or have strong connections with local
businessmen, incumbents might be able to influence employment levels above and beyond what
one would expect from budget spending alone. For example, Bertrand, Kramarz, Schoar, and
Thesmar (2007) provide convincing evidence that, in France, firms with political connections
tend to delay firing employees until after the elections.
Second, I focus on a setting where election timing is exogenous to the outcome of interest.
Strategic election timing is a common concern in the literature but, in the Philippines, it was
decided as part of the 1987 Constitution and since then all elections have been organized as
planned. Put differently, politicians are de facto unable to organize elections in good years.
Third, results discussed in this paper contribute to the literature on clientelism. The
increase in employment in the public sector before the elections is concentrated on long-term
contracts (i.e. non-casual). This is in agreement with the model of clientelism developed
by Robinson and Verdier (2013) as it allows politicians to align bureaucrats’ incentives with
their own electoral objectives as their job tenure is implicitly tied to the incumbent’s electoral
success. In addition, the size of the effect suggests that incumbents are not merely providing
jobs to secure additional votes. The observed effects do not appear large enough to affect
election results. Rather, results are consistent with the argument that incumbents attempt to
provide their constituents with benefits that will last until after the elections. The increase in
employment in the construction sector is expected to lead to an increase in either maintenance
of existing public goods or the construction of new infrastructure.
Results presented in this paper suggest that, to the extent possible, future analyses of
political business cycles should use monthly or quarterly data. A similar point is implicit in
results presented by Akhmedov and Zhuravskaya (2004) for the analysis of political budget
cycles, but it appears that data constraints have prevented researchers from doing so.
The remainder of the paper is organized as follows. Section 2 provides some background
4To the best of my knowledge, the only exceptions are Coelho, Veiga, and Veiga (2006) who test for politicalbusiness cycles in Portuguese municipalities, using yearly data, over the period 1985-2000 and Dahlberg andMork (2011) who test for electoral cycles in public employment in Finnish and Swedish municipalities, usingyearly data, over the period 1985-2002.
4
on local elections in the Philippines. Section 3 introduces the data. The estimation strategy
is presented in Section 4 and results are discussed in Section 5. Section 6 concludes.
2 Setting
In this section, I briefly present the local political context, highlighting factors that make
Philippine municipalities particularly well-suited to study political business cycles, especially
over the period 2003-2009. I start with a brief description of the institutional context, followed
by a discussion of local elected officials’ behavior and ability to affect local labor markets.
First, carrying subnational analyses allows me to control for the specificities of the insti-
tutional context. As pointed out by Drazen and Eslava (2010) in their study of the effects
of elections on local government spending in Colombia, variations in institutions make the
interpretation of cross-country regressions challenging. This is especially relevant over the
period 2003-2009 in the Philippines as the same president, Gloria Macapagal-Arroyo, was in
office throughout. Unobserved links with central government officials which could affect local
politicians’ ability to access state resources will be stable throughout the period and can thus
be controlled for by using municipal fixed-effects.
Second, since the fall of Marcos elections have followed a pre-established calendar set out
in the 1987 Constitution. This rules out concerns that election timing was endogenous to
the outcome of interest as local governments were unable to control election timing. The
literature has recognized that most governments have some control over election timing and
that observed correlations between elections and the outcome of interest could be driven by
incumbents ability to time elections as they please (Khemani 2004 Shi and Svensson 2006).
To deal with those concerns, researchers often provide evidence that results are robust to
restricting the sample to elections that were implemented according to the constitutionally
mandated schedule. The idea being that in such cases the timing of elections is exogenous
to the outcome of interest. However, if incumbents have the power to either call for elections
early or postpone them and have used it in the past, then the decision not to do so might
be correlated with the outcome of interest as well. As a result, the assumption that election
timing is exogenous is more likely to be valid when local politicians are de facto unable to
affect when elections take place; as is the case in the Philippines.
5
Third, Capuno (2012) and Sidel (1999) provide credible evidence that municipalities are
headed by strong mayors, often referred to as local bosses and perceived as such by the
population. One would expect economic voting to be especially strong in such a setting.
Duch and Stevenson (2008) argue that economic voting is stronger when voters perceive
incumbents to be in control of the local economy. When voters expect elected officials to
control the local economy, they are more likely to use the state of the economy as a signal of
incumbent’s quality. Incumbents therefore have more incentives to distort the economy ahead
of the elections.
As shown by Hutchcroft (2012), most decisions regarding municipal budgets are made
by mayors who use available funds with very little oversight. This is despite the fact that
the 1991 Local Government Code established municipal councils and gave them decision-
making powers. Mayors control both how the budgets are spent and public sector employment
(Hodder 2009 Hutchcroft 2012). In addition, there is evidence consistent with the argument
that they are able to staff the bureaucracy with their relatives (Fafchamps and Labonne 2013);
thus ensuring that bureaucrats have incentives closely aligned with their electoral objectives
(Robinson and Verdier 2013). In clientelistic systems where local elected officials are often
assessed on their ability to respond to citizens’ requests, one would expect incumbents to use
their discretionary powers more frequently ahead of elections. For example, in a case study of
Naga City in the Philippines, Kawanaka (2002) notes that resident requests greatly increase
as elections near. The city government intensifies its services to reflect favorably on Robredo’s
[the mayor] leadership.
Fourth, a number of municipalities are characterized by the presence of family dynasties
which have been in power for decades. It is common for one of the incumbent’s family member
to run for the position when the incumbent has reached the 3-term limit introduced in the 1987
Constitution (Coronel, Chua, Rimban, and Cruz (2004) and Querubin (2011)). The relevant
unit of analysis in Philippine politics is the family rather than the individual politician or the
political party (McCoy 2009). As described by Fegan (2009): A family is a more effective
political unit than an individual because it has a permanent identity as a named unit, making
its reputation, loyalties, and alliances transferable from members who die or retire to its new
standard bearer. Assuming that citizens learn about politician’s quality during their time in
office, the set-up allows me to test whether political business cycles are more pronounced in
6
areas where incumbent’s quality is more uncertain.
Finally, available qualitative evidence suggests that strong political business cycles are
likely to be identified around municipal elections in the Philippines. Incumbents attempt to
increase public spending, especially on visible projects, in the pre-election period. This is
similar to findings from Colombia where incumbents have been shown to allocate a greater
share of their budget to construction projects in election years (Drazen and Eslava 2010).
Thus one would expect an increase in employment in the construction sector ahead of the
elections. Given that most municipalities face strict budget constraints due to their reliance
on fixed fiscal transfers from the central government and their de facto inability to borrow,
the pre-election increase in spending might be followed by a post-election decline as fiscal
resources for the fiscal year have been depleted.
In addition, local incumbents’ electoral strategies might affect private sector employment
directly. Local politicians often use the power of their office to increase their business holdings
(Sidel 1999). They are then able to provide their constituents with jobs. Further, in a number
of Philippine municipalities, mayors act as employment brokers, helping their constituents find
jobs. For example, in a province surrounding Manila, job applicants in local factories were
required to provide letters of recommendation from local officials (Sidel 1999, pp 76-77). There
is qualitative evidence that this role intensifies before elections as voters have more bargaining
power (Kawanaka 2002).
3 Data
3.1 Employment
I use data from Labor Force Surveys (LFS) collected by the National Statistics Office of
the Philippines. The surveys are implemented four times a year (January, April, July and
October) and I have access to all 26 surveys in the period July 2003 - October 2009. Each
survey has a sample size of approximately 200,000 individuals in 50,000 households.5 Data
from the surveys are used to compute official employment statistics. A person is considered
employed if he reported at work for at least an hour during the week prior to the survey. In
5More information on the survey design is available at: http://www.census.gov.ph/data/technotes/notelfs new.htmlvisited on March 26, 2012.
7
addition, information is collected on the total number of hours worked during the past week,
the sector of employment and the daily wage. I use the available data to build a balanced
panel of about 1,140 cities and municipalities, out of 1,634 in the country.
For each municipality/survey wave, I compute the share of the working age population
(above 15 year old) that is employed. It is not possible to compute employment rate as a
share of the economically active population consistently across survey waves as the definition
of the economically active population changed in April 2005. The information required to
adjust past series is not available. However, the definition of employment has not changed
and I compute the employment ratio as a share of the working age population rather as a
share of the economically active population. As a result, estimates presented in this paper
combine the effects of elections on the decision to enter/exit the labor force and of getting a
job for those in the labor force.
3.2 Political environment
In accordance with the 1987 Constitution, elections have been organized every three years.
The two elections of interest in our sample period are the May 2004 and May 2007 elections. I
distinguish between pre-election months (January and April waves) and post-elections months
(July and October waves).
A number of Philippine municipalities are controlled by so-called political dynasties which
have often been in power for decades and I expect political business cycles to be different
in those municipalities. To control for that, I use lists of elected officials at the municipal
and provincial levels for the period 1987-2007 and compute for the 2004 and 2007 elections,
the number of terms the incumbent’s family has been in office in the same municipality since
1988.6 I consider that an incumbent is related to an earlier official if they share the same
last name.7 In addition, for each municipality, I compute the number of family links between
the mayor and either the provincial governor, vice-governor or congressmen using the same
6Municipal elections elections were organized in 1988. In accordance with transitory provisions of the1987 Constitution (Article XVIII) the next municipal elections were organized in 1992. Elections have beenorganized every 3 years ever since.
7It is possible to do so due to naming conventions imposed by Spanish colonial officials in the 19th century.Cruz and Schneider (2013), Fafchamps and Labonne (2013) and Querubin (2013) use a similar strategy withdata from the Philippines. I am only able to match on last names and not on middle names and, as such, it islikely that I underestimate links.
8
matching procedure
I also use yearly data on municipal budgets from the Department of Budget and Manage-
ment.8 The data are all expressed in 2000 Pesos using regional Consumer Price Indices.
3.3 Descriptive statistics
The average municipal employment rate in January and April in election years is 59.6 percent
while it is only 58.7 percent in non-elections years (Figure 1). Simple tests of equality of means
suggest that the observed differences are statistically significant (t=3.5, p-value= 0.0004). In
addition, it appears that not only are the means of the two distributions different but the two
distributions are also different. This is confirmed by results from a Kolmogorov-Smirnov test
of equality of distributions (p-value =0.003).
Overall patterns of employment rates in July and October in election and non-election
years are also consistent with the argument that the pre-election increase in employment
levels is followed by a post-electoral decline (Figure 2). The average employment rate in
the July and October waves is 59.3 percent in non-elections years while it is 58.9 percent in
elections years. As above, the differences are statistically significant (t=2.93, p-value=0.0034)
and this is confirmed by results from a Kolmogorov-Smirnov test of equality of distributions
(p-value= 0.013).
Despite the prohibition on political dynasties and the three term limit that were introduced
in the 1987 Constitution, a number of mayors come from families that have been in power
more than three times since 1987.9 About 9.1 percent of mayors elected during the 1998
elections were from families that had been elected at least 3 times since 1987. The numbers
then increased to 22.9 percent in 2001, 30.0 percent in 2004 and to 36.2 percent 2007. Such
families are better able to stay in power. For example, in the 2007 elections, a new family came
to power in 44.2 percent of municipalities where the incumbent had been in power three times
or less. The rate of turnover was only 26.7 percent in municipalities where the incumbent
had been in power four times or more. The difference between the two groups is statistically
different from zero at the usual levels of significance (χ2 = 30.4). While such descriptive
statistics are interesting, they do not provide credible estimate of incumbency advantage as
8The data are available from: http://www.blgf.gov.ph/# visited on March 26, 2012.9This is consistent with findings at the provincial and congressional levels reported by Querubin (2011).
9
they could merely reflect a selection effect with more competent political families more likely
to be elected to begin with.
As expected, incumbent political families that have links with provincial politicians are
more likely to be re-elected to municipal office. For example during the 2004 elections, a
new family came to power in only 17.3 percent of the municipalities where the incumbent
family had links with provincial politicians. In municipalities with incumbent without such
links, the proportion was 28.0 percent. The difference was even stronger in 2007, with a new
family coming to power in 24.2 percent of the municipalities with provincial links and in 39.8
percent of municipalities with politicians without family links. In both cases, the differences
are significantly statistically different from zero at the usual levels of significance (for 2007:
χ2 = 6.4). Again, the statistics discussed above do not provide any evidence on the value of
connections as competent political families might be better able to be elected to provincial
office.
4 Estimation strategy
In this section, I present the empirical strategy. First, I describe how I test for the presence of
political business cycles, the fluctuations of employment around elections, including various
robustness checks. Second, I discuss channels that could explain the strength of political
business cycles. Finally, I test for the presence of political budget cycles, the expected increase
in government expenditures before elections.
4.1 Local political business cycles
I start by estimating equations of the form:
Yijt = αEt + βXijt + uij + wijt (1)
Where Yijt is the outcome of interest in municipality i in province j at time t, Et is a vector
of electoral variables, Xit is a vector of municipal characteristics that vary across time, uij is a
municipality-specific unobservable and, wijt is the usual idiosyncratic term. Each observation
is weighted using the sum of individual survey weights in the municipality at that time period.
10
I also present results from unweighted regressions (Solon, Haider, and Wooldridge 2013).
I estimate equation (1) where Yijt is either the share of the working age population that
is employed, the average number of hours worked over the past week (for those with a job) or
the average log daily wage (for those with a job).10 I estimate equation (1) separately for the
public and private sector. The differences, if any, will provide information on the channels
through which the effects operate.
As discussed in Section 2, I argue that election timing in this context is exogenous. Indeed,
election timing was decided when the constitution was drafted in 1987 and all elections since
then, including the two elections of interest (May 2004 and May 2007), were implemented as
planned. To check that the results are not driven by one particular election, I also estimate
the model with separate dummies for each election and test whether the coefficients are equal.
In line with the literature, I start by estimating equation (1) with annual data. In addition,
to test my argument that granularity in data explains variations in electoral cycles, I also
estimate equation (1) with quarterly data and introduce two dummy variables, one capturing
the two pre-election quarters and one capturing the two post-election quarters.
Given the data structure, error terms are not independent and are likely to be correlated
both within municipalities, provinces and time periods. Standard errors need to be corrected
to account for the specific structure of the error term. Failure to do so would lead to downward
biased standard errors and to over-rejection of the null hypothesis of no effect. For example,
in Monte Carlo simulations reported by Cameron, Gelbach, and Miller (2011), the null hy-
pothesis of no effect is rejected twice as often when clustering is done along one dimension as
when it is done along two dimensions. When clustering levels are nested, as is the case here
with municipalities and provinces, clustering needs to be done at the most aggregated level
(Cameron, Gelbach, and Miller 2008). As a result, I use a method developed by Cameron,
Gelbach, and Miller (2011) and cluster standard errors across both time and provinces.
The vector Xijt includes controls for average age (and its square) in the municipality (for
those older than 15), education levels (for those older than 15), the share of women in the
sample, all computed using the LFS data. I also control for population levels and per capita
10As the surveys were designed to provide representative estimates at the regional level, the municipal-levelestimates used in the paper are likely measured with error. As a result, the estimates have larger variancesand the null hypothesis of no effect will tend to be under-rejected.
11
fiscal transfers. I include either region-specific or province-specific quadratic time trends. Due
to the seasonal nature of employment in the Philippines, I include quarter-specific dummies
in all regressions. I also estimate equation (1) with quarter/municipality fixed effects. This
allows me to control for potential variations in the degree of seasonality across municipalities.
I check wether results are robust to alternative specifications and estimation strategies. I
assume a dynamic model of the form:
Yijt =P∑
p=1
αpYijt−p + βXijt + uij + wijt (2)
I estimate equation (2) for various values of P using the fixed effects estimator. As an
additional robustness check, I eliminate uij by taking the fist difference and use lagged values
of Yijt as instruments for ∆Yit−p. This leads to:
∆Yijt =P∑
p=1
αp∆Yijt−p + γ∆Xijt + ∆wijt (3)
When using yearly data, I estimate equation (2) using the Generalized Methods of Moment
with all the available moment conditions (Arellano and Bond (1991) and Blundell and Bond
(1998)). However, when using quarterly data, given the length of the panel, using the full set
of moment conditions might lead to over-fitting and to possible small sample bias (Roodman
2009). As a result, I only use Yijt−2 and Yijt−3 as instruments for ∆Yijt−1 when estimating
equation (3) with P = 1 and Yijt−3 and Yijt−4 as instruments for ∆Yijt−1 and ∆Yijt−2 when
estimating equation (3) with P = 2.
I then provide evidence on which politicians attempt to increase employment ahead of the
elections and which sectors of the economy are the most affected.
First, I expect political business cycles to be stronger in municipalities with relatively
inexperienced incumbents as they represent a way of signaling quality to voters. Since voters
face difficulties when trying to distinguish between incumbent’s ability and the overall eco-
nomic environment, incumbents try to improve economic conditions ahead of the elections
which voters will then partly attribute to the incumbent. Assuming that voters learn from
politician’s ability during their times in office, one would expect uncertainty about ability to
decrease with the number of terms the incumbent, or one of his family member, has been
12
in office. To test whether uncertainty about the incumbent’s ability explain the presence of
political business cycles, I estimate:
Yijt = αEt + βXijt + γZijt + δEt ∗ (Zijt − Zijt) + uij + wijt (4)
where Zijt is a variable capturing the number of times a member of the incumbent’s family
has been elected mayor since 1987. Since there is no particular reason to expect that the
relationship between length in office and the strength of the political cycle is linear, and given
the wealth of data available, I also estimate the effect separately for each level of political
experience. I am unable to interpret the estimates of δ causally as municipalities that are
controlled by dynasties might be different from municipalities with inexperienced incumbents
along dimensions that could affect labor markets. However, they provide useful information
on the relationship between political experience and political business cycles and they might
allow me to rule out alternative explanations as well.
4.2 Local political budget cycles
In addition, I test for political budget cycles using annual data.11 Specifically, I estimate
equations of the form:
Bijt =P∑
p=1
αpBijt−p + βEt + γXijt + uij + wijt (5)
Where Bijt is the outcome of interest in municipality i, in province j in year t, Et is
an indicator equal to one if elections took place in year t, Xijt is a vector of municipal
characteristics that vary across time, uij is a municipality-specific unobservable and, wijt is
the usual idiosyncratic term.
The budget data are available for all municipalities and cities in the country over the period
2001-2009. As a result the fixed effects estimator might be biased and I also report results of
GMM estimation for P = 1, 2. The specification is consistent with previous contributions to
the political budget cycle literature (Drazen and Eslava 2010).
11As such, I am unable to test whether elections affect within-year allocations of resources.
13
5 Results
In this section, I first present results on the presence of local political business cycles in
Philippine municipalities. I then highlight potential channels which might affect the strength
of the cycles. Finally, I discuss results on whether municipal budgets are affected by the
timing of elections.
5.1 Local political business cycles
5.1.1 Main results
Analyses carried out with yearly data fail to identify political business cycles in Philippine
municipalities over the sample period. The proportion of the working-age population that is
employed is neither higher nor lower in election years (Panel A of Table 2).12 The conclusions
are unchanged when employment in the public and private sectors are analyzed separately
(Panels B and C of Table 2). To account for the fact that I only have data for the July and
October waves in 2003, I re-estimate the model on the 2004-2009 sample. This does not affect
the results (Table A.1)
Once I use quarterly data, I find robust evidence of the presence of political business cycles.
The differences with the yearly results are not merely driven by an increase in statistical power
but rather by a post-election decline in employment which dilutes the overall effect when using
yearly data. Point estimates for the overall effects are 7 times larger when I use quarterly
data than when I use yearly data.
First, results, available in Panel A of Table 3, indicate that there is an increase in em-
ployment in the two quarters preceding the elections.13 Results are robust to the inclusion of
a number of control variables and region-specific or province-specific quadratic time trends.
The point estimates suggest that the share of the working-age population that is employed
12When I estimate the model with a separate dummy for each election, I am unable to reject the null thatthe coefficients are equal (χ2 = 0.56, p-value=0.454).
13In some cases, standard errors are 200 percent larger than standard errors obtained without clustering and100 percent larger than standard errors obtained with one way clustering. This confirms the importance ofcomputing standard errors with two-way clustering. An added complication arises from the fact that I onlyhave 26 time periods and, as a result, standard errors are downward biased and lead to over-rejection of thenull hypothesis of no effect (Cameron, Gelbach, and Miller 2008). As I have 6 time invariant regressors, asolution is to use critical values from a T distribution with 20 degrees of freedom. Results are robust to usingsuch critical values.
14
increases by 0.87 percentage-points in the two quarters before elections. This is a large effect
as it translates into an additional 470,000 individuals employed in April 2004 than would have
been the case without the elections. The effects do not appear to be driven by one specific
election. When I estimate the model with separate dummies for each election, the 2004 coef-
ficient is 0.77 and the 2007 coefficient is 0.96. The differences are not statistically significant
at the usual levels of confidence (p-value=0.75).
Second, results indicate that employment in the two post-elections quarters is 0.48 percent-
age-points lower than it would have been without the elections. Again, the effects do not
appear to be driven by one specific election. When I estimate the model with separate
dummies for each election, the 2004 coefficient is -.41 and the 2007 coefficient is -.55. The
differences are not statistically significant at the usual levels of confidence (p-value=0.73).
The observed effects are consistent with the argument that they are driven by within-
year reallocation of resources. I am unable to reject the null hypothesis that the sum of the
coefficients on the pre-election and the post-election dummies is equal to zero (χ2 = 0.97, p-
value=0.325). More specifically, while data constraints prevent me from testing this directly,
results are consistent with the view that incumbents shift some of their planned end-of-year
budget spending before the elections which would explain the drop in employment in the two
post-elections quarters. This suggests that the failure to identify political cycles in a number
of countries could be partly driven by aggregation issues as the yearly effect captures both
the pre-election increase and the post-election decline.
If the argument is correct, the strength of political business cycles detected with yearly
data will depend on the timing of elections and the overlap between the fiscal and calendar
years. If elections are early in the fiscal year, incumbents might not have enough time to
increase spending before elections. Similarly if elections are late in the fiscal year, incumbents
do not have enough resources to bring forward before elections. As a result, we would expect
the effects to be stronger for elections organized in the middle of the fiscal year. In addition,
the larger the overlap between the fiscal and calendar years the more likely is the pre-election
increase attenuated with the post-election decrease. Findings from the literature tend to
support this argument. For example, Coelho, Veiga, and Veiga (2006) who test for political
business cycles in Portuguese municipalities, using yearly data, over the period 1985-2000. In
Portugal, elections take place at the end of the fiscal year which coincides with the calendar
15
year and, as such, effect sizes are expected to be small. The estimated effects indicate that total
municipal employment increases by 2.5 jobs in election years (Coelho, Veiga, and Veiga 2006).
In a similar context, Dahlberg and Mork (2011) find that, in Sweden and Finland there are 0.6
more full-time public employees per 1000 capita in election years. For the average municipality
in Sweden, it translates into 15.6 additional public-sector jobs.
I further test whether the cycles are present in both public sector and private sector
employment. Results, available in Panels B and C of Table 3, indicate that most of the
variations are concentrated in the private sector. Elections lead to a 0.15 percentage-point
increase in public sector employment and to a 0.73 percentage-point increase in private sector
employment in the two pre-election quarters. That is, 83 percent of the increase in employment
levels occurs in the private sector. Given that only 7.8 percent of those employed work in
the public sector, the relative increase is larger in the public sector than it is in the private
sector. Specifically, the public sector effect represents about 3.3 percent of the mean value
of public sector employment while the private sector effect represents only about 1.3 percent
of the mean value of private sector employment. There is no evidence of a decline in public
sector employment in the two post-elections quarters (point estimate: -0.002) but private
sector employment is 0.48 percentage-points lower than it would be if elections did not take
place.
Interestingly, there is no evidence that elections affect either the number of working hours
or wage for employed individuals (Tables A.2 and A.3). This suggests that, along those two
dimensions, jobs created during the political business cycles are neither superior nor inferior
to average jobs in the municipalities. They provide similar working hours and similar pay.
5.1.2 Channels and robustness checks
A potential concern with the above results is that they might simply be capturing the fact
that elections require labor which would automatically lead to a pre-election increase; not
related to political business cycles. Evidence discussed above seems to suggest that this is not
case. First, the increase is larger for private sector employment than it is for public sector
employment.
Second, as a further check, I use data on the type of employment contracts to compare the
effects of elections on public sector employment in long-term (i.e. non-casual) and short-term
16
(i.e. casual) contracts. If the results were driven by hiring of election workers, we would
expect an increase in employment on short-term contracts but no increase in employment
on long-term contracts. Available results suggest that there is an increase in public sector
employment on long-term contracts but no effect on employment on short-term contracts in
the two pre-elections quarters (Columns 1 and 2 of Table 4). Again, these results are not in
agreement with the view that effects are driven by hiring of election workers. In addition, this
increases the likelihood that local bureaucrats’ incentives are aligned with the incumbent’s
electoral objectives as their job tenure is implicitly linked with the incumbent’s electoral
success (Robinson and Verdier 2013 Iyer and Mani 2012).14
Further, the scale of the increase is difficult to reconcile with the number of jobs that are
actually needed to organize the elections. Taking 2004 as an example, the point estimates
suggest that pre-election increase in employment represents 1.1 percent of the population that
was registered to vote. It seems unlikely that so many individuals would have to be hired
given that a number of the required activities are carried out by existing civil servants, usually
public school teachers, and do not require additional hiring. In addition, the last pre-election
surveys are carried out about a month prior to the elections, that is before the final few weeks
of campaigning and election day which are likely to be the most labor-intensive.
In light of evidence from other countries (Drazen and Eslava 2010), incumbents might
attempt to increase both the levels of government spending and to target spending on visible
projects ahead of elections. Given that in the Philippines most government-financed con-
struction work is done through private contractors, one would expect the increase in private
sector employment to be concentrated on short-term contracts rather on long-term contracts.
Firms might be reluctant to hire employees on long-term contracts in response to short-term
increase in government spending. To test whether those channels explain the strength of the
political business cycles, I estimate equation (1) where Yijt is either the share of the working
age population that is employed on short-term contracts in the private sector, on long-term
contracts in the private sector. Results available in Columns 3 and 4 of Table 4 suggest
14As cited by Fafchamps and Labonne (2013), Hodder (2009) quotes a lawyer for the Civil Service Commis-sion: We can even go so far as saying that you cannot be appointed in local government if you do not know theappointing authority or, at least, if you do not have any [political] recommendation....And even once in place,the civil servant’s position is not secure: when the new mayor [comes], he just tells them ‘resign or I’ll file acase against you.’
17
that elections affect employment in the private sector on short-term contracts but not on
long-term contracts. In addition, consistent with findings from the literature on elections and
firm investment decisions, while the coefficient on the pre-elections quarters dummy is not
significant in the long term contract regression, the point estimate is negative (-0.855). Firms
are reluctant to invest before the uncertainty surrounding the elections is resolved (Julio and
Yook 2012).
In addition, one would expect private sector employment to be more responsive to elections
in sectors where local governments can invest in visible projects. To this end, I test whether
short-term employment in the construction sector is affected by elections. There is evidence
that employment in the construction sector increases by 0.20 percentage-points in the two
pre-election quarters (Column 5 of Table 4). This represents about 4 percent of the mean
value of employment in the construction sector. It is expected to lead to an increase in
either maintenance of existing public goods or the construction of new infrastructure. As
above, this seems to suggest that incumbents attempt to provide lasting benefits to their
constituents ahead of the elections.
I now present results from a number of robustness checks. First, further results suggest that
the main conclusions are robust to the inclusion of lagged values of the dependent variables
(Tables A.4-A.6). As expected point estimates differ slightly as one needs to account for
the additional effect through the lagged values but the results are of the same sign and still
statistically significant. This reinforces confidence that results discussed above are capturing
the presence of political business cycles in Philippine municipalities.
Second, I assess whether the results are robust to the exclusion of outliers. I estimate
equation (1) on a number of sub-samples where I exclude the top and bottom one, two, three
and four percent in the distribution of employment levels. Results, available in Table A.7, are
consistent with the ones obtained previously.
Third, as indicated in Section 4, I also estimate equation (1) with quarter/municipality
fixed effects. Results are available in Panel A of Table A.8. Overall, results are consistent with
the ones obtained above. The point estimates are similar but with slightly larger standard
errors and I can still reject the null hypothesis of no effect for overall employment, employment
in the private sector and long-term employment in the public sector at, at least, the 5%
significance level. The only exceptions are for employment levels in the public sector and
18
in the construction sector. The point estimate for the public sector is as before (.15) but
the standard error increases from .086 to .109, with the results going from being marginally
significant at the 10% level to being marginally insignificant at the 10% level. When it comes
to the construction sector, the point estimates are stable but the standard errors increase by
about 20 percent and I am no longer able to reject the null of no effect.
Fourth, I also present results from unweighted regressions (Solon, Haider, and Wooldridge
2013). Results, available in Panel B of Table A.8, are consistent with those obtained previ-
ously. There are two exceptions. Employment on long-term contracts in the private sector is
now 1.1 percentage-point lower in the two pre-election quarters. This is in line with previous
findings on firm investment ahead of elections (Julio and Yook 2012)., In addition, while the
coefficient is still positive, I am unable to reject the null hypothesis that employment in short-
term contracts in the construction sector is not affected by elections. This seems to indicate
that the increase in the construction sector is concentrated in more populous municipalities.
5.1.3 Heterogeneity
As explained in Section 4, I test whether political business cycles are dependent upon the
municipality’s political environment. More specifically, I compare electoral cycles in munici-
palities with established incumbents and in municipalities with relatively new incumbents. If
electoral cycles serve to signal incumbent’s quality to voters, they are likely to be concentrated
in municipalities where incumbents are from politically inexperienced families.
Results available in Panel A of Table 5 provide mixed support for the argument that
political business cycles arise as incumbents attempt to signal their types to voters. Once
I interact the electoral dummies with the number of times the incumbent’s family has been
elected mayor in the same municipality between 1988 and the current election, the increase in
public sector employment ahead of elections is lower in municipalities where the incumbent
or one of his family member has been elected more often.15
An alternative explanation for the results that employment rates in the private sector
are lower in the post-election months is that incoming officials are learning how to use the
15This set of results provides further evidence that results are not merely capturing the fact that organizingelections require labor as there would be no reason to expect different effects by number of terms. Estimatingpolitical business cycles separately for each level of political experience does not yield additional insights (TableA.9).
19
bureaucracy which might delay government spending. To test if that is the case, I compare
employment rates in municipalities where the incumbent was re-elected and in municipalities
where she lost. If learning about bureaucratic procedures explains the negative impact on
employment in the post-election period it should be concentrated in municipalities where a
challenger won. Results, available Panel B of Table 5, indicate that the effect of the post-
electoral period on employment in the public and private sectors are similar regardless of
whether the incumbent won or lost. The only exception is on employment on long-term
contracts in the public sector. It is 0.19 percentage-points lower in the two post-elections
quarters in municipalities where a challenger won than in municipalities where the incumbent
won. A potential explanation is that, consistent with Fafchamps and Labonne (2013), some
employees affiliated with the previous incumbent leave their positions just after the elections
and, through time, are replaced by employees affiliated with the new mayor. In addition, this
finding is in line with the model of clientelism developed by Robinson and Verdier (2013) who
argue that jobs are a credible way of redistributing resources from politicians to bureaucrats
as they have incentives to ensure that the incumbent is re-elected.
5.2 Local political budget cycles
I now test for the presence of budget cycles in Philippine municipalities. Results are available
in Table 6. Overall, spending does not seem to increase but tax collection effort is lower in
election years.
Depending on the number of lags included in the regression and the estimation method,
per capita revenues is between 2 and 5 percent lower in election years than in non-election
years. Local governments appear to collect less taxes during election years.16 In addition,
fiscal transfers from the central government are lower in election years. Given that fiscal
transfers from the central government are computed using tax collection three years before,
this suggests that overall tax collection effort by the central government are lower in election
years as well.
However, incumbents do not increase spending in election years, which is consistent with
the argument that local governments are de facto unable to borrow and, as such, are unable
16This is consistent with findings from India (Khemani 2004).
20
to increase spending in election years. It is possible that incumbents alter their within-year
spending allocation but I am unable to directly test for it as the data are aggregated yearly.
As above, a potential explanation for the drop in local tax collection is that, in munici-
palities where the incumbent lost, the new administration might need time to adjust to its
new role which could decrease tax collection effort. To test for that, I estimate equation (5)
and interact the election year dummy with a dummy equal to one if the incumbent lost the
election. There is no evidence of an additional drop in tax collection in municipalities where
a challenger won the election. The coefficients on the election year dummy are of similar
order of magnitude as above. However, there is limited evidence that spending is lower in
election years in municipalities where the incumbent lost. As quarterly or monthly data are
unavailable, this could either be due to the fact the incumbents who did not spend enough
money in the pre-election months tend to lose elections or to the fact that spending slows
down after the recently elected mayor takes office. Previous results suggest that the latter is
unlikely (Panel B of Table 5).
6 Conclusion
In this paper, using a balanced panel of about 1,140 municipalities over 26 quarters, I have
identified political business cycles in the Philippines. In election years, the share of the
working-age population that is employed is higher in the two pre-election quarters and lower
in the two post-election quarters than what it would have been without elections.
The results have methodological implications for the literature on political business cycles.
Once analyses are carried out with yearly data, I am unable to identify political business cycles.
The difference is not merely driven by a decline in statistical power due to aggregation: point
estimates for the overall effects are 7 times larger when I use quarterly data than when I use
yearly data. This discrepancy can be explained by a drop in employment post-election that
dilutes the yearly effects. Researchers interested in estimating political business cycles should
use either monthly or quarterly data.
21
References
Aidt, T., F. Veiga, and L. Veiga (2011): “Election results and opportunistic policies: Anew test of the rational political business cycle model,” Public Choice, 148(1), 21–44.
Akhmedov, A., and E. Zhuravskaya (2004): “Opportunistic Political Cycles: Test in AYoung Democracy Setting,” Quarterly Journal of Economics, 119(4), 1301–1338.
Arellano, M., and S. Bond (1991): “Some Tests of Specification of Panel Data: MonteCarlo Evidence and An Application to Employment Equations,” Review of Economic Stud-ies, 58(2), 277–297.
Bertrand, M., F. Kramarz, A. Schoar, and D. Thesmar (2007): “Politicians, Firmsand the Political Business Cycle: Evidence from France,” mimeo.
Besley, T. (2006): Principled Agents? The Political Economy of Good Government, TheLindhal Lectures. Oxford University Press, Oxford, UK.
Blundell, R. W., and S. R. Bond (1998): “Initial Conditions and Moment Restrictionsin Dynamic Panel Data Models,” Journal of Econometrics, 87(1), 115–143.
Brender, A. (2003): “The effect of Fiscal Performance on local government election resultsin Israel: 1989-1998,” Journal of Public Economics, 87(9-10), 2187–2205.
Cameron, C., J. Gelbach, and D. Miller (2008): “Bootstrap-based improvements forinference with clustered errors,” Review of Economics and Statistics, 90(3), 414–427.
(2011): “Robust Inference with Multiway Clustering,” Journal of Business andEconomic Statistics, 29(2), 238–249.
Capuno, J. (2012): “The PIPER Forum on 20 Years of Fiscal Decentralization: A Synthesis,”Philippine Review of Economics, 49(1), 191–202.
Coelho, C., F. Veiga, and L. Veiga (2006): “Political business cycles in local employment:Evidence from Portugal,” Economics Letters, 93(1), 82–87.
Coronel, S., Y. Chua, L. Rimban, and B. Cruz (2004): The rulemakers : how thewealthy and well-born dominate Congress. Philippine Center for Investigative Journalism,Quezon City.
Cruz, C., and C. Schneider (2013): “The (Unintended) Electoral Effects of Multilat-eral Aid The (Unintended) Electoral Effects of Multilateral Aid Projects,” University ofCalifornia - San Diego, mimeo.
Dahlberg, M., and E. Mork (2011): “Is There an Election Cycle in Public Employment?Separating Time Effects from Election Year Effects,” CESifo Economic Studies, 57(3),480–498.
Drazen, A., and M. Eslava (2010): “Electoral manipulation via voter-friendly spending:Theory and evidence,” Journal of Development Economics, 92(1), 39 – 52.
22
Duch, R. M., and R. T. Stevenson (2008): The Economic Vote. How Political and Eco-nomic Institutions Condition Election Results. Cambridge University Press, Cambridge,UK.
Fafchamps, M., and J. Labonne (2013): “Do Politicians’ Relatives Get Better Jobs?Evidence from Municipal Elections in the Philippines,” University of Oxford, mimeo.
Faye, M., and P. Niehaus (2012): “Political Aid Cycles,” American Economic Review,102(7), 3516–30.
Fegan, B. (2009): “Entrepreneurs in Votes and Violence: Three Generations of a PeasantPolitical Family,” in An Anarchy of Families: State & Family in the Philippines, ed. byA. McCoy, pp. 33–108. University of Wisconsin Press, Madison, WI.
Franzese, R. J. (2002): “Electoral and Partisan Cycles in Economic and Policy Outcomes,”Annual Review of Political Science, 5, 369–421.
Hodder, R. (2009): “Political Interference in the Philippine Civil Service,” Environmentand Planning C: Government and Policy, 27(5), 766–782.
Hutchcroft, P. (2012): “Re-slicing the pie of patronage: the politics of internal revenueallotment in the Philippines, 1991-2010,” Philippine Review of Economics, 49(1), 109–134.
Iyer, L., and A. Mani (2012): “Traveling Agents: Political Change and BureaucraticTurnover in India,” Review of Economics and Statistics, 94(3), 723–739.
Jones, M., O. Meloni, and M. Tommasi (2012): “Voters as Fiscal Liberals: Incentivesand Acccountability in Federal Systems,” Economics & Politics, 24(2), 135–156.
Julio, B., and Y. Yook (2012): “Political Uncertainty and Corporate Investment Cycles,”The Journal of Finance, 67(1), 45–84.
Kapur, D., and M. Vaishnav (2011): “Quid Pro Quo: Builders, Politicians, and ElectionFinance in India,” Center for Global Development, Working Paper 276.
Kawanaka, T. (2002): “Power in a Philippine City,” IDE Occasional Papers Series, 38.
Khemani, S. (2004): “Political cycles in a developing economy: effect of elections in theIndian States,” Journal of Development Economics, 73(1), 125–154.
McCoy, A. (2009): “An Anarchy of Families: The Historiography of State and Family inthe Philippines,” in An Anarchy of Families: State & Family in the Philippines, ed. byA. McCoy, pp. 1–32. University of Wisconsin Press, Madison, WI.
Nordhaus, W. (1975): “The Political Business Cycle,” Review of Economic Studies, 42(2),169–190.
Peltzman, S. (1992): “Voters as Fiscal Conservatives,” Quarterly Journal of Economics,107(2), 325–345.
Querubin, P. (2011): “Political Reform and Elite Persistence: Term Limits and PoliticalDynasties in the Philippines,” mimeo, MIT.
23
(2013): “Family and Politics: Dynastic Incumbency Advantage in the Philippines,”mimeo, NYU.
Robinson, J., and T. Verdier (2013): “The Political Economy of Clientelism,” Scandina-vian Journal of Economics, 115(2), 260–291.
Rogoff, K. (1990): “Equilibrium Political Budget Cycles,” American Economic Review,80(1), 21–36.
Roodman, D. (2009): “A Note on the Theme of Too Many Instruments,” Oxford Bulletinof Economics and Statistics, 71(1), 135–158.
Sakurai, S., and N. Menezes-Filho (2010): “Opportunistic and partisan election cyclesin Brazil: new evidence at the municipal-level,” Public Choice, 148(1-2), 233–247.
Shi, M., and J. Svensson (2006): “Political budget cycles: Do they differ across countriesand why?,” Journal of Public Economics, 90(8-9), 1367–1389.
Sidel, J. (1999): Capital, Coercion, and Crime: Bossism in the Philippines, ContemporaryIssues in Asia and Pacific. Stanford University Press, Stanford, CA.
Sjahrir, B. S., K. Kis-Katos, and G. G. Schulze (2013): “Political budget cycles inIndonesia at the district level,” Economics Letters, 120(2), 342 – 345.
Solon, G., S. Haider, and J. Wooldridge (2013): “What are we Weighting For?,” NBERWorking Paper 18859.
Sukhtankar, S. (2012): “Sweetening the Deal? Political Connections and Sugar Mills inIndia,” American Economic Journal: Applied Economics, 4(3), 43–63.
24
0.0
2.0
4.0
6D
ensi
ty
20 40 60 80 100Employment Rate in January/April
Election Years Non-Election Years
Figure 1: Municipal-level employment rate in elections and non-elections years
0.0
2.0
4.0
6D
ensi
ty
20 40 60 80 100Employment Rate in July/October
Election Years Non-Election Years
Figure 2: Municipal-level employment rate in elections and non-elections years
25
Table 1: Descriptive statistics(1) (2)
Mean Std. Dev.Share working-age population with a job in:Overall 59.14 (9.57)Public Sector 4.61 (3.15)Private Sector 54.53 (10.14)Public Sector (short-term) 0.55 (0.91)Public Sector (long-term) 4.06 (2.88)Private Sector (short-term) 44.34 (11.25)Private Sector (long-term) 10.19 (7.96)Construction (short-term) 5.05 (3.32)Other variables:No Education 2.16 (4.60)Some Primary 13.94 (10.86)Primary Graduate 14.50 (7.73)Some Secondary 16.93 (5.53)Secondary Graduate 24.27 (9.34)Some College + 28.19 (13.68)Population 82,347 (207,659)Share female 0.50 (0.04)Age 35.82 (2.23)
Observations 29,715
26
Table 2: Yearly political business cycles: Employment levels(1) (2) (3) (4)
Panel A - All sectorsElection Year 0.1608 0.1241 0.1209 0.1209
(0.128) (0.135) (0.160) (0.140)
Additional Controls No Yes Yes YesTime Trend No Yes Region ProvinceObservations 8,004 7,896 7,896 7,896R-squared 0.821 0.827 0.842 0.842Panel B - Public sectorElection Year -0.0338 0.0561 0.0591 0.0591
(0.139) (0.052) (0.061) (0.060)
Additional Controls No Yes Yes YesTime Trend No Yes Region ProvinceObservations 8,004 7,896 7,896 7,896R-squared 0.807 0.839 0.847 0.847Panel C - Private sectorElection Year 0.1947 0.0680 0.0618 0.0618
(0.179) (0.170) (0.186) (0.190)
Additional Controls No Yes Yes YesTime Trend No Yes Region ProvinceObservations 8,004 7,896 7,896 7,896R-squared 0.831 0.843 0.857 0.857
Notes: Results from fixed-effects regressions. The dependent variable is the yearly average of the shareof the working age population with a job in the week before the survey (Panel A), with a job in thepublic sector in the week before the survey (Panel B) and, with a job in the private sector in theweek before the survey (Panel C). Regressions in Columns 2-4 include controls for average age (andits square) in the municipality (for those older than 15), education levels (for those older than 15),the share of women, population and, per capita fiscal transfers. The standard errors (in parentheses)account for potential correlation within time period and province. * denotes significance at the 10%,** at the 5% and, *** at the 1% level.
27
Table 3: Quarterly political business cycles: Employment levels(1) (2) (3) (4)
Panel A - All sectorsPre-election quarters 0.8949*** 0.8725*** 0.8740*** 0.8760***
(0.268) (0.281) (0.276) (0.300)Post-election quarters -0.4619** -0.4843** -0.4859** -0.4864**
(0.199) (0.191) (0.231) (0.201)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 29,715 29,283 29,283 29,283R-squared 0.626 0.636 0.641 0.649Panel B - Public sectorPre-election quarters -0.0068 0.1483* 0.1488* 0.1487*
(0.077) (0.085) (0.087) (0.086)Post-election quarters -0.0909 -0.0011 -0.0017 -0.0018
(0.099) (0.034) (0.043) (0.036)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 29,715 29,283 29,283 29,283R-squared 0.548 0.610 0.611 0.618Panel C - Private sectorPre-election quarters 0.9017*** 0.7242*** 0.7252*** 0.7273***
(0.270) (0.275) (0.277) (0.281)Post-election quarters -0.3711 -0.4832** -0.4843** -0.4846**
(0.243) (0.201) (0.224) (0.205)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 29,715 29,283 29,283 29,283R-squared 0.641 0.661 0.666 0.673
Notes: Results from fixed-effects regressions. The dependent variable is the yearly average of the shareof the working age population with a job in the week before the survey (Panel A), with a job in thepublic sector in the week before the survey (Panel B) and, with a job in the private sector in theweek before the survey (Panel C). All regressions include controls for survey quarter. Regressions inColumns 2-4 include controls for average age (and its square) in the municipality (for those older than15), education levels (for those older than 15), the share of women, population and, per capita fiscaltransfers. The standard errors (in parentheses) account for potential correlation within time periodand province. * denotes significance at the 10%, ** at the 5% and, *** at the 1% level.
28
Table 4: Quarterly political business cycles: Channels(1) (2) (3) (4) (5)
Public Private ConstructionST LT ST LT ST
Pre-election quarters 0.0276 0.1211** 1.5826* -0.8553 0.2035*(0.048) (0.053) (0.847) (0.803) (0.119)
Post-election quarters 0.0334 -0.0352 -0.4466 -0.0380 -0.1039(0.048) (0.044) (0.651) (0.767) (0.092)
Observations 29,283 29,283 29,283 29,283 29,283R-squared 0.270 0.595 0.412 0.550 0.337
Notes: Results from fixed-effects regressions. The dependent variable is the share of the working agepopulation with a short-term job in the public sector in the week before the survey (Column 1) witha long-term job in the public sector in the week before the survey (Column 2), with a short-term jobin the private sector in the week before the survey (Column 3), with a long-term job in the privatesector in the week before the survey (Column 4) and with a short-term job in the construction sectorin the week before the survey (Column 5). All regressions include controls for survey quarters, averageage (and its square) in the municipality (for those older than 15), education levels (for those olderthan 15), the share of women, population per capita fiscal transfers, a dummy for whether or not theprevious municipal election led to a change in local leadership and province-specific quadratic timetrends. The standard errors (in parentheses) account for potential correlation within time period andprovince. * denotes significance at the 10%, ** at the 5% and, *** at the 1% level.
29
Tab
le5:
Qua
rter
lypo
litic
albu
sine
sscy
cles
:H
eter
ogen
eity
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Full
Pub
licP
riva
teP
ublic
Pri
vate
Con
stru
ctio
nST
LTST
LTST
Pan
elA
-In
cum
bent
’sex
peri
ence
Pre
-ele
ctio
nqu
arte
rs0.
9058
***
0.14
37*
0.76
21**
*0.
0255
0.11
82**
1.56
85*
-0.8
064
0.19
98*
(0.3
03)
(0.0
87)
(0.2
85)
(0.0
49)
(0.0
53)
(0.8
49)
(0.7
95)
(0.1
19)
Pos
t-el
ecti
onqu
arte
rs-0
.505
9**
-0.0
021
-0.5
038*
*0.
0350
-0.0
371
-0.4
319
-0.0
718
-0.1
017
(0.2
01)
(0.0
37)
(0.2
04)
(0.0
47)
(0.0
44)
(0.6
47)
(0.7
55)
(0.0
93)
Pre
-ele
ctio
nqu
arte
rsX
0.00
88-0
.070
7**
0.07
950.
0000
-0.0
708*
*0.
1440
-0.0
645
-0.0
008
Nb
term
s(0
.098
)(0
.035
)(0
.087
)(0
.012
)(0
.033
)(0
.162
)(0
.177
)(0
.034
)P
ost-
elec
tion
quar
ters
X0.
0447
-0.0
097
0.05
44-0
.001
2-0
.008
5-0
.047
80.
1022
-0.0
100
Nb
term
s(0
.094
)(0
.024
)(0
.094
)(0
.015
)(0
.031
)(0
.093
)(0
.088
)(0
.025
)R
-squ
ared
0.64
90.
618
0.67
30.
270
0.59
50.
412
0.55
00.
337
Pan
elB
-In
cum
bent
Los
tP
re-e
lect
ion
quar
ters
0.86
36**
*0.
1602
*0.
7033
**0.
0323
0.12
79**
1.59
04*
-0.8
870
0.20
94*
(0.3
03)
(0.0
87)
(0.2
84)
(0.0
48)
(0.0
53)
(0.8
50)
(0.8
03)
(0.0
99)
Pos
t-el
ecti
onqu
arte
rs-0
.499
6**
0.05
13-0
.550
9***
0.02
870.
0226
-0.5
108
-0.0
401
0.02
07(0
.198
)(0
.052
)(0
.198
)(0
.052
)(0
.058
)(0
.701
)(0
.776
)(0
.087
)P
ost-
elec
tion
quar
ters
X0.
0703
-0.1
939
0.26
410.
0039
-0.1
978*
0.18
410.
0800
0.06
30C
hang
e(0
.277
)(0
.127
)(0
.335
)(0
.050
)(0
.103
)(0
.475
)(0
.569
)(0
.063
))R
-squ
ared
0.64
90.
618
0.67
30.
270
0.59
50.
412
0.55
00.
337
Not
es:
Res
ults
from
fixed
-effe
cts
regr
essi
ons.
The
depe
nden
tva
riab
leis
the
shar
eof
the
wor
king
age
popu
lati
onw
ith
ajo
bin
the
wee
kbe
fore
the
surv
ey(C
olum
n1)
,w
ith
ajo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
2),w
ith
ajo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n3)
,wit
ha
shor
t-te
rmjo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
4)w
ith
alo
ng-t
erm
job
inth
epu
blic
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n5)
,wit
ha
shor
t-te
rmjo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n6)
,w
ith
alo
ng-t
erm
job
inth
epr
ivat
ese
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
7)an
dw
ith
ash
ort-
term
job
inth
eco
nstr
ucti
onse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
8).
All
regr
essi
ons
incl
ude
cont
rols
for
surv
eyqu
arte
rs,
aver
age
age
(and
its
squa
re)
inth
em
unic
ipal
ity
(for
thos
eol
der
than
15),
educ
atio
nle
vels
(for
thos
eol
der
than
15),
the
shar
eof
wom
en,
popu
lati
onpe
rca
pita
fisca
ltr
ansf
ers,
adu
mm
yfo
rw
heth
eror
not
the
prev
ious
mun
icip
alel
ecti
onle
dto
ach
ange
inlo
cal
lead
ersh
ipan
dpr
ovin
ce-s
peci
ficqu
adra
tic
tim
etr
ends
.T
hest
anda
rder
rors
(in
pare
nthe
ses)
acco
unt
for
pote
ntia
lco
rrel
atio
nw
ithi
nti
me
peri
odan
dpr
ovin
ce.
*de
note
ssi
gnifi
canc
eat
the
10%
,**
atth
e5%
and,
***
atth
e1%
leve
l.O
bser
vati
ons:
29,2
83.
Table 6: Political budget cycles
(1) (2) (3) (4) (5)Number of lags 0 1 1 2 2
FE FE GMM FE GMMPanel A - log per capita revenuesElection Year -0.0593* -0.0546** -0.0313*** -0.0462* -0.0207***
(0.031) (0.025) (0.003) (0.024) (0.003)
Observations 14,107 12,578 12,578 10,981 10,981R-squared 0.911 0.923 0.928Panel B - log per capita local tax collectionElection Year -0.0266** -0.0235** -0.0170*** -0.0236** -0.0166***
(0.011) (0.011) (0.004) (0.011) (0.004)
Observations 14,092 12,554 12,554 10,953 10,953R-squared 0.938 0.948 0.952Panel C - log per capita transfers from National gvtElection Year -0.0666 -0.0601* -0.0385*** -0.0439* -0.0120***
(0.043) (0.032) (0.002) (0.025) (0.003)
Observations 14,103 12,574 12,574 10,977 10,977R-squared 0.939 0.953 0.962Panel D - log per capita spendingElection Year -0.0131 -0.0156 -0.0075** -0.0181 -0.0021
(0.021) (0.024) (0.004) (0.023) (0.004)
Observations 14,105 12,575 12,575 10,977 10,977R-squared 0.873 0.890 0.897Panel E - Share municipal budget spentElection Year 4.0713** 2.8342*** 2.4552*** 2.1065** 2.6537***
(1.713) (0.966) (0.259) (0.973) (0.266)
Observations 14,107 12,578 12,578 10,981 10,981R-squared 0.169 0.178 0.196
Notes: Results from fixed-effects and GMM regressions. All regressions include inColumns 2-5 controls for population, the number of times the incumbent’s familyhas been elected in the municipality since 1987, a dummy capturing family linksbetween the mayor and provincial officials, a dummy capturing change in mayorand, a simple time trend (Columns 1 and 2), region-specific time trends (Column3), province-specific time trends (Columns 4) and, municipality-specific time trends(Column 5). The standard errors (in parentheses) account for potential correlationwithin time period and province. * denotes significance at the 10%, ** at the 5%and, *** at the 1% level.
31
Table 7: Political budget cycles: heterogeneity(1) (2) (3) (4) (5)
Nb Lags 0 1 1 2 2FE FE GMM FE GMM
Panel A - log per capita revenues from local sourcesElection Year -0.0289** -0.0188 -0.0127** -0.0201 -0.0123**
(0.012) (0.012) (0.006) (0.013) (0.006)Election Year X 0.0082 -0.0140 -0.0134 -0.0106 -0.0132Change (0.024) (0.017) (0.012) (0.022) (0.012)
Observations 14,092 12,554 12,554 10,953 10,953R-squared 0.938 0.948 0.952Panel B - log per capita spendingElection Year -0.0111 -0.0102 0.0001 -0.0124 0.0027
(0.021) (0.023) (0.005) (0.021) (0.005)Election Year X -0.0075 -0.0164 -0.0231*** -0.0173 -0.0145*Change (0.013) (0.013) (0.009) (0.017) (0.009)
Observations 14,105 12,575 12,575 10,977 10,977R-squared 0.873 0.890 0.897
Notes: Results from fixed-effects and GMM regressions. All regressions include in Columns 2-5 controlsfor population, the number of times the incumbent’s family has been elected in the municipality since1987, a dummy capturing family links between the mayor and provincial officials, a dummy capturingchange in mayor and, a simple time trend (Columns 1 and 2), region-specific time trends (Column3), province-specific time trends (Columns 4) and, municipality-specific time trends (Column 5). Thestandard errors (in parentheses) account for potential correlation within time period and province. *denotes significance at the 10%, ** at the 5% and, *** at the 1% level.
32
Table A.1: Yearly political business cycles: Employment Levels (excluding 2003)(1) (2) (3) (4)
Panel A - All sectorsElection Year 0.1957 0.1417 0.1411 0.1393
(0.154) (0.151) (0.166) (0.159)
Additional Controls No Yes Yes YesTime Trend No Yes Region ProvinceObservations 6,860 6,752 6,752 6,752R-squared 0.857 0.863 0.868 0.874Panel B - Public sectorElection Year -0.0698 0.0656 0.0657 0.0655
(0.108) (0.056) (0.065) (0.082)
Additional Controls No Yes Yes YesTime Trend No Yes Region ProvinceObservations 6,860 6,752 6,752 6,752R-squared 0.837 0.867 0.868 0.872Panel C - Private sectorElection Year 0.2655 0.0761 0.0754 0.0738
(0.197) (0.196) (0.227) (0.208)
Additional Controls No Yes Yes YesTime Trend No Yes Region ProvinceObservations 6,860 6,752 6,752 6,752R-squared 0.864 0.875 0.879 0.885
Notes: Results from fixed-effects regressions. The dependent variable is the yearly average of the shareof the working age population with a job in the week before the survey (Panel A), with a job in thepublic sector in the week before the survey (Panel B) and, with a job in the private sector in theweek before the survey (Panel C). Regressions in Columns 2-4 include controls for average age (andits square) in the municipality (for those older than 15), education levels (for those older than 15),the share of women, population and, per capita fiscal transfers. The standard errors (in parentheses)account for potential correlation within time period and province. * denotes significance at the 10%,** at the 5% and, *** at the 1% level.
33
Table A.2: Quarterly political business cycles: Average hours worked(1) (2) (3) (4)
Panel A - EmploymentPre-election quarters -0.5615 -0.5702 -0.5711 -0.5755
(0.523) (0.501) (0.510) (0.585)Post-election quarters 0.1341 0.1526 0.1551 0.1544
(0.281) (0.325) (0.325) (0.308)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 29,715 29,283 29,283 29,283R-squared 0.706 0.709 0.712 0.718Panel B - Public sector employmentPre-election quarters -0.8382 -1.1972 -1.1923 -1.1968*
(0.691) (0.771) (0.772) (0.695)Post-election quarters 0.2837 0.0387 0.0408 0.0466
(0.280) (0.378) (0.390) (0.369)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 25,442 25,083 25,083 25,083R-squared 0.305 0.309 0.311 0.318Panel C - Private sector employmentPre-election quarters -0.5433 -0.5132 -0.5142 -0.5188
(0.521) (0.493) (0.502) (0.569)Pot-election quarters 0.1152 0.1651 0.1676 0.1666
(0.289) (0.324) (0.325) (0.307)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 29,715 29,283 29,283 29,283R-squared 0.700 0.703 0.706 0.712
Notes: Results from fixed-effects regressions. The dependent variable is the average number hoursworked in the week before the survey for those who have a job in the week before the survey (Panel A),with a job in the public sector in the week before the survey (Panel B) and with a job in the privatesector in the week before the survey (Panel C). All regressions include controls for survey quarter.Regressions in Columns 2-4 include controls for average age (and its square) in the municipality (forthose older than 15), education levels (for those older than 15), the share of women, population and,per capita fiscal transfers. The standard errors (in parentheses) account for potential correlation withintime period and province. * denotes significance at the 10%, ** at the 5% and, *** at the 1% level.
34
Table A.3: Quarterly political business cycles: Log wage(1) (2) (3) (4)
Panel A - EmploymentPre-election quarters -0.0746** -0.0147* -0.0147 -0.0147
(0.035) (0.009) (0.009) (0.009)Post-election quarters -0.0307 0.0047 0.0047 0.0048
(0.040) (0.008) (0.008) (0.009)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 28,818 28,403 28,403 28,403R-squared 0.700 0.758 0.761 0.765Panel B - Public sector employmentPre-election quarters -0.0705** -0.0223 -0.0224 -0.0226
(0.035) (0.014) (0.015) (0.015)Post-election quarters -0.0400 -0.0167 -0.0167 -0.0167
(0.031) (0.020) (0.021) (0.020)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 23,324 23,002 23,002 23,002R-squared 0.260 0.289 0.292 0.302Panel C - Private sector employmentPre-election quarters -0.0684* -0.0098 -0.0100 -0.0101
(0.035) (0.010) (0.010) (0.011)Post-election quarters -0.0263 0.0120 0.0122 0.0121
(0.041) (0.008) (0.008) (0.008)
Additional Controls No Yes Yes YesQuadratic Time Trend No Yes Region ProvinceObservations 28,104 27,699 27,699 27,699R-squared 0.768 0.805 0.809 0.812
Notes: Results from fixed-effects regressions. The dependent variable is the average log wage for thosewho have a job in the week before the survey (Panel A), with a job in the public sector in the weekbefore the survey (Panel B) and with a job in the private sector in the week before the survey (PanelC). All regressions include controls for survey quarter. Regressions in Columns 2-4 include controlsfor average age (and its square) in the municipality (for those older than 15), education levels (forthose older than 15), the share of women, population and, per capita fiscal transfers. The standarderrors (in parentheses) account for potential correlation within time period and province. * denotessignificance at the 10%, ** at the 5% and, *** at the 1% level.
35
Table A.4: Yearly political business cycles: Alternative specification(1) (2) (3) (4) (5) (6)
Full Public PrivateNumber of lags 1 2 1 2 1 2Panel A: Fixed effectsElection Year 0.1606 0.1990 0.0684 0.0190 0.0909 0.1758
(0.188) (0.206) (0.058) (0.054) (0.209) (0.253)
Observations 6,751 5,607 6,751 5,607 6,751 5,607R-squared 0.873 0.887 0.872 0.880 0.884 0.896Panel B: GMMElection Year 0.1745 0.2684* 0.0550* 0.0450 0.0925 0.2179
(0.128) (0.153) (0.031) (0.039) (0.127) (0.161)
Observations 6,751 5,607 6,751 5,607 6,751 5,607
Notes: Results from Fixed-Effects and GMM regressions. The dependent variable is the yearly averageof the share of the working age population with a job in the week before the survey (Columns 1-2),with a job in the public sector in the week before the survey (Columns 3-4) and, with a job in theprivate sector in the week before the survey (Columns 5-6). All regressions include controls for surveyquarters, average age (and its square) in the municipality (for those older than 15), education levels(for those older than 15), the share of women, population, per capita fiscal transfers and region-specifictime trend. The dependent variable is lagged once in Columns 1, 3 and 5 and twice in Columns 2, 4and 6. The standard errors (in parentheses) account for potential correlation within time period andprovince (Panel A), and within province (Panel B). * denotes significance at the 10%, ** at the 5%and, *** at the 1% level.
36
Table A.5: Yearly political business cycles: Alternative specification (Excluding 2003)(1) (2) (3) (4) (5) (6)
Full Public PrivateNumber of lags 1 2 1 2 1 2Panel A: Fixed effectsElection Year 0.1957 0.2512 0.0276 0.0037 0.1687 0.2420
(0.205) (0.172) (0.057) (0.045) (0.256) (0.185)
Observations 5,607 4,464 5,607 4,464 5,607 4,464R-squared 0.886 0.901 0.875 0.894 0.895 0.908Panel B: GMMElection Year 0.1406 0.2711* 0.0529 0.0314 0.1758 0.2238
(0.129) (0.152) (0.036) (0.038) (0.132) (0.159)
Observations 5,607 4,464 5,607 4,464 5,607 4,464
Notes: Results from Fixed-Effects and GMM regressions. The dependent variable is the yearly averageof the share of the working age population with a job in the week before the survey (Columns 1-2),with a job in the public sector in the week before the survey (Columns 3-4) and, with a job in theprivate sector in the week before the survey (Columns 5-6). All regressions include controls for surveyquarters, average age (and its square) in the municipality (for those older than 15), education levels(for those older than 15), the share of women, population, per capita fiscal transfers and region-specifictime trend. The dependent variable is lagged once in Columns 1, 3 and 5 and twice in Columns 2, 4and 6. The standard errors (in parentheses) account for potential correlation within time period andprovince (Panel A), and within province (Panel B). * denotes significance at the 10%, ** at the 5%and, *** at the 1% level.
37
Table A.6: Quarterly political business cycles: Alternative specification
(1) (2) (3) (4) (5) (6)Full Public Private
Number of lags 1 2 1 2 1 2Panel A: Fixed effectsPre-election quarters 0.7388** 0.7991*** 0.1452 0.1386 0.6013 0.6648*
(0.294) (0.271) (0.098) (0.093) (0.446) (0.361)Post-election quarters -0.5921*** -0.5750*** -0.0071 -0.0210 -0.5838*** -0.5506**
(0.192) (0.217) (0.042) (0.051) (0.218) (0.264)
Observations 28,132 26,984 28,132 26,984 28,132 26,984R-squared 0.655 0.657 0.615 0.613 0.678 0.679Panel B: IVPre-election quarters 1.2551*** 1.3011*** 0.1939*** 0.1641** 1.0612*** 1.1176***
(0.230) (0.201) (0.065) (0.079) (0.240) (0.166)Post-election quarters -0.1882 -1.2071** 0.0581 -0.1130 -0.2526 -1.1795**
(0.206) (0.480) (0.041) (0.142) (0.219) (0.509)
Observations 25,837 24,690 25,837 24,690 25,837 24,690
Notes: Results from Fixed-Effects and IV regressions. The dependent variable is the share of theworking age population with a job in the week before the survey (Columns 1-2), with a job in thepublic sector in the week before the survey (Columns 3-4) and, with a job in the private sector in theweek before the survey (Columns 5-6). All regressions include controls for survey quarters, averageage (and its square) in the municipality (for those older than 15), education levels (for those olderthan 15), the share of women, population, per capita fiscal transfers and province-specific quadratictime trend. The dependent variable is lagged once in Columns 1, 3 and 5 and twice in Columns 2, 4and 6. The standard errors (in parentheses) account for potential correlation within time period andprovince. * denotes significance at the 10%, ** at the 5% and, *** at the 1% level.
38
Tab
leA
.7:
Qua
rter
lypo
litic
albu
sine
sscy
cles
:E
xclu
ding
outl
iers
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Full
Pub
licP
riva
teP
ublic
(ST
)P
ublic
(LT
)P
riva
te(S
T)
Pri
vate
(LT
)C
onst
ruct
ion
(ST
)P
anel
A-
Exc
lude
top
and
bott
om1%
(n=
28,7
10)
Pre
-ele
ctio
nqu
arte
rs0.
8661
***
0.14
52*
0.72
08**
*0.
0285
0.11
68**
1.56
23**
-0.8
415
0.20
98(0
.269
)(0
.087
)(0
.257
)(0
.048
)(0
.055
)(0
.769
)(0
.680
)(0
.160
)P
ost-
elec
tion
quar
ters
-0.4
489*
*-0
.002
0-0
.446
9**
0.03
42-0
.036
2-0
.455
60.
0086
-0.1
042
(0.1
98)
(0.0
37)
(0.2
05)
(0.0
50)
(0.0
45)
(0.6
49)
(0.7
61)
(0.0
91)
Pan
elB
-E
xclu
deto
pan
dbo
ttom
2%(n
=28
,123
)P
re-e
lect
ion
quar
ters
0.80
41**
*0.
1386
0.66
56**
*0.
0268
0.11
17**
1.56
56*
-0.9
001
0.21
42(0
.266
)(0
.085
)(0
.251
)(0
.047
)(0
.057
)(0
.808
)(0
.751
)(0
.158
)P
ost-
elec
tion
quar
ters
-0.4
228*
*0.
0020
-0.4
248*
*0.
0334
-0.0
315
-0.4
653
0.04
05-0
.102
5(0
.197
)(0
.038
)(0
.204
)(0
.051
)(0
.044
)(0
.652
)(0
.754
)(0
.091
)P
anel
C-
Exc
lude
top
and
bott
om3%
(n=
27,5
39)
Pre
-ele
ctio
nqu
arte
rs0.
8100
***
0.13
590.
6741
***
0.02
510.
1108
**1.
5703
*-0
.896
20.
2157
*(0
.264
)(0
.083
)(0
.243
)(0
.045
)(0
.053
)(0
.834
)(0
.642
)(0
.125
)P
ost-
elec
tion
quar
ters
-0.3
909*
*0.
0043
-0.3
952
0.03
21-0
.027
8-0
.462
90.
0678
-0.1
005
(0.1
99)
(0.0
38)
(0.2
92)
(0.0
49)
(0.0
70)
(0.6
80)
(0.7
81)
(0.0
88)
Pan
elD
-E
xclu
deto
pan
dbo
ttom
4%(n
=26
,952
)P
re-e
lect
ion
quar
ters
0.81
54**
*0.
1358
0.67
97**
*0.
0249
0.11
09**
1.54
62**
-0.8
665
0.21
77*
(0.2
59)
(0.0
83)
(0.2
46)
(0.0
46)
(0.0
53)
(0.7
61)
(0.6
34)
(0.1
20)
Pos
t-el
ecti
onqu
arte
rs-0
.347
70.
0000
-0.3
477
0.02
89-0
.028
9-0
.443
50.
0958
-0.1
006
(0.2
14)
(0.0
39)
(0.2
17)
(0.0
50)
(0.0
49)
(0.6
49)
(0.7
59)
(0.0
89)
Not
es:
Res
ults
from
fixed
-effe
cts
regr
essi
ons.
The
depe
nden
tva
riab
leis
the
shar
eof
the
wor
king
age
popu
lati
onw
ith
ajo
bin
the
wee
kbe
fore
the
surv
ey(C
olum
n1)
,w
ith
ajo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
2),w
ith
ajo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n3)
,wit
ha
shor
t-te
rmjo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
4)w
ith
alo
ng-t
erm
job
inth
epu
blic
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n5)
,wit
ha
shor
t-te
rmjo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n6)
,w
ith
alo
ng-t
erm
job
inth
epr
ivat
ese
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
7)an
dw
ith
ash
ort-
term
job
inth
eco
nstr
ucti
onse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
8).
All
regr
essi
ons
incl
ude
cont
rols
for
surv
eyqu
arte
rs,
aver
age
age
(and
its
squa
re)
inth
em
unic
ipal
ity
(for
thos
eol
der
than
15),
educ
atio
nle
vels
(for
thos
eol
der
than
15),
the
shar
eof
wom
en,
popu
lati
onpe
rca
pita
fisca
ltr
ansf
ers,
adu
mm
yfo
rw
heth
eror
not
the
prev
ious
mun
icip
alel
ecti
onle
dto
ach
ange
inlo
cal
lead
ersh
ipan
dpr
ovin
ce-s
peci
ficqu
adra
tic
tim
etr
ends
.T
hest
anda
rder
rors
(in
pare
nthe
ses)
acco
unt
for
pote
ntia
lco
rrel
atio
nw
ithi
nti
me
peri
odan
dpr
ovin
ce.
*de
note
ssi
gnifi
canc
eat
the
10%
,**
atth
e5%
and,
***
atth
e1%
leve
l.
39
Tab
leA
.8:
Qua
rter
lypo
litic
albu
sine
sscy
cles
:A
ddit
iona
lro
bust
ness
chec
ks(1
)(2
)(3
)(4
)(5
)(6
)(7
)(8
)Fu
llP
ublic
Pri
vate
Pub
licP
riva
teC
onst
ruct
ion
STLT
STLT
STP
anel
A-
Mun
icip
alit
y*qu
arte
rfix
edeff
ects
Pre
-ele
ctio
nqu
arte
rs0.
8723
***
0.14
690.
7254
**0.
0276
0.11
93**
1.57
46-0
.849
20.
2028
(0.3
14)
(0.1
09)
(0.2
90)
(0.0
60)
(0.0
60)
(0.9
77)
(0.7
31)
(0.1
46)
Pos
t-el
ecti
onqu
arte
rs-0
.484
8**
-0.0
011
-0.4
837*
*0.
0338
-0.0
349
-0.4
472
-0.0
364
-0.1
056
(0.2
24)
(0.0
48)
(0.2
29)
(0.0
52)
(0.0
51)
(0.6
83)
(0.8
05)
(0.1
02)
Obs
erva
tion
s29
,283
29,2
8329
,283
29,2
8329
,283
29,2
8329
,283
29,2
83R
-squ
ared
0.70
20.
687
0.72
30.
376
0.66
80.
484
0.61
00.
438
Pan
elB
-N
ow
eigh
tsP
re-e
lect
ion
quar
ters
0.79
65**
*0.
1842
*0.
6122
*0.
0490
0.13
53**
1.71
07**
-1.0
985*
0.15
68(0
.307
)(0
.106
)(0
.327
)(0
.060
)(0
.058
)(0
.738
)(0
.644
)(0
.100
)P
ost-
elec
tion
quar
ters
-0.7
115*
**0.
0187
-0.7
302*
**0.
0151
0.00
36-0
.514
2-0
.216
0-0
.089
1(0
.268
)(0
.055
)(0
.273
)(0
.042
)(0
.045
)(0
.678
)(0
.828
)(0
.091
)
Obs
erva
tion
s29
,283
29,2
8329
,283
29,2
8329
,283
29,2
8329
,283
29,2
83R
-squ
ared
0.60
30.
585
0.64
20.
215
0.56
10.
366
0.52
80.
299
Not
es:
Res
ults
from
fixed
-effe
cts
regr
essi
ons.
The
depe
nden
tva
riab
leis
the
shar
eof
the
wor
king
age
popu
lati
onw
ith
ajo
bin
the
wee
kbe
fore
the
surv
ey(C
olum
n1)
,w
ith
ajo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
2),
wit
ha
job
inth
epr
ivat
ese
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
3),w
ith
ash
ort-
term
job
inth
epu
blic
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n4)
wit
ha
long
-ter
mjo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
5),
wit
ha
shor
t-te
rmjo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n6)
,w
ith
alo
ng-t
erm
job
inth
epr
ivat
ese
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
7)an
dw
ith
ash
ort-
term
job
inth
eco
nstr
ucti
onse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
8).
All
regr
essi
ons
incl
ude
cont
rols
for
surv
eyqu
arte
rs,
aver
age
age
(and
its
squa
re)
inth
em
unic
ipal
ity
(for
thos
eol
der
than
15),
educ
atio
nle
vels
(for
thos
eol
der
than
15),
the
shar
eof
wom
en,p
opul
atio
npe
rca
pita
fisca
ltra
nsfe
rs,a
dum
my
for
whe
ther
orno
tth
epr
evio
usm
unic
ipal
elec
tion
led
toa
chan
gein
loca
lle
ader
ship
and
prov
ince
-spe
cific
quad
rati
cti
me
tren
ds.
The
stan
dard
erro
rs(i
npa
rent
hese
s)ac
coun
tfo
rpo
tent
ialc
orre
lati
onw
ithi
nti
me
peri
odan
dpr
ovin
ce.
*de
note
ssi
gnifi
canc
eat
the
10%
,**
atth
e5%
and,
***
atth
e1%
leve
l.
40
Tab
leA
.9:
Qua
rter
lypo
litic
albu
sine
sscy
cles
:H
eter
ogen
eity
(1)
(2)
(3)
(4)
(5)
(6)
(7)
(8)
Full
Pub
licP
riva
teP
ublic
(ST
)P
ublic
(LT
)P
riva
te(S
T)
Pri
vate
(LT
)C
onst
ruct
ion
(ST
)P
re-e
lect
ion
quar
ters
XF
irst
term
0.95
95**
0.27
55**
0.68
40*
0.00
370.
2718
***
1.19
90-0
.515
00.
1866
(0.4
13)
(0.1
08)
(0.3
63)
(0.0
50)
(0.0
95)
(1.0
67)
(1.1
52)
(0.1
41)
Seco
ndte
rm0.
6908
*0.
2173
**0.
4735
0.06
480.
1524
**1.
5678
*-1
.094
30.
2346
*(0
.358
)(0
.104
)(0
.356
)(0
.062
)(0
.076
)(0
.837
)(0
.786
)(0
.120
)T
hird
term
1.03
89**
*0.
0857
0.95
32**
*0.
0265
0.05
921.
4673
*-0
.514
10.
1655
(0.3
14)
(0.1
41)
(0.3
20)
(0.0
56)
(0.1
17)
(0.8
26)
(0.7
97)
(0.1
26)
Four
thte
rm1.
2127
*-0
.066
31.
2790
*-0
.027
7-0
.038
61.
8920
-0.6
130
0.20
27(0
.626
)(0
.171
)(0
.678
)(0
.055
)(0
.148
)(1
.152
)(1
.148
)(0
.221
)F
ifth
term
0.50
050.
0004
0.50
010.
0246
-0.0
242
1.51
03*
-1.0
102
0.18
58(0
.466
)(0
.130
)(0
.448
)(0
.109
)(0
.095
)(0
.897
)(0
.921
)(0
.188
)P
ost-
elec
tion
quar
ters
XF
irst
term
-0.7
884*
**-0
.017
1-0
.771
3**
0.00
99-0
.027
0-0
.974
90.
2036
-0.1
523
(0.3
02)
(0.0
74)
(0.3
06)
(0.0
61)
(0.0
77)
(0.6
86)
(0.7
58)
(0.1
16)
Seco
ndte
rm-0
.280
40.
0142
-0.2
946
0.05
92-0
.045
0-0
.148
3-0
.146
3-0
.028
2(0
.268
)(0
.069
)(0
.255
)(0
.054
)(0
.091
)(0
.919
)(1
.053
)(0
.125
)T
hird
term
-0.4
559
0.05
92-0
.515
10.
0413
0.01
79-0
.004
3-0
.510
8-0
.114
2(0
.348
)(0
.102
)(0
.385
)(0
.074
)(0
.072
)(0
.554
)(0
.754
)(0
.119
)Fo
urth
term
-0.3
573
-0.0
215
-0.3
358
-0.0
022
-0.0
193
-0.4
867
0.15
09-0
.152
0(0
.459
)(0
.119
)(0
.469
)(0
.049
)(0
.112
)(0
.812
)(0
.855
)(0
.187
)F
ifth
term
-0.4
892
-0.0
520
-0.4
372
0.05
20-0
.104
1-0
.229
4-0
.207
8-0
.042
6(0
.454
)(0
.093
)(0
.447
)(0
.079
)(0
.111
)(0
.694
)(0
.744
)(0
.122
)R
-squ
ared
0.64
90.
618
0.67
30.
271
0.59
60.
413
0.55
10.
338
Not
es:
Res
ults
from
fixed
-effe
cts
regr
essi
ons.
The
depe
nden
tva
riab
leis
the
shar
eof
the
wor
king
age
popu
lati
onw
ith
ajo
bin
the
wee
kbe
fore
the
surv
ey(C
olum
n1)
,w
ith
ajo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
2),w
ith
ajo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n3)
,wit
ha
shor
t-te
rmjo
bin
the
publ
icse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
4)w
ith
alo
ng-t
erm
job
inth
epu
blic
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n5)
,wit
ha
shor
t-te
rmjo
bin
the
priv
ate
sect
orin
the
wee
kbe
fore
the
surv
ey(C
olum
n6)
,w
ith
alo
ng-t
erm
job
inth
epr
ivat
ese
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
7)an
dw
ith
ash
ort-
term
job
inth
eco
nstr
ucti
onse
ctor
inth
ew
eek
befo
reth
esu
rvey
(Col
umn
8).
All
regr
essi
ons
incl
ude
cont
rols
for
surv
eyqu
arte
rs,
aver
age
age
(and
its
squa
re)
inth
em
unic
ipal
ity
(for
thos
eol
der
than
15),
educ
atio
nle
vels
(for
thos
eol
der
than
15),
the
shar
eof
wom
en,
popu
lati
onpe
rca
pita
fisca
ltr
ansf
ers,
adu
mm
yfo
rw
heth
eror
not
the
prev
ious
mun
icip
alel
ecti
onle
dto
ach
ange
inlo
cal
lead
ersh
ipan
dpr
ovin
ce-s
peci
ficqu
adra
tic
tim
etr
ends
.T
hest
anda
rder
rors
(in
pare
nthe
ses)
acco
unt
for
pote
ntia
lco
rrel
atio
nw
ithi
nti
me
peri
odan
dpr
ovin
ce.
*de
note
ssi
gnifi
canc
eat
the
10%
,**
atth
e5%
and,
***
atth
e1%
leve
l.O
bser
vati
ons:
29,2
83.
41
Teacher Absenteeism and the Salience of Local EthnicDiversity: Evidence from African Districts
Eoin F. McGuirk∗
CEGA, University of California, Berkeley
For an up-to-date version, please click here.
Abstract
The rate of teacher absenteeism is over five times higher in Uganda than it is in New York.In India, it is two and a half times higher than the rate of absenteeism for private sectorfactory workers. One potential explanation for these observations is that, in the presence ofweak formal institutions—such as those found in many less developed countries—the likelihoodof punishment for absent teachers may be lower. In these settings, other forms of local collectiveaction are often required to produce public goods and prevent free-riding. However, a growingliterature has shown that local collective action outcomes are often adversely affected by ethnicdivisions. In this paper, I identify the impact of a new measure of ethnic divisions on teacherabsenteeism using two datasets: one collected from random, unannounced school visits inUganda, and another collected from over 20,000 survey respondents in 16 sub-Saharan Africancountries. In light of growing empirical support for constructivist theories of ethnicity, I allowthe effect of diversity to vary by the salience of ethnic identification in each district. I findthat, at high levels of ethnic salience, a one standard deviation increase in ethnic diversityincreases the observed absenteeism rate in Uganda by between 3.8 and 9.3 percentage points,or 0.08 and 0.21 standard deviations. In the multi-country survey data, the same changeincreases perceived absenteeism by 0.08 standard deviations. At low levels of ethnic salience,diversity has no positive effect on absenteeism in either dataset. Consistent with the recentliterature on the limitations of participatory programs on public service delivery, I providesuggestive evidence that social capital in the form of within-school teacher networks, ratherthan community-level monitoring, may explain the findings. The results offer one explanationfor why substantial recent investment in education does not seem to be leading to improvedtest-score outcomes for children in many poor and ethnically diverse countries. The analysisalso has implications for the measurement of ethnic divisions.
∗Email: [email protected] or [email protected]. I am grateful in particular to Pedro Vicente and TedMiguel for their invaluable guidance, support and suggestions. I thank James Habyarimana and Halsey Rogers forproviding me with access to data, and also Gani Aldashev, Pierre Bachas, Joachim von Braun, David Berger, DanielGilligan, Guy Grossman, John Hoddinott, Philip Lane, Janet Lewis, Jeremy Magruder, Lucy Martin, Julia AnnaMatz, Fergal McCann, Mark McGovern, Ben Morse, Carol Newman, Laura Ralston, Gerard Roland, Bilal Siddiqi andparticipants at the Development Economics Lunch Seminar at University of California, Berkeley, and the WorkingGroup in African Political Economy (WGAPE) for their helpful input. I gratefully acknowledge funding from theIrish Research Council and the Fulbright Commission. All errors are my own.
1
1 Introduction
Despite the unprecedented expansion of primary school access over the past decade, standardized
test results in many less developed countries reveal that basic numeracy and literacy skills are not
improving.1 Evidence from Chaudhury et al. (2006) and Duflo et al. (2012) strongly suggests that
teacher absenteeism may be significantly contributing to this observation. The former study found
that 19% of teachers were absent during unannounced visits to nationally representative samples
of schools in Bangladesh, Ecuador, India, Indonesia, Peru and Uganda in 2002 and 2003,2 while
Duflo et al. (2012) show that a reduction of the absenteeism rate in Udaiper, India, from 36% to
18% led to an improvement in test scores of 0.17 standard deviations.
In this analysis, I show that ethnic divisions at the district and school levels are associated with
significantly higher rates of teacher absenteeism. I design a new measure of ethnic divisions to
capture what are sometimes called the ‘evolutionary’ and the ‘constructivist’ components of ethnic
identity.3 Traditional measures of ethnic diversity, or ‘ethnolinguistic fractionalization’, are based
on the composition of ethnic groups in a given area. This is to a large extent the product of long-term
cultural drift, itself caused by historical settlement duration (Ashraf and Galor, 2013; Ahlerup and
Olsson, 2012) and geographic variability (Michalopoulos, 2012). However, the extent to which this
diversity is manifested in collective action problems or political cleavages depends on the salience of
ethnic identification, which can vary across countries and over time due to nation-building policies
(Miguel, 2004), political competition (Posner, 2004a; Eifert et al., 2010) or other historical and
contextual factors (Dunning and Harrison, 2010; Glennerster et al., 2012). In order to capture
these ‘constructivist’ conditions, I create a district-level term that represents ethnic divisions by
interacting ethnic diversity with the salience of ethnic identification. The diversity component is
based on a Herfindahl concentration index of Afrobarometer survey respondents, while the salience
component is based on respondents’ propensity to identify themselves along ethnic lines rather than
1See Uwezo (2011, 2012) for East African cases, and Pratham (2006) for Indian cases.2This is a conservative estimate. The inclusion of ‘tea-drinkers’—teachers who were present but were not teaching
as scheduled—increases the figure from 25% to almost 50% for India alone.3These are often described as ‘primordial’ and ‘instrumentalist’ respectively.
2
national lines.
I combine this new interpretation of ethnic divisions with data on (i) observed teacher absences
collected from random visits to almost 100 schools in 10 Ugandan districts;4 and (ii) perceived
teacher absenteeism amongst over 20,000 Afrobarometer survey respondents in 16 African countries.
At high levels of ethnic salience,5 I find that an increase of one standard deviation in local ethnic
diversity increases the rate of observed teacher absenteeism in the Ugandan dataset by between 3.8
and 9.3 percentage points, or 0.08 and 0.21 standard deviations. In the multi-country survey data,
the comparable increase in perceived teacher absenteeism amongst respondents is 0.08 standard
deviations.6 At low levels of ethnic salience, however, I find no positive effect of ethnic diversity
on absenteeism in either dataset: in the multi-country data there is no significant effect; while
in the Ugandan data there is a significantly negative effect, which I suggest may be explained
by residential sorting. In addition, I find that including the diversity component alone, and by
implication ignoring the role of salience, would lead to the erroneous conclusion that ethnicity does
not have any significant effect on absenteeism.
I also replicate the Ugandan analysis using several school-level alternatives to the district-level
measure of ethnic divisions. In each of the 94 schools, head teachers are asked to estimate the
shares of the three most commonly spoken mother tongues amongst pupils in that school. I use this
information to construct a measure of ethnic diversity, and interact it with the salience measure
from the Afrobarometer. In addition, I create a range of teacher-specific proxies based on their
linguistic, ancestral and regional origins. Across a broad range of specifications, the results are
consistent with those reported above.
Having robustly established the reduced-form link between ethnic divisions and teacher absen-
teeism, I take a step towards identifying the channels of causation that may explain the relationship.
This is facilitated by the richness of the Ugandan data, which includes information on teacher-level
characteristics as well as school-level infrastructural and management characteristics. In order to
4These data were collected for a World Bank project led by Habyarimana (2010) and Chaudhury et al (2006).5High (low) ethnic salience is defined as the mean plus (minus) one standard deviation.6This is a subjective measure based on a four-point scale, hence there is no meaningful percentage point change
to report for comparison.
3
establish a conceptual framework for this exercise, I draw on the literature on both ethnic diversity
and teacher incentives. In simple terms, ethnic diversity is most commonly purported to affect
social capital and cooperation in three ways. First: through a ‘taste’ for discrimination between
coethnics and non-coethnics (Becker 1957, 1974; Hjort 2012). This may increase absenteeism if
‘outsider’ teachers simply care less about non coethnic students, parents or other teachers. Second:
through its impact on the effectiveness of social sanctions (Miguel and Gugerty, 2005; Habyarimana
et al., 2007, 2009). In this case, ethnic divisions may lead to higher absenteeism for two reasons:
(i) it may reduce the capacity for collective action necessary to form local monitoring institutions,
such as parent teacher associations; and (ii) outsiders may not face the same credible threats of
informal sanctioning by parents, head teachers or other teachers that apply to coethnics. Third:
ethnic divisions may have a negative impact on group formation and social participation (Alesina
and La Ferrara, 2000). This could affect the cooperation of teachers within schools, which may
ultimately impact attendance decisions.
Against this background, I conduct three groups of tests to explain the association between
ethnic divisions and teacher absenteeism. First, using a range of measures, I reject the altruism
channel by identifying no statistical difference between ‘native’ and ‘outsider’ teachers’ attendance
decisions. Second, I find that neither parent teacher associations, school inspections nor the sanc-
tioning history of head teachers can explain the main result. This is consistent with a growing
literature on teacher incentives in developing countries.7 Evidence from Banerjee et al. (2010),
de Laat et al. (2008), and Duflo et al. (2012) indicates that participatory programs designed to
empower the beneficiaries of public services—in this case parents—by providing them with infor-
mation and access to educational authorities are unlikely to substantially affect teacher attendance.
This is largely due to a combination of weak demand amongst parents and their relative lack of
power to enforce accountability mechanisms. Moreover, evidence from Kremer and Chen (2001)
suggests that head teachers are also unlikely to incentivize attendance.8
7In addition to Banerjee and Duflo (2006), Kremer and Holla (2009) present a particularly comprehensiveoverview.
8The successful intervention evaluated by Duflo et al. (2012) was based on an objective and external monitoringand reward system that was facilitated by tamper-proof cameras.
4
Instead, I find that the strength of social networks between teachers within schools, charac-
terized by their social activities outside of official hours, can explain most of the reduced form
relationship. The finding is consistent with Alesina and La Ferrara (2000), who show theoretically
and empirically how social participation is lower in heterogenous communities. The result suggests
that non-pecuniary incentives to attend work may partially derive from a teacher’s colleagues rather
than the wider local community.
The analysis contributes to a growing literature on the negative association between ethnic
diversity and the local provision of public goods in sub-Saharan Africa (Miguel and Gugarty, 2005;
Habyarimana et al., 2007, 2009) and elsewhere (Alesina et al., 1999; Vigdor, 2004). This association
has particularly acute consequences for countries that lack the strong formal institutions required
to implement many government policies—like India, Uganda and most of the 16 countries under
analysis—where informal collective action methods at the local level are required instead to provide
public goods. To illustrate, Chaudhury et al. (2006), show that teachers in their sample of less
developed countries are rarely sanctioned formally for not attending school. In India, for example,
only one head teacher in a sample of nearly 3,000 public schools reported a case in which a teacher
was dismissed for absenteeism, despite an absenteeism rate of 25%; in the Ugandan data, the
comparable figure is just under 1.5% of head teachers, despite an absenteeism rate of 28%.9 Indeed,
as the authors venture (pp. 93):
[...] the mystery for economists may not be why absence from work is so high, but
why anyone shows up at all. For many providers, the answer must be that important
intrinsic and non-pecuniary motivations - such as professional pride or concern for the
regard of peers - affect attendance decisions.
9Accordingly, absenteeism is rarely as grave an issue either in countries with strong formal institutions or inprivate sector industries with high monitoring: administrative data from New York school districts in the mid-1980srevealed a teacher absenteeism rate of around 5% (Ehrenberg et al., 1991); while the Indian Ministry of LabourIndustrial Survey 2001-2002 shows that absenteeism amongst factory workers is 10.5%, despite the existence of rigidlabour laws. As I note above, the rate of teacher absenteeism in India is estimated (conservatively) by Chaudhuryet al. (2006) to be 25%.
5
This sentiment is reflected by Duflo et al. (2012), who find a large role for the non-pecuniary costs
of absenteeism in the decision-making process of teachers using a structural model. Identifying the
nature and source of these costs is an open area of research;10 the literature suggests that, in this
clear absence of formal monitoring and enforcement, ethnic diversity may well provide a partial
explanation.11
Of course, the characterization of ethnic divisions that I present brings with it considerable
challenges for the empirical estimation strategy. Ethnic diversity is not a random accident, nor,
especially, is the salience of ethnic identity. To account for the potential effect of omitted variables
on teacher absenteeism, I construct an extensive set of controls for inclusion in the econometric
models. For the multi-country sample, I include controls for a wide set of respondent- and district-
level characteristics, as well as fixed effects for regions, ethnicity, and pre-colonial ethnic boundaries.
I also control for the temporal and spatial proximity to recent armed conflict events and fatalities
by combining information on the geographic coordinates of each district in the Afrobarometer
with those in the Armed Conflict Location Event Database (ACLED). In addition, I show that
the inclusion of controls for endogenous sorting (based on pre-colonial ethnic boundaries) and
historical settlement patterns have no effect on the model in the presence of such an extensive set
of fixed effects. For the Ugandan teacher-level dataset, I can control comprehensively for teacher
characteristics and school-level covariates based on the specification of Kremer et al., (2005), who
use an almost identical dataset in their study of the determinants of teacher absence in India.12
An additional methodological challenge inherent in the multi-country section of the analysis is
the reliability of subjective assessments of teacher absenteeism. While it is likely to be measured
with some stochastic error, Olken (2009) suggests also that an error component may be specifi-
10In their review article on teacher absenteeism, Banerjee and Duflo (2006) reach the conclusion that “mostattempts to boost the the presence of teachers [...] have not been particularly successful.”
11Revisiting the Chaudhury et al. (2006) data with this in mind, it is interesting to note that Bangladesh hasonly the fourth highest absenteeism rate, despite being the second poorest of the six countries. This is likely to beat least partially explained by the fact that it has by far the most homogeneous population, as measured by bothethnolinguistic and cultural diversity from Fearon (2003). Absenteeism in Bangladesh is lower than in Indonesia,which is over twice as wealthy but three times more diverse.
12Indeed both are component datsets for the six-country Chaudhury et al (2006) study. As such, they are basedon very similar methodologies. The Ugandan dataset is analyzed separately by Habyarimana (2010).
6
cally correlated with ethnic diversity. He finds that people in ethnically diverse villages tended to
overestimate significantly the level of corruption in a road-building project in Indonesia. The im-
plication for this study is that ethnic diversity may be associated with disproportionately negative
perceptions of public goods delivery in general, as a result in part of feedback mechanisms over
time. Although the obvious mitigation of this concern lies in the replication with objective data
from Uganda, I also run a large set of falsification tests to show that the results are highly unlikely
to be driven by this systematic bias. These include testing for the effects of ethnic divisions on per-
ceptions of other aspects of school quality and of national-level governance issues, as well as holding
school-level characteristics constant in order to analyse variation in the error component alone. I
also offer an explanation for why the mechanism at play in the Indonesian setting is unlikely to be
applicable in this context.
This analysis contributes to the literature in three ways. First, it provides new evidence of a
significant determinant of teacher absenteeism that can partially explain such high rates in poor and
ethnically divided areas. Second, it introduces a new measure of ethnic divisions that is consistent
with the heterogenous effects of ethnic diversity on a variety of outcomes found in the literature.
Moreover, the ‘constructivist’ component of the measure leaves room for a policy response. Finally,
it presents evidence that ethnic divisions do not affect absenteeism through community sanctioning
institutions or discriminatory altruism towards beneficiaries; instead, it is the erosion of social
capital between teachers within a school that appears most likely to undermine the provision of
public education. This casts a new light on the study of teacher incentives in developing countries.
The paper is organised as follows. In the next section I discuss briefly the analysis of ethnic
diversity in the literature, including methodological challenges. I then introduce the data and
discuss measurement issues, before presenting the reduced form estimation results for both the
multi-country and Ugandan analyses. I subsequently offer an explanation for the reduced form
results by testing for competing mechanisms. I finally conclude.
7
2 Analysing Ethnic Diversity
Scholars have long highlighted the deleterious impact of ethnic diversity on economic and political
development, particularly in poor and institutionally weak countries (Easterly and Levine, 1997;
Alesina et al., 2003; Alesina and La Ferrara, 2005). The cross-country evidence that characterized
the early stages of the literature have been complemented since by a series of micro-level studies
that have made progress in uncovering the channels through which ethnic diversity affects particular
outcomes.
Of particular importance for this study are analyses that increase our understanding of how
divisions lead to collective action problems in sub-Saharan Africa. Miguel and Gugerty (2005)
provide evidence that parents in Kenya contribute to school funding more in homogeneous areas
due to the credible threat of social sanctions for non-cooperation. In a seminal study, Habyari-
mana et al. (2007, 2009) provide laboratory evidence for this social sanctioning channel amongst
residents of Kampala, Uganda. It is an especially compelling explanation in these settings, where
people are more reliant on within-group networks to organise the provision of public goods that
effective governments would otherwise provide.13 However, Hjort (2012) also points to a ‘taste’
based discriminatory mechanism, showing that floriculture plant workers in Kenya weight the util-
ity of coethnics ahead of non-coethnics. He finds that non-coethnics were even willing to incur a
cost to display this discrimination amid the heightened ethnic tensions associated with the 2007
presidential election.
This wave of micro-level studies has also shed light on the conditions under which ethnic diver-
sity may not have the expected effect on certain political and economic outcomes. Miguel (2004),
for example, shows that ethnic diversity has heterogenous effects on school funding and water well
maintenance in districts that straddle either side of the Kenya-Tanzania border. In Kenya, moving
from a homogeneous area to one with a mean level of diversity lowers local school fundraising by
25%; whereas in the neighbouring Tanzanian district, the same change has no significant effect.
These contrasting outcomes are put down to the well-known nation-building efforts made in post-
13La Ferrara (2003) also analyzes the role of kin groups in the functioning of informal credit markets in Ghana.
8
independence Tanzania, characterized, amongst other policies, by the promotion of one common
language (Kiswahili) and a strong emphasis on national unity throughout public school curricula.
In Kenya, if anything, the opposite course was followed by a succession of politicians who were
demonstrably willing to use ethnic diversity as a vehicle for their own political ends.14 These diver-
gent policies are set against a background of broadly similar colonial and historical characteristics,
providing an empirical basis, also found in Hjort (2012), for the suggestion that the salience of
ethnic diversity is politically malleable.
Direct evidence of this is presented by Eifert et al., (2010), who show using Afrobarometer
data that the salience of ethnic identity, measured as the likelihood that respondents identify
themselves along ethnic lines when faced with an open question on self identification, increases
significantly with the proximity of competitive elections. Posner (2004a) also finds variation in the
salience of ethnic cleavages across political contexts. Like Miguel (2004), he exploits the arbitrary
determination of a national border—this time between Zambia and Malawi—and observes that
the Chewa and Tumbuka groups are more likely to perceive each other as allies in Zambia than
they are in neighbouring villages on the Malawian side of border, where they view each other
with considerable antagonism and are less likely to inter-marry. This is explained by the political
landscape in each country: in Zambia, neither group is large enough to form the basis of a viable
political coalition; whereas in Malawi, by contrast, they each form large political blocs that vie for
power.
Another example of the importance of context when analyzing the effects of ethnic diversity
comes from Glennerster et al., (2012), who find that variation in ethnic diversity within Sierra
Leone does not affect the provision of local public goods. This is despite being a poor and highly
diverse country that has recently experienced major civil conflict. The authors explain the result by
documenting Sierra Leone’s unique combination of colonial history, tribal organization and language
composition which together prevent the collective action failures one may expect to find in such
settings. Similarly, Dunning and Harrison (2010) use an experimental approach to show that cross-
cutting cleavages in the form of ’cousinge’ dominate the role of ethnicity in the formation of political
14This culminated in a wave of violence following the disputed 2007 presidential elections.
9
preferences in Mali.
2.1 Methodological Concerns: Measurement
Taken together, these studies highlight the incontrovertible role for constructivist explanations of
ethnic identity, which stress the importance of context and time in shaping both the formation
and salience of ethnic identification.15 So what, if any, are the implications of these accounts for
the measurement of ethnic diversity? Early cross-country studies used a measure of ethnolinguistic
fractionalization that was calculated using the Herfindahl concentration formula on a dataset com-
piled by Soviet anthropologists in the Atlas Narodov Mira (1964). In a pointed critique, Laitin and
Posner (2001) bemoan its once ubiquitous use in the cross-country economic literature, noting, for
example, that it is akin to using the rate of inflation in 1945 as a measure for a country’s prosperity
today. This is because, firstly, it is a static measure of a changing phenomenon. Identities and
cultures change over time in response to economic and political climates. They cite, for example,
the reorganization of identities in Somalia since independence, where Isaaqs and Hawiyes would
once have considered themselves part of a shared linguistic group. Today, Isaaqs conspicuously
differentiate their speech in an attempt to justify attempts at secession. Second, the group cate-
gories on which the measure is based may have no discernible meaning in the context of political
or economic cooperation. Ethnic identities have multiple dimensions in every country, and there is
no way for a researcher to know ex-ante which ones are salient in which contexts. A third point
rests with the concept of salience itself. As the literature above shows, ethnic identities in general
may have very few implications for cooperation in some countries, while having a significant effect
in others, be it at the dyadic level (Posner, 2004a) or in general (Miguel, 2004).
In response to these and similar critiques, several researchers have compiled new measures that
incorporate better the multi-faceted nature of ethnic diversity. Alesina et al. (2003) create a
new index that includes linguistic and religious fractionalization; Laitin (2000) and Fearon (2003)
15Chandra (2012) provides a contemporary discussion of constructivist theories of ethnicity.
10
create measures that incorporate the concept of linguistic ‘distance’ between groups; Posner (2004b)
develops an index based on groups who engage in political competition, called the PREG index
(for Politically Relevant Ethnic Groups); and Baldwin and Huber (2010) highlight the importance
of accounting for economic inequality between groups.
In this paper, I use a new measure of ethnic divisions that overcomes the main pitfalls listed
above. It consists of an interaction term between district level ethnic diversity, measured by apply-
ing the familiar variant of the Herfindahl concentration formula on the self-reported ethnicities of
Afrobarometer respondents, and a district-level average of the respondents’ answers to a question
on the salience of their ethnic identity compared to their national identity. It is based on data that
is concurrent with the outcome variable; it allows the subjects to choose their own ethnic identity;
and it explicitly accounts for salience. A crucial added advantage is that it is measured at the
district level, which allows me to control for regional (and, by implication, country) fixed effects
throughout the analysis. The main implication, in light of the literature outlined above, is that it
measures relevant diversity, and is thus a more accurate tool for identifying the types of problems
that are synonymous with local heterogeneity in Africa and elsewhere.
2.2 Methodological Concerns: Estimation
While these constructivist findings have clear implications for the appropriate measurement of ethnic
divisions, they also point to the need for a more careful approach to identifying valid estimation
strategies. The ethnic divisions interaction term I use in this analysis is likely to be endogenous to
a multitude of political and economic outcomes through factors as diverse as colonial history and
current-day political competition. This calls for more comprehensive econometric specifications than
those typically found in much of the early literature, which generally treats ethnic diversity as an
exogenous phenomenon. Moreover, recent contributions from Ashraf and Galor (2013), Aherlup and
Olssen (2012) and Michalopoulos (2012) have provided empirical bases to ‘evolutionary’ theories,
which describe diversity as a function of long term settlement patterns. Specifically, Ashraf and
Galor (2013) and Aherlup and Olssen (2012) show that the duration of human settlement is a
11
significant determinant of modern day diversity across countries. This is owing to cultural drift
that happens over time in response to the need for peripheral groups to provide their own public
goods. Michalopoulos (2012) provides more evidence of cultural drift, this time due to geographic
variability—such as the variation in soil quality—which led to the development of non transferable
human capital and, eventually, the formation of new linguistic groups.
Taken together, these strands of literature necessitate the inclusion of an extensive set of polit-
ical, economic, geographic and historic controls in the analysis. I describe my data and estimation
strategy in Sections 3 and 4 respectively.
3 Data
I use two main sources of data in the analysis. The first dataset comes from the 2005 round of
Afrobarometer, a series of nationally representative surveys based on standardised interviews of a
random sample of either 1,200 or 2,400 individuals in 16 sub-Saharan African countries: Benin,
Botswana, Ghana, Kenya, Lesotho, Madagascar, Malawi, Mali, Mozambique, Namibia, Nigeria,
Senegal, South Africa, Tanzania, Uganda, and Zambia. Figure 1 presents the location of every
district in which interviews were conducted on a map of Africa. 16 The second dataset comes
from Habyarimana (2010), and is a constituent dataset of the Chaudhury et al. (2006) survey of
teacher absenteeism. It consists of data from two visits to each of almost 100 schools in 10 Ugandan
districts, which are shown on a map in Figure 2. In each school, up to 20 teachers are selected at
random from the roster, and their attendance is recorded at each visit. In addition, a rich set of
characteristics for each teacher—present or otherwise—is recorded, as well as information on the
head teacher, the school’s facilities, its pupils and its structures of governance and management.
16I omit Cape Verde and Zimbabwe, as those respective samples do not have information on ethnicity and certainindividual characteristics necessary for the analysis.
12
3.1 Data: Ethnic Divisions
Afrobarometer multi-country sample
For the Afrobarometer sample, I measure ethnic divisions as the product of ethnic salience and
ethnic diversity. The measure for ethnic salience is recorded as the district-level mean of the
following survey question:
Let us suppose that you had to choose between being a [Ghanaian/Kenyan/etc.] and
being a ________ [respondent’s ethnic identity group]. Which of these two groups
do you feel most strongly attached to?
I ascribe a value of 1 to respondents if the answer is “only [group]” or “more [group],” and 0 if
the answer is “equal,” “more [country]” or “only [country]”.17 In Table 1, I present some external
validation that the question is in fact measuring the concept of salience that I discuss in the
previous section. Recall that Posner (2004a) found Chewas and Tumbukas to be salient adversaries
in Malawi, but not in Zambia. This was due to the political landscape in each country, as Chewas
and Tumbukas were each large political groups vying for power in Malawi, whereas in Zambia they
were too small to form the basis of any competitive coalition. In the top panel of Table 1, we can
see that Chewas and Tumbukas are significantly more likely to identify themselves along ethnic
lines in Malawi than they are in Zambia. While this is an imperfect test for their animosity towards
each other, it is nonetheless illustrative of the fact that political competition can lead to salient
sub-national identification.
In the second panel, we also see consistency with the findings from Miguel (2004). Recall again
that ethnic diversity had adverse effects on local collective action in Kenya, but not in Tanzania.
This was due to serious nation-building efforts in Tanzania that were designed to inculcate a sense
17Using the full five-point scale instead of this dichotomous interpretation does not qualitatively change the results.This is also the case when “equal” takes on a value of 1.
13
of common national identity ahead of sub-national ethnic attachments. Accordingly, ethnic salience
is two and half times higher in Kenya than it is in Tanzania, despite similar levels of ethnic diversity
and comparable colonial and precolonial backgrounds in both countries.
I measure ethnic diversity using the following Herfindahl concentration formula:
ELFd = 1−�n
g=1 s2g,
where si is the share of self-reported ethnic group g ∈ (g. . . n) in each of the 1207 sample
districts d. It reflects simply the likelihood that two randomly drawn individuals in a district d
report different ethnicities. In addition to the 2005 Afrobarometer sample, I include respondents
to the 2008 round in order to increase the power of the Herfindahl statistic. The median district
sample size for the variable is 47.18
I present non-parametric density functions of both interaction components in Figure 3, and a
scatter plot of their country mean values in Figure 4. A cursory look at the scatter plot reveals very
clearly the importance of accounting for both components in the analysis. By ignoring the salience
of ethnic identities, researchers would erroneously conclude that highly heterogeneous Lesotho is
more likely to suffer the consequences of ethnic divisions than relatively homogeneous (at the district
level) Nigeria. This does not take into account, however, the cultural closeness of groups in Lesotho
nor the history of ethnic violence in Nigeria, which may have contrasting effects on collective action.
I argue that these are captured by the measure of ethnic salience.19
While I have thus far pointed to the clear need for including a ‘salience’ component of ethnicity
in my measure of ethnic divisions, it is worth remembering that, in the context of local collective
action, it is obviously necessary to include a measure of diversity as an interaction component, for
it is unlikely that highly salient ethnic identification will lead to adverse public goods outcomes
18There is no measure of teacher absenteeism in the 2008 sample.19This may be also reflected in the degree of residential sorting in each country. In Figure A1 in the appendix, I
plot ethnic salience against diversity at the country level rather than the aggregated district level measure in Figure 4.The large difference between ELF at the district and country levels for Nigeria—and the relatively minute differencebetween them for Lesotho—suggests that coethnics may sort into districts in countries with high ethnic salience.This again highlights the importance of accounting for both diversity and salience, where otherwise highly diversedistricts could be misinterpreted as highly divided districts.
14
within a homogeneous community.
Ugandan school visits
In the Ugandan sample, I use two measures of ethnic diversity. First, I match the district-level
Afrobarometer measure to all ten districts covered by the survey, namely: Arua, Bugiri, Bushenyi,
Jinja, Kamuli, Kisoro, Luwero, Mpigi, Tororo and Yumbe. As this gives only ten data points, I
also construct a value of ethnic divisions for each of the 94 schools in the sample. During the first
random visit to each school, head teachers were asked to list the three most commonly spoken
mother languages amongst the pupils, and to estimate the corresponding share of pupils for whom
this is the case. Using these shares, I create another Herfindahl-based concentration index for
each school and interact it with the district-level mean ethnic salience from Afrobarometer. The
kernel density function for school-level diversity is estimated in Figure 5. All equations described
in Section 5 are estimated using both measures, which have a correlation coefficient of 0.48.
In Section 6, I introduce a number of additional proxy measures based on the diversity of the
teaching staff within each school. This is presented in support of the hypothesis that ethnicity
affects teacher absenteeism through its impact on social networks between teachers within schools.
3.2 Data: Teacher Absenteeism
Afrobarometer multi-country sample
In the multi-country Afrobarometer dataset, I base the measure of teacher absenteeism on responses
to the following survey question:
Have you encountered any of these problems with your local public schools during the
past 12 months: Absent teachers?
0=Never, 1=Once or twice, 2=A few times, 3=Often, 7=No experience with public
schools in the past twelve months, 9=Don’t Know,
15
I code the responses on a four point scale from 0 to 3, omitting respondents who choose the remaining
categories. This lowers the potential sample size from 21,598 to 14,100. Descriptive statistics are
presented in Table 2, showing mean values for ELFd, district-level ethnic salience, and a selection
of individual and village level variables for each response category of the question.
As is frequently the case when variables are based on subjective opinions, the major concern in
this part of the analysis is the potential for systematic bias caused by non-random measurement
error. While it is likely that the survey question picks up at least some noise, so that TAid =
TA∗sd + ui, where TAid is subjective teacher absenteeism reported by individual i in district d,
TA∗sd is actual teacher absenteeism at school s, and ui is stochastic measurement error, the danger
is that TAid = TA∗sd + ui + vi, where the error component vi is correlated with ethnic diversity. If
this is the case, the observed coefficient that describes the relationship between the outcome variable
and ethnic divisions may be driven by the error component, rather than a fundamental association
between ethnic divisions and true absenteeism. Olken (2009) suggests that this should be treated as
a genuine concern. He finds that people in ethnically heterogeneous villages in Indonesia are more
likely to overestimate the level of corruption associated with a road building project than villagers
in homogenous areas.
He explains the findings by suggesting that feedback mechanisms over time have caused people
in diverse areas to be wary of corruption in public projects, which in turn increases their scrutiny
of public funds. However, the means through which scrutiny led to lower corruption in ethnically
diverse villages for that particular project is linked to the disproportionately high rate of attendance
at ‘monitoring meetings’ that were provided by the central government. In the absence of this
exogenous facilitation of scrutiny, it is unclear whether or not diverse communities would cooperate
better than homogeneous villages to minimize corruption. In any case, I include a wide set of
falsification tests below that directly address this concern, and find it to be an unlikely driver of
results.
16
Ugandan school visits
The measure of teacher absence in the Ugandan dataset is more straightforward. Enumerators
recorded a teacher as absent if she was not present to teach a class that she was scheduled to
teach. Over two visits, they collected the information up to 20 randomly selected teachers from the
school’s roster. In the cases where schools have less than 20 names on its roster, the enumerators
collected information on all teachers.
In Figure 6, I plot the district mean values of each absenteeism variable. On the y-axis is the
mean value for the four-point Afrobarometer scale; on the x-axis is the district mean for a teacher-
level dummy variable indicating that a teacher was absent during a random school visit. I also
include a linear fit, which confirms that the Afrobarometer measure contains significant information
on actual teacher absenteeism. This provides some evidence for the validity of the multi-country
analysis, which itself allows for a general interpretation of the results across 16 sub-Saharan African
countries.
4 Estimation: Afrobarometer Multi-Country Sample
I begin the estimation section with a focus on the multi-country Afrobarometer survey data. As
I note above, the two main challenges in this section relate to the potential endogeneity of salient
ethnic divisions to actual teacher absence rates due to omitted variable bias, and also the possibility
of a correlation between ethnic divisions and the error component vi of the subjectively measured
dependent variable. The basic equation I estimate takes the following form:
TAidr = a+Ψn�
i=1
ESid
nd+ λELFd + β(
n�
i=1
ESid
nd∗ ELFd) + γXid + δVid + ηRr + eid, (1)
where TAidr is perceived teacher absenteeism reported on a four-point scale by individual i in
district d and region r; ES is ethnic salience; X is a vector of individual controls; V is a vector of
village-level controls; and R represents regional fixed effects for 184 regions in 16 countries. The
17
individual controls include measures for age, age squared, level of education, gender, employment
status, physical health, mental health, religion, individual ethnic salience,20 access to food, water,
medicine, fuel and income, as well as indicators for ownership of three assets: radio, television and
a vehicle. The village level vector controls for whether or not a village—which is a sub-district level
unit with a modal sample size of 8 respondents—contains each of the following services or facilities:
a school, piped water, a sewage system, a health clinic, electricity, a police station, a post office,
recreational facilities, a place of worship, community buildings and a tarred or paved road.
Given the incidental parameters problem, the equation is estimated initially using least squares
with standard errors adjusted for two-way clustering within ethnicity groups and within districts
(Cameron et al., 2011).21 This method produces standard errors that are higher than those pro-
duced by either ethnicity- or district-level clustering alone. In addition, I run the equivalent ordered
probit model, and report the relevant marginal effects in the robustness section below.
There is good reason to believe that this set of covariates controls for the potential sources of
endogeneity that I explore in Section 2. In particular, the control for regional-level fixed effects
at a stroke controls for country-level colonial and pre-colonial historical factors (Glennerster et al.,
2012), post-colonial national policies (Miguel, 2004) and current day political competition (Eifert
et al., 2010, Posner, 2004a), as well as the country-wide effects of macroeconomic policies that
are associated with ethnic diversity (Easterly and Levine, 1997; Alesina et al., 2003). Moreover,
the rich set of individual- and village-level controls accounts for variation in individual and local
wealth, health and economic factors that could otherwise plausibly affect our interpretation of the
coefficient of interest, β.
20This can be interpreted as controlling for the effects of an individual’s deviation from the district mean value ofethnic salience. Omitting it does not have a significant effect on the results.
21The Stata command for this estimator is ‘cgmreg’.
18
4.1 Results
In Figure 7, I present an illustration of the output from the most comprehensive specification
estimated in this section.22 I use a re-centering method to hold ethnic salience constant at a low
level, which I define as the mean value minus one standard deviation (or 0.004), and at a high level,
defined as the mean plus one standard deviation (or 0.33). The slope of ethnic diversity (ELFd) at
the low level (red) is -0.14, and is statistically indistinguishable from zero; whereas the slope at the
high level (blue) is 0.29, or 0.26 standard deviations of perceived teacher absenteeism. The p-value
for the interaction effect, which describes the statistical significance of the difference between the
two slopes, is 0.005.
The results show clearly the importance of conditioning on salience when trying to ascertain
the effects of ethnic diversity at the local level. In the absence of salience, ethnic diversity has no
significant effect on teacher absenteeism.
In Table 3, I present the regression output for the most basic specification. In column (1), I
omit the village-level controls, which could be affected themselves by ethnic divisions. While the
independent effects of ethnic diversity (ELFd) and ethnic salience do not affect perceived teacher
absenteeism at traditional levels of statistical significance, their interaction has an impact that is
both economically and statistically significant.
In column (2), I add the village-level controls described above. Although this inclusion increases
the precision of the main estimate, while simultaneously reducing bias by accounting for some
potentially endogenous factors, it also lowers the sample size from 13,468 to 12,240. In column
(3), I highlight the importance of including the interaction term in the model. Its omission would
lead us to erroneously conclude that it is only ethnic salience that reduces teacher absenteeism
within a district, rather than a combination of salience and ethnic diversity. In column (4), I
22Specifically, I use the output from a specification based on Table 5, column (6), but without the pre-colonialfixed effects, which I describe in the next sub-section. The effect of diversity for any level of salience can be foundsimply by plugging in the desired level of salience to the relevant regression output function. Figure 7 is a graphicalrepresentation of the function TAidr = 1.71 − 0.151(ELFd) + 1.33(
�ni=1
ESidnd
∗ ELFd) + Ψ�n
i=1ESidnd
+ γXid +
δVid + ηRr + eid, where�n
i=1ESidnd
= 0.33, 0.004. The black dots represent observations plotted by ELFd on thex-axis and TAidr on the y-axis.
19
present the most naïve representation of ethnic divisions, that is, ethnic diversity with no account
for salience whatsoever. Again, though common in the literature, this measure fails to capture at
all the significant effect of ethnic divisions on the outcome variable.
For the remainder of this section, I test the robustness of the association by (i) controlling
for an additional range of factors that are plausibly correlated with ethnic divisions and teacher
absenteeism; (ii) conducting a set of falsification tests to ensure that the results are not being
driven by an error component in the measurement of the dependent variable; and (iii) presenting
alternative specifications to ensure that the results are not driven by particularities in the survey
sample.
4.1.1 Controlling for Observables
Cultural and institutional persistence Nunn and Wantchekon (2012) use Afrobarometer data
to show how historical events can affect current behavior through culture, or, more specifically, the
intergenerational transmission of norms within ethnic groups. They show that members of ethnic
groups that were historically targeted by the slave trades have lower levels of trust in institutions
and other people today. This is due to the nature of the slave trade, which often rewarded trickery
and dishonesty by sparing from slavery those who provided other people for export. This led to a
profusion of distrust amongst the kin of those who were sold for export, which in turn developed
into a cost-saving heuristic that evidently survived over time. The implication for this analysis
is simple: certain ethnic groups may display common traits that could affect the salience of their
ethnicity and the development of local institutions.
In Table 4, I show how the omission of ethnicity-level covariates can lead to a biased estimate of
β. The inclusion of three such variables in column (1) changes the size and statistical significance
of the coefficient. The first variable is the natural log of the number of slave exports taken from the
respondent’s ethnicity group divided by the size of the area which it historically inhabited. It is taken
directly from Nunn and Wantchekon (2012), who use Murdock (1959) to link current day ethnic
group names to their pre-colonial ancestral groups. Historical slavery exports may affect current
20
day social capital, which could be manifested in more salient sub-national identities and lower levels
of local cooperation. The second variable is taken from the Ethnographic Atlas (Murdock 1967) and
coded by Nunn and Wantchekon (2012). It captures the political sophistication of each ethnicity’s
corresponding pre-colonial ancestral group by measuring the number of hierarchical layers in its
power structure. This organization of power could itself persist over time within ethnic groups to
reflect better local coordination today. The third variable is a proxy for each ethnicity’s historical
wealth by indicating whether or not there was a city within its pre-colonial boundaries in 1400. It
is taken from Chandler (1987) and again coded by Nunn and Wantchekon (2012), and is intended
to reflect the probability of colonial plunder, which may negatively affect the persistent quality of
institutions within groups over time.23
Slavery enters the model with no statistical significance, while the sophistication of pre-colonial
institutions and the indicator for pre-colonial wealth enter significantly with the expected signs.
Taken together, these results suggest that controlling for ethnicity-level variation in the response is
a necessary step in establishing the robustness of the main finding. Accordingly, I include ethnicity
fixed effects in column (2) and for the remainder of the analysis. I code ethnicity by country, in
order to account for the variation in ethnic salience within ethnic groups that spill across country
borders, as in Posner (2004a).
Although clearly a necessary inclusion, ethnicity fixed effects are not a sufficient means of control-
ling for variation in unobserved historical factors. Michalopoulos and Papaioannou (forthcoming)
show that pre-colonial factors—in this case the jurisdictional hierarchy measure from column (1)—
can also have persistent effects through local institutions. Indeed Nunn and Wantchekon (2012)
present evidence of this channel that is independent of the cultural persistence discussed above.
Both studies use information on the pre-colonial boundaries of ethnic groups from Murdock (1959)
to link historical factors to current outcomes. In the case of Nunn and Wantchekon (2012), this
is facilitated by recording the geographic coordinates of each Afrobarometer district in order to
determine the ethnic group that inhabited the corresponding area in pre-colonial era. I use this
information to include a spatial vector in column (3) that controls for historical fixed effects that
23Acemoglu et al (2001) provide some evidence on the colonial origins of institutional quality.
21
vary at this level of pre-colonial settlement.24
Conflict, sorting and settlement duration I now consider three more factors that could
plausibly affect the main result. First, I address the possibility that armed conflict is associated
with salient ethnic divisions and with perceived teacher absence. Montalvo and Reynal-Querol
(2005) provide empirical evidence for the link between ethnic polarization and conflict, building
on a seminal contribution from Horowitz (1985). It is certainly not unreasonable to consider that
ethnic conflict and ethnic salience may be related; nor is it implausible to suggest that teacher
attendance—or indeed perceptions of teacher attendance—could be affected by local conflict.
To account for this, I turn to the Armed Conflict Location and Event Dataset (ACLED), which
contains data on over 60,000 fatal and non-fatal incidents of conflict throughout Africa, parts of
Asia, and Haiti from 1997 to 2012.25 The dataset also includes the geographic coordinates of each
incident, which I use to measure its geodesic distance from the centroid of each Afrobarometer
district.26 The location of every recorded event is presented on a map in Figure 8. I combine
various levels of spatial and temporal proximity to the number of armed conflict events and to the
number of fatalities associated with each event. Specifically, I included all possible combinations
between 20km, 10km, 5km and 1km, and 10 years, 5 years, 2 years, 1 year and six months. I find
that only the number of conflict fatalities within 1 year and 1 kilometer from the centroid of each
district has a significant impact on perceived teacher absenteeism. In columns (1) and (2) of Table
5, I show that the inclusion of both this measure and the corresponding measure of conflict events
(fatal and non-fatal) has no qualitative effect on the main result.
Although potentially captured by the covariates already presented in the model, I include a
measure of historical sorting in columns (3) and (4). Residential sorting amongst ethnic groups,
as I discuss above, may reflect traits that could impact collective action outcomes. For example,
24Not every area could be linked to a precolonial group. I thus present results with and without these fixed effectsfor the remainder of these estimations.
25More information on the Acled dataset is available online at www.acleddata.com26Sangnier and Zylberberg (2012) also combine Afrobarometer and Acled data using location coordinates.
22
it is possible that individuals from different ethnic groups who have no animosity toward outsiders
may coexist in districts, which itself may perpetuate long-term cooperation and better local insti-
tutions. To measure sorting, I go back to the information on pre-colonial boundaries in Murdock
(1959). I define historical assimilation as the percentage of current district residents whose ethnic
ancestors were based elsewhere during the pre-colonial era. Adding this variable to the model has
no discernible effect on the main results.
Like sorting, it is likely that correlates of the ‘evolutionary’ sources of ethnic diversity are already
controlled for by the regional and pre-colonial fixed effects in the model. To see if this is the case,
I include three proxies for human settlement that are discussed by Alehrup and Olsson (2012) and
Michalopolous (2012) in columns (5) and (6). I first use the geodesic distance from each district
to Addis Ababa, which is also shown by Ashraf and Galor (2013), amongst others, to correlate
highly with the duration of human settlement and, in turn, genetic diversity. In addition, I include
distance to the equator (measured in degrees of latitude) and distance to the sea, which reflect two
theories of early human migration patterns from East Africa between 150,000 and 200,000. As I
mention above, I also include a binary variable for whether or not each village contains a tarred or
paved road, which could be interpreted as a proxy for the ruggedness of land. None of the variables
have a significant impact on the model.
4.1.2 Falsification Tests
In this section, I address the potential effects of non-classical measurement error in the dependent
variable. To recount, Olken (2009) shows that survey respondents in heterogenous Indonesian
villages overestimated corruption in a road project more than those in homogeneous areas. The
author suggests that this was caused by a higher level of skepticism in diverse villages that may have
been triggered through a feedback mechanism from corruption in previous projects. As a result,
community meetings designed to facilitate local monitoring and oversight of the road project were
22% more highly attended than meetings in homogeneous areas, which in turn led to a lower level
of actual corruption. The implication for this study is that respondents from ethnically divided
23
areas may simply write-off the quality of all public services and collective action outcomes without
regard to the true measure. This could lead to an upwardly biased β coefficient.
For the first set of falsification tests, I examine the impact of ethnic divisions on responses
to alternative survey questions. In addition to absent teachers, respondents are also asked in the
Afrobarometer survey whether they have encountered six other problems related to their local
public schools in the previous 12 months. They are: “services are too expensive,” “poor conditions
of facilities,” “overcrowded classrooms,” “poor teaching,” “lack of textbooks and other supplies,” and
“demands for illegal payments.” The questions are framed in exactly the same manner as the way
in which the dependent variable is framed, and all are sequenced together. If it is the case that
residents of ethnically divided areas have a common proclivity to overstate problems with public
goods provision, and if that is driving the main result of this section, then we should observe a
similar effect of the interaction term on all—or at least some—of the other six variables.
In Table 6, I present the effects of ethnic divisions on all seven variables, each normalized to have
a mean value of 0 and a standard deviation of 1 for comparison.27 Using the most comprehensive
specification presented thus far (from Table 5, column (6)), I show that district-level ethnic divisions
only have a significantly positive effect on teacher absenteeism. This comprehensively rules out the
possibility that a common error component of all seven measures is significantly correlated to ethnic
divisions. The tests also suggest that the channel through which ethnic divisions ultimately affect
teacher absenteeism does not apply to the other outcomes (a proposition corroborated in Section 6).
The possible explanation for this may be reflected in the final row of the table, which shows that five
of the six additional outcomes have higher within-country correlations than teacher absenteeism,
albeit by small margins. This indicates that the organisation of these outcomes may take place at
a more centralized level than teacher attendance.28 As I discuss in the introduction, absenteeism
is rarely sanctioned by official sources, and is thus likely to be affected by more local factors.
The next set of falsification tests follows a similar line of reasoning to a set presented by Olken
(2009). It is based on a simple premise: if people in ethnically divided areas systematically overstate
27This does not affect the interpretation of the test results.28The only outcome with a lower intra-country correlation, the demand for illegal payments, is positively affected
by ethnic divisions in a specification with no pre-colonial fixed effects (not reported).
24
corruption, or any other metric representing the quality of public goods or collective action, then
perceptions of common, national-level indicators should significantly differ between divided and
undivided areas. If they do not overstate these measures, their responses should not significantly
differ from those in undivided areas, as both groups are assessing the same phenomenon.
I run this test using six variables that measure responses to questions about national-level
governance. The results are presented in Table 7. In columns (1) and (2), the dependent variables
measure perceived corruption in the offices of government and the president respectively; in columns
(2) and (3) the dependent variables measure the trust held by respondents in the ruling party and
in the main opposition party respectively; and in columns (5) and (6) the dependent variables
measure respondents’ assessment of the manner in which the government is handling corruption
and education respectively. All six dependent variables are measured on four point scales.
In every case, the responses of individuals in ethnically divided (or indeed ethnically diverse or
ethnically salient) districts do not differ from the responses of those in undivided areas. It provides
further evidence that the effect of ethnic divisions on perceived teacher absenteeism cannot be
explained by this source of measurement error.29
In Table 8, I present the final set of tests for the robustness of the results to non-classical
measurement error in the dependent variable. I first test the hypothesis that only minority group
members in each district have higher perceptions of teacher absenteeism. I do this by including
a binary variable that indicates whether or not an individual is a member of a non-modal group
within their district. The result, shown in column (1), indicates that minority group members are
not driving the main findings.
In columns (2) and (3) I present the results of a test which attempts to hold the true level
29Readers familiar with the Afrobarometer surveys will be aware of several questions that probe respondents fortheir opinion on a multitude of political and social issues. I chose questions that were most likely to elicit viewson strictly national-level characteristics that plausibly affect individuals in diverse and homogeneous areas equally.Nevertheless, I could have presented alternative variables in each of the three categories shown in the Table 7 thatarguably fit the criteria. Under corruption, respondents were also asked about judges and magistrates; under trust,respondents were also asked about parliament, the president, the national elections commission and courts of law;and under government performance, respondents were also asked about crime, health, water and food. I rerun thetest with all of these alternatives, and find that the only question for which respondents in divided areas answereddifferently to others was on the government’s handling of crime, which itself is likely to be influenced at least partiallyby local factors.
25
of teacher absenteeism constant, and therefore analysing only the variation in the individual error
component. Recall that TAid = TA∗sd + ui. If I hold TA∗
sd constant by controlling for school fixed
effects, I can then observe the relationship between ethnic salience and the error component only.
As there are no school-level fixed effects in the dataset, I instead use village-level fixed effects.
The village (or “primary sample unit”) is the most granular level above the individual in the Afro-
barometer. It has a modal sample size of 8 respondents. To the extent that all 8 refer to the same
school in these surveys, controlling for village fixed effects will allow me to uncover the statistical
relationship between the error component of the dependent variable and an assortment of individual
characteristics.
In column (2) I provide strong evidence that individual ethnic salience is not related to the
individual error component of the dependent variable. In column (3) I include an interaction term
between individual salience and district level ELF . Although the inclusion of village fixed effects
implies that district-level ethnic diversity is held constant, this is still a strictly better test than
the one presented in column (2), as the interaction term gives a closer approximation to the true
district-level interaction effect in a test without fixed effects. Again, the results suggest that there
is no significant relationship between ethnic divisions and the error component of the dependent
variable.
Finally, I show that the explanation put forward by Olken (2009) for the link between ethnic
diversity and the overestimation of corruption is not applicable in this context. Recall that, in
Indonesia, attendance rates at community oversight meetings were significantly higher in hetero-
geneous communities, which led to lower levels of actual corruption. A key part of that story lies
in the fact that these meetings were facilitated as part of the nationwide road building project. It
is unclear that heterogeneous communities would have monitored the project as effectively in the
absence of this exogenous provision of community fora. I show in column (4) that, in this sample,
members of ethnically divided communities do not attend community meetings more frequently
than those in relatively undivided communities. As in Table 6, this test has implications for our
understanding of the channel or channels that explain the reduced-form relationship. 30
30Respondents are asked if they attend community meetings. The dependent variable ranges on a four point scale
26
4.1.3 Other Robustness Checks: Sample Issues and Functional Form
In this sub-section, I present a final set of robustness tests for the multi-country analysis. In column
(1) of Table 9, I use district-level averages for all variables to show that missing values for some
covariates are not affecting the estimation of β.
In column (2), I run the analysis on a sub-sample of respondents who might be expected to
have a better grasp of true teacher absenteeism TA∗sd: mothers. Case and Ardington (2006) show
using panel data that maternal deaths had a substantially more negative effect on a wide range of
schooling outcomes than paternal deaths amongst a sample of Zulu children in South Africa, while
paternal deaths had a larger impact on other socio-economic factors. This could be reflective of
a higher maternal involvement in children’s schooling. In such a case, the absence of a significant
interaction effect in a sub-sample of women between the ages of 25 and 50 would cast some doubt
on the validity of the results, as one would expect mothers to have a more accurate response to the
question on perceived teacher absenteeism. The results show that the interaction effect is larger
and is estimated with more precision.
In columns (3) and (4), I provide evidence that the results are not driven by ELF measures that
are calculated from small sample sizes, and thus have little statistical power. Indeed, as the table
shows, the results are more robust for districts that have above-median sample sizes than districts
with below-median sample sizes.
In column (5), I present the results of an ordered probit model, given that (i) the steps between
each response category in the dependent variable may not be constant; and (ii) the variable has
a limited range. In order to facilitate the interpretation of the interaction term, I show in Table
10 the marginal effects of ethnic diversity on the probability of each response at a high level of
ethnic salience (again, mean plus one standard deviation) and at a low level of ethnic salience
(mean minus one standard deviation). The results show clearly that, at high ethnic salience, ethnic
diversity significantly decreases the probability of a “Never” response, and significantly increases
the probability of individuals reporting “Once or twice”, “A few times” and “Often”. At low ethnic
from “no, would never do this” to “yes, often”
27
salience, ethnic diversity has no impact on the probability of any response. The results are consistent
with the linear results presented throughout the section.
In summary, this section presents robust evidence that the statistical association between ethnic
divisions and perceived teacher absenteeism in the Afrobarometer multi-country dataset is neither
caused by omitted variables, an error component in the dependent variable nor the imposition of
a linear functional form on the relationship. In the next section, I extend the results to a sample
using objective data on recorded absenteeism in Uganda.
5 Estimation: Ugandan Dataset
Kremer et al. (2005) and Habyarimana (2010) analyse the determinants of teacher absenteeism
using sub-samples of the Chaudhury et al. (2006) data collected during random, unannounced
school visits in 2002 and 2003. Here, I use an almost identical empirical specification in order to
establish the relationship between the probability of a teacher’s absence and ethnic divisions in
Uganda. The basic estimation equations take the following form:
Pr(TAjsd = 1) = a+ΨESd + λELF + β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd (2)
for the linear probability estimation, and:
Pr(TAjsd = 1) = Φ[a+ΨESd + λELF + β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd] (3)
for the probit estimation, where ESd is�n
i=1ESidnd
from the Afrobarometer sample, ELF is the
either district-level ethnic diversity ELFd, or school-level ethnic diversity ELFs, as described in
Section 3.1. Teacher-level characteristics, represented by Xjsd, include gender, age, marital status,
education, place of birth, employment rank, experience, contract status, union status, and career
training; Ssd is a set of school level characteristics, comprised of controls for institutional features,
28
such as the existence of parent teacher associations (PTAs), access and the quality of its facilities;
Ttdm is three sets of fixed effects for the time, day and month of the visit. All covariates are
described in more detail in the notes beneath Table 11.
5.1 Results
I present in Table 11 the estimation results for equations (2) and (3). In columns (1) and (2), I
omit school-level characteristics, which are added in columns (3) and (4). All four specifications
are estimated with a linear probability estimator. In each model, the interaction effect is large
and significant. To illustrate, a one standard deviation increase in district-level ethnic diversity at
a high level of ethnic salience (again defined as the mean plus one standard deviation) raises the
probability of a teacher not turning up to a scheduled class by 9.3 percentage points. At a low
level, the same increase of diversity actually decreases the probability of absence by 6.7 percentage
points. An even larger interaction effect is present in the school-level diversity data: a one standard
deviation increase in ELFs at a high level of ethnic salience increases the probability of absence by
3.8 percentage points; while at a low level of salience the same change in diversity decreases the
probability of absence by almost 17.7 percentage points.
In columns (4) and (5), I present results from the probit model. Ai and Norton (2003) highlight
the dangers of misinterpreting the effects of an interaction term in non-linear models. They show
how the marginal effect of an interaction term can have a different magnitude, sign and level of
statistical significance than the true cross-partial derivative.31 The results of their corrected method,
interpreted as the average interaction effect across all observations, support the linear results.32
It is possible that these symmetrical effects, i.e., the negative effects of district- and school-level
diversity on absenteeism where salience is low, reflect two forms of sorting mechanisms that are
described respectively by Glennerster et al. (2012) and Miguel and Gugerty (2005). The first case
31I explain this issue in more detail in Appendix B.32The Ai and Norton (2003) method requires that I drop the fixed effects for the time of day from the model. An
equivalent linear model produces quantitively similar results, supporting the interpretation presented above. Theremaining marginal effects are taken from the full model.
29
concerns residential sorting, whereby certain coethnics choose to live in clusters. By implication,
the remaining individuals—those who do not cluster by ethnicity—form more diverse communities
that are likely to have low levels of ethnic salience. Glennerster et al. (2012) find that ‘non-sorters’
in diverse areas od Sierra Leone tend to have higher levels of educational attainment, which I find
in the analysis to be strongly linked with lower absenteeism. Moreover, we can see from Figures 4
and A1 that Uganda has one of the highest rates of sorting amongst the entire 16-country sample:
the mean level of ethnic diversity by district is less than half the mean level of ethnic diversity for
the country as a whole.
The symmetrical effect at the school-level may be further compounded by a more straightforward
sorting mechanism described in Miguel and Gugerty (2005), whereby good schools (which are likely
to have teachers who attend class) attract pupils from a wider radius, leading to a positive link
between diversity and, in this case, teacher attendance. It is interesting to note also that teacher
absenteeism is lower in homogeneous areas where people express a high sense of common ethnic
identity, as implied by the significantly negative independent effect of ethnic salience. Again,
sorting may explain the finding, as homogeneous communities formed from deliberate sorting may
be better equipped for the type of collective action needed to provide local public goods than other
homogeneous communities.
In any case, the impact of ethnic diversity on teacher absenteeism at a high level of ethnic
salience is positive and significant in all specifications. Whether measured at the district level or
at the school level, the effects of ethnic diversity on the probability of a teacher being absent from
class is significantly higher in districts where individuals are more likely to identify themselves along
ethnic lines.
6 Testing for Channels of Causality
In this section, I attempt to identify the channel or channels through which ethnic divisions affect
teacher absenteeism using the Ugandan teacher-level data. As I discuss in the introduction, ethnic
diversity can affect local cooperation and teacher behavior in different ways. Below, I consider the
30
three prominent theories given in the literature to explain why ethnic diversity undermines the
provision of public goods in this context.
First, coethnics may have more concern for the utility of each other than for the utility of non-
coethnics (Becker 1957, 1974; Hjort 2012). This could result in native teachers having a higher
preference for increasing the wellbeing of pupils and parents than ‘outsider’ teachers, which may
explain higher absenteeism rates in ethnically divided schools or districts.
A second, more broadly-supported theory concerns the credibility of the threat of social sanctions
in ethnically diverse communities (Miguel and Gugerty, 2005; Habyarimana et al., 2007, 2009). In
the absence of well functioning formal institutions, coethnics often rely on each other for public
goods, credit or other services. Because of this, the costs of uncooperative behavior towards a
coethnic are likely to be higher than the costs of uncooperative behavior towards a non-coethnic.
In the case of teacher absenteeism, this could have two consequences: (i) diverse communities may
lack the capacity for coordination needed to develop institutions of oversight for all teachers, such as
effective parent teacher associations or other means of sanctioning; and (ii) ‘outsider’ (or non-native)
teachers may not view the threat of sanctions from native (i.e. non-coethnic) parents, communities
or head teachers as credible. Although any of these mechanisms could potentially explain the
association between ethnic divisions and teacher absenteeism, recent evidence on teacher incentives
in developing countries suggests strongly that official sanctions pose little threat to teachers, and,
moreover, parents are unlikely to exert pressure or enforce attendance.
A third mechanism follows from Alesina and La Ferrara (2000), who find that participation rates
in various social, professional and religious organizations in the USA can be explained partially by
ethnic heterogeneity. They show that, conditional on individuals exhibiting a strictly positive
preference toward socializing with coethnics, an increase in the heterogeneity of a population will
decease the formation of and participation in social groups. The implication in this case is that
teachers in divided areas may be less inclined to socialize together. This could affect attendance
decisions in at least three ways. First, teachers who socialize together may develop an in-group
altruism that directly increases the utility of attending school together. Second, the altruism may
manifest itself in a concern for colleagues’ potential obligations to mind unsupervised children in the
31
case of a teacher’s absence. Third, group members are likely to pose a credible threat of sanctioning
in the form of social ostracism that is not available to non-group members.
To investigate these possibilities, I use data on sanctioning institutions and the origins and social
activities of teachers to test the following three channels:
• Channel 1 Teacher coethnicity : teacher absenteeism is less probable if the teacher is ‘native’.
If this is rejected, i.e., if φ is not negative, we can say with some confidence that the ‘taste’
mechanism is unlikely to be driving the main result.
TAjsd = a+ φNativejsd +ΨESd + λELF + β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd (4)
• Channel 2a Sanctioning (homogeneous effect): teacher absenteeism is negatively determined
by effective sanctioning institutions, i.e., ϕ is negative.
TAjsd = a+ ϕSanctionsd +ΨESd + λELF + β(ESd ∗ELF ) + γXjsd + δSsd + ηTtdm + ejsd. (5)
• Channel 2b Sanctioning (heterogeneous effect): teacher absenteeism is negatively deter-
mined by effective sanctioning institutions, conditional on the teacher being ‘native’, i.e., θ is
negative.
TAjsd = a+ φNativejsd + ϕSanctionsd + θ(Nativejsd ∗ Sanctionsd) +ΨESd + λELF
+ β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd (6)
• Channel 3 Social networks between teachers: teacher absenteeism is negatively determined
by the extent to which teachers socialize together, i.e., ς is negative.
32
TAjsd = a+ ςSocial +ΨESd + λELF + β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd. (7)
In each case, the extent to which the channel under investigation explains the main result depends
on β, the interaction effect that describes the relationship between ethnic divisions and teacher
absenteeism. If this interaction effect loses all statistical significance, it is likely that the relevant
channel explains all of the relationship.
In all tests, I present linear probability models. Marginal effects derived from probit models
produce qualitatively similar results.
6.1 Teacher Coethnicity
In this sub-section, I test the hypothesis described in Channel 1: that native teachers are less
likely than ‘outsider’ teachers (or non-coethnics) to be absent from a class that they are scheduled
to teach. If this is the case, it may be reflective of either of the first two theories: natives may have a
higher degree of altruism (or ‘other-regarding preferences’) toward children and parents; or natives
may view the threat of sanctioning more credibly. If I find no evidence of a statistically significant
relationship, it implies that the ‘taste’ mechanism can be rejected as a unique explanation for the
link between ethnic divisions and teacher absenteeism.
I use three variables to measure whether or not a teacher is native. The first is a dummy variable
indicating whether or not a teacher speaks the local language natively. The second is a dummy
variable which indicates that the teacher’s ancestral home is in the same parish/city as the school.
The third is a dummy variable which indicates that the teacher was born in the local county. In
all specifications, I include as a control a categorical variable for where the teacher currently lives:
the included categories are the local county and the local district; the omitted category is the local
village.
In Table 12, I present the results using the district level and school level measures of ethnic
diversity respectively. Firstly, it is important to note that the interaction effect representing ethnic
divisions is large and statistically significant in all specifications, ruling out the possibility that the
33
full effect is explained by teacher-level ethnicity. In all specifications, native teachers are no more
likely to attend school than non-natives. This indicates that teachers do not make decisions about
attendance based on discriminatory altruism toward coethnics. Teachers who are natively fluent
in the local language are significantly more likely to be absent from school, whereas those who
are native by birth or by ancestry do not exhibit behavior that is statistically different from the
rest of the sample. In an auxiliary specification (unreported), I test for the heterogeneous effects
of coethnicity between ethnically divided and undivided districts/schools by including a triple in-
teraction term. This is to ensure that the average effects presented in Table 12 are not masking
significantly contrasting effects across districts or schools. The intuition is that non-native teachers
in otherwise homogeneous areas may have selected into the district or school because they are not
discriminatory. I also include triple interactions between the teacher’s current home and the ethnic
division components as controls. I find that ‘outsider’ teachers are no more likely to be absent than
local teachers either in divided or undivided districts and schools. This strongly suggests that the
‘taste’ explanation does not account for the main result of the paper.
6.2 Sanctioning Institutions
The possibility remains that ethnic diversity may affect teacher absenteeism through its impact
on either the effectiveness of sanctioning institutions (Channel 2a) or on the credibility of those
institutions’ threats in the eyes of ‘outsiders’ (Channel 2b). In Table 13, I show the effects of
parent teacher associations (PTAs), school inspections and the previous sanctioning behavior of
head teachers on teacher absenteeism. In columns (1) and (3), I include a dummy variable for the
existence of a parent teacher association, a categorical variable to indicate its last meeting (omitted
variable is “this month”), and a dummy variable to indicate that an inspection by the education
ministry had occurred within the previous six months. In columns (2) and (4), I include a dummy
variable to indicate that the head teacher has previously sanctioned a teacher for absence by either
dismissing, suspending or transferring her. I present this in a separate set of specifications due to
the obvious potential for reverse causality, as head teachers in schools with no absenteeism are not
34
required to sanction teachers.
The first point to note from the table is that the average effects of these institutions do not
represent likely candidates for the channel of causation from ethnic divisions to teacher absenteeism.
Across all four columns, the main interaction effect is positive, significant and stable, providing
indirect evidence that the sanctioning variables are not associated with ethnic divisions.
The other point to note is that teachers in schools which have had very recent PTA meetings (the
omitted group) are more likely to be absent than those who’s annual meeting is perhaps approaching
soon. There is evidence also that the sanctioning history of head teachers is significantly endogenous.
In Table 14, I present the effects of these sanctioning institutions conditional on whether or not
a teacher is native by language (columns (1) and (4)), by ancestry (columns (2) and (5)) or by
birth (columns (3) and (6)). To recount: the threat of sanctions by PTAs, local ministry officials
or head teachers may be credible only to coethnic teachers, as non-coethnic teachers may be reliant
on other groups for public goods. If this is the case, the lack of significance in the average effects
presented in Table 13 may be masking significant heterogeneous effects.
Again, in all six specifications, the effect of ethnic divisions is positive, significant and stable,
while no new patterns emerge. Taken together, these results indicate that ethnic divisions do not
affect teacher absenteeism through the credibility of social sanctions.
6.3 Social Networks Between Teachers
To explore the possibility that social group participation may help to explain the main relationship
in the analysis, I use a measure of social capital amongst teachers in each school that is based on
answers given by the head teacher to the following survey question:
When was the last time that the teachers in your school socialized with each other
outside of school hours, gathering for a meal or party for example?
35
I let “never” equal 0, and all other responses equal 1. The responses almost split the sample evenly:
54% percent of head teachers report that teachers in their school never socialize together, while
46% report that they do. I present simple OLS covariates in columns (1) and (2) of Table A1 in
Appendix A, which reveal that only ethnic divisions exhibit a statistically significant relationship
with the measure across both specifications.33 This can be interpreted as a necessary first-stage
condition for the validity of this explanation.
In Table 15, I present the results for specification (6). Moving from a school where teachers
never socialize to one in which they have socialized at least once in the memory of the head teacher
decreases the probability of a teacher’s absence by between 11.2 and 12.5 percentage points (or 19.2
and 21.7 percentage points based on marginal effects from a probit estimation). The inclusion of this
variable in the model eliminates entirely the significance of ethnic divisions using the district-level
measure, while also reducing substantially the magnitude and the significance of the effect of ethnic
divisions as measured at the school level. This provides support for the explanation that ethnic
divisions increase teacher absenteeism through the strength of social networks between teachers
within schools.
In Table 16, I examine whether or not these effects depend on the ethnicity of the teacher. The
average effect presented in Table 15 may be masking significant differences in the effects of social
networks on attendance between native and non-native teachers. I find no significant heterogeneous
effect when I proxy for coethnicity using variables based on birth or language, although when I use
the ancestral definition I find a positive effect. In all cases, the main results are similar to those
presented in Table 15.
Our ability to interpret these findings with certainty is limited by the data. Without knowing
the extent to which social capital is endogenous in the model,34 we cannot say with full confidence
33 For this, I run a school-level regression of Social on the of the right-hand side variables included in empirical
specification (2). Mean values are taken where necessary.
34 For example, it could be that high absenteeism creates animosity amongst colleagues, which may in turn reduce
the likelihood of social participation.
36
how exactly these social networks are related to absenteeism. However, it is possible to conduct
instructive falsifiability tests. If the causal channel is as mentioned, then we must expect to see a
significant role for ethnic divisions between teachers within schools. Although the main analysis was
conducted using measures based on the diversity of survey respondents within districts and of pupils
within schools, there remains a number of useful proxies for the diversity of teachers within schools.
If ethnic divisions amongst teachers are not significantly associated with higher absenteeism, we
can immediately reject our interpretation of this channel.
In Table 17, I present the effects of fives measures of teacher divisions. In column (1), I use a
Herfindahl concentration index based on the regional origins of each respondent;35 in column (2),
I use a Herfindahl concentration index based on the fluency of respondents in the local language36;
in column (3) I use the share of non natively-fluent teachers within the school; in column (4) I
use the share of non-native teachers by ancestry; and in column (5) I use the share of non-native
teachers by birth. In each case, we see a significant and substantially positive effect of school-level
divisions amongst teachers on absenteeism. In Table 18, I repeat the exercise with the inclusion of
the measure for social networks, and find results analogous to those in Table 15. In further support
of this interpretation, I run five additional school-level regressions of Social on each of the measures
for teacher divisions in columns (3) to (7) of Table A1, and find that the effects of ethnic divisions is
large, negative and, in all but two cases, statistically significant.37 Across, a range of comprehensive
specifications, higher ethnic divisions between teachers are associated with less social interactions;
and less social interactions are associated with a higher probability of absenteeism.
It is worth recalling that this interpretation of the results is also supported by the multi-country
analysis. In Table 6, we saw that no school characteristic other than teacher absenteeism was
35The categories are North (19.86%), South (0.2%), West (19.75%), East (39.71%), Central (17.7%) and Sudan(2.77%). The share of respondents teaching in their native region in 67.16%. This is the preferred measure, as manyof Uganda’s strongest social cleavages are regional. The measures that follow do not necessarily distinguish betweengroups.
36Native fluency (65.93%), Fluent (14.05%), Very Good (5.97%), Good (5.92%), Functional (2.05%), Minimal(3.65%), Not able to speak well (2.43%).
37The two specifications in which the effect is not significant are based on arguably the two least precise measuresof teacher diversity: share of non-native teachers by ancestry and birth.
37
associated with ethnic divisions, including the condition of facilities and the supply of materials.
In Table 8, we saw that ethnic divisions have no statistical association with respondents’ likelihood
of attending community meetings. Both of these findings are consistent with the assertion that
ethnicity does not affect absenteeism through a more general effect on community-level collective
action.
Taken together, the evidence suggests that the source of variation in teacher absenteeism is not
only ethnic divisions per se; rather, it is the effect of ethnic divisions on the formation of social
groups amongst teachers within schools. It is possible that the density of these social networks
may in turn increase the social cost of absenteeism; or perhaps they may foster altruism between
colleagues. We can say with confidence that the role of ethnic divisions in either the formation
or the effectiveness of community monitoring institutions, such as parent teacher associations, or
direct monitoring institutions, such as ministry inspections or the sanctioning behavior of head
teachers, is inconsequential for teacher absenteeism; as is the direct effect of a teacher’s ethnicity.
These findings serve as a clear invitation for experimental research in the field, where randomly
generated variation in social activities for teachers (or in the composition of teachers within schools)
could allow us to delve further into the relationship between ethnic divisions, social capital and
absenteeism amongst teachers.
7 Conclusion
In this paper I present robust evidence of a link between ethnic divisions and teacher absenteeism
using sub-national data from two sources: a nationally representative series of random, unannounced
school visits in Uganda; and a large opinion survey of citizens in 16 sub-Saharan African countries.
The results are robust to a comprehensive set of individual, geographic, institutional and historical
controls. I introduce a new measure of ethnic divisions which captures both ethnic diversity and the
salience of ethnic identification. Using ethnic diversity alone, a practice common in the literature,
would lead to a significant underestimation of the true effect. Across a range of specifications, and
using various combinations of data, I find that ethnic diversity increases teacher absenteeism at
high levels of ethnic salience; while, at low levels of salience, it either decreases teacher absenteeism
38
or has no effect at all. I present evidence that the effect is unrelated to community monitoring
institutions such as parent teacher associations, it is explained better instead by the effect of ethnic
divisions on within-school teacher networks. The analysis provides a partial explanation for the
apparent existence of a large, non-pecuniary cost of teacher absence.
The results invite further experimental research in the field that could determine how social
capital amongst teachers ultimately affects attendance decisions. Moreover, the demonstrable mal-
leability of ethnic salience leaves room for direct policy responses to collective action failures in
ethnically divided areas, which could involve either fostering a common national identity or sup-
pressing the attraction of ethnic electioneering. In all, this paper increases our understanding of
high teacher absenteeism in poor, ethnically divided areas. In so doing, it points to new suggestions
which could strengthen the link between educational investment and educational attainment.
39
ReferencesAcemoglu, D., S. Johnson, and J. A. Robinson, 2001. The Colonial Origins of Comparative Devel-opment: An Empirical Investigation. American Economic Review, 91(5), pp. 1369–401.
Ahlerup, P., and O. Olsson, 2012. The Roots of Ethnic Diversity. Journal of Economic Growth,17(2), pp. 71-102
Ai, C. R. and E. C. Norton, 2003. Interaction Terms in Logit and Probit Models. EconomicsLetters 80(1), pp. 123–129.
Alesina, A., R. Baqir and W. Easterly, 1999. Public Goods and Ethnic Divisions. Quarterly Journalof Economics, 114(4), pp. 1243-1284.
Alesina, A., A. Devleeschauwer, W. Easterly, S. Kurlat and R. Wacziarg, 2003. Fractionalization.Journal of Economic Growth, 8(2), pp. 155-194.
Alesina, A., and E. La Ferrara, 2000. Participation in Heterogeneous Communities. QuarterlyJournal of Economics, 115(3), pp. 847-904.
Alesina, A., and E. La Ferrara, 2005. Ethnic Diversity and Economic Performance. Journal ofEconomic Literature, 63, pp. 762-800.
Ashraf, Q., and O. Galor, 2013. The “Out of Africa” Hypothesis, Human Genetic Diversity, andComparative Economic Development. American Economic Review, 103(1), pp. 1-46.
Baldwin, K., and J. D. Huber, 2010. Economic Versus Cultural Differences: Forms of EthnicDiversity and Public Goods Provision. American Political Science Review, 104(4), pp. 644-662.
Banerjee, A., R. Banerji, E. Duflo, R. Glennerster and S. Khemani, 2010. Pitfalls of ParticipatoryPrograms: Evidence from a Randomized Evaluation in Education in India. American EconomicJournal: Economic Policy, 2(1), pp. 1-30.
Banerjee, A., and E. Duflo, 2006. Addressing Absence. Journal of Economic Perspectives, 20(1),pp. 117-132.
Becker, G., 1957. The Economics of Discrimination. Chicago: University of Chicago Press.
Becker, G., 1974. A Theory of Social Interactions. Journal of Political Economy, 82 (6), pp. 1063-1093.
Cameron, A. C., J. Gelbach and D. Miller, 2011. Robust Inference with Multi-way Clustering.Journal of Business and Economic Statistics, 29 (2), pp. 238-249.
Case, A., and C. Ardington, 2006. The Impact of Parental Death on School Outcomes: Longitu-dinal Evidence from South Africa. Demography, 43(3), August 2006, pp. 401-420
Chandler, T., 1987. Four Thousand Years of Urban Growth: An Historical Census. Lewiston, NY:The Edwin Mellen Press.
Chandra, K., 2012. Constructivist Theories of Ethnic Politics. Oxford University Press: Oxford.
40
Chaudhury, N., J. Hammer, M. Kremer, K. Muralidharan and F. H. Rogers, 2006. Missing inAction: Teacher and Health Worker Absence in Developing Countries. Journal of Economic Per-spectives, 20(1), pp. 91-116.
de Laat, J., M. Kremer and C. Vermeersch, 2008. Teacher Incentives and Local Participation.Draft.
Duflo, E., P. Dupas and M. Kremer, 2012. School Governance, Teacher Incentives and Pupil-Teacher Ratios: Experimental Evidence from Kenyan Primary Schools. NBER Working PaperNo. 17939.
Duflo, E., R. Hanna and S. P. Ryan, 2012. Incentives Work: Getting Teachers to Come to School.American Economic Review, 102(4), pp. 1241-1278.
Dunning, T. and L. Harrison, 2010. Cross-cutting Cleavages and Ethnic Voting: An ExperimentalStudy of Cousinage in Mali. American Political Science Review, 104(1), pp. 21-39.
Easterly, W. and R. Levine, 1997. Africa’s Growth Tragedy: Policies and Ethnic Divisions. Quar-terly Journal of Economics, 111(4), pp. 1203-1250.
Eifert, B., E. Miguel and D. N. Posner, 2010. Political Competition and Ethnic Identification inAfrica. American Journal of Political Science, 54(2), pp. 494-510.
Fearon, J. D., 2003. Ethnic and Cultural Diversity by Country. Journal of Economic Growth, 8(2),pp. 195-222.
Glennerster, R., E. Miguel and A. Rothenberg, forthcoming. Collective Action in Diverse SierraLeone Communities. Economic Journal.
Habyarimana, J., 2010. The Determinants of Teacher Absenteeism: Evidence from Panel Datafrom Uganda. Draft, Georgetown University.
Habyarimana, J., M. Humphreys, D. N. Posner and J. M. Weinstein, 2007. Why Does EthnicDiversity Undermine Public Goods Provision? American Political Science Review, 101(4), pp.709-725.
Habyarimana, J., M. Humphreys, D. N. Posner, and J. M. Weinstein, 2009. Coethnicity: Diversityand the Dilemmas of Collective Action. Russell Sage: New York.
Hjort, J., 2011. Ethnic Divisions and Production in Firms. Draft, University of California, Berkeley.
Horowitz, D. L., 1985. Ethnic Groups in Conflict. Berkeley, CA: University of California Press.
Kremer, M. and D. Chen, 2001. Interim Report on a Teacher Incentive Program in Kenya. Draft.
Kremer, M. and A. Holla, 2009. Improving Education in the Developing World: What Have WeLearned from Randomized Evaluations? Annual Review of Economics, 1, pp. 513-542.
Kremer, M., K. Muralidharan, N. Chaudhury, J. Hammer, F. H. Rogers, 2005. Teacher AbsenceIn India: A Snapshot. Journal of the European Economic Association, 3(2-3), pp. 658-667.
La Ferrara, E., 2003. Kin Groups and Reciprocity: A Model of Credit Transactions in Ghana.American Economic Review, 93(5), pp. 1730-1751.
41
Laitin, D., and D. N. Posner, 2001. The Implications of Constructivism for Constructing EthnicFractionalization Indices. APSA-CP: The Comparative Politics Newsletter, pp. 13-17.
Michalopoulos, S., 2012. The Origins of Ethnolinguistic Diversity. American Economic Review,102(4), pp. 1508-1539.
Michalopoulos, S., and E. Papaioannou, forthcoming. Pre-colonial Institutions and ContemporaryAfrican Development. Econometrica.
Miguel, E., 2004. Tribe or Nation? Nation-Building and Public Goods in Kenya versus Tanzania.World Politics 56, pp. 327–62.
Miguel, E., and M. K. Gugerty, 2005. Ethnic Diversity, Social Sanctions, and Public Goods inKenya. Journal of Public Economics, 89, pp. 2325-68.
Montalvo, J. G., and M. Reynal-Querol, 2005. Ethnic Polarization, Potential Conflict, and CivilWars. American Economic Review, 95, pp. 796–816.
Murdock, G. P., 1959. Africa: Its Peoples and Their Culture History. New York: McGraw-Hill.
Murdock, G. P., 1967. Ethnographic Atlas. Pittsburgh: University of Pittsburgh Press.
Nunn, N., and L. Wantchekon, 2012. The Slave Trade and The Origins of Mistrust in Africa.American Economic Review, 101(7), pp. 3221-52.
Olken, B., 2009. Corruption Perceptions vs. Corruption Reality. Journal of Public Economics,93(7-8), pp. 950-964.
Posner, D. N., 2004a. The Political Salience of Cultural Difference: Why Chewas and TumbukasAre Allies in Zambia and Adversaries in Malawi. American Political Science Review, 98(4), pp.529-45.
Posner, D. N., 2004b. Measuring Ethnic Fractionalization in Africa. American Journal of PoliticalScience, 48(4), pp. 849-863.
Pratham, 2006. Annual Status of Education Report.
Sangnier, M. and Y. Zylberberg, 2012. Protests and Beliefs in Social Coordination in Africa. Draft,Paris School of Economics, Sciences Po and Universitat Pompeu Fabra.
Uwezo, 2011. Uwezo East Africa Report.
Uwezo, 2012. Uwezo East Africa Report.
Vigdor, J. L., 2004. Community Composition and Collective Action: Analyzing Initial mail re-sponses to 2000 Census. Review of Economics and Statistics, 86(1), pp. 303-312.
42
Figures
Figure 1: Location of Afrobarometer Districts
43
Figure 2: Location of Ugandan School Visits
44
01
23
4De
nsity
0 .2 .4 .6 .8 1
Ethnic SalienceELF
kernel = epanechnikov, bandwidth = 0.0157
Figure 3: Kernel Density Functions: Ethnic Salience and ELF, Multi-Country
BEN
BWA
GHA
KEN
LSOMDG
MLI
MOZ
MWI
NAM
NGA
SEN
TZA
UGA
ZAFZMB
0.1
.2.3
.4Et
hnic
Sal
ienc
e by
dis
trict
.2 .4 .6 .8 1ELF by district
Figure 4: ELF and Ethnic Salience by District, Country Means
45
01
23
Dens
ity
0 .2 .4 .6 .8ELF (School Visits)
kernel = epanechnikov, bandwidth = 0.0459
Figure 5: Kernel Density Function: ELF in Uganda (School Visits)
46
Arua
Bugiri
Bushenyi
Jinja
Kamuli
Kisoro
Luwero Mpigi
Tororo
Yumbe
0.5
11.
52
Teac
her A
bsen
ce (A
froba
rom
eter
)
.1 .2 .3 .4 .5Teacher Absence (school visits)
Figure 6: Absenteeism in Uganda: Afrobarometer vs. School Visits
47
01
23
Teac
her a
bsen
ce
0 .2 .4 .6 .8 1Ethnic diversity
Ethnic salience at m+1sd Ethnic salience at m-1sd
Figure 7: Teacher Absence and Ethnic Divisions
48
Figure 8: Location of Armed Conflict Events (Source: Acled)
49
TablesTable 1: Ethnic Salience in the Literature
Mean Ethnic Salience
Posner (2004) Test
Malawi Zambia Difference
Chewa 0.31 0.05 0.26***Tumbuka 0.16 0.07 0.09*
Miguel (2004) Test
Tanzania Kenya Difference
0.06 0.16 0.10***
50
Table 2: Descriptive StatisticsTeacher absence: Never Once/twice A few times Often Full Sample
ELFd 0.48 0.44 0.43 0.44 0.46(0.30) (0.29) (0.27) (0.26) (0.29)
Ethnic Salience 0.15 0.17 0.18 0.18 0.17(0.15) (0.17) (0.17) (0.16) (0.16)
Urban 0.35 0.39 0.38 0.32 0.38(0.48) (0.49) (0.49) (0.47) (0.47)
Village facilities: school 0.78 0.80 0.81 0.82 0.80(0.41) (0.40) (0.39) (0.39) (0.40)
Village facilities: water 0.53 0.52 0.47 0.43 0.51(0.50) (0.50) (0.50) (0.49) (0.50)
Village facilities: electricity 0.52 0.58 0.52 0.47 0.54(0.50) (0.49) (0.50) (0.50) (0.50)
Village facilities: health 0.46 0.50 0.52 0.47 0.49(0.50) (0.50) (0.50) (0.50) (0.50)
Village facilities: sewage 0.23 0.28 0.24 0.20 0.24(0.42) (0.45) (0.42) (0.40) (0.43)
Respondent characteristics
Hardship 7.51 8.27 9.20 10.45 8.69(5.66) (5.41) (5.52) (5.79) (5.90)
Age 38.25 36.53 35.18 35.98 36.53(14.72) (14.27) (13.14) (13.58) (14.76)
Male 0.48 0.51 0.52 0.54 0.50(0.50) (0.50) (0.50) (0.50) (0.50)
Post-primary education 0.49 0.50 0.48 0.44 0.63(0.49) (0.50) (0.50) (0.50) (0.48)
Observations 6,755 2,521 2,791 2,033 21,598
Urban indicates the percentage of respondents surveyed in urban areas; Village facilities indicates thepercentage of respondents surveyed in villages containing a school, piped water, electricity, a healthclinic and a sewage system, respectively; Hardship is a composite variable ranging from 0-30, where 0indicates that respondents never go without food, water, medical care, cooking fuel, a cash income, andschool supplies like fees, uniforms or books, and 30 indicates that they always do. Post-primary education isthe average of a dummy variable indicating that respondents have recieved any form of post-primary education.
51
Table 3: Teacher Absenteeism and Ethnic Divisions - Afrobarometer(1) (2) (3) (4)
Teacher absence: Afrobarometer
ELF * Ethnic salience 1.008** 1.193***(0.403) (0.425)
ELF -0.002 -0.066 0.140 0.125(0.107) (0.115) (0.085) (0.084)
Ethnic salience -0.082 -0.073 0.319*(0.166) (0.178) (0.187)
Village controls No Yes Yes YesIndividual controls Yes Yes Yes YesRegion FE Yes Yes Yes Yes
Observations 13,468 12,240 12,240 12,330Number of clusters 318/1234 318/1234 318/1234 318/1234R-squared 0.177 0.176 0.175 0.174
Standard errors are adjusted for two-way clustering within ethnic groupsand within districts. ***Significant at the 1% level; **Significant at the5% level; *Significant at the 10% level. Regression equation: TAidr =
a+Ψ�n
i=1ESidnd
+ λELFd + β(�n
i=1ESidnd
∗ ELFd) + γXid + δVid + ηRr + eid
52
Table 4: Cultural and Institutional Persistence(1) (2) (3)
Teacher absence: Afrobarometer
ELF * Ethnic salience 1.027** 1.306*** 1.193**(0.474) (0.465) (0.533)
ELF -0.018 -0.165 -0.153(0.129) (0.116) (0.134)
Ethnic salience -0.049 -0.078 -0.075(0.184) (0.202) (0.229)
Slave exports -0.025(0.033)
Pre-colonial juris. hierarchy -0.061**(0.026)
Existence of city in 1400 0.091*(0.047)
Pre-colonial FE No No YesEthnicity FE No Yes YesVillage controls Yes Yes YesIndividual controls Yes Yes YesRegion FE Yes Yes Yes
Observations 10,772 12,240 11,593Number of clusters 318/1234 318/1234 318/1234R-squared 0.185 0.204 0.228
Standard errors are adjusted for two-way clustering within ethnic groups andwithin districts. ***Significant at the 1% level; **Significant at the 5% level;*Significant at the 10% level.
53
Table 5: Conflict, Sorting and Settlement History(1) (2) (3) (4) (5) (6)
Teacher absence: Afrobarometer
ELF * Ethnic salience 1.299*** 1.192** 1.307*** 1.184** 1.344*** 1.201**(0.462) (0.532) (0.463) (0.527) (0.481) (0.529)
ELF -0.157 -0.149 -0.154 -0.128 -0.176 -0.138(0.116) (0.134) (0.111) (0.131) (0.117) (0.134)
Ethnic salience -0.076 -0.077 -0.079 -0.077 -0.092 -0.067(0.201) (0.228) (0.200) (0.227) (0.205) (0.226)
Armed conflicts within 1km & 1 year -0.010 -0.008 -0.008 -0.008(0.010) (0.007) (0.007) (0.007)
Conflict fatalities within 1km & 1 year 0.005** 0.004*** 0.004*** 0.004***(0.002) (0.001) (0.001) (0.001)
Share of historical migrants 0.025 0.049 0.052(0.057) (0.066) (0.068)
Distance to Addis Ababa (km) -0.000 -0.000(0.000) (0.000)
Latitude 0.010 0.013(0.023) (0.026)
Distance to sea (km) -0.000 -0.000(0.000) (0.000)
Pre-colonial FE No Yes No Yes No YesEthnicity FE Yes Yes Yes Yes Yes YesVillage controls Yes Yes Yes Yes Yes YesIndividual controls Yes Yes Yes Yes Yes YesRegion FE Yes Yes Yes Yes Yes Yes
Observations 12,240 11,593 12,240 11,593 12,044 11,593Number of clusters 318/1234 318/1234 318/1234 318/1234 318/1234 318/1234R-squared 0.204 0.228 0.204 0.228 0.204 0.228
Standard errors are adjusted for two-way clustering within ethnic groups and within districts. ***Significant at the 1% level;**Significant at the 5% level; *Significant at the 10% level.
54
Table 6: Falsification Tests - Other School Variables(1) (2) (3) (4) (5) (6) (7)
Perceived school problems (normalized)Teacher abs. Expensive Facilities Crowding Teaching Materials Bribes
ELF * Ethnic salience 1.073** -0.105 -0.575 -0.131 0.127 -0.725* 0.383(0.473) (0.382) (0.374) (0.389) (0.375) (0.387) (0.395)
ELF -0.123 0.012 0.235** 0.064 0.070 0.098 -0.138(0.119) (0.098) (0.096) (0.101) (0.110) (0.101) (0.108)
Ethnic salience -0.060 0.206 0.223 0.051 0.044 0.390* 0.096(0.202) (0.168) (0.202) (0.207) (0.161) (0.211) (0.206)
Spatial controls Yes Yes Yes Yes Yes Yes YesPre-colonial FE Yes Yes Yes Yes Yes Yes YesEthnicity FE Yes Yes Yes Yes Yes Yes YesVillage controls Yes Yes Yes Yes Yes Yes YesIndividual controls Yes Yes Yes Yes Yes Yes YesRegion FE Yes Yes Yes Yes Yes Yes Yes
Observations 11,593 12,071 11,589 11,634 11,479 11,754 11,568Number of clusters 318/1234 318/1234 318/1234 318/1234 318/1234 318/1234 318/1234R-squared 0.228 0.266 0.294 0.265 0.255 0.259 0.230
Intra-country correlation 0.077 0.092 0.122 0.118 0.103 0.088 0.074
Standard errors are adjusted for two-way clustering within ethnic groups and within districts. ***Significant at the 1% level;**Significant at the 5% level; *Significant at the 10% level.
55
Table 7: Falsification Tests - Governance Variables(1) (2) (3) (4) (5) (6)
Perceptions of country-level institutionsCorruption Trust Gov. performance
Government President Ruling party Opposition Corruption Education
ELF * Ethnic salience -0.093 -0.452 0.511 0.137 -0.452 0.081(0.386) (0.343) (0.403) (0.358) (0.391) (0.273)
ELF 0.053 0.056 -0.102 0.023 0.141 -0.068(0.094) (0.084) (0.105) (0.083) (0.103) (0.087)
Ethnic salience -0.147 0.010 0.032 0.004 -0.247 -0.121(0.165) (0.130) (0.180) (0.186) (0.158) (0.159)
Spatial controls Yes Yes Yes Yes Yes YesPre-colonial FE Yes Yes Yes Yes Yes YesEthnicity FE Yes Yes Yes Yes Yes YesVillage controls Yes Yes Yes Yes Yes YesIndividual controls Yes Yes Yes Yes Yes YesRegion FE Yes Yes Yes Yes Yes Yes
Observations 14,324 13,943 16,595 16,126 15,857 17,051Number of clusters 318/1234 318/1234 318/1234 318/1234 318/1234 318/1234R-squared 0.221 0.266 0.294 0.167 0.226 0.257
Standard errors are adjusted for two-way clustering within ethnic groups and within districts. ***Significant at the 1% level;**Significant at the 5% level; *Significant at the 10% level.
56
Table 8: Other Falsification Tests(1) (2) (3) (4)
Teacher Teacher Teacher Communityabsence absence absence meetings
ELF * Ethnic salience (district) 1.348** -0.102(0.549) (0.365)
ELF -0.175 0.075(0.139) (0.112)
Ethnic salience -0.110 0.139(0.636) (0.232)
Minority 0.023(0.597)
ELF * Ethnic salience (individual) 0.830(0.734)
Ethnic salience (individual) 0.045 0.064 0.728 -0.098**(0.053) (0.047) (0.719) (0.039)
Village FE No Yes Yes NoSpatial controls Yes N/a N/a YesPre-colonial FE Yes N/a N/a YesEthnicity FE Yes Yes Yes YesDistrict controls Yes N/a N/a YesIndividual controls Yes Yes Yes YesRegion FE Yes N/a N/a Yes
Observations 11,000 12,974 12,974 17,359Number of clusters 318/1234 289 289 318/1234R-squared 0.225 0.396 0.396 0.255
Standard errors in column (1) and column (4) are adjusted for two-way clustering within ethnicgroups and within districts. Standard errors in column (2) and column (3) are adjusted for clusteringat the ethnicity level. ***Significant at the 1% level; **Significant at the 5% level; *Significant atthe 10% level.
57
Table 9: Robustness - Data and Specification(1) (2) (3) (4) (5)
Teacher absence: AfrobarometerDistrict Women District sample size Orderedaverage 25–50 < Med. > Med. Probit
ELF * Ethnic salience 1.218** 2.549*** 0.657 3.668* 1.323**(0.564) (0.816) (0.610) (2.101) (0.429)
ELF -0.126 -0.263 -0.142 -0.121 -0.172(0.194) (0.220) (0.157) (0.472) (0.116)
Ethnic salience 0.089 -0.807** 0.025 -0.933 -0.096(0.219) (0.356) (0.220) (1.157) (0.178)
Spatial controls Yes Yes Yes Yes YesPre-colonial FE Yes Yes Yes Yes YesEthnicity FE Yes Yes Yes Yes YesVillage controls Yes Yes Yes Yes YesIndividual controls Yes Yes Yes Yes YesRegion FE Yes Yes Yes Yes Yes
Observations 20,600 3,611 5,499 6,094 11,593Number of clusters 318/1234 274/1149 254/1036 264/200 1055R-squared 0.868 0.335 0.266 0.260Pseudo R-squared 0.105
Pre-colonial fixed effects, ethnicity fixed effects, and individual controls are included as district-level means in column (1). Standard errors are adjusted for two-way clustering within ethnicgroups and within districts in columns (1) to (4), and for clustering at the district level in col-umn (5). ***Significant at the 1% level; **Significant at the 5% level; *Significant at the 10% level.
58
Table 10: Ordered Probit Marginal Effects(1) (2)
Teacher Absence Ethnic Salience
Mean + 1 SD Mean - 1 SD
∂(Pr(outcome))
∂(ELF )
Never -0.089** 0.056(0.039) (0.039)
Once or twice 0.003** -0.005(0.001) (0.004)
A few times 0.03** -0.020(0.012) (0.014)
Often 0.057** -0.031(0.026) (0.021)
Marginal effects are calculated from the ordered probit regression presented inTable 9, column (5). ***Significant at the 1% level; **Significant at the 5% level;*Significant at the 10% level.
59
Table 11: Teacher Absenteeism and Ethnic Divisions - Uganda School Visits(1) (2) (3) (4) (5) (6)
Teacher absence: school visitsProbit
Ai & Norton (2003)Interaction effect
ELF (District) * Ethnic salience 4.410*** 4.115*** 2.36**(1.477) (1.429) (1.019)
ELF (Pupils) * Ethnic salience 4.901*** 5.253*** 2.43**(1.288) (1.343) (0.979)
Marginal effects
ELF (District) -0.818*** -0.766*** -0.744(0.272) (0.287) (0.463)
ELF (Pupils) -1.327*** -1.393*** -1.542**(0.338) (0.349) (0.722)
Ethnic salience -2.229*** -2.145*** -1.966*** -2.127*** -1.392** -1.644**(0.454) (0.446) (0.462) (0.529) (0.596) (0.701)
Teacher demographic controls Yes Yes Yes Yes Yes YesTeacher rank FE Yes Yes Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes Yes Yes YesInstitutional controls No No Yes Yes Yes YesSchool and location controls No No Yes Yes Yes Yes
Time of day FE Yes Yes Yes Yes Yes YesDay FE Yes Yes Yes Yes Yes YesMonth FE Yes Yes Yes Yes Yes Yes
Observations 1,686 1,686 1,594 1,594 1,400 1,400Number of clusters 94/10 94/10 94/10 94/10 83 83R-squared 0.248 0.253 0.252 0.256 0.223 0.230
Teacher demographic controls include: gender, age, marital status, a dummy variable indicating completion ofA-levels (high school final exams), and a dummy variable indicating place of birth (this district or another district);Teacher’s rank is a categorical variable indicating the following ranks: deputy head, head of department, permanentteacher, private teacher, temporary teacher, volunteer teacher, and ’other’. The omitted category is head teacher.Teacher’s employment characteristics include: duration of teaching career; duration of tenure at current school, anddummy variables indicating full-time status, membership of a union, and attendance of a teacher training programin the previous year. Institutional controls include dummy variables indicating the existence of a Parent TeacherAssociation (PTA), a categorical variable indicating the time lapsed since the last meeting, and dummy variablesindicating an official inspection in the previous six months and the existence of a local means of recognition for goodteachers. School and location controls include a set of dummy variables to indicate the existence of the followingfacilities: covered classrooms, non-dirt classroom floors, a toilet/latrine, drinking water and electricity; as well as thepupil-teacher ratio, average education levels of parents, and dummy variables indicating that the school is public, thatit practices multi-grade teaching, whether it is within five kilometres of a paved road and whether it is in a rurallocation. Column (5) and (6) present marginal effects from a Probit regression. Standard errors in columns (1) to(4) are adjusted for two-way clustering within schools and within districts. Standard errors in column (5) and column(6) are adjusted for clustering at the school level. ***Significant at the 1% level; **Significant at the 5% level; *Sig-nificant at the 10% level. Regression equation: TAjsd = a+ΨESd+λELF+β(ESd∗ELF )+γXjsd+δSsd+ηTtdm+ejsd
60
Table 12: Channel 1 - Teacher Coethnicity(1) (2) (3) (4) (5) (6)
Teacher absence: school visits
ELF (District) * Ethnic salience 4.330*** 4.377*** 4.378***(1.460) (1.461) (1.460)
ELF (Pupils) * Ethnic salience 5.552*** 5.647*** 5.647***(1.333) (1.379) (1.356)
ELF (District) -0.798*** -0.814*** -0.815***(0.281) (0.293) (0.290)
ELF (Pupils) -1.436*** -1.478*** -1.478***(0.342) (0.351) (0.344)
Ethnic salience -1.974*** -1.995*** -1.995*** -2.143*** -2.177*** -2.177***(0.469) (0.479) (0.478) (0.521) (0.535) (0.528)
Native: language 0.051** 0.046**(0.021) (0.022)
Native: ancestry -0.005 -0.000(0.045) (0.049)
Native: birth 0.042 0.042(0.035) (0.037)
Teacher demographic controls Yes Yes Yes Yes Yes YesTeacher rank FE Yes Yes Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes Yes Yes YesInstitutional controls Yes Yes Yes Yes Yes YesSchool and location controls Yes Yes Yes Yes Yes YesTime of day FE Yes Yes Yes Yes Yes YesDay FE Yes Yes Yes Yes Yes YesMonth FE Yes Yes Yes Yes Yes YesObservations 1,588 1,588 1,588 1,588 1,588 1,588Number of clusters 94/10 94/10 94/10 94/10 94/10 94/10R-squared 0.265 0.263 0.263 0.270 0.268 0.268
Standard errors are adjusted for two-way clustering within schools and within districts. ***Significant atthe 1% level; **Significant at the 5% level; *Significant at the 10% level. Regression equation: TAjsd =
a+ φNativejsd +ΨESd + λELF + β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd
61
Table 13: Channel 2a - Sanctioning Institutions(1) (2) (3) (4)
Teacher absence: school visits
ELF (District) * Ethnic salience 4.098*** 4.061***(1.430) (1.376)
ELF (Pupils) * Ethnic salience 5.285*** 5.417***(1.333) (1.151)
ELF (District) -0.766*** -0.751***(0.287) (0.274)
ELF (Pupils) -1.411*** -1.460***(0.342) (0.297)
Ethnic salience -1.966*** -1.956*** -2.141*** -2.163***(0.465) (0.434) (0.529) (0.458)
PTA 0.032 0.047 0.030 0.054(0.104) (0.101) (0.102) (0.096)
Last PTA meet: last month 0.022 0.018 -0.015 -0.024(0.115) (0.115) (0.102) (0.102)
Last PTA meet: < six months -0.118 -0.121 -0.139* -0.145*(0.085) (0.084) (0.078) (0.078)
Last PTA meet: < one year -0.164** -0.169** -0.165** -0.173**(0.065) (0.066) (0.068) (0.069)
Last PTA meet: > one year -0.053 -0.061 -0.046 -0.058(0.081) (0.084) (0.082) (0.083)
Recent inspection 0.029 0.031 0.053 0.059(0.048) (0.047) (0.045) (0.043)
Head teacher has sanctioned 0.078 0.121**(0.067) (0.053)
Teacher demographic controls Yes Yes Yes YesTeacher rank FE Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes YesInstitutional controls Yes Yes Yes YesSchool and location controls Yes Yes Yes YesTime of day FE Yes Yes Yes YesDay FE Yes Yes Yes YesMonth FE Yes Yes Yes Yes
Observations 1,588 1,588 1,588 1,588Number of clusters 94/10 94/10 94/10 94/10R-squared 0.260 0.261 0.266 0.267
Standard errors are adjusted for two-way clustering within schools and within districts. ***Sig-nificant at the 1% level; **Significant at the 5% level; *Significant at the 10% level Regressionequation: TAjsd = a+φSanctionsd+ΨESd+λELF+β(ESd∗ELF )+γXjsd+δSsd+ηTtdm+ejsd
62
Table 14: Channel 2b - Sanctioning Institutions by Coethnicity(1) (2) (3) (4) (5) (6)
Teacher absence: school visits
ELF (District) * Ethnic salience 3.904*** 3.570*** 3.971***(1.268) (1.364) (1.431)
ELF (Pupils) * Ethnic salience 5.363*** 5.130*** 5.462***(0.949) (1.202) (1.345)
ELF (District) -0.735*** -0.650** -0.728**(0.230) (0.276) (0.290)
ELF (Pupils) -1.427*** -1.422*** -1.477***(0.263) (0.307) (0.344)
Ethnic salience -1.931*** -1.797*** -1.898*** -2.185*** -2.054*** -2.137***(0.386) (0.449) (0.464) (0.416) (0.471) (0.504)
PTA 0.144** 0.035 0.036 0.150*** 0.044 0.045(0.062) (0.081) (0.088) (0.058) (0.078) (0.083)
Last PTA meet: last month -0.054 -0.006 -0.020 -0.080 -0.048 -0.059(0.112) (0.112) (0.104) (0.104) (0.101) (0.090)
Last PTA meet: < six months -0.151 -0.130 -0.157* -0.173* -0.158* -0.182**(0.107) (0.091) (0.091) (0.090) (0.085) (0.082)
Last PTA meet: < one year -0.249*** -0.161*** -0.182*** -0.257*** -0.164** -0.186***(0.064) (0.060) (0.067) (0.064) (0.064) (0.068)
Last PTA meet: > one year -0.118 -0.040 -0.081 -0.116 -0.037 -0.077(0.093) (0.095) (0.086) (0.089) (0.094) (0.084)
Recent inspection 0.014 0.012 0.015 0.024 0.039 0.042(0.048) (0.053) (0.051) (0.043) (0.047) (0.046)
Head teacher has sanctioned 0.156** 0.039 0.056 0.199*** 0.083 0.101**(0.068) (0.065) (0.060) (0.055) (0.054) (0.047)
Native by: Language Ancestry Birth Language Ancestry Birth
Native 0.204* -0.245*** -0.154 0.176* -0.252*** -0.148(0.106) (0.091) (0.145) (0.103) (0.091) (0.142)
Native*PTA -0.171 0.078 0.131 -0.166* 0.074 0.110(0.109) (0.135) (0.149 (0.100) (0.136) (0.142)
Native*Recent inspection 0.040 0.094 0.096 0.064 0.094 0.116(0.037) (0.092) (0.108) (0.040) (0.095) (0.107)
Native*Head teacher has sanctioned -0.142* 0.209** 0.146 -0.142* 0.197** 0.146*(0.076) (0.101) (0.102) (0.077) (0.090) (0.086)
Native*Last PTA meet: last month -0.090* -0.039 -0.196 -0.086* -0.042 -0.186(0.048) (0.284) (0.178) (0.046) (0.289) (0.178)
Native*Last PTA meet: < six months 0.011 0.170 0.029 -0.004 0.150 0.021(0.077) (0.115) (0.157) (0.082) (0.116) (0.148)
Native*Last PTA meet: < one year -0.065 0.156* 0.071 -0.066 0.160* 0.084(0.056) (0.089) (0.085) (0.049) (0.086) (0.080)
Native*Last PTA meet: > one year 0.076 -0.030 -0.176 0.083 -0.045 -0.169(0.056) (0.136) (0.117) (0.055) (0.130) (0.108)
Teacher demographic controls Yes Yes Yes Yes Yes YesTeacher rank FE Yes Yes Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes Yes Yes YesInstitutional controls Yes Yes Yes Yes Yes YesSchool and location controls Yes Yes Yes Yes Yes YesTime of Day FE Yes Yes Yes Yes Yes YesDay FE Yes Yes Yes Yes Yes YesMonth FE Yes Yes Yes Yes Yes YesObservations 1,588 1,588 1,588 1,588 1,588 1,588Number of clusters 94/10 94/10 94/10 94/10 94/10 94/10R-squared 0.267 0.270 0.267 0.273 0.277 0.274
Standard errors are adjusted for two-way clustering within schools and within districts. ***Signifi-cant at the 1% level; **Significant at the 5% level; *Significant at the 10% level. Regression equation:TAjsd = a+ φNativejsd + ϕSanctionsd + θ(Nativejsd ∗ Sanctionsd) +ΨESd + λELF+ β(ESd ∗ ELF ) + γXjsd + δSsd + ηTtdm + ejsd
63
Table 15: Channel 3 - Social Networks Between Teachers(1) (2)
Teacher absence: school visits
ELF (District) * Ethnic salience 2.434(1.823)
ELF (Pupils) * Ethnic salience 3.420**(1.726)
ELF (District) -0.316(0.364)
ELF (Pupils) -0.917**(0.406)
Ethnic salience -1.427*** -1.585***(0.493) (0.498)
Social -0.125** -0.112**(0.050) (0.053)
Teacher demographic controls Yes YesTeacher rank FE Yes YesTeacher employment characteristics Yes YesInstitutional controls Yes YesSchool and location controls Yes YesTime of day FE Yes YesDay FE Yes YesMonth FE Yes Yes
Observations 1,476 1,476Number of clusters 94/10 94/10R-squared 0.264 0.263
Standard errors are adjusted for two-way clustering within schools andwithin districts. ***Significant at the 1% level; **Significant at the5% level; *Significant at the 10% level. Regression equation: TAjsd =
a+φSocialsd+ΨESd+λELF +β(ESd ∗ELF )+γXjsd+δSsd+ηTtdm+ejsd
64
Table 16: Test for Heterogeneous Effects of Social Networks(1) (2) (3) (4) (5) (6)
Teacher absence: school visits
ELF (District) * Ethnic salience 2.094 2.133 2.138(1.750) (1.772) (1.773)
ELF (Pupils) * Ethnic salience 3.230* 3.266* 3.390**(1.655) (1.718) (1.719)
ELF (District) -0.272 -0.258 -0.276(0.352) (0.361) (0.355)
ELF (Pupils) -0.860** -0.868** -0.909**(0.390) (0.402) (0.404)
Ethnic salience -1.410*** -1.399*** -1.403*** -1.564*** -1.549*** -1.582***(0.486) (0.488) (0.483) (0.491) (0.505) (0.494)
Social -0.156*** -0.139*** -0.128** -0.144*** -0.129** -0.118**(0.043) (0.050) (0.050) (0.050) (0.053) (0.052)
Social * Native: language 0.050 0.045(0.046) (0.044)
Social * Native: ancestry 0.123*** 0.110**(0.044) (0.045)
Social * Native: birth 0.044 0.040(0.054) (0.054)
Native: language 0.031 0.030(0.031) (0.032)
Native: ancestry -0.103** -0.087*(0.045) (0.049)
Native: birth -0.005 -0.003(0.037) (0.036)
Teacher demographic controls Yes Yes Yes Yes Yes YesTeacher rank FE Yes Yes Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes Yes Yes YesInstitutional controls Yes Yes Yes Yes Yes YesSchool and location controls Yes Yes Yes Yes Yes YesTime of Day FE Yes Yes Yes Yes Yes YesDay FE Yes Yes Yes Yes Yes YesMonth FE Yes Yes Yes Yes Yes YesObservations 1,476 1,476 1,476 1,476 1,476 1,476Number of clusters 94/10 94/10 94/10 94/10 94/10 94/10R-squared 0.263 0.263 0.261 0.266 0.265 0.264
Standard errors are adjusted for two-way clustering within schools and within districts. ***Significant at the 1% level;**Significant at the 5% level; *Significant at the 10% level.
65
Table 17: Teacher Absenteeism and School-level Ethnic Divisions Among Teachers(1) (2) (3) (4) (5)
Teacher absence: school visits
ELF (Teachers: regional origin) * Ethnic salience 5.043***(1.510)
ELF (Teachers: native fluency) * Ethnic salience 3.846***(0.905)
(Share of non-native teachers: language) * Ethnic Salience 2.768***(0.663)
(Share of non-native teachers: ancestry) * Ethnic Salience 8.386***(1.989)
(Share of non-native teachers: birth) * Ethnic Salience 9.336***(1.995)
ELF (Teachers: regional origin) -1.427***(0.338)
ELF (Teachers: native fluency) -1.218***(0.222)
Share of non-native teachers: language -0.749***(0.178)
Share of non-native teachers: ancestry -1.909***(0.489)
Share of non-native teachers: birth -2.331***(0.475)
Ethnic salience -2.226*** -2.184*** -1.989*** -8.127*** -8.836***(0.562) (0.402) (0.385) (1.794) (1.663)
Teacher demographic controls Yes Yes Yes Yes YesTeacher rank FE Yes Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes Yes YesInstitutional controls Yes Yes Yes Yes YesSchool and location controls Yes Yes Yes Yes YesTime of day FE Yes Yes Yes Yes YesDay FE Yes Yes Yes Yes YesMonth FE Yes Yes Yes Yes YesObservations 1,588 1,588 1,588 1,588 1,588Number of clusters 94/10 94/10 94/10 94/10 94/10R-squared 0.267 0.271 0.266 0.261 0.267
Standard errors are adjusted for two-way clustering within schools and within districts. ***Significant at the 1% level; **Significantat the 5% level; *Significant at the 10% level.
66
Table 18: Teacher Absenteeism, School-level Ethnic Divisions and Social Networks Between Teachers(1) (2) (3) (4) (5)
Teacher absence: school visits
ELF (Teachers: regional origin) * Ethnic salience 2.722*(1.511)
ELF (Teachers: native fluency) * Ethnic salience 2.320**(1.050)
(Share of non-native teachers: language) * Ethnic Salience 1.415(0.882)
(Share of non-native teachers: ancestry) * Ethnic Salience 6.105***(2.146)
(Share of non-native teachers: birth) * Ethnic Salience 6.519***(1.952)
Social -0.086* -0.089* -0.114** -0.096** -0.092*(0.046) (0.053) (0.053) (0.048) (0.047)
ELF (Teachers: regional origin) -0.912***(0.351)
ELF (Teachers: native fluency) -0.857***(0.257)
Share of non-native teachers: language -0.401*(0.215)
Share of non-native teachers: ancestry -1.083**(0.480)
Share of non-native teachers: birth -1.527***(0.427)
Ethnic salience -1.435*** -1.540*** -1.388*** -6.147*** -6.423***(0.530) (0.448) (0.416) (1.798) (1.658)
Teacher demographic controls Yes Yes Yes Yes YesTeacher rank FE Yes Yes Yes Yes YesTeacher employment characteristics Yes Yes Yes Yes YesInstitutional controls Yes Yes Yes Yes YesSchool and location controls Yes Yes Yes Yes YesTime of day FE Yes Yes Yes Yes YesDay FE Yes Yes Yes Yes YesMonth FE Yes Yes Yes Yes YesObservations 1,476 1,476 1,476 1,476 1,476Number of clusters 94/10 94/10 94/10 94/10 94/10R-squared 0.264 0.267 0.262 0.266 0.264
Standard errors are adjusted for two-way clustering within schools and within districts. ***Significant at the 1% level; **Significantat the 5% level; *Significant at the 10% level.
67
Appendix A
BEN
BWA
GHA
KEN
LSOMDG
MLI
MOZ
MWI
NAM
NGA
SEN
TZA
UGA
ZAFZMB
0.1
.2.3
.4Et
hnic
Sal
ienc
e by
cou
ntry
.7 .75 .8 .85 .9 .95ELF by country
Figure A 1: ELF and Ethnic salience by country
68
Table A1: Social Networks Between Teachers - OLS Covariates(1) (2) (3) (4) (5) (6) (7)
Dependent Variable: SocialELF measured by: District Pupils Teachers: Share of non-native teachers by:
Region Language Language Ancestry Birth
ELF * Ethnic salience -6.997* -5.527* -5.574* -6.692** -4.110** -2.752 -1.632(3.732) (2.833) (2.801) (2.682) (1.606) (3.400) (3.349)
ELF 1.160* 0.723 1.909** 1.755*** 0.604* 0.416 0.607(0.608) (0.683) (0.719) (0.610) (0.356) (0.691) (0.753)
Ethnic salience 1.058 0.730 0.379 1.590 1.252 1.634 0.513(1.256) (0.969) (1.027) (1.258) (0.942) (3.164) (3.097)
PTA -0.100 -0.127 -0.090 -0.040 -0.068 -0.118 -0.065(0.270) (0.274) (0.246) (0.241) (0.236) (0.273) (0.289)
Last PTA meet: last month 0.274 0.135 0.497* 0.379 0.165 0.194 0.299(0.280) (0.303) (0.274) (0.275) (0.273) (0.305) (0.314)
Last PTA meet: < six months -0.029 -0.154 0.034 -0.017 -0.205 -0.165 -0.116(0.267) (0.267) (0.268) (0.274) (0.290) (0.278) (0.279)
Last PTA meet: < one year 0.465 0.528* 0.509* 0.485 0.484* 0.404 0.418(0.297) (0.298) (0.293) (0.296) (0.285) (0.305) (0.313)
Last PTA meet: > one year 0.010 -0.135 0.019 -0.027 -0.069 -0.142 -0.079(0.224) (0.245) (0.218) (0.218) (0.228) (0.214) (0.212)
Recent inspection -0.112 -0.076 -0.142 -0.147 -0.146 -0.095 -0.125(0.178) (0.170) (0.178) (0.178) (0.184) (0.189) (0.182)
Female 0.433 0.316 0.596 0.425 0.542 0.618 0.531(0.464) (0.447) (0.455) (0.443) (0.441) (0.469) (0.480)
Age -0.015 -0.006 -0.020 -0.016 -0.005 -0.017 -0.011(0.026) (0.026) (0.025) (0.025) (0.026) (0.026) (0.027)
Education 0.201 0.053 0.249 0.351 0.133 0.107 0.091(0.343) (0.346) (0.344) (0.355) (0.317) (0.334) (0.343)
Teacher training -0.158 -0.086 -0.125 -0.156 -0.179 -0.126 -0.163(0.175) (0.161) (0.159) (0.158) (0.153) (0.176) (0.169)
Experience 0.002 0.003 0.000 0.002 0.002 -0.007 0.000(0.027) (0.027) (0.027) (0.026) (0.026) (0.029) (0.032)
Experience at this school -0.018 -0.025 -0.037 -0.029 -0.021 -0.036 -0.041(0.028) (0.025) (0.023) (0.026) (0.023) (0.025) (0.025)
Fulltime -0.881 -0.829 -0.735 -0.851 -0.616 -0.397 -0.346(1.153) (1.136) (1.111) (1.185) (1.084) (1.135) (1.114)
Union -0.171 -0.315 -0.161 -0.191 -0.281 -0.265 -0.249(0.228) (0.214) (0.223) (0.222) (0.211) (0.231) (0.238)
Married 0.404 0.453 0.265 0.338 0.447 0.385 0.405(0.294) (0.276) (0.290) (0.290) (0.281) (0.310) (0.307)
Recognition program 0.044 0.064 0.038 0.033 0.100 0.080 0.087(0.130) (0.121) (0.125) (0.123) (0.114) (0.128) (0.130)
Facilities: classroom -0.283 -0.392 -0.171 -0.117 -0.135 -0.155 -0.241(0.601) (0.653) (0.544) (0.544) (0.476) (0.586) (0.585)
Facilities: floor 0.153 0.126 0.111 0.151 0.108 0.118 0.145(0.188) (0.192) (0.177) (0.184) (0.152) (0.193) (0.200)
Facilities: toilet 0.276 -0.110 0.163 0.330 0.318 -0.137 -0.117(0.668) (0.699) (0.733) (0.678) (0.725) (0.681) (0.630)
Facilities: water -0.088 0.003 -0.182 -0.129 -0.082 -0.130 -0.167(0.139) (0.137) (0.128) (0.128) (0.126) (0.138) (0.142)
Facilities: electricity 0.110 0.093 0.073 0.108 0.166 0.126 0.093(0.200) (0.201) (0.187) (0.188) (0.183) (0.205) (0.205)
Rural -0.064 -0.022 -0.105 -0.090 -0.058 -0.061 -0.045(0.126) (0.133) (0.124) (0.125) (0.126) (0.130) (0.127)
Access 0.009 0.089 0.001 0.007 -0.003 0.035 0.058(0.118) (0.124) (0.120) (0.112) (0.115) (0.118) (0.123)
Multi-grade -0.002 0.048 -0.070 -0.049 0.252 -0.008 -0.013(0.234) (0.257) (0.217) (0.238) (0.249) (0.305) (0.320)
Pupil-teacher ratio 0.001 0.000 0.001 0.001 -0.001 0.002 0.002(0.004) (0.004) (0.004) (0.004) (0.004) (0.004) (0.004)
Parental education 0.018 0.024 0.015 0.015 0.009 0.019 0.019(0.018) (0.016) (0.018) (0.017) (0.016) (0.017) (0.018)
Public School 0.204 0.180 0.239 0.231 0.315 0.168 0.182(0.199) (0.206) (0.190) (0.196) (0.208) (0.185) (0.187)
Observations 86 86 86 86 86 86 86R-squared 0.379 0.408 0.419 0.420 0.420 0.335 0.339
Robust standard errors in parentheses. ***Significant at the 1% level; **Significant at the 5% level; *Significantat the 10% level.
69
Appendix B
In order to estimate the interaction effects in the probit models presented in Table 11, I turn to
Ai and Tobin’s (2003) well-known method of calculating the cross-partial derivative. They show
that the marginal effect of an interaction term in non-linear models (i.e.∂Φ(u)∂(x1x2)
) can have a
different magnitude, sign and level of statistical significance than the true cross-partial derivative
(i.e.,∂2Φ(u)∂x1∂x2
).
Below, I show the interaction effects and z-statistics for the estimates presented in columns (5)
and (6) respectively. The corresponding marginal effects are also presented for comparison. The
plot of the z-statistics shows that every observation significantly different from zero is positive.
0
2
4
6
Inte
ract
ion
Effe
ct (p
erce
ntag
e po
ints
)
0 .2 .4 .6 .8 1Predicted Probability that y = 1
Correct interaction effect Incorrect marginal effect
Interaction Effects after Probit
Figure A 2: Interaction effect with ELF at district level, probit
70
-5
0
5
10
z-st
atis
tic
0 .2 .4 .6 .8 1Predicted Probability that y = 1
z-statistics of Interaction Effects after Probit
Figure A 3: z-statistics for interaction effect with ELF at district level, probit
71
-2
0
2
4
6
Inte
ract
ion
Effe
ct (p
erce
ntag
e po
ints
)
0 .2 .4 .6 .8 1Predicted Probability that y = 1
Correct interaction effect Incorrect marginal effect
Interaction Effects after Probit
Figure A 4: Interaction effect with ELF at school level, probit
-5
0
5
10
z-st
atis
tic
0 .2 .4 .6 .8 1Predicted Probability that y = 1
z-statistics of Interaction Effects after Probit
Figure A 5: z-statistics for interaction effect with ELF at school level, probit
72
The Long-Term Consequences of Apartheid:
Neighborhoods and Inequality among Black South Africans
Jonathan Page∗
February 27, 2014
Abstract
In this study I compare overall neighborhood effects on income for black Africans
across two geographically interlaced but characteristically different regions in post-
apartheid South Africa. Specifically, I measure the proportion of total inequality
explained by neighborhood background in a former bantustan, KwaZulu, with that in
the ‘white’ South African province, Natal, that surrounded it. This paper is the first to
decompose post-apartheid inequality into (1) inequality within neighborhoods and (2)
inequality between neighborhoods. I use a panel household survey (the KwaZulu-Natal
Income Dynamics Study) and find this proportion is 48% in KwaZulu and 89% in Natal.
This suggests reducing inequality across communities (e.g., by reducing inequalities in
school quality, distance to medical service providers, etc.) will have a larger relative
impact on reducing overall income inequality for Natal (i.e., ‘white’ South Africa) than
for KwaZulu (i.e., the bantustan). Understanding this proportion can help determine
whether neighborhoods or households should be the target of inequality-reduction
interventions.
∗Department of Economics, University of Hawaii at Manoa, Honolulu, HI (e-mail: [email protected]).I thank Anna Lou Abatayo, Chasuta Anukoolthamchote, Timothy Halliday, Karl Jandoc, Chaning Jang,Sumner La Croix, Sang-Hyop Lee, Wayne Liou, Inessa Love, Stephanie Page, and Jeffrey Traczynski forcomments.
1
1 Introduction
There is no place for [the Bantu] in the European community above the level of
certain forms of labour . . . What is the use of teaching the Bantu child mathematics
when it cannot use it in practice? That is quite absurd. Education must train
people in accordance with their opportunities in life, according to the sphere in
which they live.
Dr. Hendrik Verwoerd, Minister for Native Affairs, South Africa, 1953 (Lapping, 1986)
The above quote from Dr. Verwoerd, later the Prime Minister of South Africa from
1958-66, is representative of the efforts of this architect of apartheid to create separate
development paths for whites and the native speakers of Bantu languages1 (i.e. black South
Africans). Apartheid maintained an institutionalized system of segregation and discrimination
against non-whites in South Africa, particularly the black African majority. This included
creating nominally independent states within South Africa, called bantustans2, where black
South Africans were to be relocated. Blacks were made citizens of these bantustans according
to their tribal ancestry. This process allowed the dominant white minority to strip blacks of
their citizenship in ‘white’ South Africa. In the late 1980s the apartheid government began to
dismantle the many restrictions on blacks in response to mounting international pressure and
internal resistance. This dismantling took an important step in 1994 with the first general
election allowing blacks to vote. The election of Nelson Mandela, the black South African
resistance leader who had spent 27 years in prison, signaled to the world the end of apartheid.
This new birth of economic and political freedom held great promise in ending decades of
extreme poverty and stark inequalities. Unfortunately, since 1994 inequality and headcount
1Bantu refers both to a group of languages and to the black Africans who speak them.2The apartheid government officially referred to these as homelands. I use the label bantustan in this
paper specifically to avoid the preferred term of the oppressive apartheid government.
2
measures of poverty increased (cf. , Ozler 2007). More households were below 200% of
the household subsistence line (HSL) in 1998 than in 1993 when comparing the income
distribution in KwaZulu-Natal across these two periods. A review of the joint distribution of
income in 1993 and 1998 show the poor falling ever more behind (Carter and May, 2001).
Transition matrices, whether using endogenous income quintiles (Woolard and Klasen, 2005)
or exogenous income groups based on percent of HSL (Carter and May, 2001) reveal a society
with significant mobility, though much of that mobility entails poor households becoming
poorer.
Through 2007 the former bantustans were the most deprived in terms of income, employ-
ment, education, and living environment according to the South African Index of Multiple
Deprivation (Noble and Wright, 2012). This separate development can be seen in figure 1
which shows the lingering spatial effect of apartheid in the KwaZulu-Natal province as of
20013. The development literature offers initial household size, education, asset endowment,
employment access (Woolard and Klasen, 2005), and a highly segmented labor market (Ozler,
2007) as reasons for the increase in socioeconomic inequality. This study presents a first look
at the degree to which black African households are tied to their apartheid-neighborhood
background in post-apartheid South Africa.
In this study I compare overall community effects on income4 for black Africans across two
geographically interlaced but characteristically different regions in South Africa. Specifically,
I measure the community effect as the proportion of the total inequality explained by
cross-community inequality in a former bantustan, KwaZulu, with the ‘white South African
province, Natal, that surrounded it. Measured in this way, community effects are those factors
of income common to households in a community such as role models, social connections,
exposure to violence, and discrimination. When cross-community inequality is low relative
3Noble et al. (2009) discuss the index and the data behind it.4I use income and expenditure interchangeably here, following the custom in the development literature.
3
to overall inequality, reducing inequality across communities does little to address overall
inequality. For example, if the relevant proportion of variances, the intra-cluster correlation
coefficient (ICC), is 10%, completely eliminating inequality across communities would only
lower the overall inequality by 10%. I find ICC is relatively low for the bantustan and
relatively high for white South Africa. This suggests community-level interventions targeted
at reducing inequality will have a greater effect on lowering overall inequality for white South
Africa than for the bantustan. Likewise, inequality reduction policy in the former bantustan
should target household-level inequality within neighborhoods.
I closely follow estimation procedures used in the sibling correlation5 literature, where
ICC acts as an omnibus measure of intergenerational socioeconomic mobility. I use these
procedures to calculate ICC for a panel of black African households in the new KwaZulu-
Natal province6. In the sibling correlation literature, beginning with Solon et al. (1991), ICC
measures the importance of the factors common to all siblings within a family. In general,
the ICC measures the proportion of the total variance comprised by the variances of a family
or community-level effect7.
In the sibling context these common factors are defined by the outcomes of siblings’
parents and ICC leads naturally to a notion of intergenerational mobility8. This reasoning
does not apply directly to the household survey literature in developing countries when
the unit of analysis is a household and the grouping level is a community as opposed to
a family. However, the ICC provides insight into the sensitivity of target populations to
5Sibling correlations are equivalent to an ICC where clusters are families (i.e., groups of siblings). Myclusters are survey clusters with population sizes ranging from 331 to 317, 635. For ease of exposition I referto these as communities. When I control for population sizes of the clusters the results are unchanged. Thestandard errors are marginally inflated, but this does not affect the significance of my results.
6The new province combined the old bantustan Kwazulu with the old ‘white’ South African province ofNatal.
7Within the sibling correlation literature, a community effect is generally referred to as a neighborhoodeffect.
8For example, a sibling correlation in income of 0.4 implies that 40% of variation in incomes is due tocommon factors shared by siblings. These common factors include family and neighborhood background.
4
community-level interventions, as opposed to individual-level interventions.9
After calculating initial estimates of the ICC for KwaZulu and Natal, I calculate the
contribution of key factors to ICC10. The factors I test are education levels, urban status, and
investment in infrastructure. I proxy infrastructure investment with road quality and find it
explains much of the cross-community variation in Natal, but only a small portion of the cross-
community variation in KwaZulu. Road quality also explains more of the cross-community
variation than mean education levels or urban status for both provinces.
In my initial estimation of the ICC, I use restricted maximum likelihood (REML). Turning
again to the sibling correlation literature I show the robustness of my results to a recent
competing method (Bjorklund et al., 2009) and to various equivalence scales.
The ICC in 2004 is 0.23 in the bantustan and 0.69 in ‘white’ South Africa. This suggests
reducing inequality across communities (e.g., by reducing inequalities in school quality,
distance to medical service providers, etc.) will have a larger relative impact on reducing
overall income inequality for Natal (i.e., ‘white’ South Africa) than for KwaZulu (i.e., the
bantustan).
I describe my statistical model in the following section. In Section 3, I describe the
estimation procedure, REML, and how I intend to explain contributing factors to between-
neighborhood component of inequality. Section 4 presents an overview of the KwaZulu-Natal
Income Dynamics Study (KIDS) data and summary statistics of adult equivalent expenditure.
Section 5 presents my results, section 6 presents the analysis of factors to the correlations,
and section 7 presents robustness checks. Finally, I conclude.
9 The majority of studies in the developing world, which mention ICC, use it to adjust their standarderrors, consider it a nuisance parameter , or mention its value in passing (cf. , Tarozzi and Deaton 2009),while one paper employs ICC as a measure of spatial correlation for Burkina Faso (Grab and Grimm, 2009).Many other fields use ICC including epidemiology (Roux et al., 2001) and demography (South et al., 2011) tostudy spatial correlation.
10Here I follow the procedure in Mazumder (2008).
5
Figure 1: KwaZulu-Natal Province: Index of deprivation map. Source: Author’s calculationsusing the dataset discussed in Noble et al. (2009). Datazones are small statistical areas eachcontaining approximately the same number of individuals.
6
2 Statistical Model
The model of household income employed here has been alternately referred to as a nested-
error component model, a random effects model, a multilevel model, a mixed model, a
variance components model, or a hierarchical model (see Snedecor and Cochran, 1980; Deaton,
1997)11. The notation here mirrors that found in studies of sibling correlations (Solon et al.,
1991). The natural logarithm of adult-equivalent12 monthly expenditure, ych, for cluster c
and household h is modeled as
ych = x′chβ + εch. (1)
x′chβ includes an intercept, the number of children and the number of pensioners in order
to control for key household life-cycle effects13. The residual, εch, represents the effects of
household-specific factors unrelated to neighborhood factors. I decompose εch as follows:
εch = ac + vch, (2)
where ac is the component common to all households in community c, vch is the idiosyncratic
component for household h.
By construction, the variance, σ2ε , equals
σ2ε = σ2
a + σ2v . (3)
11Other examples include, Montmarquette and Mahseredjian (1989) who use a two-way nested-errorcomponent model to study the impact of a student’s class and school on educational achievement. Antweiler(2001) provides a succinct history of nested error models and discusses an application estimating the varianceswith maximum likelihood (ML).
12Here I use the adult equivalent scale, φ, used by Carter and May (1999) and common throughout theliterature on South African household income. φ = (A+ 0.5K)0.9, where A is the number of adults and K isthe number of children. This structure reflects children’s lower consumption relative to adults and assumeseconomies of scale.
13This follows the covariate setup in Solon et al. (1991) and Mazumder (2008) with relevant changes forthis paper’s household setting.
7
Thus, the share of variance in income due to community background, and also the income
correlation of randomly drawn pairs of households in a given community is
ρ =σ2a
σ2a + σ2
v
(4)
3 Estimation Procedure
I follow Mazumder (2008) and estimate the variance components using restricted maximum
likelihood (REML)14. While the ANOVA approach15 to calculate ρ is straightforward and
provides minimum variance estimator for balanced clusters, the same is not true for unbalanced
clusters (Corbeil and Searle, 1976). Solon et al. (1991) introduce four weighting schemes to
test robustness of results to various corrections for this imbalance. REML has the advantage,
even in the unbalanced case, of consistency, asymptotic normality, and a known asymptotic
sampling dispersion matrix. Simulations by Browne and Draper (2006) indicate that bias is
likely to be low when using REML for the number of clusters and households used in this study
given the assumption that the log of adult equivalent expenditure is normally distributed.16
Since the household survey data is unbalanced (i.e., each cluster is not restricted to the same
number of households), I select REML as the preferred method in this case to estimate ρ
(Mazumder, 2008).
14This method is alternately referred to as residual maximum likelihood. I estimate REML through thextmixed command in Stata.
15In the analysis of variance (ANOVA) approach ICC is simply the ratio of the between subjects (hereclusters) variance to the total variance.
16Following the relevant literature, I make this assumption. I produce quantile-quantile (Q-Q) plotsfor KwaZulu and Natal as a visual check of normality in figures 2a and 2b. Based on the Q-Q plots thedistributions appear normal. The Shapiro-Wilk test for normality does not reject the null hypothesis thatlogged expenditure is normally distributed for the case of KwaZulu, though it does reject normality for Natal.
8
24
68
10Lo
g Ex
pend
iture
3 4 5 6 7 8Inverse Normal
KwaZulu Q-Q Plot
(a) KwaZulu
24
68
10Lo
g Ex
pend
iture
2 4 6 8 10Inverse Normal
(b) Natal
Figure 2: Q-Q plots of logged monthly expenditure: KwaZulu and Natal
3.1 Attributing ICC to observables
In order to explore possible components of the community factor, I employ a measure proposed
by Mazumder (2008) to calculate an upper-bound estimate for the contribution of various
observables to ρ. To do this, I recalculate equation (1), adding the observed variable to X.
Define the community-level variation from this new calculation σ2∗a . I define this measure of
contribution, η, as
η =σ2a − σ2∗
a
σ2a + σ2
v
. (5)
Mazumder (2008) refers to η as an upper-bound estimate of the contribution of the factor
of interest, because it includes omitted factors which are correlated with the newly added
covariates. While it would be convenient to measure the σ2∗a using REML, Corbeil and
Searle (1976) and Robinson (1987) note that REML, in contrast to maximum likelihood
(ML), includes degrees of freedom in the estimation of the variance components17. As a
result, it is possible with REML to have σ2∗a > σ2
a when adding additional fixed effects to
17The REML estimator isσ2 = y′T ′(THT ′)−1Ty/(N − k)
where k is the number of fixed effects in the model (Corbeil and Searle, 1976), while the ML estimator is
σ2i = m−1i {d
′idi + σ2tr(Σ−122.1i + γ−1i Imi
)−1} (i = 1, 2, . . . , c)
9
the model18. Because of this issue, I diverge from the procedure in Mazumder (2008) by
replacing the variances in equation 5 with their counterparts from the standard maximum
likelihood (ML) procedure. I will continue to use REML to calculate ρ; however, since the
variance components from REML are not easily compared across model specifications, I use
ML to calculate the contributions to ρ. That is, I estimate a measure of contribution, ηML,
which I define as
ηML =σ2a,ML − σ2∗
a,ML
σ2a,ML + σ2
v,ML
. (6)
4 Data
The panel data is from the three wave KwaZulu-Natal Income Dynamics Study (KIDS)19
which was conducted in 1993, 1998, and 200420,21. All households in the survey are from the
KwaZulu-Natal province of South Africa. In 1996 it had the largest population of any South
African province with 8.4 million inhabitants, roughly 20.7% of the country’s population
(Statistics South Africa, 1996). It is approximately the size of Portugal, has two major ports
(Durban and Richards Bay) that account for the majority of the country’s cargo tonnage, and
has adequate soil and rainfall to support a wide variety of agricultural products (including
(Hartley and Rao, 1967). The incorporation of k allows σ2REML to increase when adding a fixed effect to the
model.18In fact, this is the case when estimating the contributions presented later in this paper when using REML.19This is the same data used by other studies of household income mobility covering this period in South
Africa (e.g, Klasen, 2000; Leibbrandt and Woolard, 2001; Woolard and Klasen, 2005; May et al., 2007).20The roughly 5 year gaps between observations satisfy the prescription from Naschold and Barrett (2011)
that long periods of examination are needed to accurately measure structural mobility (as opposed toshort-term fluctuations). This feature will be exploited in a robustness test of the REML procedure below.
21The KwaZulu-Natal Income Dynamics Study (KIDS) was a collaborative project between researchersat the University of KwaZulu-Natal, the University of Wisconsin, London School of Hygiene & TropicalMedicine, International Food Policy Research Institute (IFPRI), the Norwegian Institute of Urban andRegional Studies and the South African Department of Social Development. In addition to support fromthese institutions, the following organizations provided financial support: UK Department for InternationalDevelopment; the United States Agency for International Development (USAID); the Mellon Foundation;and National Research Foundation/Norwegian Research Council grant to the University of KwaZulu-Natal.
10
sugar cane, subtropical fruit, vegetables, dairy, and timber).
Though apartheid had been officially repealed in the early 1990s and the elections in
April 1994 brought peace to most of South Africa, the KwaZulu-Natal province continued a
monthly toll of 50-80 lives lost to political violence into 1995 and 1996. Due in part to the
continued violence, local elections were not held in this province until June 1996 (Johnston
and Johnson, 1997). By including in the analysis the first year of the survey (1993), I get
a sense of the household-level income mobility for a region just beginning to emerge from
the institutionalized discrimination which previously determined individual opportunity and
household-level outcomes.
I use total expenditure as constructed in the survey data as my measure of household
income. To analyze potential contributing factors to ρ, I calculate measures for community-
level education and community-level road quality. I use the mean of the years of education as
a community-level measure of education (see table 1). For 1993 and 1998 I have a measure of
the road quality for the community as a proxy for infrastructure (see tables 2 and 3). I use
dummy variables for each state of the world over the two time periods. That is, I have nine
indicator variables to fully represent the available data on road quality and investment in the
survey.
4.1 Descriptive Statistics
For each community, a summary table from the survey identifies its province, as of 1993. This
identifies whether a community is located in KwaZulu, the former bantustan, or in Natal,
the portion of the province in “white” South Africa. Table 4 lists the number of households
and communities for each year in the balanced panel as well as the average and monthly
adult-equivalent expenditure for each division.
Figures 3a and 3b show the kernel density plots for logged adult-equivalent expenditures
broken down first by whether or not the household lives in a bantustan (KwaZulu) or not
11
Table 1: Mean Years of Education Sum-mary Statistics
1993 1998 2004
Kwazulu
Rural 3.4 4.1 4.6
Urban 4.5 5.2 5.4
Natal
Rural 1.7 2.2 2.6
Urban 5.8 6.6 7.4
Values are summarize the mean years ofeducation for each cluster in the panel.
(Natal), then by year. I include rug plots22 to highlight the spread of the data. The dispersion
of incomes in KwaZulu increases over time and absolute poverty is increasing. For Natal there
is also a reduction in the concentration around the mean, but with a skewed distribution
highlighting a dispersion among the poor.
Figure 4 shows community-level monthly averages for adult-equivalent expenditures. Cross
sizes represent the community-level variance of income. The darkness of crosses represent the
number of households observed in each community. Circles represent the averages over all
communities. The bantustan communities (KwaZulu) are have higher variances and are more
tightly bundled than Natal. This suggests the community effect will be more pronounced for
Natal. In fact, the following analysis of ICC will confirm the general story presented in these
figures.
22Rug plots present vertical lines for each observation below the density plots.
12
Table 2: Road Quality Transition Matrices: KwaZulu
KwaZulu - Rural 1998 Total
(N=117) Dirt/Gravel Both Tarred 1993
Dirt 0 .71 .29 .72
Both .11 .11 .78 .23
Tarred .50 .50 0 .05
Total 1998 .05 .56 .38
KwaZulu - Urban 1998 Total
(N=33) Dirt/Gravel Both Tarred 1993
Dirt 0 0 1 .09
Both 0 0 1 .64
Tarred .67 0 .33 .27
Total 1998 .18 0 .82
Values indicate the proportion of neighborhoods with the roadquality indicated by the row in 1993 that have the road qualityindicated by the respective column in 1998. Terminal columnsand rows indicate the proportion of all neighborhoods with theindicated road quality in 1993 and 1998 respectively. The numberof neighborhoods in the sample is indicated by N.
13
Table 3: Road Quality Transition Matrices: Natal
Natal - Rural 1998 Total
(N=15) Dirt/Gravel Both Tarred 1993
Dirt 0 .25 .75 .80
Both 0 0 1 .20
Tarred 0 0 0 0
Total 1998 0 .20 .80
Natal - Urban 1998 Total
(N=36) Dirt/Gravel Both Tarred 1993
Dirt 0 0 0 0
Both 0 0 1 .08
Tarred .18 0 .82 .92
Total 1998 .17 0 .83
Values indicate the proportion of neighborhoods with theroad quality indicated by the row in 1993 that have the roadquality indicated by the respective column in 1998. Terminalcolumns and rows indicate the proportion of allneighborhoods with the indicated road quality in 1993 and1998 respectively. The number of neighborhoods in thesample is indicated by N.
14
Table 4: Monthly Mean (Median) Expenditure (2008USD)
KwaZulu Natal
Rural Urban Rural Urban
1993 56.54 94.64 35.00 189.33
(47.40) (80.35) (25.46) (128.62)
1998 35.24 66.77 25.24 183.21
(28.82) (63.09) (23.20) (130.80)
2004 42.92 67.11 29.71 274.50
(27.38) (54.94) (21.74) (196.17)
Communities 39 11 5 12
Households 460 115 31 143
Expenditures are adult-equivalent expenditures calculatedusing the approach by Carter and May (1999). This tablerepresents the cleaned, balanced, panel, not the raw data.
15
2008 USD per Day per Adult Equivalent
Den
sity
0.0
0.2
0.4
0.6
1 2 10 50
1993 1998
1 2 10 50
2004
(a) KwaZulu
2008 USD per Day per Adult Equivalent
Den
sity
0.0
0.2
0.4
0.6
1 2 10 50
1993 1998
1 2 10 50
2004
(b) Natal
Figure 3: Kernel density plots of monthly expenditure: KwaZulu and Natal. Beneath eachdensity, a rug plot indicates the frequency of the data. Within the rug plot, vertical linesmark out the location of each observation. This shows, for example, the data for Natal ismore sparse than the data for KwaZulu.
16
2008
US
D
30
60
120
240
480
960
1993 1998KwaZulu
2004 1993 1998Natal
2004
Figure 4: Community-level means of monthly expenditure.
Cross size represents the variance of income within a given community. Darker crossesrepresent communities where more households are observed. Circles represent the means over
all communities.
17
5 Main Results
To aid comparison of ρ across KwaZulu (i.e., the bantustan) and Natal, I present the variance
components of ρ. Table 5 shows the estimates of ρ stratifying the sample by location in either
KwaZulu or Natal. Overall, ρ is much higher in Natal than in KwaZulu. For example, in
2004 ρ is 23% in KwaZulu and 69% in Natal. The household components, σ2v , are similar in
both regions while ρ is consistently lower in KwaZulu. The source of this difference is the
community-level component, σ2a, as suggested by figure 4.
6 Contributing Factors
Looking now to contributing factors for ρ, table 6 presents the contribution estimates. In
KwaZulu, education and road quality are persistent factors while urban status is only a
dominant factor for 1993 and 1998. These values are lower than in Natal, suggesting even
less scope for community-level inequality-reduction policy for the former bantustan compared
to former ‘white’ South Africa.
Road quality consistently dominates the contributing factors for Natal. This suggests
infrastructure explains much of the differences across communities in Natal. Since much of the
inequality in Natal is across communities, community-level investment in infrastructure has
the potential to significantly reduce overall inequality for blacks in former ‘white’ South Africa.
As in Page and Solon (2003), urban status is an important contributor to the community
effect.
18
Table 5: Household correlations in adult equivalentexpenditure: KwaZulu and Natal
1993 1998 2004
KwaZulu
Correlation 0.217 0.261 0.229
(0.050) (0.053) (0.052)
Community component 0.055 0.111 0.131
(0.015) (0.029) (0.037)
Household component 0.197 0.316 0.440
(0.012) (0.020) (0.027)
Households 575 575 575
Communities 50 50 50
Natal
Correlation 0.760 0.693 0.693
(0.073) (0.088) (0.087)
Community component 0.767 0.731 0.794
(0.291) (0.287) (0.310)
Household component 0.242 0.324 0.352
(0.028) (0.037) (0.040)
Households 174 174 174
Communities 17 17 17
Standard errors are in parenthesis.
19
Table 6: Upper-bound estimates of the percentcontribution to the correlation: KwaZulu and Natal
1993 1998 2004
Kwazulu
Education 6.6 8.8 6.3
Roads 9.5 9.0 8.2
Urban 10.1 11.3 3.9
Education and Urban 12.6 15.3 8.1
Roads and Urban 13.6 13.8 8.7
Roads and Education 11.8 14.6 11.2
All factors 14.9 17.8 11.5
Natal
Education 21.7 42.8 35.5
Roads 54.2 50.3 51.1
Urban 46.9 44.2 41.4
Education and Urban 50.9 53.3 53.4
Roads and Urban 55.1 50.8 51.1
Roads and Education 57.3 56.6 57.1
All factors 59.7 57.0 57.3
Values are percentage contribution to ρ.
20
7 Robustness Checks
7.1 Alternate Adult-Equivalence Scales
Not everyone in a household has the same needs and not all households are the same size.
The choice of calculating household-level expenditure method may affect the validity of
cross-household comparisons (cf., Deaton (1997)). I compare my results across a variety of
equivalence scales23 common in studies of developing countries. Klasen (2000) and Woolard
and Leibbrandt (1999) provide an extensive search for meaningful equivalence scales for South
African households. Here I employ the following equivalence scales, φ,
φ = (A+ αK)θ
with A adults and K children, α is the proportion of an adult equivalent to a child, and θ
permits economies of scale. Each φ represents the number of adult equivalents in a given
household. I divide household expenditure by φ to calculate adult-equivalent expenditure. I
use the list of scales in table 7 and use two definitions for children (under 18 and under 16)
to test the robustness of the results presented earlier.
Figures 5a and 5b plot ρ using each of these scales. The diamonds represent ρ using CM
with children defined as under 16 (i.e., the scale used throughout this study). The horizontal
bars indicate 95% confidence intervals calculated using the delta method via Stata’s nlcom
command. It is clear from these figures that choice of φ has little impact on ρ.
23Adult equivalence scales adjust the number of “adults” in a household to adjust for the lower consumptionof children and the existence of economies of scale. I recognize there are general issues with using adultequivalence scales in the measurement of welfare (cf. Gronau, 1988). Instead of addressing these issues, I usethis section to demonstrate the invariance of ρ to alternate scales.
21
Table 7: Adult equivalence scales
Definition Source
1 + 0.7(A− 1) + 0.5K OECD (2013)
1 + 0.5(A− 1) + 0.3K OECD (2013)
α = 1 and θ = 0.25 OECD (2013)
α = 1 and θ = 1 (i.e., per capita) OECD (2013)
α = 0.997 and θ = 0.68 Woolard and Leibbrandt (1999)
α = 0.68 and θ = 0.72 Woolard and Leibbrandt (1999)
α = 0.5 and θ = 0.9 Carter and May (1999)
1 for the entire household
ρ
0.2
0.4
0.6
0.8
1.0
1993 1998 2004
(a) KwaZulu
ρ
0.2
0.4
0.6
0.8
1.0
1993 1998 2004
(b) Natal
Figure 5: Robustness test of adult-equivalence scale specification. Each line connects valuesfor ρ for a given equivalence scale. The diamonds represent the values of ρ for the equivalencescale used in the analysis presented in this paper. The tick marks about the diamonds markoff two standard errors above and below the estimates for ρ.
22
7.2 The model with transitory shocks
Single-period expenditure may not be representative of expenditure over all waves. To test
the robustness of the results of the REML approach (Mazumder, 2008; Lindahl, 2011), I run
the same analysis using the algorithm in Bjorklund et al. (2009). As their method exploits
the use of multiple observations, I first specify a statistical model with transitory shocks. I
model the natural logarithm of per adult-equivalent monthly expenditure in wave t, ycht for
communities c and household h as
ycht = x′chtβ + εcht. (7)
x′chtβ includes an intercept, the number of children, the number of pensioners, a wave dummy,
and their interactions to control for key household life-cycle effects24. I decompose εcht as
follows:
εcht = ac + bch + vcht, (8)
where ac is a permanent component common to all households in communities c, bch is a
permanent component unique to household h, and vcht measures wave specific deviations from
long-run income. I view bch as the household’s demeaned position in the long-run income
distribution.
By construction,
σ2ε = σ2
a + σ2b + σ2
v . (9)
Thus, the share of variance in income due to community background, and also the income
correlation of randomly drawn pairs of households in a given community (in the sibling
24This follows the covariate setup in Bjorklund et al. (2009) with relevant changes for the household setting.
23
correlation literature these would be pairs of brothers), this time considering only the
permanent components, is
ρ =σ2a
σ2a + σ2
b
. (10)
Again, ρ measures the importance of community effects on the outcomes of households,
this time controlling for transitory shocks. Household-level factors, such as level of education,
access to land, and within-community marginalization, are captured by the household-level
component, bch.
Due to the approximately 5-year gap between waves, I assume persistence in transitory
shocks to be negligible25 and specifically that vcht is a random shock with mean equal to zero,
and constant variance, σ2v .
7.2.1 Estimation Procedure
I perform OLS to calculate εcht. The decomposition of the error term εcht in equation 8
implies the following structural covariances:
E[εijtεkls] =
σ2a + σ2
b + σ2v , i = k; j = l; t = s
σ2a + σ2
b , i = k; j = l; t 6= s
σ2a, i = k; j 6= l
0. i 6= k
(11)
As in Bjorklund et al. (2009), I use the four weighting options from Solon et al. (2000) to
control for the unbalanced nature of the pairs of households drawn from the survey clusters26.
25Bjorklund et al. (2009), as an example, model v as an AR(1) process to reflect the relative importance ofpersistence in their annual context. When I include an AR(1) specification, the parameter λ is not statisticallysignificant and the general analysis remains unchanged.
26Elsewhere, survey clusters are referred to as communities.
24
That is, I need to control for unbalanced clusters. These weighting options are:
w1c =
(nc(nc − 1)
2
)−1w2c =
(nc − 1
2
)−1w3c =
(√nc(nc − 1)
2
)−1w4c = 1
where nc is the number of households in cluster c. Approach (1) weights each cluster equally
by weighting each household pair inversely to the total number of pairs contributed by its
cluster. Approach (4) weights each household pair equally, while approaches (2) and (3) are
somewhere between the extremes of (1) and (4).
Taking the chosen weights, I then compute the empirical household-pair autocovariance
matrix. Once complete, I apply GMM to the implied moment restrictions in order to estimate
σa, σb, and σv. I then construct ρ as defined above27.
For the sake of simplicity, I bootstrap the standard errors using 50 replications. Checks
with various random seeds indicate no substantive changes in the standard error estimates
implying, for the current situation, 50 replications is sufficient.
Again, when estimating using the method in Mazumder (2008) I use the xtmixed command
in STATA, though with multiple time periods it is necessary to employ nlcom to calculate
the standard errors of the correlations using the delta method.
27I implement this procedure in Python and R.
25
7.2.2 Results
The results of this robustness test are presented in table 8. The two moderate weighting
schemes (w2c and w3c) produce results strikingly similar to those from REML. The sensitivity
of ρ to alternate weightings, under the OLS-GMM method, indicates this procedure may
be less suitable for situations with very unbalanced clusters when compared with REML.
The estimated correlations are larger than before, but this is due to removing the transitory
component from the denominator28.
8 Conclusion
In order to compare the community effects on incomes for black Africans in a bantustan
with those for black Africans in ‘white’ South Africa, I presented an in-depth analysis of the
intracluster correlation (ICC) coefficient. I used ICC to analyze household income variation
due to community-level factors. Many papers have controlled for ICC (Owens et al. 2003,
de Brauw and Harigaya 2007, and de Brauw and Hoddinott 2011), while others used it
to measure the spatial clustering of their outcome of interest (e.g., Morris 2001). I used
ICC to show the lingering effect of the bantustan system on the relationship of community
inequality to overall inequality. Understanding ICC can help determine whether communities
or households should be the target of inequality-reduction interventions. In particular, where
ICC is low, reducing the inequality across communities will do little to address overall
inequality.
I found ICC is relatively low for the bantustan and relatively high for ‘white’ South Africa.
That is, I observed that outcomes for households in KwaZulu, the bantustan, are explained
28I remove the transitory component for consistency with Mazumder (2008) and Bjorklund et al. (2009).When I compute the correlation including the transitory component in the denominator, my results areconsistent with my estimates in table 5.
26
Table 8: Robustness comparison with Bjorklund et al. (2009): KwaZulu and Natal
OLS-GMMREML w1c w2c w3c w4c
KwaZuluCorrelation 0.384 0.397 0.405 0.406 0.420
[0.25, 0.53] (0.065) (0.055) (0.057) (0.067)Community Component 0.054 0.071 0.073 0.073 0.075
[0.02, 0.08] (0.015) (0.014) (0.015) (0.018)Household Component 0.087 0.108 0.107 0.107 0.104
[0.06, 0.11] (0.017) (0.017) (0.017) (0.016)Transitory Component 0.256 0.284 0.284 0.284 0.284
[0.23, 0.28] (0.020) (0.018) (0.015) (0.017)Households 578
Communities 50Natal
Correlation 0.893 1 0.914 0.914 0.823[0.84, 1] (0.078) (0.035) (0.034) (0.109)
Community Component 0.806 0.876 0.774 0.774 0.697[0.11, 1.28] (0.245) (0.143) (0.179) (0.209)
Household Component 0.097 0 0.073 0.073 0.15[0.06, 0.13] (0.058) (0.026) (0.028) (0.081)
Transitory Component 0.218 0.294 0.294 0.294 0.294[0.23, 0.28] (0.053) (0.042) (0.048) (0.052)
Households 175Communities 17
Standard errors are in parenthesis. For REML, basic parametric confidence intervals are presentedbased on 10,000 bootstrap samples. For the method discussed in Bjorklund et al. (2009), standarderrors are bootstrapped with 50 replications.
27
more by household-level effects than community-level effects, and that the opposite is true
for Natal, i.e., ‘white’ South Africa. This indicates community-level policy interventions will
be more effective in lowering overall inequality when applied to ‘white’ South Africa than in
the bantustan.
The community effect and the importance of various contributing factors to the community
effect differ across regions. I found that investments in infrastructure (proxied by road quality),
urban status and to a lesser extent education explain much of the cross-community inequality
in Natal (combined they explain 57% of the cross-community inequality). On the other hand,
these factors explain little (less than 18%) of the cross-community inequality in Kwazulu.
I have demonstrated that the measurement of ICC is insensitive to the choice of adult-
equivalence scale and that, for household surveys, REML outperforms the approach in
Bjorklund et al. (2009) as well as other ANOVA methods similarly dependent on appropriate
weighting schemes.
In this study of community effects on income, I have presented an in-depth analysis of
the intracluster correlation coefficient. This is, to my knowledge, a novel approach in the
development literature. In order to derive expectations on the impact of various public
projects intended to serve poorer communities29, it will be useful to understand how the
fate of households within a community are tied together. By taking innovations from
various literatures, such as those used in this study, we can take important steps towards
understanding these complex community bonds.
References
Antweiler, W. (2001). Nested random effects estimation in unbalanced panel data. Journal
of Econometrics, 101(2):295–313.
29For example, the Zibambele road maintenance program (McCord, 2004) and regional health programs(Coovadia et al., 2009).
28
Bjorklund, A., Jantti, M., and Lindquist, M. J. (2009). Family background and income
during the rise of the welfare state: Brother correlations in income for Swedish men born
1932-1968. Journal of Public Economics, 93(56):671–680.
Browne, W. J. and Draper, D. (2006). A comparison of Bayesian and likelihood-based
methods for fitting multilevel models. Bayesian Analysis, 1(3):473–514.
Carter, M. and May, J. (2001). One kind of freedom: Poverty dynamics in post-apartheid
South Africa. World Development, 29(12):1987–2006.
Carter, M. R. and May, J. (1999). Poverty, livelihood and class in rural South Africa. World
Development, 27(1):1–20.
Coovadia, H., Jewkes, R., Barron, P., Sanders, D., and McIntyre, D. (2009). The health and
health system of south africa: historical roots of current public health challenges. The
Lancet, 374(9692):817 – 834.
Corbeil, R. R. and Searle, S. R. (1976). Restricted maximum likelihood (reml) estimation of
variance components in the mixed model. Technometrics, 18(1):31–38.
de Brauw, A. and Harigaya, T. (2007). Seasonal migration and improving living standards in
Vietnam. American Journal of Agricultural Economics, 89(2):430–447.
de Brauw, A. and Hoddinott, J. (2011). Must conditional cash transfer programs be
conditioned to be effective? the impact of conditioning transfers on school enrollment in
mexico. Journal of Development Economics, 96(2):359–370.
Deaton, A. (1997). The analysis of household surveys: a microeconomic approach to develop-
ment policy. Johns Hopkins University Press.
29
Grab, J. and Grimm, M. (2009). Spatial inequalities explained: evidence from Burkina Faso.
ISS Working Papers - General Series 1765018725, International Institute of Social Studies
of Erasmus University (ISS), The Hague.
Gronau, R. (1988). Consumption technology and the intrafamily distribution of resources:
Adult equivalence scales reexamined. Journal of Political Economy, 96(6):pp.1183–1205.
Hartley, H. O. and Rao, J. N. K. (1967). Maximum-likelihood estimation for the mixed
analysis of variance model. Biometrika, 54(1/2):pp.93–108.
Johnston, A. M. and Johnson, R. W. (1997). The local elections in KwaZulu-Natal: 26 june
1996. African Affairs, 96(384):377–398.
Klasen, S. (2000). Measuring poverty and deprivation in south africa. Review of Income and
Wealth, 46(1):33–58.
Lapping, B. (1986). Apartheid: A History. Grafton Books.
Leibbrandt, M. and Woolard, I. (2001). The labour market and household income inequality in
south africa: existing evidence and new panel data. Journal of International Development,
13(6):671–689.
Lindahl, L. (2011). A comparison of family and neighborhood effects on grades, test scores,
educational attainment and income: evidence from sweden. The Journal of Economic
Inequality, 9(2):207–226.
May, J. D., Agero, J., Carter, M. R., and Timus, I. M. (2007). The KwaZulu-Natal income
dynamics study (kids) third wave: methods, first findings and an agenda for future research.
Development Southern Africa, 24(5):629–648.
Mazumder, B. (2008). Sibling similarities and economic inequality in the US. Journal of
Population Economics, 21:685–701.
30
McCord, A. (2004). Policy expectations and programme reality: The poverty reduction and
labour market impact of two public works programmes in south africa. Technical report,
Overseas Development Institute.
Montmarquette, C. and Mahseredjian, S. (1989). Does school matter for educational achieve-
ment? a two-way nested-error components analysis. Journal of Applied Econometrics,
4(2):pp.181–193.
Morris, S. S. (2001). Targeting urban malnutrition: a multi-city analysis of the spatial
distribution of childhood nutritional status. Food Policy, 26(1):49–64.
Naschold, F. and Barrett, C. B. (2011). Do short-term observed income changes overstate
structural economic mobility?. Oxford Bulletin of Economics & Statistics, 73(5):705–717.
Noble, M., Barnes, H., Wright, G., McLennan, D., Avenell, D., Whitworth, A., and Roberts,
B. (2009). The South African index of multiple deprivation 2001 at datazone level. Technical
report, Pretoria: Department of Social Development.
Noble, M. and Wright, G. (2012). Using indicators of multiple deprivation to demonstrate
the spatial legacy of apartheid in South Africa. Social Indicators Research, ONLINE
FIRST:1–15.
OECD (2013). What are equivalence scales? ONLINE. Downloaded at
http://www.oecd.org/social.
Owens, T., Hoddinott, J., and Kinsey, B. (2003). Ex-ante actions and ex-post public
responses to drought shocks: Evidence and simulations from zimbabwe. World Development,
31(7):1239–1255.
Ozler, B. (2007). Not separate, not equal: Poverty and inequality in postapartheid south
africa. Economic Development and Cultural Change, 55(3):pp.487–529.
31
Page, M. E. and Solon, G. (2003). Correlations between brothers and neighboring boys
in their adult earnings: The importance of being urban. Journal of Labor Economics,
21(4):831–855.
Robinson, D. L. (1987). Estimation and use of variance components. Journal of the Royal
Statistical Society. Series D (The Statistician), 36(1):pp.3–14.
Roux, A., Kiefe, C. I., Jacobs Jr, D. R., Haan, M., Jackson, S. A., Nieto, F. J., Paton,
C. C., and Schulz, R. (2001). Area characteristics and individual-level socioeconomic
position indicators in three population-based epidemiologic studies. Annals of epidemiology,
11(6):395–405.
Snedecor, G. G. and Cochran, W. G. (1980). Statistical Methods. The Iowa State University
Press, 7 edition.
Solon, G., Corcoran, M., Gordon, R., and Laren, D. (1991). A longitudinal analysis of sibling
correlations in economic status. The Journal of Human Resources, 26(3):509–534.
Solon, G., Page, M. E., and Duncan, G. J. (2000). Correlations between neighboring children
in their subsequent educational attainment. The Review of Economics and Statistics,
82(3):383–392.
South, S. J., Crowder, K., and Pais, J. (2011). Metropolitan structure and neighborhood
attainment: Exploring intermetropolitan variation in racial residential segregation. Demog-
raphy, 48(4):1263–1292.
Statistics South Africa (1996). The people of south africa, population census, 1996. ONLINE.
Accessed April 13, 2013.
Tarozzi, A. and Deaton, A. (2009). Using census and survey data to estimate poverty and
inequality for small areas. Review of Economics and Statistics, 91(4):773–792.
32
Woolard, I. and Klasen, S. (2005). Determinants of income mobility and household poverty
dynamics in South Africa. Journal of Development Studies, 41(5):865–897.
Woolard, I. and Leibbrandt, M. (1999). Measuring poverty in south africa. Working Paper
99/33, University of Cape Town, Development Policy Research Unit.
33
Globalization and Wage Convergence: Mexico and the United States*
Davide Gandolfi
Macalester College
Timothy Halliday+ University of Hawaii at Manoa
Raymond Robertson Macalester College
Version 32.0
March 8, 2014 JEL Codes: F15, F16, J31, F22 Keywords: Migration, Labor-market Integration, Factor Price Equalization Abstract: Neoclassical trade theory suggests that factory price convergence should follow increased commercial integration. Rising commercial integration and foreign direct investment followed the 1994 North American Free Trade Agreement between the United States and Mexico. This paper evaluates the degree of wage convergence between Mexico and the United States between 1988 and 2011. We apply a synthetic panel approach to employment survey data and a more descriptive approach to Census data from Mexico and the US. First, we find no evidence of long-run wage convergence among cohorts characterized by low migration propensities although this was, in part, due to large macroeconomic shocks. On the other hand, we do find some evidence of convergence for workers with high migration propensities. Finally, we find evidence of convergence in the border of Mexico vis-à-vis its interior in the 1990s but this was reversed in the 2000s.
* We thank participants at the University of Hawaii Applied Micro Workshop for useful feedback. + Corresponding author. Address: 2424 Maile Way; 533 Saunders Hall; Honolulu, HI 96822. Phone: (808) 956-8615. E-mail: [email protected].
1
The North American Free Trade Agreement (NAFTA) significantly increased
commercial integration between the United States, Canada, and Mexico. Between 1994 and
2011, trade in goods between the two countries quadrupled in value, increasing from $108.39
billion to $461.24 billion (USCensus Bureau). The value of US goods exported to Mexico
increased from $50.84 to $198.39 billion, while the value of Mexican goods exported to the
United States increased from $49.49 billion to $262.86 billion. In 2011, total exports to Mexico
accounted for 13.4 percent of overall US exports and total imports from Mexico accounted for
11.9 percent of overall US imports (Office of the United States Trade Representative). In 2012,
the total value of trade between Mexico and the US closely approached half a trillion dollars. By
2013, total trade between all three NAFTA countries reached 1 trillion dollars.
GDP per capita has also increased in both countries. In constant 2005 US dollars, US
GDP per capita increased from $32,015 to $43,063 between 1992 and 2012. While Mexico has
had some macroeconomic setbacks, such as the December 1994 peso crisis, recovery has
generally been rapid. In constant 2005 US dollars, Mexican GDP per capita increased from
$6,628 to $8,215 over the same time period.1
Rather than converge, however, Mexican GDP per capita and US GDP per capita grew
apart. The ratio of Mexican to US GDP per capita fell from 20.7% of US GDP per capita in
1992 to 19.2% in 2011.
The persistent and seemingly growing gap between GDP per capita is at odds with
neoclassical trade theory, migration theory, and early applied general equilibrium predictions of
the effects of NAFTA. The neoclassical Heckscher-Ohlin-Samuelson (HOS) framework, one of
the canonical trade models, predicts that trade liberalization would lead to convergence in the 1 These data were taken from World Bank Development Indicators. See http://data.worldbank.org/data-catalog/world-development-indicators.
2
prices of traded goods, which in turn would induce factor price convergence. In addition to the
significant increase in trade noted above, Robertson, Kumar, and Dutkowsky (2009) find strong
support for convergence in goods-level prices between Mexico and the United States, making the
lack of convergence in income inconsistent with the prediction of trade models.2
The lack of convergence is also at odds with labor-based migration models. At the most
basic level, an increase in labor supply from migration should reduce wages if the aggregate
labor demand curve is downward sloping. Borjas (2003) provides empirical evidence for the
downward-sloping labor demand curve. Mishra (2007) provides evidence that Mexican
emigration bids up Mexican wages.3 Most migration models, therefore, predict wage
convergence. Because most Mexican migrants come from the middle to lower end of the age,
education, and wage distribution (Chiquiar and Hanson 2005), convergence should be the most
prominent for these demographic groups. Such movements would tend to raise Mexican wages
and depress US wages, thereby reinforcing the effects of free trade on wage convergence.
Early applied general equilibrium models generated predictions of NAFTA’s effects that
implied significant income convergence. Brown (1992) in particular surveys several of the pre-
NAFTA applied general equilibrium models and demonstrates that the models that included both
Mexican and US income gains all predicted that Mexican gains would be at least double (if not
an order of magnitude greater than) the US gains.
2 The lack of evidence of factor price equalization generally has prompted many to question the validity of neoclassical HOS-type models. Schott (2003) finds that we live in a “multi-cone” world that precludes factor price equalization. Davis and Mishra (2007) suggest that ignoring important variation between the mix of factors employed in the production of domestic and imported goods obfuscates the possible effect that free trade may depress the wages of workers in relatively labor-intensive domestic industries. Goldberg and Pavcnik (2007) discusses evidence of rising inequality in poorer countries in the wake of many trade liberalizations in the eighties and nineties which is very much at odds with a standard HOS story of how globalization should unfold. The authors provide numerous reasons why the predictions of the standard HOS theory may not hold in the data such as technology, the pattern of tariff reductions, and within-industry shifts. 3 For example, Card (1990, 2001) argues that the evidence for migration’s effect on wages is weak.
3
Although the above studies suggest that there should be some degree of wage
convergence between Mexico and the United States, there has yet to be a study that investigates
this directly. The closest papers to ours focus on within-country convergence or short-run
convergence. Within-country changes may help explain changes in international comparisons,
and early studies of the Mexican labor market did detect evidence of regional wage convergence
within countries (Hanson 1996, 1997 and Chiquiar 2001). Robertson (2000) finds a strong,
positive correlation between wage growth in the United States and wage growth for Mexican
workers who reside on the border with the United States. Hanson (2003) also finds a similar
result. Robertson (2005), however, finds no evidence that NAFTA increased the estimated
degree of labor market integration between the United States and Mexico.
In this paper, we measure long-run international convergence using two complementary
methodologies and four data sources. The first regression-based approach employs synthetic
cohorts and matches quarterly data from the Current Population Survey in the United States and
the Encuesta Nacional de Ocupacion y Empleo (ENOE) in Mexico. The second approach is
more descriptive and employs census data from Mexico and the United States.
Following Robertson (2000), Borjas (2003), and Mishra (2007), we first divide Mexican
and US working-age people into forty-five age-education cohorts. Comparing exclusively
Mexican and US workers in the same education-age cohort effectively controls for variation in
returns to skill and allows us to use high-frequency CPS quarterly data to identify time-series
patterns. The disadvantage is that it focuses only on workers residing in urban areas in Mexico.
The second approach overcomes this disadvantage by using data that include rural
workers, but it has the disadvantage that the data are observed only once every ten years.
However, these data have the added advantage that, in a given year, the sample sizes are larger
4
than the survey data which enables us to have a more detailed look at the data. First, we
compare mean wage differentials by education and age cohort and look at how these have
evolved over time. Next, we look deeper into the data and investigate how the relative wage
distributions have evolved over time by comparing changes in a given percentile for a given age
and education level. Finally, we conduct an exercise in which we treat the United States and
Mexico as one “integrated economy” and decompose wage inequality in this integrated economy
into between and within components and investigate how these has changed over time.
At first glance, the results demonstrate that there has been very little, if any, convergence
between US and Mexican wages over time for everyone but the least educated. While there is
evidence of some convergence in the high-migration cohorts (i.e. younger people with less than
twelve years of education), this seems to be primarily due to falling US wages at the bottom of
the US income distribution, as opposed to rising Mexican wages. However, the overall
divergence from 1990-2000 has much to do with the effect of the peso crisis of 1994. We do see
some convergence in the high frequency data post-1994 but this abates in 2001. A more detailed
look at the census data reveals that there was convergence in the border region of Mexico
relative to the interior in the 1990’s but subsequently, there was divergence in the 2000’s. Since
a lot of foreign direct investment in Mexico targets the border, this is suggestive evidence that
NAFTA may have indeed led to some wage convergence which was then reversed during the
2000’s.
Finally, we provide evidence of rising wage inequality in the United States and falling
inequality in Mexico and we show that this is driven by changes to the variation in wages within
educational/age cohorts not across them, which is not consistent with a standard HOS
explanation of how trade liberalization should impact inequality. Similarly, we also show that in
5
the US-Mexico integrated economy the variance of log wages has declined and that this is due to
reductions in variation in wages across education/age cohorts not within them which, once again,
is not consistent a standard explanation of trade liberalization and inequality since it implies that
trade liberalization should reduce the demand for a given factor in one country and raise the
demand for the same factor in the other. While these results are not consistent with the HOS
model of trade with two countries, richer models may be able to account for what convergence
we do see.
We begin presenting these results with a simple theoretic model that motivates our focus
on the equilibrium wage differential between Mexico and the United States in Section I. After
describing the data in Section II, we present empirical results in Section III and IV. We then
evaluate mechanisms that may be behind these findings and offer conclusions in Section V.
I. Theoretical Foundation
Our empirical work focuses on the long-run wage differential between Mexico and the
United States. We posit that the differential is a function of labor-market integration following
Robertson (2000). Consider an economy composed of two regions (“Mexico” and “United
States”). We assume that Mexican and US workers are price substitutes, such that an increase in
the wages of American workers increases the demand for Mexican labor. We also assume that
capital flows between the two regions are not instantaneous, such that the lagged US wage
affects the demand for Mexican labor. A general form that captures the previous assumptions is:
(1) L δ δ w δ w γw
6
where L is labor demand, w is the natural log of the US wage, and w is the natural log of
the Mexican wage. The subscript j represents an education-experience group and subscript t
represents the time period. The parameter γ captures the responsiveness of demand to lagged
wages, and δ is a group-specific effect on labor demand.
If US wages rise, Mexican workers choose to emigrate to the United States. We assume
that workers may migrate instantaneously from one region to another, because labor is more
mobile than factors that shift demand, such as capital. Therefore, the supply of Mexican labor is
responsive to wage levels in both regions. A general form that captures these assumptions is:
(2) L σ σ w σ w φw
The variable L represents labor supply. The subscript j represents an education-experience group
and subscript t represents the time period. The parameter φ captures the responsiveness of
supply to lagged wages, and σ is a group-specific effect on labor supply.
The coefficients δ and σ represent the frictions in our model. The wage differential
will be increasing as these two parameters move away from each other. We will show that when
they are the same, there is no differential. One can interpret these as the cost of migration to
demanders and suppliers of labor, respectively.4
In the presence of exogenous costs, an equilibrium differential separates regional wages.
Wage shocks may temporarily move US or Mexican wages away from equilibrium, but they will
eventually return to it. We represent the equilibrium as:
4 As an example of these migration costs, Roberts et al. (2010) estimate smuggling costs.
7
(3) δ δ w δ w γw σ σ w σ w φw
By solving (3) for the current Mexican wage, we obtain an expression in terms of the lagged
Mexican wage, the current US wage and the lagged US wage:
(4) w w w w
For the sake of simplicity, we may rewrite (4) as:
(5) w α w α w α w
As specified in Robertson (2000), Hendry and Ericsson (1991) show that long-run homogeneity
between w and w implies that the sum of α , α and α equals 1. Thus, we may take a
differenced form of (5) to obtain:
(6) ∆w α α ∆w 1 α w w
Because 1 α is positive, increases in the US wage relative to the Mexican wage will result
in higher Mexican wages tomorrow.
The long-run equilibrium implies that wages in both regions are such that labor markets
clear; as long as labor markets remain in equilibrium, wage levels do not change over time. As a
result, ∆w 0 , ∆w 0 and w w w w . We impose this
restriction and solve for w w :
8
(7) w w
This difference is analogous to the migration cost in most theoretic migration models. Although
ubiquitous, few papers analyze the long-run behavior of the equilibrium migration cost.
Deepening economic integration, changes in policy, and a host of other factors may affect the
long-run differential. For example, an increase in Mexican labor supply increases the wage gap,
while an increase in Mexican labor demand reduces the gap. Increased responsiveness to wages
(such as through a reduction in long-run migration costs that reduce the 2 and 2 parameters in
the denominator) cause the gap to fall (as long as current wages are weighted more than past
wages). Finally, if is zero, then US and Mexican wages are the same in equilibrium.
II. Data
We use four datasets that represent two separate types of data. The first type is quarterly
household survey data in which urban residents have been consistently surveyed over the period
1988-2011. As a result, urban residents are typically over-represented in Mexican household
earning data. To avoid composition bias, we restrict our analysis to Mexican urban households.
US household survey data are a representative sample of both urban and rural US households.
Second, we use census data that have two advantages over the survey data. The first is that the
Mexican census data contain much more accurate information about rural households. The
second is that the sample sizes are much larger so we can obtain a more detailed understanding
of what is happening to the relative wage distributions. That said, they have the disadvantage of
only being available in ten years intervals.
9
Household Survey Data
We extract all data on Mexican households from the Encuesta Nacional de Empleo
(ENE) over the period 1988-2004 and from the Encuesta Nacional de Ocupacion y Empleo
(ENOE) over the period 2005-2011. Data on US households are from the Merged Outgoing
Rotation Groups (MORG) data of the CPS over the entire period 1988-2011. We exclude from
the sample working-age adults who have zero or unreported earnings. The sample is further
restricted to adult males between 19 and 63 years of age. Focusing on male workers allows us to
ignore the issue of self-selection on the participation of women in the labor force, as well as the
effect of changes to self-selection patterns over time and between the United States and Mexico.
The Mexican data are reported as monthly earnings until 2005. The US data report
weekly earnings. To explore the robustness from using potentially poor measures of hours
worked, we consider both monthly and hourly earnings. We multiplied reported US weekly
wages by 4.33 to transform them into monthly wages. US hourly wages have been computed by
dividing weekly earnings by the number of hours usually worked each week. Mexican hourly
wages have been computed by dividing monthly earnings by the number of hours worked each
week times 4.33 until 2005, when the hourly wages of Mexican workers are directly available
from ENOE data.
Following Chiquiar and Hanson (2005), all earnings measures are converted into 1990
US dollar units. Mexican earnings are converted into dollars by using simple quarterly averages
of the daily official exchange rates published by the Mexican Central Bank (Banco de Mexico
2013). We then deflated the wages to 1990 dollars using the quarterly average of the US
Consumer Price Index (CPI) (Bureau of Labor Statistics). Also as in Chiquiar and Hanson
10
(2005), we only use Mexican wages that are between $0.05 and $20.00 and US wages that are
between $1.00 and $100.00.
ENE/ENOE surveys have been extended to significantly more rural areas over the last
two decades. In order to reduce the bias generated by greater participation of the rural Mexican
population, we restrict the sample to workers from major metropolitan areas and state capitals
that have consistently been part of the surveys. Such areas include Mexico City, the State of
Mexico, San Luis Potosí, Leon, Guadalajara, Chihuahua, Monterrey, Tampico, Torreon,
Durango, Puebla, Tlaxcala, Veracruz, Merida, Orizaba, Guanajuato, Tijuana, Ciudad Juarez,
Matamoros, and Nuevo Laredo. No geographical restrictions have been imposed on MORG data.
Descriptive statistics for the raw survey data are displayed in Table 1. Each column gives
an average of quarterly observations collected over a four- or five-year period. The average US
monthly wage ranges from $1466 to $1515, and it has remained roughly constant from 1988 to
2011. The average Mexican monthly wage ranges from $226 to $310. It has declined fairly
steadily over time. The average age of the US workforce has increased steadily between 1988
and 2011, from 37 to 40 years. The average age of the Mexican workforce has also risen steadily,
from 35 years in 1988-1994 to 37 in 2008-2011. The US workforce is significantly more
educated than the Mexican workforce, with about 90% of all workers in each time period having
at least completed high school education. By contrast, the number of Mexican workers who
completed high school education or attended college ranges from 30% in 1988-1994 to 32.3% in
2008-2011. Mexico has improved the education of its workforce. The steady rise in the number
of high school graduates and college attendees has been accompanied by a steady decline in the
number of workers with 0-5 years of education, which dropped from 18% in 1988-1994 to 12%
11
in 2008-2011. The largest gains emerge in the 9-11 category, when Mexico raised the
compulsory education requirement from 6 to 9 years in 1992.5
Ideally, survey data would collect information from surveyed individuals at regular
intervals, and neatly organize it as panel data. In the absence of such data, it is possible to use a
time series of cross-sectional surveys to create a version of synthetic panels (Deaton, 1985). In
our paper, we create 45 age-education cohorts when using the survey data. In the absence of
significant changes to the composition of the cohorts, the average behavior of each cohort over
time should approximate the estimates obtained from genuine panel data (Deaton, 1997). Since
our focus is not on wage growth of individuals over time, we do not “age” the cohort cells.
Working-age adults in each sample are subdivided into five education categories and nine
age categories. The first age group includes workers aged 19-23 years old; the second includes
workers aged 24-28, the third those aged 29-33, and so forth. The first education group includes
adults with 0-5 years of education; the second includes adults with 6-8 years of education; the
next comprise those with 9-11, 12-15 and finally 16 or more years of education. These categories
are roughly comparable to those employed by Robertson (2000), Borjas (2003) and Mishra
(2007). Unlike Borjas (2003), we are able to identify greater variation in the group of working
adults who have not completed high school. We are unable to distinguish between high school
graduates and workers with some college experience; we classify both groups as having 12-15
years of schooling. We exclude from the sample workers with zero or unreported amounts of
education. Once workers are assigned to the 45 categories, we take the average wage of each cell
with and without the sample (population) weights. Sample (population) weights are not
available for Mexican household surveys during the 1994-2003 period.
5 See http://wenr.wes.org/2013/05/wenr-may-2013-an-overview-of-education-in-mexico.
12
Different demographic groups have different propensities to migrate, and since migration
may drive equalization, Figure 1 shows the percentage of Mexican-born workers in the US by
age and education for each of the 45 cohorts. Most Mexican-born workers in the US are younger.
In addition, Mexican-born workers in the United States comprise a progressively declining share
of the workforce among older groups. We also see that the bulk of Mexicans residing in the
United States tend to be less educated.
Figure 2a plots the log of the real average monthly earnings of Mexican workers over
time by education-age cohorts6. Several significant macroeconomic events are immediately
apparent. The December 1994 peso crisis led to the rapid devaluation of the peso against the US
dollar, as nominal exchange rates doubled from 4 pesos/US dollar to 8 pesos/US dollar in the
space of a few months. The drastic change in exchange rates and the subsequent erosion of
purchasing power represented a significant shock to Mexican wages. The peso/US dollar
exchange rate has been floating ever since. At least some of the increase in Mexican real wages
between 1994 and 2001 may be attributed to a rebound in purchasing power experienced by
Mexican workers as the effects of the crisis waned over time. The increase in wages reverses
around 2001, which coincides with both the US recession (March 2001) and China entering the
WTO (December 11, 2001). Recovery resumes around 2005 and continues until the Financial
Crisis and Great Trade Collapse in October 2008.
Figure 2b plots the log of the real average monthly earnings of US workers over time by
age-education cohorts. Compared to Mexican wages, US wages are relatively stable. Real
wages have experienced no significant expansion or contraction over the sample period, but may
appear to decline slightly after 2001.
6 The wages of 59-63 year-old male workers with 12-15 years of education are not shown. Since this particular demographic cohort of Mexican workers is very small, it displays a wildly erratic wage pattern that obfuscates the general picture; therefore, we chose to omit it.
13
Figure 3 plots the difference between real US wages and real Mexican wages over time.
Once again, the differential experienced by workers aged 59-63 with 12-15 years of education
has been omitted for the sake of overall clarity. Figure 3 shows less dispersion across cohorts
than the individual country graphs. The differentials of different cohorts largely move together
and changes in the differential coincide with significant macroeconomic events. To see these
events more clearly, Figure 4a graphs the mean wage differential7 and identifies some of the
significant events affecting Mexico since NAFTA. The peso crisis is immediately apparent, as is
the relatively rapid recovery. The reduction in the differential accelerates until 2001, when
China enters the WTO. Dussel, Peters and Gallagher (2013) argue that China had a significantly
negative influence on NAFTA trade. The differential grows until the middle of the 2000s and
then falls until the financial crisis.
To formally identify structural breaks in the average differential, we apply tests for
unknown breaks described by Vogelsang and Perron (1998). Figure 4a plots the relevant
additive outlier test statistic. The local extrema of the test statistic indicates a trend break. The
peso crisis is the most significant break, but a smaller local maximum appears around 2000.
Therefore, in the empirical work that follows, we include structural breaks in both 1994 and
2001.
Figure 4b graphs the standard deviation of the wage differentials across cohorts. The
standard deviation of wage differential across cohorts is falling until approximately the time of
the break identified by the Vogelsang and Perron test statistic. The standard deviation rises
steadily until the end of the sample, again supporting the use of multiple structural breaks.
Figure 4b also motivates a more detailed look at changes in other measures of the wage
distribution, which we carry out using census data. 7 The mean is calculated taking the unweighted arithmetic average across cohorts.
14
While the differentials of different cohorts generally move together, there are some
differences across cohorts. Figures 5a, 5b, and 5c present the trends for three different cohorts.
Figure 5a shows that the differential for Cohort 4 (workers with 0-6 years of education and 34-38
years old) exhibits significant peso crisis effects. Around 2001, however, the recovery seems to
stop and the differential grows through the 2000s. The pattern for Cohort 38 (workers with 12-
16 years of education and 54-58 years old), shown in Figure 5b, reveals a smaller peso crisis
effect, but a rising wage gap during the 2000s. On the other hand, Figure 5c shows that the wage
gap for the “high migration” cohort (19 to 23-year-old workers with 6-9 years of education)
either remains flat or falls slightly throughout the 2000s. These differences across cohorts are
consistent with the idea that migration helps to integrate markets by closing the wage differential
across countries.
Census Data
We employ three years of census data from Mexico and the US: 1990, 2000 and 2010.
We use a 10 percent sample from the Mexican census. For the years 1990 and 2000, we use a 5
percent sample from the US census. For 2010, we employ the American Community Survey,
which is a 1 percent sample of the population.
The sample selection criteria that we use for the census data mimic that of the survey
data. Specifically, we include men between ages 19 and 63 who report positive income in the
previous year. In Mexico, hourly wages are constructed by taking monthly earnings and then
dividing by reported hours worked during a typical week times 4.33. In the United States, hourly
wages were computed by taking reported yearly earnings and then dividing by reported usual
15
hours worked per year.8 As with the survey data, all wages are in 1990 US dollars. Mexican
wages were, once again, converted to 1990 dollars by, first, converting wages in pesos to US
Dollars using the exchange rate for that year and then deflating the wages to 1990 dollars using
the US CPI.9
We employ two samples from the Mexican census. The first is a sample of all workers
meeting the criteria defined above, which we call “Sample 1.” The second is a sample of
primarily urban dwellers that includes the metropolitan areas employed in the survey data. We
call this “Sample 2.”
Table 2 displays descriptive statistics from the census data. We see that the average US
wage was between $14.21 and $15.07 for the three census years. In Mexico for Sample 1,
average wages were between $1.43 and $1.59 and increased steadily over the 20 year period.
The mean wages were slightly higher in Sample 2 when we only employed urban dwellers. The
average age in the US sample ranged between 36.83 and 39.66 and increased over time. The
average age in Mexico also increased over the 20 year period but ranged from 34.79 and 37.10 in
Sample 1 and 34.59 and 37.46 in Sample 2. Finally, as in the survey data, the statistics on years
of schooling in Mexico indicate massive gains in human capital over this period. In Sample 1,
the percentage of Mexicans with 0-4 years of schooling in 1990 was 29.56 percent but was only
11.89 percent in 2010. Similarly, the percentage of Mexicans with 9-12 years of schooling was
8 Hours worked per year were obtained by taking usual hours worked per week times the number of weeks that the respondent reported to have worked during the year. 9 We also converted Mexican wages to 1990 US dollars by first deflating the wages to 1990 pesos using the Mexican CPI and then converting them to US dollars using the 1990 exchange rate. Overall, this alternative method did not make too much of a difference.
16
27.41 percent in 1990 but was 45.53 percent in 2010.10 The numbers are similar in the other
sample.
Figure 6 shows the percentages of Mexicans residing in the United States by 45 age and
education categories. Note that for reasons discussed above the education groups in the Census
data differ slightly from the survey data. The patterns in this figure are broadly consistent with
Figure 1. One key difference, however, is that we see substantially more people in the second
education category that we label as “ed1.” The reason for this is that many Mexicans leave
school between grades 5 and 6. The category “ed1” includes grade 5 in Figure 5 but excludes it
in Figure 1.
III. Results: Household Survey Data
Our main variable of interest is the long-run US-Mexican wage differential as derived in
Section I across age-education cohorts. The trend in the long-run differentials may be affected
by exogenous shocks and differences in migration costs across cohorts. To describe the changes
in the long-run differential, we use a simple trend analysis that accounts for both the peso crisis
and the 2001 trend break. Since we expect changes in wage differentials to differ between the
migrants and non-migrant groups, we also include a dummy variable for the high migration
cohort (HMC). The following regression captures all these observations:
(8) ∗ 94
∗ 94 01 ∗ 01
10 Note that the education categories in the census data are slightly different than what we use in the survey data due to the way that years of schooling were categorized in the US census years 1990 and 2000.
17
where w is equal to the difference between the natural log of the US wage and natural log of the
Mexican wage in education-age group j. Negative values indicate wage convergence. The
variable time is a time trend; is a dummy variable that indicates whether j is the high
migration cohort (workers of age 19-23 with 6 to 9 years of schooling); is a dummy variable
indicating whether the year is 1994 or later; is a dummy variable indicating whether the year
is 2001 or later and are group-specific fixed effects for an education-age group j.
The trend analysis based on equation (8) and variations of equation (8) are reported in
Tables 3. The following results do not use weights, but in separately available results, we find
that the same qualitative results emerge when we use US sample weights, Mexican sample
weights, US cell sizes, and Mexican cell sizes as weights. All equations include fixed cohort
effects and all estimated coefficients are statistically significant at the1% level.
Table 3 displays four variations of equation (8). The first column just includes the time
trend. The positive sign indicates overall divergence, but the coefficient is quite small. Figure 3,
however, shows the importance of controlling for macroeconomic events. Column 2, therefore,
includes controls for the 1988-1994 and the 1994-2001 periods both in levels and interacted with
the time trend. The overall trend (which represents 2001-2011) more than triples, representing
overall divergence in wage differentials. Note that the controls for the two periods show the
response to shocks with high intercept terms and large and negative convergence estimates.
We are also interested in the possibility that the rates of convergence differ across cohort
characteristics. In particular, we are interested in whether or not the high-migration cohort
exhibits different trends than the rest of the sample. Columns (3) and (4) show that the high
migration cohort exhibits more convergence than the rest of the sample both with and without
18
controls for the different macroeconomic shocks. Overall, therefore, these results are consistent
with the hypothesis that migration helps close the wage gap between the United States and
Mexico but overall, the gap has not been getting smaller.
IV. Results: Census Data
Mean Wage Differentials
We begin by plotting which is the mean wage differential for education cohort i and
age k at time t in Figure 7 to provide a visual understanding of the wage differentials in the
census data. We do so using both samples from the Mexican census described in Section II. We
see that for people with less education (i.e. 0 to 8 years of education) there was little change in
the differential between 1990 and 2000 but there was a substantial decline between 2000 and
2010. This is the case in both Mexican samples. Also, noteworthy is that the mean differentials
are smaller when we use Sample 2 which is the more urban sample; this is a consequence of
urban areas being richer. Once we move on to people with slightly more years of schooling, we
see a more attenuated decline between 2000 and 2010 while there still is little difference between
1990 and 2000. Finally, for the most educated cohort (more than 16 years of schooling), there is
little difference from 1990 to 2010. Overall, this figure reflects the key finding from the survey
data which is that there is some evidence of wage convergence for less educated people, although
in the census, these results are concentrated during the 2000’s.
In an attempt to quantify some of the results in Figure 7, we estimate the following
regression model:
19
in which we regress the wage differential for each education/age cohort on a set of education
(indexed i) and time dummies together with their interactions. The results are reported in Table
4.11 In the first two columns, we employ Sample 1 from the Mexican census and in the last two
columns, we employ Sample 2. In the first and third columns, we weight age education/age/year
cells using weights from the US census and in the second and fourth columns, we use weights
from the Mexican census. These adjust each education/age/time cell for the share of the
population that they represent in either Mexico or the US for that year.12
The table essentially reinforces the results shown in Figure 7 but does provide some
additional quantitative content. First, the constants in each column range from 2.25-2.39
suggesting that in 1990, people with zero to four years of schooling earned about ten times as
much in the US than in Mexico. This is broadly consistent with the average wage differentials
shown in Table 2 for the census data as well as with figures shown in Table 2 of Hanson and
Chicquiar (2005). Note that these differentials, which are on the order of about ten, are larger the
differentials obtained from the Survey data which are on the order of five; this is not a
consequence of differences in the Mexican survey and census data but instead in differences in
the US data since US wages in the CPS are lower than in the census.
Next, the first column suggests that there was a substantial widening of the wage
differential in 2000 but this is not borne out in the next three columns. Moreover, the last two
columns, in which we employ Sample 2 from the Mexican census, show a statistically significant
narrowing of the differential from 1990 to 2000. One reason for this discrepancy could be that
11 Note that we use people ages 19-63 for the first four education groups but only people ages 22-63 for the last education group which yields 222 groups per year. 12 Once again, bear in mind that we have two layers of weighting. In the first, we use the weights from the US and Mexican Censuses to construct averages for each age/education/time cell; these weights come from their respective Census. In the second, we weight each cell average with either the US or the Mexican weights for that cell.
20
weights based on the US census place more emphasis on better educated people for whom we
see substantial wage divergence in 2000 as shown in the fifth panel of Figure 8 in the first
column. However, it is not quite appropriate to attribute the negative estimates for the year 2000
dummy to a narrowing of the wage differential during the nineties. The reason for this is that
interaction between the 2000 dummy and the education variables, in columns three and four, by-
and-large are positive and at least marginally significant for up to 12 years of schooling.
Moreover, they tend to be larger in magnitude than the 2000 dummy which is indicative of a
widening of the US-Mexico wage gap during the nineties which is consistent with the results
from the survey data.
Finally, looking at the interactions between years of schooling and the 2010 dummy, we
see evidence of convergence for less educated cohorts during the 2000’s. This is true regardless
of how we weight the regressions or what sample we use. In the first column, we see that the
interactions with 0-4 and 5-8 are -0.163 and -0.137 and in the second column, they are -0.162
and -0.096. This indicates that, for these less-educated cohorts, the wage differential in 2010
was between 85.0 percent and 90.9 percent of what it was in 2000. The corresponding
interactions are -0.110 and -0.139 in column three and -0.109 and -0.089 in column four.
Changes in the Relative Wage Distribution over Time
Next, we investigate how the US and Mexican wage distributions evolved from 1990 to
2010. To do this, we compute differences in percentiles of the US and Mexican wage
distribution by education and year for 2000-1990 and 2010-2000. To fix ideas, we let
denote the th percentile for education cohort k at year t in country l. We then plot
, , , ,
21
and
, , , ,
as a function of . The first term in parentheses in each of these expressions is the wage
differential at the th percentile between the US and Mexico in either 2010 or 2000. The second
term is the same quantity but from the previous census year. The difference in the two
expressions in parentheses is then the change in the cross-border differential at a particular
percentile over a ten year period. At this point, we only consider three educational cohorts since
computing percentiles is more demanding of the data than computing means; the three cohorts
that we consider are 0-11 (no high school), 12-15 (high school) and more than 15 years of
schooling (college).
In Figure 8, we plot the changes in the relative wage distributions for 2000-2010 and
2000-1990 using both samples from the Mexican census. The most striking results are in the first
row which displays 2010-2000. First, we see that at, all points in the wage distribution, there
was a narrowing of the cross-border differential for people with less than twelve years of
schooling. The estimates indicate that the wage differential in 2010 was roughly 85 percent of
what it was in 2000 in Sample 1 and 80% of what it was in Sample 2. For high school and
college graduates, we see convergence at the lower end of the distribution. The estimated change
in the differential is negative through the 20th percentile for the college-educated and the 40th
percentile for the high school-educated in Sample 1. In Sample 2, we do not see convergence for
college graduates and but we do until the 40th percentile for high school graduates. This indicates
that the wages of US workers in the bottom half of the distribution became closer to their
counterparts across the border in the 2000s.
22
The bottom panel displays the difference from 1990 to 2000. In Sample 1, the figure
shows no stark patterns and, overall, is not indicative of any converge in the two wage
distributions over this period. However, in Sample 2, we see some evidence of convergence
among the college-educated; in particular, their wages in Mexico in 2000 were roughly 85% of
what they were in 1990. However, the survey data results indicate that the peso crisis led to a
large divergence during the mid-90’s and that this may account for the lack of evidence of
convergence which we see in Figure 8 for the period 1990-2000.
An important question to ask at this point is whether these changes are driven by Mexico
catching up or the US falling behind. To do this, we plot the change in the wage distributions in
the US and Mexico from 1990-2000 and 2000-2010. For each Mexican sample, we display these
four profiles in three graphs corresponding to the three educational cohorts. The panel for people
with less than twelve years of schooling indicates that a large part of the convergence that we see
for the less educated is a consequence of US workers falling behind. Indeed, real wages in the
US fell about 0.12 log points at all points in the distribution over this period. In contrast, there
were modest gains in Mexican wages over this period. Turning to high school graduates in the
middle panel, we see that from 2000-2010, US wages fell behind quite a bit, particularly, at the
bottom of the distribution. Mexican wages also declined over this period but, typically, by a
smaller magnitude.
However, there is one very important difference in the behavior of the wage structure of
high school graduates from 2000-2010 between the United States and Mexico. We see that the
plot for the United States is increasing and that the plot for Mexico is decreasing. What this
means is that the losses in the United States disproportionately hit the poor, whereas in Mexico,
they disproportionately hit people towards the top of the distribution. This suggests that
23
although mean wages of high school graduates may have fallen during the 2000’s in both
countries, inequality for this group declined in Mexico but increased in the US
We now turn to the college-educated in the third row. In Sample 1, we do not see terribly
strong evidence of either Americans falling behind or Mexicans catching up during either the
1990’s or the 2000’s. However, the results are starker in Sample 2. The wages of the college-
educated in Mexico declined between 2000 and 2010 by roughly 10%. However, we also see
that between 1990 and 2000, Mexican wage growth was over 10% larger than in the US at most
points in the wage distribution. This suggests that the evidence for convergence that we saw in
Figure 8 for the college-educated between 1990 and 2000 was due to gains in Mexico.
Triple Diffs: Comparisons between the Border and the Interior
One way in which we can attempt to tease out the extent to which trade or migration is
responsible for the observed narrowing of the US-Mexico wage gap during the period 2000-2010
in the census data is to conduct a similar analysis as in the previous section but to compare these
changes between Mexico’s border and interior states. The rationale behind this exercise that, as
pointed by many including Robertson (2000), Mexico’s border is more tightly linked with the
United States than its interior. The two reasons for this are the presence of the maquiladora
industry which is concentrated primarily along the US-Mexico border and the fact that many
border cities are conduits for migrants, notably, Tijuana. In addition and perhaps more
important, Figure 3 showed that the peso crisis of 1994 most likely confounds our ability to
detect any convergence during the 1990’s that may have occurred due to trade or migration.
Because the crisis impacted the entirety of Mexico, this third difference mitigates the bias from
this confounding factor.
24
To investigate this, we consider a triple-difference version of the exercise from the
previous section. Specifically, we compute
, ,,
, ,,
, ,,
, ,,
where the superscript B denotes Mexico’s border region and I denotes Mexico’s interior.13 So,
we look at how the change in the US-Mexico wage gap between 2010 and 2000 changes as we
move from Mexico’s border to its interior.
We report the results in Figure 10. During the period 2000-2010, we do not see any
evidence that convergence was any faster along the border than in the interior. In fact, using
Sample 2 from the Mexican sample, we actually see that, relative to the interior, the wage
differential along the border expanded from 2000 to 2010. What this may then indicate is that
during the period 2000-2010 light industries may have exited Mexico’s border region thereby
reducing wages there vis-à-vis the interior. Next, we see that during the period 1990-200 that
wages in Mexico’s border region increased at a more rapid rate than in the interior. This is
particularly the case in Sample 2.
It is important to emphasize that we see large movements in wage differentials in the
border area relative to the interior at least once we restrict the sample to more urban areas.
During the 1990’s, wages in these cities close to the border saw large gains relative to the rest of
Mexico and this was subsequently reversed in the 2000’s. This is suggestive that trade has the
potential to narrow US-Mexico wage differentials but, at the same time, it also suggests that US-
Mexico trade is not responsible for the convergence that we saw in the survey and the census
13 We define “border” to be all of Mexico’s states that border with the United States which includes Baja California, Sonora, Chihuahua, Tamaulipas and Coahuila. When we employ Sample 1, we use all wages from these states which include those from rural areas. When we employ Sample 2, we only use selected cities which include large border towns such as Tijuana and Juarez.
25
during the 2000’s since wages in Mexico’s maquiladora sector took a substantial hit during this
period. Rather, it may indicate that a third factor such as Chinese competition both adversely
impacted Mexican and US wages.
Variance Decompositions
We conclude the analysis of the census data with a variance decomposition exercise. It is
common in the inequality literature (e.g. Lemieux 2008) to decompose the variance of location l
at time t into its “within” and “between” components as follows:
where
,
and
,
where is the population weight for cell i,k,t in country l, is the variance in cell i,k,t in
county l, is the average log wage in cell i,k,t in country l and is the average of the log
wage at time time t in country l. The within component measures variation in wages within
education/age cohorts, whereas the between component measures variation across education/age
cohorts. We conduct this wage decomposition for the US and Mexico. We also combine data
from the two countries and conduct the exercise for the integrated economy with appropriate
modifications to the weights for relative country sizes and using the grand mean of the wage in
the US and Mexico in the formula for the between component.
26
Before we discuss our results, it is useful to consider how a simple HOS story with two
countries would play out. In the aftermath of trade liberalization, demand for low-skilled labor
in the United States should decline but increase in Mexico and, more generally, within a given
skill set, wages should converge. What this suggests then is that in the US-Mexico integrated
economy, the within group component of inequality should decline over time. Next, given the
conventional wisdom that trade should hurt lower skilled workers in the United States but help
them in Mexico, we should also expect to see that the between component of the variance should
increase in the United States but decrease in Mexico.
The results are reported in Table 5. First, the table indicates that the total variance of log
wages in the integrated economy has steadily declined since 1990 in when we use all Mexicans
but not when we restrict the sample to urban Mexicans. We do see that the within component of
variance declined in the integrated between 1990 and 2000 but increased in 2010. Next, we see
that the variance of wages has declined steadily in Mexico since 1990, but this decline is due to
changes in the within component not the between component. Finally, inequality in the United
States has steadily increased from 1990-2010, but similar to Mexico, this increase is due to
increases in the within component of inequality. In summary, the data seem to suggest that
Mexican wage dispersion has decreased and that American inequality has done the opposite but
that this is not consistent with a textbook two-country HOS story.
V. Conclusion In this paper, we presented descriptive evidence on the evolution of wage differentials
between the United States and Mexico over the period 1988-2011. On net, we showed that
wages between the two countries diverged over this period. However, this had much to do with
the peso crisis of 1994. Subsequently, there was a large convergence until 2001, the year in
27
which China entered the WTO, after which we saw steady divergence. These findings strongly
indicate that the divergence from 1988-2011 had much to do with large macroeconomic events
which may have counteracted the effects of US-Mexico trade and migration.
A more detailed look at our data reveals that trade and migration may indeed bring more
wage convergence, despite the overall divergence in the raw data. First, in the survey data, we
show that, the peso crisis notwithstanding, there is steady convergence for young people with
intermediate levels of schooling who are precisely the people who are most likely to emigrate
from Mexico. One important topic for future work is to investigate more rigorously the effects
of migration on US-Mexico long-run wage differentials. Second, in the census data, we show
that over the period 1990-2000 that the border of Mexico caught up to the US relative to the
interior. This exercise has the added benefit that it mitigates greatly the confounding effects of
the peso crisis which allows us to better see the effects of NAFTA which should have been more
prevalent in the border. On the other hand, this same exercise reveals that during the period
2000-2010 that there was divergence in the border relative to the interior. Given that we also
saw that low-skilled US wages declined by around 10% over this period, this suggests that a
third factor may have had adverse effects on the Mexico border and low-skilled US wages.
Autor, Dorn, and Hanson (2012) show that much of the latter can be attributed to Chinese trade.
Another important topic for future work is to conduct a similar analysis in Mexico.
References
Autor, David H., David Dorn, and Gordon H. Hanson (2013) “The China Syndrome: Local
Labor Market Effects of Import Competition in the United States” American Economic Review 103(6): 2121-68.
Banco de Mexico. (2013). Exchange rate, Pesos per US dollars (Daily). Retrieved from http://www.banxico.org.mx/SieInternet/consultarDirectorioInternetAction.do?accion=consultarCuadro&idCuadro=CF102§or=6&locale=en
28
Borjas, George J. (2003). The Labor Demand Curve Is Downward Sloping: Reexamining the Impact of Immigration on the Labor Market. The Quarterly Journal of Economics, 118(4): 1335-1374.
Brown, Drusilla (1992) “The Impact of a North American Free Trade Area: Applied General Equilibrium Trade Models” in Lustig, Nora, Barry P. Bosworth, Robert Z. Lawrence (eds.) North American Free Trade: Assessing the Impact The Brookings Institution, Washington D.C.
Card, David. (1990). The Impact of the Mariel Boatlift on the Miami Labor Market. Industrial and Labor Relations Review, 43(2): 245-247.
Card, David. (2001). Immigrant Inflows, Native Outflows and the Local Labor Market Impacts of Higher Immigration. Journal of Labor Economics, 19(1): 22-64.
Bureau of Labor Statistics. Consumer Price Index. Retrieved May 11 2013. http://www.bls.gov/cpi/
Chiquiar, Daniel. (2001). Regional Implications of Mexico’s Trade Liberalization. Mimeo UCSD.
Chiquiar, Daniel. and Gordon H. Hanson. (2005). Internal Migration, Self-Selection and the Distribution of Wages: Evidence from Mexico and the United States. Journal of Political Economy 113(2): 239-281.
Davis, Donald and Prachi Mishra (2007). Stopler-Samuelson is Dead: And Other Crimes of Both Theory and Data, in Globalization and Poverty, ed. by A. Harrison. University of Chicago Press, Chicago, Il.
Deaton, Angus. (1985). Panel data from time series of cross sections. Journal of Econometrics 30(1): 109-126.
Deaton, Angus. (1997). The Analysis of Household Surveys: A Microeconomic Approach to Development Policy. Johns Hopkins University Press: Baltimore.
Dussel Peters, Enrique and Kevin P. Gallagher. (2013) “NAFTA’s Uninvited Guest: China and the Disintegration of North American Trade” Cepel Review 110(August): 83-108.
Goldberg, P. and N. Pavcnik (2007). Distributional Effects of Globalization in Developing Countries. Journal of Economic Literature 45(1): 39-82.
Hanson, G. H. (1996). Localization Economies, Vertical Organization, and Trade. American Economic Review 86(5): 1266-1278.
Hanson, G. H. (1997). Increasing Returns, Trade, and the Regional Structure of Wages. Economic Journal 107(440): 113-133.
Hanson, G. H. (2003). What Has Happened to Wages in Mexico Since NAFTA? Implications for Hemispheric Free Trade. Working Paper Series, 9563.
Hendry, D.F. and N.R. Ericsson (1991). Modeling the Demand for Narrow Money in the United Kingdom and the United States. European Economic Review 35(4): 833-886.
Lemieux, T. (2008). What Do We Really Know About Changes in Wage Inequality? Mimeo UBC.
Mishra, P. (2007). Emigration and wages in source countries: Evidence from Mexico. Journal of Development Economics 82(1): 180-199.
Office of the United States Trade Representative. Mexico. Retrieved April 8 2013. http://www.ustr.gov/countries-regions/americas/mexico
Roberts, Bryan, Gordon Hanson, Derekh Cornwell, and Scott Borger (2010) “An Analysis of Migrant Smuggling Costs along the Southwest Border” Department of Homeland
29
Security Office of Immigration Studies Working Paper, November. https://www.dhs.gov/xlibrary/assets/statistics/publications/ois-smuggling-wp.pdf.
Robertson, Raymond. (2000). Wage Shocks and North American Labor-Market Integration. American Economic Review, 90(4): 742-764.
Robertson, Raymond (2005) “Has NAFTA Increased Labor Market Integration between the United States and Mexico?” The World Bank Economic Review, 19: 425-448.
Robertson, Raymond; Kumar, Anil; Dutkowsky, Donald (2009) “Purchasing Power Parity an Aggregation Bias in a Developing Country: The Case of Mexico” Journal of Development Economics November, 90(2): 237-243.
Schott, Peter K. (2003). "One Size Fits All? Heckscher-Ohlin Specialization in Global Production," American Economic Review June 93(3): 686-708.
United States Census Bureau. Trade in Goods with Mexico. Retrieved April 8 2013. http://www.census.gov/foreign-trade/balance/c2010.html
Vogelsang, Timothy J. and Pierre Perron (1998) “Additional Tests for a Unit Root Allowing for a Break in the Trend Function at an Unknown Time” International Economic Review November 39(4): 1073-1100.
30
Table 1: Summary Statistics of Survey Data
United States
1988‐1994 1995‐2002 2003‐2007 2008‐2011
Monthly Wage $1,492.69 $1,504.65 $1,515.75 $1,466.30
(679.02) (703.75) (677.00) (681.38)
Hourly Wage $8.26 $8.27 $8.41 $8.28
(3.42) (3.52) (3.41) (3.45)
Age 37.45 38.74 39.85 40.54
(0.29) (0.45) (0.19) (0.18)
Education
0‐5 1.60% 2.30% 2.40% 2.10%
6‐8 2.70% 1.60% 1.40% 1.20%
9‐11 7.50% 7.80% 7.90% 6.50%
12‐15 61.50% 59.40% 57.00% 56.60%
>16 26.70% 28.90% 31.30% 33.60%
Mean N per quarter 21,155.89 19,393.91 20,960.35 19,667.75
Mexico
1988‐1994 1995‐2002 2003‐2007 2008‐2011
Monthly Wage $310.57 $260.24 $272.11 $226.50
(175.59) (149.47) (135.21) (112.70)
Hourly Wage $2.09 $1.36 $1.41 $1.24
(1.33) (0.81) (0.74) (0.64)
Age 35.05 35.56 36.88 37.32
(0.11) (0.41) (0.35) (0.09)
Education
0‐5 18.40% 14.30% 12.90% 12.40%
6‐8 27.70% 26.80% 23.60% 22.10%
9‐11 24.10% 30.60% 31.60% 33.20%
12‐15 13.40% 13.10% 16.90% 18.90%
>16 16.40% 15.20% 15.00% 13.40%
Mean N per quarter 33,445.89 42,934.50 31,427.05 27,756.00
Notes: All wages are in 1990 US dollars. In Mexico, the monthly wage was computed by converting wages to US dollars using the exchange rate for that year and then deflating the wages using the US CPI. Standard deviations are in parentheses. Mean N per quarter represents the average number of observed individuals per quarter per period (without population weight expansion).
31
Table 2: Descriptive Statistics from Census Data
1990 2000 2010 US
Hourly Wage 14.21 (11.38)
15.07 (12.49)
14.98 (13.09)
Age 36.83 (11.59)
38.33 (11.50)
39.61 (12.27)
Education 0-4 1.56% 1.56% 1.50% 5-8 3.26% 3.20% 3.01% 9-12 37.72% 35.42% 32.36% 13-16 47.99% 49.66% 52.07% >16 9.47% 10.15% 11.06% N 1,982,151 2,361,079 496,042 MX – Sample 1 Hourly Wage 1.43
(1.82) 1.55
(1.92) 1.59
(1.81) Age 34.79
(11.20) 35.39
(11.04) 37.10
(11.38) Education 0-4 29.56% 18.10% 11.89% 5-8 30.01% 26.49% 21.60% 9-12 27.41% 37.42% 45.53% 13-16 5.62% 9.54% 12.22% >16 7.42% 8.45% 8.77% N 1,264,613 1,597,037 1,754,953 MX – Sample 2 Hourly Wage 1.61 1.77 1.74 (1.98) (2.15) (1.97) Age 34.59 35.42 37.46 (10.97) (10.91) (11.35) Education 0-4 18.38% 10.95% 7.30% 5-8 31.00% 24.65% 18.85% 9-12 33.04% 43.12% 49.24% 13-16 7.81% 11.80% 14.62% > 16 9.76% 9.47% 9.99% N 507,068 538,663 360,515 All wages are in 1990 US dollars. In Mexico, the hourly wage was computed by converting wages to US dollars using the exchange rate for that year and then deflating the wages using the US CPI. US census data were 5% samples except for the American Community Survey sample in 2010 which was a 1% sample. The Mexican census was a 10% sample for all three years. MX – Sample 1 uses all people who meet the sample criteria described above. MX – Sample 2 uses these criteria and further restricts the sample to the metropolitan areas that are employed in the Mexican survey data.
32
Table 3: Trends in US-Mexico Wage Gap
(1) (2) (3) (4) VARIABLES Trend Breaks Migrants Migrants and Breaks Time 0.002*** 0.007*** 0.002*** 0.007*** (0.000) (0.000) (0.000) (0.000)Migrant_x_time ‐0.003*** ‐0.003*** (0.001) (0.001)1988-1994 4.139*** 4.139*** (0.077) (0.076)1994-2001 3.187*** 3.187*** (0.104) (0.104)Trend in 88-94 ‐0.031*** ‐0.031*** (0.001) (0.001)Trend in 94-2001 ‐0.019*** ‐0.019*** (0.001) (0.001)Constant 1.448*** 0.393*** 1.437*** 0.381*** (0.028) (0.048) (0.028) (0.048) Observations 4,320 4,320 4,320 4,320Number of cohorts 45 45 45 45Notes: Standard errors in parentheses. *** p<0.01.
33
Table 4: Mean Wage Difference Regressions, Census Data
(1) (2) (3) (4) Constant 2.393***
(0.052) 2.402***
(0.012) 2.238***
(0.056) 2.250***
(0.016) Years of Education 0-4 - - - - 5-8 -0.207***
(0.064) -0.272***
(0.017) -0.093
(0.069) -0.154***
(0.020) 9-12 -0.314***
(0.054) -0.340***
(0.017) -0.170***
(0.058) -0.196***
(0.019) 13-16 -0.569***
(0.053) -0.586***
(0.030) -0.464***
(0.057) -0.480***
(0.028) >16 -0.332***
(0.057) -0.358***
(0.027) -0.249***
(0.061) -0.270***
(0.027) Year 2000 0.084**
(0.030) 0.033
(0.033) -0.087**
(0.032) -0.113***
(0.031) 2010 -0.009
(0.029) -0.008(0.032)
-0.003 (0.031)
-0.013(0.030)
Education*Year 0-4*2000 -0.058
(0.080) -0.005 (0.038)
0.129 (0.085)
0.157***
(0.040) 5-8*2000 -0.061
(0.059) 0.033
(0.037) 0.100
(0.064) 0.167***
(0.036) 9-12*2000 -0.033
(0.033) 0.020
(0.037) 0.120***
(0.036) 0.145***
(0.034) 13-16*2000 -0.204***
(0.032) -0.136 (0.048)
-0.025
(0.035) 0.012
(0.043) 0-4*2010 -0.163**
(0.080) -0.162***
(0.040) -0.110
(0.087) -0.104***
(0.042) 5-8*2010 -0.137***
(0.060) -0.096***
(0.037) -0.139**
(0.064) -0.084***
(0.036) 9-12*2010 0.008
(0.033) -0.007 (0.036)
-0.006 (0.036)
-0.010 (0.034)
13-16*2000 0.014 (0.032)
0.022 (0.047)
0.038 (0.034)
0.057 (0.042)
MX Sample 1 1 2 2 Weights US MX US MX R2 0.7548 0.7472 0.7213 0.6508 Number of Cohorts 666 666 666 666
Standard errors in parentheses *** p<0.01, ** p<0.05, * p<0.1 Notes: In the first and third column, we weight the regression using weights from the US census; in the second and fourth column, we weight the regression using weights from the Mexican census.
34
Table 5: Variance Decompositions, Census Data
MX and US
MX and US MX
MX
US 1990
Within 0.420 0.404 0.649 0.588 0.345 Between 1.293 1.134 0.147 0.145 0.135
Total 1.713 1.538 0.796 0.733 0.480
2000
Within 0.405 0.400 0.519 0.482 0.366 Between 1.275 1.126 0.192 0.203 0.125
Total 1.680 1.526 0.711 0.685 0.491
2010
Within 0.426 0.427 0.461 0.462 0.414 Between 1.199 1.124 0.145 0.149 0.171
Total 1.625 1.551 0.606 0.611 0.585 MX
Sample 1 2 1
2 -
35
Figure 1: Percentage of Mexican-born Workers in the US by Age and Education, Household Surveys
Notes: The first age group includes workers aged 19-23 years old; the second includes workers aged 24-28, the third those aged 29-33, and so forth. The first education group includes adults with 0-5 years of education; the second includes adults with 6-8 years of education; the next comprise those with 9-11, 12-15, and finally 16 or more years of education.
Ed0
Ed1
Ed2
Ed3
Ed4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
123
45
67
89
Ed0
Ed1
Ed2
Ed3
Ed4
Age Group
36
Figure 2a: Time Series Behavior of Mexican Monthly Wages
Notes: Cohort 39 is excluded.
37
Figure 2b: Time Series Behavior of US Monthly Wages
38
Figure 3: Time Series Behavior of Mean Differentials by Cohorts
.51
1.5
22
.5lo
g(w
age
)
1988
q1
1990
q1
1992
q1
1994
q1
1996
q1
1998
q1
2000
q1
2002
q1
2004
q1
2006
q1
2008
q1
2010
q1
2012
q1
Time
US-MX Difference in Monthly Earnings by Cohort
39
Figure 4a: Time Series Behavior of Mean Differentials across Cohorts
Notes: The trend break test statistic is test 2a from Volgelsang and Perron (1998), which is an additive outlier test for an unknown break. Note that peaks occur at the peso crisis (December 1994), the US recession that started in March 2001, and the Financial Crisis (October 2008).
02
04
06
0T
ren
d B
rea
k T
est S
tat
1.4
1.6
1.8
22
.2M
ean
Wag
e D
iffer
ent
ial
1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012Time
Mean Wage Differential Trend Break Test Stat
Mean Differential and Trend Break Test Statistic
40
Figure 4b: Time Series Behavior of Standard Deviation of Diffentials across Cohorts
Notes: The peso crisis occurs in December 1994 and China enters the WTO on December 11, 2001.
.05
.1.1
5.2
.25
.3S
td. D
ev. A
cro
ss C
ohor
ts
1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2008 2010 2012Time
Standard Deviation Across Cohorts
41
Figure 5a: Wage Differentials, 0-6 Years of Education and 34-38 Years Old
42
Figure 5b: Wage Differentials, 12-16 Years of Education and Age 54-58
43
Figure 5c: Wage Differentials, 6-9 Years of Education and 19-23 Years Old
44
Figure 6: Percentage of Mexican-born Workers in the US by Age and Education, Census Data
Notes: The first age group includes workers aged 19-23 years old; the second includes workers aged 24-28, the third those aged 29-33, and so forth. The first education group includes adults with 0-4 years of education; the second includes adults with 5-8 years of education; the next comprise those with 9-12, 13-16, and finally 17 or more years of education.
Ed0Ed1
Ed2Ed3
Ed4
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Ed0
Ed1
Ed2
Ed3
Ed4
45
Figure 7: Mean Wage Differentials by Age, Census Data MX – Sample 1 MX – Sample 2
1.6
1.8
22.
22.
42.
62.
8U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ < 5
1.6
1.8
22.
22.
42.
62.
8U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ < 5
1.6
1.8
22.
22.
4U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ >= 5 and <= 8
1.6
1.8
22.
22.
4U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ >= 5 and <= 8
1.4
1.6
1.8
22.
22.
4U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ >= 9 and <= 12
1.4
1.6
1.8
22.
22.
4U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ >= 9 and <= 12
46
1.2
1.4
1.6
1.8
2U
S-M
X W
ag
e D
iffe
ren
tial
20 30 40 50 60Age
1990 20002010
Educ >= 13 and <= 16
1.2
1.4
1.6
1.8
2U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ >= 13 and <= 161.
41.
61.
82
2.2
2.4
2.6
2.8
US
-MX
Wag
e D
iffer
entia
l
20 30 40 50 60Age
1990 20002010
Educ > 16
1.4
1.6
1.8
22.
22.
42.
62.
8U
S-M
X W
age
Diff
eren
tial
20 30 40 50 60Age
1990 20002010
Educ > 16
47
Figure 8: Changes in Wage Percentiles by Education
MX – Sample 1 MX – Sample 2
-.4
-.3
-.2
-.1
0.1
.2U
S-M
X W
age
Diff
: 201
0-20
00
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
-.4
-.3
-.2
-.1
0.1
.2U
S-M
X W
age
Diff
: 201
0-20
00
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
-.25
-.15
-.05
.05
.15
.25
US
-MX
Wag
e D
iff: 2
000-
1990
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
-.25
-.15
-.05
.05
.15
.25
US
-MX
Wag
e D
iff: 2
000-
1990
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
48
Figure 9: Decompositions of Wage Distribution Changes by Years
MX – Sample 1 MX – Sample 2
-.3
-.2
-.1
0.1
Wag
e G
row
th
0 .2 .4 .6 .8 1Quantile
US 2000-1990 MX 2000-1990US 2010-2000 MX 2010-2000
No High School
-.3
-.2
-.1
0.1
Wag
e G
row
th
0 .2 .4 .6 .8 1Quantile
US 2000-1990 MX 2000-1990US 2010-2000 MX 2010-2000
No High School
-.35
-.25
-.15
-.05
.05
.15
Wa
ge
Gro
wth
0 .2 .4 .6 .8 1Quantile
US 2000-1990 MX 2000-1990US 2010-2000 MX 2010-2000
High School
-.35
-.25
-.15
-.05
.05
.15
Wa
ge
Gro
wth
0 .2 .4 .6 .8 1Quantile
US 2000-1990 MX 2000-1990US 2010-2000 MX 2010-2000
High School
-.2
-.1
0.1
.2.3
Wa
ge
Gro
wth
0 .2 .4 .6 .8 1Quantile
US 2000-1990 MX 2000-1990US 2010-2000 MX 2010-2000
College
-.2
-.1
0.1
.2.3
Wa
ge
Gro
wth
0 .2 .4 .6 .8 1Quantile
US 2000-1990 MX 2000-1990US 2010-2000 MX 2010-2000
College
49
Figure 10: DDD Results – Differences in Changes in Wage Percentiles by Education across Mexico’s Border and Interior
MX - Sample 1 MX – Sample 2
-.1
-.05
0.0
5.1
US
-MX
Wag
e D
iff -
Trip
le D
if: 2
010-
2000
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
-.1
-.05
0.0
5.1
.15
.2.2
5.3
.35
.4U
S-M
X W
age
Diff
- T
riple
Dif:
201
0-20
00
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
-.1
-.05
0.0
5.1
US
-MX
Wag
e D
iff -
Trip
le D
if: 2
000
- 19
90
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege
-.3
-.25
-.2
-.15
-.1
-.05
0.0
5.1
US
-MX
Wag
e D
iff -
Trip
le D
if: 2
000
- 19
90
0 .2 .4 .6 .8 1Quantile
No High School High SchoolCollege