did railways affect literacy? evidence from india ... · in order to test for a link between...

34
DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA LATIKA CHAUDHARY ? AND JAMES FENSKE Preliminary Draft Prepared for the Economic History Association 2019 Annual Meeting Abstract. In this paper, we study the effect of railways on literacy in colonial India. Using a new district-level dataset on both total and gender-specific literacy between 1881 and 1921, we estimate the effect of railways using an instrumental variables strategy based on distances from a minimum spanning tree that connects major cities that existed prior to the beginning of railroad construction, and from a plan that predates construction of the railway. We also estimate a difference-in-differences approach that compares the change in literacy before and after the arrival of railways within the same district. These approaches address concerns that districts that were coastal, more developed, or more urbanised were the first to be connected. Our preliminary results suggest large and positive effects of railways on literacy using instrumental variables, but not with the difference-in-differences set-up. Keywords : Colonialism, Railways, Literacy. JEL Codes : N75, N35, R40 1. Introduction The arrival of railroads transformed the world in the 19 th century. By reducing trans- portation costs and connecting different markets, railways reduced price variation across space and time, increased access to markets and contributed to higher incomes in many economies. Many scholars have studied the effects of railways on price dispersion, market access and income in several countries. Yet, railways transported both goods and people. Hence, we would expect their arrival to also affect outcomes, such as the health and school- ing of people. Indeed, a small number of studies that have investigated these channels find negative effects of railways on mortality in Japan (Tang 2017), and positive effects on school attendance in the United States (Atack, Margo and Perlman 2012). India saw one of the largest expansions of railways in this period, increasing from 9,893 route miles in 1881 to 37,266 in 1921 (Bogart and Chaudhary 2016). Railways reduced transport costs and price variation across districts, which in turn translated into higher ? Associate Professor, Graduate School of Business and Public Policy, Naval Postgraduate School Professor, Department of Economics, University of Warwick E-mail addresses: [email protected], [email protected]. Date : August 9, 2019. 1

Upload: others

Post on 01-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA

LATIKA CHAUDHARY? AND JAMES FENSKE†

Preliminary Draft Prepared for the Economic History Association 2019 Annual Meeting

Abstract. In this paper, we study the effect of railways on literacy in colonial India. Using

a new district-level dataset on both total and gender-specific literacy between 1881 and 1921,

we estimate the effect of railways using an instrumental variables strategy based on distances

from a minimum spanning tree that connects major cities that existed prior to the beginning

of railroad construction, and from a plan that predates construction of the railway. We also

estimate a difference-in-differences approach that compares the change in literacy before and

after the arrival of railways within the same district. These approaches address concerns

that districts that were coastal, more developed, or more urbanised were the first to be

connected. Our preliminary results suggest large and positive effects of railways on literacy

using instrumental variables, but not with the difference-in-differences set-up.

Keywords: Colonialism, Railways, Literacy.

JEL Codes: N75, N35, R40

1. Introduction

The arrival of railroads transformed the world in the 19th century. By reducing trans-

portation costs and connecting different markets, railways reduced price variation across

space and time, increased access to markets and contributed to higher incomes in many

economies. Many scholars have studied the effects of railways on price dispersion, market

access and income in several countries. Yet, railways transported both goods and people.

Hence, we would expect their arrival to also affect outcomes, such as the health and school-

ing of people. Indeed, a small number of studies that have investigated these channels find

negative effects of railways on mortality in Japan (Tang 2017), and positive effects on school

attendance in the United States (Atack, Margo and Perlman 2012).

India saw one of the largest expansions of railways in this period, increasing from 9,893

route miles in 1881 to 37,266 in 1921 (Bogart and Chaudhary 2016). Railways reduced

transport costs and price variation across districts, which in turn translated into higher

?Associate Professor, Graduate School of Business and Public Policy, Naval PostgraduateSchool†Professor, Department of Economics, University of WarwickE-mail addresses: [email protected], [email protected]: August 9, 2019.

1

Page 2: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

2 LATIKA CHAUDHARY AND JAMES FENSKE

agricultural incomes (Hurd 1975, McAlpin 1975, Donaldson 2018). But, India did not expe-

rience rapid urbanisation or industrialisation. The forward and backward linkages between

railways and industry were weak. And, we know nothing about their effects, if any, on social

outcomes. To quantify the total effects of railways, we thus need to understand if and how

railways affected human capital among other social outcomes. Our paper takes a first step

in this direction by studying the effect of railways on literacy in India between 1881 and

1921. We focus on literacy, a key measure of schooling, which is well-documented in colonial

records of India.

There are multiple theoretical links from railways to literacy. We know the arrival of rail-

ways in India increased agricultural income, which would generate income and substitution

effects. Many rural families in India did not send their children to school because of the

opportunity cost of child labor. Children helped their parents in the field and with other

household tasks. If income effects of railways were large and helped relax such constraints,

we would expect more families to send children to school contributing to higher literacy. If,

however, higher agricultural incomes increased the returns to child labor, we would expect

a negative effect on subsequent literacy.

Similarly, additional taxes on land could have funded rural public schools. In theory, an

increase in agricultural incomes could thus increase the supply of rural primary schools. But

this was likely a small factor in practice. First, the land revenue was fixed in perpetuity in

provinces such as Bengal on account of the Permanent Settlement of 1793, decades before

the arrival of railways. Second, even in areas where it was updated, such revisions were

infrequent (every 30 years) and not directly tied to changes in agricultural productivity.

Land revenues declined in importance over the colonial era from over 10% of agricultural

income in 1860 to 3% in 1940 (Kumar 1983). So, if rising agricultural incomes due to railways

affected schooling, this is more likely to have been driven by demand-side than supply-side

channels. Indeed, fees and other private sources accounted for 53% of total spending on

education between 1881 and 1921.1

Apart from agriculture, railways likely changed the occupational structure of districts. If

such changes increased the return to skills, we would expect a positive effect of railways on

literacy. More people traveling across districts and more information flowing across space

would also contribute to higher literacy.

In order to test for a link between railways and human capital in colonial India, we use

new data on district-level literacy from 1881 to 1921 for all the districts of British India. Our

data includes total literacy and literacy disaggregated by gender. To these data we merge

the opening date of railways in each district. By 1881, 51% of British Indian districts were

connected to a railroad increasing to 96% by 1921. Unlike railroad growth, literacy only

1This refers to spending on recognised schools that came under the supervision of public education depart-ments. See Chaudhary (2016) for details.

Page 3: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 3

increased from 3.16% in 1881 to 6% in 1921. Differences across districts in access to railways

and literacy were large. Literacy ranged from 2% to 32% as of 1921, while the number

of years a district had been connected to a railroad varied anywhere from 0 to 68 across

districts. Due to changes in the definition of literacy across census years, in our baseline

analysis, we use cross-sectional estimates computed separately for each census year. In the

analysis, we also control for geographic variables, namely latitude, longitude, altitude, area

of the district, ruggedness, slope, precipitation, indicator for a river, distance to the coast,

and suitability for specific crops such as cotton, dryland rice, wetland rice, and wheat. Such

a rich set of variables ensures we control for any geographical factors correlated with the

early arrival of railways and literacy.

This simple cross-sectional comparison of literacy in districts with and without railways

is likely to be misleading, however, because the presence of railways was not random. Major

coastal cities such as Calcutta and Bombay were among the first to be connected, as were

densely-populated centers along the Indo-Gangetic plain. Even in the absence of railways,

we may expect such places to have higher literacy because they were initially more urbanised

and developed.

To identify the effects of railways, we use three estimation strategies. First, we use an

instrumental variables strategy based on a spanning tree. Such approaches have been used in

the urban economics literature to assess the effects of Chinese roads (Faber, 2014), African

railroads (Jedwab and Moradi, 2016), and Brazilian highways (Morten and Oliveira, 2016).

In our context, we construct a minimum spanning tree that connects cities that existed

circa 1850 before the construction of railroads. We use distance from this spanning tree to

credit both the placement of the railroad in 1881 and its further expansion. We also include

key geographic variables in the first stage to account for any positive effects of geography

on routes and development that may also contribute to higher literacy. Our identifying

assumption is that conditional on the geographic controls, the distance from the spanning

tree is uncorrelated with factors that may affect literacy other than through railroads. Our

second strategy constructs an instrumental variable based on an early 1852 railway plan

proposed by Major Kennedy, consulting engineer to the Government of India (GOI). The

instrument is distance from the proposed plan. Third and finally, we use a differences-

in-differences strategy comparing changes in literacy within districts before and after they

received railways to districts that never received railways between 1881 and 1921, or were

connected throughout these decades.

In repeated cross-sections from 1881 to 1921, we find positive and significant effects of

railroad years in both OLS and instrumental variables regressions. Standardised coefficients

suggest effect sizes ranging from 0.1 to 0.36 standard deviations depending on year and

estimation. We also find the effects are larger for female literacy compared to male literacy.

Page 4: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

4 LATIKA CHAUDHARY AND JAMES FENSKE

Unlike the cross-sectional results, we find small and insignificant effects of railroad years in

the differences-in-differences estimation. One potential reason could be measurement error

in literacy on account of changing definitions. In future work, we plan to add data on

occupational structure to understand the mechanism driving the positive results.

Our paper relates to three literatures. First, it relates to the large literature in economic

history on railways. Beginning with the seminal work of Fogel (1964), early work on railways

quantified the effect of railways on economic growth. Using the social savings approach, Fogel

(1964) argued that railways played a small role in the 19th century economic success of the

United States. Scholars then extended the methodology to other countries finding large

effects of railways in Latin America for example (Summerhill 2005, Herranz-Loncan 2014).

Recent contributions using better data and more sophisticated techniques rebut the findings

for the US (Donaldson and Hornbeck 2016), while confirming the positive effects of railroads

on integrating markets and increasing incomes in other parts of the world. Apart from the

United States, the literature has focused on trade and the different market integration effects

of railroads.2 Quantitative studies looking at other outcomes are few and far between.3 Our

contribution here is studying the effects on literacy, an important human capital outcome.

Similar to the United States, we also find positive effects of railroads on education in India.

Similar to global economic history, research on Indian railways has studied its effects on

transport costs, price dispersion, wages and agricultural incomes.4 Apart from Andrabi

and Kuehlwein (2010), these papers find large and positive effects of railways on market

integration. In particular, Donaldson’s (2018) innovative approach using new data and

modelling confirms the positive assessment of older studies. Yet, Indian railways have been

caught up in debates on colonisation. Indeed, nationalist critiques argue the financing of

Indian railways delivered excessive returns to British investors at the expense of Indian

taxpayers and hurt economic development (Dubey 1965, Satya 2008). Railways did not

industrialise India in this view because they were built to benefit the Empire. While there

is no doubt that British influence on Indian railways was large, the effects on income were

positive and returns to British investors were not excessive (Bogart and Chaudhary 2019).

Our study of literacy extends this conversation beyond trade thereby offering a more complete

picture of the interaction between railways and other parts of the colonial economy.

Second, our paper relates to a small but growing literature in Indian economic history using

new district or town-level data to study conflict, schooling, population growth, urbanisation

2In the case of US railroads, scholars have looked at the effects on urbanisation, banking, schooling amongother outcomes. See Atack et al. (2010), Atack, Haines and Margo (2011), Atack et al. (2014) among others.3Notable exceptions are Tang (2017) looking at mortality in Japan and Atack, Margo and Perlman (2012)on school attendance in the United States.4For example, Hurd (1975), Mukherjee (1980), McAlpin (1975), Collins (1999), Andrabi and Kuehlwein(2010) and Donaldson (2018). See Bogart and Chaudhary (2016) for more details on this literature.

Page 5: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 5

and migration, among other outcomes.5 Third and finally, our paper contributes to a devel-

opment literature estimating the effects of infrastructure shocks on schooling. Recent work,

much of it drawing on South Asia, exploits the expansion of road and highway networks in

developing countries (for example, Adukia et al. 2019, Aggarwal 2018 and Mukherjee 2012).

In the case of Adukia et al. (2019), they find positive effects of roads on middle school en-

rolment but null effects on primary school enrolment. Studying a different road construction

program in India, Aggarwal (2018) finds positive enrolment effects on children aged 5 to 14

and negative effects on older children aged 14-20. Roads, similar to railways, can increase

both the returns to schooling and the opportunity cost of attending school. Such mixed

results by age highlight the different mechanisms from infrastructure to schooling.

The rest of the paper is organised as follows: Section 2 provides historical background on

literacy and railways in colonial India. Section 3 describes our data. Estimation strategies

are explained in Section 4. We report the results in Section 5, with preliminary conclusions

in Section 6.

2. Historical Background

2.1. Literacy in colonial India. As British rule spread in India former indigenous schools

were slowly replaced with a new system of public and private schools. Did this transition

increase literacy? We have no way of knowing for certain because there are few estimates of

literacy before the late 19th century. Some missionary accounts suggest village schools were

common in Bengal in the early 1800s, but offer few details. The British began collecting

data on education after the Crown took control in 1858. Indeed, literacy became a standard

question on the census beginning with the first census of 1872. Using official reports, Chaud-

hary (2016) shows that literacy was low, but increasing from 1881 to 1941. Men were at

least six times more literate than women. Upper castes were significantly more literate than

lower castes. There were also big differences by religion, with Jains and Christians leading

the way. Moreover, spatial variation within India was large.

For individuals age 10 and over, male literacy in British India increased from 13% in 1901

to 18% in 1931.6 Female literacy increased from just under 1% to 3% in the same period.

Coastal provinces of Bengal, Bombay and Madras had higher literacy in each decade, with

male literacy averaging 20% in 1931 compared to 11% in the interior for Central Provinces

and United Provinces. Literacy among Brahmans, the priestly upper caste of Hindus, av-

eraged 33% in 1931, similar to Jains (35%) and Christians (28%). In comparison, literacy

was low among marginalized groups, at 2% for the lower castes and just under 1% for

tribal groups, also known as Scheduled Tribes in modern India. Yet, regional differences

5See Roy (2019) for a discussion on the shift from history to economics in Indian economic history andChaudhary et al. (2016) for a sampling of recent quantitative work.6This discussion draws on Chaudhary (2016).

Page 6: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

6 LATIKA CHAUDHARY AND JAMES FENSKE

were remarkable with Brahman literacy at 54% in Madras compared to 16% in the United

Provinces.

What accounts for the disappointing performance? Colonial officials pointed to low de-

mand for education in a rural country like India, high illiteracy among parents, and cultural

taboos against Hindu and Muslim girls attending school. One could argue demand for basic

literacy may be lower in agricultural societies compared to industrial countries. But, it is

hard to argue that demand for basic literacy was lower in India than countries at similar lev-

els of development. For example, Lindert (2004) and Chaudhary (2016) show that primary

school enrollment in colonial India between 1860 and 1920 was far below Brazil, Mexico,

Japan and Chile, a reasonable comparison set based on GDP per capita.

A more first-order explanation was the poor funding of public schooling. A large share of

public funding for rural primary schools came from surcharges on land revenues. But, the

surcharges were fixed and land revenues did not keep pace with the increase in agricultural

incomes. Moreover, education was a small share of the government budget increasing from

1.5% in 1881 to 7.7% in 1931. Indeed, the picture was worse in terms of national income

with public education spending accounting for less than 1% of national income as late as

1931. Public expenditures on education under the Raj were lower than other developing

countries and British dependent colonies (Chaudhary 2009). As many scholars have argued,

public spending was and is critical to increasing global literacy (Lindert 2004). Against this

background it is no surprise that India made few strides in increasing mass literacy.

2.2. Spread of Railways in colonial India. Unlike schooling, the British were early

promoters of railways in India, building an extensive rail network.7 The first passenger

line opened in 1853, connecting Bombay to Thane, a distance of 20 miles. Prompted by

mercantile interests in Britain, the early lines connected the ports of Bombay, Calcutta

and Madras to the interior. Given the few good roads and navigable rivers, British firms

hoped railways would lower the costs of exporting raw cotton from India and of importing

British manufactured goods to new Indian markets (Thorner 1951, 1955). Indeed, the British

believed goods traffic would significantly exceed any passenger traffic. They proved to be

wrong, with passenger traffic accounting for 60% of revenues.

Indian railways were built by British firms with British financing, albeit subsidized by a

guaranteed dividend backed by the GOI. Such firms were the main players up to the 1870s,

when the GOI began to build lines. This was followed by mixed public-private partnerships

in the 1880s. These partnerships were the norm until the 1920s (Sanyal 1930). Route

mileage expanded quickly in the early decades, especially from 1881 to 1901. Total route

miles increased from 9,893 in 1881 to 17,283 to 1891, 25,365 in 1901, 32,839 in 1911 and

7There is a large literature on Indian railways. Edited volumes by Kerr (2001, 2007) offer an excellentintroduction to the main issues, while Sanyal (1930) offers a detailed history of railway development.

Page 7: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 7

then 37,266 in 1921. Mileage in 1901 was 2.5 times that in 1881, while mileage in 1921 was

1.5 times as high as in 1901 (Bogart and Chaudhary, 2016). By the early 1900s India had

the fourth largest network in the world, an impressive achievement for a developing country.

Her railway performance is in sharp contrast to education where India was far below most

of the world.

Figures 1-3 show the spread of the network from 1881 to 1921, our period of analysis.

The important ports were connected to the interior before 1881. Many lines criss-crossed

the densely populated Indo-Gangetic plain with fewer interior lines in the Deccan plateau.

Early plans such as the Kennedy plan in 1852 called for lines parallel to the coast in order

to economise on costs. Some were never built because subsequent officials opted for more

expensive but direct routes cutting through mountains (Davidson 1868). We return to the

Kennedy plan below as an instrument for route placement.

Although British firms built the railways, the GOI dictated route placement. What guided

the decisions? Military, commercial and famine concerns were cited as the main drivers in

official correspondence (Hurd 1983). Following the Sikh Wars in the 1840s and the Indian

Mutiny in 1857, the British were cognisant of the need to transport troops and supplies

across the country at low cost. Existing transport routes were of poor quality and slow,

which made it necessary to station troops at multiple locations in the event of an uprising

(Parliamentary Papers 1854). On the commercial side, British merchants lobbied for Indian

railways that would connect the ports to cotton-growing regions in the interior, and from

the eastern and western ports to Delhi in the north. Another consideration was famines.

Following devastating crop failures and famines in the 1870s, the GOI built “protective lines”

in famine-prone regions of the South. Finally, a few small lines were built connecting hill

stations where British officials liked to spend their summers. While not random, the network

of railroads across districts was not uniformly indicative of positive or negative selection.

Rather, a mix of factors affected route placement decisions. To address the endogeneity

of railroads, we use instrumental variables and differences-in-differences approaches that we

describe in Section 4.

3. Data

We construct a new district-level dataset for British India spanning 1881 to 1921. Our data

pulls information from three sources – the decennial census for literacy, the railway reports for

dates on the opening of railways in each district, and then Geographic Information Systems

(GIS) data on geographic controls. We begin with the 1881 census rather than the first

census of 1872 because of inconsistencies with the 1872 census data. We also restrict our

focus to British Indian districts because data on the Princely States are inaccurate in the

early censuses up to 1911. We describe each of our data sources next.

Page 8: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

8 LATIKA CHAUDHARY AND JAMES FENSKE

3.1. Measuring Literacy. While data on literacy was collected in each colonial census, its

definition and measurement changed over time. In the 1881 and 1891 censuses, individuals

were classified into literate, those that were learning, and illiterate. Yet, there was no guid-

ance on how to measure literacy or account for learners apart from using an age threshold.

Learners as a category was dropped in subsequent censuses. A definition for literacy, namely

“the ability to read and write”, was adopted in the 1901 census. But, census administrators

were given no official guidance on measuring literacy. This led to differences across provinces

where some used a more rigorous standard while others enumerated individuals as literate if

they could sign their names. Beginning in 1911, a consistent standard was adopted whereby

an individual was recorded as literate if he or she could read and write a letter. Because

these differences create problems when exploiting changes over time in literacy within dis-

tricts, our main results focus on cross-sections estimated separately in each census year.

We also present results using a differences-in-differences estimation that exploits within dis-

trict variation in railroad years. We are more cautious in drawing conclusions based on the

difference-in-differences results.

Table 1 reports summary statistics on the total literacy rate, broken down by gender for

each year.8 We construct the literacy rate as total number of literates divided by total

population. Total literacy doubled from an average of 3.16% in 1881 to 6% in 1921. The low

average notwithstanding, differences by gender and region were large. In 1881, male literacy

averaged almost 6%, 22 times higher than female literacy at 0.27%. By 1921, the difference

had shrunk, with male literacy at 10.46%, only 8 times as high as female literacy at 1.32%.

Both the standard deviation and range highlight the differences across districts.

Figures 4-6 show the distribution of total, male and female literacy. While the distribution

of total literacy was skewed in 1881, it became more dispersed by 1921. Less than 1 in 10

people could read and write in over 95% of districts in 1881. By 1921 more than 1 in 10

people were literate in roughly 10% of districts, with a maximum literacy of 32% in Madras

city. Male literacy was more dispersed in each year compared to total literacy (Figure 5).

Indeed, it also increased and became more dispersed from 1881 to 1921. Female literacy

increased from its low base in 1881, yet the distribution remained skewed as late as 1921

(Figure 6). We present maps of total literacy by quintile in Figures 7-9 for 1881, 1901 and

1921.

3.2. Measuring Access to Railways. To estimate the effect of railways on literacy, we

follow Fenske and Kala (2018) to code the opening dates of railway access for each district.

Fenske and Kala (2018), following a procedure similar to Donaldson (2018), construct a poly-

line shapefile of the Indian railway network with an opening date for each segment. These

8The number of districts change across years because we are missing data on few districts in some years.

Page 9: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 9

dates are based on the History of Indian Railways Constructed and In Progress (1934 edi-

tion). For each listed railway line, they recorded the opening dates along with the beginning

and end points of each line. We intersect these shape files of railway lines with their opening

dates with a map of modern tehsils (an administrative unit smaller than the district). Using

a GIS mapping of colonial districts to these modern tehsils, we compute the earliest year

each district connected to the railroad.

Figure 10 shows a map of the year a district was first connected to a railroad by quintiles.

As noted in Section 2, geography was an important factor in railroad placement. Coastal

districts with important ports were connected to a railway early on as were those in the

Ganga valley. Yet, a few cotton growing interior districts were connected before 1881 as

were districts closer to Afghanistan. Neither group would be considered positively selected

for rail access.

Two common ways to measure railroad access involve using (1) a simple indicator (0/1) for

being connected to a railway or not, or (2) the number of years a district has been connected

to a railroad in each census year. A simple indicator captures the extensive margin, but

misses the difference between a district connected for 2 years versus 20 years. Both would

be coded as a one. Hence, we use the more intensive measure of railway access in the main

results. We find similar results using the extensive 0/1 indicator.

Table 1 reports summary statistics on an indicator for railroad access and the number of

years a district was “treated” with railways in each census year. Fifty percent of districts in

British India were connected to a railway by 1881 increasing to 96% by 1921. Indeed, much

of the increase occurred before 1901 when 87% of districts were already connected. Yet, the

number of railroad years better illustrates variation across districts in exposure to railroads.

For example, the number of railroad years averaged 7.4 years in 1881 increasing to 22 years

in 1901 and then 41 years by 1921.

Comparing the railway maps to the literacy maps suggests a mild positive correlation

between railways and literacy. Figures 11 and 12 confirm the positive correlation for 1881

and 1921. Coastal districts were among the first to be connected and also had higher literacy.

But the relationship is not obvious everywhere. Many districts in the United Provinces were

connected early but had lower literacy. Both maps, however, reveal the significant spatial

variation in railway access and literacy, which dominates the temporal variation in the two

variables.

3.3. Geographic Controls. India has a wide range of terrain with the Himalayas in the

north, mountain ranges along the east and west coasts, the Thar dessert in Rajasthan, allu-

vial plains along the Indus and Ganga river valleys and the Deccan plateau. Such differences

in topography affected the railroad network because of the difficulty in building railroads

Page 10: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

10 LATIKA CHAUDHARY AND JAMES FENSKE

crossing mountains and desserts. To this end, we construct a broad set of geographic vari-

ables to control for positive selection of railroad exposure on account of favourable geography.

In particular, we collect data on the latitude and longitude coordinates of the centroid of

the district. We also control for altitude, ruggedness, slope, precipitation, an indicator for

rivers, distance to the coast, and suitability for specific crops such as cotton, dryland rice,

wetland rice, and wheat, averaged over raster cells within a district. The ruggedness data are

from Nunn and Puga (2012), while the data on precipitation and crop suitability are from

the FAO-GAEZ data portal. Although FAO data measure contemporary crop suitability,

they have been shown to be strongly predictive of historical agricultural patterns (Mayshar

et al. 2015; Michalopoulos et al. 2018). In Appendix Figures 1 to 5 we show maps of these

variables, which suggest they align well with what is already known in the context of colonial

India.

4. Estimation Strategy

We begin by estimating a simple cross-sectional OLS regression of the following form:

(1) ln(LiteracyRatedt) = βRailroadY earsdt + γ′xdt + εdt

We estimate this regression separately for t ∈ {1881, 1891, 1901, 1911, 1921}. In this equa-

tion, ln(LiteracyRatedt) is the log literacy rate in district d in year t. We transform the

literacy rate into logs because it is highly skewed.9 RailroadY earsdt is the number of years

district d in year t has had a railroad (and 0 if unconnected). The vector xdt includes province

fixed effects and GIS controls (latitude, longitude, area, altitude, cotton suitability, dryland

rice suitability, wetland rice suitability, average precipitation, slope, tea suitability, temper-

ature, wheat suitability, ruggedness). Finally, εdt is the error term. We estimate robust

standard errors in the regressions.

Such a regression however is unlikely to identify the effects of railroads on literacy. For

example, if more urbanised or developed districts were the first to receive railways, our

estimate of railway years would be biased up because it would conflate the effect of rail-

ways with urbanisation. On the other hand, if famine prone areas received access early on,

then our estimates would be biased down. Indeed, military strategy also does not provide

clear guidance on the potential selection problem. Railroads from Calcutta to Delhi facili-

tated a quick movement of troops, but were of immense commercial value in transporting

goods. This would suggest positive selection. Yet, railways from Delhi heading northwest to

9In 1881, we observe one district in the Assam hills with no individuals recorded as being literate. We dropthis observation because we estimate log literacy. As a robustness check, we assigned a small positive literacyrate under 0.01% to this district. It did not change the results.

Page 11: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 11

Afghanistan were of less commercial value and would suggest negative selection. To address

such endogeneity concerns, we turn next to three different identification strategies.

4.1. Instrumental Variables: Spanning Tree. A common instrumental variables ap-

proach in the transportation literature involves constructing a minimum spanning tree (Morten

and Olivia 2016, Faber 2014, Jedwab and Moradi 2016, Fenske and Kala 2018). In our con-

text, we build a tree that spans the set of Indian cities with populations greater than 10,000

in 1850 using the set of cities recorded in Chandler and Fox (1974). We choose 1850 because

it precedes the construction of railways. Using Prim’s algorithm, we construct the shortest

tree that spans these 97 cities. Figure 13 shows a map of this tree superimposed on the 1881

railway network. After constructing the tree, we compute the distance of the closest part of

the district from the spanning tree. We then use the log of (one plus) distance to this tree

as an instrument for RailroadY earsdt.

4.2. Instrumental Variables: 1852 Kennedy Plan. In our second approach, we con-

struct an instrument using Major J. P. Kennedy’s 1852 proposal for building railways in

India. Major Kennedy was the Consulting Engineer for the GOI from 1851 to 1852. Despite

his short tenure, his detailed proposal of 1852 influenced the development of India’s railway

network. He pushed for building low cost railways that in his view would confer innumerable

benefits. As stated in his words below:

It is not sufficient to be convinced as I am, that the establishment of Railways in

India is an essential preliminary to the attainment of the highest degree of efficiency

of which our military and civil administrations are capable; to the prevention of

local famine, and to the uniform dispersion of food; to any vigour and activity

in manufacture or commerce; to the increased consumption of English goods: to

the power of competing with America in furnishing to England raw cotton and

other important articles: in short, to the growth of everything connected with the

extension of British interests in India as well as with the industry, the wealth, and

the comfort of its vast population (Parliamentary Papers 1854, p.3).

Yet, Major Kennedy was aware of the costs of building railways. So he emphasised lower-

cost routes connecting the ports and interior cities. In particular, his plan called for a network

in “strict harmony with the natural advantages” of the country. Unlike routes that would

cut through the Eastern and Western coastal ranges of India, his plan called for routes that

favoured softer gradients, followed the coast and natural topography. For example, he was

critical of the direct route from Bombay (western port) to Allahabad (city on the Ganga in

the United Provinces) because it would cut through “four unnecessary and fierce ranges of

mountains”. Rather, he proposed an indirect route along the coast with a gentle gradient.

Donaldson (2018) has used portions of the Kennedy plan that were not implemented to

construct placebo lines. In many cases, however, Kennedy’s routes were adopted especially

Page 12: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

12 LATIKA CHAUDHARY AND JAMES FENSKE

from Calcutta to Delhi. In other cases, however, more expensive yet direct routes cutting

through the Western Ghats for example were selected (Davidson 1868). Comparing the

Kennedy plan in Figure 14 to the actual railway network highlights the overlap. Since

Kennedy was clear about using geography, in particular gentle gradients, in his proposed

network, we are assuming here that conditional on geographic controls, the 1852 Kennedy

plan is uncorrelated with factors that would affect literacy other than through access to

railways. To construct the instrument, we convert the map of Kennedy’s proposal into a

shapefile. We then calculate the shortest distance of each district from this route. We use

the log of (one plus) distance to the Kennedy plan as an instrument for RailroadY earsdt.

4.3. Differences-in-Differences. Our third and final approach exploits the differences in

railway access over time within districts. For district d in year t ∈ {1881, 1891, 1901, 1911, 1921}we estimate the following difference-in-differences regression:

(2) ln(LiteracyRatedt) = βRailroadY earsdt + δd + ζt + ζt × ηp + εdt

Here ln(LiteracyRatedt) and RailroadY earsdt are defined as in equation (1). In addition, we

include district fixed effects captured by δd and year fixed effects captured by ζt. District fixed

effects capture time-invariant differences across districts including geography for example.

And, year fixed effects control for temporal patterns that affect all districts in the same

manner, i.e. common national shocks. In addition, we control for year interacted with

province fixed effects. This is to control for changes in the measurement and enumeration

of literacy across census years. Official guidance on such issues was often set at the province

level. The hope is such flexible controls address measurement error in literacy. We also

cluster the standard errors at the district level to account for serial correlation.

Our key identifying assumption is the change in railroad years within districts over time

is uncorrelated with the error term, εdt. Unfortunately, the changing definition and mea-

surement of literacy up to 1901 is likely to generate measurement error that may bias the

coefficient on railroad years in a non-random manner. We can exploit the variation between

1911 and 1921 when literacy is consistently measured, but then we are left with little vari-

ation in railroad access because 94% of districts in British India are connected by 1921.

Nonetheless, we present the differences-in-differences results but are cautious in drawing

conclusions.

5. Results

5.1. OLS Results. Table 2 reports OLS estimates of equation (1) as repeated cross-sections

for each census year. Columns (1) to (3) show results for log literacy with no controls in

(1), including province fixed effects but no geographic controls in (2), and including province

Page 13: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 13

fixed effects with geographic controls in (3). In columns (4) and (5) we report the results

separately for male and female literacy. Four patterns stand out. First, the estimates are

positive and significant across specifications. Second, the estimates are larger in specifications

without controls compared to those with controls. This is unsurprising and highlights the

importance of including geographic controls that are likely correlated with railroad years

and literacy. Third, the effects of railroads are larger for female literacy compared to male

in every cross-section. Fourth, the effects of railroad years decrease in magnitude from 1881

to 1921.

Coefficients on railroad years are positive and significant at the 1% level in each year, across

specifications, and for total and gender-specific literacy. In terms of magnitude, standardised

β coefficients (multiplying the β coefficient in Table 2 with the standard deviation of railroad

years and dividing by the standard deviation of log literacy) range from 0.12 to 0.33 in

standardised magnitude. For example, in column (3) for 1881, a one standard deviation

increase in railroad years translates into a 0.22 standard deviation increase in literacy. By

1921 the standardised magnitude decreases by almost 50% to 0.12. Looking across columns

(1) to (3) the standardised magnitudes decline as we would expect going from 0.38 in column

(1) to 0.22 in column (3) for 1881. We observe similar declines up to the 1900s.

Railroad years have larger positive effects on female literacy compared to male literacy, a

pattern that is consistent over time. In 1881, the standardised coefficient on male literacy is

0.2 compared to 0.33 for female literacy. The effects of railroads are, thus, 1.7 times higher

for females compared to males. By 1921 the effects of railroads are 1.4 times higher for

females compared to males. Finally, the effect sizes across outcomes decline between 1881

and 1921. Since a majority of districts are connected to a railroad by 1921, additional years

of access generate a smaller effect on literacy as compared to going from no access to access.

5.2. Instrumental Variables Results. Tables 3 and 4 report second stage instrumental

variables results for the spanning tree and Kennedy plan instruments, respectively. Columns

(1)-(5) correspond to the same outcomes and controls as in Table 2. First, the Kleinbergen-

Paap F-statistic (KPF) is large across all specifications in Tables 3 and 4. Appendix Table

1 shows the first stage results for each instrument and year. Both the spanning tree and

Kennedy plan, thus, do a good job of predicting railroad years in each cross-section. Similar

to Table 2, the coefficients on railroad years are positive and significant in all specifications.

Yet, the magnitudes of the effects and associated patterns are different from Table 2.

First, the IV estimates on railroads are larger for total literacy and are gender-specific.

In terms of standardised coefficients, the effects of railroad years in 1881 on total literacy

range from 0.35 for the spanning tree IV (Table 2) to 0.36 for the Kennedy plan IV (Table

3) compared to 0.22 in Table 2. One explanation could be measurement error in the treat-

ment variable, namely railroads in our context, that would bias the OLS estimate towards

Page 14: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

14 LATIKA CHAUDHARY AND JAMES FENSKE

zero. Although our measure of railroad years uses official reports and fairly accurate GIS

mapping techniques, measurement error could still be a concern since years of exposure is

a proxy for the degree to which railroads “treated” districts. Another possible explanation

is negative selection of districts that received railroads; if, for example, cotton-producing

or famine-prone districts had lower literacy rates. Further, the difference between the two

could be due to the OLS estimate corresponding to an average treatment effect of railroad

years. In contrast, the IV estimate corresponds to a local average treatment effect (LATE),

i.e. the effect of increasing railroad years for those districts that gained access to railroads

(treatment) because of their proximity to the spanning tree or 1852 Kennedy plan. For

example, more initially isolated districts only incidentally connected because they are on a

direct line between other major centres may have benefitted more.

Second, the IV estimates on female literacy are again higher in magnitude ranging from

0.54 in 1881 with the spanning tree IV to 0.43 with the Kennedy plan IV. In contrast,

the standardised coefficients for male literacy are similar in 1881 at 0.34 across both IVs,

although the Kennedy plan generates larger effect sizes of 0.39 for male literacy in 1921

compared to 0.24 with the spanning tree. Although the estimates on railroad years for male

and female literacy change across years, the female effect sizes are uniformly larger. Third,

we do not observe consistent declines in the magnitude of the effects over time. If anything,

the standardised coefficient on total literacy increases in the Kennedy plan IV from 0.36 in

1881 to 0.46 in 1921.

Are these effects big or small? To answer this question, we benchmark against two studies.

First, we compare our estimates to Atack, Margo and Perlman (2012), who estimate the

effect of railway access on individual school enrolment for the United States. Combining

individual data with a difference-in-differences set-up, their estimates suggest increasing rail

access across US counties in the 1850s predicts 56% of the increase in mean school enrolment

between 1850 and 1860 (p 16). We find similar, albeit smaller, effects if we undertake their

exercise. If we use the coefficient on railroad years in Table 3, column (3) for the spanning

tree IV, then the change in railroad years between 1881 and 1891 of 6.12 years predicts

an increase of 0.29 log points. This accounts for 49% of the observed increase in literacy

between 1881 and 1891.

Second, we compare the effects of railroad years to public expenditures on rural primary

education in India. Chaudhary (2010) estimates the effects of 1911 rural primary school ex-

penditures on 1921 literacy using a cross-sectional IV approach. Her standardised coefficients

on total literacy at 0.16 standard deviations are smaller than the standardised coefficients

on 1921 literacy using the spanning tree IV (Table 3, column (3) for 1921) at 0.54. Unlike

railways, she finds insignificant effects of public spending on female literacy. This suggests

Page 15: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 15

that supply side shocks were less effective at increasing literacy in colonial India compared

to infrastructure shocks.

5.3. Differences-in-differences Results. Table 5 reports the difference-in-differences re-

sults exploiting the change in railroad years within districts over time. Unlike the OLS and

IV estimates, we find no significant effects of railroad on any literacy outcome. We split the

panel into pre-1901, namely 1881, 1891, 1901, and post-1901, namely 1911 and 1921. We

split the panel because the 1911-1921 census years include a consistent measure of literacy.

Similar to the complete panel, we find no significant effects of railroad years in these split

panels. It is difficult to interpret and reconcile these differences-in-differences results with

those in the earlier sections. In the differences-in-differences we are exploiting within-district

variation that is perhaps less endogenous if unobservable time-invariant characteristics of dis-

tricts drove route choice. Yet, there are changes in literacy that introduce measurement error

in the outcome over time. It is unclear if the null effects we observe in Table 5 are due to

changes in the measurement of literacy, or true null effects.

Another possible explanation here is that what really mattered was getting connected

early, i.e. before 1881. But once these districts were connected, later districts could no longer

benefit in the same manner. For example, if parents sent their children to be educated in

schools in the early-connected districts, or if they migrated there. We plan to explore these

issues among others in more detail in later versions of this paper.

6. Conclusions

We study the effects of railways on literacy with district-level data on colonial India

for 1881 to 1921. Using cross-sectional variation in railway access, our preliminary results

suggest positive and significant effects of railroad exposure on total literacy. Our results are

robust to including a rich set of geographic variables and two instrumental variable strategies

to address the potential endogeneity of railroads. Although we find large and positive effects

of railways in repeated cross-sections from 1881 to 1921, we find small and insignificant

effects in a difference-in-differences estimation in which we exploit variation within districts

in railroad years controlling for district and year fixed effects. In later versions of this paper,

we hope to explore the mechanisms underlying these results.

References

(1) Adukia, Anjali, Sam Asher, Paul Novosad. (2019). “Educational Investment Re-

sponse to Economic Opportunity: Evidence from Indian Road Construction.” Work-

ing paper: http://www.dartmouth.edu/˜novosad/adukia-asher-novosad-dise-roads.pdf

(2) Andrabi, Tahir and Michael Kuehlwein. (2010). “Railways and Price Convergence

in British India.” Journal of Economic History 70(2): 351-377.

Page 16: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

16 LATIKA CHAUDHARY AND JAMES FENSKE

(3) Atack Jeremy, F. Bateman, M. R. Haines, and Robert A. Margo. (2010). “Did Rail-

roads Induce or Follow Economic Growth?: Urbanization and Population Growth in

the American Midwest, 1850-1860.” Social Science History 34(2):171-197.

(4) Atack Jeremy, M. Haines, and Robert A. Margo. (2011). “Railroads and the Rise of

the Factory: Evidence for the United States, 1850-1870.” In Rhode, P., Rosenbloom,

J., and Weiman, D., editors, Economic Evolution and Revolutions in Historical Time,

pp.162-179. Stanford University Press, Palo Alto, CA.

(5) Atack, Jeremy, Matt Jaremski and Peter Rousseau. (2014). “American Banking and

the Transportation Revolution Before the Civil War.” Journal of Economic History

74(4): 943-86.

(6) Atack Jeremy, Robert A. Margo and Elizabeth R. Perlman. (2012). “The Impact

of Railroads on School Enrollment in Nineteenth Century America.” Working paper:

http://elisabethperlman.net/papers/PerlmanMargoAtack SchoolRR Draft8.pdf

(7) Bogart, Dan and Latika Chaudhary. (2016). “Railways in Colonial India: An Eco-

nomic Achievement?” In A New Economic History of India, edited by Latika Chaud-

hary, Bishnupriya Gupta, Tirthankar Roy and Anand V. Swamy, 140-160. New York:

Routledge.

(8) Chaudhary, Latika. (2016). “Caste, Colonialism and Schooling: Education in British

India.” In A New Economic History of India, edited by Latika Chaudhary, Bish-

nupriya Gupta, Tirthankar Roy and Anand V. Swamy, 161-178. New York: Rout-

ledge.

(9) Davidson, Edward. (1868). The Railways of India: With an Account of Their Rise,

Progress and Construction. London: E. and F. N. Spon

(10) Donaldson, Dave and Richard Hornbeck. (2016). “Railroads and American Eco-

nomic Growth: A “Market Access” Approach.” The Quarterly Journal of Economics

131(2): 799-85.

(11) Donaldson, Dave. (2018). “Railroads of the Raj: Estimating the Impact of Trans-

portation Infrastructure.” American Economic Review 108 (4-5): 899-934.

(12) Faber, B. (2014). “Trade integration, market size, and industrialization: evidence

from China’s National Trunk Highway System.” Review of Economic Studies, 81(3):1046–

1070.

(13) Fogel, R. (1964). Railroads and American Economic Growth: Essays in Econometric

History. Baltimore: John Hopkins Press.

(14) Herranz-Loncan, A. (2014). “Transport Technology and Economic Expansion: the

Growth Contribution of Railways in Latin America before 1914.” Revista de Historia

Economica - Journal of Iberian and Latin American Economic History 32(1): 13-45.

Page 17: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 17

(15) Hurd II, John. (1975). “Railways and the Expansion of Markets in India,” Explo-

rations in Economic History 12(3): 263-88.

(16) Jedwab, R. and Moradi, A. (2016). “The permanent effects of transportation revolu-

tions in poor countries: evidence from Africa.” Review of Economics and Statistics,

98(2):268–284.

(17) Kerr, Ian J. ed. (2001). Railways in Modern India. Oxford University Press, New

Delhi.

(18) Kerr, Ian J. ed. (2007). 27 Down: New Departures in Indian Railway Studies. Orient

Longman, Hyderabad.

(19) Kumar, Dharma. (1983). “The Fiscal System” In The Cambridge Economic History

of India, Volume 2: c. 1757-c. 1970, edited by Dharma Kumar, 905-944. Cambridge

University Press: Cambridge.

(20) Mayshar, J., Moav, O., Neeman, Z. and Pascali, L., 2015. Cereals, appropriability

and hierarchy Barcelona Graduate School of Economics Working Paper 842.

(21) McAlpin, Michelle Burge. (1975). “The Effects of Markets on Rural Income Dis-

tribution in Nineteenth Century India.” Explorations in Economic History 12 (3):

289-302.

(22) Michalopoulos, S., Putterman, L., and Weil, D. N. (2018). The influence of ancestral

lifeways on individual economic outcomes in Sub-Saharan Africa. Forthcoming in the

Journal of the European Economic Association.

(23) Morten, M. and Oliveira, J. (2016). “The Effects of Roads on Trade and Migration:

Evidence from a Planned Capital City.” NBER Working Paper No. 22158.

(24) Parliamentary Papers. (1854) 131. Railways (India). Copy of memorandum by

Major Kennedy, on the question of a general system of railways for India, referred

to in the minute by the Governor-General in Council of 20 April 1853, relative to

railway undertakings in that country. Vol.XLVIII, 32 pp.

(25) Sanyal, Nalinaksha. (1930). Development of Indian Railways. University of Calcutta,

Calcutta.

(26) Summerhill, W. R. (2005). “Big Social Savings in a Small Laggard Economy: Railroad-

Led Growth in Brazil.” Journal of Economic History 65(1): 72-102.

(27) Tang, John. (2017). “The Engine and the Reaper: Industrialization and Mortality

in Late Nineteenth Century Japan.” Journal of Health Economics, 56 (December):

145-162.

(28) Thorner, Daniel. (1951). “Great Britain and the Development of India’s Railways.”

Journal of Economic History 11(4): 389-402.

(29) Thorner, Daniel. (1955). “The Pattern of Railway Development in India.” Far

Eastern Quarterly, XIV: 201-6.

Page 18: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

18 LATIKA CHAUDHARY AND JAMES FENSKE

Appendix A. Tables

Table 1. Summary StatisticsMean SD Min Max N

Literacy 1881 3.16% 2.45% 0.00% 17.66% 195Literacy 1891 4.20% 3.57% 0.60% 35.23% 196Literacy 1901 4.82% 3.32% 0.85% 24.82% 201Literacy 1911 5.19% 3.26% 0.86% 27.91% 193Literacy 1921 6.03% 3.68% 1.59% 32.01% 194

Male Literacy 1881 5.84% 4.04% 0.00% 30.52% 195Male Literacy 1891 7.58% 4.83% 1.10% 35.23% 195Male Literacy 1901 8.76% 5.22% 1.46% 35.99% 201Male Literacy 1911 9.30% 5.28% 1.65% 42.13% 193Male Literacy 1921 10.46% 5.75% 2.70% 45.32% 194

Female Literacy 1881 0.27% 0.72% 0.00% 6.33% 195Female Literacy 1891 0.40% 1.00% 0.04% 8.73% 194Female Literacy 1901 0.66% 1.42% 0.02% 11.49% 201Female Literacy 1911 0.86% 1.45% 0.07% 13.10% 193Female Literacy 1921 1.32% 1.90% 0.17% 17.37% 194

Railroad Indicator 1881 51.24% 50.11% 0 1 201Railroad Indicator 1891 72.28% 44.87% 0 1 202Railroad Indicator 1901 87.06% 33.64% 0 1 201Railroad Indicator 1911 94.30% 23.24% 0 1 193Railroad Indicator 1921 95.88% 19.94% 0 1 194

Railroad Years 1881 7.38 9.05 0 28 201Railroad Years 1891 13.57 12.54 0 38 202Railroad Years 1901 21.77 15.17 0 48 201Railroad Years 1911 30.99 16.41 0 58 193Railroad Years 1921 40.45 17.37 0 68 194

Note: See text for details on sources.

Page 19: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 19

Table 2. OLS, Repeated Cross Sections(1) (2) (3) (4) (5)

Ln (Literacy Rate) Male Female

Year 1881Railroad Years 0.0247*** 0.0160*** 0.0142*** 0.0131*** 0.0392***

(0.00464) (0.00361) (0.00298) (0.00278) (0.00637)

Observations 194 194 193 193 192

Year 1891

Railroad Years 0.0121*** 0.0094*** 0.0102*** 0.0089*** 0.0236***(0.0034) (0.0027) (0.0025) (0.0023) (0.0047)

Observations 196 196 195 194 193

Year 1901

Railroad Years 0.0093*** 0.0083*** 0.0088*** 0.0075*** 0.0231***(0.0028) (0.0024) (0.0020) (0.0019) (0.0040)

Observations 201 201 201 201 201

Year 1911

Railroad Years 0.0071*** 0.0068*** 0.0079*** 0.0069*** 0.0169***(0.0025) (0.0020) (0.0016) (0.0015) (0.0029)

Observations 193 193 193 193 193

Year 1921

Railroad Years 0.0053** 0.0058*** 0.0068*** 0.0058*** 0.0136***(0.0024) (0.0021) (0.0017) (0.0017) (0.0025)

Observations 194 194 194 194 194

Controls No No GIS GIS GISFE No Province Province Province Province

Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1GIS controls include latitude, longitude, altitude, ruggedness, precipitation, dis-tance to the coast, distance to a river and suitability for specific crops such ascotton, dryland rice, wetland rice, and wheat.

Page 20: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

20 LATIKA CHAUDHARY AND JAMES FENSKE

Table 3. Instrument- Spanning Tree, Repeated Cross Sections(1) (2) (3) (4) (5)

Ln (Literacy Rate) Male Female

Year 1881Railroad Years 0.0412*** 0.0233*** 0.0229*** 0.0217*** 0.0651***

(0.0093) (0.0074) (0.0070) (0.0068) (0.0152)

Observations 193 193 193 193 192KPF 58.99 46.59 34.38 34.38 32.82

Year 1891Railroad Years 0.0237*** 0.0212*** 0.0248*** 0.0210*** 0.0431***

(0.0066) (0.0082) (0.0084) (0.0074) (0.0111)

Observations 195 195 195 194 193KPF 56.08 41.74 32.41 32.58 31.44

Year 1901Railroad Years 0.0163*** 0.0109* 0.0132** 0.0112** 0.0336***

(0.0057) (0.0057) (0.0052) (0.0051) (0.0103)

Observations 201 201 201 201 201KPF 47.70 33.38 22.38 22.38 22.38

Year 1911Railroad Years 0.0157*** 0.0124** 0.0188*** 0.0170*** 0.0345***

(0.0056) (0.0053) (0.0053) (0.0051) (0.0090)

Observations 193 193 193 193 193KPF 39.75 27.38 17.08 17.08 17.08

Year 1921Railroad Years 0.0122** 0.0108** 0.0157*** 0.0131** 0.0343***

(0.0054) (0.0053) (0.0055) (0.0053) (0.0092)

Observations 194 194 194 194 194KPF 37.52 25.29 15.78 15.78 15.78

Controls No No GIS GIS GISFE No Province Province Province Province

Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1GIS controls include latitude, longitude, altitude, ruggedness, precipitation, dis-tance to the coast, distance to a river and suitability for specific crops such ascotton, dry-land rice, wetland rice, and wheat.

Page 21: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 21

Table 4. Instrument- Kennedy Plan, Repeated Cross Sections(1) (2) (3) (4) (5)

Ln (Literacy Rate) Male Female

Year 1881Railroad Years 0.0454*** 0.0212*** 0.0239*** 0.0221*** 0.0514***

(0.0108) (0.0074) (0.0072) (0.0070) (0.0153)

Observations 193 193 193 193 192KPF 37.85 33.70 29.99 29.99 29.48

Year 1891Railroad Years 0.0335*** 0.0127* 0.0159** 0.0141** 0.0419***

(0.0090) (0.0073) (0.0068) (0.0065) (0.0134)

Observations 195 195 195 194 193KPF 26.74 25.78 22.18 22.29 22.61

Year 1901Railroad Years 0.0296*** 0.0152** 0.0235*** 0.0208*** 0.0472***

(0.0083) (0.0065) (0.0072) (0.0068) (0.0138)

Observations 201 201 201 201 201KPF 23.83 24.39 18.77 18.77 18.77

Year 1911Railroad Years 0.0341*** 0.0156** 0.0236*** 0.0204*** 0.0426***

(0.0097) (0.0061) (0.0071) (0.0064) (0.0129)

Observations 193 193 193 193 193KPF 17.79 21.01 15.71 15.71 15.71

Year 1921Railroad Years 0.0345*** 0.0144** 0.0256*** 0.0215*** 0.0492***

(0.0105) (0.0061) (0.0083) (0.0075) (0.0148)

Observations 194 194 194 194 194KPF 15.25 18.12 12.87 12.87 12.87

Controls No No GIS GIS GISFE No Province Province Province Province

Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1GIS controls include latitude, longitude, altitude, ruggedness, precipitation, dis-tance to the coast, distance to a river and suitability for specific crops such ascotton, dry-land rice, wetland rice, and wheat.

Page 22: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

22 LATIKA CHAUDHARY AND JAMES FENSKE

Table 5. Difference-in-Differences(1) (2) (3)

Literacy Rate Male Literacy Female Literacy

Railroad Years -0.00079 -0.00162 -0.00629(0.00255) (0.00211) (0.00400)

Observations 978 977 975

1881, 1891, 1901

Railroad Years -0.00344 -0.00401 -0.00768(0.00266) (0.00250) (0.00742)

Observations 591 590 588

1911, 1921

Railroad Years 0.00052 0.00091 -0.00669(0.00829) (0.00804) (0.01789)

Observations 387 387 387

Note: Robust standard errors clustered at the district level in paren-theses. *** p<0.01, ** p<0.05, * p<0.1All the regressions include district FE, year FE, and year FE interactedwith province FE.

Page 23: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 23

Appendix Table 1. First Stage Results, Railroad Years(1) (2) (3) (4) (5)

1881 1891 1901 1911 1921

Spanning Tree IV

Ln (Distance to -0.7199*** -0.9098*** -0.9072*** -0.8893*** -0.9144***Spanning Tree) (0.1228) (0.1598) (0.1918) (0.2152) (0.2302)

KPF 34.38 32.41 22.38 17.08 15.78

1852 Kennedy Plan IV

Ln (Distance to -0.7209*** -0.8106*** -0.8394*** -.8484*** -.8196***Kennedy Plan) (0.1316) (0.1721) (0.1938) (0.2140) (0.2284)

KPF 29.99 22.18 18.77 15.71 12.87

Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1The regressions control for province FE and GIS controls namely latitude, longitude,altitude, ruggedness, precipitation, distance to the coast, distance to a river and suit-ability for specific crops such as cotton, dry-land rice, wetland rice, and wheat.

Page 24: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

24 LATIKA CHAUDHARY AND JAMES FENSKE

Appendix B. Figures

Figure 1. Rail Network as of 1881

Page 25: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 25

Figure 2. Rail Network as of 1901

Page 26: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

26 LATIKA CHAUDHARY AND JAMES FENSKE

Figure 3. Rail Network as of 1921

Page 27: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 27

Figure 4. Distribution of Total Literacy

Page 28: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

28 LATIKA CHAUDHARY AND JAMES FENSKE

Figure 5. Distribution of Male Literacy

Page 29: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 29

Figure 6. Distribution of Female Literacy

Page 30: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

30 LATIKA CHAUDHARY AND JAMES FENSKE

Figure 7. Map of Total Literacy, 1881, Quintiles

Figure 8. Map of Total Literacy, 1901, Quintiles

Page 31: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 31

Figure 9. Map of Total Literacy, 1921, Quintiles

Figure 10. Map of Year First Connected to Railroad

Page 32: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

32 LATIKA CHAUDHARY AND JAMES FENSKE

Figure 11. Two way scatter of railroad years and literacy, 1881

Figure 12. Two way scatter of railroad years and literacy, 1921

Page 33: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA 33

Figure 13. Map of Spanning Tree

Page 34: DID RAILWAYS AFFECT LITERACY? EVIDENCE FROM INDIA ... · In order to test for a link between railways and human capital in colonial India, we use new data on district-level literacy

34 LATIKA CHAUDHARY AND JAMES FENSKE

Figure 14. Map of Kennedy Plan