by jean-william p. laliberté...education: schoolsand neighborhoods 1.1 introduction improving...

152
Essays in Labour Economics by Jean-William P. Laliberté A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Economics University of Toronto c Copyright 2018 by Jean-William P. Laliberté

Upload: others

Post on 22-Aug-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Essays in Labour Economics

by

Jean-William P. Laliberté

A thesis submitted in conformity with the requirementsfor the degree of Doctor of PhilosophyGraduate Department of Economics

University of Toronto

c© Copyright 2018 by Jean-William P. Laliberté

Page 2: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Abstract

Essays in Labour Economics

Jean-William P. LalibertéDoctor of Philosophy

Graduate Department of EconomicsUniversity of Toronto

2018

This thesis uses novel data sets to examine the process of human capital accumulation at different points

in the life cycle.

Chapter 1 decomposes total childhood exposure effects – the causal effect of growing up in a better

area – into separate school and neighborhood components. To do so, it brings together two research

designs. First, I implement a spatial regression-discontinuity design based on institutional rules that

assign different default schools to students of different linguistic backgrounds to estimate school effects.

Second, I study students who move across neighborhoods in Montreal during childhood to estimate

total exposure effects by exploiting variation in the timing of moves. I focus on measures of long-term

educational attainment outcomes such as university enrollment. I find that total exposure effects are

large, and that between 50 and 70% of the long-term benefits of moving to a better area are actually

due to access to better schools rather than to the neighborhoods themselves.

In chapter 2, joint with Graham Beattie and Philip Oreopoulos, we introduce a novel method for

collecting a comprehensive set of non-academic characteristics to explore which measures best predict

the wide variance in first-year college performance unaccounted for by past grades. Students whose

first-year college average is far below expectations (divers) have a high propensity for procrastination

and are considerably less conscientious than their peers. Divers are more likely to express superficial

goals, hoping to ’get rich’ quickly. In contrast, students who exceed expectations (thrivers) express more

philanthropic goals, are purpose-driven, and are willing to study more hours per week to obtain the

higher GPA they expect.

Chapter 3 estimates the effect of linguistic enclaves on language skills. Using rich longitudinal data,

I find that enclave size significantly impedes language acquisition, albeit the effect is smaller than cross-

sectional models suggest. An unusually rich set of variables is used to generate bounds on the effect

of enclaves and a complementary instrumental variable approach confirms the robustness of the results.

Enclaves are unrelated to formal language course take-up rates, indicating that they affect language

learning via social interactions among friends and colleagues rather than through formal education.

ii

Page 3: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Acknowledgements

I am forever indebted to my advisors Philip Oreopoulos, Kory Kroft and Natalie Bau. Phil’s passion forinvestigating truly policy-relevant questions and his dedication to helping others are highly contagious.My own views on how to conduct rigorous and impactful research have been strongly influenced by themany discussions we’ve had in his office. Without a doubt, deciding to work with Phil was the singlebest decision of my doctoral studies. The quality of the work presented in this thesis would have beenmuch, much weaker had it not been of Kory’s insistence on always pushing harder, on leaving no stoneunturned. His unwavering support, from the very beginning of my first research project at the Universityof Toronto until the final moments of my life as a graduate student, is greatly appreciated. Natalie wasthe perfect addition to this dream team. She his certainly one of the smartest and most creative thinkerI have met, and my research has benefited immensely from her detailed comments and amazing ideas.

I would also like to thank all the other faculty members who have taken some time to read my drafts,to attend my seminar presentations, or to provide me with their feedback on my ideas, notably MichaelBaker, Dwayne Benjamin, Gustavo Bobonis, Robert McMillan, Ismael Mourifié, Michel Serafinelli andAloysius Siow. Also thanks to my co-authors at other institutions, in particular Matt Notowidigdo forhis inspiring optimism, and to everyone at the University of Calgary who have made me feel at homeright from the start.

Importantly, many thanks to the most amazing group of peers I could have ever hoped for, inparticular Nicolas Gendron-Carrier, Juan Morales, Mathieu Marcoux, Scott Orr, Maripier Isabelle, MikeGilraine, Kevin Devereux and Marc-Antoine Schmidt. The content of these pages is the product ofmany of our discussions. Your solidarity in the toughest times was priceless. One day, eventually, wewill publish the Reagle Beagle Journal of Economics.

Un immense merci à mes parents, Michel et Marie-Josée, pour votre amour, mais aussi pour toutesvos décisions et petites attentions qui, à travers le temps, on fait de moi qui je suis aujourd’hui. Mercià la fratrie, Samuel, Frédéric et Sarah-Jeanne, pour votre support, que j’ai toujours bien senti malgréles kilomètres qui nous séparent. Et bien entendu, merci infiniment à ma femme, Brandy. Il va sansdire que cette thèse n’aurait jamais été complétée sans ton support inconditionnel, ta patience sanslimite par rapport à mes incessantes requêtes de lire et relire mes trucs, ton encouragement sincère dansles moments de frustration, et tous tes sourires qui m’attendent chaque jour à la maison, autant pourcélébrer les bons coups que pour m’aider à garder la tête haute. I love you binnie.

iii

Page 4: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Contents

1 Long-term Contextual Effects in Education 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Data and Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2.1 Quebec’s Education System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2.3 Descriptive and Summary Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3 Conceptual Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4 Empirical Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.4.1 Schools and neighborhoods: Measurement . . . . . . . . . . . . . . . . . . . . . . . 131.4.2 Effect of attending better schools . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151.4.3 Total exposure effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161.4.4 Decomposing exposure effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.5.1 School effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181.5.2 Total exposure effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201.5.3 Schools or neighborhoods? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.6 Robustness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221.6.1 Robustness of Regression-discontinuity estimates . . . . . . . . . . . . . . . . . . . 221.6.2 Robustness of Movers estimates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.6.2.1 Robustness: Time-invariant confounds . . . . . . . . . . . . . . . . . . . . 231.6.2.2 Robustness: Time-varying characteristics . . . . . . . . . . . . . . . . . . 241.6.2.3 Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.6.3 Robustness of Decomposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281.8 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301.9 Appendix Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 431.10 Data Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 701.11 Mathematical Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

2 Using Non-Academic Measures to Predict College Success and Failure 762.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 792.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

iv

Page 5: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

2.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822.4.1 Defining outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 822.4.2 Differences in quantitative non-academic measures . . . . . . . . . . . . . . . . . . 832.4.3 Text analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852.5.1 Predicting college grades using past academic achievement . . . . . . . . . . . . . 852.5.2 Predicting Student Outliers with Non-Academic Outcomes . . . . . . . . . . . . . 852.5.3 Text Analysis of Student Outliers . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

2.6 Summary Measures of Non-Academic Characteristics . . . . . . . . . . . . . . . . . . . . 882.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 892.8 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912.9 Appendix Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

3 Language Skill Acquisition in Immigrant Social Networks 1073.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1073.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

3.2.1 The Longitudinal Survey of Immigrants to Australia (LSIA) . . . . . . . . . . . . . 1093.2.2 Measurement: Linguistic concentration . . . . . . . . . . . . . . . . . . . . . . . . 1103.2.3 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

3.3 Empirical Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1123.4 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

3.4.1 The Effect of Linguistic Concentration on Language Acquisition . . . . . . . . . . 1133.4.2 Sensitivity to controls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1153.4.3 Sponsored immigrants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

3.5 Heterogeneous effects and mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1173.6 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1193.7 Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1203.8 Appendix Tables and Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1263.9 Appendix: Level of aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1313.10 Data appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1313.11 Appendix: Theoretical Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

3.11.1 The learning process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1333.11.2 The location decision and selection bias . . . . . . . . . . . . . . . . . . . . . . . . 134

Bibliography 136

v

Page 6: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1

Long-term Contextual Effects inEducation: Schools andNeighborhoods

1.1 Introduction

Improving graduation rates and college attendance are high-priority objectives shared by communityleaders, researchers and policymakers. Educational outcomes, however, vary greatly across regions,neighborhoods and schools. Given the sizable economic and nonpecuniary benefits to education, dis-parities in educational attainment can translate into persistent socio-economic inequality in adulthood.1

Multiple policy interventions target neighborhoods directly or incentivize families to relocate to low-poverty areas, motivated by the belief that social context significantly influences students’ aspirationsand learning. Schools are key institutions of local communities and thereby plausibly constitute a piv-otal mechanism fueling spatial inequalities. Yet, empirical evidence on the relative importance of schoolsand of neighborhoods for educational attainment remains scarce, despite the important implications ofsuch information for the allocation of public resources towards policies directed at either schools orneighborhoods. For instance, in many jurisdictions, school enrollment is strictly residence-based, whichmakes these two dimensions observationally indistinguishable. Disentangling neighborhood and schooleffects is further complicated by sorting of families; identifying separate causal effects for schools andneighborhoods requires two sources of random variation.

This paper estimates the separate effects of schools and neighborhoods on long-term educationaloutcomes. In particular, I evaluate the long-term impact of growing up in a better area and calculatethe fraction of the benefits that are driven by school quality. To do so, I combine unique student-leveladministrative data with several key institutional features of Quebec’s education system to overcome thestringent data and institutional requirements that generally hinder analyses of the separate contributionof schools and neighborhoods.2 The large longitudinal database used here follows students who grew up

1On the economic returns to education see Oreopoulos and Petronijevic (2013), Card (2001), and Angrist and Krueger(1991), and, specifically for Canada, Boudarbat, Lemieux and Riddell (2010). On non-pecuniary benefits, see Oreopoulosand Salvanes (2011) and Heckman, Humphries and Veramendi (2017).

2Identification of exposure effects requires longitudinal data, and separating the two contexts necessitates that both

1

Page 7: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 2

in the region of Montreal throughout their entire educational career and tracks them on a yearly basisas they switch schools, move across neighborhoods, and make higher education investments.

My empirical framework brings together two research designs to develop a new approach that allowsme to decompose total exposure effects – the combined effect of an additional year of exposure to a givenneighborhood and to its schools – into school and neighborhood components. The analyses incorporatevariation from one instrument shifting school quality alone (holding neighborhood quality constant)with another shifting both schools and neighborhoods simultaneously. First, institutional rules thatassign different default schools to students based on their linguistic background are used to calculatethe effect of attending a better school. Second, I adapt methods developed by Chetty and Hendren(2018a) to conduct a series of within-city across-neighborhood quasi-experiments that vary both schooland neighborhood quality. More precisely, I compare students who made the same move – both fromand to the same places – at different ages to pin down total exposure effects on a variety of measuresof educational attainment, including university enrollment, graduating from secondary school on time,and number of years of education. Together, these research designs allow me to isolate the fraction ofthe benefits of moving to a better area that is due to access to schools of greater quality. The resultsindicate that these benefits are large and mostly driven by schools rather than by other neighborhoodscharacteristics.

Quebec is particularly well suited for investigating the role of schools separately from neighborhoods.The province operates two parallel public school systems – one French and one English – thereby allocat-ing neighbors to different default neighborhood schools on the basis of their mother tongue. Importantly,parents are allowed to opt out of these default options, breaking the deterministic link between schoolsand neighborhoods within language groups.3 I exploit these assignment rules along with hand-collectedgeocoded data of primary school catchment areas to instrument for school quality in a spatial regression-discontinuity framework (RD-IV), leveraging the fineness of the spatial information in the administrativefiles. Here, I document an important role for schools independently of neighborhoods: students growingup on the side of a French primary school boundary associated with a relatively higher-quality defaultoption are 3 percentage points more likely to enroll in university than their immediate neighbors on theopposite side of the boundary. Crucially, the catchment areas of English and French default schoolsare not same. This feature allows me to implement placebo tests confirming that the relevant bound-aries do not coincide with discontinuous changes in non-school unobserved attributes.4 The educationaloutcomes of students attending English schools exhibit no discontinuity around French primary schoolboundaries.

Next, I estimate the magnitude of total causal exposure effects by focusing on movers. To addressthe endogeneity of location decisions, I exploit variation in the timing of moves across families andfocus on within-city moves to examine the role of schools.5 Intuitively, if social context matters, theeducational outcomes of movers should converge towards those of the permanent residents of their

residence and school attended be observed in the data, and that the two dimensions don’t perfectly overlap.3Private schools are widespread and relatively affordable in Montreal, generating more school and neighborhood inde-

pendent variation, and further loosening the mechanical relationship between the two dimensions. Private school studentsare included in my database.

4Boundaries that coincide with major geographical features such as highways or canals are excluded.5Aaronson (1998) and Weinhardt (2014) also use variation from movers to identify neighborhood effects. Chetty and

Hendren (2018a,b) use rich tax data to track families with children who move across commuting zones and counties in theU.S. and estimate the causal effect of places on earnings. Similar identification strategies have also been used to analyzehealth care utilization (Finkelstein, Gentzkow and Williams, 2016), physician practice style (Molitor, 2018), the impactof EITC on labor supply (Chetty, Friedman and Saez, 2013), and brand preferences (Bronnenberg, Dubé and Gentzkow,2012).

Page 8: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 3

destination (children who always resided in the same area) with increasing time spent in that location.The reduced-form object of interest is a convergence rate. To insure that the estimates are not confoundedby sorting into different areas, the model relies on comparisons between children who started in thesame neighborhood and moved to the same neighborhood.6 The identifying assumption, then, is thatthe degree of selection into locations does not vary systematically with children’s age at the time of themove. In support of this assumption, I show that my results hold up to a series of robustness checks,notably family fixed effects specifications and controlling for time-varying observables around the timeof the move.

I find that movers’ outcomes improve linearly with each year spent in a better location duringchildhood, where neighborhood quality is measured using the mean long-term outcomes of permanentresidents. My estimates suggest that movers’ educational attainment converges at an annual rate of about4.5% towards the outcomes of the permanent residents of their destination neighborhood. Put differently,moving one year earlier to a place where permanent residents have 1 more year of education than thoseof one’s origin location increases one’s own educational attainment by 0.045 years. Extrapolating over15 years of childhood, these effects account for about 2/3 of the differences in outcomes of permanentresidents between origin and destination. The magnitude of these effects is remarkably similar to thatreported in Chetty and Hendren (2018a) despite important differences between the two settings, notablymy emphasis on smaller geographic units and within-city variation.

Having established the presence of substantial contextual effects in Montreal, I then explore howmuch of the benefits of moving to better areas are driven by school quality. Key to this analysis is thefact that since school effects can be identified after conditioning on neighborhoods, the mean outcomesof permanent residents of a given neighborhood can be partitioned into a part reflecting the qualityof local schools and a neighborhood “residual”. For example, two locations where permanent residentshave the same mean outcomes need not have schools of the same quality if they differ in terms of otherneighborhood characteristics.

Building on this insight, I isolate the fraction of total exposure effects that are driven by schools.More precisely, with forecast-unbiased measures of school quality in hand (from the RD estimates), Iseparately estimate the effect of moving to a place with schools of greater quality and the effect of movingto a place with better (non-school) neighborhood amenities. I show that the total convergence rate isdetermined by these two partial exposure effects along with the variances of the school and neighborhoodcomponents, and derive the mapping between the total rate and these other reduced-form parameters.Then, I calculate a restricted convergence rate for which the effect of moving to a place with schools ofgreater quality is set to zero. Comparing this restricted rate to the total rate that incorporates bothschool and neighborhood effects, I find that between 50% and 70% of the benefits of moving to a betterarea are due to access to better schools. Even in a context where students have the option to opt outof their local educational institutions, causal place effects are driven for the most part by schools ratherthan by neighborhoods themselves. Nonetheless, a small residual neighborhood exposure effect persistsabove and beyond the contribution of schools.

This paper brings together several literatures. First, it speaks directly to the classic question “Doneighborhoods matter?”. Correlational analyses generally find strong associations between neighbor-hood poverty and success at school (Sharkey and Faber, 2014; Burdick-Will et al., 2011). In contrast,most experimental and quasi-experimental studies that tackle the challenging task of isolating “place”

6The main empirical specification includes both origin-by-destination and age-at-move fixed effects.

Page 9: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 4

effects from non-random sorting of families into neighborhoods have found limited evidence of staticneighborhood effects on educational and economic outcomes (Ludwig et al., 2013; Kling, Liebman andKatz, 2007; Oreopoulos, 2008, 2003; Jacob, 2004).7 In a recent re-analysis of the Moving to Opportunity(MTO) experiment, Chetty, Hendren and Katz (2016) show that children do benefit from moving tobetter locations both in terms of earnings and college enrollment, but that these gains only materializefor youth who moved before the age of 13, consistent with cumulative exposure effects. Similarly, Chettyand Hendren (2018a,b) estimate large exposure effects for children moving across U.S. commuting zones.Given that school attendance is generally residence-based, these estimates of neighborhood exposureeffects also reflect differences in local school quality (Altonji and Mansfield, 2014). My estimates of totalexposure effects are consistent with this prior literature, but my main focus is on unpacking the role ofschools as a mechanism. Accounting for the cumulative nature of long-term contextual effects, I demon-strate that neighborhood exposure effects operate mostly via schools rather than through neighborhoodsthemselves.

Second, my paper also relates to a parallel stream of research evaluating the causal impact of schoolson educational and labor market outcomes. Large effects of attending a better school are found us-ing quasi-experiments (Gould, Lavy and Paserman, 2004), lottery-based designs (Angrist et al., 2017;Deming et al., 2014; Dobbie and Fryer, 2015, 2011), and admission threshold rules (Pop-Eleches andUrquiola, 2013; Jackson, 2010). Using similar research designs, however, Abdulkadiroğlu, Angrist andPathak (2014) and Cullen, Jacob and Levitt (2006) respectively find no positive effects of attending anelite school or of attending one’s preferred school in a school choice program. My paper takes a differentapproach and instead exploits spatial discontinuities in the spirit of Black (1999). I show that the earlyschooling environment has a long-term impact: residing on the better side of a French primary schoolboundary at age 6 affects educational outcomes measured more than 10 years later.

I also contribute to a growing body of research that contrasts the magnitude of school and neighbor-hood effects.8 Historically, researchers have generally focused on either schools or communities, but afew recent review papers have speculated on the relative effectiveness of school and neighborhood inter-ventions by comparing separate studies.9 Fryer and Katz (2013) and Katz (2015) contrast results fromthe MTO experiment (Ludwig et al., 2013; Kling, Liebman and Katz, 2007), which induced low-incomefamilies to move to low-poverty neighborhoods, with the effects of the Harlem Children’s Zone exper-iment, which combines both school-level and community-level interventions (Dobbie and Fryer, 2015,2011). They conclude that school interventions are likely more effective than community programs foreducational outcomes, a conclusion also reached by Oreopoulos (2012).10 My paper adds to this evidenceby directly separating school from neighborhood effects using two instruments simultaneously and showsthat school quality goes a long way explaining why neighborhoods matter for educational attainment.

7Notable exceptions include Goux and Maurin (2007) who find positive effects of neighborhood peers on the probabilityof repeating a grade using variation from public housing projects in France, and Gould, Lavy and Paserman (2011) whofind significant effects of childhood conditions on adult outcomes in Israel.

8In sociology, Carlson and Cowen (2015) examine short-run variation in test score gains across schools and neighborhoodsin Milwaukee, and Wodtke and Parbst (2017) explore how school poverty mediates neighborhood effects on math andreading tests in the PSID. Sykes and Musterd (2011) study how school characteristics mediate the relationship betweenneighborhood characteristics and test scores in the Netherlands.

9Papers that explicitly considers schools and neighborhoods as separate entities include Card and Rothstein (2007) onthe effect of segregation on test scores and Billings, Deming and Ross (2016) on the formation of criminal networks.

10Rothstein (2017) finds that differences in quality of K-12 education (measured by test scores) account for little ofthe between-city differences in intergenerational mobility, while Card, Domnisoru and Taylor (2018) find that state- andcounty-level school quality was a key factor driving regional differences in upward mobility in the early 20th century. Incontrast to these studies, my paper studies school quality as a within-city mechanism.

Page 10: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 5

The methods used in this paper have several empirical benefits. Movers likely constitute a morediverse cross-section of the population than samples of experimental studies that focus on very disad-vantaged households (Oreopoulos, 2003) or negatively-selected populations of lottery applicants (Chyn,2016), contributing to the external validity of the results. Also, by using outcome-based measures ofneighborhood and school quality, I circumvent the issue of choosing which observable characteristics touse to proxy for quality. For instance, school input measures and teacher observable characteristics oftenfail to predict effectiveness, despite the evidence that both schools and teachers have large causal effectson student outcomes (Dobbie and Fryer, 2013; Chetty, Friedman and Rockoff, 2014b; Rivkin, Hanushekand Kain, 2005; Hanushek, 1986).

The rest of the paper proceeds as follows. First, I describe the institutional context and the data inSection 1.2. Then, to fix ideas and motivate the empirical analyses, I set up a conceptual framework inSection 1.3, and present the associated econometric models in Section 1.4. The results are reported inSection 1.5 and a host of robustness checks are conducted in Section 1.6. Section 7 concludes.

1.2 Data and Background

1.2.1 Quebec’s Education System

Levels of education In Quebec, education is compulsory from age 6 to 16, and most children enroll inkindergarten at age 5. Children complete six years of primary education (grades 1-6), and then attend asecondary school for five more years (grades 7-11), until they obtain a secondary school diploma (diplômed’études secondaires – DES), or equivalent qualifications. Grade repetition is common and over 20% ofstudents drop out of secondary school before obtaining any degree.

The higher education system differs considerably from standard North American systems. In Quebec,there is a sharp hierarchical distinction between college and university, the former being a pre-requisitefor the latter.11 After secondary school, most students enroll in college in either a pre-university (2years) or a technical program (3 years).12 Pre-university college degrees are categorized in three broadfields – social sciences, natural sciences, and arts – which are chosen at the time of applying to college.The typical student who obtains a pre-university college degree then enrolls in a 3-year bachelor degreeprogram in university. As in college, students apply and are admitted directly to a specific universityprogram. A college degree is a necessary condition for university admission, with few exceptions.

Figure 1.1 shows the typical education course towards a bachelor degree in Quebec and in a standardNorth American system. No transition between levels of education in Quebec coincide with the age atwhich students transition in other educational systems. The number of years of education associatedwith a bachelor degree, however, remains the same.

In the empirical application, I measure neighborhood and school exposure up until the academic yeara student is aged 15 on September 30, inclusive. All educational investments made after that point areconsidered outcomes.

School choice between sectors Quebec’s education system possesses multiple elements of schoolchoice that contribute to breaking the mechanical link between area of residence and school attendance.

11Collegial institutions in general are informally known in Quebec as Cégeps, although only public colleges officially bearthat name. Cégep is an acronym for Collège d’enseignement général et professionnel.

12Completing a college technical program in Quebec is roughly equivalent to a 2-year college degree in the U.S.

Page 11: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 6

At the primary and secondary levels, two public school systems operate in parallel – one French and oneEnglish. Public schools are governed by schools boards, which are responsible for personnel, transporta-tion, infrastructure, and the allocation of resources across schools. School boards are language-specific,and any place of residence falls within the territory of exactly one English and one French school board.13

Importantly, the attendance zones of English and French schools are not the same. Hence, two neighborswith different mother tongues who both attend their nearest language-specific public school likely haveschool peers who originate from different neighborhoods. Access to instruction in English is restrictedto anglophones born in Canada. This rule is strictly enforced and parents must obtain a certificateof eligibility before enrolling their child in an English school.14 Note that language restrictions do notapply to post-secondary institutions.

In comparison with other Canadian provinces and the U.S., the private sector is widespread inQuebec, notably at the secondary school level. In Montreal, almost a third of all students attend aprivate secondary school. Private schools do not have attendance zones and are relatively accessiblegiven that they are highly subsidized and that very few schools charge the maximum fee allowed by law(Lefebvre, Merrigan and Verstraete, 2011). Subsidized private schools are also subject to the languageof instruction restriction.15

School choice within sectors Quebec’s open enrollment policy stipulates that parents have theright to enroll their child in the school of their choice (libre choix), subject to capacity constraintsand the language restrictions described above. In practice, school boards assign children to defaultneighborhood schools, and parents who desire to enroll their child in a public school other than the onethey are assigned to must fill in the relevant paperwork at the neighborhood school. Default optionsmay induce two sets of parents living in the same area to enroll their children in different schools sincecatchment area boundaries often cut through neighborhoods.16 These boundaries serve as the basis fora regression-discontinuity analysis described in section 1.4.2.

Importantly, over the time period considered in this study, there existed no public information aboutrelative primary school quality and performance such as rankings on outcome-based measures.17 Ifenrollment exceeds capacity, priority is given to children residing in the school’s catchment area and tosiblings of children attending the school, and students opting-out of their assigned school are not eligiblefor school bus transportation. Other non-residence based admission criteria are used for elite magnetschools. The neighborhood school therefore acts as a default option, and catchment area boundaries as

13Before 1998, school boards were religion-specific (Catholic or Protestant), but individual schools were still either Frenchor English.

14In the language of the law, anglophones are students whose mother or father attended an English primary or secondaryschool in Canada. Under this rule, almost all immigrants are de facto forbidden to attend school in English. Exceptionsto the rule are rare.

15Non-subsidized English schools are allowed to enroll non-English speaking students. However, these schools are un-common and represent less than 1% of total enrollment (Duhaime-Ross, 2015). A minority of subsidized private schoolshave entrance exams, yet the vast majority of students taking such exams are admitted to their preferred school (Lapierre,Lefebvre and Merrigan, 2016).

16For example, Appendix Figure A.1 shows that many census tracts overlap with more than one French primary schoolcatchment area. In this paper, I focus exclusively on French primary school boundaries since English primary schoolboundaries are not as well defined. Some English schools offer different programs (e.g. English Core vs. Bilingual) andtheir catchment areas may vary by program. At the secondary school level, English public schools in Montreal do not havecatchment areas, but French public schools do.

17All Montreal school boards strongly oppose public disclosure of rankings or quality indicators. Parents who decideto opt-out of their neighborhood primary school must either acquire their information from social networks or by visitingschools during open-house events. On the other hand, secondary school rankings are published yearly in mainstream media.In 2015, a well-known newspaper published partial rankings of Montreal public primary schools for the first time. Thecohorts of students analyzed in this paper had left primary school many years before that event.

Page 12: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 7

cost shifters. In my data, every neighborhood school enrolls at least some students residing outside itscatchment area.18

1.2.2 Data

The main source of data used in this paper consists of student-level administrative records providedby Quebec’s Ministry of Education that cover all levels of education from primary school to university.Separate files from four different branches of the Ministry were matched using unique student identifiers.For each year in primary and secondary education, school attended, grade level, and the six-digit postalcode of residence are recorded. Postal codes are very small geographic areas, generally equivalent toa block-face or a unique apartment building. One’s postal code determines the default neighborhoodschools (one English and one French). Catchment areas were manually geocoded on that basis.19

In addition to the assigned neighborhood schools, I also calculated for each postal code the distanceto the nearest catchment area boundary, distance to the nearest public school, associated Census Tractand Forward Sorting Area (FSA). FSAs are postal-code-based neighborhoods and constitute the maingeographic unit of analysis.20 All distances were calculated separately for the English and French publicschool systems. In Montreal, students reside in over 500 different Census Tracts and about 100 FSAs.For confidentiality reasons, school identifiers and six-digit postal codes are de-identified in the analyticaldataset.

Student demographics – age on September 30, gender, mother tongue, country of origin, languagespoken at home – are included, in addition to time-varying variables such as school day care use (primaryschool only) and an indicator of whether a student is currently considered to have learning difficulties(primary and secondary school).21 In addition, I append neighborhood characteristics from the 2001Canadian Census using students’ census tract of residence.

In terms of long-term educational outcomes, the data include enrollment and graduation informationin secondary school as well as for all vocational, college, and university programs. I use these to calculate– among other outcomes – university enrollment, timely secondary school graduation, and number ofyears of education.22 Post-secondary programs of study are also recorded. More detailed informationregarding the construction of the outcome variables is provided in the Data Appendix.

The sample is focused on residents of the Island of Montreal, Quebec’s most populous region andmain urban center. This territory fully includes the city of Montreal and a few smaller municipalitiesthat are either located in the suburban westernmost part of the Island or enclaved within the city of

18One important reason why capacity constraints do not appear to be binding is that Quebec’s school system wasexperiencing a decline in school-age population over the time period covered here, which notably led to several publicschool closures in the early 2000s.

19See Data Appendix for details on how catchment areas were re-constructed.20A FSA is defined by the first three digits of a postal code. Their boundaries do not necessarily overlap with census

tract boundaries. FSAs are regularly used to operationalize neighborhoods in Canadian research (Card, Dooley and Payne,2010).

21On any given year in primary and secondary school, students with social maladjustment or learning disabilities can beidentified as being “in difficulty”. School boards receive extra funding to support these students, and many observers worrythat schools may ’over-diagnose’ students as a result. Yet, the predictive power of this variable with respect to educationalattainment is stunning. The probability that one obtains a secondary school diploma on time (five years after startingsecondary school) decreases monotonically with each year flagged in difficulty (Figure A.2). For the earliest cohorts, theprobability of obtaining a bachelor degree is 36% for students never identified in difficulty, while it is only 5% among thosewho were flagged once or more.

22Other measures of educational attainment I consider include college enrollment and degree completion, bachelor degreecompletion, and expected earnings on the basis of highest level of education.

Page 13: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 8

Montreal.23 The Island of Montreal encompasses three francophone and two anglophone school boards.Administrative records were obtained for five cohorts of students who started primary school between

1995 and 2001, following students until the 2014-2015 academic year.24 The sample consists of allstudents who resided on the Island of Montreal at the time of entering grade 1 (100,929 students). Thisselection rule conditions on a common starting point, and therefore excludes students who moved toMontreal after completing grade 1 elsewhere.25 The main sample (92,764 students) excludes all studentswho left Quebec’s education system before turning 16.26

1.2.3 Descriptive and Summary Statistics

Descriptive statistics for the main sample are shown separately by mobility status in Table 1. Permanentresidents are defined as those who, by the age of 15, had always resided in the same FSA (44,912 students).I distinguish between movers who were still living in Montreal by age 15 (31,525 students), and those whohad moved off the Island but remained in the province. Because of the within-city focus of this paper,students who left Montreal are excluded from the empirical analyses. Note that residential mobility isgreater within than across cities – for example, while less than 50% of students in my sample qualify aspermanent residents, over 80% of families in Chetty and Hendren (2018a) do.

In Montreal, students are on average 6 years old when they enter primary school. Only half thesample consists of native French speakers, but 75% of students attend school in French. Allophones –defined as individuals whose mother tongue is neither French nor English – make up almost a third ofthe sample. Nevertheless, the vast majority of students was born in Canada (90%). Anglophones areoverrepresented among permanent residents, while francophones are disproportionately more likely tomove outside of Montreal, and allophones to move within Montreal.

At baseline (in grade 1), 4% of students are considered to have learning difficulties (flagged “indifficulty”), and the fraction increases sharply over time. By the time they reach the age of 15, almosta third of the sample will have been flagged at least once. In general, movers appear to be negativelyselected: In first grade, 3% of permanent residents are in difficulty, while 5% of movers are. At age 15,one permanent resident out of four has been flagged at least once, while more than a third of movershave.

The number of years for which I track students varies across cohorts, hence observed educationalattainment will be higher for earlier cohorts, by construction. Appendix Table A.1 reports summarystatistics for some educational outcomes separately by cohort.27 Roughly 76% of students obtain asecondary school diploma (DES), but only 61% do so on time (in five years), with little variation acrosscohorts. The college enrollment rate is consistent across cohorts, at 70%. In terms of university-leveloutcomes, as of 2015, 46% of students who started primary school in 1995 had enrolled in university and28% had completed a bachelor degree. Virtually no student of the 2001 cohort has a bachelor degree

23These administrative divisions are irrelevant for school resources administration purposes.24Data for the 1997 and 1999 cohorts are not yet available.25While many families with school-age children move from Montreal to suburban areas outside of the Island, fewer

move in the opposite direction. Also, by focusing on students who started primary school in Quebec, most internationalimmigrants who arrived at age 7 or older are excluded.

26In primary and secondary school, attrition is generally due to students leaving the province. In exceptional cases,some students may disappear from administrative records if they attend illegal schools, or are home schooled. Studentsleaving the system are disproportionately non-French speakers and immigrants. See the Data Appendix for more detailson attrition.

27These statistics exclude almost 1,000 individuals who enroll in a Quebec post-secondary institution at some point, butwho had left the primary and secondary school system before turning 16 and therefore are excluded from the main sample.

Page 14: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 9

yet, but 22% of them are enrolled in university. Every econometric model in this paper includes cohortfixed effects to account for these differences.

Educational attainment varies dramatically across neighborhoods of Montreal. Figure 1.2 mapsdifferences in mean educational attainment of permanent residents across FSAs.28 The gap betweenneighborhoods with best and worst outcomes is abysmal, with local fractions of students completing highschool on time ranging from 32% to 92%. The gap grows even larger in terms of university enrollment,with a minimum rate of 15% and a maximum of 80%. Even starker disparities emerge across censustracts (Figure A.6).

School attendance Given the variety of school choice options available in Montreal, students livingin the same neighborhood need not attend the same school. For instance, at baseline, students livingin the average FSA attend as many as 57 different primary schools.29 When entering grade 1, 63%of students in French schools attend their neighborhood school and 50% of students in English schoolsdo so. In total, 41% of students opt-out of their default option at baseline. By the end of secondaryschool, this proportion exceeds 70%.30 Opt-out rates vary between the primary and secondary schoollevels primarily because of differences in availability of private school options. Around 12% of Montrealstudents are in the private sector in primary school, and that proportion rises to almost 30% in secondaryschool (Figure A.5). Yet, geography remains an important factor for many parents when it comes todeciding which school their child will attend. For example, among students in French schools at baseline,68% attend their default school if that school is the nearest French public school from their house, whileonly 50% do so if it is not.31

To examine whether default options affect enrollment, I randomly pick one neighborhood school foreach boundary and plot the probability of attending that school as a function of distance to the nearestboundary. Figure A.3 shows the discontinuity in attendance for students enrolled in French primaryschools at baseline.32 Students at positive distances are residing in the catchment area of the randomlychosen school. On the left side of the border (negative distances), 20% of students attend the schoollocated on the other side, rather than their own default option or any other French school. Despitethe open-enrollment policy, there is a large discontinuity in attendance rates at the border, suggestingthat many parents passively select the default option. This is consistent with a body of evidence inbehavioral economics and psychology on the importance of default options (Chetty, 2015; Lavecchia,Liu and Oreopoulos, 2014).

1.3 Conceptual Framework

This section lays out a model of human capital accumulation that incorporates both cumulative neighbor-hood and school effects, expanding the framework of Chetty and Hendren (2018b) to allow for multiplecontextual inputs. It first describes the outcomes of permanent residents and movers parsimoniously

28The values of yPRn are adjusted for cohort differences. More precisely, these mean-outcomes are neighborhood fixedeffects from a regression of outcome y on neighborhood and cohort fixed effects. The fixed effects are then re-centered sothat their average is equal to the unconditional mean y.

29See Appendix Figure A.4 for distribution of FSAs by number of schools.30By definition, students in English secondary schools all opt-out since English public schools do not have attendance

zones at the secondary level. At the age of 15, 58% of students in French schools attend a school other than their defaultoption.

31For 30% of students in French schools the default option is not the nearest French public primary school.32Since attendance zones for English schools are not as well defined as for French schools, I focus on French boundaries.

Page 15: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 10

in terms of the parameters of an education production function. The model is then used to clarify theinterpretation of reduced-form estimates of exposure effects. I then discuss how a decomposition of totalexposure effects can be achieved. The econometric specifications used to implement this decompositionare presented in Section 1.4.

Education production function Consider a general framework in which educational investmentin children takes place over compulsory schooling years (up to year A) and a long-term outcome isrealized and measured after the investment years. The education production function is cumulative andseparately additive in family, school, and neighborhood (non-school) inputs:

yi =A∑a=0

[λµn(i,a) + ωψs(i,a)

]+Aθi

where yi is a measure of educational attainment for student i, n(i, a) denotes the neighborhoodin which the student resided at age a, s(i, a) the school she attended that year, and θi are annualaverage family inputs. Neighborhood and school quality are denoted by variables µn(i,a) and ψs(ia,), andparameters λ and ω respectively represent the causal effect of one year of exposure to better non-schoolneighborhood amenities and the causal effect of attending a better school for one year. Contextual effectsconstitute the sum of neighborhood and school effects over all investment years.33 For ease of exposition,I collapse the sum of school inputs into annual averages, with ψs(n(i)) denoting average school quality foryears during which student i resided in neighborhood n. The production function can then be writtenas the sum of inputs received while living in each location:

yi =∑n

ain[λµn + ωψs(n(i))

]+Aθi

where ain is the number of years student i resided in location n and∑n ain = A. Note that average

school quality ψs(n(i)) remains indexed by i because students living in the same area can attend differentschools.

School effects To isolate school effects from any neighborhood-related variation, I focus on the sub-sample of permanent residents (PR) – children who always resided in the same place. For thesestudents, aik = A for neighborhood n, and aik = 0 for all other locations k 6= n. Their educationaloutcomes simplify to yPRn(i) = Aλµn + Aωψs(n(i)) + Aθi, and neighborhood-level mean outcomes of PRsare

yPRn = A[λµn + ωψPRn + θPRn

](1.1)

where ψPRn = E[ψs(n(i))|n(i, a) = n∀a

]is the average annual school input of permanent residents

33To keep the model tractable, I do not explicitly include disruption costs associated with moving or switching school inthe production function. The empirical model developed in Section 1.4 accounts for any age-variant disruption costs withthe inclusion of age-at-move fixed effects.

Page 16: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 11

of location n, and θPRn = E[θi|n(i, a) = n∀a

]their average family input.34 In practice, school quality

ψs(n(i)) and neighborhood quality µn are unobserved. Hence, I partition educational attainment intomeasurable school-related and neighborhood-related terms, as well as an idiosyncratic residual νi thatis unrelated to either schools or neighborhoods:

yPRn(i) = Ωs(n(i)) + Λn + νi

where Ωs(n(i)) reflects both cumulative causal school effects over student i’s childhood Aωψs(n(i)), aswell as average sorting into schools, and Λn is defined accordingly for neighborhood non-school amenities.Put differently, Ωs(n(i)) is a biased measure of true school effects Aωψs(n(i)) because it incorporates thepartial correlation between school quality and parental inputs.

Let π denote the fraction of the effect of Ωs(n(i)) on yPRn(i) reflecting causal variation. My first empiricalobjective is to obtain a consistent estimate of π to fix measures of predicted gains Ωs(n(i)) by properlydeflating them.35 This can be achieved by using a valid instrumental variable that exogenously shiftsschool quality ψs(n(i)) independently of neighborhood quality (“first-stage”) and that is uncorrelatedwith parental inputs (“exclusion restriction”). Note that an OLS regression of yPRn(i) on Ωs(n(i)) and aset of neighborhood fixed effects yields a coefficient on Ωs(n(i)) of one, by construction. In contrast,when instrumenting for Ωs(n(i)), the regression coefficient obtained reflects only the proportion of theeffect of a one-unit change in Ωs(n(i)) that is due to true school effects and thereby corresponds to π.Forecast-unbiased measures of predicted gains can then be recovered using πΩs(n(i)) (Chetty, Friedmanand Rockoff, 2014a). Similarly, πΩPRn is a forecast-unbiased measure of the average cumulative causalschool effects for PRs of neighborhood n, AωψPRn .

Identifying total exposure effects The total effect of growing up in a given area incorporates bothschool and non-school neighborhood inputs. To obtain causal estimates of these total exposure effects, Irely on movers. For one-time movers, let o(i) denote the origin neighborhood of mover i, d(i) denotethe destination, and mi the age at which student i moved.36 Their educational attainment is given by

yi = A[λµd(i) + ωψs(d(i)) + θi

]− (mi − 1)

[λ(µd(i) − µo(i)

)+ ω

(ψs(d(i)) − ψs(o(i))

)]︸ ︷︷ ︸Total exposure effects (ei(o,d))

. (1.2)

Equation (1.2) highlights that the long-term outcomes of movers depend on the quality of schools andneighborhoods in both places as well as on the length of exposure to each place, which varies with age-at-move. Total exposure effects ei(o, d) are the gains of living in and attending schools of area d for oneyear relative to area o. Unfortunately, terms incorporated in ei(o, d) are unobservable. To take equation

34The variation in outcomes across neighborhoods for resident permanents is a combination of true contextual effects –school as well as non-school factors – and of sorting. Decomposing the variance of yPRn :

V ar(yPRn ) = V ar(Aλµn) + V ar(AωψPRn ) + V ar(θPRn )

+ 2[Cov

(Aλµn, Aωψ

PRn

)+ Cov

(Aλµn, θ

PRn

)+ Cov

(AωψPRn , θPRn

)].

35More details about the interpretation of π are provided in the Mathematical Appendix.36For these students, aio = mi − 1, aid = A− (mi − 1), and aik = 0∀k 6= o, d.

Page 17: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 12

(1.2) to the data, it is useful to re-write it as a function of variables that can be readily measured. Forinstance, since permanent residents’ mean outcomes are a function of the same contextual inputs, thetotal effect of living one year in area d relative to area o is directly related to the outcomes of PRs inboth locations. Substituting the difference in outcomes between permanent residents of neighborhoodsd and o, ∆yod = yPRd − yPRo , into equation (1.2):

ei(o, d) = 1A

∆yod + ω[(ψs(d(i)) − ψs(o(i))

)−(ψPRd − ψPRo

)]−(θPRd − θPRo

). (1.3)

Positive exposure effects imply that the cumulative gains of moving to a ∆yod-unit better areashould grow (shrink) with the amount of time spent the destination (with age-at-move). Empirically,the magnitude of annual exposure effects can be assessed by comparing students who moved from thesame origin to the same destination at different ages. Intuitively, if d is “better” than o, then a studentwho moved at age 9 is expected to have better outcomes than her peer who made the same move at age12 since she will have been exposed to the better area for three more years.37

A reduced-form object of interest is the rate at which movers’ outcomes converge towards those ofthe permanent residents of their destination with the number of years of exposure to that location,which can be estimated by regressing movers’ outcomes yi on the interaction term (mi − 1) ×∆yod.38

Equation (1.3) indicates the magnitude of this convergence rate will depend on the degree of sortingof permanent residents (e.g. the extent to which variation in ∆yod reflects differences in θPRd − θPRo ).Greater sorting of permanent residents translates into smaller convergence rates.39 Also, the size ofthe estimates increases with the propensity of movers to attend schools of comparable quality to thoseattended by permanent residents in their origin and destination. In this sense, the estimated coefficientscan be interpreted as intention-to-treat (ITT) effects, with E

[ψs(d(i))−ψs(o(i))

ψP Rd−ψP R

o

]representing the relevant

compliance rate. Under full compliance, i.e.(ψs(d(i)) − ψs(o(i))

)=(ψPRd − ψPRo

)∀i, and no sorting

of permanent residents, the convergence rate is equal to 1/A. A non-zero convergence rate indicatesthat there are benefits to moving to a better area, but does not necessarily imply that neighborhoodsmatter independently of schools. If neither schools nor neighborhoods matter (i.e. λ = ω = 0), then theconvergence rate is zero.

Decomposing total exposure effects Total exposure effects encompass both the changes in schooland non-school neighborhood inputs experienced by movers. Isolating the part of ∆yod that reflectscausal school effects and rearranging equation (1.3) accordingly yields

ei(o, d) = 1Aπ∆Ωod + ω

[(ψs(d(i)) − ψs(o(i))

)−(ψPRd − ψPRo

)](1.4)

+ 1A

∆y−sod −(θPRd − θPRo

)37If individual inputs adjust in response to changes in other inputs, then the effect of moving a student across neighbor-

hoods should be interpreted as a policy effect which encompasses parental responses (Todd and Wolpin, 2003). For instance,prior research suggests that parental effort and school quality are treated as substitutes (Pop-Eleches and Urquiola, 2013;Houtenville and Conway, 2008).

38Identification issues associated with this approach are discussed in section 1.4.39This is under the assumption that families with high unobservable characteristics select into better schools and neigh-

borhoods: Cov(λµn+ωψPRn , θPRn ) > 0. If the parents of students with low family inputs are more likely to sort into betterschools and neighborhoods, then the convergence rate increases with sorting of permanent residents.

Page 18: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 13

where π∆Ωod = π(ΩPRd − ΩPRo

)and ∆y−sod ≡ ∆yod−π∆Ωod. With measures of ∆Ωod and π in hand,

one can estimate separate convergence rates for (a) moving to an area with π∆Ωod-unit better schools,and for (b) moving to an area with ∆y−sod -unit better outcomes associated with non-school neighborhoodamenities.

The extent to which schools are directly driving total exposure effects can be assessed by calculatinga restricted convergence rate for which the school channel has been shut down. I achieve this by settingthe school-specific convergence rate to zero and calculating the associated residual convergence rate usingthe appropriate correspondence. I can then examine the fraction of total exposure effects that operatethrough schools by comparing the resulting restricted convergence rate with the benchmark total ratethat encompasses both school and neighborhood effects.

The education production function used in this paper has several restrictions. Firstly, neighborhoodand school effects are both linear in years of exposure.40 This assumption appears to be supported bythe data (see Section 1.5). The model also rules out complementary between schools and neighborhoods.Suggestive evidence that there is no systematic interaction between school and neighborhood quality inmy data is provided in the next Section. Additive separability of schools and neighborhoods is relativelystandard in the literature (Gibbons, Silva and Weinhardt, 2013; Card and Rothstein, 2007), and isconsistent with results from the Harlem Children’s Zone (Fryer and Katz, 2013). Finally, school andneighborhood effects are assumed to be constant across students. This assumption is common to mostwork on school (Deming, 2014), teacher (Chetty, Friedman and Rockoff, 2014b) and college (Hoxby,2015) value-added.

1.4 Empirical Roadmap

This section presents the econometric specifications used to obtain the empirical objects necessary toimplement a decomposition of exposure effects.

1.4.1 Schools and neighborhoods: Measurement

To obtain measures of Ωs(n(i)) and Λn, I estimate a simple two-way fixed effects model on the subsampleof permanent residents. The estimating equation is

yinsc = δn + δs(i) + δc + εinsc (1.5)

where yincs is a long-term educational outcome for student i from cohort c, living in neighborhood nand attending the set of schools s(i). The model includes cohort (δc), FSA (δn) and school (δs(i)) fixedeffects. Intuitively, this model is identified because the set of students living in the same area attend avariety of different schools, and students in the same school reside in different neighborhoods.41 Sincestudents generally attend two different schools during childhood – one primary and one secondary school– I parameterize the vector of school effects to include a fixed effect for primary school attended at baseline

40Angrist et al. (2017), Dobbie and Fryer (2013), Abdulkadiroğlu et al. (2011) and Autor et al. (2016) also assume thatschool effects are proportional with number of years. Chetty and Hendren (2018a) make a similar assumption for placeeffects.

41Just like models of worker and firm fixed effects are identified from switchers (Abowd, Kramarz and Margolis, 1999),this model requires that students of a given neighborhood be observed in multiple schools and that students from a givenschool be observed in multiple neighborhoods.

Page 19: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 14

(δPs ) and a fixed effect for secondary school attended at age 15 (δSs ). I therefore obtain a proxy for schoolquality for each school in the data set. Note that these measures of school quality are net of neighborhoodfixed effects and therefore reflect the contribution of schools (and sorting into schools) that cannot beaccounted for by where schools gather their students from.42 These outcome-based measures of schoolquality can be interpreted as predicted gains and reflect any observed and unobserved differences inproductive school inputs – e.g. teacher and principal quality. In contrast, traditional measures of schoolquality based on test scores may not fully capture other important dimensions of school effectivenessfor long-term educational attainment, such as effects on non-cognitive skills (Jackson, 2016; Heckman,Stixrud and Urzua, 2006).

To describe the amount of variation in the data, Table 1.2 reports the student-level standard deviationin school (δs(i)) and FSA (δn) fixed effects for the three main outcomes of interest: university enrollment,finishing secondary school on time (DES in 5 years), and years of education. As a benchmark, I firstreport in columns (1), (3) and (5) the raw variation across school and neighborhood fixed effects, notaccounting for variation in the other dimension. These reflect the dispersion of neighborhood and schoolfixed effects estimated in separate regressions. For all three outcomes, the variance across schools isabout twice as large as the variance between FSAs.43 In columns (2), (4) and (6), fixed effects forschools and neighborhoods are estimated simultaneously. While the magnitude of the variation acrossschools barely shrinks when FSA fixed effects are included, a large fraction – between 55 and 65 percent– of the raw student-level variation across FSAs is accounted for by school attendance. Nevertheless,this preliminary piece of descriptive evidence suggests that there is independent variation across bothschools and neighborhoods that cannot be accounted for by the other dimension.44

I then use the fixed effect estimates reported in column (6) to document two additional stylized facts.Firstly, the student-level correlation between school and FSA fixed effects for years of education is smallbut positive (0.17), indicating that students residing in better neighborhoods attend better schools onaverage.45 Secondly, I follow the approach developed in Card, Heining and Kline (2013) to examinewhether there are systematic interactions between the two contexts. Figure A.7 is constructed by slicingthe distributions of school and FSA fixed effects into deciles, and then plotting the average residuals ineach school-by-neighborhood decile cell. Most average residuals are smaller than 0.1 year of schooling,or less than 5% of a standard deviation in the sample of permanent residents. If there were positiveinteractions between school and neighborhood quality, one would expect abnormally large and positivemean residuals for cells corresponding with high or low deciles in both dimensions. The figure shows nosuch discernible pattern, which lends support to the additive separability assumption made in section1.3.46

42Primary school quality is net of the secondary schools its students will eventually attend, and secondary school qualityis net of the primary schools it gathers its students from.

43This result is not due to the fact that there are fewer FSAs than primary schools, as the patterns replicate at thecensus tract level (Table A.2). Also, these patterns closely reflect the conclusions of Carlson and Cowen (2015), who focuson the variance in test scores growth across neighborhoods and schools in Milwaukee’s open enrollment system.

44Appendix Table A.3 reports corresponding standard deviations for “shrunk” estimates obtained using empirical Bayestechniques (Kane and Staiger, 2008; Chandra et al., 2016; Best, Hjort and Szakonyi, 2017). Such an adjustment leavesunchanged the observation that there is more variance across schools than neighborhoods. If anything, the between-neighborhood variation is noisier. In section 1.6.3 I use these empirical Bayes estimates in the decomposition of totalexposure effects and show that doing so yields a larger total convergence rate and reinforces the conclusion that schoolsaccount for most of these effects.

45The correlations are 0.05 and 0.13 for graduating secondary school on time and university enrollment, respectively.46In addition, allowing for unrestricted match effects between schools and neighborhoods (i.e. a full set of indicator

variables for each possible combination of neighborhood and primary/secondary school) only slightly improves the model’sfit. For years of education, the adjusted R2 increases from 0.3710 to 0.3735.

Page 20: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 15

Finally, I collapse the estimated fixed effects, δs(i) and δn, at the FSA-level to obtain measures ofΩPRn and ΛPRn . Appendix Figure A.8 shows the spatial variation in these measures. Importantly, thetwo maps exhibit little overlap. Places with low values of the school component ΩPRn do no necessarilyalso have a low neighborhood component ΛPRn . This non-collinearity between the two dimensions iscritical to the feasibility of decomposing total exposure effects.

1.4.2 Effect of attending better schools

In this section, I present the RD-IV design used to estimate the effect of attending better schools onlong-term educational outcomes. The approach is based on the fact that schools’ catchment areas cutthrough neighborhoods in such ways that students on opposite sides of a boundary reside in the samecommunities and enjoy the same neighborhood amenities (Black, 1999).47 Yet, these boundaries shiftthe quality of schools two neighbors may be exposed to by varying their default option. I focus on Frenchprimary school boundaries throughout.

For each boundary, I first identify which of the two default schools is of better quality, that is whichyields greater predicted gains (i.e. has a relatively higher fixed effect δPs ). Note that because these fixedeffects are net of FSA-level variation and secondary school attendance, the “better” school for a givenboundary is not necessarily the one where students have the best outcomes in absolute terms. For eachstudent, I then define an indicator variable HighSideib for whether student i resides on the better side ofthe nearest French primary school boundary b.48 These indicator variables are then used as instrumentsin the following two-stage regression-discontinuity framework:

yPRicnb = πΩ−is(n(i)) + f(distanceib) + γXicnb + αb + αn + αc + εicnb (1.6)

Ω−is(n(i)) = ζHighSideb + f(distanceib) + γXicnb + αb + αn + αc + εicnb (1.7)

where (1.7) and (1.6) are first and second stage equations, respectively. The dependent variable yPRicnb is aneducational outcome for permanent resident i of neighborhood n. Student-level individual characteristicsXicnb are included to improve precision.49 Each student is matched to the boundary b that is the nearestfrom her home. The main regressor of interest, Ω−is(n(i)), is a leave-self-out measure of average schoolquality over i’s entire childhood.50 In both stages, a control function for distance to the nearest boundaryf(distanceib) is included, as well as FSA (αn), boundary (αb), and cohort (αc) fixed effects.51 Standarderrors are clustered at the French primary school boundary level.

The validity of the RD approach rests on the assumption that right around boundaries, the qualityof default school options is as good as random. In education systems where school attendance is fully

47Boundaries that coincide with natural divisions such as highways or canals are excluded.48The boundary-specific higher quality default school is not the one with relatively higher raw outcomes for over a

quarter of all permanent residents. In other words, if I were to assign values of HighSideib on the basis of raw outcomesrather than of adjusted school quality δPs , the values of the dummy would flip for a fourth of my sample. The fact that Idetect no evidence of sorting at the boundaries is consistent with findings that parental preferences are unrelated to schooleffectiveness once peer quality is accounted for (Abdulkadiroğlu et al., 2017; Rothstein, 2006), and that school value-addedis not capitalized in house prices (Imberman and Lovenheim, 2016; Kane, Riegg and Staiger, 2006).

49The RD point estimates are virtually identical if baseline characteristics Xicnb are omitted.50The childhood school quality measure Ω−i

s(n(i)) is obtained by taking the sum of leave-self-out transformations of theprimary (δP

s(i)) and secondary school (δSs(i)) fixed effects estimated in section 1.4.1. The exact procedure is described in

the Data Appendix. Jackknife and split-sample approaches yield almost identical results.51In the main specification, I follow Lee and Lemieux (2010) and parameterize f(distanceib) with a rectangular kernel.

In section 1.6, I show that my results are robust to functional form assumptions and bandwidth restrictions.

Page 21: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 16

determined by residence, households may sort right around the boundaries, generating discontinuitiesin sociodemographic characteristics (Bayer, Ferreira and McMillan, 2007). However, in Montreal, anyincentive to sort at the boundary is substantially weakened by opportunities to opt-out of one’s defaultpublic school. Similarly, the large set of available private school options strongly reduces sorting incen-tives. For instance, Fack and Grenet (2010) find that the capitalization of school quality in house pricesin Paris falls sharply with private school availability, and is effectively null in areas with many privateschools. More importantly, given that rankings of Montreal primary schools are not publicly available,distinguishing good from bad nearby schools is difficult and parents may have little ability to sort atboundaries.52

To validate that any jump in school quality at boundaries does not reflect discrete changes in studentcharacteristics, I verify that observable characteristics are balanced around these boundaries (FiguresA.10, A.11 and A.12). The distribution of covariates does appear to be smooth at the threshold (panels(a) to (j)). Similarly, there is no selective attrition around boundaries (panels (k) and (l)).

I also indirectly test the identifying assumption in section 1.5 by conducting a placebo test. Idemonstrate that there is no discontinuity in educational outcomes for students in English schools aroundFrench boundaries. This placebo test suggests the boundaries do not coincide with discontinuous changesin non-school local amenities that would equally benefit English and French students. Note that anysorting of families around boundaries on the basis of their willingness to pay for school quality (via houseprices) should affect both French and English households, as they all participate in the same housingmarket. That the fraction of students enrolled in English schools exhibit no discontinuity at boundariesis consistent with the absence of such sorting around the threshold.53

1.4.3 Total exposure effects

The empirical approach used to estimate the combination of school and neighborhood effects – totalexposure effects – relies on variation in the timing of moves. More specifically, I first investigate whetherthe outcomes of movers converge towards those of the permanent residents of the FSA to which theymove in proportion with time spent in that destination neighborhood. As in equation (1.3), the econo-metric framework models movers’ outcomes as a function of the outcomes of permanent residents ofthe neighborhoods in which they have resided, weighted by time spent in these locations. The mainestimating equation is

yicmod = β (mi ×∆yod) + γXicmod + αod + αm + αc + εicmod (1.8)

where yicod is some educational outcome of student i, from cohort c, who resided in neighborhood o(origin) at baseline, and moved to neighborhood d (destination) at age mi. The coefficient of interestis on the interaction between age-at-move mi and ∆yod, the difference between the mean outcomes ofpermanent residents of neighborhoods d and o. If exposure matters, we would expect that β < 0, whichimplies that the outcomes of movers converge to that of the permanent residents in the destinationneighborhood with the number of years they lived in that area.

52Appendix Figure A.9 shows a density plot by distance to boundaries. No excess density is observed on the right sideof the threshold (side with relatively better schools in term of university enrollment). A formal McCrary (2008) test findsno statistically significant gap: the log difference in height is 0.006 with a standard error of 0.018.

53See panel (f) of Figures A.10, A.11 and A.12.

Page 22: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 17

The origin is the FSA in which students resided at baseline, while the destination is the one in whichthey lived during the academic year they were aged 15 on September 30. Sorting to better areas isaccounted for by origin-by-destination fixed effects (αod) and unobserved differences between studentswho move at different ages, notably differential disruption costs, are absorbed by age-at-move fixedeffects (αm). Cohorts fixed effects (αc) are also included to account for the different number of years forwhich students are tracked in the data. Standard errors are clustered at the destination neighborhoodlevel to allow for arbitrary correlation among families moving to the same place.

Benchmark results are reported in section 1.5, and a series of robustness checks, including familyfixed-effects models and verification of balance on covariates, are conducted in section 1.6. Note thatthere is no systematic correlation between mi and ∆yod in the data. Children who move at early agesare no more likely to move to better or worse areas (relative to their origin) than children who moveat later ages (Figure A.13). Using a Kolmogorov-Smirnov test, I cannot reject the null of equality ofdistributions of ∆yod between early (age 7-11) and late (age 12-15) movers (p-val=0.22).

To maximize power, in most specifications the sample includes all movers irrespective of the numberof times they moved across FSAs, as long as both origin and destination are within Montreal and arenot the same.54 For multiple-times movers, the average quality of neighborhoods exposed to prior tomoving to the destination is measured with error. The model is therefore also estimated on the subsample of one-time movers, in which case the econometric model maps directly onto the conceptualframework discussed in section 1.3. In all cases, the sample is always restricted to movers whose originand destination both have at least 10 permanent residents.

1.4.4 Decomposing exposure effects

As discussed in section 1.3, the total convergence rate β reflects the combined effect of changes in bothschool and neighborhood (non-school) quality. To investigate the quantitative importance of schools asa driver of this total effect, I estimate a “horse-race”-type model that simultaneously includes changesin both components of permanent residents’ outcomes. The reduced-form counterpart to equation (1.4)is

yicmod = βs (mi × π∆Ωod) + βn(mi ×∆y−sod

)+ γXicmod + αod + αm + αc + εicmod (1.9)

where ∆Ωod and ∆y−sod are measured using the fixed effects estimated in section 1.4.1.55 As for totalexposure effects, the school- and neighborhood-specific convergence rates βs and βn are identified fromvariation in the timing of moves. These are partial regression coefficients that reveal the annual effectof a change in one contextual dimension, holding the other constant.

Given that ∆yod = π∆Ωod+∆y−sod , there exists a direct mapping between estimates of total exposureeffects obtained from equation (1.8) and the coefficients of equation (1.9). In fact, one can recover thefull convergence rate β by using estimates of βs and βn, as well as sample estimates of V ar(π∆Ωod),V ar(∆y−sod ) and Cov

(π∆Ωod,∆y−sod

)in the following decomposition equation (see Mathematical Ap-

pendix for derivation):54A few families move out of a given FSA and later move back to that same neighborhood, in which case ∆yod = 0.55As a special case, if π = 1, then ∆y−s

od= ∆Λod.

Page 23: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 18

β = βs

[V ar(π∆Ωod) + Cov(π∆Ωod,∆y−sod )

V ar(∆yod)

]︸ ︷︷ ︸

Convergence due to school effects

+ βn

[V ar(∆y−sod ) + Cov(π∆Ωod,∆y−sod )

V ar(∆yod)

]︸ ︷︷ ︸

Residual convergence due to (non-school) neighborhood factors

.

(1.10)

The full convergence rate is the sum of school-specific and neighborhood-specific terms. Intuitively,the total effect of moving to a better area captures independent variation in school and neighborhoodquality, as well as joint variation in these two dimensions. As equation (1.10) makes clear, because ofpossible differences in variances, equality of βs and βn does not imply that schools and neighborhoodsmatter equally. In other words, even if students benefit greatly from having access to better schools (i.e.if βs is large in magnitude), schools could nonetheless explain only a small share of the estimated gainsof moving to a better neighborhood if there is little variation in school quality across FSAs in the data(i.e. if V ar(π∆Ωod) is small).

To examine how schools account for the observed convergence of movers’ outcomes towards those ofpermanent residents, I calculate a restricted convergence rate β−s by shutting down the school channel,that is by setting the effect of moving to a place with schools of greater quality βs equal to zero:

β−s = βn

[V ar(∆y−sod ) + Cov(π∆Ωod,∆y−sod )

V ar(∆yod)

].

This restricted rate corresponds to any residual exposure effects that are not driven by changes inschool quality. I then use this restricted rate to calculate the school share of total exposure effects bytaking the following ratio

(β−β−s

β

)= βs(V ar(π∆Ωod)+Cov(π∆Ωod,∆y−s

od))

β(V ar(∆yod)) .Overall, the analysis of the extent to which exposure effects are driven by causal school effects relies

on a system of estimating equations and a clear mapping between them. Equation (1.8) pins down thetotal impact of moving to a better area, while equations (1.6) and (1.7) identify the forecast-unbiasedschool effects that are fed into equations (1.9) and (1.10) to implement the decomposition of interest.The next section reports the results of these statistical analyses.

1.5 Results

This section reports baseline results, presenting estimates of school effects and the analysis of totalexposure effects that focuses on movers. It then combines these two sets of results to evaluate thefraction of the benefits of moving to a better area that is driven by schools.

1.5.1 School effects

Because students are allowed to opt-out of their neighborhood schools, boundaries might not necessarilygenerate large breaks in the quality of schools actually attended by students. I therefore first examinewhether default options produce such first-stage variation.

Figure 1.3 plots the student-level mean quality of primary schools δPs(i) – here measured in terms ofuniversity enrollment – at baseline by distance to the nearest boundary, where students assigned to the

Page 24: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 19

better school of any boundary-specific pair are depicted on the right of the threshold (positive distances).For visual clarity, I restrict the sample to permanent residents living within 500 meters of their nearestboundary. Panel A shows the quality of default primary schools, whereas Panel B plots the quality ofschools actually attended by students. In both cases, a large jump in quality is observed right at theboundary. In Panel A, a break occurs by construction. Yet, the jump might be very small in magnitudeif differences in school quality between nearby schools were small. This is not the case. The RD estimateimplies a 6.3 percentage points jump in school quality measured in terms of university enrollment rates.Panel B confirms that default options have a strong impact on the quality of schools parents send theirchildren to (statistically significant RD estimate of 2.6 percentage points). Panels C and D are placebotests which only include students in English schools. These students should not be directly affected byFrench school catchment areas, but do enjoy the same neighborhood amenities. The similarity betweenPanels A and C indicate that English-school students reside around French boundaries that are nodifferent than the boundaries faced by the full sample. Nonetheless, at these boundaries, there is nojump in the quality of schools attended by English students (Panel D).

Next, corresponding graphs for the first-stage equation (1.7) as well as the reduced-form relationshipbetween distance to boundaries and university enrollment are shown in Figure 1.4. Again, Panels A andB include all permanent residents, and Panels C and D are restricted to students in English schools.The first graph confirms that the instrument has a strong first-stage. Being assigned a better schoolat baseline does significantly shift the average quality of schools a student attends during childhood(Ω−is(n(i))). Educational attainment also jumps right at the threshold: students on the better side of aboundary at age 6 are about 2 percentage points more likely to eventually enroll in university later inlife. Importantly, there is no break in school quality or university enrollment for students in Englishschools. The sharp changes observed at the threshold for the full sample are therefore due to schoolsthemselves rather than to some other productive neighborhood characteristic that varies discontinuouslyand coincides with these boundaries.

Regression results analog to the above figures are presented in Table 1.3. The baseline specificationincludes control variables Xicnb – gender, place of birth indicators, language at home indicators, useof day care, ’in difficulty’ status at baseline, handicapped status – as well as cohort, FSA, and nearestboundary fixed effects. To increase precision, the main sample imposes no bandwidth restriction andincludes all permanent residents.56 Columns (1) through (4) are first-stage and reduced-form regressionsand are estimated by OLS. Consistent with the visual evidence, the average quality gap between defaultschools on opposite sides of a shared boundary is 0.0631 percentage points (s.e. 0.003) in terms ofuniversity enrollment (column (1)). For all three main outcomes, these differences in default optionsdo translate into significant differences in the quality of schools attended at baseline (gap of 0.0245(s.e. 0.0027) in column (2) for university enrollment). Importantly, this initial shift in school qualitystrongly affects average childhood school quality Ω−is(n(i)) (column (3)). The results in column (4) indicatestatistically significant reduced-form relationships between each measure of educational attainment andthe assignment variable. For example, students living on the better side of boundary are 3.5 percentagepoints (s.e. 0.0084) more likely to obtain a secondary school diploma in five years than students on theopposite side. Crucially, for columns (2) through (4), all coefficients for placebo tests reported in thebottom panel are close to zero and statistically indistinguishable from zero.

The last column reports two-stage least square estimates of cumulative school effects. Here, there is56I document the robustness of the results to the choice of bandwidth in section 1.6.

Page 25: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 20

some variation across outcomes. The RD-IV coefficient of π is below one for university enrollment andyears of study, which implies the presence of some degree of sorting into schools that is not accountedfor by place of residence. For these two outcomes, one may therefore overstate the importance ofschools if this bias is ignored. In contrast, the coefficient for finishing secondary school on time is veryclose to one. Speculatively, for a given degree of sorting, schools likely have a more direct influence onimmediate outcomes such as graduating on time than on higher education investments made later in life.In terms of the conceptual model of section 1.3, the value of ω might be relatively higher for outcomeson which schools can act directly. Section 1.6 documents the robustness of these results to functionalform assumptions and bandwidth restrictions.

1.5.2 Total exposure effects

In this section, I first provide visual evidence of the convergence of movers’ outcomes towards those ofthe permanent residents of their destination by estimating a non-parametric version of equation (1.8).More specifically, I interact ∆yod with a set of indicators for each possible value of age-at-move mi (age 7to 15). Figure 1.5 plots the results. As expected, the coefficients on ∆yod shrink (increase) with age-at-move (time spent in destination neighborhood). Importantly, they decrease approximately linearly withage-at-move, which validates that the assumption that exposure effects are linear with age is reasonable.

I then report baseline estimates of the convergence rate, which is the slope of the line that wouldbest fit the points shown on Figure 1.5. Table 1.4 reports the results for the main outcomes considered– university enrollment, finishing secondary school on time, and years of education. In the first twocolumns, I include all movers regardless of the number of times they moved. For this sample, the qualityof neighborhoods exposed to prior to the move is necessarily measured with some error. I condition onone-time movers in columns (3) and (4). In columns (2) and (4), I include a set of dummies for thenumber of times one has been flagged in difficulty prior to moving to control for pre-move schoolingability. All models are estimated by ordinary least squares and standard errors are clustered at thedestination FSA level.

For the two binary outcomes, moving one year earlier to a neighborhood where permanent resi-dents exhibit 10-percentage-points higher outcomes, relative to the origin, increases movers’ educationalattainment by about 0.4 percentage points. Extrapolating over 15 years, the cumulative effect wouldtherefore be 6 percentage points, or 60% of the difference between permanent residents of the destinationand origin locations. These point estimates are all statistically significant at the 1% level. Slightly largercoefficients are obtained for years of education, implying a convergence rate of about 4.5%. Overall, theestimates are very stable across specifications and outcomes. Controlling for poor schooling outcomesprior to moving (number of times in difficulty) or restricting the sample to one-time movers barely affectthe magnitudes of the coefficients.57

To put these estimates in perspective, consider the raw variation across FSAs documented in Table1.2. The standard deviation of university enrollment rates across FSAs for permanent residents is 14percentage points. Hence, accumulated over 15 years, the exposure effects estimated from movers canaccount for almost half of these spatial differences. Alternatively, the cumulative effect of a move to aplace where university enrollment is 10 percentage points higher is about half the size of the unconditional

57Allowing permanent residents’ outcomes to be cohort-specific (i.e. yPRnc ) increases sampling error and therefore yieldsconvergence rates of slightly smaller magnitudes (3.4− 4.3%). Similar patterns emerge if I use mutually exclusive cohortsto calculate yPRn and to estimate total exposure effects. These results are available upon request.

Page 26: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 21

gender gap in university enrollment (11 percentage points in favor of women).My estimates of total exposure effect are also surprisingly close to those reported by Chetty and

Hendren (2018a), who find a convergence rate of 4% in earnings for millions of movers across commutingzones in the US. While one may expect a larger influence of neighborhoods at finer levels of geography,my estimates are more likely to be attenuated due to sampling error in the calculation of the averageoutcomes of permanent residents. Nonetheless, it is remarkable that our findings so closely align giventhe differences in the locations we study as well as differences in the populations of interest. Whilemovers across cities tend to have a slight income advantage relative to stayers, movers within citiesappear to be poorer than permanent residents (at least in Montreal).58

Results for alternative measures of educational attainment are shown in Appendix Table A.5. Formost of these, the observed patterns mirror the main results. The magnitude of the effect appearsmarginally smaller for bachelor degree completion, and slightly larger for dropping out of high schoolwith no degree or qualification. In the last two rows, I compute measures of expected earnings on thebasis of (a) the highest level of education alone and (b) the level of education combined with the fieldof study, using the Public Use Microdata File of the 2006 Canadian Census.59 Convergence rates forexpected earnings on the basis of the level of education are about 4 − 4.5%, while those that also takeinto account fields of study a slightly smaller (3.3− 4%).

The estimates of the total effect of one year of exposure to a one-unit “better” area are valid underthe assumption that the degree of selection to better FSAs does not vary systematically with age. Insection 1.6, a host of robustness checks are conducted to corroborate the validity of this assumption.

1.5.3 Schools or neighborhoods?

While the previous section showed that exposure to better locations does matter, it is unclear whetherit is schools or neighborhood themselves that drive these effects. For instance, the descriptive statisticspresented in subsection 1.4.1 show that a large fraction of the between-FSA variance is accounted for byschool attendance. Also, the fact that many parents appear to passively enroll their child in the defaultschool suggests that patterns of school attendance are not completely unrelated to geography.

In this section, I use decomposition equation (1.10) to evaluate the extent to which exposure effectsoperate through schools. Again, I start by presenting visual evidence based on non-parametric estimates.Figure 1.6 reproduces in light grey the total exposure regression coefficients that were previously shown inFigure 1.5. In addition, it displays in red the corresponding restricted coefficients for which convergenceon the school component of differences in permanent residents’ outcomes π∆Ωod has been set to zero.60

For all three outcomes, the slope of the line that connects these points is considerably flatter, indicating58For completeness, I also estimate the main exposure model at the census tract level (Table A.4, Panel A). Because census

tracts are much smaller than FSAs, including origin-by-destination fixed effects generates a large number of singletons. Forthat reason, I also consider a less restrictive model in Panel B, which includes origin and destination fixed effects separately.Overall, the estimated coefficients vary between 2 and 4 %, with most being smaller than the associated coefficients at theFSA level. This is consistent with the idea that measurement error is plausibly more important at the census tract level.Firstly, permanent residents’ outcomes will be measured less accurately because of sampling error. Also, census tracts mayless precisely capture all features of the community in which children live and socialize, which is arguably larger than asingle census tract. The smaller convergence rates could also reflect greater sorting of permanent residents at this level ofgeography.

59Details on the measurement of all outcome variables are provided in the Data Appendix.60To construct this figure, a non-parametric version of horse-race equation (1.9) is first estimated:

yicmod =15∑m=7

βs,m (π∆Ωod × 1 mi = m) +15∑m=7

βn,m(∆y−s

od× 1 mi = m

)+ γXicmod + αod + αm + αc + εicmod.

Page 27: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 22

a much lower rate of convergence once the school channel has been shut down. In contrast, a similarexercise that instead shuts down direct neighborhood effects yields restricted coefficients that barelydeviate from the ones that depict full exposure effects (Appendix Figure A.14).

I conduct a thorough investigation of the role of schools in Table 1.5. In the first column, schoolquality is measured by the simple neighborhood-level average of the sum of primary and secondaryschool effects (ΩPRn ). Setting to zero the effect of schools (βs = 0), I obtain restricted convergence ratesβ−s of 1% for university enrollment, 1.1% for finishing secondary school on time, and 1.5% for years ofeducation. Taking the ratio of these restricted rates over the full convergence rate, the results imply thatschools are responsible for 75% of total exposure effects on university enrollment. For timely graduationfrom secondary school and years of education, the proportions are 74% and 70% respectively. Column(2) reports results of a similar decomposition in which school quality is measured by the FSA average ofpermanent residents’ leave-self-out childhood school quality Ω−is(n) = E

[Ω−is(n(i))|n(i) = n∀a

]. This slight

change in measurement has little effect on the results – the fraction of exposure effects explained byschools remains in the vicinity of 70− 75%.

Decompositions that do not take into account that biased measures of school quality Ω−is(n) also partlyreflect selection may overstate the importance of schools. In column (3), I therefore use the baselineRD-IV estimates to isolate causal school effects. The school-related share of total exposure effects dropsto 65% for university enrollment, to 73% for finishing secondary school on time, and to 46% for years ofeducation.61

Overall, this decomposition analysis indicates that schools matter more than neighborhoods for long-term educational attainment. Most of the long-term benefits of moving to a better area are driven bychanges in school quality. Nonetheless, schools do not fully account for the these total exposure effects –neighborhoods do have a small independent effect on human capital accumulation. I further assess therobustness of this conclusion in the next section, in which I notably validate that movers do experiencea substantial change in school quality as a result of the move.

1.6 Robustness

1.6.1 Robustness of Regression-discontinuity estimates

The benchmark specification for estimating school effects imposes several restrictions. Firstly, it assumesthat the relationship between distance to the boundary and student outcomes is linear. Appendix TableA.7 allows for a quadratic functional form. RD-IV estimates for finishing secondary school on time andyears of education appear insensitive to this assumption. The estimate of π with quadratic functions foruniversity enrollment (0.71), however, is smaller than the baseline (0.85). Both the reduced-form andfirst-stage coefficients are moderately smaller than in Table 1.3. The reduction in the reduced-form pointestimate slightly exceed that of the first-stage coefficients, which leads to a somewhat smaller RD-IV

Then, a restricted coefficient β−sm is calculated for each possible value of age-at-move using the mapping given by equation(1.10). The standard errors around these coefficients are obtained by the delta method.

61In Appendix Table A.6, I re-arrange equation (1.10) and consider an alternative decomposition in which school effects

are given by(βsV ar(π∆Ωod)+βnCov(π∆Ωod,∆y

−sod

)V ar(∆yod)

)and neighborhood effects by

(βnV ar(∆y−s

od)+βsCov(π∆Ωod,∆y

−sod

)V ar(∆yod)

).

This method differs in how it weighs the covariance term. The interpretation is now cast in terms of direct and indirecteffects. For example, an increase in π∆Ωod has a direct effect βs on yi, as well an indirect effect βn via its correlationwith ∆y−s

od. It turns out that the covariance term is very small in practice, hence making the results under this approach

almost identical to the main results.

Page 28: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 23

estimate. Similarly, using a triangular kernel for the control function yields results almost identical tothe baseline results (Appendix Table A.8).

Appendix Figure A.15 examines the sensitivity of results to bandwidth restrictions. Moving along thehorizontal axis, I gradually expand the sample by including students living farther away from boundaries.The point estimates do fluctuate across these sample restrictions, following no monotonic pattern. Forinstance, for university enrollment, keeping only students living within 750 meters of a boundary yieldsa considerably smaller RD-IV coefficient (0.62), while further restricting the bandwidth to 300 metersgives a coefficient very close to the baseline (0.83). These movements in point estimates are plausiblydriven by differences across the set of schools and neighborhoods that are dropped when the bandwidthis changed. For instance, denser parts of Montreal are unaffected by these restrictions since all studentsliving in these areas live very close to a boundary. Large distances from boundaries are only observedin the suburbs.62 Nevertheless, most estimates shown in Figure A.15 remain within short range of thebaseline results. The conclusions of the decomposition exercise are therefore unaffected by the choice ofbandwidth – the vast majority of estimates of the fraction of total exposure effects driven by schools fallbetween 50 and 70% (Appendix Figure A.16).

1.6.2 Robustness of Movers estimates

My estimates of total exposure effects may be biased if students with higher (lower) unobserved familyinputs (θi) who move to better (worse) areas tend to do so earlier. This section examines the validityof the identifying assumption by running a set of robustness checks that address issues of time-invariantand time-varying unobserved heterogeneity.

1.6.2.1 Robustness: Time-invariant confounds

Within-family exposure effects The first test I run involves estimating the exposure model withhousehold fixed effects to account for any time-invariant family unobserved heterogeneity. In this speci-fication, identification relies on age differences between siblings. In this context, positive exposure effectswould generate a relationship between the change in neighborhood and school quality, on one hand, andthe difference in educational outcomes among siblings, on the other hand, that varies proportionally tothe age-difference of siblings.

Since siblings are not directly identified in the data, I match students using unique moves at a veryfine level of geography. More precisely, I assume that two students who move from and to the exactsame six-digit postal codes in the same year must belong to the same household.63 Many householdunits are not consistent over time given the prevalence of step- and blended-families. For instance, twostudents from different biological parents may have been living under the same roof only for a fractionof their lives. I therefore exclude household units for which the children have lived at a common postalcode for less than 75% of the years for which I can observe them.

In columns (1) and (2) of Table 1.6, I estimate the main exposure model with origin-by-destinationfixed effects on the subsample of siblings. Standard errors are considerably larger than in the mainspecification since the sample size is much smaller, but the point estimates are in line with the resultsbased on all movers. In columns (3) and (4), I substitute family fixed effects for the origin-by-destination

62In the sample with no bandwidth restriction, the median distance to the nearest boundary is 204 meters.63Out of the original 100,929 students, this method identifies about 13,000 siblings attached to roughly 6,000 different

households.

Page 29: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 24

fixed effects to account for any time-invariant heterogeneity across families and still find convergencerates of about 4.5%. These results support the idea that the estimated exposure effects are not drivenby differences in unobservable time-invariant family characteristics. However, the estimates reportedin Table 1.6 might still be subject to bias if time-changing unobservables affect siblings differentiallyin proportion of their age-gap. I further address robustness to time-varying unobservables in section1.6.2.2.

Balance of Covariates The second approach tests for balance of covariates to verify that variationin the interaction term is arguably random conditional on age-at-move and origin-by-destination fixedeffects. In this section, I run a set of balancing checks by estimating the exposure model using individualcharacteristics as dependent variables. The balancing test equation takes the form

Xicmod = βx (mi ×∆yod) + αod + αm + αc + εicmod.

Under the identifying assumption that the degree of sorting to better areas is independent of age-at-move, I should find coefficients of zero on the mi ×∆yod interaction. Pei, Pischke and Schwandt (2017)show that putting the covariates on the left-hand side is a more powerful test than gradually adding orremoving these variables from the right-hand side of the main estimating equation, particularly if theindividual characteristics are poor measures of the underlying confounders they are meant to accountfor. For instance, being ’in difficulty’ is certainly a noisy measure of academic abilities.

Results of the test are shown in Table A.9. In columns (1) and (2), I use years of education ofpermanent residents to measure ∆yod. Finishing secondary school on time and university enrollment areused in columns (3) and (4) and columns (5) and (6), respectively. The coefficients on immigrant statusare marginally significant at the 5% level for some, but not all outcomes. In Montreal, immigrants doobtain more post-secondary education than domestic students. It might also be the case that they tendto move later if acquiring information about neighborhoods takes more time for this subgroup, given thatthey may have less prior information than native-born parents. The coefficients for learning difficultiesat baseline reach statistical significance in some cases. For the complete history of learning difficultiesprior to moving, all coefficients are effectively zero. Overall, most coefficients in the table are very smalland statistically indistinguishable from zero.

1.6.2.2 Robustness: Time-varying characteristics

Another possible source of concern is that length of exposure to a one-unit better area mirrors exposureto different family circumstances. Put differently, one may be worried that if a move is triggered bya change in marital status or income, the age-specific unobserved parental inputs θia may have alsochanged sharply and in proportion with mi.

Selection on time-varying observables Unfortunately, my data set includes very few time-varyingindividual characteristics. For instance, parental income and marital status of parents are not observed.To account for possible changes in family circumstances that coincides with a move, I instead controlfor differences in census tract characteristics between the origin and the destination, as well as theinteraction of these differences with age-at-move:

Page 30: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 25

yicmod = β (mi ×∆yod) + η0∆Ziod + η1 (mi ×∆Ziod) + γXicmod + αod + αm + αc + εicmod

where ∆Ziod is the difference in census tract characteristics between the areas in which student iresides after and before the move. Because census tracts are considerably smaller than FSAs, these con-trols vary within origin-by-destination cells and so the main effects η0 are identified. The characteristicsI consider are median household income, average dwelling value, percentage low income, percentage ofadults with some college education, and fraction of lone parent families. Those are all obtained from the2001 Canadian Census. The inclusion of these variables accounts for changes in family circumstancesthat are correlated with changes in neighborhood attributes, as well as any sorting on the basis of theseobservable neighborhood characteristics.64 For example, a positive income shock may be associated withboth a move to an area where property value is higher than in the origin and an increase in parentalinputs. For any unobserved variable to generate bias in the exposure estimates under this specification,the confounding variable would have to generate variation orthogonal to changes in these neighborhoodattributes. The inclusion of these neighborhood attributes likely absorbs part of the causal exposureeffect of interest, and therefore over-adjusts for changes in family circumstances.

Results are quite robust to the inclusion of these controls (Table A.10). In columns (1) through (5),I control for changes in one time-varying characteristic at a time. In all of these cases, the exposureeffects remain stable around 4 to 4.5%. Among all considered variables, the local fraction of lone-parentfamilies is the one that most affects the main exposure effects. Yet, even in this case, the exposureeffects remain large (≈4%). In column (6), I include all controls simultaneously. For each outcome, theexposure effects falls just under 4% and remains strongly statistically significant.

Event-study Next, I investigate whether students who move to better areas exhibit different trendsin learning difficulties prior to moving. The idea is that family circumstances plausibly directly affectthe likelihood that a student struggles in school, hence changes in unobserved family inputs shouldbe reflected in the probability of being identified ’in difficulty’. I leverage year-to-year variation inDiffiod,t, the indicator of whether student i was in difficulty in year t, to create an index of relativelearning difficulties that summarizes the way movers compare to permanent residents in their origin anddestination (Finkelstein, Gentzkow and Williams, 2016; Bronnenberg, Dubé and Gentzkow, 2012):

σod(i,t) = Diffiod,t −Diffo,tDiffd,t −Diffo,t

where Diffn,t is the fraction of permanent residents of FSA n that were in difficulty at time t.Note that this index takes a value of zero if mover i’s difficulty status is the same as the average in herorigin and a value of one if it is equal to the average in the destination. An increase in σod(i,t) overtime indicates that student i’s success in school (or lack of thereof) converges towards that of permanentresidents in the destination relative to the origin.

I investigate patterns in σod(i,t) around the time of moves. For instance, a positive pre-trend would64Similarly, Altonji and Mansfield (2014) control for group-level average individual characteristics – arguably the basis

on which households sort into neighborhoods – to account for unobserved individual heterogeneity, and thereby obtain alower bound on contextual effects.

Page 31: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 26

indicate that movers started converging towards permanent residents of the destination before they evenmoved. Such a pattern could arise if moves to certain areas occur as a result of gradual changes infamily circumstances. For example, if divorces are preceded by an erosion of the quality of the parents’relationship and are disproportionately followed by moves to worse places, my estimates of total exposureeffects could be biased.

Figure A.17 shows the results of the following event-study analysis

σod(i,t) =4∑

k=−4αk1 t = mi + k+ δi + εimodt

where observations are weighted by(Diffdt −Diffot

)2 as in Bronnenberg, Dubé and Gentzkow(2012) and δi are student fixed effects.65 For descriptive purposes, I first show in Panel A estimates ofraw trends with no student fixed effects. A jump in σod(i,t) occurs on impact, and students schoolingdifficulties then converge gradually towards the destination’s average. Importantly, there is no discerniblepre-trend – coefficients are stable prior to moving. These results are consistent with Aaronson (1998),who finds no systematic pattern between pre-move changes in family circumstances and the quality ofthe destination neighborhood quality. For instance, moves preceded by a divorce are just as likely tolead to a better than to a worse destination.

Because students move at different ages, the panel is not balanced. As a result, any pre- and post-move trend may be the result of changes in the sample’s composition. In Panel B, I follow Finkelstein,Gentzkow and Williams (2016) and include student fixed effects to address this issue. While the post-move trend now disappears, the jump at the time of the move remains significant. One concern is thatthis sudden jump is the product of a sharp and sudden change in family inputs. In Panel C and D, Itherefore distinguish between students who did and did not switch school the year they moved. Theevidence suggests that the break is instead the result of a change in the schooling environment possiblydriven by differences in schools’ propensity to flag marginal students. For students who did not changeschool at the time of the move, there is no jump in σod(i,t). Overall, the event-study plots highlight theabsence of pre-trends in schooling difficulties. It also emphasizes the potentially important role schoolsplay in the decision to label a student as being in difficulty or not.

1.6.2.3 Heterogeneity

In Table A.11, I explore whether exposure effects vary in magnitude by gender, language of instruction,or whether students are moving to a better or worse place. Columns (1) and (2) estimate the modelin equation (1.8) separately for boys and girls, respectively. While the coefficients are almost identicalacross genders for secondary school completion, girls appear to benefit slightly more from exposure tobetter areas than boys do in terms of university enrollment and years of education. This result is at oddswith many studies that find that boys are more sensitive to their childhood environment (Autor et al.,2016; Chetty et al., 2016), but agrees with the findings in Deming et al. (2014) that girls who win aschool choice lottery experience increases in college enrollment but that boys do not. In columns (3) and(4), students in English and French schools appear to benefit equally from moves to better areas, despite

65Observations outside this window remain in the analytic dataset, hence all coefficients αk are relative to omitted years.Standard errors are clustered at the student-level.

Page 32: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 27

the fact that anglophones start from a much higher baseline – in the full sample, students in Englishschools are 10 percentage points more likely to enroll in university than students in French schools.

The main specification not only assumes that exposure effects are linear with age-at-move, but alsothat they are linear and symmetric in ∆yod. I explore the validity of this assumption in columns (5) and(6). Significant exposure effects are found both for students moving to a better area and for those movingto a worse place, but the patterns here are not consistent across outcomes. Moving to a better FSAaffects the probability of graduating from secondary school on time, but the associated convergence ratefor moves to worse FSAs is not statistically different from zero. In contrast, negative moves appear toinfluence the propensity to enroll in university more strongly than do positive moves. I cannot, however,reject that the two coefficients are statistically equal.

1.6.3 Robustness of Decomposition

School-switching compliance rate The model in section 1.3 shows that the magnitude of the fullconvergence rate depends on movers’ propensity to attend schools similar in quality to those attendedby permanent residents. For schools to possibly drive the benefits of moving to a better area, it mustbe that the compliance rate E

[ψs(d(i))−ψs(o(i))

ψP Rd−ψP R

o

]is not zero. As a sanity check, I conduct an event-study

similar to the one described in subsection 1.6.2.2 to validate that the quality of schools attended bymovers shifts towards that of the permanent residents of the destination neighborhood when they move.The index of relative school quality is given by

σψod(i,t) =δs(i,t) − δs(o,t)δs(d,t) − δs(o,t)

where δs(i,t) is the quality of the school attended by student i at time t (measured by the fixedeffects estimates obtained in subsection 1.4.1), and δs(n,t) is the average quality of schools attended bypermanent residents of FSA n at time t. The corresponding event-study results are shown in FigureA.18. The index increases sharply in value right at the time of the move. While there seems to be amodest spike in the year preceding the move, this bump is very small compared to the break that occurson impact. The magnitude of the jump implies a compliance rate close to 60%. Given the many schoolchoice options in Montreal, it is not surprising that this rate is not 100%. Yet, this exercise demonstratesthat movers do experience a substantial change in school quality as a result of a move.

Accounting for movers’ school attendance In this section, instead of using measures of schoolquality from permanent residents, I directly account for movers’ school attendance in the baseline totalexposure effect model. I then examine the behavior of the estimated convergence rate as I include fixedeffects for schools attended by movers themselves. The estimating equation becomes

yicmod = β (mi ×∆yod) + αs(0) + αs(A) + γXicod + αod + αm + αc + εicmod

where αs(0) and αs(A) are sets of fixed effects for schools attended at baseline and at age 15, re-spectively. To account for variation in length of exposure to better schools that may be correlatedwith neighborhood exposure, the school fixed effects are allowed to vary linearly with age-at-move, i.e.

Page 33: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 28

αs(a) = α0s(a) +α1

s(a)×mi, which is equivalent to allowing age-at-move effects to have a different slope ineach school. Note that these fixed effects account for differences in school quality as well as any sortinginto schools. The estimates therefore put an upper bound on the role of schools.

The results are presented in Table 1.7. Benchmark estimates of total exposure effects are reproducedin column (1). In column (2), fixed effects for schools attended at the beginning (the “origin” school)and at the end (the “destination” school) of the exposure period, as well as interactions with age-at-move, are added. In this case, the annual exposure effects shrink substantially to 1.1% for universityenrollment, 1.2% for completing secondary school on time, and 1.8% for years of education. These pointestimates are strikingly close to the restricted convergence rates reported in column (2) of Table 1.5(1.1%, 1.3% and 1.6% respectively), validating that differences in permanent residents’ school inputs(∆Ωod) accurately capture the change in school quality experienced by movers when they move acrossneighborhoods. As a further robustness check, I also interact the school fixed effects with the actualnumber of years spent in the associated schools instead age-at-move (column (3)). The results are in linewith the main conclusion that schools account for a large fraction of exposure effects, but some residualneighborhood exposure effect still persists above and beyond the contribution of schools.

Measurement error One possibility is that sampling error affects the estimation of school and neigh-borhood fixed effects differentially. For instance, it might be that estimation error accounts for a largerfraction of the variance in ∆Ωod than of the variance in ∆Λod. To verify that this is not the case, Ire-estimate equation (1.9) using estimates of the fixed effects Ωs(n(i)) and Λn that are shrunk towardszero using empirical Bayes techniques (Chandra et al., 2016; Best, Hjort and Szakonyi, 2017; Kane andStaiger, 2008). The associated decomposition results are shown in Table A.12.

The first row shows the associated total convergence rates, which are calculated using equation (1.10).These rates are slightly larger than the ones presented in Table 1.4, which imply that my main estimatessuffer from a small attenuation bias.66 The shares of total exposure effects due to schools reported inthis table do not account for the endogeneity of school attendance and should therefore be comparedto the corresponding results presented in column (1) of Table 1.5. For all three outcomes, adjustingfor measurement error only reinforces the conclusion that school effects account for most of the benefitsof moving to a better area, with school shares exceeding 80% for university enrollment and finishingsecondary school on time.

1.7 Conclusion

Establishing whether schools drive neighborhood exposure effects is crucial on a policy level to informthe development of community-wide versus in-school intervention programs. Yet, isolating the effectsof neighborhoods from those of schools is a difficult task since in most places students are allocatedto schools on the basis of residence. This paper overcomes these difficulties by bringing together tworesearch designs in order to isolate the fraction of total long-term exposure effects that is driven byschool effects.

66Note that when using shrunk fixed effects, the total convergence rate estimated directly with equation (1.8) is notexactly equal to that obtained by plugging the coefficients of equation (1.9) into the decomposition equation. This isbecause contrary to unadjusted estimates, shrunk estimates of yPRn are not equal to the sum of shrunk estimates of ΩPRnand ΛPRn . Yet, with empirical Bayes fixed effects, both ways of calculating the total convergence rate produce rates largerin magnitude than those obtained with unadjusted fixed effects.

Page 34: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 29

The first contribution of this paper is to break the mechanical link between the two dimensions byexploiting institutional features of Quebec’s education system. In Montreal, default options influenceparents’ decision over which schools their child will attend. Building upon this observation, I find thatthe quality of the primary and secondary schools children attend have large effects on their educationalattainment. More precisely, immediate neighbors living on opposite sides of a French primary schoolboundary at age 6 exhibit significantly different propensities to enroll into university more than 10 yearslater.

My second set of results demonstrates that children who move to a better neighborhood at a youngage benefit substantially from this change. In particular, I successfully replicate the findings of linearexposure effects of Chetty and Hendren (2018a) using within-city variation and implementing theirmethods in a different setting, looking at a much smaller scale of geography and examining differentoutcomes. My estimates suggest that movers’ educational attainment improve linearly with each yearspent in a better location at an annual rate of approximately 4.5%.

The main result of the paper is that schools are the main driver of total childhood exposure ef-fects. Decompositions that take into account the endogeneity of school quality indicate that between50 and 70% of the educational benefits of moving to a better location are due to schools rather thanneighborhoods themselves. These findings strongly corroborate earlier conclusions made on the relativeimportance of schools and neighborhoods (Dobbie and Fryer, 2015; Fryer and Katz, 2013; Oreopoulos,2012; Gould, Lavy and Paserman, 2004). By showing that spatial inequalities in long-term educationalattainment are partly rooted in the quality of schools children attend, the results bear important policyimplications. They notably suggest that school reforms or interventions might be more effective thancommunity programs or relocation policies in raising educational attainment.

The decomposition approach developed in this paper also opens up new possibilities for examiningmechanisms in other settings. For instance, the idea of partitioning the outcomes of non-movers andusing variation from movers to pin down and decompose place effects could be valuable in investigationsof the quantitative importance of physicians in driving hospital effects, or of teachers for school effects.

While the magnitude of the estimated exposure effects, and the main conclusion of this paper morebroadly, may reflect a social reality unique to Montreal, I believe the results are very informative forother contexts as well. For instance, because of Quebec’s open enrollment policies and the unusualavailability of private school options in Montreal, the link between school attendance and residence isrelatively loose. Hence, in jurisdictions where schools and neighborhoods are tightly linked, one mayexpect schools to contribute even more to spatial inequalities in educational attainment than the resultsin this paper suggest. I leave for future research the question of whether this conclusion extends to othersocio-economic outcomes such as earnings and criminal behavior.

Page 35: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 30

1.8 Tables and Figures

Figure 1.1: Quebec’s Education SystemAge

5 Kindergarden Kindergarden

6 Grade 1 Grade 1

7 Grade 2 Grade 2

8 Grade 3 Grade 3

9 Grade 4 Grade 4

10 Grade 5 Grade 5

11 Grade 6 Grade 6

12 Grade 7 / Secondaire 1 Grade 7

13 Grade 8 / Secondaire 2 Grade 8

14 Grade 9 / Secondaire 3 Grade 9

15 Grade 10 / Secondaire 4 Grade 10

16 Grade 11 / Secondaire 5 Grade 11

17 Pre-university 1 Grade 12

18 Pre-university 2 College 1

19 Year 1 College 2

20 Year 2 ...

21 Year 3 ...

Seco

nd

ary

sch

oo

l

Un

iver

sity

QUÉBEC Rest of CANADA

Un

iver

sityTechnical studies 3

Co

llege

Year 1

Year 2

Year 3

Year 4

Co

llege

Technical studies 1

Technical studies 2

Pri

mar

y sc

ho

ol

Elem

enta

ry s

cho

ol

Hig

h s

cho

ol

Page 36: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 31

Figure 1.2: Spatial variation in educational outcomesPanel A: Fraction ever enrolled in university

(0.635,0.803](0.598,0.635](0.526,0.598](0.463,0.526](0.426,0.463](0.391,0.426](0.323,0.391](0.268,0.323][0.153,0.268]No data

Panel B: Fraction graduating secondary school on time(0.872,0.920](0.833,0.872](0.764,0.833](0.721,0.764](0.697,0.721](0.660,0.697](0.587,0.660](0.499,0.587][0.316,0.499]No data

Notes: Statistics based on permanent residents. Outcomes are adjusted for cohort effects. Data forFSAs with fewer than 10 permanent residents are not shown (no data).

Page 37: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 32

Figure 1.3: Discontinuities in school quality at French primary school boundariesAll permanent residents

Panel A: Quality δPs(i) of assigned French school Panel B: Quality δPs(i) of school attended0.

390.

410.

420.

440.

45Q

ualit

y of

ass

igne

d sc

hool

(fr

actio

n en

rolle

d in

uni

vers

ity)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.42

30.

431

0.44

00.

448

0.45

6Q

ualit

y of

sch

ool a

ttend

ed (

frac

tion

enro

lled

in u

nive

rsity

)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Students in English schools only (Placebo)Panel C: Quality δPs(i) of assigned French school Panel D: Quality δPs(i) of school attended

0.38

0.40

0.41

0.42

0.44

Qua

lity

of a

ssig

ned

scho

ol (

frac

tion

enro

lled

in u

nive

rsity

)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.46

90.

475

0.48

00.

486

0.49

2Q

ualit

y of

sch

ool a

ttend

ed (

frac

tion

enro

lled

in u

nive

rsity

)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Notes: For each French primary school boundary, the neighborhood school with greater school quality –in terms of university enrollment – is assigned to the right. The figure shows the average school quality ofschools attended by students at baseline, by distance to the boundary. Attendance recorded at baseline(grade 1). In Panels A and B, the sample includes all permanent residents, and in Panels C and D it isconstituted of students enrolled in English schools only. For visual clarity, students living further than500 meters away from their nearest boundary are excluded.

Page 38: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 33

Figure 1.4: Regression-discontinuity: First-stage and Reduced-form relationshipsAll permanent residents

Panel A: Total childhood school quality Ω−is(n(i)) Panel B: University enrollment(First-stage) (Reduced-form)

0.40

0.41

0.43

0.45

0.46

Chi

ldho

od s

choo

l qua

lity

(fra

ctio

n en

rolle

d in

uni

vers

ity)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.38

0.40

0.42

0.45

0.47

Fra

ctio

n en

rolle

d in

uni

vers

ity

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Students in English schools only (Placebo)Panel C: Total childhood school quality Ω−is(n(i)) Panel D: University enrollment

(First-stage) (Reduced-form)

0.43

0.45

0.46

0.47

0.49

Chi

ldho

od s

choo

l qua

lity

(fra

ctio

n en

rolle

d in

uni

vers

ity)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.41

0.44

0.47

0.49

0.52

Fra

ctio

n en

rolle

d in

uni

vers

ity

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Notes: For each French primary schools boundary, the neighborhood school with greater school quality –in terms of university enrollment – is assigned to the right. The figure shows the average school quality ofschools attended by students at baseline, by distance to the boundary. Attendance recorded at baseline(grade 1). In Panels A and B, the sample includes all permanent residents, and in Panels C and D it isconstituted of students enrolled in English schools only. For visual clarity, students living further than500 meters away from their nearest boundary are excluded.

Page 39: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 34

Figure 1.5: Non-parametric total exposure effectsPanel A: University enrollment

-.4

-.2

0.2

.4C

oeffi

cien

t on

Diff

eren

ce in

Pre

dict

ed O

utco

mes

7 8 9 10 11 12 13 14 15Age at move

Panel B: DES in 5 Years

-.4

-.2

0.2

Coe

ffici

ent o

n D

iffer

ence

in P

redi

cted

Out

com

es

7 8 9 10 11 12 13 14 15Age at move

Panel C: Years of education

-.6

-.4

-.2

0.2

Coe

ffici

ent o

n D

iffer

ence

in P

redi

cted

Out

com

es

7 8 9 10 11 12 13 14 15Age at move

Notes: Sample includes all movers who remained within Montreal. Observation in FSAs with less than10 permanent residents are omitted. Coefficients shown are obtained by regressing

yicmod =15∑m=7

βm (∆yod × 1 mi = m) + γXicmod + αod + αm + αc + εicmod

Standard errors are clustered at the destination level.

Page 40: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 35

Figure 1.6: Non-parametric restricted exposure effects – No school effectPanel A: University enrollment

-.4

-.2

0.2

.4C

oeffi

cien

t on

Diff

eren

ce in

Pre

dict

ed O

utco

mes

7 8 9 10 11 12 13 14 15Age at move

Total Exposure Effects No school effect

Panel B: DES in 5 Years

-.4

-.2

0.2

Coe

ffici

ent o

n D

iffer

ence

in P

redi

cted

Out

com

es

7 8 9 10 11 12 13 14 15Age at move

Total Exposure Effects No school effect

Panel C: Years of education

-.6

-.4

-.2

0.2

Coe

ffici

ent o

n D

iffer

ence

in P

redi

cted

Out

com

es

7 8 9 10 11 12 13 14 15Age at move

Total Exposure Effects No school effect

Notes: Notes: Sample includes all movers who remained within Montreal. Observation in FSAs withless than 10 permanent residents are omitted. Coefficients in red correspond to age-specific restrictedcoefficients for which the school channel is shut down (β−sm ). Standard errors are clustered at thedestination level and calculated by the delta method.

Page 41: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 36

Table 1.1: Descriptive Statistics

Within

Montreal Left Montreal

mean mean mean mean coef.

(s.d.) (s.d.) (s.d.) (s.d.) (s.e.)

Variables (1) (2) (3) (4) (5)

Female 0.49 0.49 0.49 0.49 0.001

[0.500] [0.500] [0.500] [0.500] [0.004]

Age on September 30 6.02 6.01 6.04 6.02 -0.033***

[0.376] [0.329] [0.455] [0.330] [0.003]

Mother tongue: French 0.49 0.47 0.45 0.65 0.026***

[0.500] [0.499] [0.497] [0.478] [0.004]

Mother tongue: English 0.21 0.26 0.20 0.09 0.063***

[0.407] [0.439] [0.397] [0.291] [0.003]

Mother tongue: Other 0.30 0.27 0.36 0.26 -0.090***

[0.457] [0.444] [0.479] [0.439] [0.003]

Language at home: French 0.54 0.50 0.50 0.69 0.006*

[0.499] [0.500] [0.500] [0.460] [0.004]

Language at home: English 0.26 0.32 0.24 0.12 0.077***

[0.437] [0.466] [0.427] [0.326] [0.003]

Language at home: Other 0.21 0.18 0.26 0.18 -0.083***

[0.405] [0.384] [0.438] [0.387] [0.003]

Immigrant 0.10 0.07 0.14 0.11 -0.080***

[0.296] [0.247] [0.347] [0.308] [0.002]

Language at school: French 0.75 0.69 0.77 0.88 -0.073***

[0.433] [0.461] [0.423] [0.324] [0.003]

Uses School Day Care (baseline) 0.25 0.24 0.24 0.29 0.005

[0.432] [0.428] [0.426] [0.453] [0.003]

In difficulty (baseline) 0.04 0.03 0.05 0.05 -0.016***

[0.193] [0.170] [0.209] [0.219] [0.001]

Handicapped (baseline) 0.01 0.01 0.02 0.01 -0.003***

[0.118] [0.116] [0.126] [0.111] [0.001]

Ever in difficulty by age 15 0.31 0.25 0.37 0.37 -0.116***

[0.462] [0.431] [0.482] [0.483] [0.003]

Students 92,764 44,912 31,526 16,326 76,438

All studentsPermanent

residents

Movers Difference

between (2)

and (3)

Notes: The main sample excludes students who left Quebec’s system before turning 16. Permanentresidents are defined as students who always resided in the same FSA until the age of 15. Movers withinMontreal are those who moved across FSAs at least once and were still living on the Island of Montrealat age 15. Movers who left Montreal were residing in the province of Quebec but outside the Island ofMontreal at age 15.

Page 42: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 37

Table 1.2: Variation Across Neighborhoods and Schools

(1) (2) (3) (4) (5) (6)

Student-level standard deviation of fixed effects:

Schools 0.270 0.264 0.249 0.235 1.207 1.141

Neighborhoods (FSAs) 0.138 0.046 0.139 0.062 0.680 0.258

Dependent variable summary statistics:

Mean

Standard deviation

Fixed effects estimated

Separately x x x

Simultaneously x x x

Number of students

Number of primary schools

Number of secondary schools

Number of neighborhoods

44,912

440

95

218

University enrollment Years of educationDES in 5 years

Outcome

[2.113]

0.706

[0.456]

0.443

[0.497]

13.228

Notes: Sample restricted to permanent residents. School fixed effects are measured by the sum of aprimary and a secondary school fixed effect. In columns (1), (3) and (5), school and neighborhoodeffects are respectively estimated in separate regressions. In columns (2), (4) and (6), all fixed effectsare estimated simultaneously from equation (1.5).

Page 43: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 38

Table 1.3: School effects: Regression-discontinuity estimates

Dependent variable:

(1) (2) (3) (4) (5)

Measure of educational attainment

University enrollment 0.0631*** 0.0245*** 0.0328*** 0.0279*** 0.8542***

(0.0032) (0.0027) (0.0065) (0.0087) (0.1645)

Secondary school diploma in 5 years 0.0715*** 0.0297*** 0.0337*** 0.0347*** 1.0340***

(0.0037) (0.0025) (0.0061) (0.0084) (0.1618)

Years of schooling 0.2933*** 0.1157*** 0.1511*** 0.1165*** 0.7739***

(0.0148) (0.0120) (0.0298) (0.0390) (0.1575)

N 43296 43279 43291 43296 43291

University enrollment 0.0632*** -0.0017 -0.0012 -0.0098 7.9509

(0.0044) (0.0041) (0.0099) (0.0157) (58.0234)

Secondary school diploma in 5 years 0.0722*** 0.0031 0.0008 -0.0081 -9.3699

(0.0059) (0.0026) (0.0071) (0.0116) (88.2925)

Years of schooling 0.2836*** 0.0052 0.0104 -0.0448 -4.2566

(0.0204) (0.0156) (0.0433) (0.0655) (22.8624)

N 13446 13444 13444 13446 13444

Cohort fixed effects x x x x x

Individual characteristics x x x x x

Neighborhood (FSA) fixed effects x x x x x

Boundary fixed effects x x x x x

All permanent residents

Placebo: Students in English schools

First-stage(s)Reduced-

formRD-IV

Quality of

assigned

school at

baseline

(δP

s(i) )

Quality of

school

attended at

baseline

(δP

s(i) )

Childhood

average

school

quality

(Ω-i

s(n(i)) )

Outcome Outcome

Notes: This table reports RD estimates. In columns (1) and (2), primary school quality is measuredusing the fixed effects, δPs(i), estimated in section 1.4.1. In column (3), the dependent variable is childhoodaverage school quality Ω−is(n(i)). Column (5) reports 2SLS estimates of equations (1.6) and (1.7). In allspecifications, the control function for distance to boundary is linear and allows for different slopes oneither side of the threshold. In the first three rows, the sample includes all permanent residents. In thelast three rows, only permanent residents enrolled in English schools are included. All standard errorsare clustered at the French primary school boundary level.*** p<0.01, ** p<0.05, * p<0.1

Page 44: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 39

Table 1.4: Main Results: Exposure Effects

Sample:

(1) (2) (3) (4)

Measure of educational attainment

University enrollment -0.0424*** -0.0412*** -0.0416*** -0.0408***

(0.0090) (0.0092) (0.0116) (0.0115)

Secondary school diploma in 5 years -0.0421*** -0.0402*** -0.0506*** -0.0502***

(0.0088) (0.0088) (0.0117) (0.0117)

Years of schooling -0.0488*** -0.0471*** -0.0444*** -0.0435***

(0.0088) (0.0094) (0.0103) (0.0102)

Cohort fixed effects x x x x

Individual characteristics x x x x

Age at move fixed effects x x x x

Origin-by-destination fixed effects x x x x

Only moved once x x

Times in difficulty before moving x x

N 25993 25993 16949 16949

All movers One-time movers

Notes: Coefficients shown in the table are convergence rates β. Individual characteristics include gender,immigrant status, allophone status, born in Canada but outside Quebec, English spoken at home, daycare use at baseline, ’in difficulty’ status at baseline, handicapped status. In columns (2) and (4), themodel includes a set of dummies for each possible value of number of times in difficulty prior to moving.Standard errors are clustered at the destination neighborhood level.*** p<0.01, ** p<0.05, * p<0.1

Page 45: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 40

Table 1.5: Decomposing Exposure Effects

(1) (2) (3)

University enrollment

β -0.0424*** -0.0424*** -0.0424***

(0.0090) (0.0090) (0.0090)

β-s (No school effects) -0.0105** -0.0106** -0.0148**(0.0042) (0.0043) (0.0059)

β-n (No neighborhood effects) -0.0318*** -0.0318*** -0.0276***

(0.0072) (0.0071) (0.0069)

Share school effects 75% 75% 65%

(0.0789) (0.0785) (0.1093)

Secondary school diploma in 5 years

β -0.0421*** -0.0421*** -0.0421***

(0.0088) (0.0088) (0.0088)

β-s (No school effects) -0.0109*** -0.0129*** -0.0111***

(0.0037) (0.0042) (0.0036)β

-n (No neighborhood effects) -0.0309*** -0.0289*** -0.0307***

(0.0085) (0.0082) (0.0082)

Share school effects 74% 69% 73%

(0.0879) (0.0965) (0.0831)

Years of education

β -0.0488*** -0.0488*** -0.0488***

(0.0088) (0.0088) (0.0088)

β-s (No school effects) -0.0147*** -0.0160*** -0.0265***

(0.0043) (0.0045) (0.0075)β-n

(No neighborhood effects) -0.0340*** -0.0328*** -0.0223***

(0.0081) (0.0078) (0.0085)

Share school effects 70% 67% 46%

(0.0839) (0.0834) (0.1386)

Measure of school quality πΩs(n) πΩ-is(n) πΩ-i

s(n)

π 1 1 RD estimate

Restricted convergence rates

Total exposure effects

Restricted convergence rates

Total exposure effects

Restricted convergence rates

Total exposure effects

Notes: Sample restricted to movers within Montreal. Standard errors are clustered at the destinationFSA level, and obtained by the delta method for restricted convergence rates. Restricted convergencerates are calculated using equation (1.10). β−s is a restricted rate for which βs = 0, and β−n is arestricted rate for which βn = 0. Share school effects is given by the ratio β−β−s

β .

Page 46: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 41

Table 1.6: Exposure Effects: Siblings Subsample

Sample:

(1) (2) (3) (4)

Measure of educational attainment

University enrollment -0.0453 -0.0571 -0.0478* -0.0504*

(0.0365) (0.0359) (0.0275) (0.0274)

Secondary school diploma in 5 years -0.0242 -0.0434 -0.0392 -0.0500*

(0.0378) (0.0365) (0.0301) (0.0300)

Years of schooling -0.0453 -0.0629** -0.0444* -0.0486**

(0.0311) (0.0299) (0.0236) (0.0233)

Cohort fixed effects x x x x

Individual characteristics x x x x

Age at move fixed effects x x x x

Origin-by-destination fixed effects x x

Household fixed effects x x

Times in difficulty before moving x x

N 3674 3674 3674 3674

Siblings only

Notes: Restricted to households in which siblings lived at the same address for at least 75% of theobserved years. In columns (2) and (4), the model includes a set of dummies for each possible value ofnumber of times in difficulty prior to moving. Standard errors are clustered at the household level.*** p<0.01, ** p<0.05, * p<0.1

Page 47: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 42

Table 1.7: Total Exposure Effects Net of Movers’ School Attendance

(1) (2) (3)

University enrollment -0.0424*** -0.0111 -0.0121

(0.0090) (0.0102) (0.0089)

Share school effects 74% 71%

Secondary school diploma in 5 years -0.0421*** -0.0117 -0.0071

(0.0088) (0.0093) (0.0081)

Share school effects 72% 83%

Years of education -0.0488*** -0.0178** -0.0180**

(0.0088) (0.0079) (0.0075)

Share school effects 64% 63%

Cohort fixed effects x x x

Individual characteristics x x x

Age at move fixed effects x x x

Origin-by-destination fixed effects x x x

School fixed effects

(o) School at baseline x x

(o) School at baseline * age-at-move (linear) x

(o) School at baseline * years-exposure x

(d) School at age 15 x x

(d) School at age 15 * age-at-move (linear) x

(d) School at age 15 * years-exposure x

N 25993 25984 25984

Notes: Primary school fixed effects are based on school attendance at baseline. Secondary school fixedeffects are based on school attendance at age 15. In columns (2) and (3), school fixed effects are linearlyinteracted with age-at-move and years of exposure, respectively. Standard errors are clustered at thedestination neighborhood level.*** p<0.01, ** p<0.05, * p<0.1

Page 48: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 43

1.9 Appendix Tables and Figures

Figure A.1: Catchment Areas and Census Tracts in Eastern Montreal (2001)

Sources: Esri, HERE, DeLorme, USGS, Intermap, increment P Corp., NRCAN, Esri Japan, METI, Esri China (Hong Kong), Esri (Thailand), MapmyIndia,© OpenStreetMap contributors, and the GIS User Community

Notes: Colored areas indicate French primary school catchment areas as of 2001. Red lines denote census tracts.

Figure A.2: Educational attainment, by number of times in difficulty

0.2

.4.6

.8F

ract

ion

finis

hed

seco

ndar

y sc

hool

in 5

yea

rs

0 1 2 3 4 5 6 7 8 9 10 11Average across all students ever in difficulty is .199.Cohorts 1995 and 1996 only.

Fraction finished secondary school in 5 yearsby number of times in difficulty

0.1

.2.3

.4F

ract

ion

with

bac

helo

r de

gree

or

mor

e

0 1 2 3 4 5 6 7 8 9 10 11Average across all students ever in difficulty is .048.Cohorts 1995 and 1996 only.

Fraction with bachelor degree or moreby number of times in difficulty

Notes: Sample restricted to students from the 1995 and 1996 cohorts. Too few students of the latercohorts have completed a bachelor degree by 2014-2015 to analyze this outcome for these students.

Page 49: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 44

Figure A.3: Discontinuity in School Attendance At Boundaries

0.0

0.2

0.3

0.5

0.7

Fra

ctio

n

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Fraction attending primary school to the right of a boundary

Notes: For each French primary school boundary, one neighborhood school is randomly assigned to theright. The figure shows the fraction of students enrolled in that school, by distance to the boundary.Students at positive distance are assigned the random chose default school. Students at negative distancesare assigned to a school other than the one to the right. Attendance recorded at baseline (grade 1).Sample is restricted to students in French schools.

Page 50: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 45

Figure A.4: Distribution of school choice across FSAs

05

1015

Fre

quen

cy

0 50 100 150Number of schools

Notes: The histogram shows the distribution of FSAs by number of different primary schools attend byits residents. School attendance measured at baseline (i.e. first enrollment in grade 1).

Page 51: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 46

Figure A.5: Fraction in private schools, by grade and language of instruction

0.1

.2.3

Fra

ctio

n in

priv

ate

scho

ol

French English

1 2 3 4 5 6 7 8 9 10 11 1 2 3 4 5 6 7 8 9 10 11

For each grade, only the first time observed in that grade is counted.

Notes: Statistics calculated over main analytical sample of 92,764 students. Data shown separately forstudents in French and English schools.

Page 52: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 47

Figure A.6: Spatial variation in educational outcomes - Census tract levelPanel A: Fraction ever enrolled in university

(0.635,0.865](0.558,0.635](0.484,0.558](0.434,0.484](0.393,0.434](0.341,0.393](0.293,0.341](0.214,0.293][0.004,0.214]No data

Panel B: Fraction graduating secondary school on time(0.879,0.954](0.811,0.879](0.768,0.811](0.728,0.768](0.684,0.728](0.625,0.684](0.559,0.625](0.455,0.559][0.187,0.455]No data

Notes: Statistics based on permanent residents (students who always resided in the same census tract).Outcomes are adjusted for cohort effects. Data for census tracts with fewer than 10 permanent residentsare not shown (no data).

Page 53: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 48

Figure A.7: Mean Years of Education (Residuals), by School and FSA Deciles

1 2

3 4

5 6

7 8

9 10

-0.4

-0.3

-0.2

-0.1

0

0.1

0.2

0.3

0.4

1 2

3 4

5 6

7 8

9 10

FSA Fixed Effect Decile Ave

rage

re

sid

ual

s (Y

ear

s o

f Ed

uca

tio

n)

School Fixed Effect Decile

Notes: Residuals extracted from the estimation of a two-way fixed effect model, and correspond to theestimates reported in column (6) of Table 1.2. The figure is constructed by slicing the distributions ofschool and FSA fixed effects into deciles, and then calculating the average residuals in each school-by-neighborhood decile cell.

Page 54: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 49

Figure A.8: Spatial variation in ΩPRn and ΛPRn , for University EnrollmentPanel A: School variation (ΩPRn )

(0.585,0.723](0.544,0.585](0.499,0.544](0.475,0.499](0.438,0.475](0.408,0.438](0.380,0.408](0.309,0.380][0.168,0.309]No data

Panel B: Neighborhood variation (ΛPRn )(0.523,0.655](0.492,0.523](0.473,0.492](0.455,0.473](0.433,0.455](0.421,0.433](0.399,0.421](0.383,0.399][0.216,0.383]No data

Notes: Statistics based on permanent residents. Outcomes are adjusted for cohort effects. To easethe interpretation, the student-level fixed effects used to compute ΩPRn and ΛPRn were first re-centeredaround the unconditional university enrollment rate for the the full sample. Data for FSAs with fewerthan 10 permanent residents are not shown.

Page 55: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 50

Figure A.9: Density plot around French primary school boundaries

0.0

01.0

02.0

03

-5000 0 5000

Notes: Figure produced with the stata package DCdensity.ado, which implements the test derived inMcCrary (2008). The x-axis shows distance relative to the nearest boundary, in meters.

Page 56: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 51

Figure A.10: Balance of Covariates at Boundaries - School quality in terms of university enrollment(a) Age (b) Gender (c) Speaks English at home

5.98

6.00

6.02

6.03

6.05

Mea

n ag

e on

Sep

tem

ber

30

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.45

0.47

0.49

0.51

0.53

Fra

ctio

n fe

mal

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.23

0.25

0.28

0.30

0.32

Fra

ctio

n sp

eaki

ng E

nglis

h at

hom

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(d) Speaks neither

English nor French at home (e) Immigrant (f) Attend school in English

0.14

0.17

0.19

0.22

0.24

Fra

ctio

n sp

eaki

ng n

eith

er F

renc

h no

r E

nglis

h at

hom

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.04

80.

059

0.07

10.

083

0.09

4F

ract

ion

imm

igra

nt

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.21

0.24

0.27

0.29

0.32

Fra

ctio

n in

Eng

lish

scho

ols

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(g) Learning difficulties at baseline (h) Handicapped at baseline (i) Day Care Use at baseline

0.01

70.

025

0.03

30.

040

0.04

8F

ract

ion

in d

iffic

ulty

(at

bas

elin

e)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.00

30.

009

0.01

40.

020

0.02

5F

ract

ion

hand

icap

ped

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.19

0.21

0.23

0.25

0.27

Fra

ctio

n in

Sch

ool D

ay C

are

(at b

asel

ine)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(k) Attrition: Left Montreal, (l) Attrition:

(j) Attend default school at baseline Stayed in Quebec Left the province

0.52

0.56

0.59

0.63

0.67

Fra

ctio

n at

tend

ing

defa

ult s

choo

l

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.14

50.

156

0.16

80.

179

0.19

0F

ract

ion

left

Mon

trea

l, re

mai

ned

in Q

uebe

c

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.05

10.

061

0.07

10.

082

0.09

2F

ract

ion

left

the

prov

ince

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Notes: In panels (a) to (j), the sample is restricted to permanent residents. In panels (k) and (l), thereis no sample restriction, hence all students in the database are included. For each boundary, studentsassigned the default school with the highest fixed effect δPs(i) (measured in units of university enrollment)are at positive distances. Variables are first residualized on boundary and FSA fixed effects. Standarderrors are clustered at the boundary level.

Page 57: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 52

Figure A.11: Balance of Covariates at Boundaries - School quality in terms of DES in 5 years(a) Age (b) Gender (c) Speaks English at home

5.98

6.00

6.01

6.03

6.04

Mea

n ag

e on

Sep

tem

ber

30

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.44

0.46

0.48

0.51

0.53

Fra

ctio

n fe

mal

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.24

0.27

0.29

0.32

0.34

Fra

ctio

n sp

eaki

ng E

nglis

h at

hom

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(d) Speaks neither

English nor French at home (e) Immigrant (f) Attend school in English

0.15

0.17

0.20

0.22

0.24

Fra

ctio

n sp

eaki

ng n

eith

er F

renc

h no

r E

nglis

h at

hom

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.04

0.06

0.07

0.09

0.10

Fra

ctio

n im

mig

rant

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.21

0.24

0.27

0.30

0.33

Fra

ctio

n in

Eng

lish

scho

ols

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(g) Learning difficulties at baseline (h) Handicapped at baseline (i) Day Care Use at baseline

0.01

50.

024

0.03

30.

042

0.05

1F

ract

ion

in d

iffic

ulty

(at

bas

elin

e)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.00

00.

006

0.01

20.

017

0.02

3F

ract

ion

hand

icap

ped

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.20

0.22

0.23

0.25

0.26

Fra

ctio

n in

Sch

ool D

ay C

are

(at b

asel

ine)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(k) Attrition: Left Montreal, (l) Attrition:

(j) Attend default school at baseline Stayed in Quebec Left the province

0.52

0.56

0.59

0.63

0.66

Fra

ctio

n at

tend

ing

defa

ult s

choo

l

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.14

20.

155

0.16

90.

182

0.19

5F

ract

ion

left

Mon

trea

l, re

mai

ned

in Q

uebe

c

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.06

00.

070

0.08

00.

089

0.09

9F

ract

ion

left

the

prov

ince

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Notes: In panels (a) to (j), the sample is restricted to permanent residents. In panels (k) and (l), thereis no sample restriction, hence all students in the database are included. For each boundary, studentsassigned the default school with the highest fixed effect δPs(i) (measured in units of timely secondaryschool graduation) are at positive distances. Variables are first residualized on boundary and FSA fixedeffects. Standard errors are clustered at the boundary level.

Page 58: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 53

Figure A.12: Balance of Covariates at Boundaries - School quality in terms of years of education(a) Age (b) Gender (c) Speaks English at home

5.99

76.

007

6.01

76.

026

6.03

6M

ean

age

on S

epte

mbe

r 30

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.43

0.46

0.49

0.52

0.55

Fra

ctio

n fe

mal

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.24

0.26

0.28

0.30

0.32

Fra

ctio

n sp

eaki

ng E

nglis

h at

hom

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(d) Speaks neither

English nor French at home (e) Immigrant (f) Attend school in English

0.15

0.17

0.20

0.23

0.25

Fra

ctio

n sp

eaki

ng n

eith

er F

renc

h no

r E

nglis

h at

hom

e

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.04

40.

057

0.07

00.

084

0.09

7F

ract

ion

imm

igra

nt

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.21

0.24

0.27

0.29

0.32

Fra

ctio

n in

Eng

lish

scho

ols

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(g) Learning difficulties at baseline (h) Handicapped at baseline (i) Day Care Use at baseline

0.02

00.

027

0.03

40.

040

0.04

7F

ract

ion

in d

iffic

ulty

(at

bas

elin

e)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.00

30.

008

0.01

30.

018

0.02

3F

ract

ion

hand

icap

ped

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.21

0.23

0.24

0.26

0.27

Fra

ctio

n in

Sch

ool D

ay C

are

(at b

asel

ine)

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

(k) Attrition: Left Montreal, (l) Attrition:

(j) Attend default school at baseline Stayed in Quebec Left the province

0.52

0.56

0.59

0.63

0.67

Fra

ctio

n at

tend

ing

defa

ult s

choo

l

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.14

80.

158

0.16

80.

177

0.18

7F

ract

ion

left

Mon

trea

l, re

mai

ned

in Q

uebe

c

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

0.04

90.

060

0.07

20.

084

0.09

5F

ract

ion

left

the

prov

ince

-500 -400 -300 -200 -100 0 100 200 300 400 500Relative distance to boundary (meters)

Notes: In panels (a) to (j), the sample is restricted to permanent residents. In panels (k) and (l), thereis no sample restriction, hence all students in the database are included. For each boundary, studentsassigned the default school with the highest fixed effect δPs(i) (measured in units of university enrollment)are at positive distances. Variables are first residualized on boundary and FSA fixed effects. Standarderrors are clustered at the boundary level.

Page 59: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 54

Figure A.13: Distribution of ∆yod by age-at-move

0.2

.4.6

Den

sity

-4 -2 0 2 4Origin-Destination Difference in Predicted Years Of Educ.

7 8 910 11 1213 14 15

Kolmogorov-Smirnov test of equal distribution (Ages 7-11 vs 12-15), pval= .225Distributions plotted separately for each possible age-at-move.

Notes: The kernel density of the distribution of ∆yod (in years of education) is plotted separated foreach possible value of age-at-move.

Page 60: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 55

Figure A.14: Non-parametric restricted exposure effects – No neighborhood effectPanel A: University enrollment

-.4

-.2

0.2

.4C

oeffi

cien

t on

Diff

eren

ce in

Pre

dict

ed O

utco

mes

7 8 9 10 11 12 13 14 15Age at move

Total Exposure Effects No neighborhood effect

Panel B: DES in 5 Years

-.4

-.2

0.2

.4C

oeffi

cien

t on

Diff

eren

ce in

Pre

dict

ed O

utco

mes

7 8 9 10 11 12 13 14 15Age at move

Total Exposure Effects No neighborhood effect

Panel C: Years of education

-.6

-.4

-.2

0.2

Coe

ffici

ent o

n D

iffer

ence

in P

redi

cted

Out

com

es

7 8 9 10 11 12 13 14 15Age at move

Total Exposure Effects No neighborhood effect

Notes: Notes: Sample includes all movers who remained within Montreal. Observation in FSAs withless than 10 permanent residents are omitted. Coefficients in red correspond to age-specific restrictedcoefficients for which the neighborhood channel is shut down (β−nm ). Standard errors are clustered atthe destination level and calculated by the delta method.

Page 61: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 56

Figu

reA.15:

Ban

dwidth

sensitivity

ofRD

estim

ates

Universityen

rollm

ent

DESin

5Years

Years

ofed

ucation

(a)First-stagean

dRed

uced-fo

rm(b)First-stagean

dRed

uced

-form

(c)First-stagean

dRed

uced

-form

-.020.02.04.06Estimated Coefficients

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Red

uced

-for

mF

irst-

stag

e

-.020.02.04.06Estimated Coefficients

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Red

uced

-for

mF

irst-

stag

e

-.10.1.2.3Estimated Coefficients

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Red

uced

-for

mF

irst-

stag

e

(d)RD-IV

(π)

(e)RD-IV

(π)

(f)RD-IV

(π)

Bas

elin

e es

timat

e (.

854)

-.50.511.52Estimated RD-IV Coefficient

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Bas

elin

e es

timat

e (1

.034

)

0.511.52Estimated RD-IV Coefficient

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Bas

elin

e es

timat

e (.

774)

0.511.52Estimated RD-IV Coefficient

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Notes:Pa

nels(a)to

(c)show

first-stage

andredu

ced-form

RD

coeffi

cients

fordiffe

rent

band

width

values.The

dashed

lines

represent95%

confi

denc

eintervalswith

stan

dard

errors

clusteredat

thebo

unda

ry-le

vel.

Pane

ls(d)to

(f)show

theassociated

RD-IV

coeffi

cients.The

horiz

ontallineshow

stheba

selin

eRD-IV

estim

ateun

derno

band

width

restric

tion.

Page 62: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 57

Figu

reA.16:

Ban

dwidth

sensitivity

ofsharescho

oleff

ects

(decom

posit

ion)

Pane

lA:U

niversity

enrollm

ent

PanelB

:DES

in5Ye

ars

Pane

lC:Y

ears

ofed

ucation

Ben

chm

ark

(no

rest

rictio

n): .

65

.2.4.6.81Share School Effects

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Ban

dwid

th (

met

ers)

Ben

chm

ark

(no

rest

rictio

n): .

73

.2.4.6.81Share School Effects

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Ban

dwid

th (

met

ers)

Ben

chm

ark

(no

rest

rictio

n): .

46

0.2.4.6.81Share School Effects

250

300

350

400

450

500

550

600

650

700

750

800

850

900

950

1000

Ban

dwid

th (

met

ers)

Notes:Fo

reach

pane

l,theredlin

eshow

stheshareof

totalexpo

sure

effects

dueto

scho

olfordiffe

rent

RD

band

width

values.The

dashed

lines

represent95%

confi

denc

eintervalswith

stan

dard

errors

calculated

bythede

ltametho

d.The

horiz

ontallineshow

stheba

selin

eestim

ateof

thescho

olshareun

derno

band

width

restric

tion.

Page 63: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 58

Figure A.17: Index of relative learning difficulties, by years relative to movePanel A: No student fixed effects Panel B: With student fixed effects

0.2

.4.6

.8In

dex

of r

elat

ive

lear

ning

diff

icut

ies

-4 -3 -2 -1 0 1 2 3 4Year relative to move

-.3

-.2

-.1

0.1

Inde

x of

rel

ativ

e le

arni

ng d

iffic

utie

s

-4 -3 -2 -1 0 1 2 3 4Year relative to move

Panel C: With student fixed effects Panel D: With student fixed effectsSchool switchers Non-switchers

-.4

-.3

-.2

-.1

0.1

Inde

x of

rel

ativ

e le

arni

ng d

iffic

utie

s

-4 -3 -2 -1 0 1 2 3 4Year relative to move

-.2

-.1

0.1

.2In

dex

of r

elat

ive

lear

ning

diff

icut

ies

-4 -3 -2 -1 0 1 2 3 4Year relative to move

Notes: Standard errors are clustered at the individual level. The y-axis shows regression coefficientson σod(i,t). Observations outside the event window are included in the regression, so all coefficients arerelative to omitted relative-time periods. Panel C includes only students who switched school the yearthey moved. Panel D includes movers who did not switch school the year they moved.

Page 64: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 59

Figure A.18: Index of relative school quality, by years relative to moveUniversity enrollment

Panel A: No student fixed effects Panel B: With student fixed effects.2

.4.6

.81

Inde

x of

rel

ativ

e sc

hool

qua

lity

-4 -3 -2 -1 0 1 2 3 4Year relative to move

0.2

.4.6

.8In

dex

of r

elat

ive

scho

ol q

ualit

y

-4 -3 -2 -1 0 1 2 3 4Year relative to move

DES in 5 YearsPanel C: No student fixed effects Panel D: With student fixed effects

0.2

.4.6

.81

Inde

x of

rel

ativ

e sc

hool

qua

lity

-4 -3 -2 -1 0 1 2 3 4Year relative to move

0.2

.4.6

.81

Inde

x of

rel

ativ

e sc

hool

qua

lity

-4 -3 -2 -1 0 1 2 3 4Year relative to move

Years of educationPanel E: No student fixed effects Panel F: With student fixed effects

.2.4

.6.8

1In

dex

of r

elat

ive

scho

ol q

ualit

y

-4 -3 -2 -1 0 1 2 3 4Year relative to move

0.2

.4.6

.8In

dex

of r

elat

ive

scho

ol q

ualit

y

-4 -3 -2 -1 0 1 2 3 4Year relative to move

Notes: Standard errors are clustered at the individual level. The y-axis shows regression coefficients onσψod(i,t) = δs(i,t)−δs(o,t)

δs(d,t)−δs(o,t). For each period t, δs(n,t) is measured by the relevant average primary school

fixed effects if student i was in primary school in that year. Secondary school fixed effects are used forremaining years. Observations outside the event window are included in the regression, so all coefficientsare relative to omitted relative-time periods.

Page 65: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 60

Table A.1: Summary statistics: Educational outcomes across cohorts

All 1995 1996 1998 2000 2001

Primary and secondary school outcomes

Did not start secondary school on time 0.113 0.156 0.153 0.124 0.073 0.068

Secondary school diploma 0.760 0.755 0.752 0.759 0.767 0.765

Secondary school diploma in 5 years 0.610 0.600 0.587 0.609 0.630 0.621

No secondary school qualification 0.200 0.208 0.209 0.195 0.189 0.198

Post-secondary outcomes

Ever enrolled in college 0.695 0.678 0.682 0.699 0.710 0.705

Enrolled in college by age 17 0.530 0.497 0.503 0.532 0.560 0.555

Ever enrolled in university 0.373 0.460 0.451 0.424 0.332 0.220

Enrolled in university by age 19 0.170 0.166 0.166 0.169 0.175 0.175

Bachelor degree or more 0.128 0.275 0.249 0.140 0.003 0.004

Educational attainment

Number of years of education 12.810 13.247 13.200 13.066 12.517 12.119Observations 92,764 16,969 18,067 18,777 19,125 19,826

Cohort

Notes: The table shows cohort-specific average outcomes.

Table A.2: Variation Across Census Tracts and Schools

(1) (2) (3) (4) (5) (6)

Student-level standard deviation of fixed effects:

Schools 0.261 0.255 0.248 0.235 1.172 1.123

Neighborhoods (Census Tracts) 0.152 0.068 0.159 0.081 0.734 0.328

Dependent variable summary statistics:

Mean

Standard deviation

Fixed effects estimated

Separately x x x

Simultaneously x x x

Number of students

Number of primary schools

Number of secondary schools

Number of neighborhoods 502

[0.444] [0.498] [2.083]

37,491

435

211

0.729 0.460 13.323

Outcome

DES in 5 years University enrollment Years of education

Notes: Sample restricted to students who always resided in the same census tract. School fixed effectsare the sum of a primary and a secondary school fixed effect. In columns (1), (3) and (5), school andneighborhood effects are respectively estimated in separate regressions. In columns (2), (4) and (6), allfixed effects are estimated simultaneously from equation (1.5).

Page 66: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 61

Table A.3: Variation Across FSAs and Schools - Empirical Bayes Estimates

(1) (2) (3) (4) (5) (6)

Student-level standard deviation of shrunk fixed effects:

Schools 0.263 0.251 0.243 0.218 1.180 1.073

Neighborhoods (FSAs) 0.127 0.016 0.129 0.035 0.636 0.148

Dependent variable summary statistics:

Mean

Standard deviation

Fixed effects estimated

Separately x x x

Simultaneously x x x

Number of students

Number of primary schools

Number of secondary schools

Number of neighborhoods

Outcome

DES in 5 years University enrollment Years of education

0.706 0.443 13.228

218

95

[0.456] [0.497] [2.113]

44,912

440

Notes: Sample restricted to students who always resided in the same FSA. School fixed effects are thesum of a primary and a secondary school fixed effect. To shrink estimates, I first calculate standard errorsfor each school and neighborhood fixed effect using bootstrap resampling (100 samples with replacement,clustering within primary school-secondary school-FSA cells). I then shrink estimates toward their meansusing the empirical Bayes procedure described in Chandra et al. (2016).

Page 67: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 62

Table A.4: Exposure Effects: Moves across Census Tracts

Sample:

(1) (2) (3) (4)

Measure of educational attainment

Secondary school diploma in 5 years -0.0175 -0.0200* -0.0300 -0.0355*

(0.0123) (0.0115) (0.0202) (0.0196)

University enrollment -0.0182 -0.0218* -0.0153 -0.0206

(0.0116) (0.0114) (0.0190) (0.0186)

Years of schooling -0.0173 -0.0219* -0.0194 -0.0268

(0.0115) (0.0112) (0.0186) (0.0180)

N 18981 18981 7460 7460

Secondary school diploma in 5 years -0.0222*** -0.0224*** -0.0366*** -0.0388***

(0.00659) (0.00608) (0.00842) (0.00798)

University enrollment -0.0225*** -0.0236*** -0.0281*** -0.0313***

(0.00643) (0.00628) (0.00820) (0.00799)

Years of schooling -0.0272*** -0.0277*** -0.0306*** -0.0341***

(0.00615) (0.00587) (0.00794) (0.00762)

N 31333 31333 15469 15469

Cohort fixed effects x x x x

Individual characteristics x x x x

Age at move fixed effects x x x x

Only moved once x x

Times in difficulty before moving x x

Origin-by-destination fixed effects

Origin + destination fixed effects

All movers One-time movers

Notes: Coefficients shown in the table are convergence rates β. In columns (2) and (4), the modelincludes a set of dummies for each possible value of number of times in difficulty prior to moving.Standard errors are clustered at the destination neighborhood level.*** p<0.01, ** p<0.05, * p<0.1

Page 68: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 63

Table A.5: Exposure Effects: Alternative Outcomes

Sample:

(1) (2) (3) (4)

Measure of educational attainment

No Secondary school qualification -0.0676*** -0.0648*** -0.0496*** -0.0496***

(0.0137) (0.0143) (0.0159) (0.0165)

College enrollment (ever) -0.0373*** -0.0356*** -0.0267** -0.0258*

(0.0109) (0.0118) (0.0134) (0.0138)

College enrollment by 17 -0.0412*** -0.0382*** -0.0408*** -0.0389***

(0.00814) (0.00806) (0.0112) (0.0114)

College degree -0.0407*** -0.0389*** -0.0336*** -0.0321***

(0.00873) (0.00895) (0.0110) (0.0111)

University enrollment by 19 -0.0395*** -0.0381*** -0.0454*** -0.0442***

(0.0110) (0.0112) (0.0160) (0.0163)

Bachelor degree or more -0.0374*** -0.0363*** -0.0261 -0.0258

(0.0129) (0.0130) (0.0181) (0.0182)

Expected earnings on basis of -0.0454*** -0.0433*** -0.0411*** -0.0397***

level of education (0.00869) (0.00930) (0.00972) (0.00970)

Expected earnings on basis of -0.0406*** -0.0391*** -0.0334*** -0.0324***

level and field of education (0.00847) (0.00897) (0.0106) (0.0105)

Cohort fixed effects x x x x

Individual characteristics x x x x

Age at move fixed effects x x x x

Origin-by-destination fixed effects x x x x

Only moved once x x

Times in difficulty before moving x x

N 25993 25993 16949 16949

All movers One-time movers

Notes: Note: Coefficients shown in the table are convergence rates β. In columns (2) and (4), themodel includes a set of dummies for each possible value of number of times in difficulty prior to moving.Standard errors are clustered at the destination neighborhood level. Details on the measurement ofoutcomes are provided in the Data Appendix.*** p<0.01, ** p<0.05, * p<0.1

Page 69: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 64

Table A.6: Alternative Decomposition of Exposure Effects

(1) (2) (3)

University enrollment

β -0.0424*** -0.0424*** -0.0424***

(0.0090) (0.0090) (0.0090)

β-s (No school effects) -0.0107*** -0.0108*** -0.0154***(0.0036) (0.0037) (0.0042)

β-n (No neighborhood effects) -0.0317*** -0.0316*** -0.0270***

(0.0069) (0.0068) (0.0058)

Share school effects 75% 75% 64%

(0.0612) (0.0615) (0.0526)

Secondary school diploma in 5 years

β -0.0421*** -0.0421*** -0.0421***

(0.0088) (0.0088) (0.0088)

β-s (No school effects) -0.0109*** -0.0124*** -0.0114***

(0.0036) (0.0038) (0.0038)β

-n (No neighborhood effects) -0.0309*** -0.0294*** -0.0304***

(0.0084) (0.0080) (0.0083)

Share school effects 74% 70% 73%

(0.0876) (0.0866) (0.0896)

Years of education

β -0.0488*** -0.0488*** -0.0488***

(0.0088) (0.0088) (0.0088)

β-s (No school effects) -0.0136*** -0.0146*** -0.0224***

(0.0032) (0.0033) (0.0041)β-n

(No neighborhood effects) -0.0351*** -0.0342*** -0.0264***

(0.0074) (0.0071) (0.0055)

Share school effects 72% 70% 54%

(0.0542) (0.0518) (0.0401)

Measure of school quality πΩs(n) πΩ-is(n) πΩ-i

s(n)

π 1 1 RD estimate

Restricted convergence rates

Total exposure effects

Restricted convergence rates

Total exposure effects

Restricted convergence rates

Total exposure effects

Notes: Sample restricted to movers within Montreal. Standard errors are clustered at the destina-tion FSA level, and obtained by the delta method for restricted convergence rates. β−s is a re-stricted rate for which βsV ar(π∆Ωod)+βnCov(π∆Ωod,∆y−s

od)

V ar(∆yod) = 0, and β−n is a restricted rate for whichβnV ar(∆y−s

od)+βsCov(π∆Ωod,∆y−s

od)

V ar(∆yod) = 0. Share school effects is given by the ratio β−β−s

β .

Page 70: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 65

Table A.7: School effects: Quadratic control function

Dependent variable:

(1) (2) (3) (4) (5)

Measure of educational attainment

University enrollment 0.0634*** 0.0248*** 0.0293*** 0.0206** 0.7086***

(0.0032) (0.0028) (0.0065) (0.0093) (0.2235)

Secondary school diploma in 5 years 0.0714*** 0.0304*** 0.0325*** 0.0351*** 1.0812***

(0.0036) (0.0026) (0.0063) (0.0091) (0.1891)

Years of schooling 0.2961*** 0.1192*** 0.1372*** 0.1098** 0.8023***

(0.0146) (0.0126) (0.0309) (0.0428) (0.1914)

N 43296 43279 43291 43296 43291

University enrollment 0.0624*** -0.0051 -0.0089 -0.0169 1.8710

(0.0043) (0.0041) (0.0106) (0.0178) (1.8939)

Secondary school diploma in 5 years 0.0712*** 0.0056** 0.0012 -0.0052 -4.1034

(0.0054) (0.0028) (0.0077) (0.0135) (33.3491)

Years of schooling 0.2810*** 0.0042 -0.0075 -0.0438 5.7248

(0.0189) (0.0167) (0.0478) (0.0770) (29.8754)

N 13446 13444 13444 13446 13444

Cohort fixed effects x x x x x

Individual characteristics x x x x x

Neighborhood (FSA) fixed effects x x x x x

Boundary fixed effects x x x x x

All permanent residents

Placebo: Students in English schools

First-stage(s)Reduced-

form RDRD-IV

Quality of

assigned

school at

baseline

(δP

s(i) )

Quality of

school

attended at

baseline

(δP

s(i) )

Childhood

average

school

quality

(Ω-i

s(n(i)) )

Outcome Outcome

Notes: This table reports RD estimates. In columns (1) and (2), primary school quality is measured usingthe fixed effects, δPs(i), estimated in section (1.4.1). In column (3), the dependent variable is childhoodaverage school quality Ω−is(n(i)). Column (5) reports 2SLS estimates of equations (1.6) and (1.7). Inall specifications, the control function for distance to boundary is quadratic and allows for differentfunctions on either side of the threshold. In the first three rows, the sample includes all permanentresidents. In the last three rows, only permanent residents enrolled in English schools are included. Allstandard errors are clustered at the French primary school boundary level.*** p<0.01, ** p<0.05, * p<0.1

Page 71: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 66

Table A.8: School effects: Triangular kernel control function

Dependent variable:

(1) (2) (3) (4) (5)

Measure of educational attainment

University enrollment 0.0632*** 0.0245*** 0.0315*** 0.0253*** 0.8081***

(0.0032) (0.0027) (0.0064) (0.0086) (0.1774)

Secondary school diploma in 5 years 0.0715*** 0.0298*** 0.0330*** 0.0344*** 1.0459***

(0.0036) (0.0025) (0.0061) (0.0086) (0.1700)

Years of schooling 0.2946*** 0.1162*** 0.1456*** 0.1129*** 0.7779***

(0.0146) (0.0121) (0.0297) (0.0398) (0.1661)

N 43296 43279 43291 43296 43291

University enrollment 0.0630*** -0.0025 -0.0036 -0.0110 2.9933

(0.0043) (0.0041) (0.0098) (0.0160) (6.3146)

Secondary school diploma in 5 years 0.0721*** 0.0037 0.0005 -0.0084 -15.8055

(0.0057) (0.0026) (0.0071) (0.0118) (231.0920)

Years of schooling 0.2836*** 0.0053 0.0044 -0.0471 -10.4999

(0.0198) (0.0156) (0.0433) (0.0658) (114.4287)

N 13446 13444 13444 13446 13444

Cohort fixed effects x x x x x

Individual characteristics x x x x x

Neighborhood (FSA) fixed effects x x x x x

Boundary fixed effects x x x x x

All permanent residents

Placebo: Students in English schools

First-stage(s)Reduced-

form RDRD-IV

Quality of

assigned

school at

baseline

(δP

s(i) )

Quality of

school

attended at

baseline

(δP

s(i) )

Childhood

average

school

quality

(Ω-i

s(n(i)) )

Outcome Outcome

Notes: This table reports RD estimates. In columns (1) and (2), primary school quality is measuredusing the fixed effects, δPs(i), estimated in section (1.4.1). In column (3), the dependent variable ischildhood average school quality Ω−is(n(i)). Column (5) reports 2SLS estimates of equations (1.6) and(1.7). In all specifications, the control function for distance to boundary is a triangular kernel and allowsfor different functions on either side of the threshold. In the first three rows, the sample includes allpermanent residents. In the last three rows, only permanent residents enrolled in English schools areincluded. All standard errors are clustered at the French primary school boundary level.*** p<0.01, ** p<0.05, * p<0.1

Page 72: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 67

Table A.9: Balancing check for movers

Outcome of permanent residents:

(1) (2) (3) (4) (5) (6)

Covariates

Gender 0.0038* 0.0034 0.0175* 0.0174 0.0187* 0.0162

(0.0019) (0.0029) (0.0093) (0.0136) (0.0095) (0.0138)

Speaks English at Home -0.0024 -0.0001 -0.0149* -0.0035 -0.0121 -0.0034

(0.0016) (0.0021) (0.0076) (0.0107) (0.0084) (0.0113)

Speaks neither French nor English at Home -0.0006 -0.0010 0.0033 0.0036 -0.0030 -0.0044

(0.0015) (0.0020) (0.0077) (0.0093) (0.0081) (0.0107)

Immigrant -0.0032** -0.0045** -0.0054 -0.0104 -0.0175*** -0.0246**

(0.0013) (0.0019) (0.0069) (0.0095) (0.0065) (0.0098)

Handicapped -0.0006 -0.0008 -0.0039 -0.0037 -0.0036 -0.0049

(0.0006) (0.0008) (0.0027) (0.0039) (0.0027) (0.0036)

Use Day Care at baseline 0.0003 -0.0005 -0.0004 -0.0021 0.0026 -0.0016

(0.0014) (0.0016) (0.0069) (0.0081) (0.0066) (0.0078)

In difficulty at baseline 0.0017** 0.0016* 0.0066* 0.0052 0.0078** 0.0085**

(0.0007) (0.0009) (0.0036) (0.0045) (0.0035) (0.0043)

Times in difficulty pre-move 0.0098 0.0078 0.0313 0.0125 0.0365 0.0391

(0.0081) (0.0083) (0.0394) (0.0399) (0.0394) (0.0407)

Cohort fixed effects x x x x x x

Age at move fixed effects x x x x x x

Origin-by-Destination fixed effects x x x x x x

One-time movers only x x x

N 25993 16949 25993 16949 25993 16949

Years of Education DES in 5 years University Enrollment

Notes: In Columns (1) and (2), ∆yod is measured using years of education. In columns (3) and (4),fractions of students finishing secondary school in 5 years are used, and in columns (5 ) and (6), universityenrollment rates are. Standard errors are clustered at the destination neighborhood level.*** p<0.01, ** p<0.05, * p<0.1

Page 73: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 68

Table A.10: Robustness to time-varying observables

(1) (2) (3) (4) (5) (6)

Measure of educational attainment

Secondary school diploma in 5 years -0.0392*** -0.0437*** -0.0423*** -0.0370*** -0.0380*** -0.0318***

(0.0101) (0.00872) (0.00933) (0.00996) (0.0105) (0.0111)

University enrollment -0.0373*** -0.0436*** -0.0392*** -0.0350*** -0.0399*** -0.0332***

(0.0107) (0.00948) (0.00969) (0.0110) (0.0107) (0.0117)

Years of schooling -0.0435*** -0.0484*** -0.0437*** -0.0422*** -0.0457*** -0.0376***

(0.00965) (0.00903) (0.00902) (0.0103) (0.00946) (0.0105)

Time-varying controls

Income x x

Percent low-income x x

Dwelling value x x

Percent lone family x x

Percent with college x x

Cohort fixed effects x x x x x x

Individual characteristics x x x x x x

Age at move fixed effects x x x x x x

Origin-by-destination fixed effects x x x x x x

N 24357 24357 24357 24357 24357 24357

Notes: Time-varying controls are differences in census tract characteristics around the time of the move.The model includes both the main effect of these controls as well as their interaction with age-at-move.Each column includes a different set of observable time-varying variables. Standard errors are clusteredat the destination neighborhood level.*** p<0.01, ** p<0.05, * p<0.1

Table A.11: Heterogeneous Exposure Effects

Heterogeneity by:

Boys Girls French English Better FSA Worse FSA

(1) (2) (3) (4) (5) (6)

Measure of educational attainment

Secondary school diploma in 5 years -0.0440*** -0.0476*** -0.0473*** -0.0449** -0.0385*** -0.0115

(0.0110) (0.0140) (0.0114) (0.0205) (0.0137) (0.0235)

University enrollment -0.0321** -0.0571*** -0.0385*** -0.0390* -0.0398** -0.0536**

(0.0123) (0.0128) (0.0107) (0.0207) (0.0192) (0.0207)

Years of schooling -0.0425*** -0.0587*** -0.0485*** -0.0525*** -0.0257* -0.0520***

(0.0119) (0.0127) (0.0124) (0.0173) (0.0151) (0.0192)

Cohort fixed effects x x x x x x

Individual characteristics x x x x x x

Age at move fixed effects x x x x x x

Origin-by-destination fixed effects x x x x x x

N 13200 12793 19102 6891 14189 11804

Language at schoolGender Moves to

Notes: Column (1) includes only boys and column (2) restricts the sample to girls. In columns (3) and(4), regressions are run separately by language of instruction at age 15. In column (5), the sample isrestricted to movers for which ∆yod > 0 and column (6) is restricted to cases where ∆yod < 0. Standarderrors are clustered at the destination neighborhood level.*** p<0.01, ** p<0.05, * p<0.1

Page 74: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 69

Table A.12: Decomposition of Exposure Effects - Empirical Bayes Estimates

Outcome: University enrollment DES in 5 years Years of education

(1) (2) (3)

β -0.0494*** -0.0434*** -0.0548***

(0.0105) (0.0102) (0.0102)

β-s

(No school effects) -0.0093** -0.0054** -0.0137***(0.0042) (0.0021) (0.0044)

β-n (No neighborhood effects) -0.0401*** -0.0381*** -0.0411***

(0.0091) (0.0103) (0.0100)

Share school effects 81% 88% 75%(0.0725) (0.0543) (0.0809)

Restricted convergence rates

Total exposure effects

Notes: Sample restricted to movers within Montreal. To shrink estimates of Ωn and Λn, I first calculatestandard errors for each school and neighborhood fixed effect using bootstrap resampling (100 sampleswith replacement, clustering within primary school-secondary school-FSA cells). I then shrink estimatestoward their means using the empirical Bayes procedure described in Chandra et al. (2016). Standarderrors on the convergence rates are clustered at the destination FSA level, and obtained by the deltamethod for restricted convergence rates. β−s is a restricted rate for which βs = 0, and β−n is a restrictedrate for which βn = 0. The total rate is constructed using equation (1.10) (i.e.β = β−s + β−n). Shareschool effects is given by the ratio β−β−s

β .

Page 75: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 70

1.10 Data Appendix

Measurement of outcomes Different levels of education are governed by different departments ofthe Ministry of Education. Each department keeps separate student records in different formats, butthese files can be matched using unique student IDs. Researchers interested in using these data mustfirst submit a research protocol to the Ministry and file a data access request through the Commissiond’accès à l’information.

Primary and secondary school levels, as well as vocational studies, are governed by the same depart-ment. These records notably include any secondary school degree or qualification received, vocationaldegrees awarded, and the year these degrees were earned. For vocational degrees, the subject is alsorecorded. From these files, I create an indicator variable for obtaining a secondary school diploma (DES)within 5 years of starting secondary school (i.e. the year a student is first observed in grade 7). Notethat a student may have been held back in primary school and still obtain a secondary school diplomaon time.

The College department records the year a student was first enrolled in any collegial program inQuebec, as well as the program and the institution of that first registration. If a college degree isawarded, the program in which the degree was awarded is recorded (e.g. pre-university degree in NaturalSciences). The exact date the degree was earned is not recorded, however. The files instead indicatewhether the degree was completed either (a) on time, (b) less than 2 years after expected duration,or (c) more than 2 years after the expected duration. There is a further caveat: degree completion isonly recorded for students who first enrolled in a “normal” college program (DEC ). For example, degreecompletion is not recorded for students who first enrolled in a transition program. I use these files tocreate indicators of college enrollment and college completion. I also approximate the year of completionusing the coarse information on time to completion.

The University department records enrollment separately by semester (Fall, Winter and Summer).For each semester, if a student is enrolled in a Quebec university, the number of credits taken, theinstitution and the field of study are recorded. A separate file is kept for degrees awarded. This fileincludes the year a degree is awarded, the granting institution, and the type of degree (bachelor, masters,doctoral, 1-year diploma, etc.). With these files, I notably create an indicator of university enrollmentand one for bachelor degree completion.

Combining information from all three departments, I then calculate each student’s highest level ofeducation. The categories I consider are:

• No secondary school diploma or qualification

• Secondary school diploma (DES)

• Secondary school qualifications

• Vocational degree (DEP)

• Some collegial, started in “normal” program, no degree yet

• Some collegial, did not start in “normal” program

• Pre-university college degree

• Technical college degree

Page 76: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 71

• Other college degree (includes 1-year degrees)

• Some university, no degree yet

• 1-year university diploma

• Bachelor degree or higher

I also calculate each student’s number of years of education. Note that this variable might vary withinthe categories listed above. For instance, someone who dropped out in grade 9 has 9 years of education,while someone who dropped out in grade 10 has 10. Someone who took 13 years of primary/secondaryschooling to obtain a DES and has no further schooling is coded as having 11 years of education (i.e. thenormal time it takes to get a DES). Students who were in university for one year and then dropped outhave 14 years of education (11 for primary+secondary school, 2 for college, and 1 in university), whilethose who stayed in university for two years before dropping out have 15 years of education. I top codethe number of years of education at 16 (the time it takes to obtain a bachelor degree), however, to avoidmy results being driven by outliers. For instance, I do observe a few hundreds students with 19 years ofeducation or more (i.e. people from earlier cohorts in master and PhD programs). The number of yearsof education therefore incorporates information on multiple margins, e.g. retention in university, collegeenrollment, vocational studies after secondary school, drop out behavior, etc.

Finally, I create measures of expected earnings. To do so, I calculate earnings percentile ranks (in thenational earnings distribution) for all workers aged 30-44 in the Public Use Microdata File of the 2006Canadian Census, separately by age-group. I then calculate the mean earnings rank for each category ofhighest level of education, as well as for all possible combinations of level-of-education and field-of-study.Finally, I assign to each student in my data the mean earnings rank associated with her level of educationin the 2006 Canadian Census (or combination of highest level of education and field of study). Notethat students in the 1995 cohort normally finished secondary school in 2005-2006, meaning that 2006 isthe year they were making their decision to pursue a post-secondary education.

Measurement of Ω−is(n(i)) Equation (1.5) simultaneously includes primary and secondary school fixedeffects. This yields one fixed effect for each school in Montreal. Note that students attending a givensecondary school need not have attended the same primary school – secondary schools do not nestprimary schools.67

For each student, I then create a leave-self-out measure for both primary and secondary schools. Forinstance, for student i and primary school s (which student i attended), I calculate δ−i,Ps = δP

s(i)Ns−yi

Ns−1 ,where Ns is the number of permanent residents who attend school s and yi = yi − y is the deviation ofstudent i’s outcome from the sample mean. Student i’s outcome must be first re-centered around thesample mean because fixed effects are normalized to have a mean of zero.68

Then, I assign the relevant leave-self-out measure to each student-year observation. For years inwhich a student is in a primary school other than the one he was attending at baseline, no leave-self-out adjustment is necessary since that student was not in that school during the year on which thefixed effect estimation is based. I then take the student-level average of δ−i,Ps over all primary school

67Default French primary schools do feed into default secondary schools. But with open enrollment, and the large numberof private secondary schools, the connection between local primary and secondary schools is weak.

68Jackknife estimates of school fixed effects δ−is , in which one regression is ran for each observation, are almost perfectlycorrelated (0.99) with my hand-calculated leave-self-out measures.

Page 77: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 72

years, and similarly calculate a student-level average of δ−i,Ss for secondary school years. The childhoodschool quality measure Ω−is(n(i)) is then the simple sum of these two averages. Note this averaging overprimary/secondary school years only matters for permanent residents who have switched school at somepoint. For the majority of students who only attended one primary and one secondary school, theaveraging is redundant, and it is simply the case that Ω−is(n(i))=δ

−i,Ps +δ−i,Ss .

In unreported analyses, I use a split-sample approach in which a random half of the sample ofpermanent residents is used to measure school and neighborhood quality and the other half is used toestimate the regression-discontinuity design. Split-sample and leave-self-out measures of school qualityare highly correlated (0.98), hence the results presented in this paper are very similar under the split-sample approach.

Catchment Areas To my knowledge, no electronic, geocoded version of the catchment areas thatprevailed in the years 1995-2001 exists. I therefore re-constructed such maps using the following proce-dure.

To first generate a benchmark, the default school associated with each six-digit postal code of theIsland of Montreal as of 2015 was recorded by “feeding” each of these ≈45,000 postal codes in the searchengines of the websites of the three francophone schools boards. Using shapefiles for Canadian postalcodes, I then created a map of all 2015 French catchment areas on the Island of Montreal, down to thesix-digit postal code level.

To infer what the boundaries were in the years the cohorts of students I track started grade school,I used two additional sources of information. First, the Ministry of Education provided me temporarilywith baseline enrollment data for all 100,929 students in my data set along with their six-digit postalcodes (in the analytical data set, six-digit postal codes are de-identified).69 I then mapped actualattendance patterns and compared with the 2015 boundaries. Second, I used the Internet ArchivesWaybackMachine (https://archive.org/web/) to document each school opening/closure that happenedsince 1995, and extracted old maps of catchment areas from archived versions of the school boardswebsites (when available). Combining all these sources of information, I deducted where the boundariesmust have been drawn, and assigned the appropriate default schools to each postal code by hand. Itmust be noted that for many schools, the boundaries have not changed since 1995, hence no manual re-coding was necessary. Using ArcGIS, I also calculated, for each postal code, the distance to the nearestboundary and the unique ID of that boundary. Only boundaries that do not coincide with naturaldivisions such as highways and canals were considered. Using these same sources of information, I alsoinputted catchment areas for English public schools. As explained in the text, however, these boundariesare not well-defined and therefore not used in the analyses.

Attrition About 8% of the total number of students who started grade 1 in Montreal had vanishedfrom primary/secondary school educational records before turning 16. These students are excluded fromthe main sample used this paper. Interestingly, about 1,000 of these students did enroll in a Quebecuniversity at some point, even though they did not graduate from secondary school in the province.Students who had left Montreal (but remained in Quebec) by the time they turned 15 are also excludedfrom all analyses.

69This first data delivery contained only two variables: school attended (name and code) and postal code of residence.For confidentiality reasons, this file had to be destroyed before the analytical files could be transferred to me.

Page 78: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 73

For higher-education, enrollment in colleges and universities outside the province is not comprisedin my dataset. As a result, I may wrongly infer that some students in my main sample never attendedcollege, when in fact they did out-of-province. However, this phenomenon likely only affects a very smallproportion of my sample. A few factors provide strong incentives for college and university students toremain in the province, at least for their undergraduate studies. Firstly, tuition fees in Quebec are thelowest in Canada. Secondly, the discrepancies between Quebec’s and other North American educationalsystems generate important timing issues in meeting college requirements. For instance, at the end ofsecondary school, students in Quebec only have 11 years of school, rather than 12. Finally, there is alanguage barrier for the large majority of students who went to primary and secondary school in French.

To assess the possible magnitude of this measurement issue, I use data from the loans and bursariesrecords of the Ministry of Education. For each year between 1995-1996 and 2014-2015, I was given aseries of indicator variables that flag whether student i in my sample was receiving loans or bursariesin year t. Students who resided in Quebec in childhood but go abroad for college are still eligible forloans and bursaries from the Quebec government. Since at the time of enrolling in a foreign college thestudent’s permanent address is often still a Quebec one, it is easier for them to take up loans from Quebecthan from another province. I can therefore check the proportion of students who take up students loanswhile not being enrolled in any postsecondary institution in Quebec to assess the size of the phenomenon.Under this method, I find that about 1% of my sample attended a higher education institution outsidethe province at some point (many of which also attended a college or a university in Quebec before doingso out-of-province). Finally, it is worth noting that any mis-measurement of educational attainment dueto students leaving the province would plausibly lead me to underestimate differences across schools andneighborhoods. Students studying abroad, where tuition is much more expensive, are arguably fromhigher-SES backgrounds, leading me to underestimate educational attainment in places where it is thehighest.

1.11 Mathematical Appendix

Interpretation of π: Example Suppose we did observe µn and ψs(n(i)), and ran a simple cross-sectional regression of yPRn(i) on both these variables for the subsample of permanent residents. Theregression equation would take the following form:

yPRn(i) = αnµn + αsψs(n(i)) + εi. (1.11)

The OLS estimate of αs is equal to Aω + Aρs, where ρs corresponds to the omitted variable bias.Alternatively, ρs is a partial regression coefficient in a linear projection of family inputs θi onto µn andψs(n(i)). Then, Ωs(n(i)) = αsψs(n(i)), π = ω

ω+ρsand πΩs(n(i)) = Aωψs(n(i)).

Now, consider the feasible regression of yPRn(i) on Ωs(n(i)) as well as on a set of neighborhood fixedeffects, and let Ωs(n(i)) denote the residuals from a regression of Ωs(n(i)) on neighborhood fixed effects.The OLS estimate of the coefficient on Ωs(n(i)) is

Page 79: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 74

Cov(yPRn(i), Ωs(n(i))

)V ar(Ωs(n(i)))

=Cov

(Aλµn, Ωs(n(i))

)V ar(Ωs(n(i)))︸ ︷︷ ︸=0 (fixed effects)

+Cov

(Aωψs(n(i)), Ωs(n(i))

)V ar(Ωs(n(i)))

+Cov

(Aθi, Ωs(n(i))

)V ar(Ωs(n(i)))

= Aω

αs+ Aρs

αs= 1.

Note that Ωs(n(i)) is a measure of predicted gains (estimated school effects), while Aωψs(n(i)) cor-responds to true gains (true school effects). Now consider an experimental sample of permanent res-idents that is randomly assigned to schools and neighborhoods. Their outcomes are given by yEi =A[λµn(i) + ωψs(n(i)) + νi

], where νi is uncorrelated with µn(i) and ψs(n(i)) by virtue of random as-

signment. Consider a regression of yEi on a measure of Ωs(n(i)) constructed using an external, non-experimental sample. The OLS coefficient obtained by regressing yEi on Ωs(n(i)) and a set of neighbor-hood fixed effects is

Cov(yEi , Ωs(n(i)))V ar(Ωs(n(i)))

=Cov

(A[λµn(i) + ωψs(n(i)) + νi

], Ω)

V ar(Ω)

=Cov

(Aλµn(i), Ω

)V ar(Ω)︸ ︷︷ ︸

=0 (fixed effects)

+Cov

(Aωψs(n(i)), Ω

)V ar(Ω)

+Cov

(νi, Ω

)V ar(Ω)︸ ︷︷ ︸

=0 (randomization)

=Cov

(Aωψs(n(i)), Ω

)V ar(Ω)

= Aω

αs= ω

ω + ρs= π.

The coefficient π is therefore the ratio of the causal effect of attending a better school over totalschool variation (the causal effect plus the sorting component). In the language of Chetty, Friedmanand Rockoff (2014a), π is the relationship between true school effects and estimated school effects. Itfollows that 1− π = ρs

ω+ρsis the amount of forecast bias in Ωs(n(i)).

Without such an experimental sample, one can still estimate the amount of forecast bias using a validinstrumental variable Zi that shifts ψs(n(i)) but is otherwise orthogonal to parental inputs θi. The IVestimate of the coefficient on Ωs(n(i)) in a regression of yPRn(i) on Ωs(n(i)) as well as on a set of neighborhoodfixed effects is

Cov(yPRn(i), Zi)Cov(Ωs(n(i)), Zi)

=Cov

(Aλµn, Zi

)Cov(Ωs(n(i)), Zi)︸ ︷︷ ︸

=0 (fixed effects)

+Cov

(Aωψs(n(i)), Zi

)Cov(Ωs(n(i)), Zi)

+Cov

(Aθi, Zi

)Cov(Ωs(n(i)), Zi)︸ ︷︷ ︸

=0 (exclusion restriction)

= AωCov

(ψs(n(i)), Zi

)Cov(Ωs(n(i)), Zi)

= Aω

αs

Cov(Ωs(n(i)), Zi

)Cov(Ωs(n(i)), Zi)

= ω

ω + ρs= π

where Zi denotes the residuals from a regression of Zi the neighborhood fixed effects.

Decomposition equation For ease of exposition, ignore the conditioning variables and fixed effectsin equations (1.8) and (1.9), and focus on the associated regression coefficients β, βs and βn. Also, for

Page 80: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 1. Long-term Contextual Effects in Education 75

simplicity, set π = 1 so that (∆yod − π∆Ωod) = ∆Λod, and let ∆Ωod denote the residuals of a regressionof ∆Ωod on ∆Λod: ∆Ωod = ∆Ωod− Cov(∆Ωod,∆Λod)

V ar(∆Λod) ∆Λod. Define ∆Λod accordingly. Then, the coefficientsof the simplified horse-race regression are

βs =Cov

(m∆Ωod, yi

)V ar(m∆Ωod)

; βn =Cov

(m∆Λod, yi

)V ar(m∆Λod)

.

The full convergence rate is β = Cov(m∆yod,yi)V ar(m∆yod) . Re-organizing

V ar(m∆yod)× β = Cov (m∆yod, yi) = Cov (m∆Ωod, yi) + Cov (m∆Λod, yi)

= Cov(m∆Ωod, yi

)+ Cov(m∆Ωod,m∆Λod)

V ar(m∆Λod)Cov (m∆Λod, yi)

+ Cov(m∆Λod, yi

)+ Cov(m∆Ωod,m∆Λod)

V ar(m∆Ωod)Cov (m∆Ωod, yi)

= βsV ar(m∆Ωod) + βnCov(m∆Λod, yi

)+ Cov(m∆Ωod,m∆Λod)

[Cov (m∆Λod, yi)V ar(m∆Λod)

+ Cov (m∆Ωod, yi)V ar(m∆Ωod)

]= βs

[V ar(m∆Ωod)−

Cov(m∆Ωod,m∆Λod)2

V ar(m∆Λod)

]+ βn

[V ar(m∆Λod)−

Cov(m∆Ωod,m∆Λod)2

V ar(m∆Ωod)

]+ Cov(m∆Ωod,m∆Λod)

[Cov (m∆Λod, yi)V ar(m∆Λod)

+ Cov (m∆Ωod, yi)V ar(m∆Ωod)

]= βsV ar(m∆Ωod) + βnV ar(m∆Λod) + (βs + βn)Cov(m∆Ωod,m∆Λod)

⇒β = 1V ar(∆yod)

[βsV ar(∆Ωod) + βnV ar(∆Λod) + (βs + βn)Cov(∆Ωod,∆Λod)] .

Page 81: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2

Thrivers and Divers: UsingNon-Academic Measures to PredictCollege Success and Failure

2.1 Introduction

In recent decades, college enrollment has increased and both policy makers and parents have continuedto emphasize the importance of postsecondary education as a worthy investment. In parallel, more at-tention is now directed towards helping entrants actually complete their degrees and exit with valuableexperience and skills. But despite efforts to increase college support – additional tutoring, counseling,stress management workshops, time management assistance, and other resources – the fraction of stu-dents completing a degree remains alarmingly low. Only about half of students who begin a bachelors’degree in the United States complete it within six years (Symonds, Schwartz and Ferguson, 2011). InCanada, three-quarters complete but many do so with minimum requirements and questionable skillimprovement (Arum and Roksa, 2011).1

Understanding what factors can improve college performance predictions would allow administratorsto better target students at risk of struggling and identify incoming skills particularly helpful for aca-demic success. Previous research shows that past performance strongly predicts college achievement,which explains why institutions rely on past grades or standardized tests for admission.2 But even forstudents with similar past grades, a high variance exists in subsequent performance. Similarly, thereis considerable variance in high school grades among freshmen at the bottom of the college grade dis-tribution, those most at risk of failing to graduate. Of the students who perform well enough in highschool to make it to selective postsecondary institutions, a substantial fraction end up struggling andeventually drop out. Transitioning from high school to college can be challenging and success in one levelof education does not guarantee success in another.3 Navigating this new environment with ease may

1Bound and Turner (2011)and Bound, Lovenheim and Turner (2010) discuss recent trends in the US, and Childs, Finnieand Martinello (2016) provide and analysis of Canadian trends.

2Bettinger, Evans and Pope (2013); Dooley, Payne and Robb (2012); Cyrenne and Chan (2012); Rothstein (2004).3For example, Scott-Clayton, Crosta and Belfield (2014) discuss how the difficulties associated with identifying at-risk

students generate substantial mis-assignment of students to remedial classes.

76

Page 82: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 77

require more than strong academic capabilities – students may have to “become new kinds of learners”(Farrington et al., 2012). Hence, when it comes to predicting who among admitted students with similargrades will eventually ’thrive’ and who will ’dive’, we need to look beyond their high school academicperformance. College students arrive from an increasing variety of backgrounds with different initialabilities, hopes, goals, and expectations, all of which may influence the degree of ease with which theytransition from high-school to college. For example, Bound, Lovenheim and Turner (2010) show that athird of the decline in completion rates in recent decades can be explained by a surge in the fraction ofstudents with weaker preparation.

Recent research on non-academic factors suggests that variables aside from past grades may helpidentify students who are at risk of floundering in college and those who are likely to succeed. There isample evidence that these skills, particularly personality traits and social background, exhibit substantialpredictive power for a variety of life outcomes such as educational attainment, earnings, and health.4

Conscientiousness – a personality trait associated with staying organized, working hard, and persistence– is positively associated with educational achievement independent of intelligence.5 Gritty students,who persevere towards achieving particular goals, tend to have higher college GPAs than their peers evenafter conditioning on SAT scores (Duckworth et al., 2007). Also, work by Mischel, Shoda and Rodriguez(1989) and Kirby, Winston and Santiesteban (2005) suggests that the ability to delay gratification alsopredicts future achievement.

In this paper, we collect a comprehensive set of non-academic characteristics for a large sample ofincoming college freshmen from various backgrounds to explore which measures best predict success andfailure in first-year that could not have been foreseen on the basis of past grades.6 More specifically, wefocus on the incremental predictive power of these measures by first absorbing the variation in collegegrades explained by high school grades, thereby accounting for any correlation between non-academicvariables and past grades. We depart from the previous literature in focusing our attention on the highlyinformative, yet understudied groups of student outliers – those who end up in the bottom and top decilesin our sample, in terms of the difference between actual performance and predicted performance based onhigh school grades. We call students in the top decile thrivers, and those in the bottom, divers. Thriversand divers are opposite extremes, making it easier to examine key differences in initial characteristicsrelative to the rest of the student population. Examining them in isolation helps avoid measuring smalllinear relationships from the majority of students ’in the middle’ and allows for asymmetries betweenoutliers. That a typical B student obtains a GPA of B+, or that a former high school valedictorianreceives a GPA of A-, is not out of the ordinary. The search for non-academic predictors of a successfulor failed transition to college is concentrated by isolating outlier groups that, on their own, are ofparticular interest. In particular, our data indicates that divers are four times more likely to drop outafter a year than the average, and therefore fall in the category of students often implicitly targetedby support interventions. Focusing on them may help administrators better understand how to avoidpitfalls and promote environments for helping incoming college students.

Our approach offers a new, innovative and low-cost way of collecting both quantitative and qualitative4Kautz et al. (2014); Almlund et al. (2011); Borghans et al. (2008); Roberts et al. (2007); Heckman, Stixrud and Urzua

(2006).5Burks et al. (2015); Almlund et al. (2011); Komarraju, Karau and Schmeck (2009); Poropat (2009); O’Connor and

Paunonen (2007); De Fruyt and Mervielde (1996).6We use the expression ‘non-academic’ to refer to any variable that is not an explicit measure of academic performance,

such as grades or test scores. This broad category therefore includes measures often labelled as non-cognitive or soft skills,but also demographic information.

Page 83: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 78

data from large samples of students with near-perfect consent rates. Our data come from partnering withall first-year economics instructors at the three campuses of the University of Toronto and asking studentsto complete an online ’warm-up exercise’ for 2 percent of their final grade, as part of a broader researchprogram (Oreopoulos and Petronijevic, 2016). Over 45 to 90 minutes during the first weeks of school, theparticipating students completed survey questions about procrastination, study habits, social identity,academic expectations, and agreed to link their responses to the university’s administrative database ofbackground characteristics and future academic performance.7 A subset of our data allows us to explorea wide variety of non-academic characteristics, including grit, risk aversion, time preferences, locus ofcontrol, as well as the Big Five personality traits; agreeableness, conscientiousness, extraversion, opennessto experience, and emotional stability. Our sample is also very large, allowing sufficient statisticalpower to detect even small differences between performance groups. We explore what variables bestpredict first-year performance, both unconditionally as well as when conditioning on all other predictors.8

The exercise does not attempt to uncover causal estimates, but rather document the independent andincremental predictive properties of a large number of characteristics, above and beyond what could beexpected on the basis of high school grades. All relationships between these non-academic variables andcollege grades are estimated on the same sample, ensuring that the set of controls is consistent and thatcoefficients’ magnitudes are directly comparable.

We find that objective and subjective measures of procrastination and impatience are the best predic-tors of failing to keep up with grade expectations. Whether conditioning on other traits or characteristicsor not, students that self-report tending to cram for exams, wait until the last minute in general to com-plete deadlines, or even wait last minute to complete the survey we collected data from them are muchmore likely to end up in the lowest or second lowest grade decile relative to expectations. Poor perform-ers also tend to work many more hours for pay than their peers and are less conscientious on average.These patterns are not the same for thrivers. The best predictors for far exceeding grade expectationsare self-reported intended hours of study and expected grades. Students who expect higher grades tendto get them, and thrivers plan to study over three hours a week more than divers do, on average.

Another subset of students was asked to write freely about their future goals, anticipated setbacks,and mindset. Examining their answers offers the opportunity to vastly expand the set of potentialpredictors beyond those explicitly measured by questionnaires reflecting researchers’ priors. We findthat thrivers and divers answered these open-ended written questions differently. Thrivers write longeranswers and use better spelling than divers, and are also are more likely to identify self-discipline as atrait they admire in themselves. In addition, when asked to identify future goals, thrivers are more likelyto discuss the impact they want to make on society, while divers are more likely to emphasize wantingto ’get rich’.

These findings have both theoretical and practical implications. A better understanding of thecharacteristics of student outliers informs us about the shape of the college education production function.Even among those who were admitted to the University of Toronto, several noncognitive skills sharplydistinguish divers from other students. Accounting for the skills we measure increases the explanatorypower in predicting performance over the full distribution compared to using only past performance alone,but high school grades remain the single best predictor of college grades. Also, skills that characterizestudents who are the most successful in their transition to college are not necessarily the ones that

7The consent rate was 97 percent.8Access to administrative files will allow us to consider other outcomes such as persistence and academic performance

in future work.

Page 84: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 79

divers lack. On practical grounds, this paper highlights some specific skills that educational policiesmight target to improve. The abilities to persist, to self-regulate and to set high expectations for oneselfall contribute to reducing the risk of struggling in higher education. Our findings also motivate furtherresearch on possible policies likely to restrict the scope for the negative effects of behaviors sharedby most divers, such as increasing the frequency of deadlines to mitigate procrastination. By helpingcharacterize the profile of students exceptionally poor or great at transition to college, this research mayalso prove useful for catching students before they run into difficulty, and advising students about howto excel in school.

The paper proceeds as follows. First, we briefly review the existing literature on predictors of collegesuccess in Section 2.2. Section 2.3 explains the data collection process and the institutional environmentand provides an overview of our estimation samples. The methodology is presented in Section 2.4 andresults are displayed in section 2.5. In Section 2.6, we combine the best predictors into a uni-dimensional“at-risk” factor, and document its predictive power over the full distribution of grades, as well as formore policy-relevant extreme negative outcomes. We then benchmark the predictive properties of oursimple summary measure against machine learning results that let a computer algorithm choose the bestpredictors and find the weights on them that maximizes the predictive power. Section 2.7 concludeswith a discussion of the policy implications of this research.

2.2 Background

Social scientists increasingly stress the importance of noncognitive abilities for a host of socioeconomicoutcomes. Both in the labor market and in school, the explanatory power of personality traits andpersonal preferences is comparable to or greater than that of cognitive abilities (Almlund et al., 2011).In a similar vein, successful childhood interventions that have long-term impacts on adult outcomes oftenshow no persistent effect on cognitive skills while significantly improving children’s non-academic skills(Kautz et al., 2014; Chetty et al., 2011). Grades in high school as well as in college partly reflect boththe cognitive and noncognitive abilities of students.

The emphasis on personality traits and other non-academic measures as determinants of educationalsuccess has a long tradition in the fields of education and psychology.9 In recent decades, the emergenceof the Big Five dimensions of personality as a broadly accepted general taxonomy (John, Naumann andSoto, 2008), along with an increasing interest in motivational theories (Robbins et al., 2004), generateda substantial amount of research on the incremental effects of personality and individual goals on collegesuccess over that of standard predictors such as standardized tests (Conard, 2006). The number ofnoncognitive measures that have been found to correlate significantly with college GPA is large. Yet, itremains unclear which of them or which set constitute the best predictors of success in college, since fewstudies consider a broad selection of predictors simultaneously and many distinct measures considerablyoverlap conceptually and empirically. The lack of a thorough evaluation of how different measures usedin separate literatures are related has rendered integration of independent findings difficult.

For example, conscientiousness10 and grit11 , which have been the focus of most personality research,are both strong predictors of postsecondary education performance, but recent evidence suggests thatthe latter might be a facet of the former (Credé, Tynan and Harms, 2016; Dumfart and Neubauer,

9Willingham (1985) provides an excellent overview of the early work on the topic.10Burks et al. (2015); Komarraju, Karau and Schmeck (2009); Poropat (2009).11Duckworth et al. (2007).

Page 85: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 80

2016). In parallel, the literature on motivational theories has emphasized the importance of goals andbeliefs about performance. The most comprehensive meta-analytic reviews in psychology and educationresearch generally find that academic self-efficacy – the belief in one’s capability to succeed academically– and grade goals – exhibit the strongest correlations with college GPA (Richardson, Abraham and Bond,2012; Robbins et al., 2004).12 More recently, researchers in economics of education have emphasized therole of time preferences as important inputs in schooling decision and in the educational productionfunction.13

These separate branches of research in education have yet to integrate findings from one another. Ourpaper casts a wider net by considering multiple predictors from all three fields simultaneously, notablyincluding standard personality constructs, measures of motivational factors previously found to be goodpredictors of college GPA such as locus of control and grade expectations, as well as economic preferenceparameters. We further broaden the set of predictors by moving beyond traditional questionnaire-basedmeasures through text analysis, and complement our examination with machine learning techniques.

2.3 Data

Our data comes from an online exercise completed by first year economics students in all three campusesof the University of Toronto. While more than half of the university’s student population attend themain campus, over 25,000 students are registered at two smaller satellite campuses to the West and Eastof downtown, both about 20 miles away. These campuses receive more commuter students than themain campus and have different admission requirements. The downtown campus is perceived as moreelite, whereas the satellite campuses resemble other smaller institutions across Ontario. As a result, theuniversity’s student population comes from a very diverse set of academic backgrounds.

Early in the 2015 Fall semester, all undergraduate students enrolled in an introduction to economicscourse (approx. 6,000) across all three campuses were asked to participate in an online ’warm-up’exercise. The nature of the exercise varied randomly across students – some were asked to completea comprehensive personality test while others were assigned a goal-setting program which asks themto write freely about their future goals. Each group was shown a short video created to introducethe purpose of the program and key take-away points. Beforehand, students were required to fill ina brief survey and were asked for consent to work with their administrative data (97 percent agreed).Completion of this one- to two-hour exercise counted for 2 percentage points of their overall grade inthe course.14

The group of students who took part in the program represents about a third of all first year studentsenrolled at this university, and almost 10% of the entire undergraduate student population.15 Linked ad-ministrative variables include gender, citizenship, registration status, GPA, all courses taken and gradesreceived at this postsecondary institution and, for the majority of students, the high school performancemeasure used for admission to Canadian universities (the admission grade).16 In the analyses below,

12While these meta-analyses consider many characteristics as predictors, the underlying studies rarely do, plausiblyintroducing bias. Our setup overcomes this methodological drawback.

13Lavecchia, Liu and Oreopoulos (2016); Cadena and Keys (2015); Burks et al. (2015).14The warm-up exercise was setup, in part, to test the effectiveness of new online and text-based approaches for providing

student support. For more information about the experimental design, we refer readers to Oreopoulos and Petronijevic(2016).

15Introduction to Economics is an extremely popular course. Many students in fields other than business or economicstake this course as an elective. The sample also includes students who enrolled but dropped the course later in the semester.

16This corresponds to the student’s average of her best six grades for a standardized set of high school courses taken

Page 86: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 81

we restrict our estimation sample to full-time students for which we have this measure of high schoolachievement (77 percent of the sample).

The set of variables that was collected as part of the survey from all students contains detailedbackground characteristics such as international student status and parental education, as well as alarge set of other measures of noncognitive skills, in particular reports of study habits and subjectiveexpectations. Survey questions are presented in the Online Appendix.

For a 30 percent random subsample of students (henceforth the personality sample), we collectedadditional data on a large array of traditional personality traits and economic preference measures as partof the online exercise. These include self-assessed propensity to procrastinate and summary measuresof perseverance of effort and consistency of interest, two latent factors loading onto the construct ofgrit (Duckworth and Quinn, 2009). Two complementary measures of each Big Five trait were alsoconstructed: an absolute measure obtained by implementing the Likert-scale Mini-IPIP questionnaire(Donnellan et al., 2006), and a relative-scored ipsative measure. The ipsative measure indicates theextent to which a given trait is dominant in one’s personality profile relative to other traits. Thisrelative-scored method is known to be more resistant to biased responding (Hirsh and Peterson, 2008).17

We also assess students’ level of tolerance for risk using both a simple survey question as well as a seriesof hypothetical choices between a lottery and a certain amount of money (Dohmen et al., 2011, 2010).Finally, we elicit time preferences using lists of hypothetical choices between an amount of money paidat some early point in time and a larger amount received later (Dohmen et al., 2010; Andersen et al.,2008).18

The first column of Table 2.1 shows descriptive statistics for all students included in the personalitysample for whom the admission grade is non-missing.19 The average admission grade is 87 percentwith the majority of students scoring above 80.20 The summary statistics for demographic variablesunderline the sample’s diversity. Roughly half the students have a mother tongue other than Englishand a citizenship other than Canadian, and a third self-report as international students.21 Approximately53 percent are women, and 81 percent started their first year of university in the Fall of 2015. More than40 percent of our sample intends to major in a field other than economics or business (the two programsfor which the introduction to economics course is required). Only 25 percent are first-generation collegestudents (i.e. neither of their parents is college-educated).

There is substantial variation in average first-year college grades. The mean is 66 percent with astandard deviation of 13 percentage points, almost three times larger than the standard deviation ofadmission grades.22 Of particular interest is the fact that students with the lowest college grades are not

by all students in the province of Ontario. Admission to postsecondary education in Ontario is based solely on academicperformances. There is no admission criterion, implicit or explicit, based on personal characteristics such as race, ancestry,ethnic origin, sex or age.

17The relative-scored measure combines rank-order and forced-choice approaches. The main drawback to this approachis that relative-scored traits are negatively correlated with each other by construction.

18It must be noted that skipping questions was not permitted. Interested readers will find the personality test questionsin the Online Appendix.

19Admission grades are more likely missing for transfer, non-traditional and international students.20In terms of high school performance, our sample is reasonably close to the provincial average for those enrolling in

university. The most recent application data from the Council of Ontario Universities (2014) indicates that the secondaryschool average of Full-Time, First Year students at the University of Toronto is 85.9%. The average across Ontariouniversities is 83.4% with some institutions with entering average grades above 86%.

21In practice, domestic student are those with either a Canadian citizenship or a Permanent Resident status.22By construction, the distribution of admission grades we observe is truncated at the bottom. It does not reflect the

full distribution of potential applicants as it only includes enrollees. This restriction of range raises methodological issuesif one tries to extrapolate the relationship between past grades and college grades to non-enrolled students (Rothstein,2004). Our objective in this paper is not to inform admission policy and the interpretation of our results is independent

Page 87: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 82

systematically the ones with the lowest high school grades. The large variance of college grades aroundhigh school grades is shown in Figure B.1 in the Online Appendix.

In terms of study habits, students expect to study for approximately 18 hours per week on averageand work at a paid job for less than 8 hours per week. Students come in with high expectations:approximately 63 percent intend to eventually pursue graduate studies,23 and the average expectedGPA is 3.6, more than one grade point above the actual first-year mean GPA (2.3) – a difference greaterthan a full standard deviation. In addition to these subjective expectations, we also consider an objectivemeasure of procrastination, which is the number of days between the first day of class and the time astudent started the online survey for this study. Students were encouraged to complete the task earlybefore being burdened with other homework, and given a two-week deadline. On average, four dayspassed between the beginning of classes and the moment students started the survey, with about halfthe sample registering within 2 days, but a fifth of students waiting more than a week.

In complementary analyses, we focus on a separate 50% random subsample of students (henceforththe text sample) who were asked to answer open-ended questions such as “describe what kind of personyou want to become later on in life”.24 The qualitative answers to each of these questions providesufficient information to analyze whether outliers tend to discuss different topics than other studentswhen they are allowed to choose what to write about. Students were prompted to take their time andtake the exercise seriously because it was intended for their benefit. Some questions contained word countand time constraints, with a friendly message of encouragement to students that tried to complete aquestion before removing these constraints. The large majority of students wrote in detail, with emotion,clarity and personal insight.

2.4 Methodology

2.4.1 Defining outliers

Admission to college generally relies on standardized tests or high school grades. Yet, substantial vari-ation in freshman performance around past grades remains. High school GPA alone is not sufficient topredict which students are the most likely to struggle and eventually drop out of college. The methodol-ogy developed below aims at exploring whether adding more variables is useful for improving predictionsof these extreme outcomes. To emphasize the incremental predictive power of non-academic characteris-tic, we focus on the part of college grades that cannot be expected on the basis of past grades. It must benoted that high school grades partly reflect both cognitive and noncognitive skills. As a result, control-ling for past academic performance absorbs part of the total contribution of our non-academic measuresin explaining the raw variation in college grades. This approach aims at improving our predictions ofsuccessful and failed transitions, and helps us understand what makes divers and thrivers different thanother students.25

To identify students who perform unusually above or below expectations, we first residualize collegegrades on past performance. More specifically, we extract the portion of college grades that is linearlyof restriction of range issues.

23In comparison, only 20% of the university’s student population is enrolled in a graduate program.2450% of first-year students and 70% of upper-year students were randomly assigned to a goal-setting exercise. The

proportion of first-year/upper-year students was unknown prior to assignment. Overall, about 53% of students who tookpart in the warm-up exercise were assigned to the goal-setting exercise. By construction, the personality sample and thetext sample are mutually exclusive.

25See Farrington et al. (2012) for an extensive discussion of transition points in education.

Page 88: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 83

predicted by past grades and a set of background characteristics (κics) by estimating the followingequation:

CollegeGradeics = α0 + α1HSGRADEics + α2κics + δc + δs + εics (2.1)

where CollegeGradeics is the credit-weighted first-year average college grade of student i who startedcollege in semester s and at campus c, and HSGRADEics is her high school average used for admis-sion. Campus fixed-effects (δc) are included to take into account differences in admission criteria acrosscampuses, as well as any discrepancy in grading practices. Upper year students included in our sampleare more likely to be enrolled in STEM programs and to take introduction to economics as an electivethan are first-year students. Therefore, cohort fixed-effects (δs) are added to the model. We estimatethe model separately for the personality sample and the text sample.

Figure 2.1 plots residualized college grades against admission grades for the personality sample.In both dimensions, we highlight students who belong to either the top or the bottom decile of thedistribution. The vast majority of students who perform significantly above expectations (groups 1, 4 and7 on the figure) or below expectations (groups 3, 6 and 9) come from the middle of the admission gradedistribution. Put differently, students who thrive are not simply students who were already expected todo well and did even better, and students who dive are not merely students who were expected to haverelatively low grades and did even worse, nor students expected to do exceptionally well but who insteadregressed towards the mean. In fact, the performance gap between the two outlier groups is colossal:divers’ average first-year college grade is 40, and thrivers’ is 81. In our main specifications we define thetwo groups of students who rank in the top and bottom deciles of the distribution of εics as thrivers anddivers, respectively. We explore the robustness of our results with respect to the definition of divers andthrivers in section 2.5.2.

2.4.2 Differences in quantitative non-academic measures

The main exercise we undertake compares the distributions of a large set of non-academic measures forthe two outlier groups relative to the full sample. Unconditional mean differences for each characteristicx ∈ X are obtained from the following regression:

xi = γ1Di + γ2Ti + ui (2.2)

where xi is a non-academic measure, Di is a dummy for diver status and Ti is a dummy for thriverstatus. To ease the interpretation of the results, each non-binary individual characteristic of interest isstandardized with mean zero and unit variance. For continuous predictors, the coefficients of interest,γ1 and γ2, indicate the difference in mean for each outlier group relative to the main distribution, instandard deviations units.26 Correspondingly, binary measures are centered such that their mean is zeroand the estimated coefficients reflect the percentage point difference in the fraction of thrivers or diverswho exhibit the characteristic of interest relative to the main sample.

As discussed in section 2.2, there is substantial conceptual overlap between different non-academic26The coefficients are relative to the full distribution since the model does not include a constant.

Page 89: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 84

constructs. To find which of these measures are the best predictors of success and failure in transi-tioning to college, we assess whether the mean differences remain significant when using only variationin the distribution of a given characteristic that is unexplained by other predictors. These conditionaldifferences are calculated in two-steps. First, we residualize each characteristic x:

xi =a+ bX−x,i + vx,i (2.3)

where X−x is the subset of X that excludes characteristic x. Then, differences in means of residualizedcharacteristics are obtained by substituting vx,i for xi in equation (2). This strategy amounts to com-paring the outlier distributions with the main distribution using only the fraction of the variation in agiven construct that is orthgonal to other non-academic measures.27

Figure 2.2 illustrates the nature of the comparison exercise. In panel A we show the unconditionaldistributions of (relative-scored) conscientiousness for thrivers, divers and the full personality sample.Divers are considerably less conscientious than average (0.26 standard deviations below the samplemean). This pattern is not symmetric – on average, thrivers are just as conscientious as others. Con-ditional differences are presented in panel B, where each density plot shows the distribution of residualconscientiousness that is unaccounted for by variation in other non-academic measures. The mass ofdivers with very low conscientiousness (around -2 s.d.) observed in the unconditional distribution isexplained by other predictors, but the conditional mean difference between divers and the full sampleremain substantial. Figures B.2 and B.3 show similar density plots for other non-academic measures.

2.4.3 Text analysis

Students in the text sample were asked a series of open-ended questions, such as “Name at least onething that you admire about yourself”. This type of question allows students much more freedom toanswer, so individual answers are often very informative. For instance, answers are not restricted to aset of goals pre-selected by the researcher, but rather include any goals students may have. However,aggregating the results over all students in a meaningful way is a challenge. We use two techniques toquantify the writing, one evaluating effort and writing quality, the other analyzing which topics studentschoose to write about.

There are three measures of effort and writing quality. Firstly, the programming of the survey websiteallows us to measure how many seconds each student takes to answer a given question. Secondly, wecount the number of words each student uses for each answer, where a word is defined as one or morecharacters separated by one or more spaces. Finally, we run each of these words through the MicrosoftWord Canadian English spellchecker, and calculate the proportion of words which are spelled correctly.28

These variables are taken as measures of conscientiousness and language ability and analyzed using themethod described in Section 2.4.2.

We also compare the topics that divers and thrivers discuss in their answers using a simplified topic-modelling text analysis approach. In topic modelling, it is assumed that an author makes a series of

27This is similar but not numerically equivalent to including all other characteristics as controls in equation (2). Resultsare not sensitive to this choice of methodology. See Online Appendix Table B.5.

28Note that this is a noisy measure of spelling quality. If a student’s misspelling of a word is a correct spelling of anotherword – for example, “coarse” for “course” – it will count as a correct spelling. On the other hand, some widely acceptableabbreviations, such as GPA, are not recognized by the spellchecker and counted as incorrect spellings.

Page 90: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 85

decisions about which topics to discuss. Each topic then maps to a series of words (Blei, Ng and Jordan,2003; Hofmann, 2000). For example, discussion of procrastination might use words like “procrastinate",“cram", or “all-nighter". The researcher measures the amount of space devoted to a topic by comparingthe frequency of words across documents. If one author, or group of authors, use a particular set ofwords more often, it is assumed that they devote a higher proportion of their documents to topics relatedto those words.

Given the sample size and the fact that students often give brief responses, we adopt a very simplemethod to apply this approach. Firstly, we clean the students’ answers to generate more meaningfulresults with the following rules: If a word was spelled incorrectly according to the Word spellchecker,we replace it with Word’s top suggestion for a replacement. These words are then stemmed, to removegrammatical constructions such as pluralisation and verb tenses. This ensures that words such as “class”and “classes” are treated as identical. Finally, we remove stopwords, which are short, common wordssuch as “and” or “the”.

For each word in the cleaned text, we calculate the proportion of students who use the word to answera given question among divers, thrivers, and in the entire sample. A chi-squared test comparing theshare of divers who use a word with the share of the entire sample shows if low performing students aremore likely than others to use a given word. If many of the words used more often by divers are relatedto a given topic, the intuition of the topic modelling approach suggests that divers are more likely tospend more space discussing that topic.

2.5 Results

2.5.1 Predicting college grades using past academic achievement

Estimates of the relationship between past academic performance and college grades (equation (1))are shown in Table B.1 of the Online Appendix. A one standard deviation higher admissions grade isassociated with a 0.41-0.43 standard deviation higher first-year average college grade. Older studentsand non-domestic students receive lower grades in college than do younger and domestic students withequivalent admission grades. While past measures of academic performance do predict success in college,the explanatory power of this model is modest. When no demographics are included, less than 20% ofthe observed variation in college grades is explained by admission grades, in line with previous findings(Stephan et al., 2015; Bettinger, Evans and Pope, 2013; Richardson, Abraham and Bond, 2012). Theinclusion of age at entry and non-domestic student status adds some explanatory power, but morethan three quarters of the variation in college grades remain unexplained.29 We next explore whichnon-academic characteristics best characterize outliers relative to the main distribution.

2.5.2 Predicting Student Outliers with Non-Academic Outcomes

Columns (1) and (3) of Table 2.2 report average unconditional deviations from the sample mean for diversand thrivers, respectively. For each possible predictor, columns (2) and (4) report deviations from the

29Adding polynomials of admission grades does not affect subsequent results. Similarly, including high school fixedeffects (unreported) does not qualitatively affect most of our conclusions, but comes at the high cost of precision since highschool identifiers are only observed for domestic students, and the sample contains multiple high schools from which onlyone student is observed. Also, our results are robust to further including high school grades in mathematics and Englishas additional covariates.

Page 91: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 86

mean conditional on all other predictors listed in the table. In the last two columns, we test whether thedifference between the top and bottom outliers for each non-academic measure is significantly differentfrom zero.

Relative to the full distribution, students who perform largely below expectations are much morelikely to self report they cram for exam (0.30 s.d. above the mean), much more likely to start the onlinesurvey later (0.29 s.d. above the mean) and tend to work much more hours at paid jobs (0.22 s.d. abovethe mean). They are also significantly less conscientious (0.26 s.d. below the mean) and more impatientthan their peers (0.2 s.d. above the mean), consistent with prior evidence (Burks et al., 2015). Evenconditional on other predictors, most of these patterns remain strong and statistically significant. Beingsure about one’s major and intending to pursue graduate studies has little explanatory power, and, ifanything, divers are more likely to say they often think about the future. We interpret these results asevidence that students who perform significantly below expectations are neither lacking ambition norvision, but tend to put themselves in situations that hinder their academic success.

Thrivers are not the mirror image of divers; they are no less likely to cram for exams or to work manyhours for pay than the average student. However, they tend to study for relatively more hours (0.22 s.d.above the mean), and expect a higher GPA than divers (difference of 0.23 s.d.). We find that thrivers aremore introverted than divers (unconditional difference of -0.27 s.d.), but that the conditional differenceis not statistically significant.30 Relative to the full distribution, thrivers are more risk averse, butthis difference is mostly accounted for by variation in other characteristics. Students who excel aboveexpectations do not report finding the transition to university any less challenging than the averagestudent does, and intend to pursue graduate studies in the same proportions as average students anddivers do.

We find no statistically significant differences between outliers in terms of agreeableness, opennessto experience or emotional stability. Similarly, grit (perseverance of effort and consistency of effort) andlocus of control do not help predict extreme outcomes. The point estimates for our subjective measureof procrastination indicate that thrivers are less likely to procrastinate than divers, but we cannot rejectthe null hypothesis of no difference.

Men are overrepresented in both tails of the distribution of college grade residuals: the proportionof women is approximately 10 percentage point lower among divers and thrivers than in the full sample.Previous research has also found that boys exhibit higher variance in test scores than girls (Machinand Pekkarinen, 2008; Hedges and Nowell, 1995). Other demographic characteristics have little or nopredictive power.

To evaluate the robustness of these findings, we consider two alternative definitions of outliers in theOnline Appendix. In table B.2, divers (thrivers) are defined as students who fall in the bottom (top) 20%of the distribution of college grade residuals. Broadening the groups’ composition improves precision,but may dilute results by including students with less extreme outcomes in the outlier groups. Wecontinue to find that divers are less conscientious and more impatient, more likely to cram for exam andto start the exercise later, and that thrivers study more hours on average. Under this specification, thedifference between divers and the full sample in terms of procrastination does reach statistical significanceat conventional levels, but differences in gender composition do not. In table B.3, we verify that ourresults are not driven by students who came in with extraordinarily high or low admission grades by

30While less common in the literature, this result is not entirely new (O’Connor and Paunonen, 2007; Noftle and Robins,2007). Chamorro Premuziz and Furnham (2005) discusses how introverts may have a greater ability to consolidate learningand have better study habits (e.g. spend more time studying than socializing).

Page 92: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 87

restricting the sample to those with past grades in the middle 80% of the distribution.31

Overall, students who perform markedly lower than expected given their past academic achievementare more prone to procrastination, more impatient and less conscientious than the average. In con-trast, students who perform significantly better than expected exhibit very few differences with the fullsample, with the exception of the number of hours spent studying. This suggests that most measuredcharacteristics conductive to success in college are already reflected through high school grades, but thatnon-academic measures do help predict negative outcomes that were unexpected on the basis of pastperformance.

We note that it is unlikely that these results reflect purely transitory phenomena. For instance, whilepersonality traits are not fixed, they are particularly stable over time (Cobb-Clark and Schurer, 2012;Almlund et al., 2011; John, Naumann and Soto, 2008). Also, the students we identified as divers infirst-year do not appear to catch-up with other students in later years – their second-year grades are stillmore than a full standard deviation below the mean (Figure B.4). In addition, divers were more thanfour times more likely to have dropped out after the first-year than the average student.32

For completeness, table B.4 shows differences between students in the top and bottom distribution ofcollege grades that are not adjusted for high school grades, so that students in the right tail also includestudents who do very well and were expected to do so.33 While results for students in the bottom 10%are essentially the same as for divers, the picture for top students is noticeably different than for thrivers:relative to the main distribution, students with the best college grades are more conscientious and lessextroverted, less likely to cram for exams, expect higher GPAs and are significantly less tolerant of risk.Our interpretation is that these characteristics contribute to success both in college and in high school,but cannot explain why some students thrive beyond expectations. However, students who obtain lowcollege grades unconditionally or relative to expectations share the same harmful traits of impatienceand lack of conscientiousness.

2.5.3 Text Analysis of Student Outliers

The analysis of open ended questions yields results that are consistent with the main results. Table2.3 replicates the methodology in Table 2.2, and shows that divers use fewer words when answeringquestions and thrivers use more, which suggests that thrivers are providing more detailed and carefulanswers. Thrivers spend more time answering these questions, although this difference is not significant.Finally, in the unconditional comparisons, thrivers have stronger spelling than the average and divershave weaker spelling. The significance of this result dissipates in the conditional comparisons, whichsuggests that most of this difference is a function of other non-academic variables. Overall, thriversappear to put in more effort when answering these open-ended questions, consistent with the findingthat they expect to study for more hours than others.

Topic analysis results are shown in Table 2.4. For a selected set of questions, the table shows aword if the difference between the share of thrivers (or divers) who use it and the share of the wholesample is significant at 5%, and if at least 5 thrivers (or divers) use it. These results reinforce the pointthat conscientiousness is a crucial trait. When asked to identify traits they admired about themselves,thrivers were more likely to use words such as “discipline”, “practice”, or “responsibility”, which are

31We restrict the sample to students in groups 4, 5 and 6 on Figure 2.1.32The dropout rate for the full personality sample is 8%. For divers it is 36%.33We here use the distribution of grades adjusted only for cohort and campus fixed effect, as well as age at entry and

non-domestic status.

Page 93: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 88

indicative of conscientiousness. Sample phrases using these words include “I admire the fact that I havediscipline”, and “One of the qualities I admire most about myself is responsibility”.

Thrivers and divers can also be differentiated when they are asked to list their goals or hopes forthe future. Divers are significantly likely to use words which highlight wealth. Examples include “rich”,in contexts such as “be a rich man” and “business” in contexts such as “being successful, having somany successful businesses.” Thrivers, on the other hand, are more likely to highlight how they plan tocontribute to society, using words such as “human” and “people”. Previous work has emphasized theimportance for educational success of pursuing long-term goals. Our text analysis stresses the importanceof the nature and content of these goals.

2.6 Summary Measures of Non-Academic Characteristics

We combine our key non-academic predictors of college success and failure into an overall predictor, toexamine how well it performs in forecasting outcomes over the full distribution as well as in the tails. Ourmost robust predictors of outliers are: propensity to cram for exams, number of hours studying, numberof hours of paid work, expected GPA, time started the exercise, conscientiousness and impatience. Weremain agnostic about the exact relative importance of each of these seven constructs and take theunweighted average of these standardized variables for each student. We later explore whether we canimprove our predictions by using weights obtained from machine learning techniques.

Figure 2.3 shows the distributions of unadjusted college grades for students in the top and bottom 10%of the distribution of our relatively simple summary measure of non-academic characteristics. Studentsdeemed the most at-risk under this metric have first-year grades on average more than a full standarddeviation below students considered least at risk of struggling during the transition to college.

The one-dimensional measure performs relatively well in terms of predicting freshman performance,but its incremental explanatory power after accounting for past performance is modest, as shown inTable 2.5. The table displays estimates of a modified version of equation (1) in which one-dimensionalat-risk measures are substituted for admission grades in panel A, and added as regressors in panel B. Ourpreferred metric, the simple unweighted average of 7 large predictors of extreme unexpected performance,correlates strongly with college grades (column (3)). In terms of adjusted R-squared, the explanatorypower of this measure alone (0.163) is not as high as that of high school grades (0.218), but adding theat-risk factor to past grades increases the model’s fit by almost 4 percentage points.34

We then benchmark the predictive abilities of our summary at-risk factor against measures computedwith more sophisticated but less transparent approaches to constructing indices. In column (4), the sevenbest predictors are summarized by their principal component. The model’s fit is actually lower withthis method than under our preferred approach, suggesting that only using the variance common to all7 variables is too restrictive.

In columns (5) to (9) we use least angle regressions (LARS) and let the algorithm pick the bestpredictors and put optimal weights on these (Efron et al., 2004). A summary measure of characteristicscan then be defined as the fitted values associated with the LARS estimates. The dependent variableused in the process is average college grade adjusted for our conditioning variables, but not for high schoolgrades. Since we chose our 7 best predictors by examining outliers, we first run the LARS algorithm

34The adjusted R-square is 0.096 when only conditioning variables (cohort and campus fixed effects, age, non-domesticstatus) are included.

Page 94: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 89

on the subsample of divers and thrivers, as they are defined in section 2.4.1. Comparing the adjustedR-squared in columns (5), (6) and (7) with column (3) indicates that the weights put on the sevenselected predictors that maximize the share of the variance in grades among outliers do not necessarilygeneralize to the full distribution since the unweighted average has better predictive power over thefull personality sample. This observation underscores the importance of non-linearities in the educationproduction function. Column (8) demonstrates that the fit of the model is minimally improved by lettingthe algorithm pick more predictors than the ones we selected.35 The summary measure used in column(9) is obtained using LARS on the full distribution of students in the personality sample and thereforeputs an upper bound on the joint predictive power of all the non-academic characteristics over the fulldistribution. We find that using information from the full sample (column (9)) rather than from outliersonly (column (8)) to calibrate the weights increases the R-squared by 0.015 (from 0.169 to 0.184) ifadmission grades are omitted, but only by 0.002 if past grades are account for. Our simple summarymeasure raises the adjusted R-squared almost as high as these upper-bound measures do, but withoutcompromising on transparency.

Results on outliers highlight important asymmetries in the distribution of non-academic characteris-tics across the grade distribution, suggesting that the at-risk factor may have more predictive power forextreme outcomes than over the entire distribution of grades. Table 2.6 shows the proportion of studentsconsidered most or least at-risk under different criteria who fall in the bottom of the distribution of rawfirst-year college grades. About a quarter of all students below the 10th percentile of admission gradesend up below the 10th percentile of college grades (row (a)). The proportion of students deemed ‘at-risk’by our simple measure (row (b)) that ends up with such dramatic outcomes is very similar (23%).36 Yet,there is little overlap in the tails of the distributions of admission grades and of our at-risk measure– only 2.2% of students in our sample fall below the 10th percentile in both distributions (row (c)).Among students in this situation, 38% will fall in the bottom decile of college grades, and all of themwill end up below the median. Importantly, falling below the 10th percentile of college grades may haveserious consequences: In our sample, no student in the bottom decile of college grades has an averagegrade above 50 and all are therefore put on probation at the end of their first year in college, whichsubstantially reduce the probability of graduating (Lindo, Sanders and Oreopoulos, 2010). When usedjointly with high school grades, the at-risk factor can substantially improve the prediction of extremeoutcomes, with potentially important benefits for school administrators and students alike.

2.7 Conclusion

A vast array of personality traits and other noncognitive constructs are used in education researchin order to predict performance in college, with substantial overlap across distinct measures. Also,samples are often based only on a select group of volunteers. In this paper, we were able to gathera more comprehensive set of non-academic measures for virtually all students taking a large first-yearcollege course by assigning a small grade requirement to the survey. We investigated which variables,unconditional and conditional on other predictors, best explain the variation in college grades thatcould not have been expected on the basis of variables known upon admission, notably past academic

35Reassuringly, the algorithm tends to select the same predictors we picked.36This result suggests that this measure can be a useful substitute for admission grades when these are missing. For

instance, the relationship between the at-risk measure and grades observed in the personality sample also holds in thesmaller sample of students for which non-academic measures are observed but past grades are missing (non-reported).

Page 95: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 90

performance. Our results suggest that a few non-academic measures have reasonable predictive powerand that linear assumptions often implicit in prior research mask interesting asymmetries.

Students whose first-year college average is far below expectations (divers) have a high propensity forprocrastination – they self-report cramming for exams and wait longer before starting a short exerciseworth 2 percent of their overall grade in a first-year economics course. They are also considerablyless conscientious than their peers. Divers are generally more impatient for positive experiences. Forinstance, qualitative analyses of short texts written by students suggest divers are more likely to expresssuperficial goals, hoping to ’get rich’ quickly. In contrast, students who exceed expectations (thrivers)express more philanthropic goals, are purpose-driven, and are willing to study more hours per week toobtain the higher GPA they expect. The only background characteristic that help predict outlier statusis gender, with men being more likely to both thrive and dive.

Divers are considerably more likely to drop out after first year than other students, and even thosewho remain in school continue lagging behind. Our results, which indicate that divers are more impatientand tend to wait to the last minute before getting to work, suggest some possible ways to help earlyon. For example, interventions emphasizing efforts around staying organized and structured to avoidwasting time may be fruitful. In a follow-up paper (Beattie et al., 2017), we notably find that studentsat the bottom of the grade distribution severely lack time management skills, are aware of these issues,but are lost when it comes to finding solutions. Proactive guidance on how to improve these skills, forinstance how to design a study schedule and how to respect it, are promising avenues.

Consistent with the extensive literature on the correlates of college GPA, we found that high schoolgrades remain the best predictor of college grades in general. However, non-academic constructs are espe-cially useful for predicting extreme outcomes that cannot be explained by prior educational achievement.Importantly, the characteristics that best predict successful transitions to college are not necessarily theones that struggling students lack. Our results, descriptive in nature, warrant further research on theimportance of non-linearities, notably at the bottom of grade distributions, for the design and targetingof successful interventions in higher-education.

Page 96: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 91

2.8 Tables and Figures

Figure 2.1: Distribution of grade residuals

1

2

3

4

5

6

7

8

9

-6-4

-20

2R

esid

ual (

stan

dard

ized

) co

llege

gra

de

-3 -2 -1 0 1 2Standardized admission grade

Colors indicate whether students are in the top or bottom decile of the distribution on each dimension. College graderesiduals are obtained from specification (2), table B.1. The sample is restricted to the personality sample.

Page 97: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 92

Figure 2.2: Differences in distributions of conscientiousnessPanel A: Unconditional distributions

0.1

.2.3

.4.5

Den

sity

-4 -2 0 2 4Conscientiousness (standardized)

DiversThriversAll students

Means: Divers = -.263 ; Thrivers = .028 ; All students = 0

Conscientiousness is relative-scored and unadjusted. Divers are defined as students with residual college grade below the10th percentile. Thrivers have residual college grades above the 90th percentile. The full distribution corresponds to thepersonality sample.

Panel B: Conditional distributions

0.2

.4.6

Den

sity

-4 -2 0 2Conscientiousness (standardized) - Residualized

DiversThriversAll students

Means: Divers = -.191 ; Thrivers = -.048 ; All students = 0

Conscientiousness is relative-scored and residualized from a regression on all other non-academic characteristics. Diversare defined as students with residual college grade below the 10th percentile. Thrivers have residual college grades abovethe 90th percentile. The full distribution corresponds to the personality sample.

Page 98: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 93

Figure 2.3: College grades by at-risk status

0.2

.4.6

Den

sity

-6 -4 -2 0 2Standardized average college grade

Most at riskLeast at riskAll students

Means: Most at risk = -.517 ; Least at risk = .577 ; All students = 0

College grades are unadjusted. Most at-risk students are defined as students below the 10th percentile in the distributionof the seven-variable unweighted average of key characteristics. Least at-risk students rank above the 90th percentile. Thefull distribution corresponds to the personality sample.

Page 99: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 94

Table 2.1: Summary statistics

Variables Mean Standard deviation

Age at entry 18.07 [0.959]

Mother tongue: English 0.48 [0.500]

Citizenship: Canadian 0.52 [0.500]

Women 0.52 [0.500]

First-year student in 2015 0.82 [0.388]

International student 0.34 [0.473]

Economics is a required course 0.59 [0.491]

Living in Residence 0.30 [0.459]

Mother has BA or more 0.50 [0.500]

Father has BA or more 0.59 [0.491]

First-generation student 0.25 [0.430]

Hours expected to study 18.18 [10.816]

Hours expected to work for pay 7.46 [9.807]

Expects to get more than undergraduate degree 0.63 [0.482]

Expected college GPA 3.61 [0.434]

Day started the survey (relative to first day of class) 3.84 [5.257]

Admission grade 87.38 [5.121]

Average college grade 66.33 [13.467]

Observations 1,317

Notes: Sample is restricted to students in the personality sample whose admission grade is not missing, and who finishedat least one university course in their first year. First-year and international student status, gender, parental education,study habits and expectations are self-reported. We infer that economics is a required course if a student intends to majorin either Economics or Business. Age, mother tongue, citizenship and grades are from administrative records. The averagecollege grade is calculated over all courses for which a valid grade is reported in the administrative file and weighted bynumber of credits.

Page 100: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 95

Table 2.2: Differences between outliers and full distribution - Personality sample

Unconditional Conditional Unconditional Conditional Unconditional Conditional

Mean diff. Mean diff. Mean diff. Mean diff. (3) - (1) (4) - (2)

[s.e.] [s.e.] [s.e.] [s.e.] [p-val test (3)=(1)] [p-val test (4)=(2)]

(1) (2) (3) (4) (5) (6)

Study hours per week (z-score) -0.079 -0.019 0.224** 0.226*** 0.303** 0.245**

[0.087] [0.083] [0.087] [0.083] [0.014] [0.037]

Sure about program of study (z-score) -0.074 -0.085 -0.067 -0.08 0.007 0.005

[0.087] [0.082] [0.087] [0.082] [0.954] [0.966]

Think about future goals (z-score) 0.134 0.150** -0.119 -0.095 -0.253** -0.244**

[0.087] [0.075] [0.087] [0.075] [0.040] [0.022]

Identify with university (z-score) 0.067 0.04 -0.011 0.04 -0.078 0.001

[0.087] [0.080] [0.087] [0.080] [0.529] [0.995]

Transition has been challenging (z-score) 0.061 -0.024 -0.059 -0.091 -0.12 -0.067

[0.087] [0.078] [0.087] [0.079] [0.330] [0.548]

Cram for exams (z-score) 0.297*** 0.209*** -0.043 -0.058 -0.34*** -0.268**

[0.087] [0.076] [0.087] [0.076] [0.006] [0.013]

Work hours per week (z-score) 0.216** 0.140* 0.049 0.076 -0.168 -0.064

[0.087] [0.083] [0.087] [0.084] [0.173] [0.588]

Expected GPA (z-score) -0.097 -0.106 0.137 0.126 0.233* 0.232**

[0.087] [0.080] [0.087] [0.080] [0.058] [0.040]

Day started exercise (z-score) 0.288*** 0.199** -0.041 -0.03 -0.329*** -0.229*

[0.087] [0.083] [0.087] [0.083] [0.007] [0.051]

Expects more than undergraduate 0.003 -0.009 0.016 0.033 0.012 0.043

[0.042] [0.040] [0.042] [0.040] [0.834] [0.448]

Agreeableness (z-score) -0.023 0.045 -0.03 -0.056 -0.007 -0.101

[0.087] [0.079] [0.087] [0.080] [0.956] [0.368]

Conscientiousness (z-score) -0.263*** -0.191*** 0.028 -0.048 0.292** 0.143

[0.087] [0.066] [0.087] [0.067] [0.018] [0.128]

Extraversion (z-score) 0.170* 0.1 -0.102 -0.038 -0.272** -0.138

[0.087] [0.078] [0.087] [0.078] [0.027] [0.212]

Openness (z-score) 0.087 0.035 0.094 0.011 0.007 -0.024

[0.087] [0.077] [0.087] [0.077] [0.953] [0.829]

Emotional stability (z-score) 0.05 -0.014 0.024 -0.102 -0.026 -0.088

[0.087] [0.076] [0.087] [0.077] [0.834] [0.414]

Risk tolerance (z-score) 0.098 0.042 -0.231*** -0.109 -0.328*** -0.151

[0.087] [0.078] [0.087] [0.078] [0.008] [0.171]

Impatience (z-score) 0.199** 0.180** -0.134 -0.112 -0.333*** -0.292**

[0.087] [0.085] [0.087] [0.085] [0.007] [0.016]

Procrastination (z-score) 0.102 0.067 0.043 -0.039 -0.059 -0.106

[0.087] [0.078] [0.087] [0.078] [0.633] [0.335]

Locus of Control (z-score) 0.112 0.094 -0.034 0.02 -0.146 -0.074

[0.087] [0.081] [0.087] [0.081] [0.238] [0.519]

Perseverance of effort (z-score) -0.135 -0.089 -0.04 0.038 0.095 0.127

[0.087] [0.081] [0.087] [0.081] [0.439] [0.265]

Consistency of interest (z-score) 0.053 0.051 -0.132 -0.086 -0.185 -0.137

[0.087] [0.078] [0.087] [0.078] [0.133] [0.214]

Women -0.105** -0.082** -0.094** -0.090** 0.011 -0.008

[0.043] [0.041] [0.044] [0.041] [0.860] [0.888]

English mother tongue 0.013 0.017 0.009 0.041 -0.004 0.024

[0.044] [0.031] [0.044] [0.031] [0.950] [0.594]

Canadian citizenship -0.017 -0.029 -0.036 -0.034 -0.019 -0.005

[0.044] [0.026] [0.044] [0.026] [0.757] [0.899]

International student -0.005 -0.014 -0.01 -0.015 -0.005 -0.001

[0.041] [0.028] [0.041] [0.028] [0.931] [0.976]

Economics is required 0.021 0.011 -0.059 -0.022 -0.079 -0.033

[0.043] [0.041] [0.043] [0.041] [0.191] [0.570]

Mother has at least bachelor degree -0.022 -0.049 -0.003 0.004 0.019 0.053

[0.044] [0.034] [0.044] [0.034] [0.759] [0.270]

Father has at least bachelor degree 0.043 0.033 0.018 0.034 -0.026 0.001

[0.043] [0.029] [0.043] [0.029] [0.672] [0.983]

First-generation student -0.026 -0.011 0.03 0.041 0.055 0.052

[0.037] [0.026] [0.038] [0.026] [0.299] [0.151]

Bottom decile Top decile Difference between outliers

Notes: Diver and thivers status is defined using residuals from the specification reported in column (2) of Table A1. All non-z-scorepredictors are binary. In columns (1) through (4), coefficients represent the difference in means between outlier groups and the fullsample. For conditional differences (columns (2) and (4)), each characteristic is first regressed on the set of other characteristicsreported in this table. Big Five traits are relative-scored. Likert-scale Big Five traits are used as controls in lieu of relative-scoredtraits in the residualization process for columns conditional differences. *** p<0.01, ** p<0.05, * p<0.1

Page 101: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 96

Table 2.3: Differences between outliers and full distribution - Text sample

Unconditional Conditional Unconditional Conditional Unconditional Conditional

Mean diff. Mean diff. Mean diff. Mean diff. (3) - (1) (4) - (2)

[s.e.] [s.e.] [s.e.] [s.e.] [p-val test (3)=(1)] [p-val test (4)=(2)]

(1) (2) (3) (4) (5) (6)

Total number of words used (z-score) -0.263*** -0.134** 0.046 0.130** 0.309*** 0.264***

[0.065] [0.061] [0.065] [0.061] [0.001] [0.002]

Proportion spelled correctly (z-score) -0.135** -0.058 0.046 0.077 0.170* 0.135

[0.066] [0.064] [0.065] [0.064] [0.066] [0.136]

Time taken on written questions (z-score) -0.043 -0.066 0.098 0.066 0.141 0.132

[0.065] [0.062] [0.065] [0.062] [0.126] [0.135]

Bottom decile Top decile Difference between outliers

Notes: In columns (1) through (4), coefficients represent the difference in means between outlier groups and the full sample.For conditional differences (columns (2) and (4)), each characteristic is first regressed on the set of controls (variables fromsurvey and administrative data. *** p<0.01, ** p<0.05, * p<0.1

Table 2.4: Words used more frequently by outliers

Question Top decile words Bottom decile words

Name two goals build rich business own actuary

Qualities admire in self discipline specific word practice

responsibility smart confident game

cause communicate receive friendly trust

Your future self human meet people deal god trustworthy

computer whole provide famous love

mature determine helpful wise

tough man book father moment rich

Qualities admire in others weakness avoid challenge overcome read

mistake creativity word people Steve

area general initiative understand

power concentration waste

Notes: The words listed are used with higher frequency by thrivers (or divers) in response to the questions listed in thefirst column. Words are included if the frequencies are different at the 5% significance level, according to a Pearson’sChi-square test.

Page 102: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 97

Table 2.5: Predictive properties of summary measures of non-academic characteristics

Summary measure- -

Unweighted

average

Principal

component

LARS on

outliers

LARS on

outliers

LARS on

outliers

LARS on

outliers

LARS on full

sample

Predictors included

- -

7 best 7 best 7 best

7 best +

polynomials

7 best +

polynomials +

interactions All All

(1) (2) (3) (4) (5) (6) (7) (8) (9)

Panel A: Separate predictive power

Admission grade 0.433*** 0.446***

[0.030] [0.030]

Admission grade2

0.046**

[0.019]

Summary measure 0.653*** 0.249*** 0.560*** 0.410*** 0.701*** 0.499*** 1.355***

[0.061] [0.027] [0.054] [0.045] [0.076] [0.045] [0.110]

Observations 1,317 1,317 1,317 1,317 1,317 1,317 1,317 1,317 1,317

Adjusted R2

0.218 0.221 0.163 0.145 0.160 0.145 0.146 0.169 0.184

Panel B: Incremental predictive power

Admission grade 0.433*** 0.446*** 0.387*** 0.401*** 0.392*** 0.408*** 0.405*** 0.390*** 0.370***

[0.030] [0.030] [0.030] [0.030] [0.030] [0.030] [0.030] [0.030] [0.030]

Admission grade2

0.046** 0.037** 0.038** 0.038** 0.042** 0.041** 0.037** 0.035*

[0.019] [0.018] [0.019] [0.018] [0.019] [0.019] [0.018] [0.018]

Summary measure 0.466*** 0.171*** 0.406*** 0.308*** 0.515*** 0.381*** 0.992***

[0.059] [0.026] [0.052] [0.043] [0.072] [0.043] [0.109]

Observations 1,317 1,317 1,317 1,317 1,317 1,317 1,317 1,317 1,317

Adjusted R2

0.218 0.221 0.255 0.245 0.255 0.250 0.249 0.265 0.267

Dependent variable : Standardized first-year college grades

Notes: All regressions include campus and cohort fixed-effects, as well as non-domestic statusand age at entry. Standarderrors are in brackets. In column (6), quadratic and cubic terms for each of the 7 best predictors are used in the LARSalgorithm. In column (7) all paiwise interactions between the 7 best predictors are further added. In columns (8) and (9),the set of potential predictors used in the algorithm is all variables encompass all variables listed in table 2. *** p<0.01,** p<0.05, * p<0.1

Table 2.6: Proportion of ’at-risk’ students in bottom tail of distribution of college grades

Bottom 10% Bottom 20% Bottom 30% Bottom 40% Bottom 50%

Most at-risk:

(a) Bottom decile of summary measure 23% 40% 52% 61% 71%

(b) Bottom decile of admission grades 25% 43% 60% 76% 87%

(c) Bottom decile in both metrics 38% 55% 76% 86% 100%

Least at-risk:

(d) Top decile of summary measure 3% 6% 8% 14% 24%

(e) Top decile of admission grades 2% 2% 5% 9% 15%

(f) Top decile in both metrics 0% 0% 0% 0% 7%

Distribution of unadjusted first-year college grades

Notes: Each cell indicates the fraction of our personality sample who fall into the categories in rows and columns. Byconstruction 10% of our sample is defined as most/least at-risk in rows (a), (b), (d) and (e). Only 2.2% of students in oursample satisfy the ’at-risk’ criterion in row (c), and 2.3% satisfy the criterion in (f).

Page 103: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 98

2.9 Appendix Tables and Figures

Figure B.1: Unconditional grade variation

-6-4

-20

2S

tand

ardi

zed

Ave

rage

Col

lege

Gra

de

-3 -2 -1 0 1 2Standardized admission grade

The sample is restricted to the personality sample.

Page 104: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 99

Figure B.2: Differences in distributions of study hoursPanel A: Unconditional distributions

0.1

.2.3

.4.5

Den

sity

-2 0 2 4Hours studying weekly (standardized)

DiversThriversAll students

Means: Divers = -.079 ; Thrivers = .224 ; All students = 0

The number of hours studying weekly is standardized with mean zero and unit variance. Divers are defined as studentswith residual college grade below the 10th percentile. Thrivers have residual college grades above the 90th percentile. Thefull distribution corresponds to the personality sample.

Panel B: Conditional distributions

0.1

.2.3

.4.5

Den

sity

-2 0 2 4Hours studying weekly (standardized) - Residualized

DiversThriversAll students

Means: Divers = -.019 ; Thrivers = .226 ; All students = 0

The number of hours studying weekly is standardized with mean zero and unit variance and residualized from a regression onall other non-academic characteristics. Divers are defined as students with residual college grade below the 10th percentile.Thrivers have residual college grades above the 90th percentile. The full distribution corresponds to the personality sample.

Page 105: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 100

Figure B.3: Differences in distributions of time started exercisePanel A: Unconditional distributions

0.2

.4.6

Den

sity

-4 -2 0 2 4Day started exercise (standardized)

DiversThriversAll students

Means: Divers = .288 ; Thrivers = -.041 ; All students = 0

The number of days relative to first day of class is standardized with mean zero and unit variance. Divers are definedas students with residual college grade below the 10th percentile. Thrivers have residual college grades above the 90thpercentile. The full distribution corresponds to the personality sample.

Panel B: Conditional distributions

0.1

.2.3

.4.5

Den

sity

-4 -2 0 2 4Day started exercise (standardized) - Residualized

DiversThriversAll students

Means: Divers = .199 ; Thrivers = -.03 ; All students = 0

The number of days relative to first day of class is standardized with mean zero and unit variance and residualized from aregression on all other non-academic characteristics. Divers are defined as students with residual college grade below the10th percentile. Thrivers have residual college grades above the 90th percentile. The full distribution corresponds to thepersonality sample.

Page 106: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 101

Figure B.4: First and second year performance across deciles of the distribution of first-year graderesiduals

-2-1

01

Sta

ndar

dize

d A

vera

ge C

olle

ge G

rade

Divers 2 3 4 5 6 7 8 9 Thrivers

Standardized Average Grade in First yearStandardized Average Grade in Second year

Notes: Categories on the horizontal axis are deciles of the distribution of first-year grade residuals.

Page 107: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 102

Table B.1: First-stage - Predicting performance using admission grades

Variables (1) (2)

Admission Grade 0.409*** 0.433***

[0.030] [0.030]

Non-Domestic status -0.273***

[0.051]

Age at entry -0.049*

[0.027]

Observations 1,317 1,317

Adjusted R20.195 0.218

Dependent variable: Average first year college grade

Notes: Both average college and admission grades are standardized to have mean zero and a standard deviation of one.All regressions include admission unit and cohort fixed-effects. Non-domestic status is one if a student either self-declaredas international or has a citizenship other than Canadian. Standard errors are in brackets. The sample is restricted to thepersonality sample, but since group assignment is random, results are similar for the text sample. *** p<0.01, ** p<0.05,* p<0.1

Page 108: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 103

Table B.2: Differences between outliers and full distribution - Quintiles

Unconditional Conditional Unconditional Conditional Unconditional Conditional

Mean diff. Mean diff. Mean diff. Mean diff. (3) - (1) (4) - (2)

[s.e.] [s.e.] [s.e.] [s.e.] [p-val test (3)=(1)] [p-val test (4)=(2)]

(1) (2) (3) (4) (5) (6)

Study hours per week (z-score) -0.003 0.043 0.172*** 0.144** 0.175** 0.101

[0.061] [0.059] [0.062] [0.059] [0.044] [0.225]

Sure about program of study (z-score) -0.059 -0.033 -0.002 -0.021 0.057 0.012

[0.062] [0.058] [0.062] [0.058] [0.512] [0.888]

Think about future goals (z-score) 0.000 0.036 -0.092 -0.081 -0.092 -0.117

[0.062] [0.053] [0.062] [0.053] [0.290] [0.120]

Identify with university (z-score) -0.001 -0.013 -0.016 0.002 -0.014 0.015

[0.062] [0.056] [0.062] [0.057] [0.868] [0.854]

Transition has been challenging (z-score) 0.129** 0.070 -0.067 -0.084 -0.197** -0.154**

[0.061] [0.055] [0.062] [0.055] [0.024] [0.049]

Cram for exams (z-score) 0.211*** 0.119** -0.150** -0.122** -0.361**** -0.242***

[0.061] [0.054] [0.061] [0.054] [0.000] [0.002]

Work hours per week (z-score) 0.139** 0.084 0.005 0.024 -0.134 -0.06

[0.061] [0.059] [0.062] [0.059] [0.123] [0.475]

Expected GPA (z-score) -0.124** -0.115** 0.152** 0.115** 0.276*** 0.23***

[0.061] [0.056] [0.061] [0.056] [0.002] [0.004]

Day started exercise (z-score) 0.230*** 0.165*** -0.084 -0.032 -0.314*** -0.198**

[0.061] [0.058] [0.061] [0.059] [0.000] [0.017]

Expects more than undergraduate (binary) -0.001 -0.012 0.002 0.021 0.002 0.033

[0.030] [0.028] [0.030] [0.028] [0.954] [0.403]

Agreeableness (z-score) 0.042 0.062 -0.000 -0.048 -0.042 -0.11

[0.062] [0.056] [0.062] [0.056] [0.626] [0.166]

Conscientiousness (z-score) -0.206*** -0.128*** 0.081 -0.015 0.287*** 0.114*

[0.061] [0.047] [0.061] [0.047] [0.001] [0.087]

Extraversion (z-score) 0.127** 0.101* -0.115* -0.071 -0.241*** -0.172**

[0.061] [0.055] [0.062] [0.055] [0.006] [0.028]

Openness (z-score) -0.009 -0.014 -0.007 -0.031 0.002 -0.017

[0.062] [0.055] [0.062] [0.055] [0.977] [0.822]

Emotional stability (z-score) 0.059 0.046 0.043 -0.034 -0.016 -0.08

[0.062] [0.054] [0.062] [0.054] [0.852] [0.294]

Risk tolerance (z-score) -0.001 -0.046 -0.133** -0.020 -0.131 0.026

[0.061] [0.055] [0.062] [0.055] [0.131] [0.736]

Impatience (z-score) 0.155** 0.157*** -0.124** -0.110* -0.279*** -0.267***

[0.061] [0.060] [0.061] [0.060] [0.001] [0.002]

Procrastination (z-score) 0.165*** 0.158*** -0.034 -0.069 -0.199** -0.227***

[0.061] [0.055] [0.062] [0.055] [0.022] [0.003]

Locus of Control (z-score) 0.010 0.011 -0.057 -0.001 -0.067 -0.012

[0.062] [0.057] [0.062] [0.057] [0.443] [0.879]

Perseverance of effort (z-score) -0.082 -0.038 0.026 0.075 0.108 0.113

[0.062] [0.057] [0.062] [0.057] [0.215] [0.163]

Consistency of interest (z-score) 0.069 0.109** -0.091 -0.087 -0.16* -0.197**

[0.061] [0.055] [0.062] [0.055] [0.066] [0.011]

Women -0.037 -0.021 -0.012 -0.024 0.025 -0.003

[0.031] [0.029] [0.031] [0.029] [0.571] [0.942]

English mother tongue -0.009 0.002 -0.034 0.018 -0.025 0.016

[0.031] [0.022] [0.031] [0.022] [0.569] [0.616]

Canadian citizenship -0.025 -0.016 -0.068** -0.030 -0.044 -0.015

[0.031] [0.018] [0.031] [0.018] [0.314] [0.575]

International student (survey) 0.014 -0.003 0.031 -0.006 0.017 -0.003

[0.029] [0.020] [0.029] [0.020] [0.688] [0.917]

Economics is required 0.009 0.010 -0.023 -0.003 -0.032 -0.013

[0.030] [0.029] [0.030] [0.029] [0.456] [0.748]

Mother has at least bachelor degree 0.008 -0.027 0.006 0.009 -0.002 0.037

[0.031] [0.024] [0.031] [0.024] [0.966] [0.283]

Father has at least bachelor degree 0.051* 0.013 0.015 0.022 -0.036 0.009

[0.030] [0.021] [0.030] [0.021] [0.406] [0.762]

First-generation student -0.052** -0.029 0.021 0.028 0.073* 0.057**

[0.026] [0.018] [0.027] [0.018] [0.052] [0.026]

Bottom quintile Top quintile Difference between outliers

Notes: All non-z-score predictors are binary. In columns (1) through (4), coefficients represent the difference in means betweenoutlier groups and the full sample. For conditional differences (columns (2) and (4)), each characteristic is first regressed on the setof other characteristics reported in this table. Big Five traits are relative-scored. Likert-scale Big Five traits are used as controlsin lieu of relative-scored traits in the residualization process for columns conditional differences. *** p<0.01, ** p<0.05, * p<0.1

Page 109: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 104

Table B.3: Differences between outliers and full distribution - Groups 4,5,6 only

Unconditional Conditional Unconditional Conditional Unconditional Conditional

Mean diff. Mean diff. Mean diff. Mean diff. (3) - (1) (4) - (2)

[s.e.] [s.e.] [s.e.] [s.e.] [p-val test (3)=(1)] [p-val test (4)=(2)]

(1) (2) (3) (4) (5) (6)

Study hours per week (z-score) -0.069 -0.034 0.265** 0.241** 0.334** 0.275**

[0.096] [0.092] [0.103] [0.099] [0.018] [0.041]

Sure about program of study (z-score) -0.06 -0.045 -0.074 -0.105 -0.014 -0.06

[0.095] [0.088] [0.102] [0.095] [0.919] [0.640]

Think about future goals (z-score) 0.098 0.152* -0.15 -0.13 -0.248* -0.282**

[0.095] [0.081] [0.102] [0.088] [0.075] [0.018]

Identify with university (z-score) 0.025 -0.007 -0.061 -0.012 -0.086 -0.005

[0.095] [0.088] [0.103] [0.094] [0.537] [0.970]

Transition has been challenging (z-score) 0.102 -0.002 -0.057 -0.093 -0.159 -0.091

[0.095] [0.086] [0.102] [0.092] [0.257] [0.470]

Cram for exams (z-score) 0.302*** 0.195** -0.083 -0.094 -0.385*** -0.289***

[0.094] [0.083] [0.101] [0.089] [0.005] [0.018]

Work hours per week (z-score) 0.249*** 0.169* 0.009 0.063 -0.24* -0.106

[0.096] [0.091] [0.103] [0.098] [0.088] [0.426]

Expected GPA (z-score) -0.099 -0.106 0.179* 0.193** 0.278** 0.299**

[0.096] [0.089] [0.103] [0.096] [0.050] [0.022]

Day started exercise (z-score) 0.307*** 0.196** -0.084 -0.091 -0.391*** -0.288**

[0.093] [0.089] [0.100] [0.096] [0.004] [0.028]

Expects more than undergraduate (binary) -0.024 -0.018 0.019 0.051 0.044 0.069

[0.046] [0.044] [0.050] [0.047] [0.521] [0.279]

Agreeableness (z-score) -0.05 0.015 -0.021 -0.037 0.029 -0.052

[0.097] [0.088] [0.104] [0.095] [0.836] [0.688]

Conscientiousness (z-score) -0.254*** -0.153** -0.004 -0.116 0.251* 0.037

[0.096] [0.073] [0.103] [0.079] [0.076] [0.731]

Extraversion (z-score) 0.185* 0.119 -0.111 -0.051 -0.297** -0.17

[0.096] [0.086] [0.103] [0.092] [0.035] [0.175]

Openness (z-score) 0.031 -0.032 0.112 0.021 0.081 0.052

[0.096] [0.085] [0.103] [0.092] [0.567] [0.676]

Emotional stability (z-score) 0.099 0.01 0.046 -0.118 -0.053 -0.127

[0.097] [0.085] [0.105] [0.091] [0.712] [0.308]

Risk tolerance (z-score) 0.051 0.002 -0.251** -0.088 -0.303** -0.09

[0.094] [0.084] [0.101] [0.090] [0.029] [0.465]

Impatience (z-score) 0.168* 0.168* -0.200** -0.161 -0.368*** -0.329**

[0.094] [0.092] [0.101] [0.099] [0.008] [0.015]

Procrastination (z-score) 0.081 0.042 0.025 -0.041 -0.056 -0.083

[0.095] [0.085] [0.102] [0.091] [0.688] [0.504]

Locus of Control (z-score) 0.107 0.086 0.015 0.063 -0.092 -0.023

[0.096] [0.089] [0.103] [0.095] [0.513] [0.858]

Perseverance of effort (z-score) -0.210** -0.168* 0.016 0.056 0.226* 0.224*

[0.093] [0.086] [0.101] [0.093] [0.100] [0.077]

Consistency of interest (z-score) 0.053 0.072 -0.125 -0.061 -0.178 -0.132

[0.094] [0.083] [0.101] [0.089] [0.198] [0.277]

Women -0.113** -0.086* -0.111** -0.110** 0.001 -0.024

[0.047] [0.045] [0.051] [0.049] [0.984] [0.713]

English mother tongue 0.003 0.008 0.026 0.036 0.023 0.028

[0.048] [0.034] [0.051] [0.037] [0.738] [0.579]

Canadian citizenship -0.026 -0.021 -0.033 -0.036 -0.007 -0.015

[0.048] [0.029] [0.051] [0.031] [0.924] [0.717]

International student (survey) 0.008 -0.007 -0.033 -0.022 -0.04 -0.015

[0.045] [0.030] [0.048] [0.033] [0.541] [0.741]

Economics is required 0.007 -0.01 -0.067 -0.036 -0.074 -0.027

[0.047] [0.045] [0.050] [0.048] [0.282] [0.687]

Mother has at least bachelor degree -0.036 -0.080** -0.015 0.009 0.021 0.089

[0.048] [0.037] [0.051] [0.040] [0.769] [0.105]

Father has at least bachelor degree 0.080* 0.045 -0.004 0.029 -0.083 -0.016

[0.047] [0.032] [0.050] [0.034] [0.224] [0.728]

First-generation student -0.054 -0.024 0.06 0.057* 0.114* 0.081*

[0.040] [0.028] [0.044] [0.030] [0.055] [0.045]

Bottom decile Top decile Difference between outliers

Notes: Sample restricted to students with admission grades between the 10th and 90th percentile. All non-z-score predictors arebinary. In columns (1) through (4), coefficients represent the difference in means between outlier groups and the full sample. Forconditional differences (columns (2) and (4)), each characteristic is first regressed on the set of other characteristics reported inthis table. Big Five traits are relative-scored. Likert-scale Big Five traits are used as controls in lieu of relative-scored traits inthe residualization process for columns conditional differences. *** p<0.01, ** p<0.05, * p<0.1

Page 110: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 105

Table B.4: Differences between outliers and full distribution - Grades not adjusted for high schoolperformance

Unconditional Conditional Unconditional Conditional Unconditional Conditional

Mean diff. Mean diff. Mean diff. Mean diff. (3) - (1) (4) - (2)

[s.e.] [s.e.] [s.e.] [s.e.] [p-val test (3)=(1)] [p-val test (4)=(2)]

(1) (2) (3) (4) (5) (6)

Study hours per week (z-score) -0.123 -0.035 0.169* 0.124 0.291** 0.159

[0.087] [0.083] [0.087] [0.083] [0.018] [0.176]

Sure about program of study (z-score) -0.145* -0.125 0.01 0.019 0.155 0.145

[0.087] [0.082] [0.087] [0.082] [0.209] [0.211]

Think about future goals (z-score) 0.091 0.124* -0.175** -0.158** -0.266** -0.282***

[0.087] [0.075] [0.087] [0.075] [0.031] [0.008]

Identify with university (z-score) -0.009 -0.022 -0.169* -0.109 -0.16 -0.088

[0.087] [0.080] [0.087] [0.080] [0.194] [0.437]

Transition has been challenging (z-score) 0.097 0.032 -0.142 -0.133* -0.239* -0.165

[0.087] [0.078] [0.087] [0.079] [0.052] [0.137]

Cram for exams (z-score) 0.248*** 0.127* -0.227*** -0.188** -0.475*** -0.315***

[0.087] [0.076] [0.087] [0.076] [0.000] [0.004]

Work hours per week (z-score) 0.172** 0.091 -0.011 0.043 -0.183 -0.048

[0.087] [0.083] [0.087] [0.084] [0.138] [0.685]

Expected GPA (z-score) -0.160* -0.129 0.281*** 0.266*** 0.441*** 0.394***

[0.087] [0.079] [0.087] [0.080] [0.000] [0.000]

Day started exercise (z-score) 0.412*** 0.298*** -0.118 -0.08 -0.53*** -0.378***

[0.086] [0.082] [0.087] [0.083] [0.000] [0.001]

Expects more than undergraduate (binary) -0.012 -0.026 0.031 0.054 0.043 0.08

[0.042] [0.040] [0.042] [0.040] [0.471] [0.154]

Agreeableness (z-score) -0.07 0.02 -0.089 -0.155* -0.019 -0.175

[0.087] [0.079] [0.087] [0.079] [0.879] [0.118]

Conscientiousness (z-score) -0.275*** -0.132** 0.327*** 0.147** 0.602*** 0.279***

[0.086] [0.066] [0.087] [0.067] [0.000] [0.003]

Extraversion (z-score) 0.222** 0.134* -0.322*** -0.226*** -0.544*** -0.36***

[0.086] [0.078] [0.087] [0.078] [0.000] [0.001]

Openness (z-score) 0.116 0.077 0.192** 0.140* 0.076 0.064

[0.087] [0.077] [0.087] [0.077] [0.536] [0.560]

Emotional stability (z-score) 0.023 -0.001 -0.1 -0.189** -0.123 -0.188*

[0.087] [0.076] [0.087] [0.076] [0.318] [0.082]

Risk tolerance (z-score) 0.145* 0.034 -0.375*** -0.230*** -0.52*** -0.264**

[0.086] [0.077] [0.087] [0.078] [0.000] [0.016]

Impatience (z-score) 0.251*** 0.223*** -0.150* -0.118 -0.401*** -0.342***

[0.087] [0.085] [0.087] [0.085] [0.001] [0.005]

Procrastination (z-score) 0.163* 0.142* -0.018 -0.045 -0.181 -0.186*

[0.087] [0.078] [0.087] [0.078] [0.143] [0.091]

Locus of Control (z-score) 0.07 0.036 0.023 0.094 -0.047 0.057

[0.087] [0.081] [0.087] [0.081] [0.705] [0.619]

Perseverance of effort (z-score) -0.093 -0.056 0.01 0.071 0.103 0.127

[0.087] [0.081] [0.087] [0.081] [0.404] [0.268]

Consistency of interest (z-score) 0.068 0.081 -0.037 -0.012 -0.105 -0.093

[0.087] [0.078] [0.087] [0.078] [0.394] [0.397]

Women -0.128*** -0.104** -0.025 -0.045 0.102* 0.059

[0.043] [0.041] [0.044] [0.041] [0.096] [0.310]

English mother tongue 0.044 0.022 -0.044 -0.019 -0.088 -0.041

[0.043] [0.031] [0.044] [0.031] [0.155] [0.359]

Canadian citizenship 0.028 0 -0.021 0.029 -0.049 0.029

[0.044] [0.026] [0.044] [0.026] [0.424] [0.433]

International student (survey) -0.005 0.018 0.036 0.025 0.041 0.007

[0.041] [0.028] [0.041] [0.028] [0.486] [0.868]

Economics is required 0.013 0.001 -0.051 -0.022 -0.064 -0.024

[0.043] [0.041] [0.043] [0.041] [0.291] [0.684]

Mother has at least bachelor degree -0.037 -0.043 -0.026 0 0.011 0.043

[0.044] [0.034] [0.044] [0.034] [0.856] [0.366]

Father has at least bachelor degree 0.021 0.032 -0.013 0.019 -0.033 -0.014

[0.043] [0.029] [0.043] [0.029] [0.581] [0.744]

First-generation student -0.003 0 0.045 0.036 0.048 0.036

[0.037] [0.026] [0.038] [0.026] [0.369] [0.313]

Bottom decile Top decile Difference between outliers

Notes: All non-z-score predictors are binary. Outliers are top and bottom deciles of the adjusted college grades distribution.College grades are adjusted for cohort and campus fixed effects, as well as age and non-domestic status. In columns (1) through(4), coefficients represent the difference in means between outlier groups and the full sample. For conditional differences (columns(2) and (4)), each characteristic is first regressed on the set of other characteristics reported in this table. Likert-scale Big Fivetraits are used as controls in lieu of relative-scored traits in the residualization process. *** p<0.01, ** p<0.05, * p<0.1

Page 111: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 2. Using Non-Academic Measures to Predict College Success and Failure 106

Table B.5: Differences between outliers and full distribution - Conditional differences estimated in onestep

Unconditional Conditional Unconditional Conditional Unconditional Conditional

Mean diff. Mean diff. Mean diff. Mean diff. (3) - (1) (4) - (2)

[s.e.] [s.e.] [s.e.] [s.e.] [p-val test (3)=(1)] [p-val test (4)=(2)]

(1) (2) (3) (4) (5) (6)

Study hours per week (z-score) -0.079 -0.001 0.224** 0.241*** 0.303** 0.242**

[0.087] [0.091] [0.087] [0.090] [0.014] [0.046]

Sure about program of study (z-score) -0.074 -0.120 -0.067 -0.133 0.007 -0.013

[0.087] [0.090] [0.087] [0.090] [0.954] [0.917]

Think about future goals (z-score) 0.134 0.136 -0.119 -0.156* -0.253** -0.292**

[0.087] [0.086] [0.087] [0.086] [0.040] [0.011]

Identify with university (z-score) 0.067 0.036 -0.011 0.013 -0.078 -0.023

[0.087] [0.089] [0.087] [0.089] [0.529] [0.846]

Transition has been challenging (z-score) 0.061 -0.033 -0.059 -0.091 -0.12 -0.058

[0.087] [0.086] [0.087] [0.086] [0.330] [0.615]

Cram for exams (z-score) 0.297*** 0.241*** -0.043 -0.035 -0.34*** -0.276**

[0.087] [0.083] [0.087] [0.083] [0.006] [0.013]

Work hours per week (z-score) 0.216** 0.167* 0.049 0.085 -0.168 -0.081

[0.087] [0.092] [0.087] [0.092] [0.173] [0.507]

Expected GPA (z-score) -0.097 -0.112 0.137 0.106 0.233* 0.218*

[0.087] [0.088] [0.087] [0.087] [0.058] [0.061]

Day started exercise (z-score) 0.288*** 0.231** -0.041 -0.006 -0.329*** -0.237**

[0.087] [0.090] [0.087] [0.091] [0.007] [0.050]

Expects more than undergraduate 0.003 -0.015 0.016 0.016 0.012 0.031

[0.042] [0.043] [0.042] [0.043] [0.834] [0.586]

Agreeableness (z-score) -0.023 0.045 -0.030 -0.058 -0.007 -0.103

[0.087] [0.086] [0.087] [0.086] [0.956] [0.367]

Conscientiousness (z-score) -0.263*** -0.227*** 0.028 -0.081 0.292** 0.146

[0.087] [0.072] [0.087] [0.072] [0.018] [0.128]

Extraversion (z-score) 0.170* 0.110 -0.102 -0.031 -0.272** -0.14

[0.087] [0.084] [0.087] [0.085] [0.027] [0.212]

Openness (z-score) 0.087 0.041 0.094 0.017 0.007 -0.024

[0.087] [0.084] [0.087] [0.084] [0.953] [0.829]

Emotional stability (z-score) 0.050 -0.030 0.024 -0.120 -0.026 -0.09

[0.087] [0.083] [0.087] [0.083] [0.834] [0.413]

Risk tolerance (z-score) 0.098 0.033 -0.231*** -0.120 -0.328*** -0.154

[0.087] [0.084] [0.087] [0.084] [0.008] [0.170]

Impatience (z-score) 0.199** 0.192** -0.134 -0.105 -0.333*** -0.297**

[0.087] [0.092] [0.087] [0.092] [0.007] [0.016]

Procrastination (z-score) 0.102 0.072 0.043 -0.037 -0.059 -0.109

[0.087] [0.084] [0.087] [0.085] [0.633] [0.334]

Locus of Control (z-score) 0.112 0.112 -0.034 0.036 -0.146 -0.075

[0.087] [0.088] [0.087] [0.088] [0.238] [0.519]

Perseverance of effort (z-score) -0.135 -0.097 -0.040 0.033 0.095 0.13

[0.087] [0.087] [0.087] [0.088] [0.439] [0.264]

Consistency of interest (z-score) 0.053 0.047 -0.132 -0.093 -0.185 -0.14

[0.087] [0.084] [0.087] [0.084] [0.133] [0.213]

Women -0.105** -0.103** -0.094** -0.108** 0.011 -0.005

[0.043] [0.044] [0.044] [0.044] [0.860] [0.932]

English mother tongue 0.013 0.017 0.009 0.030 -0.004 0.013

[0.044] [0.034] [0.044] [0.034] [0.950] [0.777]

Canadian citizenship -0.017 -0.042 -0.036 -0.052* -0.019 -0.01

[0.044] [0.028] [0.044] [0.028] [0.757] [0.784]

International student -0.005 -0.014 -0.010 -0.008 -0.005 0.006

[0.041] [0.030] [0.041] [0.030] [0.931] [0.889]

Economics is required 0.021 0.002 -0.059 -0.042 -0.079 -0.045

[0.043] [0.044] [0.043] [0.044] [0.191] [0.452]

Mother has at least bachelor degree -0.022 -0.057 -0.003 -0.004 0.019 0.053

[0.044] [0.037] [0.044] [0.037] [0.759] [0.284]

Father has at least bachelor degree 0.043 0.041 0.018 0.040 -0.026 -0.001

[0.043] [0.032] [0.043] [0.032] [0.672] [0.975]

First-generation student -0.026 -0.002 0.030 0.060** 0.055 0.062*

[0.037] [0.028] [0.038] [0.028] [0.299] [0.094]

Bottom decile Top decile Difference between outliers

Notes: All non-z-score predictors are binary. In columns (1) through (4), coefficients represent the difference in means betweenoutlier groups and the full sample. For conditional differences (columns (2) and (4)), each characteristic is regressed on diverand thriver status dummies, as well as the set of other characteristics reported in this table. Big Five traits are relative-scored.Likert-scale Big Five traits are used as controls in lieu of relative-scored traits for conditional differences. *** p<0.01, ** p<0.05,* p<0.1

Page 112: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3

Language Skill Acquisition inImmigrant Social Networks:Evidence from Australia

3.1 Introduction

In countries with large foreign-born populations, immigrants tend to be spatially concentrated in a fewmetropolitan areas. Policy-makers and researchers posit that residential segregation may produce exter-nalities affecting the economic and social outcomes of the cities in which ethnic enclaves are formed. Theimplementation of refugee placement and desegregation policies in many immigrant-receiving countriesreflects these concerns. Enclaves might also affect immigrants’ long-term economic assimilation: (Borjas,2015) demonstrates that about 20% of the decline in earnings convergence experienced by most recentcohorts of immigrants to the U.S. can be attributed to changes in the size of ethnic groups. These trendsnotably correlate strongly with slower rates of language acquisition, suggesting an important role forEnglish proficiency as a mechanism.

Living in a linguistic enclave may diminish the incentives to invest in the acquisition of host-countrylanguage skills (Lazear, 1999). Similarly, highly segregated areas may provide fewer interactions withnatives, and therefore make learning the dominant language more difficult. Any resulting lower lan-guage proficiency can hinder the economic assimilation of migrants, given that proficiency in the host-country language has a sizable economic return estimated to approximately 15% higher earnings in mostEnglish-speaking countries.1 Moreover, qualitative evidence suggests that employers are strongly con-cerned about applicants’ language skills when reviewing resumes (Oreopoulos, 2011). The consequencesof lower proficiency rates may not be exclusively borne by the individuals who lack these language skills.Low fluency rates may have external effects on economic and social outcomes by giving rise to commu-nication barriers between groups.2 Linguistic isolation can also generate intergenerational externalities

1Chiswick and Miller (2014) provide an excellent review of existing empirical estimates of the effect of dominant languageproficiency on earnings. Also see Lewis (2013); Bleakley and Chin (2004); Berman, Lang and Siniver (2003); Dustmannand van Soest (2002, 2001).

2English proficiency may generate positive human capital externalities as in Borjas (1995) and Moretti (2004). Also,the equilibrium fluency rate may not be socially optimal, in that it might produce too little productive social interactions(Konya, 2007; Lazear, 1999; Church and King, 1993). Other possible externalities relate to the costs associated with the use

107

Page 113: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 108

by reducing the incentives to acquire language skills for future cohorts of immigrants.This paper estimates the effect of linguistic enclaves on one of the most important determinants of

both the economic and social integration of immigrants: language skills. I focus on Australia, a nationwith one of the highest shares of foreign-born population among developed countries, and with substantialsegregation of immigrants in urban areas.3 Using the Longitudinal Survey of Immigrants to Australia(LSIA), I directly track changes in language skills over time to explicitly measure language acquisition.Existing estimates rely on cross-sectional variation in immigrants’ language skills, making it difficult toseparate out learning from sorting: if immigrants make their location decisions on the basis of preexistinglanguage skills, a strong correlation between linguistic concentration and proficiency in the host-countrylanguage may be found among newly arrived immigrants, that is even before any learning has possiblyoccurred in the host country.4 This is particularly important in English-speaking countries, given thatmost immigrants possess English language skills to varying degrees before they migrate. To assess therobustness of my estimates, I use two complementary empirical strategies. The first one leverages anunusually rich set of observable characteristics associated with immigrants’ abilities and their locationdecisions to generate bounds on enclave effects. The second approach focuses on sponsored immigrants,for whom information on their sponsor’s location can be used as the basis for a instrumental variable.

I further contribute to the literature on enclave effects by examining different channels through whichspatial concentration of language groups can affect the acquisition of language skills. The level of ge-ography at which segregation matters the most for language skills is notably treated as an empiricalquestion. Also, generally unavailable information on English language course take-up is used to investi-gate whether enclaves slow down language acquisition by reducing investments in language skills throughformal education. Additional survey questions related to language used at work and job search strategiesare used to investigate the link between enclaves, language acquisition, and economic incentives.

As a starting point, the paper first replicates the negative association between linguistic concentrationand language skills documented in earlier studies. I then show that about a third of standard cross-sectional estimates actually reflects differences in pre-immigration proficiency rather than an effect onlearning. My estimates suggest that a (within language-group) standard deviation increase in city-widelinguistic concentration reduces the probability of becoming fluent in English by 2.6 percentage points,a magnitude equivalent to roughly 15% of the total increase in proficiency observed over the survey’sduration.5 This result is very robust across specifications and empirical strategies. I also find thatconditional on city-level enclave size, there is no incremental negative effect of residing in a relativelyhigh-concentration neighborhood. Finally, I show that enclave size is unrelated to English course take-up, suggesting that social interactions outside the classroom – at work, in particular – are important in

of interpreters, which are not necessarily internalized by non-fluent individuals. More importantly, when professional inter-pretation services are not available, relying on ad hoc interpreters (e.g. family members) can have disastrous consequences,notably on the quality of health care services provided (Flores, 2006).

3In 2011, more than one out of every four Australians was foreign-born, the third highest ratio among OECD countries(OECD, 2013). That same year, two thirds of individuals speaking a language other than English at home resided in eitherGreater Sydney or Greater Melbourne, while less than 40% of the total Australian population lived in these two cities.

4In the United States, Lazear (1999) and Chiswick and Miller (2005) find a negative association between the probabilitythat an immigrant speaks English well and the size of his linguistic group in his area. A negative relationship betweenethnic/linguistic concentration and English language proficiency is also found in Australia (Chiswick and Miller, 1996), inCanada (Warman, 2007), and in the United Kingdom (Dustmann and Fabbri, 2003). Danzer and Yaman (2016) exploitsWest Germany’s Guest-Worker Programme as an exogenous source of spatial variation, which allows them to isolate thecausal effect of enclaves on learning. The immigrant population they study is substantially different than the one consideredhere, however. Danzer and Yaman (2016) focus on low-skilled immigrants who all had no prior knowledge of German.

5Note that low proficiency is not merely transitory phenomenon: in 2011, among the 1.3 million immigrants who hadbeen living in Australia for at least 15 years and whose best spoken language is not English, one in five still spoke English“not well” or “not at all”.

Page 114: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 109

the learning process.The remainder of the paper is organized as follows. Section 3.2 describes the survey data, and the

empirical strategy is outlined in Section 3.3. The results are then presented in Section 3.4. Possiblemechanisms are discussed in Section 3.5, and Section 3.6 concludes.

3.2 Data

3.2.1 The Longitudinal Survey of Immigrants to Australia (LSIA)

The primary dataset used in this paper is the Longitudinal Survey of Immigrants to Australia (LSIA1),a survey of a representative sample of recent immigrants undertaken by the Department of Immigrationand Multicultural and Indigenous Affairs of Australia.6 The LSIA1 sample consists of 5,192 PrimaryApplicants (PAs), which respresents approximately 7% of all PAs aged 15 or above who arrived in Aus-tralia between September 1993 and August 1995 and were offshore visaed.7 In-depth personal interviewswere conducted approximately five or six months (wave 1), 18 months (wave 2), and 42 months (wave3) after the date of arrival in Australia. The questionnaire covers a comprehensive set of topics, notablyincluding information about the reasons for immigrating to Australia, work status and educational at-tainment prior to immigration, place of residence in Australia, enrollment in English language coursesin Australia, language best spoken, and English language skills.8

The longitudinal structure of the survey allows me to examine changes in language skills that occurwith time spent in Australia. In each wave, respondents whose best spoken language is not Englishwere asked to self-assess their ability to speak English on a four-point scale ranging from “Not at all” to“Very well”. In line with previous studies (Danzer and Yaman, 2016; Dustmann and Fabbri, 2003; Lazear,1999), I collapse self-assessed language skills into a binary variable in most analyses. In particular, anindividual is considered fluent in English if she reports speaking it “Well” or “Very well”. I complementthis information with an arguably more objective measure of language skills – an indicator of whetherthe interview was conducted in English or not.9

This paper’s main analyses are based on a balanced panel of PAs who were interviewed in all threewaves, whose best spoken language is not English, and who were of working age (18-64 years old) atthe time of the first interview.10 The analytical sample is further restricted to individuals for which Iam able to produce measures of enclave size, which excludes smaller language groups. The final sampleconsists of 2,053 Primary Applicants.11

6Other work that also use the LSIA to examine immigrants’ English proficiency include Chiswick, Lee and Miller (2006)and Chiswick, Lee and Miller (2004).

7Immigrants with special eligibility visas, whose country of birth could not be identified, or who did not settle in amajor urban area, as well as New Zealand citizens, were excluded from the survey. These subgroups only make up aboutfour or five percent of all otherwise eligible PAs.

8Two other rounds of the LSIA were conducted on more recent cohorts of immigrants. The LSIA1 cohort arrived inAustralia prior to the implementation, in 1999, of a major reform of the points test system which strengthened the Englishlanguage ability requirements for two of the five main visa categories. English proficiency upon arrival was relativelyweaker for LSIA1 participants than for LSIA2 and LSIA3 cohorts, leaving more room for language skill acquisition in thehost-country (Chiswick, Lee and Miller, 2004; Cobb-Clark, 2004). LSIA1 also has a longer follow-up period than the laterrounds.

9Interviews could be conducted with the assistance of friends or members of the respondent’s family, or with the helpof an accredited interpreter or through a bilingual interviewer.

10The attrition rate between waves 1 and 3 is of 30%. Sampling weights are adjusted accordingly.11I ignore possible general equilibrium effects associated with the location decisions of the LSIA1 cohort, which only

represent 2% of Australia’s total non-English speaking population in 1996. In the language of Angrist (2014), there isa separation between the subjects of peer effects (the LSIA1 cohort) and the peers who provide a mechanism for causal

Page 115: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 110

3.2.2 Measurement: Linguistic concentration

The key independent variable of interest is linguistic concentration, which I measure using Bertrand,Luttmer and Mullainathan’s (2000) contact availability ratio:

CAjk =

(Number of people of language-group k in area j

Number of people in area j

)(Total number of people of language-group k in the country

Total population of the country

) .

Deflating the proportion of individuals belonging to language-group k in area j by the share of thatlanguage-group in the entire country has the advantage of avoiding underweighting smaller languagegroups. It also has a clear interpretation as a segregation index: if language-group k makes up thesame proportion of the population in area j as it does at the country-level, then the index will be equalto one. Consequentially, a contact availability measure above one implies that the language-group ofinterest is over-represented in the area, whereas an index below one means that the language-group isunder-represented.12

The population counts of the Australian Census Community Profiles are used to compute measuresof CAjk for 31 non-English linguistic communities defined by language spoken at home in 1996.13 Thesmaller geographic unit for which this is feasible is a Statistical Subdivision (SSD), a “socially andeconomically homogeneous region characterized by identifiable links between inhabitants” (AustralianBureau of Statistics, 1996). Linguistic concentration is also calculated for Statistical Divisions (SD),which are made up of one or more SSDs, and more or less correspond to MSAs in the United States andCMAs in Canada. For example, the SD of Sydney (population of 3,741,290 in 1996) consists of 14 SSDs(with an average population of 267,235). There is substantial variation in the size of SDs, with Sydneybeing the most populous, and some having a population as small as 44,798 (Pilbara, Western Australia).The individuals in my sample live in 16 Statistical Divisions (64 SSDs) in wave 3, and form 223 uniquelanguage-SD cells (671 language-SSD cells).

The choice of level of aggregation is not insignificant. One may argue that the social interactionsthat matter the most in the present context mainly take place at the neighborhood level. If this is thecase, then, linguistic concentration measured at the SSD level should be a more accurate measure ofimmigrant social networks than at the broader SD level. However, there are a few shortcomings to thisapproach that can be addressed by using a more aggregate measure. Firstly, measures of concentrationfor linguistic groups who make up a very small proportion of the population in certain areas may besubject to measurement error; using larger geographic units should attenuate this problem (Danzerand Yaman, 2013). Secondly, smaller geographic units do not allow for cross-neighborhood interactions(Warman, 2007). If labour markets stretch beyond the borders of SSDs, then concentration measured atthe SD level may capture the economic benefits of English proficiency more accurately. Finally, becauseit is arguably easier to move within cities than across metropolitan areas, using a larger aggregationlevel may reduce the selection bias.

A priori, it is unclear what level of aggregation better captures the reach of social interactions likely

effects on those subjects (neighbors, notably individuals from previous waves of immigration).12As is customary in the literature on ethnic concentration, I use lnCAjk in most econometric specifications. Because

language-group fixed effects are included in the model, results are numerically equivalent to using the log of the exposureindex: EIjk = 100 ∗

(Number of people of language-group k in area jNumber of people in area j

).

13Results are qualitatively unchanged if country of birth is used instead of language spoken at home (available uponrequest).

Page 116: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 111

to affect language acquisition. I treat this issue as an empirical question that I investigate by contrastingresults at the SD and SSD levels. Bertrand, Luttmer and Mullainathan (2000) and Cutler, Glaeser andVigdor (2008) also simultaneously consider two levels of geography. Other studies of enclave effects covera broad range of geographical units: wards in the UK (Dustmann and Fabbri, 2003), municipalitiesin Sweden (Edin, Fredriksson and Åslund, 2003) and in Denmark (Damm, 2009), CMAs in Canada(Warman, 2007), and regions (Anpassungsschichten) in West Germany (Danzer and Yaman, 2016).

3.2.3 Descriptive Statistics

To illustrate how people who settle in enclaves might differ from those who do not, I split the samplebetween immigrants above (enclave) and below (non-enclave) their language-specific median linguisticconcentration measured at the SD level. Descriptive statistics for observable individual characteristicsare reported in Table 3.1. The enclave status indicators are based on the respondents’ location at thetime of the third interview.14

In terms of demographics (panel A), immigrants living in high-concentration cities (column (2))are significantly older, less educated, live in larger households but are less likely to be married, andare disproportionately female. In panel B, I present differences on a set of variables that are plausiblyassociated with one’s ability and motivation for learning English. The patterns are suggestive of negativeselection into enclaves. For instance, immigrants living in enclaves were less likely to be working priorto immigrating to Australia and to have chosen to move to Australia primarily for reasons related toemployment opportunities. The fraction of immigrants expecting to receive some help finding work ishigher among non-enclave residents. There is, however, no difference between immigrants living in high-and low-concentration areas in terms of whether they expect to receive help learning English. There issome suggestive evidence that enclave residents were more likely to immigrate for family reunion reasons,given their higher propensity to have visited Australia before and the likelihood they choose their Stateof residence because they have more family there.15 All these wave one variables are retrospective andare considered as pre-determined characteristics in the empirical analyses.

Figure 3.1 shows the average English fluency rates in each of the three waves for enclave and non-enclave immigrants separately. There is clear evidence of sorting on the basis of pre-existing languageskills: upon arrival in Australia (wave one), immigrants located in enclaves are 6 percentage pointless likely to be fluent in English, and 7 percentage points less likely to have done the interview inEnglish. The longitudinal nature of the survey allows to directly measure the amount of learning whileliving in Australia. Approximately 4 years after immigration, the initial gaps in language skills widento 11 (15) percentage points for English fluency (interview in English), suggesting that enclaves doimpede language acquisition. Yet, Figure 3.1 demonstrates that cross-sectional differences in languageskills between immigrants living in high- and low-concentration area don’t provide a causal estimate ofenclave effects. About half of the unadjusted gap observed in wave three already existed upon arrival.

14Sorting is most likely a slow process. Firstly, upon arrival, immigrants may not possess sufficient information on theentire set of possible locations, and so their initial location may be transitory. Secondly, if there are frictions in the housingmarket, it may take time for someone to find an affordable unit in their preferred area. Wave 3 locations are thereforeused in order to fully uncover sorting patterns.

15For brievity, Table 3.1 only shows the most common reasons for moving to Australia/choosing State living in. Inregression analyses, I use sets of dummies for all possible values as control variables.

Page 117: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 112

3.3 Empirical Framework

As a starting point, consider the cross-sectional econometric specification used in most previous studies(Danzer and Yaman, 2016; Warman, 2007; Chiswick and Miller, 2005):

Lijk = β (lnCAjk) + γ′Xijk + δj + δk + εijk (3.1)

where the dependent variable Lijk is a measure of language skills and i indexes individuals, j indexesareas, k indexes language-groups. The coefficient of interest, β, represents the relationship betweenlinguistic concentration and English language skills. The model includes pre-determined individualcharacteristics Xijk as well as language-group (δk) and location fixed effects (δj).

The effect of enclaves is identified by comparing immigrants who belong to the same language-group but live in different cities, and by comparing immigrants who live in the same location butbelong to different language-groups. Any difference across locations in terms of local labor marketconditions and opportunities for learning English (e.g. differences in supply of English courses) that donot vary across language-groups are absorbed by the area fixed effects. Similarly, estimates of enclaveeffects are unaffected by unobserved heterogeneity between language-groups that is constant acrosslocations. For example, economic incentives to learn English might be weaker for language-groupswho are subject to more discrimination in the labour market, and groups may differ in terms of socialnorms regarding social integration, as well as in terms of values and attitudes towards labour marketparticipation (Bertrand, Luttmer and Mullainathan, 2000). Language-groups fixed effects also accountfor the relationship between cost of language skill acquisition (degree of difficulty) and linguistic distancefrom English (Isphording and Otten, 2014).

One way to interpret the coefficients in equation (3.1) is through the lens of a language productionfunction. The workhorse model of host-country language fluency among immigrants conceptualizes thehuman capital accumulation process as a function of three broad class of factors: economic incentives,degree of exposure to the language, and efficiency in language acquisition (Chiswick and Miller, 1995,1996). This approach formalizes the idea that language acquisition reflects a response to the economicbenefits to fluency (incentives) as well as to the costs, which depend on one’s ability to learn newlanguages and the number opportunities for using the language (Van Tubergen and Kalmijn, 2009). Inthis sense, enclaves may affect language acquisition either by reducing the incentives to invest in languageskills, or by making learning more difficult.16

In cross-sectional analyses, one only observes immigrants’ degree of proficiency in the host-countrylanguage – the stock of English language skills – at one point in time, making it impossible to distinguishpreexisting language skills at the time of migration from those learned subsequently. Unless immigrantsare randomly allocated to locations, estimates from equation (3.1) do not identify the effect of enclaveson learning, a flow measure.

Language skills constitute a form of human capital which can be accumulated. It is therefore usefulto decompose the stock of Lijkt at time t into two components: the stock at time t− 1 and the amountof learning experienced between the two periods (Learnijkt):

Lijkt = (1− ζ)Lijk,t−1 + Learnijkt

16The Online Appendix lays out a simple conceptual model that clarify these distinct interpretations.

Page 118: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 113

where ζ ∈ (0, 1) is a depreciation parameter, allowing for one’s proficiency in English to decreasesif the skill is never or rarely used. Provided that Lijk,t−1 is observed, an estimate of the relationshipbetween enclave size and the acquisition of new language skills can be obtained by estimating a versionof the cross-sectional model in which past skills Lijk,t−1 are included as a covariate. More precisely, witht = 0 corresponding to the first wave of LSIA, and t = 1 to the third wave:

Learnijk1 = β (lnCAjk) + γ′Xijk + δj + δk + εijk

Lijk1 = β (lnCAjk) + (1− ζ)Lijk0 + γ′Xijk + δj + δk + εijk. (3.2)

Equation (3.2) is effectively a flow specification, and the coefficient on β now only reflects the effect ofenclaves on learning.17 By including lagged proficiency as a covariate, this empirical strategy shares manysimilarities with the value-added literature in education, in which the endogeneity of some educationalinputs (e.g. assignment to a better teacher) is accounted for by controlling for lagged values of theoutcome of interest, generally test scores (Chetty, Friedman and Rockoff, 2014a).

Including past stock of language skills as a control variable accounts for selection into enclaves onthe basis of pre-immigration proficiency and unobserved characteristics correlated with it, notably priorexposure to English. Likewise, within language-group differences in ambition and ability to learn newlanguages are likely to be echoed by past language skills. For instance, for most migrants, the decisionto move to Australia was made at least some time before they actually migrated.18 In this context, themost ambitious individuals plausibly invested relatively more resources into developing their ability tospeak English before migrating, anticipating this skill would be valuable in the host-country.

Yet, immigrants with the same linguistic background and the same degree of proficiency upon arrivalmay still differ in unobserved ways associated with both the propensity to locate in enclaves and theability to learn English. I use two complementary strategies to address this concern. First, I exploit thescope of the LSIA1 questionnaire to examine how the estimated coefficient varies with the inclusion ofa rich set of individual characteristics after conditioning on past proficiency. I then construct boundson the causal effect of enclaves in section 3.4.2. Second, in section 3.4.3, I use the fact that immigrantssponsored by a family member (e.g. a spouse, a child, a close relative) tend to move close to theirsponsor as the basis for an instrumental variable derived from the sponsor’s location.

3.4 Results

3.4.1 The Effect of Linguistic Concentration on Language Acquisition

Baseline estimates of the effect of linguistic concentration on language skills are shown in Table 3.2for two outcome measures: self-reported binary fluency and an indicator of whether the interview wasconducted in English or not.19 I use linear probability models, in conformity with previous work (Danzer

17This approach is equivalent to using a “gain” measure (∆Lijk1 = Lijk1 − Lijk0) on the left hand-side. SubtractingLijk0 from both sides, ∆Lijk1 = α+ β

(lnCAjk

)− ζLijk0 + γ′Xijk + δj + δk + εijk.

18The LSIA sample includes only offshore visaed immigrants. Moreover, under the “cap and queue” system, immigrantsapplying for the Independent or Skilled-Australian Sponsored visa categories might have been put on a waiting list if theyearly limit of immigrants allowed under these visa categories had been reached at the time of their application (Chiswickand Miller, 2004).

19Appendix Table C.1 presents results for three alternative measures of spoken English: an indicator for both conductingthe interview in English and self-reporting being fluent, the principal component of these two variables, and an indicator

Page 119: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 114

and Yaman, 2016; Dustmann and Fabbri, 2003; Chiswick and Miller, 1995). In all cases, the standarderrors are clustered within language-group by area of residence cells, that is at the level of aggregationat which CAjk varies.20 Panel A reports results for concentration measured at the broader SD level,and panel B presents the corresponding estimates for enclave size measured at the SSD level.

In columns (1) and (4), the set of individual characteristics is restricted to standard demographicvariables (age, gender, marital status, household size, presence of children in household) and laggedproficiency is omitted. In line with previous findings, the relationship between linguistic concentrationand language skills is negative and significant at the city-level (Statistical Division, panel A). A largeset of observable characteristics plausibly related to ability and motivation to learn English (listed inTable 3.1) is further added in columns (2) and (4). The change in R2 confirms that these variables havesizable predictive power for language skills.21 The coefficient on enclave size decreases slightly, from-0.096 to -0.084 for self-reported speaking ability, and from -0.102 to -0.092 for language of interview.For comparison, Chiswick and Miller (1996) find that a one percentage point increase in own language-group’s share of the population at the metro level is associated with a 5 percentage points reduction inEnglish proficiency in Australia. The point estimate reported in column (2) implies that a similar increaseof one percentage point in concentration translates into a 5.6 percentage points decline in fluency.22

Lagged language skills is added as a regressor in columns (3) and (6), and the corresponding coeffi-cient on enclave size represents the effect on learning. The coefficient of interest is strongly statisticallysignificant, but shrinks by a third relative to models that do not control for pre-immigration proficiency.In other words, linguistic concentration does appear to have a negative impact on language skill acquisi-tion, but a considerable fraction of the cross-sectional estimate is attributable to sorting on the basis ofpre-immigration proficiency. To better grasp the economic significance of these results, I first calculatethe within-language group standard deviation in concentration relative to the sample mean, which is45% of the mean at the SD-level. The impact of a one standard deviation increase in enclave size is a(0.058 ∗ 0.45 =) 2.6 percentage points reduction of in the probability of being fluent in English. To putthis magnitude in perspective, consider the total change in proficiency between the first and third waves– a 17.6 percentage points (43%) increase, from 41% to 59%. The effect of a (within-group) standarddeviation increase in linguistic concentration on language acquisition is equivalent to about (2.6/17.6 =)15% of the sample’s total improvement between the two waves.23

In Panel B, the effect of SSD-level concentration (CASSDjk ) is small and generally statistically insignif-icant. This might be surprising given that these estimates rely on variation across SSDs both withinand between SDs.24 In Online Appendix 3.9, I show that SSD-level estimates can be interpreted as aweighted average of (a) the partial effect of SD-level concentration (CASDjk ), and (b) the partial effect

for whether the respondent indicated (in wave three) that their ability to speak English had improved at least moderatelysince the last interview.

20Twoway clustering at the area and language group levels produce almost identical results (unreported).21The adjusted R2 increases by 38%, from 0.33 to 0.455, for self-reported proficiency, for example.22The average SD-level language-group share in my sample is 0.015. A one percentage point increase therefore corresponds

to a 66% increase from the mean, and a (0.0844 ∗ 0.66 =) 5.6 percentage point decrease in fluency.23My estimates are strikingly close the those obtained by Danzer and Yaman (2016), who find that a one standard

deviation increase in ethnic concentration at the regional level reduces the probability of being fluent in German by3.8 percentage points. A full one standard deviation increase in concentration in my sample decreases proficiency by(0.058 ∗ 0.66 =) 3.8 percentage points. My results are, however, markedly lower than most cross-sectional estimates. Forexample, Lazear (1999) finds that a one standard deviation increase in ethnic concentration lowers English proficiency byabout 23 percentage points, that is by about a third of the overall fluency rate prevailing in the U.S. in 1990.

24The magnitudes of the coefficients at the SD and SSD level are not directly comparable because people do not locaterandomly within cities. For instance, a 1% increase in linguistic concentration at the SSD level will conceivably translateinto a less-than-1% increase in SD-level concentration.

Page 120: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 115

of within-SD relative SSD-level concentration. Perhaps surprisingly, I find no evidence of an impactof CASSDjk on language acquisition working through the second channel. A decomposition of the esti-mates reported in panel B between these two channels suggests that any effect of CASSDjk works entirelythrough its relationship with SD-level concentration. The resulting coefficients are particularly smallbecause considerably more weight is put on the (null) effect of relative SSD-level concentration condi-tional on SD-level enclave size. In other words, conditional on residing in a city where many people shareone’s mother tongue, there is no incremental negative effect of residing in a relatively high-concentrationneighborhood. Similarly, Cutler, Glaeser and Vigdor (2008) find that MSA-level group share is consis-tently associated with lower English ability in the U.S., but that within-MSA segregation, if anything, ispositively associated with English ability conditional on MSA-level group share. In light of these results,the remaining of the paper will mainly focus on SD-level enclave size.

To validate that the effect of enclaves on learning is not driven by groups of outliers or by functionalform assumptions, I perform a number of robustness and specification checks. First, I verify that thenegative effect of linguistic concentration does not hinge on age restrictions (Table C.2, rows (b)-(d)) andconfirms that the effect is not driven by a few large language groups (rows (e)-(h)). Excluding Chinese(21% of the sample) and Arabic (10%) speakers notably leads to coefficients larger in magnitude.25

Excluding the two metropolitan areas where the vast majority of immigrants settle also leaves the mainconclusion unchanged (rows (i) and (j)). Second, I take into account that individuals might have movednon-randomly across cities between the two interviews. I would over-estimate the negative effect ofenclave size on language skills if, for example, individuals who were able to improve their ability tospeak English markedly while living in an enclave then moved to a low-concentration area just beforethe third interview. Yet, this form of sorting is not cause for concern in the case of analyses conductedat the SD level, since less than 5% of respondents in my sample moved across SDs (37 percent movedbetween SSDs). The last four rows of Table C.2 display the estimated effect of linguistic concentration onlanguage acquisition for different subgroups of “stayers”. Finally, Tables C.3 and C.4 respectively showthat the results remain qualitatively unchanged if one uses probit models or measures concentration inlevels rather than in log.

3.4.2 Sensitivity to controls

In this section, I examine the robustness of the estimated enclave effects to the inclusion of a compre-hensive set of individual characteristics. Intuitively, if controlling for lagged language skills accountsfor most of the relevant unobserved heterogeneity, then the coefficient β should be stable across spec-ifications that do and do not account for observable characteristics once we have conditioned on pastproficiency.

In Table 3.3, I re-estimate flow equation (3.2) at the SD-level and gradually add more covariates. Incolumns (1) and (4), the set of individual characteristics is empty, while columns (2) and (5) includebaseline demographics, and numerous proxies for ability and motivation to learn English are furtheradded in columns (3) and (6). These variables notably include a full set of education dummies, indicatorsfor the main reason for choosing the State of residence, visa category dummies, the number of languagesspoken well, and whether the PA expected to receive help learning English on arrival. For both self-

25Note that it may be the case that linguistic concentration is an inaccurate measure of social networks for these broadgroups. The definition of Chinese language used in the Australian census encompasses more than six dialects (Cantonese,Hokkien, Kan/Hakka, Mandarin, Min, Teo-Chiew and Wu). The same is true of Arabic, which is a catchall category forall varieties of Arabic languages.

Page 121: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 116

reported proficiency and language of interview, the coefficients are very similar across columns anddifference are not statistically significant.

To provide a more complete picture, I then implement a test of how strong selection on unobservableswould have to be, relative to selection on observables, to explain away the entire estimated effect.Building upon the work of Altonji, Elder and Taber (2005), Oster (2017) shows that under proportionalselection on observables and unobservables, one can calculate bounds on the coefficient of interest usingthe magnitude of the movements of the coefficient and of the R2 as controls are added. Bias-adjustedeffects are approximately given by

β∗ ≈ βL − δ(βS − βL

) [Rmax −RLRL −RS

]where βL and βS are the estimated effects in the long (with covariates) and short (no covariate)

models, respectively. RL and RS are the associated R2s. The parameter Rmax is the R2 we wouldobserve if all relevant unobservables were also included in the full regression, and δ is the ratio ofselection on unobservables over selection on observables. A fair approximation of Rmax can be obtainedby running a regression that includes fixed effects for all possible combinations of SSDs and language-groups. Doing so drives up the R2 to 0.70 for self-assessed English language skills. For completeness, Ialso compute more conservative bounds using Rmax = 1.

Identified sets [βL, β∗] for δ = 1 are shown in Table 3.3.26 In all cases, the bounds are relativelytight, and the identified sets never include zero. In addition, I report values of δ that would be necessaryto bring β down to zero. The results indicate that if the variation in language skills could be fullyexplained by observables and unobservables (Rmax = 1), selection on unobservables would have to bebetween 2 and 5 times larger than selection on observables for the estimated effect to vanish completely(columns (3) and (6)). If the relevant unobservables were only to raise the R2 to 0.7, then selection onunobservables would have to be 8 to 19 times larger than selection on observables to completely explainaway enclave effects.

3.4.3 Sponsored immigrants

The second approach I use to verify that my estimates of enclave effects are not driven by sorting isto focus on a subgroup of immigrants for whom the location decision is transparent: PAs sponsoredby a relative. For these individuals, where they locate depends largely on where their sponsor happensto residence. For instance, when asked whether their sponsor’s location had an important influence onwhere they chose to live, 94% of sponsored said that it did. By the time of the third interview, the vastmajority (90%) of sponsored PAs indicated still living in the same city or town as their sponsor.

Table 3.4 report estimates of enclave effects on language skills for sponsored PAs.27 Column (1)reproduces the baseline specification for the relevant subsample, and I gradually incorporate sponsor-related variables in columns (2) through (5). First, I control for a few sponsor observable characteristics(column (2)), which does not materially affect the magnitude of the coefficient of interest.28 This result

26Oster (2017) suggests that equal selection (δ = 1) is an appropriate upper bound.27For completenesse, Table C.5 present results for the alternative measures of language skills also used in Table C.1.28The characteristics are the number of year lived in Australia, whether the sponsor is an Australian citizen, and whether

English is the language spoken at home. A few immigrants who were sponsored by their fiancé/spouse declared that Englishwas spoken at home at the time of the first interview, suggesting that their spouse do not share their mother tongue.

Page 122: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 117

is robust to adding fixed effects for the sponsor’s State of residence (column (3)), restricting the sampleto PAs whose sponsor has been living in Australia for at least 5 years (column (4)), or adding a set ofdummies for the sponsor’s relationship with the PA (column (5)) – e.g. spouse, uncle, daughter.

In columns (6) through (10), enclave size is instrumented with linguistic concentration in the sponsor’sState of residence.29 The exclusion restriction under this approach postulates that unobserved sponsorcharacteristics that directly impede or facilitate the immigrant’s acquisition of language skills are notsystematically related to the sponsor’s propensity to locate in enclaves. One concern, for instance, isthat sponsors located in enclaves are systematically less likely to directly help PAs becoming fluent, inwhich case any relationship between enclave size and learning might operate via the sponsor’s behaviorrather than the enclave environment itself. Note that in theory, however, the sponsor’s own Englishlanguage skills could have either a positive or a negative effect on the PAs’ learning, since proficientsponsors could act as either teachers or interpreters.30 Examining the relationship between linguisticconcentration in the sponsor’s location and the likelihood that the PA received help from their relativeslearning English, I find no evidence that either a positive or negative link exists.31

In most specifications, the 2SLS estimates are moderately smaller than the corresponding OLS es-timates, and are considerably less precise. Given that I can only measure enclave size in the sponsor’slocation at a very high level of aggregation, these results must be interpreted with caution. Yet, themagnitude of the point estimates tend to be line with my baseline estimates of enclave effects, supportingthe hypothesis that enclaves causally slow down language acquisition.

3.5 Heterogeneous effects and mechanisms

To shed light on the possible mechanisms through which linguistic concentration impedes languageacquisition, I first examine which groups of immigrants are most affected by enclaves. Table 3.5 displaysthe results of an analysis of the heterogeneity of effects. Linguistic concentration appears to stronglyinfluence the language acquisition of women, but not of men.32 If women’s attachment to the labormarket is weaker, then their labour market participation, and therefore indirectly their decision toinvest in host-country human capital, are plausibly more responsive to environmental circumstances.Secondly, English language acquisition seems unrelated to linguistic concentration for immigrants olderthan 35 at the time of immigration. Presumably, the lifetime benefits of learning English are lower forolder individuals who have fewer years left on the job market. These two pieces of evidence suggestthat language acquisition indeed responds to economic incentives. For the most and least educatedimmigrants, learning appears relatively insensitive to the size of their linguistic group. Arguably, itmight be the case that high (low) ability individuals find it particularly easy (difficult) to learn a new

29Sponsor location is only recorded at a fairly aggregate level that cannot be accurately matched with Statistical Divisions.Yet, the first-stage remains strong (F-stat> 75). This instrument shares some similarities with the approach taken by Edin,Fredriksson and Åslund (2003), who instrument ethnic concentration in the current area of residence with enclave size inthe municipality to which refugees were assigned at the time of migration under a refugee placement program in Sweden.It also incorporates insight from Bertrand, Luttmer and Mullainathan (2000), who instrument PUMA-level concentrationwith concentration at a more aggregate level of geography (MSA).

30Chiswick and Miller (1995) make a similar point regarding the influence of children born in the host-country on theirimmigrant parents’ proficiency.

31In the first interview, LSIA respondents were asked if their relatives helped them learning English, and if they helpedthem getting a job. The F-stat from univariate regressions of binary variables indicating whether help was received onlinguistic concentration in the sponsor’s location are 0.89 and 0.5, for learning English and finding work respectively.

32Danzer and Yaman (2016) find that enclaves affect men and women equally in Germany, whereas Warman (2007)report larger effects of enclaves on language skills for women than men in Canada.

Page 123: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 118

language independently of the social context.Table 3.6 further unpacks the possible ways in which enclaves might affect language acquisition via

economic incentives. My objective, here, is to uncover some general patterns rather than isolate causaleffects of enclaves on intermediate outcomes. The results are therefore suggestive at best.

In column (1), the relationship between enclave size and the propensity to take up English coursesis reported.33 The small and statistically insignificant coefficient implies that enclave residents do nothave a lower propensity to enroll in formal language courses than immigrant living outside of enclaves.The effect of enclaves on proficiency therefore probably works through social interactions, not differencesin formal training. Not only does English proficiency provide strong economic value by increasing thenumber of people with whom one can interact, but learning itself depends on the number of opportunitiesto practice speaking the language. The absence of a correlation between enclave size and Englishcourse take-up is consistent with the fact that linguistic concentration has little or no effect on readingand writing skills, the learning of which plausibly depend much less on social interactions outside theclassroom than speaking abilities (Table C.6). From a policy perspective, these patterns suggest thatadditional public subsidies towards formal English courses would not induce enclave residents to investmore in language skills, considering that such courses were already provided for free in Australia.

In columns (2) through (7) of Table 3.6, I examine indirect pieces of evidence regarding the role ofsocial interactions. In column (2), the dependent variable is the amount of contact between people fromdifferent countries and cultures, as perceived by the PA. Concentration is associated with a decreasein the amount of such contacts, but the relationship is not statistically significant. Columns (3) and(4) indicate that enclave size is not associated with lower income or lower employment rates, in linewith Danzer and Yaman (2016); Damm (2009); Edin, Fredriksson and Åslund (2003).34 Columns (5)and (6) summarize the type of jobs occupied by workers and their job search strategies, conditional onbeing employed at the time of the third wave. Again, the relationships are not statistically significant,but the point estimates suggest that more enclave residents hold jobs at which they speak languagesother than English and that they found their jobs through friends and famil.35 Finally, in column (7),I directly test the hypothesis that the return to proficiency is lower in enclaves. To do so, for each SDby language-group cell, I calculate the difference in employment rates between fluent and non-fluentindividuals. I then regress this measure of the return to proficiency on enclave size, and find that theyare indeed negatively related.

While the estimates presented in Table 3.6 are all very imprecise, a general picture emerges. Thepenalty for reduced English proficiency, in terms of probability of finding work, is smaller in enclaves,plausibly because jobs with weaker language requirements are more available. As a result, enclaveresidents are just as likely to find work, despite having lower language skills. One may then speculatethat the work environment itself provides fewer opportunities to practice English in enclaves, therebyreducing language skill acquisition.

33In Australia, immigrants are provided with up to 510 hours of free English language tuition through the Adult MigrantEnglish Program (AMEP). Notably, the program also offers correspondence or distance learning courses and Home Tutorservices, in addition to formal classroom tuition (Martin, 1998).

34Any reduction in income due to weaker language skills in enclaves relative to non-enclaves is possibly counterbalancedby the benefits of language group-based job information networks.

35The results are fairly similar at the SSD level, but the negative coefficients for “contact between cultures” and “Englishonly at work” are statistically significant at that level of aggregation.

Page 124: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 119

3.6 Concluding Remarks

This paper estimates the relationship between linguistic concentration and language skill acquisition inan English-speaking country. While a negative effect of enclaves on learning is confirmed, I demonstratethat non-random sorting of immigrants on the basis of pre-immigration proficiency is prevalent, andconsiderably inflates cross-sectional estimates. The stability of the estimated relationship across severalempirical strategies attests to the robustness of this result. Overall, I find that a (within language-group) standard deviation increase in concentration translates into a 2.6 percentage points reductionin learning rates. Considering the large economic returns to proficiency, this negative effect of enclavespossibly contributes to slowing down the economic assimilation of immigrants (Borjas, 2015).

The evidence presented in this paper suggests that policies aimed at reducing the cost of formalEnglish courses are unlikely to attenuate the negative effect of linguistic concentration.36 The resultsare perhaps more consistent with social interactions being the main mechanism through which enclavesaffect the aquisition of host country-specific human capital. If English skills are practiced and improvedat work, then some immigrants might be caught in a circular relationship between lack of Englishjob opportunities and low English proficiency. Speculatively, poor initial language skills might preventindividuals from finding certain types of jobs, and the resulting scarcity of interactions in English withcolleagues further hinders language skill acquisition. The question of which type of intervention will bemore successful at reducing the effects of enclaves on language skill acquisition – e.g. anti-discriminationand labor market integration policies, or social integration activities – warrants further research.

36Such policies may very well help increase proficiency rates in general, however. The analyses presented here onlyindicate that the effect of enclaves on proficiency is not mediated by English course take-up.

Page 125: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 120

3.7 Tables and Figures

Figure 3.1: Language Acquisition

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Wave 1 Wave 2 Wave 3

Dif

fere

nce

(N

on

-en

clav

e -

Encl

ave)

Frac

tio

n s

pe

akin

g En

glis

h "

Wel

l" o

r "V

ery

wel

l"

Non-enclave Enclave Difference

0.00

0.02

0.04

0.06

0.08

0.10

0.12

0.14

0.16

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Wave 1 Wave 2 Wave 3

Dif

fere

nce

(N

on

-en

clav

e -

Encl

ave)

Frac

tio

n in

terv

iew

ed in

En

glis

h

Non-enclave Enclave Difference

Notes: Data from the Longitudinal Survey of Immigrants to Australia. Enclaves defined as areas withlinguistic concentration above the language-specific median, at the SD level. Statistics are based on2,053 observations and are calculated using the provided survey weights.

Page 126: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 121

Table 3.1: Descriptive statistics

Individual characteristics Full sample Enclave Non-enclaveDifference

(2) - (3)Difference

(s.d.)(1) (2) (3) (4) (5)

A. DemographicsWomen 0.508 0.541 0.493 0.0475** (0.0242)Age 32.56 33.963 31.97 1.993*** (0.487)Married 0.770 0.725 0.789 -0.0641*** (0.0203)Children present 0.345 0.385 0.329 0.0563** (0.0230)Household size 4.012 4.192 3.937 0.255** (0.0998)B. Ability and motivation proxiesEducation:

Bachelor/Professional degree or more 0.273 0.231 0.290 -0.0590*** (0.0215)Trade/Highschool 0.454 0.417 0.470 -0.0531** (0.0241)Less than High school 0.273 0.352 0.240 0.112*** (0.0214)

Visa:Preferential Family 0.586 0.553 0.601 -0.0480** (0.0238)Concessional Family 0.079 0.088 0.076 0.0120 (0.0131)Business Skills/ENS 0.025 0.031 0.023 0.0085 (0.0076)Independent 0.122 0.127 0.121 0.0058 (0.0159)Humanitarian 0.187 0.202 0.180 0.0218 (0.0189)

Reason choosed State living in:Spouse/partner lived here 0.466 0.408 0.490 -0.0825*** (0.0241)Employer is located here 0.018 0.019 0.017 0.0014 (0.0064)Job opportunities 0.069 0.067 0.070 -0.0026 (0.0122)More family in this state 0.293 0.365 0.264 0.101*** (0.0219)Friends living here 0.073 0.060 0.078 -0.0188 (0.0126)

Reason moved to Australia:Better employment opportunities 0.198 0.159 0.215 -0.0563*** (0.0193)To join family/relatives 0.381 0.407 0.371 0.0362 (0.0235)To get married 0.167 0.156 0.171 -0.0147 (0.0180)Better future for family 0.147 0.173 0.137 0.0357** (0.0171)

Was working in Prior Country 0.697 0.661 0.713 -0.0521** (0.0222)Visited Australia before 0.273 0.306 0.259 0.0470** (0.0215)Requested job-related information 0.461 0.448 0.466 -0.0176 (0.0241)Expected to receive help finding work 0.515 0.463 0.536 -0.0730*** (0.0241)Expected to receive help learning English 0.525 0.534 0.521 0.0129 (0.0242)Number of languages spoken well 1.473 1.507 1.459 0.0481 (0.0356)

Notes: Enclaves defined as areas with linguistic concentration above the language-specific median, at theSD level. Statistics are based on 2,053 observations and are calculated using the provided survey weights.All individual characteristics are derived from wave one survey questions. For variables pertaining to themain reason for choosing to move to Australia/for choosing State living in, only the most popular answersare shown. “Requested job-related information” and the “Expected to receive help...” are retrospectivequestion, referring to the period just before moving to Australia.

Page 127: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 122

Table 3.2: Main results

Dependent variable:(1) (2) (3) (4) (5) (6)

ln(Enclave size) -0.0958*** -0.0844*** -0.0584*** -0.102*** -0.0918*** -0.0613***(0.0237) (0.0226) (0.0206) (0.0212) (0.0213) (0.0205)

R 2 0.363 0.491 0.568 0.386 0.492 0.553

ln(Enclave size) -0.0234 -0.0345** -0.0212 -0.0236 -0.0326** -0.0218(0.0162) (0.0147) (0.0131) (0.0153) (0.0149) (0.0137)

R 2 0.359 0.490 0.567 0.382 0.489 0.552N 2053 2053 2053 2053 2053 2053Language group fixed effects X X X X X XSSD fixed effects X X X X X XDemographic characteristics X X X X X XMotivation/Ability proxies X X X XWave 1 dependent variable X X

Spoken English Interview in English

B. Statistical Subdivision level (SSD)

A. Statistical Division level (SD)

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level inpanel A, and the SSD by language-group level in panel B, are reported in parentheses. Data are weightedusing the provided sample weights. All specifications are linear probability models. Demographic char-acteristics are age, gender, household size, marital status, and presence of children. Motivation/abilityproxies are education dummies, main reason for moving to Australia, main reason for choosing State ofresidence, number of languages spoken well, having visited Australia prior to immigration, visa category,work status in former country, whether the PA requested job-related information before moving, whetherthe PA expected to receive help finding work, expected to receive help learning English. In columns (1)and (2), the dependent variable is equal to one if English is either spoken “Well” or “Very well”.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Table 3.3: Main results - Sensitivity to observable characteristics

Dependent variable:

Specification: Short Long 1 Long 2 Short Long 1 Long 2

(1) (2) (3) (4) (5) (6)

ln(Enclave size) -0.0610*** -0.0569*** -0.0584*** -0.0658*** -0.0597*** -0.0613***

(0.0228) (0.0196) (0.0206) (0.0222) (0.0202) (0.0205)

R 20.457 0.555 0.568 0.461 0.539 0.553

Identified set if

R max = 0.7 [-0.0569,-0.0508] [-0.0584,-0.0553] [-0.0597,-0.0471] [-0.0613,-0.0541]

R max = 1 [-0.0569,-0.0383] [-0.0584,-0.0483] [-0.0597,-0.0236] [-0.0613,-0.0394]

Selection necessary for β=0 if

R max = 0.7 9.4 18.9 4.7 8.5

R max = 1 3.1 5.8 1.7 2.8

Language group fixed effects X X X X X X

SSD fixed effects X X X X X X

Demographic characteristics X X X X

Motivation/Ability proxies X X

Wave 1 dependent variable X X X X X X

Spoken English Interview in English

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. The method used tocalculate the identified set as well as the amount of selection on unobservable necessary for β to shrinkto zero is from Oster (2017) and is discussed in the main text.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 128: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 123

Table 3.4: Sponsored ImmigrantsEstimation:

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)A. Dependent variable: Spoken Englishln(Enclave size) -0.0774*** -0.0705*** -0.0709*** -0.0721** -0.0721*** -0.0498 -0.0457 -0.0559* -0.0598 -0.0546

(0.0240) (0.0234) (0.0235) (0.0281) (0.0253) (0.0395) (0.0383) (0.0313) (0.0389) (0.0335)B. Dependent variable: Interview in Englishln(Enclave size) -0.0555** -0.0478** -0.0471** -0.0413* -0.0581** -0.0480 -0.0456 -0.0238 -0.0310 -0.0359

(0.0240) (0.0228) (0.0233) (0.0245) (0.0241) (0.0595) (0.0579) (0.0291) (0.0311) (0.0327)N 1422 1422 1422 1127 1231 1422 1422 1422 1127 1231Language group fixed effects X X X X X X X X X XSSD fixed effects X X X X X X X X X XDemographic characteristics X X X X X X X X X XMotivation/Ability proxies X X X X X X X X X XWave 1 dependent variable X X X X X X X X X XSponsor's characteristics X X X X X X X XSponsor's location fixed effects X X X X X XSample restriction: Sponsor in Australia for ≥ 5 years

X X

Sponsor's relationship with PA X X

OLS 2SLS

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. The sample is restrictedto PAs sponsored by a relative. The sponsor’s location is at the State level. Sponsor characteristics arenumber of year lived in Australia, Australian citizen status, and whether English is the language spokenat home. In columns (5) and (10), a full set of dummies for categories of the sponsor’s relationship withthe PA are included. In columns (6) through (10), the instrument is ln(Enclave size) in the sponsors’State of residence.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 129: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 124

Table 3.5: Heterogeneous effects

Spoken English

Interview in

English

Subsample (1) (2) (3)

Baseline -0.0584*** -0.0613*** 2053

(0.0206) (0.0205)

Men -0.0367 -0.0205 1137

(0.0323) (0.0312)

Women -0.0901*** -0.0945*** 916

(0.0300) (0.0324)

Education: Low -0.0333 -0.0478* 729

(0.0299) (0.0278)

Education: Middle -0.0782*** -0.103*** 917

(0.0229) (0.0251)

Education: High -0.0745 0.0240 407

(0.0830) (0.0560)

Age >= 35 -0.0122 -0.0060 769

(0.0398) (0.0377)

Age < 35 -0.0697*** -0.0744*** 1284

(0.0230) (0.0236)

Dependent variableSample size

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. High education denotesa bachelor degree or more, medium education is a professional or trade degree, or 12 years of schoolingor more, and low education is less than 12 years of schooling.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 130: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 125

Table 3.6: Possible Mechanisms

Dependent variable:

English course take-up

Contact between cultures

ln(Income) WorkingEnglish only language at

work

Found job through

friends or family

Return to proficiency

(1) (2) (3) (4) (5) (6) (7)ln(Enclave size) 0.0187 -0.0267 0.0260 -0.0118 -0.0355 0.0508 -0.149*

(0.0278) (0.0242) (0.0854) (0.0329) (0.0383) (0.0340) (0.0847)N 2053 2053 2031 2053 1226 1226 1782Language group fixed effects X X X X X X XSSD fixed effects X X X X X X XDemographic characteristics X X X X X X XMotivation/Ability proxies X X X X X X X

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. English course take-uptakes value one if the PA has ever been enrolled in a course after immigration by the time of the thirdinterview. Contact between cultures takes values one if the PA believes there at a lot of some contactbetween cultures in Australia. Income is measured by the mid-range value of 13 possible weekly incomebrackets. In column (4), the dependent variable indicates whether the PA is working in wave three. Forworking individuals, columns (5) and (6) use indicators for only speaking English at work, and havingfound the job through friends or family. The dependent variables in column (7) is a SD by language-group specic measure of the return to proficiency, in terms of employment probabilities. This outcomeis missing for cells in which either everyone is fluent or no one is fluent.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 131: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 126

3.8 Appendix Tables and Figures

Table C.1: Baseline results - Alternative measures of spoken English

Dependent variable:(1) (2) (3) (4) (5) (6) (7) (8) (9)

A. Statistical Division level (SD)ln(Enclave size) -0.121*** -0.111*** -0.0746*** -0.220*** -0.196*** -0.122*** -0.0549* -0.0478* -0.0455

(0.0245) (0.0233) (0.0227) (0.0452) (0.0429) (0.0395) (0.0286) (0.0288) (0.0283)R 2 0.374 0.499 0.579 0.424 0.557 0.648 0.123 0.162 0.163B. Statistical Subdivision level (SSD)ln(Enclave size) -0.0357** -0.0467*** -0.0350*** -0.0521* -0.0745** -0.0432* -0.0402** -0.0388** -0.0377**

(0.0162) (0.0148) (0.0132) (0.0313) (0.0289) (0.0253) (0.0174) (0.0177) (0.0177)R 2 0.369 0.497 0.578 0.420 0.555 0.648 0.124 0.164 0.164Language group fixed effects X X X X X X X X XSSD fixed effects X X X X X X X X XDemographic characteristics X X X X X X X X XMotivation/Ability proxies X X X X X XWave 1 dependent variable X X Xa

Both self-declares speaking English and interviews in English

Composite of self-declared spoken English and interview in English

How English has improved since last interview

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level inpanel A, and the SSD by language-group level in panel B, are reported in parentheses. Data are weightedusing the provided sample weights. Columns (1)-(3): the outcome is an indicator for both doing theinterview in English and speaking English “Well” or “Very well”. Columns (4)-(6): the outcome isthe principal component of doing the interview in English and speaking English “Well” or “Very well”.Column (7)-(9): The outcome is a binary variables that takes value one if the PA declares that herEnglish has improved at least moderately since the last interview (the question is asked in wave three).In column (9), the lagged dependent variable is speaking English “Well” or “Very well” in wave one.Note that the outcome in columns (7)-(9) is a flow measure.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 132: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 127

Table C.2: Baseline results - Robustness to sample restrictions

Spoken EnglishInterview in

EnglishSample size

Subsample (1) (2) (3)(a) Baseline -0.0584*** -0.0613*** 2053

(0.0206) (0.0205)(b) Age: 25-64 -0.0408* -0.0464** 1741

(0.0226) (0.0201)(c) Age: 18-54 -0.0604*** -0.0787*** 1952

(0.0210) (0.0196)(d) Age: 25-54 -0.0418* -0.0690*** 1640

(0.0228) (0.0193)(e) Chinese speakers excluded -0.0663*** -0.0634*** 1753

(0.0207) (0.0206)(f) Arabic speakers excluded -0.0797*** -0.0781*** 1806

(0.0204) (0.0202)(g) Spanish speakers excluded -0.0561*** -0.0639*** 1877

(0.0211) (0.0206)(h) Vietnamese speakers excluded -0.0545** -0.0548*** 1934

(0.0241) (0.0207)(i) Sydney excluded -0.0991*** -0.0833*** 1149

(0.0260) (0.0247)(j) Melbourne excluded -0.0728*** -0.0792*** 1461

(0.0235) (0.0209)(k) Stayers: Always same SD -0.0502** -0.0585*** 1933

(0.0226) (0.0221)(l) Stayers: Always same SSD -0.0671** -0.0982*** 1119

(0.0288) (0.0257)(m) Same dwelling for at least 24 months -0.0297 -0.0257 1092

(0.0363) (0.0309)(n) Same dwelling for at least 18 months -0.0588* -0.0381 1274

(0.0312) (0.0307)

Dependent variable

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. In rows (m)-(n), thesubsample of stayers is based on a questions asked in wave three.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 133: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 128

Table C.3: Baseline results - Probit specification

Dependent variable:

(1) (2) (3) (4) (5) (6) (7) (8) (9)

A. Statistical Division level (SD)

ln(Enclave size) -0.213*** -0.229*** -0.177** -0.325*** -0.320*** -0.240** -0.300*** -0.326*** -0.238*

(0.0750) (0.0714) (0.0714) (0.0996) (0.102) (0.119) (0.114) (0.119) (0.142)

B. Statistical Subdivision level (SSD)

ln(Enclave size) -0.0598 -0.0722 -0.00684 -0.0842 -0.0962 -0.0469 -0.0704 -0.0768 -0.0627

(0.0466) (0.0471) (0.0469) (0.0609) (0.0631) (0.0660) (0.0632) (0.0667) (0.0738)

N 2053 2053 2053 2053 2053 2053 2053 2053 2053

Language group fixed effects X X X X X X X X X

SSD fixed effects X X X X X X X X X

Demographic characteristics X X X X X X X X X

Motivation/Ability proxies X X X X X X

Wave 1 dependent variable X X X

Spoken English (binary) Interview in EnglishSpoken English (1-4 scale)

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD-level, are reported in paren-theses. Data are weighted using the provided sample weights. Columns (1)-(3) show estimates of anordered probit model. Columns (4) to (9) are binary probit models.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Table C.4: Baseline results - Alternative functional forms

Dependent variable:(1) (2) (3) (4) (5) (6)

Statistical Division level (SD)Contact availability -0.0245 -0.253*** -0.0249 -0.219***

(0.0154) (0.0605) (0.0159) (0.0589)Contact availability2 0.0575*** 0.0489***

(0.0144) (0.0143)Contact availability: 2nd quintile -0.115*** -0.0672**

(0.0355) (0.0277)Contact availability: 3rd quintile -0.0452 -0.0322

(0.0411) (0.0408)Contact availability: 4th quintile -0.114*** -0.157***

(0.0369) (0.0366)Contact availability: 5th quintile -0.0940** -0.0320

(0.0365) (0.0330)N 2053 2053 2053 2053 2053 2053Language group fixed effects X X X X X XSSD fixed effects X X X X X XDemographic characteristics X X X X X XMotivation/Ability proxies X X X X X XWave 1 dependent variable X X X X X X

Spoken English Interview in English

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. In columns (3) and (6),the effect of enclave size is estimated using dummies for each quintile of the distribution of CASDjk . Theomitted category is the first quintile.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 134: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 129

Table C.5: Sponsored Immigrants - Alternative measures of spoken EnglishEstimation:

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10)A. Dependent variable: Both self-declares speaking English and interviews in Englishln(Enclave size) -0.0882*** -0.0788*** -0.0789*** -0.0787*** -0.0864*** -0.0583 -0.0510 -0.0590* -0.0656* -0.0646*

(0.0253) (0.0247) (0.0249) (0.0255) (0.0265) (0.0388) (0.0376) (0.0336) (0.0352) (0.0363)B. Dependent variable: Composite of self-declared spoken English and interview in Englishln(Enclave size) -0.136*** -0.125*** -0.125*** -0.123** -0.137*** -0.0993 -0.0953 -0.0842 -0.0994 -0.0943

(0.0452) (0.0442) (0.0447) (0.0477) (0.0478) (0.0887) (0.0859) (0.0578) (0.0654) (0.0648)C. How English has improved since last interviewln(Enclave size) -0.0647** -0.0601** -0.0596** -0.0131 -0.0934*** -0.0915* -0.1000* -0.0971** -0.0644 -0.122***

(0.0307) (0.0299) (0.0301) (0.0351) (0.0318) (0.0516) (0.0521) (0.0399) (0.0512) (0.0422)N 1422 1422 1422 1127 1231 1422 1422 1422 1127 1231Language group fixed effects X X X X X X X X X XSSD fixed effects X X X X X X X X X XDemographic characteristics X X X X X X X X X XMotivation/Ability proxies X X X X X X X X X XWave 1 dependent variable X X X X X X X X X XSponsor's characteristics X X X X X X X XSponsor's location fixed effects X X X X X XSample restriction: Sponsor in Australia for ≥ 5 years

X X

Sponsor's relationship with PA X X

OLS 2SLS

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. The sample is restrictedto PAs sponsored by a relative. The sponsor’s location is at the State level. Sponsor characteristics arenumber of year lived in Australia, Australian citizen status, and whether English is the language spokenat home. In columns (5) and (10), a full set of dummies for categories of the sponsor’s relationship withthe PA are included. In columns (6) through (10), the instrument is ln(Enclave size) in the sponsors’State of residence. In panel C, the lagged dependent variable is speaking English “Well” or “Very well”in wave one.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Table C.6: Reading and Writing skills

Dependent variable:

(1) (2) (3) (4) (5) (6)

A. Statistical Division level (SD)

ln(Enclave size) -0.0283 -0.0397** -0.0290* -0.0089 -0.0204 -0.0012

(0.0204) (0.0197) (0.0174) (0.0236) (0.0238) (0.0214)

R 20.422 0.453 0.508 0.404 0.442 0.511

B. Statistical Subdivision level (SSD)

ln(Enclave size) -0.0156 -0.0210 -0.0091 -0.0138 -0.0187 -0.0075

(0.0147) (0.0149) (0.0134) (0.0142) (0.0132) (0.0120)

R 20.422 0.453 0.508 0.428 0.443 0.511

N 2053 2053 2053 2053 2053 2053

Language group fixed effects X X X X X X

SSD fixed effects X X X X X X

Demographic characteristics X X X X X X

Motivation/Ability proxies X X X X

Wave 1 dependent variable X X

Reading skills Writing skills

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. Both measures oflanguage skills are binary, taking value one if the PA reads/writes English “Well” or “Very well”.***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 135: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 130

Table C.7: Decomposition of SSD-level estimatesMeasure of language skills:

(1) (2) (3) (4) (5) (6)β1 0.0037 -0.0160 -0.0075 0.0059 -0.0106 -0.0073

(0.0209) (0.0189) (0.0171) (0.0185) (0.0179) (0.0164)β2 -0.0959*** -0.0838*** -0.0581*** -0.103*** -0.0913*** -0.0610***

(0.0244) (0.0226) (0.0204) (0.0217) (0.0225) (0.0211)ρSD,SSD 0.272*** 0.273*** 0.272*** 0.272*** 0.273*** 0.271***

(0.0292) (0.0282) (0.0280) (0.0292) (0.0282) (0.0279)Language group fixed effects X X X X X XSSD fixed effects X X X X X XDemographic characteristics X X X X X XMotivation/Ability proxies X X X XWave 1 dependent variable X X

Spoken English Interview in English

Notes: Heteroskedasticity-consistent standard errors, clustered at the SD by language-group level, arereported in parentheses. Data are weighted using the provided sample weights. The coefficients β1 andβ2 are obtained by estimating equation (3.3). The coefficient ρSD,SSD is from a regression of lnCASDjkon lnCASSDjk (and all other conditioning variables).***, **, * indicate significance at the 1%, 5%. and 10% levels.

Page 136: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 131

3.9 Appendix: Level of aggregation

To reconcile the relatively small effect of SSD-level linguistic concentration with the large effect ofSD-level linguistic concentration, consider an econometric model in which contact availability has twoseparate effects on language acquisitions: (a) an effect of SD-level contact availability that is commonto all immigrants of particular language group in a given city, irrespective of which neighborhood theylive in, and (b) an additional effect of SSD-level contact availability given the SD-level concentration.In particular, let CAwithinjk = CASSDjk /CASDjk denote SSD-level concentration relative to the SD-levelcontact availability and write

Lijk =β1 lnCAwithinjk + β2 lnCASDjk + γ′Xijk + δj + δk + εijk (3.3)

=β1 lnCASSDjk + (β2 − β1) lnCASDjk + γ′Xijk + δj + δk + εijk.

Intuitively, conditional on living in a city where same-language contacts are plentiful (i.e. SD-levelconcentration is high), there might be little additional effect of living in a neighborhood with relativelymore same-language contacts than the city average, in which case β1 would be small. Similarly, theremight be little impact of living in a high-concentration SSD (relative to the city average) if there arevery few same-language contacts in the city to begin with (i.e. very low SD-level concentration).

Note that the baseline estimates of equation (3.2) when concentration is measured at the SSD-level(βSSD) are a weighted average of β1 and β2:

βSSD =β1 (1− ρSSD) + β2ρSSD

where ρSSD is the coefficient from a regression of lnCASDjk on lnCASSDjk (and all other conditioningvariables).37 Table C.7 reports estimates of β1, β2, and ρSSD.

An empirical strategy often employed in studies of contextual effects is to instrument the independentvariable of interest with a corresponding variable measured at a higher level of aggregation (Evans, Oatesand Schwab, 1992; Bertrand, Luttmer and Mullainathan, 2000; Cutler, Glaeser and Vigdor, 2008). Forexample, if we instrument lnCASSDjk with lnCASDjk , the IV estimates are in fact composite coefficientsthat relate directly to β1, β2, and πSD, the coefficient from a regression of lnCASSDjk on lnCASDjk (i.e.the first-stage):

βIVSSD = βSDπSD

= β2 − β1 (1− πSD)πSD

= β1 + (β2 − β1)πSD

. (3.4)

The 2SLS estimates reported in section 3.4.3 are a special case of equation (3.4).

3.10 Data appendix

The 1996 Community Profiles report population counts for each SSD/SD by language spoken at homefor 22 languages: English, Arabic (including Lebanese), Australian Indigenous Languages, Chinese lan-

37i.e. lnCASDjk = ρSSD lnCASSDjk + γ′Xijk + δj + δk + εijk.

Page 137: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 132

guages, Croatian, French, German, Greek, Hungarian, Indonesian, Italian, Macedonian, Malay, Maltese,Netherlandic, Polish, Portuguese, Russian, Serbian, Spanish, Tagalog (Filipino), Turkish, and Viet-namese. To maximize the number of language groups for which I can compute measures of concen-tration, I instead use the 2001 Time Series Community Profiles, in which 1996 population counts arereported for all languages listed above, as well as for Hindi, Japanese, Khmer, Korean, Persian, Samoan,Sinhalese, Tamil, and Ukrainian.38 Similarly, 1996 population counts for Thai and Assyrian speakers arecollected from the 2006 Time Series Community Profiles. I drop Malay, Maltese and Samoan speakersbecause there are too few observations for these language groups in the LSIA sample. In the calculations,overseas visitors and people whose language spoken at home is not stated are excluded from the totalpopulation of each area. This may generate some measurement error, leading to an attenuation bias inthe empirical analysis.

Note that between each Census, the boundaries of a few SSD were modified. Because the locationreported in the LSIA corresponds to the 1996 definitions, I must adjust the population counts from the2001 and 2006 Time Series Community Profiles so that they match the 1996 boundaries. For example,the SSD of Blacktown-Baulkham Hills was renamed Blacktown in 2001, and its size was reduced by thetransfer of the Statistical Local Area of Baulkham Hills (A) to the SSD of Central Northern Sydney.Hence, to account for these changes, the population counts of Baulkham Hills (A) for the year 1996 areadded to those of Blacktown (and subtracted from those of Central Northern Sydney) in the 2001 TimeSeries Community Profiles to reflect the 1996 boundaries. In most cases, the mapping from the 2001 tothe 1996 boundaries is very straightforward, and so the 1996 population counts are calculated withouterror. In a few cases, only part of a given geographic unit was transferred from one SSD to another,generating minor measurement error.

Finally, the Census Community Profiles also report population counts by country of birth for a subsetof countries. If groups are assigned on that basis rather than by best language spoken, the sample sizeis substantially reduced (e.g. many Arabic and Spanish speakers in the LSIA1 data set were born in acountry for which the population counts are not reported in the Community Profiles). The main resultsreported in this paper are qualitatively unchanged if country of birth is used to measure concentrationor if population counts for year 2001 instead of 1996 are used (available upon request).

3.11 Appendix: Theoretical Model

To understand the mechanisms through which linguistic concentration may affect language skill acquisi-tion, and how non-random location patterns may introduce bias in empirical estimates, it is convenientto consider the language acquisition process and the location decision in two steps. For simplicity, sup-pose there are only two languages in the host-country: a majority language (English), and a minoritylanguage. The proportion of minority speakers Nj varies across J cities. Consider a cohort of minor-ity language-speaking immigrants who arrive in Australia and must decide in which city to reside andwhether to learn English or not.39 The entering cohort is sufficiently small so that its size has virtuallyno impact on the distribution of minority and majority language speakers in each location (i.e. valuesof Nj are exogenous to the model).

38The category “South slavic, not further defined” can’t be matched to one specific language in the LSIA data, hence thisgroup is dropped. Australian Indigenous Languages are also ignored, as they are not spoken by immigrant populations.

39Immigrants already fluent in English only need to decide where to locate.

Page 138: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 133

3.11.1 The learning process

This subsection focuses exclusively on immigrants who are not fluent upon arrival. Let Lij ∈ 0, 1indicate the English proficiency status of individual i in area j. Immigrants who learn English incur acost cij = c(Xi, vi, Nj ,Gj) where Xi is a vector of observable individual characteristics (e.g. education,age) that affect the ease with which the individual is able to learn new languages, vi is an unobservedidiosyncratic component (e.g. ability), and Gj is a vector of city characteristics such as availabilityof English courses. Note that the cost is allowed to depend directly on the concentration of minorityspeakers. The literature suggests that cij is an increasing function of Nj .40

Because individuals tend to interact disproportionately with people who speak their mother tongue,residential concentration may also reflect an underlying social network in which valuable information isdisseminated. These networks can serve the function of a system of job referrals, potentially improvingthe labor market outcomes of its members.41 For instance, it has been shown that once selectionis accounted for, the generally uncovered negative relationship between linguistic concentration andearnings disappears, or even turns positive.42 For the most part, studies tackling issues of selectioninto enclaves do not explore the different mechanisms through which concentration affect employmentand earnings. In this paper, I shed light on the strength of the language acquisition channel, treatinglanguage skills as the main outcome of interest.

Individuals do not value fluency directly, but their utility is a function of earnings, which in turndepend on language skills. There are two types of jobs: those that require English, paying a wage wHj ,and those that require limited language abilities and pay a wage wLj < wHj .43 A worker i receives a typeq ∈ H,L job offer with probability P qij = P q(Lij ,Xi, vi, Nj ,Gj). The job finding rates are functionsof Xi and vi, implicitly allowing for ability in language acquisition to be correlated with labour marketabilities. The probability of finding type-q work is allowed to vary with linguistic concentration to reflectsubstantial evidence that informal job search and job referrals from family and friends are widespread(Ioannides and Loury, 2004), notably in immigrant social networks (Beaman, 2012). Under basic as-sumptions, some network models predict that the individual probability of finding work is increasing inthe number of contacts (ties) (Patacchini and Zenou, 2012).44

Markets are segmented in the sense that workers fluent in English only look for type-H jobs and non-40In Lazear’s (1999) random-encounter model, as Nj grows, the opportunity cost of learning English (i.e. the probability

of “trade” occurring for a non-fluent immigrant) increases. Alternatively, a larger Nj implies that minority language-speakers rarely meet native speakers, and so opportunities to practice English are less common (Stevens, 1992). Forinstance, Vervoort, Dagevos and Flap (2012) find that ethnic concentration is negatively (positively) associated with thenumber of contacts with natives (co-ethnics) in the Netherlands, which in turn is related to greater (lower) proficiency inDutch. Danzer and Yaman (2013) document a similar pattern in Germany.

41Goel and Lang (2009) and Patacchini and Zenou (2012) validate the use of ethnic concentration as a measure ofnetwork size by showing that it is related to the probability of finding a job through social networks.

42Munshi (2003) studies the employment probability of Mexican migrants in the United States and finds that Mexicanworkers are more likely to be employed, and to have a nonagricultural job (conditional on employment) if their socialnetwork is exogenously large. These results are in line with Cutler, Glaeser and Vigdor (2008), who instrument forsegregation using variation in the distributions of occupations across areas and groups of immigrants. Exploiting theinstitutional features of refugee placement policies, Edin, Fredriksson and Åslund (2003) and Damm (2009) find strongevidence of low-ability refugees sorting into enclaves in Sweden and Denmark, respectively. Once the selection bias isremoved, they find that the impact of living in an enclave on earnings is positive.

43Note that because the entering cohort is small, the learning decisions made by this group of immigrant has no generalequilibrium effect on the return to proficiency, hence on market wages.

44This is not necessarily true in the short-run, because of within network competition (Calvó-Armengol and Jackson,2004). Also, beyond a critical value of network size, the number of job matches can actually decrease because of congestion(Calvó-Armengol and Zenou, 2005; Wahba and Zenou, 2005). Also, if concentration affects labour demand (through jobcreation, for example), then the job finding rates captures both the dissemination of information channel and the effect onlabour demand in the two types of jobs.

Page 139: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 134

fluent workers are only employed in type-L jobs.45 Immigrant i’s expected utility in area j is separablein income, the cost of learning English, and idiosyncratic preferences for location j (εij):

Uij(Lij) = (PHij wHj )Lij + (PLijwLj )(1− Lij)− Lijcij + εij . (3.5)

The optimal choice of language skill acquisition in location j is given by:

L∗ij(Xi, vi, Nj ,Gj , wHj , w

Lj ) =

1 if PH(1,Xi, vi, Nj ,Gj)wHj − PL(0,Xi, vi, Nj ,Gj)wLj︸ ︷︷ ︸

Benefits(Bij)

≥ c(Xi, vi, Nj ,Gj)︸ ︷︷ ︸Costs(cij)

0 otherwise

.

(3.6)

Equation (3.6) can be seen as a formalization of Chiswick and Miller’s (1995; 1996) language model,which defines immigrants’ propensity to acquire host-country language skills as a function of three broadcategories of factors: economic incentives, exposure, and efficiency. In my model, economic incentivesare accounted for by the labour market benefits of English proficiency (Bij). Individual characteristicsXi and vi are efficiency variables, and Nj is an exposure factor.

A relatively weak sufficient condition for concentration to have a unambiguously negative effect onlanguage acquisition is ∂Bij/∂Nj < 0. For instance, if language-based social networks only provide jobreferrals for jobs with weak English requirements (i.e. ∂PL

ij

∂Nj≥ 0 and ∂PH

ij

∂Nj= 0), the result follows. The

condition is also satisfied if there are fewer type-H job opportunities in enclaves (PHij decreases withNj). If individuals were randomly assigned to different locations, regressing L∗ij on Nj (as well as on allthe other variables included in equation (3.6)) would provide an unbiased estimate of the average neteffect of enclave size. Yet, location is not random, hence this uncomplicated empirical approach likelyyields biased estimates.

3.11.2 The location decision and selection bias

The indirect utility of living in location j for individual i is given by U∗ij = Uij(L∗ij). Individual i willtherefore choose to reside in location s if and only if

U∗is > U∗ij ∀j 6= s. (3.7)

In this model, there is no uncertainty; immigrants know whether they will learn English or not whendeciding where to live.46 Bias in OLS specifications may arise for at least three reasons.

Bias 1: The classic ability bias. To illustrate this case, consider immigrants whose learning decision isperfectly inelastic with respect to Nj , that is high-ability individuals who find it optimal to learn Englishin every J locations (always-learners, with vi exceeding some arbitrary v), and low-ability individual

45That is, PH(0,Xi, vi, Nj ,Gj) = 0 and PL(1,Xi, vi, Nj ,Gj) = 0. Suppose workers are endowed with one indivisibleunit of time which can either be allocated to searching English or non-English jobs. Then, as long as expected earningsof fluent workers are higher in type-H jobs (PHj (1, Xi, vi, Nj)wHj ≥ P

Lj (1, Xi, vi, Nj)wLj ), it is always optimal for them to

search for English rather than non-English jobs.46I assume that individuals have perfect information to characterize optimal sorting patterns. The selection bias would

likely be less severe if people did not locate optimally. For example, one may expect most sponsored immigrants to taketheir initial location as given (i.e. they move wherever their sponsor lives). Then, if moving costs are substantial, animmigrant may stay in the city where his sponsor is located even if he could achieve a higher utility living somewhereelse. Similarly, uncertainty or imperfect information with respect to city characteristics may lead to sub-optimal locationdecision, hence to a smaller selection bias.

Page 140: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Chapter 3. Language Skill Acquisition in Immigrant Social Networks 135

who never “pick up” English in any location (never-learners). The former’s indirect utility is given byU∗ij = PHij w

Hj − cij + εij , and the latter’s by U∗ij = PLijw

Lj + εij . Always-learners will tend to locate

out of enclaves disproportionately if (a) choosing a location with a relatively high Nj increases thecost of language acquisition more than it increases the probability of finding a type-H job, and (b)they dislike living in enclaves (e.g. Cov(εij , Nj |vi ≥ v) < 0). Similarly, never-learners may select intohigh-concentration areas if the probability of finding type-L work is greater in enclaves, or if they havea preference for enclaves. In the extreme case of a sample made up exclusively of these two typesof migrants, a negative correlation between concentration and proficiency might be found even in theabsence of any effect of Nj on learning.

Bias 2: The comparative advantage bias. The effect of living in low-concentration areas on languageskill acquisition may vary across individuals, and immigrants may sort on that basis. Because migrantsdo not value proficiency in itself, we cannot unambiguously predict whether immigrants whose learningdecision is the most sensitive to linguistic concentration are systematically more or less likely to resideout of enclaves than individuals who are unaffected by concentration.

Bias 3: Selection on pre-immigration proficiency. Immigrants already fluent in English upon arrivaldo not have to make a learning decision. In every location, their indirect utility is given by U∗ij = PHij w

Hj +

εij and any sorting of these individuals is independent of learning costs. If English-job opportunitiesare more scarce in enclaves, or if this group of migrants simply prefer to live in low-concentration areas,then any regression of the “stock” of language skills at one point in time on linguistic concentrationmight be plagued by a problem of reverse causality: we may conclude that some immigrants becameproficiency because they were living in low-concentration areas, whereas in reality they chose to live inthese locations because they were already fluent upon arrival.47

47There is considerable empirical evidence of this type of behavior (Bauer, Epstein and Gang, 2005; Bayer, McMillanand Rueben, 2004).

Page 141: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

Bibliography

Aaronson, Daniel. 1998. “Using sibling data to estimate the impact of neighborhoods on children’seducational outcomes.” Journal of Human Resources, 915–946.

Abdulkadiroğlu, Atila, Joshua Angrist, and Parag Pathak. 2014. “The elite illusion: Achieve-ment effects at Boston and New York exam schools.” Econometrica, 82(1): 137–196.

Abdulkadiroğlu, Atila, Joshua D. Angrist, Susan M. Dynarski, Thomas J. Kane, andParag A. Pathak. 2011. “Accountability and flexibility in public schools: Evidence from Boston’scharters and pilots.” The Quarterly Journal of Economics, 126(2): 699–748.

Abdulkadiroğlu, Atila, Parag A. Pathak, Jonathan Schellenberg, and Christopher R. Wal-ters. 2017. “Do Parents Value School Effectiveness?” National Bureau of Economic Research, IncNBER Working Papers 23912.

Abowd, John M, Francis Kramarz, and David N Margolis. 1999. “High wage workers and highwage firms.” Econometrica, 67(2): 251–333.

Almlund, Mathilde, Angela Lee Duckworth, James Heckman, and Tim Kautz. 2011. “Chap-ter 1 - Personality Psychology and Economics.” In . Vol. 4 of Handbook of the Economics of Education,, ed. Eric A. Hanushek, Stephen Machin and Ludger Woessmann, 1 – 181. Elsevier.

Altonji, Joseph G., and Richard K. Mansfield. 2014. “Group-Average Observables as Controlsfor Sorting on Unobservables When Estimating Group Treatment Effects: the Case of School andNeighborhood Effects.” National Bureau of Economic Research, Inc NBER Working Papers 20781.

Altonji, Joseph G., Todd E. Elder, and Christopher R. Taber. 2005. “Selection on Observed andUnobserved Variables: Assessing the Effectiveness of Catholic Schools.” Journal of Political Economy,113(1): 151–184.

Andersen, Steffen, Glenn W Harrison, Morten I Lau, and E Elisabet Rutström. 2008.“Eliciting risk and time preferences.” Econometrica, 76(3): 583–618.

Angrist, Joshua D. 2014. “The perils of peer effects.” Labour Economics.

Angrist, Joshua D., and Alan B. Krueger. 1991. “Does compulsory school attendance affect school-ing and earnings?” The Quarterly Journal of Economics, 106(4): 979–1014.

Angrist, Joshua D., Peter D. Hull, Parag A. Pathak, and Christopher R. Walters. 2017.“Leveraging lotteries for school value-added: Testing and estimation.” The Quarterly Journal of Eco-nomics, 132(2): 871–919.

136

Page 142: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 137

Arum, Richard, and Josipa Roksa. 2011. Academically adrift: Limited learning on college campuses.University of Chicago Press.

Australian Bureau of Statistics. 1996. Statistical Geography: Volume 1. Australian Standard Geo-graphical Classification (ASGC), 1996 Edition. Catalogue no.1216.0.

Autor, David, David Figlio, Krzysztof Karbownik, Jeffrey Roth, and Melanie Wasserman.2016. “School Quality and the Gender Gap in Educational Achievement.” American Economic Review,106(5): 289–295.

Bauer, Thomas, Gil Epstein, and Ira Gang. 2005. “Enclaves, language, and the location choice ofmigrants.” Journal of Population Economics, 18(4): 649–662.

Bayer, Patrick, Fernando Ferreira, and Robert McMillan. 2007. “A unified framework for mea-suring preferences for schools and neighborhoods.” Journal of political economy, 115(4): 588–638.

Bayer, Patrick, Robert McMillan, and Kim S. Rueben. 2004. “What drives racial segregation?New evidence using Census microdata.” Journal of Urban Economics, 56(3): 514–535.

Beaman, Lori A. 2012. “Social Networks and the Dynamics of Labour Market Outcomes: Evidencefrom Refugees Resettled in the U.S.” Review of Economic Studies, 79(1): 128–161.

Beattie, Graham, Jean-William P Laliberté, Catherine Michaud-Leclerc, and Philip Ore-opoulos. 2017. “What Sets College Thrivers and Divers Apart? A Contrast in Study Habits, Atti-tudes, and Mental Health.” National Bureau of Economic Research Working Paper 23588.

Berman, Eli, Kevin Lang, and Erez Siniver. 2003. “Language-skill complementarity: returns toimmigrant language acquisition.” Labour Economics, 10(3): 265–290.

Bertrand, Marianne, Erzo F. P. Luttmer, and Sendhil Mullainathan. 2000. “Network EffectsAnd Welfare Cultures.” The Quarterly Journal of Economics, 115(3): 1019–1055.

Best, Michael Carlos, Jonas Hjort, and David Szakonyi. 2017. “Individuals and Organizations asSources of State Effectiveness, and Consequences for Policy.” National Bureau of Economic Research23350.

Bettinger, Eric P, Brent J Evans, and Devin G Pope. 2013. “Improving college performance andretention the easy way: Unpacking the ACT exam.” American Economic Journal: Economic Policy,5(2): 26–52.

Billings, Stephen B., David J. Deming, and Stephen L. Ross. 2016. “Partners in Crime: Schools,Neighborhoods and the Formation of Criminal Networks.” National Bureau of Economic Research, IncNBER Working Papers 21962.

Black, Sandra E. 1999. “Do better schools matter? Parental valuation of elementary education.”Quarterly journal of economics, 577–599.

Bleakley, Hoyt, and Aimee Chin. 2004. “Language Skills and Earnings: Evidence from ChildhoodImmigrants.” The Review of Economics and Statistics, 86(2): 481–496.

Page 143: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 138

Blei, David M, Andrew Y Ng, and Michael I Jordan. 2003. “Latent dirichlet allocation.” Journalof Machine Learning Research, 3(Jan): 993–1022.

Borghans, Lex, Angela Lee Duckworth, James J Heckman, and Bas Ter Weel. 2008. “Theeconomics and psychology of personality traits.” Journal of human Resources, 43(4): 972–1059.

Borjas, George J. 1995. “Ethnicity, Neighborhoods, and Human-Capital Externalities.” AmericanEconomic Review, 85(3): 365–90.

Borjas, George J. 2015. “The slowdown in the economic assimilation of immigrants: Aging and cohorteffects revisited again.” Journal of Human Capital, 9(4): 483–517.

Boudarbat, Brahim, Thomas Lemieux, and W. Craig Riddell. 2010. “The Evolution of theReturns to Human Capital in Canada, 1980-2005.” Canadian Public Policy, 36(1): 63–89.

Bound, John, and Sarah Turner. 2011. “Chapter 8 - Dropouts and Diplomas: The Divergencein Collegiate Outcomes.” In Handbook of The Economics of Education. Vol. 4 of Handbook of theEconomics of Education, , ed. Stephen Machin Eric A. Hanushek and Ludger Woessmann, 573 – 613.Elsevier.

Bound, John, Michael F Lovenheim, and Sarah Turner. 2010. “Why have college completion ratesdeclined? An analysis of changing student preparation and collegiate resources.” American EconomicJournal: Applied Economics, 2(3): 129–157.

Bronnenberg, Bart J., Jean-Pierre H. Dubé, and Matthew Gentzkow. 2012. “The evolu-tion of brand preferences: Evidence from consumer migration.” The American Economic Review,102(6): 2472–2508.

Burdick-Will, Julia, Jens Ludwig, Stephen W. Raudenbush, Robert J. Sampson, Lisa San-bonmatsu, and Patrick Sharkey. 2011. “Converging evidence for neighborhood effects on children’stest scores: An experimental, quasi-experimental, and observational comparison.” In Whither oppor-tunity?: Rising inequality, schools, and children’s life chances. , ed. Greg J. Duncan and Richard J.Murnane, Chapter 12. Russell Sage Foundation.

Burks, Stephen V, Connor Lewis, Paul A Kivi, Amanda Wiener, Jon E Anderson, LorenzGötte, Colin G DeYoung, and Aldo Rustichini. 2015. “Cognitive skills, personality, and eco-nomic preferences in collegiate success.” Journal of Economic Behavior & Organization, 115: 30–44.

Cadena, Brian C, and Benjamin J Keys. 2015. “Human capital and the lifetime costs of impatience.”American Economic Journal: Economic Policy, 7(3): 126–153.

Calvó-Armengol, Antoni, and Matthew O. Jackson. 2004. “The Effects of Social Networks onEmployment and Inequality.” American Economic Review, 94(3): 426–454.

Calvó-Armengol, Antoni, and Yves Zenou. 2005. “Job matching, social network and word-of-mouthcommunication.” Journal of Urban Economics, 57(3): 500–522.

Card, David. 2001. “Estimating the return to schooling: Progress on some persistent econometricproblems.” Econometrica, 69(5): 1127–1160.

Page 144: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 139

Card, David, and Jesse Rothstein. 2007. “Racial segregation and the black-white test score gap.”Journal of Public Economics, 91(11-12): 2158–2184.

Card, David, Ciprian Domnisoru, and Lowell Taylor. 2018. “The Intergenerational Transmissionof Human Capital: Evidence from the Golden Era of Upward Mobility.”

Card, David, Jorg Heining, and Patrick Kline. 2013. “Workplace Heterogeneity and the Rise ofWest German Wage Inequality.” The Quarterly Journal of Economics, 128(3): 967–1015.

Card, David, Martin D. Dooley, and A. Abigail Payne. 2010. “School competition and efficiencywith publicly funded Catholic schools.” American Economic Journal: Applied Economics, 2(4): 150–176.

Carlson, Deven, and Joshua M. Cowen. 2015. “Student neighborhoods, schools, and test scoregrowth: Evidence from Milwaukee, Wisconsin.” Sociology of Education, 88(1): 38–55.

Chamorro Premuziz, Tomas, and Adrian Furnham. 2005. “Personality and intellectual compe-tence.”

Chandra, Amitabh, Amy Finkelstein, Adam Sacarny, and Chad Syverson. 2016. “Health CareExceptionalism? Performance and Allocation in the US Health Care Sector.” The American EconomicReview, 106(8): 2110–2144.

Chetty, Raj. 2015. “Behavioral economics and public policy: A pragmatic perspective.” The AmericanEconomic Review, 105(5): 1–33.

Chetty, Raj, and Nathaniel Hendren. 2018a. “The Impacts of Neighborhoods on IntergenerationalMobility I: Childhood Exposure Effects.” The Quarterly Journal of Economics, forthcoming.

Chetty, Raj, and Nathaniel Hendren. 2018b. “The Impacts of Neighborhoods on IntergenerationalMobility II: County-Level Estimates.” The Quarterly Journal of Economics, forthcoming.

Chetty, Raj, John N. Friedman, and Emmanuel Saez. 2013. “Using Differences in Knowledgeacross Neighborhoods to Uncover the Impacts of the EITC on Earnings.” The American EconomicReview, 103(7): 2683–2721.

Chetty, Raj, John N. Friedman, and Jonah E. Rockoff. 2014a. “Measuring the impacts of teachersI: Evaluating bias in teacher value-added estimates.” American Economic Review, 104(9): 2593–2632.

Chetty, Raj, John N. Friedman, and Jonah E. Rockoff. 2014b. “Measuring the impacts of teach-ers II: Teacher value-added and student outcomes in adulthood.” The American Economic Review,104(9): 2633–2679.

Chetty, Raj, John N Friedman, Nathaniel Hilger, Emmanuel Saez, Diane WhitmoreSchanzenbach, and Danny Yagan. 2011. “How Does Your Kindergarten Classroom Affect YourEarnings? Evidence from Project Star.” The Quarterly Journal of Economics, 126(4): 1593–1660.

Chetty, Raj, Nathaniel Hendren, and Lawrence F. Katz. 2016. “The Effects of Exposure toBetter Neighborhoods on Children: New Evidence from the Moving to Opportunity Experiment.”American Economic Review, 106(4): 855–902.

Page 145: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 140

Chetty, Raj, Nathaniel Hendren, Frina Lin, Jeremy Majerovitz, and Benjamin Scuderi.2016. “Gender gaps in childhood: Skills, behavior, and labor market preparedness childhood environ-ment and gender gaps in adulthood.” The American Economic Review, 106(5): 282–288.

Childs, Stephen E., Ross Finnie, and Felice Martinello. 2016. “Postsecondary Student Persistenceand Pathways: Evidence From the YITS-A in Canada.” Research in Higher Education, 1–25.

Chiswick, Barry R., and Paul W. Miller. 1995. “The Endogeneity between Language and Earnings:International Analyses.” Journal of Labor Economics, 13(2): 246–88.

Chiswick, Barry R., and Paul W. Miller. 1996. “Ethnic Networks and Language Proficiency amongImmigrants.” Journal of Population Economics, 9(1): 19–35.

Chiswick, Barry R., and Paul W. Miller. 2004. “Language Skills and Immigrant Adjustment: WhatImmigration Policy Can Do!” Institute for the Study of Labor (IZA) IZA Discussion Papers 1419.

Chiswick, Barry R., and Paul W. Miller. 2005. “Do enclaves matter in immigrant adjustment?”City & Community, 4(1): 5–35.

Chiswick, Barry R., and Paul W. Miller. 2014. “International Migration and the Economics ofLanguage.” Institute for the Study of Labor (IZA) IZA Discussion Papers 7880.

Chiswick, Barry R., Yew Liang Lee, and Paul W. Miller. 2004. “Immigrants’ language skills:The Australian experience in a longitudinal survey.” International Migration Review, 38(2): 611–654.

Chiswick, Barry R., Yew Liang Lee, and Paul W. Miller. 2006. “Immigrants’ Language Skillsand Visa Category.” International Migration Review, 419–450.

Church, Jeffrey, and Ian King. 1993. “Bilingualism and Network Externalities.” Canadian Journalof Economics, 26(2): 337–45.

Chyn, Eric. 2016. “Moved to Opportunity: The Long-Run Effect of Public Housing Demolition onLabor Market Outcomes of Children.”

Cobb-Clark, Deborah A. 2004. “Selection Policy and the Labour Market Outcomes of New Immi-grants.” Institute for the Study of Labor (IZA) IZA Discussion Papers 1380.

Cobb-Clark, Deborah A, and Stefanie Schurer. 2012. “The stability of big-five personality traits.”Economics Letters, 115(1): 11–15.

Conard, Maureen A. 2006. “Aptitude is not enough: How personality and behavior predict academicperformance.” Journal of Research in Personality, 40(3): 339–346.

Credé, Marcus, Michael C Tynan, and Peter D Harms. 2016. “Much Ado About Grit: AMeta-Analytic Synthesis of the Grit Literature.” Journal of Personality and Social Psychology.

Cullen, Julie Berry, Brian A. Jacob, and Steven Levitt. 2006. “The effect of school choice onparticipants: Evidence from randomized lotteries.” Econometrica, 74(5): 1191–1230.

Cutler, David M., Edward L. Glaeser, and Jacob L. Vigdor. 2008. “When are ghettos bad?Lessons from immigrant segregation in the United States.” Journal of Urban Economics, 63(3): 759–774.

Page 146: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 141

Cyrenne, Philippe, and Alan Chan. 2012. “High school grades and university performance: A casestudy.” Economics of Education Review, 31(5): 524–542.

Damm, Anna Piil. 2009. “Ethnic Enclaves and Immigrant Labor Market Outcomes: Quasi-Experimental Evidence.” Journal of Labor Economics, 27(2): 281–314.

Danzer, Alexander M., and Firat Yaman. 2013. “Do Ethnic Enclaves Impede Immigrants’ Inte-gration? Evidence from a Quasi-experimental Social-interaction Approach.” Review of InternationalEconomics, 21(2): 311–325.

Danzer, Alexander M, and Firat Yaman. 2016. “Ethnic concentration and language fluency ofimmigrants: Evidence from the guest-worker placement in Germany.” Journal of Economic Behavior& Organization, 131: 151–165.

De Fruyt, Filip, and Ivan Mervielde. 1996. “Personality and interests as predictors of educationalstreaming and achievement.” European journal of personality, 10(5): 405–425.

Deming, David J. 2014. “Using school choice lotteries to test measures of school effectiveness.” TheAmerican Economic Review, 104(5): 406–411.

Deming, David J., Justine S. Hastings, Thomas J. Kane, and Douglas O. Staiger. 2014.“School choice, school quality, and postsecondary attainment.” The American Economic Review,104(3): 991–1013.

Dobbie, Will, and Roland G. Fryer. 2011. “Are high-quality schools enough to increase achievementamong the poor? Evidence from the Harlem Children’s Zone.” American Economic Journal: AppliedEconomics, 3(3): 158–187.

Dobbie, Will, and Roland G. Fryer. 2013. “Getting beneath the veil of effective schools: Evidencefrom New York City.” American Economic Journal: Applied Economics, 5(4): 28–60.

Dobbie, Will, and Roland G. Fryer. 2015. “The medium-term impacts of high-achieving charterschools.” Journal of Political Economy, 123(5): 985–1037.

Dohmen, Thomas, Armin Falk, David Huffman, and Uwe Sunde. 2010. “Are risk aversion andimpatience related to cognitive ability?” The American Economic Review, 100(3): 1238–1260.

Dohmen, Thomas, Armin Falk, David Huffman, Uwe Sunde, Jürgen Schupp, and Gert GWagner. 2011. “Individual risk attitudes: Measurement, determinants, and behavioral consequences.”Journal of the European Economic Association, 9(3): 522–550.

Donnellan, M Brent, Frederick L Oswald, Brendan M Baird, and Richard E Lucas. 2006.“The mini-IPIP scales: tiny-yet-effective measures of the Big Five factors of personality.” Psychologicalassessment, 18(2): 192.

Dooley, Martin D, A Abigail Payne, and A Leslie Robb. 2012. “Persistence and academic successin university.” Canadian Public Policy, 38(3): 315–339.

Duckworth, Angela L, Christopher Peterson, Michael D Matthews, and Dennis R Kelly.2007. “Grit: perseverance and passion for long-term goals.” Journal of personality and social psychol-ogy, 92(6): 1087.

Page 147: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 142

Duckworth, Angela Lee, and Patrick D Quinn. 2009. “Development and validation of the ShortGrit Scale (GRIT–S).” Journal of personality assessment, 91(2): 166–174.

Duhaime-Ross, Alix. 2015. “Three essays in the economics of education: evidence from Canadianpolicies.” PhD diss. University of British Columbia.

Dumfart, Barbara, and Aljoscha C Neubauer. 2016. “Conscientiousness Is the Most PowerfulNoncognitive Predictor of School Achievement in Adolescents.” Journal of Individual Differences.

Dustmann, Christian, and Arthur van Soest. 2001. “Language Fluency And Earnings: EstimationWith Misclassified Language Indicators.” The Review of Economics and Statistics, 83(4): 663–674.

Dustmann, Christian, and Arthur van Soest. 2002. “Language and the earnings of immigrants.”Industrial and Labor Relations Review, 55(3): 473–492.

Dustmann, Christian, and Francesca Fabbri. 2003. “Language proficiency and labour marketperformance of immigrants in the UK.” Economic Journal, 113(489): 695–717.

Edin, Per-Anders, Peter Fredriksson, and Olof Åslund. 2003. “Ethnic Enclaves And The Eco-nomic Success Of Immigrants - Evidence From A Natural Experiment.” The Quarterly Journal ofEconomics, 118(1): 329–357.

Efron, Bradley, Trevor Hastie, Iain Johnstone, Robert Tibshirani, et al. 2004. “Least angleregression.” The Annals of statistics, 32(2): 407–499.

Evans, William N, Wallace E Oates, and Robert M Schwab. 1992. “Measuring peer groupeffects: A study of teenage behavior.” Journal of Political Economy, 100(5): 966–991.

Fack, Gabrielle, and Julien Grenet. 2010. “When do better schools raise housing prices? Evidencefrom Paris public and private schools.” Journal of public Economics, 94(1): 59–77.

Farrington, Camille A, Melissa Roderick, Elaine Allensworth, Jenny Nagaoka,Tasha Seneca Keyes, David W Johnson, and Nicole O Beechum. 2012. Teaching Adolescentsto Become Learners: The Role of Noncognitive Factors in Shaping School Performance–A CriticalLiterature Review. University of Chicago Consortium on Chicago School Research.

Finkelstein, Amy, Matthew Gentzkow, and Heidi Williams. 2016. “Sources of GeographicVariation in Health Care: Evidence From Patient Migration.” The Quarterly Journal of Economics,131(4): 1681–1726.

Flores, Glenn. 2006. “Language barriers to health care in the United States.” New England Journal ofMedicine, 355(3): 229–231.

Fryer, Roland G., and Lawrence F. Katz. 2013. “Achieving Escape Velocity: Neighborhood andSchool Interventions to Reduce Persistent Inequality.” American Economic Review, 103(3): 232–37.

Gibbons, Stephen, Olmo Silva, and Felix Weinhardt. 2013. “Everybody Needs Good Neighbours?Evidence from Students’ Outcomes in England.” Economic Journal, 123: 831–874.

Goel, Deepti, and Kevin Lang. 2009. “Social Ties and the Job Search of Recent Immigrants.”National Bureau of Economic Research, Inc NBER Working Papers 15186.

Page 148: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 143

Gould, Eric D., Victor Lavy, and M Daniele Paserman. 2004. “Immigrating to opportunity: Es-timating the effect of school quality using a natural experiment on Ethiopians in Israel.” The QuarterlyJournal of Economics, 119(2): 489–526.

Gould, Eric D., Victor Lavy, and M Daniele Paserman. 2011. “Sixty years after the magic carpetride: The long-run effect of the early childhood environment on social and economic outcomes.” TheReview of Economic Studies, 78(3): 938–973.

Goux, Dominique, and Eric Maurin. 2007. “Close neighbours matter: Neighbourhood effects onearly performance at school.” The Economic Journal, 117(523): 1193–1215.

Hanushek, Eric A. 1986. “The economics of schooling: Production and efficiency in public schools.”Journal of Economic Literature, 24(3): 1141–1177.

Heckman, James J., John Eric Humphries, and Gregory Veramendi. 2017. “The Non-MarketBenefits of Education and Ability.” National Bureau of Economic Research, Inc NBERWorking Papers23896.

Heckman, James J, Jora Stixrud, and Sergio Urzua. 2006. “The Effects of Cognitive andNoncognitive Abilities on Labor Market Outcomes and Social Behavior.” Journal of Labor Economics,24(3): 411–482.

Hedges, Larry V, and Amy Nowell. 1995. “Sex differences in mental test scores, variability, andnumbers of high-scoring individuals.” Science, 269(5220): 41.

Hirsh, Jacob B, and Jordan B Peterson. 2008. “Predicting creativity and academic success with a"fake-proof" measure of the Big Five.” Journal of Research in Personality, 42(5): 1323–1333.

Hofmann, Thomas. 2000. “Learning the Similarity of Documents: An Information-Geometric Ap-proach to Document Retrieval and Categorization.” 914–920.

Houtenville, Andrew J., and Karen Smith Conway. 2008. “Parental Effort, School Resources,and Student Achievement.” Journal of Human Resources, 43(2): 437–453.

Hoxby, Caroline. 2015. “Computing the value-added of american postsecondary institutions.” InternalRevenue Service, US Department of the Treasury, Washington, DC.

Imberman, Scott A., and Michael F. Lovenheim. 2016. “Does the market value value-added?Evidence from housing prices after a public release of school and teacher value-added.” Journal ofUrban Economics, 91: 104–121.

Ioannides, Yannis M., and Linda Datcher Loury. 2004. “Job Information Networks, NeighborhoodEffects, and Inequality.” Journal of Economic Literature, 42(4): 1056–1093.

Isphording, Ingo E, and Sebastian Otten. 2014. “Linguistic barriers in the destination languageacquisition of immigrants.” Journal of Economic Behavior & Organization, 105: 30–50.

Jackson, C. Kirabo. 2010. “Do Students Benefit From Attending Better Schools?: Evidence FromRule-based Student Assignments in Trinidad and Tobago.” The Economic Journal, 120(549).

Page 149: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 144

Jackson, C. Kirabo. 2016. “What Do Test Scores Miss? The Importance Of Teacher Effects On Non-Test Score Outcomes.” National Bureau of Economic Research, Inc NBER Working Papers 22226.

Jacob, Brian A. 2004. “Public housing, housing vouchers, and student achievement: Evidence frompublic housing demolitions in Chicago.” The American Economic Review, 94(1): 233–258.

John, Oliver P, Laura P Naumann, and Christopher J Soto. 2008. “Paradigm shift to theintegrative big five trait taxonomy.” Handbook of personality: Theory and research, 3: 114–158.

Kane, Thomas J., and Douglas O. Staiger. 2008. “Estimating Teacher Impacts on Student Achieve-ment: An Experimental Evaluation.” National Bureau of Economic Research 14607.

Kane, Thomas J., Stephanie K. Riegg, and Douglas O. Staiger. 2006. “School quality, neigh-borhoods, and housing prices.” American Law and Economics Review, 8(2): 183–212.

Katz, Lawrence F. 2015. “Reducing Inequality: Neighborhood and School Interventions.” Focus,31: 12–17.

Kautz, Tim, James J. Heckman, Ron Diris, Bas ter Weel, and Lex Borghans. 2014. “Fosteringand Measuring Skills: Improving Cognitive and Non-cognitive Skills to Promote Lifetime Success.”OECD Publishing OECD Education Working Papers 110.

Kirby, Kris N, Gordon C Winston, and Mariana Santiesteban. 2005. “Impatience and grades:Delay-discount rates correlate negatively with college GPA.” Learning and individual Differences,15(3): 213–222.

Kling, Jeffrey R., Jeffrey B. Liebman, and Lawrence F. Katz. 2007. “Experimental analysis ofneighborhood effects.” Econometrica, 75(1): 83–119.

Komarraju, Meera, Steven J Karau, and Ronald R Schmeck. 2009. “Role of the Big Fivepersonality traits in predicting college students’ academic motivation and achievement.” Learning andindividual differences, 19(1): 47–52.

Konya, Istvan. 2007. “Optimal Immigration and Cultural Assimilation.” Journal of Labor Economics,25: 367–391.

Lapierre, David, Pierre Lefebvre, and Philip Merrigan. 2016. “Long term educational attainmentof private high school students in Québec: Estimates of treatment effects from longitudinal data.”Research Group on Human Capital Working Papers Series 16-02.

Lavecchia, Adam M., Heidi Liu, and Philip Oreopoulos. 2014. “Behavioral economics of educa-tion: Progress and possibilities.” National Bureau of Economic Research.

Lavecchia, Adam M, Heidi Liu, and Philip Oreopoulos. 2016. “Chapter 1 - Behavioral Economicsof Education: Progress and Possibilities.” In . Vol. 5 of Handbook of the Economics of Education, , ed.Stephen Machin Eric A. Hanushek and Ludger Woessmann, 1 – 74. Elsevier.

Lazear, Edward P. 1999. “Culture and Language.” Journal of Political Economy, 107(S6): S95–S126.

Lee, David S., and Thomas Lemieux. 2010. “Regression discontinuity designs in economics.” Journalof Economic Literature, 48(2): 281–355.

Page 150: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 145

Lefebvre, Pierre, Philip Merrigan, and Matthieu Verstraete. 2011. “Public subsidies to privateschools do make a difference for achievement in mathematics: Longitudinal evidence from Canada.”Economics of Education Review, 30(1): 79–98.

Lewis, Ethan. 2013. “Immigrant-Native Substitutability and the Role of Language.” In Immigration,Poverty, and Socioeconomic Inequality. , ed. David Card and Steven Raphael. Russell Sage Foundation.

Lindo, Jason M, Nicholas J Sanders, and Philip Oreopoulos. 2010. “Ability, gender, and per-formance standards: Evidence from academic probation.” American Economic Journal: Applied Eco-nomics, 2(2): 95–117.

Ludwig, Jens, Greg J. Duncan, Lisa A. Gennetian, Lawrence F. Katz, Ronald C.Kessler, Jeffrey R. Kling, and Lisa Sanbonmatsu. 2013. “Long-Term Neighborhood Effectson Low-Income Families: Evidence from Moving to Opportunity.” The American Economic Review,103(3): 226–231.

Machin, Stephen, and Tuomas Pekkarinen. 2008. “Global sex differences in test score variability.”Science.

Martin, Shirley. 1998. “New life, new language: The history of the Adult Migrant English Program.”

McCrary, Justin. 2008. “Manipulation of the running variable in the regression discontinuity design:A density test.” Journal of econometrics, 142(2): 698–714.

Mischel, Walter, Yuichi Shoda, and Monica L Rodriguez. 1989. “Delay of gratification in chil-dren.” Science, 244(4907): 933–938.

Molitor, David. 2018. “The Evolution of Physician Practice Styles: Evidence from Cardiologist Mi-gration.” American Economic Journal: Economic Policy, 10(1): 326–356.

Moretti, Enrico. 2004. “Estimating the social return to higher education: evidence from longitudinaland repeated cross-sectional data.” Journal of econometrics, 121(1): 175–212.

Munshi, Kaivan. 2003. “Networks In The Modern Economy: Mexican Migrants In The U.S. LaborMarket.” The Quarterly Journal of Economics, 118(2): 549–599.

Noftle, Erik E, and Richard W Robins. 2007. “Personality predictors of academic outcomes: bigfive correlates of GPA and SAT scores.” Journal of personality and social psychology, 93(1): 116.

O’Connor, Melissa C, and Sampo V Paunonen. 2007. “Big Five personality predictors of post-secondary academic performance.” Personality and Individual differences, 43(5): 971–990.

OECD. 2013. International Migration Outlook 2013. OECD Publishing.

Oreopoulos, Philip. 2003. “The Long-Run Consequences Of Living In A Poor Neighborhood.” TheQuarterly Journal of Economics, 118(4): 1533–1575.

Oreopoulos, Philip. 2008. “Neighborhood Effects in Canada: A Critique.” Canadian Public Policy-Analyse de politiques, 34(2).

Oreopoulos, Philip. 2011. “Why Do Skilled Immigrants Struggle in the Labor Market? A Field Experi-ment with Thirteen Thousand Resumes.” American Economic Journal: Economic Policy, 3(4): 148–71.

Page 151: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 146

Oreopoulos, Philip. 2012. “Moving Neighborhoods Versus Reforming Schools: A Canadian’s Perspec-tive.” Cityscape, 207–212.

Oreopoulos, Philip, and Kjell G. Salvanes. 2011. “Priceless: The Nonpecuniary Benefits of School-ing.” Journal of Economic Perspectives, 25(1): 159–184.

Oreopoulos, Philip, and Uros Petronijevic. 2013. “Making College Worth It: A Review of Researchon the Returns to Higher Education.” National Bureau of Economic Research, Inc NBER WorkingPapers 19053.

Oreopoulos, Philip, and Uros Petronijevic. 2016. “Student Coaching: How Far Can TechnologyGo?” National Bureau of Economic Research, Inc NBER Working Papers 22630.

Oster, Emily. 2017. “Unobservable Selection and Coefficient Stability: Theory and Evidence.” Journalof Business & Economic Statistics, 0(0): 1–18.

Patacchini, Eleonora, and Yves Zenou. 2012. “Ethnic networks and employment outcomes.” Re-gional Science and Urban Economics, 42(6): 938–949.

Pei, Zhuan, Jorn-Steffen Pischke, and Hannes Schwandt. 2017. “Poorly Measured ConfoundersAre More Useful On The Left Than On The Right.” National Bureau of Economic Research, IncNBER Working Papers 23232.

Pop-Eleches, Cristian, and Miguel Urquiola. 2013. “Going to a Better School: Effects and Behav-ioral Responses.” American Economic Review, 103(4): 1289–1324.

Poropat, Arthur E. 2009. “A meta-analysis of the five-factor model of personality and academicperformance.” Psychological bulletin, 135(2): 322.

Richardson, Michelle, Charles Abraham, and Rod Bond. 2012. “Psychological correlates ofuniversity students’ academic performance: a systematic review and meta-analysis.” Psychologicalbulletin, 138(2): 353.

Rivkin, Steven G., Eric A. Hanushek, and John F. Kain. 2005. “Teachers, schools, and academicachievement.” Econometrica, 73(2): 417–458.

Robbins, Steven B, Kristy Lauver, Huy Le, Daniel Davis, Ronelle Langley, and AaronCarlstrom. 2004. “Do psychosocial and study skill factors predict college outcomes? A meta-analysis.”Psychological bulletin, 130(2): 261.

Roberts, Brent W, Nathan R Kuncel, Rebecca Shiner, Avshalom Caspi, and Lewis R Gold-berg. 2007. “The power of personality: The comparative validity of personality traits, socioeconomicstatus, and cognitive ability for predicting important life outcomes.” Perspectives on PsychologicalScience, 2(4): 313–345.

Rothstein, Jesse. 2006. “Good principals or good peers? Parental valuation of school characteristics,Tiebout equilibrium, and the incentive effects of competition among jurisdictions.” The AmericanEconomic Review, 96(4): 1333–1350.

Rothstein, Jesse. 2017. “Inequality of Educational Opportunity? Schools as Mediators of the Inter-generational Transmission of Income.”

Page 152: by Jean-William P. Laliberté...Education: Schoolsand Neighborhoods 1.1 Introduction Improving graduation rates and college attendance are high-priority objectives shared by community

BIBLIOGRAPHY 147

Rothstein, Jesse M. 2004. “College performance predictions and the SAT.” Journal of Econometrics,121(1): 297–317.

Scott-Clayton, Judith, Peter M Crosta, and Clive R Belfield. 2014. “Improving the Target-ing of Treatment Evidence From College Remediation.” Educational Evaluation and Policy Analysis,36(3): 371–393.

Sharkey, Patrick, and Jacob W. Faber. 2014. “Where, when, why, and for whom do residentialcontexts matter? Moving away from the dichotomous understanding of neighborhood effects.” AnnualReview of Sociology, 40: 559–579.

Stephan, Jennifer L, Elisabeth Davis, Jim Lindsay, and Shazia Miller. 2015. “Who will succeedand who will struggle? Predicting early college success with Indiana’s Student Information System.”U.S. Department of Education, Institute of Education Sciences, National Center for Education Eval-uation and Regional Assistance, Regional Educational Laboratory Midwest REL2015-078.

Stevens, Gillian. 1992. “The social and demographic context of language use in the United States.”American Sociological Review, 171–185.

Sykes, Brooke, and Sako Musterd. 2011. “Examining neighbourhood and school effects simultane-ously: what does the Dutch evidence show?” Urban Studies, 48(7): 1307–1331.

Symonds, William C, Robert Schwartz, and Ronald F Ferguson. 2011. “Pathways to prosperity:Meeting the challenge of preparing young Americans.” Cambridge, MA: Pathways to Prosperity Projectat Harvard Graduate School of Education.

Todd, Petra E., and Kenneth I. Wolpin. 2003. “On The Specification and Estimation of TheProduction Function for Cognitive Achievement.” Economic Journal, 113(485): 3–33.

Van Tubergen, Frank, and Matthijs Kalmijn. 2009. “Language proficiency and usage among immi-grants in the Netherlands: Incentives or opportunities?” European Sociological Review, 25(2): 169–182.

Vervoort, Miranda, Jaco Dagevos, and Henk Flap. 2012. “Ethnic concentration in the neighbour-hood and majority and minority language: A study of first and second-generation immigrants.” Socialscience research, 41(3): 555–569.

Wahba, Jackline, and Yves Zenou. 2005. “Density, social networks and job search methods: Theoryand application to Egypt.” Journal of Development Economics, 78(2): 443–473.

Warman, Casey. 2007. “Ethnic enclaves and immigrant earnings growth.” Canadian Journal of Eco-nomics, 40(2): 401–422.

Weinhardt, Felix. 2014. “Social housing, neighborhood quality and student performance.” Journal ofUrban Economics, 82: 12–31.

Willingham, Warren W. 1985. “Success in college.” New York: College Entrance Examination Board.

Wodtke, Geoffrey T., and Matthew Parbst. 2017. “Neighborhoods, Schools, and AcademicAchievement: A Formal Mediation Analysis of Contextual Effects on Reading and Mathematics Abil-ities.” Demography.