technical report the effects of the new orleans post … › preferen › you must read this ›...

Douglas N. Harris, Tulane UniversityMatthew F. Larsen, Lafayette College

July 15, 2018

Education Research Alliance NOLA.org

Technical Report

THE EFFECTS OF THE NEW ORLEANS POST-KATRINA MARKET-BASED SCHOOL

REFORMS ON STUDENT ACHIEVEMENT, HIGH SCHOOL GRADUATION, AND COLLEGE

OUTCOMES

1

The Effects of the New Orleans Post-Katrina Market-Based School Reforms on Student Achievement, High School Graduation, and

College Outcomes

Douglas N. Harris Matthew F. Larsen

July 15, 2018

Abstract: The school reforms put in place in New Orleans after Hurricane Katrina represent the most intensive market-based school accountability system ever created in the United States. Almost all public schools were taken over by the state, which turned over management to autonomous non-profit charter management organizations working under performance contracts; collective bargaining and tenure were ended, yielding flexible human capital management; and traditional school attendance zones were eliminated, expanding choice for families. More than a decade later, this study provides the first examination of the effects of this package of reforms on students’ short-term and long-term outcomes, using matched difference-in-differences. We find that the package of reforms improved the quantity, quality, and equity of schooling in the city on almost every available measure, increasing average test scores by 0.28-0.40 standard deviations, high school graduation by 3-9 percentage points, college attendance by 8-15 percentage points, college persistence by 4-7 percentage points, and college graduation by 3-5 percentage points. These effects translate to 10-67 percent increases over baseline levels. The reforms also apparently reduced educational inequality by race and income on most measures. Our estimation procedures address potential threats to identification, including, for example, changes in the population. The reforms highlight the potential of market-based school reforms, though we also identify reasons why effects of this large size and range may not be expected in other locations and circumstances. Acknowledgements: This study was conducted at the Education Research Alliance for New Orleans at Tulane University. The authors wish to thank the organization’s funders: the John and Laura Arnold Foundation, William T. Grant Foundation, the Spencer Foundation and, at Tulane, the Department of Economics, Murphy Institute and School of Liberal Arts. We also thank Joshua Angrist, Robert Bifulco, Deven Carlson, Joshua Cowen, John Easton, David Figlio, Adam Gamoran, Dan Goldhaber, Helen Ladd, Jane Arnold Lincove, Susanna Loeb, Andrew McEachin, Parag Pathak, Ross Rubenstein, Robert Santillano, Amy Schwartz, John Yinger, Lindsay Bell Weixler, and seminar participants at the National Bureau of Economic Research, Syracuse University, Texas A&M University, Tulane University, the University of Arkansas, and the University of Virginia. For other important contributions, including data collection and analysis, we thank Alica Gerry and Nathan Barrett. Author Information: Douglas N. Harris (corresponding author) is Professor of Economics, the Schleider Foundation Chair in Public Education, Director of the Education Research Alliance for New Orleans, and Director of the National Center for Research on Education Access and Choice (REACH) at Tulane University ([email protected]). Matthew F. Larsen is an Assistant Professor of Economics at Lafayette College and a Non-Resident Research Fellow at ERA-New Orleans ([email protected]).

2

Introduction For the past century, America’s publicly funded schools have been almost

universally operated by local government agencies that assign students to schools based

on their neighborhoods. This could be efficient with school districts competing against

one another (Tiebout, 1956; Hoxby, 2000), though political forces (Kollman, Miller, &

Page, 1997), labor unions (Hoxby, 1996; Strunk & Grissom, 2010; Lovenheim & Willen,

2016), and other factors may make public sector education production inefficient (Chubb

& Moe, 1990; Hill, Pierce, & Guthrie, 1997; Hanushek, 2002). To address these

problems, Friedman (1962) argued for reform of the American schooling system so that

families are “free to choose” where their children attend school, government subsidies

follow the student to induce more direct competition among schools, and non-

governmental organizations supply schooling subject to minimal regulation.

The school reforms put in place in New Orleans after the tragedy of Hurricane

Katrina offer arguably the first direct test of the effects of these two alternative models.

Prior to Katrina, the New Orleans school system was well aligned with almost every

other city in the United States. In addition to neighborhood-based assignment of students

to schools, the vast majority of schools were governed and operated by the Orleans Parish

School Board (OPSB). The city’s teachers worked under a union contract that established

a single salary schedule and extensive work rules. The union contract, along with state

tenure protections, also provided teachers with considerable job security.

3

After Hurricane Katrina struck New Orleans on August 29, 2005, all the

hallmarks of the traditional school district were eliminated (Harris, 2015).1 The state

government took over the school system, moving oversight of almost all the city’s public

schools from the local OPSB to the statewide Louisiana Recovery School District (RSD).

Many OPSB schools were quickly turned into charter schools and, over time, so too were

all RSD schools. Attendance zones were essentially eliminated, creating open school

choice for families. All educators were fired. The OPSB allowed its teacher union

contract to expire and never replaced it. By virtue of becoming charter schools, tenure

protections were also eliminated. Local and state agencies still had a role, especially in

funding schools, but control was mostly limited to passing funds on to schools (on a per-

pupil basis) and negotiating and enforcing contracts with their charter schools. They no

longer made decisions about school personnel, compensation, curriculum, instruction, or

other most other aspects of school management. Over just a few years, the government

role was dramatically altered, from operating schools to overseeing them. Consistent with

Friedman’s (1962) call, the district-based “one best system” of U.S. public education

(Tyack, 1974) was eliminated for the first time since it took shape a century earlier.

As sudden as these changes were in New Orleans, the new policies themselves

reflected a two-decade shift toward test-based and market-based accountability

throughout the United States. Induced by evidence of possibly inefficient resource use

(Hanushek, 1996), poor showings on international assessments (National Commission on

Educational Excellence, 1983; Goldin & Katz, 2008), and flat test score trends

(Hanushek & Woessman, 2010), the 1994 re-authorization of the federal Elementary and

1 Hurricane Rita struck just one month later on September 24, 2005. For simplicity, however, we simply refer to “the hurricane” going forward.

4

Secondary Education Act (ESEA) began requiring standardized testing and school report

cards. The frequency of testing and stakes attached to them increased under No Child Left

Behind (NCLB) in 2001 (Dee & Jacob, 2011). These new incentives shifted the

traditional public school system in the U.S., albeit only slightly, toward performance-

based contracting that is common in other parts of the public sector.

The New Orleans reforms also followed the longer national trend toward market

accountability through parental school choice and opening up the supply side through

non-governmental organizations, such as charter schools (e.g., Angrist et al., 2010;

Abdulkadiroğlu et al., 2011; CREDO, 2013a), private school vouchers (Rouse, 1998;

Krueger & Zhu, 2004; Abdulkadiroğlu et al., 2015; Mills & Wolf, 2017), and intra- and

inter-district choice among traditional public schools (Berry, Jacob, & Levitt, 2005;

Harris, Witte, & Valant, 2017). With both performance-based contracts and market

pressure from families, as well as autonomy from district and union rules, to meet these

demands, the theory is that leaders would have stronger incentives for efficient operation

and the autonomy to respond to these incentives by creating better schools.

Though the word accountability has been commonly used in the national debate,

the actual incentives facing schools, and the degree of autonomy they have to respond to

them, have changed relatively little. Before Hurricane Katrina, some districts around the

country had experimented with school-level autonomy (Ravitch, 2000) and mayoral and

state takeovers of schools and districts (Wong & Shen, 2006; Gill et al., 2006); but most

of these efforts were short-lived, and local school board politics, unions, and school

attendance zones remained as core components of school operations (Ravitch, 2000).

NCLB increased the volume of testing, but only a small fraction of the schools slated for

5

corrective action under NCLB experienced significant intervention, such as closure or

takeover (GAO, 2007).2 This may be why researchers have found the NCLB effects to be

small at best (Dee & Jacob, 2011).

Market accountability, too, seems to have had a limited influence. At about the

time Katrina made landfall, only two percent of U.S. students attended charter schools,

and 13 percent of U.S. students attended a non-assigned publicly funded school (Harris,

Witte, & Valant, 2017).3 Only seven districts had more than 20 percent of their students

in charter schools, and none had more than 50 percent (National Alliance for Public

Charter Schools, 2013). New Orleans was typical in having only four charter schools (out

of 127 schools in total). While research increasingly shows positive effects of charter

schools (e.g., Angrist et al., 2010, 2016; Abdulkadiroğlu et al., 2011, Booker et al., 2011;

CREDO, 2013a) and they seem to create competition that improves outcomes in nearby

traditional public schools (Gill & Booker, 2008; Epple, Romano & Zimmer, 2015), their

market share has been too small to have much effect on outcomes across entire cities or

regions.4 For these reasons, advocates for accountability and school autonomy have

argued that policymakers have not gone far enough with these reforms (Hill & Lake,

2004, Evers, 2014; Peterson, 2014; Walberg, 2014). President Trump and his Secretary of

Education, Betsy DeVos, seem to agree as they have advocated for large state and

national school choice programs.

2 A synthesis of evidence from studies of pre-NCLB accountability found more positive cumulative effects on test scores averaging 0.08 standard deviations (Lee, 2008). Also, see Carnoy & Loeb (2003). 3 We include in “non-assigned publicly funded schools” students who attend schools labeled charter, magnet, and intra-and inter-district schools of choice (Harris, Witte, & Valant, 2017). The percentage of students in charter schools has since increased to almost five percent (NCES, 2015). 4 The majority of studies on competitive effects from choice suggest positive effects, and almost none show negative effects (many are null) (Gill & Booker, 2008). Other studies have examined the effects of competition within the traditional public schooling market (e.g., Hoxby, 2000; Belfield & Levin, 2003; Rothstein, 2007).

6

In New Orleans, policymakers went much further with school accountability and

autonomy than perhaps any district or state ever had. There have also been some signs of

success. New Orleans statewide’ ranking on the percentage of students who are proficient

has moved from the 67th ranked district to the 39th (of 68) ranked districts since the

reforms (Louisiana Department of Education, 2015).5 Averaging across all subjects and

grades, the test score gap between New Orleans and the rest of the state decreased by

0.37-0.44 student-level standard deviations (s.d.) from 2005 to 2014 (see Table 1). New

Orleans’ high school graduation rates, any college attendance, and college graduation

increased by 17-20 percentage points, 10 percentage points, and 2.1 percentage points,

respectively. These positive trends, combined with other evidence (CREDO, 2013b;

Abdulkadiroğlu et al., 2016), suggest the New Orleans reform effects may have been

positive. As a result, the New Orleans system has been widely hailed among reform

advocates and national political leaders with otherwise divergent views, ranging from

Democratic President Obama (2010) and his Secretary of Education Arne Duncan

(Dreilinger, 2014) to President George W. Bush’s Director of the Institute for Education

Sciences (Whitehurst, 2012) and Republican Louisiana Governor Bobby Jindal (America

Next, 2015). Reflecting this broad political support, at least 27 districts nationally are

following New Orleans’ lead with similar, albeit somewhat less aggressive, reforms (Hill

& Campbell, 2011).

There have been reasons to be skeptical, however. Prior studies have focused

narrowly on high-stakes test scores, which are prone to cheating and strategic behavior

(Jacob, 2005). More generally, prior studies have not been designed to estimate causal

5 For comparability, the post-Katrina New Orleans “district” ranking is based on a weighted average of the New Orleans RSD and OPSB schools.

7

effects of the city’s reforms.6 Here, we provide some of the first evidence about the

effects of the New Orleans reform package on students’ long-term academic outcomes

using several difference-in-differences strategies that compare the pre- and post-reform

periods in New Orleans to matched comparison groups. We find that the effects on

achievement and college graduation closely mirror the strong positive direction of the

descriptive trends. The reforms increased student achievement by 0.28-0.40 s.d., high

school graduation by 3-9 percentage points (10-14 percent from baseline levels), college

attendance by 8-15 percentage points (15-67 percent), college persistence by 4-7

percentage points (25-28 percent), and college graduation by 3-5 percentage points (30-

45 percent).

With additional analysis, we are able to largely rule out several threats to

identification, including population change, strategic behavior from performance-based

accountability, trauma and disruption from the hurricane, and the effectiveness of the

interim schools that evacuated students temporarily attended. The treatment effects

appear to be an order of magnitude larger than the potential biases, and some biases

partially cancel out. The main potential alternative explanation for the effects, beyond the

reforms themselves, is that the charter-based system gained a considerable financial

advantage over the comparison groups (Buerger & Harris, 2016), though these alone

cannot explain the overall effects.

6 Abdulkadiroğlu et al. (2016) studied the effectiveness of charter schools as compared with the schools run directly by the OPSB after the reforms started. The Center for Research on Education Outcomes (CREDO, 2015) compared annual student growth in post-Katrina New Orleans’ charter schools with growth of similar students (“virtual twins”) in traditional public schools mostly in other districts. These first two important studies focused only on the post-Katrina period and did not consider school quality prior to the reforms, and therefore were not intended or designed to study the effects of the reforms. Sacerdote (2012) found that New Orleans evacuees experienced larger increases in school quality than evacuees from other Louisiana parish/districts, which confirms the low performance of pre-treatment New Orleans schools but, again, does not address their post-Katrina improvement within New Orleans.

8

There are many ways to reform schools and a variety of roles for markets and

governments. New Orleans is an important case because it shares Friedman and other

market-focused school reformers’ focus on school choice for students and autonomy for

educators, but the city diverges by maintaining an active role for government in

regulation and performance-based contracting. The next section describes our methods

and how they address threats to identification. This is followed by discussion of data,

results, and conclusions.

Model and Identification

Estimation Strategy

We attempt to identify causal effects using a combination of matching and

difference-in-differences (DD) analysis. Specifically, we estimate the effects of the New

Orleans school reform package starting with standard two-period DD estimation (Angrist

& Pischke, 2009):

𝐴!"# = 𝛾! + 𝑋!"#𝛽 + 𝜆𝑑! + 𝛿 𝑁𝑂𝐿𝐴! ∙ 𝑑! + 𝜀!"# (1) where 𝐴!"# is the achievement of student i in school district j at time t, 𝛾! is a vector of

school district fixed effects, 𝑋!"# is a vector of student covariates7, 𝑑! indicates whether

the outcomes pertain to a single pre-treatment period or a single post-treatment period,

and 𝑁𝑂𝐿𝐴! is an indicator set to unity for New Orleans and zero for students in the

matched comparison group districts. No other district in Louisiana experienced the

reforms, and these therefore serve as a useful comparison group. Under certain

assumptions discussed below, especially that student outcomes would have moved in

7 These include race, free/reduced price lunch status, special education status, limited English proficiency, and grade repetition. In addition, we include bin indicators for each stratum in the matching process.

9

parallel absent the treatment, ordinary least squares estimation of 𝛿 provides an unbiased

estimate of the average treatment effect (ATE).8

Our first estimates are based on equation (1) using only the most recent pre-

reform year possible, and the most recent post-reform period available in the data, which

varies by method (see below).9 However, the effects of the reforms may have emerged

gradually over time. In creating an entirely new system of schooling, New Orleans

leaders not only had to create new schools, but an entirely new governance structure and

new institutions to develop charter school operators (e.g., New Schools for New Orleans),

new avenues for recruiting teachers to the city (e.g., Teach for America, which had some

presence in the city before the reforms), and new ways to provide information to parents

to help them choose schools (e.g., New Orleans Parents Guide). The state RSD existed

prior to Katrina but had just a handful of staff and had not been designed to carry out its

new responsibilities.

To estimate these dynamic effects and avoid imposing restrictive assumptions of

two-period DD and related types of models,10 we instead rely mostly on Granger/event

study estimates (Granger, 1969; Autor, 2003; Angrist & Pischke, 2009) as follows:

𝐴!"# = 𝛾! + 𝜆! + 𝑋!"#𝛽 + 𝛿!! 𝑁𝑂𝐿𝐴! ∙ 𝑑!,!!!!!! + 𝛿!! 𝑁𝑂𝐿𝐴! ∙ 𝑑!,!!

!!!! + 𝜀!"#(2)

where 𝜆! is a vector of year indicators, m is the number of years in the data prior to

treatment, and q is the number of years after treatment. This implies that 𝛿!! is the

(adjusted) difference in outcomes of the control and treatment groups 𝜏 periods before

8 Athey and Imbens (2002) discuss additional linearity assumptions used in DD estimation. 9 We refer to the spring of the school year throughout the remainder of the study, since this is when students take tests. So, 2005 means the 2004-05 school year and so on. 10 When there are more than two periods of data, it is sometimes recommended to add group-specific time trends as follows: 𝐴!"# = 𝛾!! + 𝛾!!𝑡 + 𝜆! + 𝑋!"#𝛽 + 𝛿 𝑁𝑂𝐿𝐴 ∙ 𝑑! + 𝜀!"#where t is a continuous time period variable and 𝛾!! is the slope (Angrist & Pischke, 2009). This specification yields biased estimates, however, when there are dynamic effects (Pischke, 2005). Equation (2) avoids this problem.

10

treatment. This provides a test of parallel trends, and if the assumption holds, then it is

reasonable to interpret 𝛿!! as causal effects of the reforms.11 The estimation of (2) also

shows how the effects increase (or decrease) toward the longer-term effects from the

estimation of (1).

We use two general strategies to estimate equations (1) and (2): (a) panel analysis

using only that portion of the pre-hurricane student population that returned to their pre-

hurricane district for at least one post-treatment year; and (b) pooled cross-sections of

student cohorts who were in the same grades pre- and post-treatment (e.g., comparing

achievement for the 2005 cohort of 4th graders with the 2014 cohort of 4th graders). With

the panel approach, we are able to study a fixed group of individuals and thereby account

for unobserved differences directly; however, the returning group is a small, non-random

subsample of the original population, which limits statistical power and generalizability.

Also, eventually, the outcomes of the pre-treatment students cannot be measured because

they move on to non-tested grades or graduate, making it impossible to study the longer-

term reform effects that are of primary interest. With pooled cross sections, the sample is

much larger as almost all students who were in New Orleans schools pre- or post-Katrina

contribute to the estimation, but we have to rely on observable demographic information

to account for population change.

Our methods vary somewhat depending on the dependent variable. With test

scores, we can use both panel and pooled analyses, at least for certain years. However,

with one-time events like high school graduation and college attendance, we have to rely

mainly on the pooled approach because, by their nature, these long-term outcomes cannot 11 While these pre-treatment 𝛿!! terms can be interpreted as parallel trends tests in the event studies, we report a separate parallel trends test that assumes simple linear pre- and post-trends for control and treatment.

11

be measured for individuals annually across time and therefore cannot be used to track

individuals pre- and post-treatment to account for unobservable differences. This limits

the usefulness of the panel approach with one-time events.

Where possible, we designed the analyses to have enough pre-treatment data to

allow parallel trends tests. We also account for potential endogeneity using a variety of

the methods discussed by Bertrand, Duflo, and Mullainathan (2004): graphing the

dynamics of the effects (see model (2)), adding treatment-specific time trends that vary

pre- and post-reform, and looking for an effect prior to intervention (placebo tests). The

use of both panel and pooled analysis, one additional identification strategy, and various

direct tests for specific forms of bias all reinforce our results.

We report results estimated at the student-level with Generalized Estimating

Equation (GEE) clustering at the district level (Liang and Zeger, 1986). However, the

GEE approach rests on asymptotic assumptions about the number of clusters. Inference is

generally only valid with at least 30-50 clusters (Kezdi, 2004; Cameron, Gelbach, and

Miller, 2008; Angrist & Pishke, 2009), and our preferred estimates include only 6-8

districts. To address this, we also report estimates using almost all of the more than 60

districts in the state. The standard errors are smaller in that case, but not enough to

change the conclusions. We also carried out some estimation with data aggregated to the

district-by-year level, which generally yields conservative standard errors (Bertrand et al.,

2004; Angrist & Pischke, 2009).12 The results are generally robust to these and many

other variations in methods.

12 A third common alternative, the wild bootstrap (Cameron, Gelbach, & Miller, 2008) is infeasible in this case because there is only one treatment cluster.

12

Threats to Identification

There are many general threats to identification with natural experiments,

especially endogenous policy adoption and the simultaneous (unobserved) policy changes

that also affect the outcome of interest. In the case of the New Orleans school reforms,

there are five specific threats that serve as alternative potential causes of the city’s

positive test score trends.

First, the population of the city changed (The Data Center, 2014; Vigdor, 2008).

In the process of rebuilding, city leaders decided to shut down and eventually replace

most of the major public housing projects. For this and other reasons, low-income

residents may not have returned, and more socio-economically advantaged families may

have replaced them, increasing outcomes for reasons unrelated to the reforms.

Second, when Louisiana families evacuated, they generally placed their children

in schools near their new and temporary residences. New Orleans evacuees experienced

larger gains in school quality in these “interim schools” relative to non-New Orleans

evacuees (Sacerdote, 2012), and if these gains did not fade out, then some of the later

increases in achievement might reflect the performance of these interim schools rather

than the New Orleans reforms.13

13 A related possibility is that the students who evacuated benefited from being in better housing and/or neighborhoods. It is possible that, as in some past disasters, New Orleans’ housing quickly surpassed the quality that had preceded it and that this drove positive educational effects for those who did return (Hornbeck & Keniston, 2017). This seems unlikely, however, as the quality of the New Orleans’ housing stock was unchanged for non-flooded homes for sale and remained considerably worse for flooded homes one full year after Katrina (McKenzie & Levendis, 2010). Most homeowners lacked flood insurance, and state and federal programs averaged about $40,000 per home, or about 17 percent of their prior values. This likely understates the percentage of the housing value covered for low-income homeowners, but not enough to generate an overall improvement in the housing stock for low-income families. It is also worth considering whether there was a general improvement in neighborhood quality after the storms, in which case the effects might be interpreted more in line with the well known Moving to Opportunity (MTO) program (Ludwig et al., 2013). However, this, and other evidence presented later about poverty, suggests

13

Third, prior research has shown that accountability induces some schools to

manipulate high-stakes measures and/or reallocate resources in ways that reduce

unobserved outcomes that are lower-stakes (Jacob, 2005; Figlio, 2006; Koretz, 2009).

Such strategic behavior may be especially important in New Orleans where schools are

closed or taken over based substantially on the test scores and graduation rates that

represent some of our main dependent variables (Bross, Harris, & Liu, 2016) and

therefore where the stakes tied to test scores and graduation rates are very high.

Fourth, NCLB had been adopted a few years prior to Katrina, and the law’s key

provisions were just being implemented when the storm and reforms hit. Since low-

performing schools are the focus of NCLB sanctions, and New Orleans had a

disproportionate share of such schools, the post-Katrina improvements in New Orleans’

outcomes might have occurred anyway. While the effects of NCLB have apparently been

modest on a national level (Dee & Jacob, 2011), the effects were larger in some places

than others, and larger effects are especially likely in cities like New Orleans where a

large share of schools would have been affected.

Fifth, our difference-in-differences analysis suggests that average spending per

pupil increased by roughly $1,358 per pupil (13 percent). A growing body of research

suggests that school spending increases student outcomes (e.g., Jackson, Johnson, &

Persico, 2016; Lafortune, Rothstein, & Schanzenbach, 2016). These additional resources

may have increased outcomes even if the market-based reforms had not been adopted.

While these first five threats to identification suggest the trends would tend to bias

estimated effects upwards, the direction of a sixth threat could have the opposite

that the improvements in student outcomes we observe cannot plausibly be explained by neighborhood effects either. We thank David Figlio for mentioning this potential role for housing.

14

influence: Hurricane Katrina was one of the worst disasters in American history14 and

created persistent trauma and anxiety for residents (DeSalvo et al., 2007; Elliott & Pais,

2006; Weems et al., 2010). Some of these psychological effects were driven by poor

post-storm labor market outcomes among those who had lived in the most heavily

flooded areas (Groen & Polivka, 2008). Those with worse post-hurricane housing and

labor market outcomes also experienced worse Post-Traumatic Stress Disorder (PTSD)

(Elliott & Pais, 2006). While most of the psychological evidence pertains to adults, there

is also evidence of trauma and disruption among children more than two years after the

hurricane (Brown et al., 2011)15, and this apparently reduced academic learning at least in

the short term (Pane et al., 2006, 2008; Sacerdote, 2012).

Data and Matching

Dependent Variables

The Louisiana Department of Education (LDOE) provided student-level

longitudinally linked data for essentially all public school students in the state for each

year 2001-2014. Key variables include student test scores, exit codes (for high school

graduation), college outcomes, student demographics, grade levels, and schools attended.

Pre- and post-Katrina, students took state standardized tests in grades 3-8.16

14 As many as 1,900 people died as a result of the storm, and the city experienced at least $80 billion dollars in damage to physical infrastructure (Pane et al., 2008). 15 One sample of students reported thoughts of the following common disaster-related events 30 months after the hurricanes: “having thoughts someone might die (79%), having clothes or toys ruined (78%), having their home badly damaged or destroyed (65%), witnessing others hurt during the storm (45%), having a pet hurt or die (41%), thinking they might die during the storm (38%), having trouble getting food and water (20%)” (Brown et al., 2011, p.576). 16 The high school testing data is omitted because, like many states, Louisiana switched to End-of-Course (EOC) exams in high school after Katrina, which created issues of comparability. Also, students can take the high school tests in different grades, depending on when courses are available, creating an additional source of endogeneity. For these reasons, we focus on scores in grades 3-8, where the testing regime is more stable, along with high school graduation and college outcomes.

15

High school graduation is a stronger predictor of long-term life outcomes than

student test scores and serves essentially as a prerequisite to college (Levin et al., 2006;

Heckman, Lochner, & Todd, 2008). The design of state data systems, however,

complicates the measurement of high school graduation because exit codes are self-

reported by schools. This means, first, that some high school credentials will be higher

quality than others. Also, since Louisiana high schools are held accountable for

graduation, they may misreport the data in ways that inflate their graduation rates, similar

to cheating on high-stakes standardized tests (e.g., Jacob, 2005). Manual audits by the

state using samples of the hardest-to-verify codes (transfer to private schools, out-of-state

public schools, and home schools) have been unable to confirm the school data

(Louisiana Department of Education, n.d.).

To address these problems, we use three different graduation rates. Variable

Grad1 counts only regular diplomas as graduates and defines the denominator (zeros and

missing exits) in ways that approximate the state-defined graduation rate (hard-to-verify

exit codes are coded as missing); Grad2 is the same but counts hard-to-verify exit codes

as zeros (non-graduation); and Grad3 uses the Grad1 definition of the denominator but

broadens the definition of the numerator to include alternative completion such as

GEDs.17 Unless otherwise specified, we include both on-time and delayed high school

graduation together, though the restriction to on-time graduation has a minimal effect.18

17 Since the analysis is at the district level, all three high school graduation rate definitions ignore transfers between schools within districts; transfers between districts are counted as missing in all three definitions because they are neither positive nor negative outcomes, and depending on the grade, these transfers are removed from the cohort of the sending school for graduation purposes. 18 An additional potential problem with the high school data is that there was no formal validation process prior to 2007 (when graduation was added into school performance measures for accountability purposes). Therefore, we are also assuming that any trend in the measurement error due to this lack of validation is orthogonal to treatment.

16

An additional methodological challenge emerges because state and federal

definitions of graduation rates are based on the share of first-time 9th graders who

graduate on time with regular diplomas.19 This means we need five years of pre-treatment

data to calculate a single pre-treatment on-time graduation rate (one year for identifying

students who are first-time 9th graders plus four more years of high school). In order to

carry out the parallel trends test we also need two pre-treatment cohorts and therefore six

years of pre-treatment data. Given that we only have five years of pre-treatment data, we

report pooled results for first-time 9th graders without parallel trends test and for first-

time 10th graders with parallel trends tests.

We also study college attendance (on-time and any enrollment), persistence, and

graduation. For these outcomes, we restrict to first-time 12th graders, rather than high

school graduates) because of concerns that high schools might not let some students

graduate if they are performing poorly or not planning to attend college. We also restrict

persistence and graduation to that which occurs within 5 years of the end of students’ first

year in 12th grade, so that the time period is the same before and after the reforms.

The underlying sources of the college data provided by LDOE are the Louisiana

Board of Regents (BOR; 2001-2011) and the National Student Clearinghouse (NSC;

2005-2016), both of which cover two-year and four-year colleges. The BOR includes

only in-state public colleges and universities (two-year and four-year) and some private

colleges, but only includes information about on-time college enrollment (the year

immediately after high school graduation), omitting persistence and graduation. The NSC

19 In both cases, we use students who are in the given grade for the first time. It is common for students who are doing poorly academically to be held back and remain in the same grade for multiple years. Some of these first-time 9th and 10th graders may not be promoted to higher grades, and this shows up in our results through either dropout or delayed graduation.

17

data, in contrast, cover above 90 percent of all colleges, public and private across the

nation (Dynarski, Hemelt, & Hyman, 2013). In Louisiana, 93 percent of higher education

institutions are covered in the NSC; this under-states enrollment coverage as the excluded

colleges are very small in size. The NSC data also include on-time college enrollment but

go further to incorporate any enrollment, persistence, and graduation. This has

implications for the matching process with college outcomes, as discussed below.

An implicit assumption throughout the analysis is that measurement error is

orthogonal to treatment. This could be violated if there were either pre-to-post treatment

changes in college missing data rates or changes in the composition of colleges attended

by students from treatment and comparison districts. The former could occur, for

example, because of the shift from BOR to NSC, each of which could have different

types of measurement error; and the latter could arise if certain colleges were affected by

the hurricanes, or if there were treatment effects on the types of colleges students

attended that were correlated with the missing data rate.20 Our additional analysis largely

rules this out and seems consistent with orthogonal measurement error.21

20 We cannot distinguish missing observations from non-enrollment in our data in the case of high school graduation and college outcomes. As is standard in this form of analysis, when a student is “missing,” we assume the student is not enrolled. 21 To test whether measurement error is orthogonal, we assumed that the BOR data were complete for those institutions included in it, and then compared the BOR with the NSC to obtain the NSC missing data rate for every college for the years the two data sources overlap (2005-2011). (As noted above, only a handful of Louisiana colleges did not participate in the NSC, and these receive a missing data rate of 100 percent.) Next, we identified the percentage of students in each district who attended each college to calculate a college missing data rate by district. We found no effect on the missing data rate, suggesting that the data satisfy the above measurement error assumption. (Since we are assuming the BOR data are accurate, all districts had a 0 percent missing data rate in the pre-treatment period. For this reason, we also did not estimate the effect via DD analysis. Instead, we only calculated the first differences in the post-treatment period.) Since the NSC also covers out-of-state colleges, we also considered whether this might introduce bias; however, 95 percent of Louisiana college-goers attend in-state, so this has a minimal influence.

18

Matching

Having a within-state comparison group allows us to account for the differences

in the test scales and state data collection methods across grades and years, as well as

changes in state policy that are unrelated to the New Orleans’ school reforms. Our

preferred specifications also narrow the comparison group further to just hurricane-

affected districts. If the trauma/disruption and interim school effects were the same in

New Orleans and other hurricane-affected districts, then this sample restriction would

eliminate it as a source of bias.22 As we will show, the estimates are robust to statewide

versus hurricane-affected matched samples with test scores and college entry. With high

school graduation, the effects are also consistently positive and significant across

samples, but differ in magnitude.

Panel Matching. In the panel analysis with test scores, our first preferred

matching method involves the following steps: (a) restrict to hurricane-affected school

districts; (b) from those affected districts, drop students who never returned to their pre-

treatment district; and (c) among the returning students, use Mahalanobis matching to

identify comparison students with similar composite test score levels in both of the two

most recent pre-treatment years (2004 and 2005), stratifying by year of return. To

account for grade repetition, step (c) is further stratified so that students who ever-

repeated (never-repeated) a grade pre-treatment are exact-matched to other students in

22 The hurricanes apparently affected New Orleans more than all but perhaps two districts. According to Pane et al. (2006), 81 percent of the displaced students came from Orleans, Jefferson, and Calcasieu Parishes. Five additional parishes account for nearly all of the remaining displaced students: St. Tammany, St. Bernard, Plaquemines, Vermilion, and Cameron. Pane et al. (2006) define “displaced” as any student who exited the school system because of the hurricane, as determined by the state government and parishes. We consider all eight parishes to be hurricane-affected in what follows. Pane et al. (2008) show that New Orleans accounted for more than half the students in the entire state who left their home districts for a long enough period that they enrolled in another Louisiana district or left the state and did not return.

19

each respective category pre-treatment.23 Step (b) helps ensure that the comparison group

is similar to New Orleans schools24 in both the observed and unobserved factors

associated with return to the original district.25 In other estimates, we added

demographics to the list of matching variables, along with pre-treatment dependent

variables; but the results were robust, and we therefore report only the outcome-matched

results. Panel matching requires implausible assumptions in the case of high school

graduation and college outcomes because we cannot match, at the student level, on

dependent variables that are one-time events. Therefore, we employ the panel method

only with test scores.

Pooled Matching. Since it is not possible to match on pre-treatment dependent

variables for individual students in the pooled method, our preferred strategy is to match

whole schools using their pre-treatment school-level dependent variables and then

assume that changes in the unobserved factors affecting student outcomes are

(conditionally) independent of treatment. We show below that this assumption is

plausible, suggesting that the pooled estimates have limited bias.

In our preferred pooled analysis, we match the post-reform cohorts on pre-reform

measures as follows: (a) restrict to hurricane-affected districts; (b) identify potential 23 In Louisiana, students were retained in grades 4 and 8 if they do not reach the Basic level on one or more tests. The number of tests for which Basic is required has changed over time, as has enforcement of the policy, and it is common for schools to seek waivers to allow students to be promoted regardless of their scores. 24 When we say “New Orleans schools,” we mean all publicly funded and governed schools in the city, including those authorized by both the RSD and OPSB. 25 For example, parents who were unemployed prior to the hurricane might have evacuated with their children to other districts and found jobs there, reducing the probability of returning to the original district. Since we cannot observe unemployment, and we would expect unemployment to influence student learning, this family characteristic would introduce bias in the absence of matching. The matched comparison group allows us to account for it, to the degree that the factors determining return were the same across districts. There may also have been unobserved factors associated with the neighborhoods from which families moved. People tend to live near others with similar incomes; if families in some neighborhoods returned sooner than others, then this should mean that the ability to return depended on (unobserved) income, which would affect returnees and non-returnees in similar ways, ceteris paribus. Matching based on year of return helps account for these potentially important differences in sample characteristics.

20

match schools, i.e., those that exist in 2002-2005 and all available pre- and post-treatment

years and have at least 10 students in each tested subject and grade;26 (c) drop districts

that have fewer than 3-4 potential school matches (depending on the outcome measure);

and (d) among remaining schools, Mahalanobis match treatment schools to the

comparison group based on the specific pre-treatment outcome levels. The entire

matching process is carried out separately by district, so that each district is matched to

the distribution of New Orleans.27 Steps (a)-(d) are carried out separately for each student

outcome measure, so that the matched comparison schools are somewhat different across

outcomes.

With college entry, the matching process is essentially the same as high school

graduation, matching on the dependent variable. The situation is more complex, however,

with longer-term college outcomes. We only have the NSC data for a single pre-reform

year (2005), which precludes the usual parallel trends test when studying college

persistence and graduation. Instead, we take the school weights from the matching

process that uses the 2005 values for the lagged dependent variable and apply these

weights to a test that uses multiple years of college attendance, which we can measure in

multiple pre-treatment years. This process still allows us both to match on the specific

dependent variable of interest (in levels) and to test for parallel trends at the same time,

even if the data do not allow the usual test with parallel trends on the dependent variable

26 We matched on a different number of pre-reform years in the panel and pooled analyses because different methods yielded parallel trends (though the post-trends look very similar regardless of matching). 27 In the panel method, we match at the student level so that the weights create comparison districts of the same size as New Orleans. In the pooled method, the matching is at the school level which means that each comparison district has the same number of schools as New Orleans. This, combined with the fact that we are estimating at the student-level, means that the weighted number of students in the pooled sample is dependent on the size of the matched schools. Later, we refer to robustness checks where we aggregated the data up to the school and then district level so that each district has the same weight regardless of size or number of students.

21

itself (e.g., when college graduation is the dependent variable, the parallel trends test is

based on college attendance).

These pooled analyses are essentially the only feasible method with high school

graduation and college outcomes. They also have the advantage of allowing us to study

effects over a much longer period of time, in contrast to the panel method. A final

advantage of the pooled approach relates to the implementation of NCLB, which might

have increased scores in New Orleans even in the absence of the city’s larger reform

effort and probably done so more than in other districts because of the city’s

disproportionate share of low-performing schools. Since NCLB places pressure on whole

schools, matching at the school level, as we do in the pooled method, may better address

observed and unobserved differences in how NCLB affected student outcomes in the

comparison and treatment groups. In any event, we expect such bias to be relatively small

given prior research on NCLB (Dee & Jacob, 2011).

Descriptive Statistics

Table 1 Panel A provides descriptive statistics for New Orleans’ student

demographics and outcomes pre-treatment (2005). In addition to reinforcing the

improvement over time in New Orleans’ student outcomes, the data show that the New

Orleans public school student population was extremely socio-economically

disadvantaged in the pre-reform period with 83 percent eligible for free and reduced price

lunch (FRPL); almost all the students are racial/ethnic minorities, and 94 percent are

black (depending on the year). The differences between 2005 and 2014 also provide a

first indication that the demographics of the New Orleans public school population did

not change significantly or in a clear direction after the hurricane. The percentage of

22

students eligible for FRPL increased after the reforms from 83 to 88 percent, and the

percent of black students dropped from 94 to 88 percent. We provide additional analysis

of population change using Census data later.

Panel B of Table 1 compares the characteristics of New Orleans and the panel

matched comparison group in the pre-reform period only. The matching succeeded in

improving the baseline match between New Orleans and the comparison group for the

panel analysis. Without matching, as we have seen, New Orleans was 0.55 s.d. below the

rest of the state (averaging across subjects). With matching, this is reduced. For example,

the table shows that the panel comparison group is only 0.04-0.10 s.d. higher than New

Orleans in pre-reform test levels (averaging across grades). The fact that we can match

only at the school level in the pooled analysis clearly makes the match less successful,

however. The pooled matching method yields a difference between New Orleans and the

comparison group of 0.35 s.d. Table 1 also shows similar gaps in baseline outcome levels

for high school graduation and college outcomes. While the focus here is on outcome

levels, we are most concerned with the parallel trends tests shown later.

Recent research suggests that the DD method, especially with matching on lagged

dependent variables, tends to replicate the results from randomized control trials (St.

Clair, Hallberg, & Cook, 2016). Also, the specific methods proposed here at least partly

address all of the main threats to validity. The panel DD avoids the issue of population

change. The restriction to hurricane-affected districts addresses interim school effects and

trauma/disruption. Matching on test scores helps address the threat posed by NCLB.

Below, we discuss additional methods for addressing population change in the pooled

analysis as well as strategic behavior from test-based accountability.

23

Population Change

One of the main threats to identification in the pooled analysis is that the

unobserved characteristics of the population may have changed disproportionately in

New Orleans relative to the comparison group. As noted earlier, the New Orleans

population had even higher rates of FRPL eligibility after the reforms than before (Table

1). However, FRPL is problematic because it cannot capture the difference between

students just below the poverty line and those in extreme poverty, and because the FRPL

reporting rates depend on how schools administer the FRPL program.28 We therefore

provide additional evidence on this issue.

Table 2 Panel A provides data on pre-treatment 3rd graders, including all pre-

treatment students and only those who returned (returnees), for New Orleans and other

hurricane-affected districts. By 2010, New Orleans returnees had somewhat lower pre-

treatment scores than the overall pre-treatment New Orleans population, while, in the

other districts, the returnee scores were higher than the pre-treatment population. The DD

therefore favors the comparison districts by 0.043 s.d. In other words, the change in the

population actually reduced post-Katrina New Orleans scores by a small amount, leading

to a possible downward bias in our pooled effect estimates.

Since the analysis of the panel sample is somewhat limited (e.g., they only include

returnees, and the pooled analysis includes all post-Katrina students), we commissioned

the U.S. Census Bureau to provide detailed demographics for households with students in

28 More recently, the federal government has also instituted Community Eligibility, which reduces validity of school-level FRPL percentages as indicators of poverty. However, this change occurred after the years in the present analysis.

24

public schools for each district in the state.29 Table 2 Panel B provides difference-in-

differences analysis of these Census data, showing that some socio-economic changes

favor New Orleans and others favor the hurricane-affected districts. For example, median

household income dropped by $736 in New Orleans, but increased in the comparison

districts by $1,750, for a simple DD of -$2,486 (2012 dollars).30 The percentage of the

population with a BA or higher, however, increased by five percentage points in New

Orleans but by only three percentage points in the comparison group, for a DD of two

percentage points favoring New Orleans.

To identify the potential influence of these Census-based demographic shifts on

student learning, we used data from the USDOE’s Early Childhood Longitudinal Study

(ECLS) to estimate the partial correlation between achievement levels and each of the

demographic measures.31 With the resulting regression coefficients (shown in Table 2

Panel C), we then carried out an out-of-sample prediction of the achievement

levels/growth change expected from the changes in Census demographic measures.32 The

results are shown in Table 2 Panel D. The simulated cumulative effect across five years

in the reformed school system (our estimate of the “dosage”33), averaged across the

demographic measures, is 0.012 s.d. with a range of -0.012 (favoring the comparison

29 The Census could only provide these data for the three parishes/districts with more than 100,000 residents (Calcasieu, Jefferson, and St. Tammany). 30 The absolute decline in socio-economic characteristics in New Orleans is corroborated by Vigdor (2008). 31 In each regression, the ECLS test score (in levels and growth, respectively) is regressed on one demographic measure and a vector of school fixed effects. 32 We estimate the models separately for achievement levels and achievement growth so that the cumulative predicted effect reflects both. See table notes for details on the different cumulative measures. 33 For students who were enrolled in 2006, we found an average of 5.5 years, but this is an over-estimate because some students would have (re-)entered after 2006, and these students would have lower dosages. Given that these data include 2006-2014 (eight years), we might have expected a higher number, but note that dosages are truncated for students who were very young or near the end of their high school careers in 2006. Also, some students switch between the public and private schools and/or between districts.

25

districts) to 0.044 s.d. (favoring New Orleans).34 The latter number suggests a possible

upward bias in the treatment effects estimates in the pooled analysis, but all of the

estimates are small in magnitude relative to the average treatment effects we show later.

Overall, it appears that the elimination of public housing and the disproportionate

impact of flooding on low-income neighborhoods had a minimal effect on the relative

demographics or test scores of the public school population in the years after the

hurricanes. This is partly because the hurricane affected 80 percent of the city, so that all

demographic groups were affected. For example, the black middle class, whose children

also attended public schools in large numbers, saw a large population drop (Plyer,

Shrinath, & Mack, 2015). In addition, the increase in the number of federal Section 8

public housing vouchers was much larger than the drop in public housing units, so more

low-income families, and their children, were apparently able to return to the city than

appears at first glance.35 This evidence suggests that population change is not a major

threat to identification in the pooled analysis, especially after controlling for measureable

demographic changes.

34 The results in Table 3 are based on reading test scores only and for the entire population. We therefore also re-estimated the Panel C models for low-income ECLS students, which increases the predicted achievement effects, and re-estimated for ECLS math, which reduces the effects; thus the reported effects on reading for the whole population represent a middle ground. We thank Jane Lincove for suggesting these checks. 35 According to Seicshnaydre and Albright (2015), the number of housing vouchers used changed from 4,763 in 2000, to 8,400 in 2005 (which includes some post-Katrina months), to 17,437 in 2010, for an increase of at least 10,000 units. In contrast, public housing units dropped by about 5,000 units.

26

Results

Average Treatment Effects on Achievement

Panel Estimates for Achievement

Table 3 reports results from the panel analysis estimation of average treatment

effects (ATEs) based on equation (1) for 4th and 5th graders (separately) for 2006

returnees. Again, these are from models with matching on the lagged dependent variable,

controls for student demographics and grade repetition, and bin indicators.36 The column

(1) sample includes almost all Louisiana students who have data pre- and post-hurricane

(without matching)37; column (2) includes the entire state matched on test score levels.

We follow the same pattern in columns (3) and (4), showing unmatched and matched

samples with the hurricane-affected districts, the latter being our preferred specification.

Since our test scores end in grade 8, we can follow pre-treatment 4th (5th) graders only

through 2009 (2008).

A bit more than half of the 32 estimates are positive and significant, and only

three have negative signs (all are insignificant). The estimates are similar between the

state and hurricane-affected districts, though the unmatched estimates often fail the

parallel trends tests (see the coefficient and significance level under the standard error for

each estimate in the table). Matching largely solves the parallel trends problem, and the

point estimates remain mostly positive.

36 The year of return is based on the year the tests were taken, so a 2007 returnee likely returned in the fall of 2006. However, 2006 returnees almost all returned in spring of 2006 because all the schools were closed through fall of 2005. The vast majority of students who returned and who have post-Katrina data in grades 3-8 had returned by 2007. Also, there are very few returnees in other hurricane-affected districts to match with after 2007. 37 We excluded only those students who did not return to their 2005 district for at least one year and students who took alternative assessments. These same exclusions apply to both New Orleans and the comparison districts.

27

Focusing just on our preferred specification where students are matched to those

in other hurricane-affected districts, the point estimates average about 0.10 s.d. through

2009 for pre-treatment 4th and 5th graders (Table 3 Panel A). Since the matching is based

on test score levels, the sensitivity to matching may mean that NCLB, other statewide

policies or a change in the test scale had particular influence on low-performing students

and schools in the matched comparison group.

To leverage the entire panel, and not just two time points, we also estimate

equation (2) (i.e., Granger/event studies). Especially in math and ELA, the effects in later

years emerged from a combination of an initial dip in scores in the first year of return

followed by a positive upward trajectory. The negative effects in the first year of return

could reflect either low-performance of schools in the early years (followed by

improvement) or the trauma of returnees in New Orleans the first few years after the

storm that faded out. It is difficult to empirically distinguish between these alternative

theories, though the pooled results suggest that schools steadily improved after 2009 (see

later results), which suggests that trauma/disruption partly explains the initial dip.

Other robustness checks reinforce the above results. In addition to 2006 returnees,

we considered 2007 returnees. The 2007 returnees have both a smaller dosage and greater

potential for conflating the effects of the reforms with interim school quality; therefore, it

is not surprising that the 2007 returnees display less pronounced initial dips in scores and

that the estimates for the last available year are more positive than for the 2006 returnees

(results available upon request). We also re-estimated using Mahalanobis matching on

28

both test scores and year of return (instead of exact matching on year of return). The

results were quite similar (available upon request).38

While these estimates suggest positive effects of the New Orleans school reforms,

a key disadvantage of the panel analysis is that it stops in 2009 and prevents us from

testing whether the upward trajectory continued. Three years (2006-2009) might be

considered a short span of time to implement an entirely new type of schooling system

and recruit and select new schools and educators. In 2009, most schools were still being

operated directly by the RSD, and the majority of teachers were still those from the pre-

treatment period (Barrett & Harris, 2015). Only three schools had been closed or turned

over to other operators in this time frame, compared with 45 schools between 2008 and

2015. Also, even if the system had reached equilibrium, the dosage was limited to a

maximum of 3.5 grades for spring 2006 returnees and less for later returnees. Finally, the

initial dip in scores upon return suggests that trauma/disruption effects may have been

larger for New Orleans students and pulled down the measured effects in the short term.

If the objective is to estimate the long-term cumulative effects of the program, then these

panel analysis limitations imply that the panel estimates in Figure 3 and Table 3 are

biased downwards. The analysis that follows avoids these limitations.

Pooled Estimates for Achievement

The pooled results, shown in Table 4, are positive for every specification and

statistically significant in each of the 16 cases. Focusing just on the hurricane-affected

matched sample, the estimates are all positive and statistically significant, in the range of

0.35-0.43 s.d. across subjects. Figure 3 also suggests that the positive effects are the

38 Specifically, we re-estimated using Mahalanobis matching on test scores and year of return (which reduces extremely poor matches on test levels while sacrificing similarity on year of return).

29

result of steady improvement leading up to 2013, with an apparent plateau in 2014. The

unmatched estimates still often fail the parallel trends tests, but this problem mostly goes

away with matching. (The event study results in Figure 3 suggest even larger point

estimates than those in Table 4 for 2014, apparently due to the covariance between the

student demographics and the year indicators. However, we focus on the more

conservative estimates in Table 4.)

One of the main potential threats to identification in the pooled analysis is the

change in population. However, recall that our various estimates in Table 2 suggest very

small population changes. Also, the trends in achievement effects in Figure 3 are

inconsistent with those of population change. We find an initial positive spike in

socioeconomic status in New Orleans right after the hurricane, which dissipated in the

ensuing few years. If population change was the driving force behind the estimated

effects, then we would have expected a large initial achievement effect followed by a flat

or declining effect trend. This is almost the opposite of the actual trend in Figure 4, which

displays negative initial spikes and positive trajectories, reinforcing the idea that

population change does not bias the pooled estimates.

Switchers, IV, and Other Identification Strategies

A third identification strategy, aside from the panel and pooled approaches above,

involves only students who switch into or out of New Orleans (“in-switchers” and “out-

switchers,” respectively) and who remain in their new districts for (at least) one academic

year within either a pre-reform or a post-reform period. The basic logic is that these

switches should affect student outcomes in proportion to the change in school quality.

Therefore, if New Orleans school quality improved, then those who switched into New

30

Orleans before Katrina, what we call pre-Katrina in-switchers, should see less outcome

improvement than post-Katrina in-switchers (opposite for out-switchers). In the simplest

Switcher Method 1 (M1), we take the one-year difference in achievement for individual

students just before and after the switch and compare this growth before and after the

reforms (with regression adjustments). We also estimate a Switcher Method 2 (M2) that

accounts for changes in statewide trends in cross-district switching (i.e., district

mobility).39

One advantage of the switcher method, like the pooled analysis, is that it allows

us to test for effects many years later. We specifically use data from 2001-2005 and

2009-2013. The identifying assumption of the Switcher-M1 approach is that the

unobserved factors affecting both students’ district of enrollment and achievement are

time-constant (i.e., the types of students moving into New Orleans did not change over

time). With Switcher-M2, the assumption is weaker: that the unobserved factors

associated with cross-district mobility only follow the same trend in New Orleans and the

rest of the state.

If New Orleans schools improved because of the reforms, then we would expect,

at a minimum, that the in-switcher estimates are more positive than the out-switcher

39 The general model for the switcher strategy is: 𝐴!"# = 𝜆𝐴!",!!! + 𝜃! + 𝛽!𝑑! + 𝛽!𝐼𝑛𝑆𝑤𝑖𝑡𝑐ℎ!" + 𝛽!(𝐼𝑛𝑆𝑤𝑖𝑡𝑐ℎ!"×𝑑!) + 𝜀!" where the dependent variable 𝐴!"# is achievement in the receiving school district k. The Switcher-M1 model includes only lagged achievement of student i in time t in sending district j (𝐴!",!!!), a vector of grade fixed effects (𝜃!), and an indicator for the post-Katrina period (𝑑!) where the analysis is limited to students who switch districts. In this model, we are interested in 𝛽!which simply compares achievement growth from switches that occur before and after the reforms. Switcher-M2 is essentially a differences-in-differences analysis and accounts for the possibility that the types of students who switch districts changed over time by using switches throughout the state as a comparison group. This involves adding 𝐼𝑛𝑆𝑤𝑖𝑡𝑐ℎ!" as an indicator for whether the switch was specifically into New Orleans (𝐼𝑛𝑆𝑤𝑖𝑡𝑐ℎ!" = 0 for cross-district switches where New Orleans is neither the sender nor the receiver). Under Switcher-M2, we are primarily interested in 𝛽!. We then carry out the same estimation replacing 𝐼𝑛𝑆𝑤𝑖𝑡𝑐ℎ!" with 𝑂𝑢𝑡𝑆𝑤𝑖𝑡𝑐ℎ!". Unlike the pooled and panel strategies, there is no matching involved. We thank Andrew McEachin for suggesting this general approach.

31

estimates; that is, switching into New Orleans should have become relatively beneficial

post-reform and switching out should come at a relative achievement cost. This is largely

what we find. The in-switcher estimates for are 0.10 and 0.07 s.d. larger than the out-

switcher estimates, for the M1 and M2 methods, respectively. These estimates are not

directly comparable to the earlier pooled estimates because the switcher estimates are, by

their nature, annualized effects, while the main pooled estimates are cumulative across

years. To compare them, we re-estimated the models from Table 3 with annual

achievement gains as the dependent variable, instead of achievement levels.40 These

estimates are similar on average to the switcher analyses in Table 4. For this reason, and

because the switcher analysis involves so few students,41 we rely mainly on the pooled

version of equation (1) (e.g., Table 3) throughout the rest of the analysis.

We also considered carrying out an instrumental variables (IV) strategy akin to

Abdulkadiroğlu et al. (2014) who estimate the effects of charter schools by leveraging

plausibly exogenous takeovers of public schools by charter schools. Specifically, they

estimate the effect of attending a charter school using the pre-takeover attendance in the

takeover public school as an instrument for subsequent charter attendance. Since students

were guaranteed slots in the new charter schools (and tend to stay in the same school

buildings in any event), they create a “grandfathering” instrument. This instrument has a

strong first stage and plausibly satisfies the exclusion restriction because students could

40 Specifically, we estimated the first difference as 3rd-to-4th grade growth for the 2010-11 cohort of 3rd graders minus 3rd-to-4th grade cohort in the 2003-04 cohort of 3rd graders. Thus, there are two dimensions of changes over time in this case: within student over time and across cohorts over time. This can provide additional protection against violations of the parallel trends assumption as in a typical triple difference (DDD) model, although our preferred DD method described in the main text seems to satisfy the parallel trends assumption. Nevertheless, while the DDD increases measurement error in the dependent variable, two of the four DDD estimates are statistically significant (science and social studies), and the average point estimate is 0.07 standard deviations in annual growth, slightly below the in-switcher results. 41 The switcher strategy requires restricting the sample to just 10 percent of New Orleans students, a small and possibly unusual sample.

32

not have easily known the public schools would be turned into charter schools. The main

assumption is that attendance in pre-Katrina schools did not (conditionally) affect the

achievement trajectory after the schools became charters.

The analogous approach here would be to use attendance in pre-reform New

Orleans schools as the grandfathering instrument for attending post-reform New Orleans

schools. However, the exclusion restriction is problematic in the present setting. First,

attendance in New Orleans public schools pre-treatment could influence post-Katrina

outcomes (even after conditioning on covariates) because of trauma/disruption, the same

problem we have in the panel analysis. Second, as Abdulkadiroğlu et al. (2014) indicate,

the assumption is much more plausible with outcome gains, which is infeasible for the

key one-time events that are of interest here (high school graduation and college entry).

These problems in the present context do not arise in the one considered by

Abdulkadiroğlu et al. (2014).42 Additionally, note that the earlier switcher analyses utilize

a similar type of exogenous variation (switching districts), but without the problematic

exclusion restriction.

Our objective in this section has been to estimate the ATEs of the New Orleans

school reforms through 2014. The results are consistently positive and arguably large in

magnitude across all of the following: panel versus pooled, state versus hurricane

districts, DD versus switcher, achievement levels versus gains specifications, and across

42 We also considered using synthetic cohort analysis, though this approach does not have good statistical properties in this situation. Synthetic cohort analysis is designed for situations where there is a single treatment unit (e.g., school district) and there are multiple candidate comparison groups, some of which are more similar to the treatment group at baseline. In this case, we do have a single treatment unit (New Orleans), but almost all the variance is between schools within school districts. More generally, synthetic cohort analysis is not as useful when: (a) there is a common support problem at the level of the treatment unit (i.e., New Orleans has lower pre-treatment scores than all the districts); and (b) there are smaller units (schools) nested within the treatment unit and most of the variance in outcomes is between these smaller units, rather than the more aggregated units (i.e., districts). Under these conditions, Mahalanobis matching at the lower-level unit of aggregation appears more effective in identifying a reasonable comparison group.

33

all four academic subjects. In additional analyses (available upon request), we also show

that the results are robust across student-level versus district-level data aggregation.

Moreover, as noted above, there is no evidence of bias from missing data, or attrition.

In summary, our preferred pooled estimates, with hurricane-affected districts and

matching, suggest that the reforms increased student achievement by 0.40 s.d. (range

across subjects: 0.35-0.43 s.d.) A linear extrapolation of the panel estimates yields a very

similar value of 0.39 s.d. Alternatively, if we assume the difference between the panel

and pooled estimates in 2009 persists into the future, this yields a projected effect of 0.28

s.d. The fact that the panel estimates could reflect upward bias in the pooled analysis

(e.g., due to population change) or downward bias (e.g., due to the more severe trauma

and disruption effects for New Orleans students) implies a range of 0.28-0.40 s.d.

Average Treatment Effects for High School Graduation

We focus on pooled analysis for high school graduation, with school-level

matching based on pre-treatment graduation rates and estimation of equation (2) for

cohorts of first-time 9th and 10th graders. The tables and figures report the years the

students started these earlier grades (the cohort years), e.g., for 10th graders, the last

cohort where we can identify an effect is the one that was in that grade in 2011-12, for

which the on-time graduation year would be 2014. We focus on any graduation but

discuss on-time graduation as well.

The pooled results, shown in Table 6, are uniformly positive and usually

statistically significant, in the range of 3-13 percentage points, across 9th and 10th graders.

As in the analysis of test scores, the estimates are least positive and most likely to be

insignificant in the matched analyses. In this case, the estimates are also smaller in the

34

hurricane-affected districts, regardless of matching. Therefore, our preferred range is 3-9

percentage points. From Table 1, the baseline graduation rate was 50-60 percent, so this

effect translates into a 10-14 percent improvement in the high school graduation rate.

Our preferred estimates are positive but insignificant when we use cohorts of 10th

graders. This could be related to the fact that a large share of high school dropout occurs

between 9th and 10th grade. However, given that we cannot test for parallel trends with 9th

graders, we cannot rule out that the results are driven by non-parallel trends.

The time trend seems to suggest a more immediate effect than the gradual

improvement we saw with test scores (Figure 4). This may be because the first effect on

graduation is for the 2008 cohort of 10th graders who experienced the reforms into 2010.

That is, compared with the test score analyses, the first graduation effect reflects both a

larger dosage in total, and a larger share of dosage that occurred after the reforms had a

chance to affect schools.

We find no evidence that the high school graduation results are driven by

accountability-based strategic behavior, i.e., schools taking advantage of the fact that

some exit codes are difficult for the state to verify. The estimates are nearly identical

across the graduation rates definitions (this can be seen by looking down each column of

Table 6 within the 9th and 10th grade groups). That is, even if we treat the less reliable exit

codes as dropouts, the reform effects on graduation are very similar.

The results are consistently more positive with delayed compared to on-time

completion (available upon request). This may reflect a combination of increased

academic expectations and increased effort by educators to keep students in school, so

that students persist even after they have fallen a grade level behind. The reforms also

35

may have led New Orleans schools to increase use of “credit recovery” programs that

allow students to gain academic credit through possibly less rigorous and less time-

consuming online programs (Harris, Liu, & Barrett, 2017). While this is difficult to test,

conversations with educators in multiple districts indicate that credit recovery programs

are being used in most districts and are not limited to New Orleans.

Our preferred ATE estimate of 3-9 percentage point effect is much smaller than

the descriptive improvement in New Orleans of 17-20 percentage points (Perry, Harris, &

Buerger, 2015). This is mostly because high school graduation rates were increasing

statewide, due in part to the addition of federally mandated high-stakes accountability for

graduation rates that started around 2007 (Harris, Liu, & Barrett, 2017). Still, a 3-9

percentage point effect is economically large, as the later cost-benefit analysis shows

more clearly.

Average Treatment Effects on College Attendance and Graduation

College attendance and college graduation are especially important outcomes,

given: (a) how strongly they predict long-term life outcomes (Kane & Rouse, 1995;

Goldin & Katz, 2008; Heckman, Humphries, & Veramendi, 2016); and (b) that they are

less prone to strategic behavior than high school measures because they are collected

completely outside of schools and without school accountability.

The ATEs for on-time college attendance and any attendance are 15 and 8

percentage points, respectively (Table 7).43 The effects are uniformly positive and

statistically significant across samples and matching. The reported estimates all pass the 43 On-time means that students attended college immediately after graduating high school. One reason for using this approach is that this is how college entry is defined in the BOR data that we used in the matching process. The college persistence measure discussed below does not make this restriction. The generally positive estimates across these various definitions suggest that our results are robust to how we define college outcomes.

36

parallel trends test for our preferred specifications with matching.44 Given the baseline

college attendance rate for 12th graders of 53.4 percent (any attendance) and 22.5 percent

(on-time; see Table 1), these represent a 15-67 percent improvement over the baseline.

The only estimates with negative signs are those for attendance in two-year colleges.

Given the overall positive effect on attendance, it therefore appears that some students

likely shifted from two- to four-year colleges, and others, who would not have attended

any college, attended a two-year college. This shift to four-year college is indicative of a

larger economic return (e.g., Kane & Rouse, 1995). As an additional check, we re-

estimated the effects on college attendance using only the BOR data, and the results are

qualitatively similar.

We measure persistence by comparing the percentage attending college with the

percentage attending college for two or four years in total. (Note that this refers to the

total number of years in any college and does not distinguish two-year colleges from

four-year colleges.) We see positive effects of 4-7 percentage points for the two

persistence measures (25-28 percent increases from baseline).

We also estimated effects on college completion within five years of expected

high school graduation, among students who reached 12th grade. The effect magnitudes

are positive though smaller than the others, at 3-5 percentage points (30-45 percent

effect). Figure 5A shows the event study results for college attendance and graduation,

allowing us to compare the effects for specific cohorts. This reinforces that those students

who do attend college as a result of the reforms are usually benefiting beyond just the

first semester or year.

44 Recall that the parallel trends tests use only on-time college attendance, given our inability to track persistence and graduation before 2005.

37

The smaller absolute effects on college graduation apparently reflect three factors.

First, the college graduation rate is 10 percent at baseline (among 12th graders) as shown

in Table 1, so the 3-5 percentage point effect is larger, in percentage terms.45 Second,

college graduation is under-counted in the NSC. If the under-counting is orthogonal to

treatment, or higher in the treatment group, then this biases the effect estimates

downward. Third, we can only estimate graduation effects for the first three cohorts of

12th graders (through 2009) who had experienced more disruption and lower dosages in

New Orleans schools during the early stages of the reform’s development.46 As with the

parallel trends tests shown for college attendance in Table 7, we do not reject the null of

non-parallel trends in any of the college persistence and graduation levels.47

Like the high school graduation results, the event studies suggest an immediate

effect on college outcomes. One possible explanation is that the effect was driven by an

increase in educators’ expectations for students to attend college, which could occur

quickly without significant academic improvement.48 The average treatment effects for

all student outcomes, as well as the magnitudes of potential biases, are summarized in

Table 8.

45 According to Bailey and Dynarski, the national college graduation rate is 25 percent (conditional on college entry) among low-income students. This cannot be directly compared with the graduation rate for 12th graders. 46 Research on school closure, for example, suggests that high school students have more difficulty bouncing back from school disruptions (Larsen, 2017). 47 As discussed earlier, we cannot measure college persistence and graduation in the pre-reform years, therefore we modified the parallel trends test, using the weights from matching on a single lagged value of the dependent variable, but using only college attendance for the parallel trends test. The range of these modified parallel trends tests is -0.001 to +0.019, where positive point estimates could suggest upward bias, though the standard errors are not close to reaching statistical significance. These are not shown in Table 7 to avoid confusion between the modified and unmodified versions of the parallel trends tests. 48 Also, note that, compared with elementary schools, a large share of high schools (and high school seats) remained under the school district control after Katrina. This is because the only schools not taken over by the state were those that were high performing. Some pre-Katrina high schools were high performing because they were selective admissions. However, most OPSB schools were also turned into charter schools, and all were affected by the move to school choice and the elimination of the union contract.

38

Effect Heterogeneity

One of the most common critiques of the New Orleans school reforms is that they

have been inequitable and even harmful to disadvantaged students. Numerous media

reports and lawsuits have alleged denied admission, disproportionate numbers of

suspensions and expulsions, and insufficient services among certain disadvantaged

students under the city’s reforms (P.B. v. Pastorek, 2010). Charter school leaders in New

Orleans have reported advising challenging students to move to other schools (Jabbar,

2015), and there are some other signs students have done so (Ferguson, 2017). This

problem was especially pronounced with school discipline, albeit only in the early years

of the reform (Hernandez, forthcoming). While these practices cannot plausibly bias our

ATE estimates,49 they may affect the distribution of benefits across groups.

Given that the vast majority of New Orleans students are black and/or low-

income, the ATEs reported earlier clearly suggest that these disadvantaged groups

benefited with higher outcomes. However, it could still be the case that the reforms

exacerbated education gaps across groups within the district. To test this, we carried out

the same estimation methods as above, but separately by FRPL and race/ethnicity.50 The

49 If low-performing students are pushed out of post-reform schools, they almost all end up in some other publicly funded school in the city and therefore still show up in our analyses regardless. 50 Identification of effects for English Language Learners (ELLs) and special education students is left for future research due to several additional methodological issues. The ELL population in New Orleans was small before the storm and grew considerably afterwards. Also, there are extremely few ELL students in the comparison districts with which to match. The empirical challenges with special education are a bit different. After the storm, many special education students began taking new types of alternative assessments. There is no crosswalk between these and the regular state tests, and the percentage of students taking the alternative assessments changed widely over time. Moreover, there are good reasons to believe that selection into special education worked differently before and after the reforms, which limits us to panel analysis over just the first few years. For these reasons, we leave the analysis of this important topic to a separate study.

39

matching process is similar, except for the additional stratification by subgroup.51 The

Granger/event study results by race and FRPL and for each of the main outcomes can be

found in Figures 6 and 7.

The results are much more positive for disadvantaged students, compared with

other students, with regard to high school graduation and college-going. This is not true

with test scores, however. In none of the models or years did black or FRPL students see

larger effects than their white or non-FRPL counterparts, in terms of achievement. In the

later years (pooled analysis), the effects for black and FRPL students converge to those of

their counterparts, suggesting that all the various groups benefited in similar ways in the

long run. With high school graduation, in contrast, black and low-income students seem

to have benefited slightly more, and from the very beginning of the reforms.

With college attendance, essentially the entire citywide effect of the reforms

seems to have come from improvement among black and FRPL students. The results for

black students violate parallel trends, though in a way that suggests our results may

under-state the disproportionate effects for black students. In additional analysis, we re-

ran the results unmatched and see similar results in the post-reform period (but still with

non-parallel trends in the pre-reform period).

With FRPL, we also see more positive effects for lower-income students initially,

followed by convergence. These unstable trends with regard to FRPL, especially in the

test score and college attendance results, likely reflect that FRPL is not an especially

51 In the pooled subgroup matching, we also restricted the comparison group to schools that had at least 10 students in the given subgroup (e.g., 10 in FRPL and 10 non-FRPL), so that the estimates for each pair of subgroups reflect the same comparison schools. Also, we matched on the test scores of each pair of subgroups simultaneously; for example, for each New Orleans school, we looked for a comparison school where FRPL students had similar test scores to the FRPL students in the New Orleans school and where the non-FRPL students in the potential comparison also had scores similar to the non-FRPL students in the New Orleans school.

40

reliable indicator of poverty, especially in this setting.52 Given that the race indicators do

not suffer from the same flaws and that there is a strong positive correlation between

income and race, the effect heterogeneity by race is probably more valid.

In both the panel and pooled cases, we also carried out many of the same

robustness and bias checks for each subgroup. In general, the sub-group analyses pass

these tests and are robust with alternative specifications. The fact that the results for black

students mirror those of the ATEs also reinforces the validity of the latter, showing that

the results are the same even with a different method of matching.53

Spending and Costs

To interpret these positive effects on achievement, high school graduation, and

college outcomes, we also have to consider the increase in school spending that

accompanied the market-based reforms (Buerger & Harris, 2016). First, the spending

increase can be viewed as an alternative to the market-based reforms as an explanation

for the above effects on student outcomes. Second, it can be viewed as an investment in

the reforms that calls for a cost-benefit-analysis. We consider the cost side from both of

these perspectives.

52 There are two issues with FRPL: the administration of the program generally and the rules that apply under natural disasters. To the latter point, after Katrina, almost all New Orleans’ public school students could be considered “homeless” when they first returned, and this automatically made them eligible for FRPL. This is because, under FRPL rules, a student is considered homeless if “s/he is identified as lacking a fixed, regular and adequate nighttime residence by the LEA homeless liaison, or by the director of a homeless shelter” (USDA, 2014). Many students were living with relatives or in homes that were still heavily damaged. Thus, even some students who are otherwise socio-economically advantaged could be considered homeless and eligible for FRPL. Since FRPL students are only compared with other FRPL students, this likely led to what appear to be large achievement effects at first and then smaller effects. Further, this pattern would not appear in the panel analysis because FRPL eligibility in that case is based entirely on pre-treatment FRPL eligibility. We thank Lindsay Bell Weixler for pointing out this issue with the FRPL homeless designation. 53 The high school graduation effects, however, look noticeably larger for blacks than in the ATEs in Figure 5. The reason this is possible is that we always weighted the districts to match the size of New Orleans, and these other districts often had very small black populations. Therefore, the comparison districts used in estimating the ATEs represent different demographic populations than those in the effect heterogeneity analyses.

41

The counterfactual in this analysis is what would have happened to student

outcomes if New Orleans schools had returned to business as usual, only with more

funding. Jackson, Johnson, & Persico (2016) found that a $1,000 increase in school

spending, caused by state school funding lawsuits, increased high school graduation rates

by roughly 10 percentage points. Also, Lafortune, Rothstein, and Schanzenbach (2016)

found that state funding adequacy lawsuits increased relative spending in low-income

districts by about $700 per pupil and reduced the NAEP achievement gap with high-

income districts by about 0.1 s.d. Taken at face value, these high-end effect estimates

suggest that the increased spending could explain a substantial share of our estimated

effects.

It is questionable, however, whether the results from these studies provide a valid

indication of the counterfactual in this case. First, the corruption and dysfunction in the

Orleans Parish School Board prior to the storm implies that the additional resources

would not have been used to generate better outcomes to the extent that the average

district did in the above school funding studies. Second, the city’s spending increase,

which came mainly from local funding and philanthropists, may have been partly caused

by the reforms. The same inefficiencies that led to public disenchantment with the local

OPSB pre-Katrina led to a widespread perception in the city that the reforms improved

schools (Cowen Institute, 2016). This increased public support likely contributed to

political support for local property tax levies and the backing of philanthropists that

produced the spending increase. Any effect of spending on student outcomes, in this

sense, may not be just an alternative explanation, but rather an indirect effect of the

42

reforms. Therefore, while spending almost certainly contributed to the overall effect, it is

unclear whether it was a substantial cause.

We also provide a cost-benefit analysis in Table 8 based on the prior frameworks

of Krueger & Whitmore (2001), using the above $1,358 per student estimate of the

reform costs,54 combined with evidence from other studies on the labor market returns to

cognitive skill and years of education (Murnane, Willett, & Levy, 1995; Krueger &

Whitmore, 2001). As shown at the bottom of Table 8, the New Orleans reforms easily

pass a simple cost-benefit test. More importantly, the benefit-cost ratios (and internal

rates of return) are in the same range as the Perry Preschool experiments (Heckman et al.,

2010) and are larger than the Tennessee STAR experiment and almost all the other

rigorously studied programs.55 The notes to Table 8 provide additional details about the

assumptions of this analysis.

Conclusion

Critics of American schooling have long advocated for a radically different

system than the government-driven school district model that still predominates

throughout the country. New Orleans is the first U.S. school system to overturn that

system and replace it with a market-driven one. We find that that the reform package put

in place after Hurricane Katrina had large positive effects on both the quality and quantity

of education New Orleans students received. The reforms increased student achievement

by 0.28-0.40 s.d., high school graduation by 3-9 percentage points (10-14 percent),

college attendance by 8-15 percentage points (15-67 percent), college persistence by 4-7

54 The $1.8 billion investment in buildings was slow to yield actual improvements in buildings and could not have had a significant influence on these results. 55 See Harris (2009) for comparative cost-effectiveness of these and other programs.

43

percentage points (25-28 percent), and college graduation by 3-5 percentage points (30-

45 percent). Moreover, the reforms reduced most education gaps between this

socioeconomically challenged district and the rest of the state, and between advantaged

and disadvantaged groups within the district.

The results are robust across multiple identification strategies and dozens of

robustness checks, as well as various additional tests we conducted to test the threats to

identification. None of our three types of analysis suggests that population change could

explain more than 10 percent of our preferred estimates. The net effects of interim

schools and trauma/disruption also seem small and temporary, and these tend to work in

the other direction of the other biases, thus partly cancelling them out. The fact that we

see effects across all academic subjects and across all outcomes, regardless of the stakes

attached to those measures, also reinforces that the improvement was not the result of

strategic behavior.

Our results are also consistent with specific types of operational changes that

occurred in the city as a result of the reforms: (a) with attendance zones almost entirely

eliminated, families became more active choosers and travelled much further to get to

higher quality schools (Harris & Larsen, 2015); (b) schools are differentiated in the types

of programs they provide, making good matches with family preferences more likely

(Arce-Trigatti, Lincove, Harris & Jabbar 2015); (c) the state RSD ended the contracts of

schools that were low-performing (Bross, Harris, & Liu 2016); and (d) the teacher

workforce changed significantly and in ways plausibly consistent with achievement

growth (Barnett & Harris, 2015). Moreover, some of the potential unintended

consequences did not emerge. Additional DD analyses suggest, for example, that

44

voluntary student mobility (Maroulis, Santillano, Jabbar, & Harris, 2015) and segregation

(Barrett, Weixler & Harris, 2015) have been largely unaffected by the reforms.

Nevertheless, the fact that the New Orleans reforms seem to have been beneficial

on average, and for key subgroups, does not mean these benefits would extend to other

cities. In general, external validity considerations rest on the types of people served, the

intensity and quality of policy implementation, and the basis of comparison

(counterfactual). Market-based school reforms seem consistently effective with black and

low-income students in urban areas, perhaps because they have such test scores and

greater geographic density in which competition can develop (Harris, 2017).

The New Orleans reforms were also implemented with an unusual, and perhaps

unusually large and high-quality, supply of educators. A national out-pouring of support

from across the nation occurred in response to Hurricane Katrina. The city also became

an epicenter for school reform and a magnet for ambitious, talented, young educators

from around the country (Harris, 2015). These distinctive features in New Orleans were

almost certainly important given the importance of teachers in generating student

outcomes. More broadly, the results had everything going for them in this case.

While the reforms were implemented in an entire school district, taking the policy

to a larger scale could prove more challenging. Teacher quality again comes into play

because the supply of educators from Teach for America and other alternative preparation

programs is limited. New Orleans is also a relatively small urban district and requires

relatively few teachers. Taking New Orleans-style reforms to larger districts, or simply

more districts, would require larger shifts in teacher supply.

45

Finally, the counterfactual in this difference-in-differences analysis is a pre-

treatment school system that, by just about any measure, was failing badly. Corruption,

mismanagement, and rapid turnover of superintendents resulted in extremely poor student

outcomes (Council of Great City Schools, 2001; Buerger & Harris, 2015, Cowen

Institute, 2015; Perry, Harris, & Buerger, 2015). Even some of the strongest critics of the

reforms agree that major changes were in order (Ferguson, 2017). New Orleans had

nowhere to go but up.

While the generalizability of the findings are, as always, a bit unclear, there is

much to be learned here. More than a decade ago, Hoxby (2000) speculated on how

difficult it might be to ever observe the effects of a massive reform in a U.S. school

system and that it would take 10 years to see a radical departure from the traditional

school district reach equilibrium.56 The conditions she described are quite similar to what

we see in New Orleans. The successes documented here force educators and

policymakers to question assumptions about how an education system can and should be

designed and operated. It shows that, at least under certain circumstances, intensive

system-wide school reform, based on principles of accountability and school autonomy,

have the potential to produce large effects on student learning. The question now is

whether such large gains can be achieved at scale in other cities, through these or other

means, without a tragedy like Hurricane Katrina.

56 Hoxby (2000, p.1209) writes that the “Tiebout process . . . is still the most powerful force in American schooling. It will be years before any reform could have the pervasive effects that Tiebout choice has had on American schools. Moreover, the short-term effects of reforms [would be] misleading because … the supply response to a reform--the entry or expansion of successful schools and the shrinking or exit of unsuccessful schools--may take a decade or more to fully evince itself.”

46

References Abdulkadiroğlu, A., Angrist, J.D., Dynarski, S., Kane, T.J., & Pathak, P. (2011)

Accountability and Flexibility in Public Schools: Evidence from Boston’s Charters and Pilots. The Quarterly Journal of Economics 126: 699–748.

Abdulkadiroğlu, A., Angrist, J.D., Hull, P.D., & Pathak, P.A. (2015). School vouchers

and student achievement: First-year evidence from the Louisiana Scholarship Program. Working Paper 2015.06. School Effective and Inequality Initiative. Cambridge, MA: Massachusetts Institute of Technology.

Abdulkadiroğlu, A., Angrist, J.D., Hull, P.D., & Pathak, P.A. (2016). Charters without

lotteries: Testing takeovers in New Orleans and Boston. American Economic Review 106(7): 1878-1920.

America Next (2015). K-12 Education Reform: A Roadmap. Downloaded April 27, 2015

from: http://americanext.org/wp-content/uploads/2015/02/America-Next-K-12-Education-Reform.pdf.

Angrist, J.D., Cohodes, S.R., Dynarski, S.M., Pathak, P.A., & Walters, C.R. (2016).

Stand and Deliver: Effects of Boston’s Charter High Schools on College Preparation, Entry, and Choice. Journal of Labor Economics 34(2): 275-318.

Angrist, J.D., Dynarski, S.M., Kane, T.J., Pathak, P.A., & Walters, C.R. (2010). Inputs

and Impacts in Charter Schools: KIPP Lynn. American Economic Review 100(2): 239-43.

Angrist, J.D., Pathak, P., & Walters, C.R. (2013). Explaining charter school

effectiveness. American Economic Journal: Applied 5(4): 1-27. Angrist, J. & Pischke J-S. (2009). Mostly Harmless Econometrics. Princeton, NJ:

Princeton University Press. Arce-Trigatti, P., Harris, D., Lincove, J., & Jabbar, H. (2015). Many options in New

Orleans public schools. Education Next 15(4): 25-33. Athey, S. & Imbens, G. (2003). Identification and inference in nonlinear

difference-in-differences models. Econometrica 74(2): 431-497. Barrett, N. & Harris, D. (2015). Significant Changes in the New Orleans Teacher

Workforce. New Orleans, LA: Tulane University, Education Research Alliance for New Orleans.

Berry, J. Jacob, B.A., & Levitt, S.D. (2005). The impact of school choice on student

outcomes: an analysis of the Chicago Public Schools. Journal of Public Economics 89 5: 729-760.

Belfield, C.R. & Levin, H.M. (2003). The effects of competition on educational

outcomes: a review of US evidence. Review of Educational Research 72(2): 279-341.

47

Bertrand, M., Duflo, E., & Mullainathan, S. (2004). How much should we trust differences-in-differences estimates? Quarterly Journal of Economics 119(1): 249-275.

Booker, K., Sass, T., Gill, B., & Zimmer, R. (2011). The Effects of Charter High Schools

on Educational Attainment. Journal of Labor Economics, 29(2), 377-415. Brown, T.H., Mellman, T.A., Alfano, C.A., & Weems, D.F. (2011). Sleep fears, sleep

disturbance, and PTSD symptoms in minority youth exposed to Hurricane Katrina. Journal of Traumatic Stress 24(5): 575–580.

Buerger, C., & Harris, D., (2015). How can decentralized systems solve system-level

problems? An analysis of market-driven New Orleans school reforms. American Behavioral Scientist 59(10): 1246–1262.

Carnoy, M. & Loeb, S. (2003). Does external accountability affect student outcomes? A

cross-state analysis. Educational Evaluation and Policy Analysis 24(4): 305-331. Center for Research on Education Outcomes (2013a). National Charter School Study.

Palo Alto, CA: Stanford University. Center for Research on Education Outcomes (2013b). Charter School Performance in

Louisiana. Palo Alto, CA: Stanford University. Chubb, J.E., & Moe, T.M. (1990). Politics, Markets, and Schools. Washington, DC:

Brookings Institution. Council of Great City Schools (2001). Rebuilding Human Resources in New Orleans

Public Schools. Washington, DC. Cowen Institute for Public Education Initiatives (2015). State of Public Education in New

Orleans. New Orleans, LA: Tulane University. Cowen Institute for Public Education Initiatives (2016). What Happens Next? Voters’

Perceptions of K-12 Public Education in New Orleans. New Orleans, LA: Tulane University.

The Data Center (2014). Who Lives in New Orleans and Metro Parishes Now? New

Orleans, LA. Dee, T. & Jacob, B. (2011). The impact of No Child Left Behind on student achievement,

Journal of Policy Analysis and Management 30(3): 418-446. DeSalvo, K. B., Hyre, A.D., Ompad, D.C., Menke, A., Tynes, L.L., & Muntner, P.

(2007). Symptoms of posttraumatic stress disorder in a New Orleans workforce following Hurricane Katrina. Journal of Urban Health 84(2): 142-152.

Dreilinger, D. (2014). Arne Duncan: New Orleans education community is 'leading the

nation where we have to. Times-Picayune. Downloaded January 1, 2016 from: go'http://www.nola.com/education/index.ssf/2014/12/arne_duncan_new_orleans_educat.html.

48

Dynarski, S.M., Hemelt, S.W. & Hyman, J.M. (2013). The Missing Manual: Using National Student Clearinghouse Data to Track Postsecondary Outcomes. NBER Working Paper No. 19552. Cambridge, MA: National Bureau of Economic Research.

Elliott, J.R. & Pais, J. (2006), Race, class, and Hurricane Katrina: Social differences in

human responses to disaster. Social Science Research 35: 295–321. Epple, D., Romano, R., & Zimmer, R. (2015). Charter schools: A survey of research on

their characteristics and effectiveness. NBER Working Paper 21256. Cambridge, MA: National Bureau of Economic Research.

Evers, W. (2014) Implementing standards and testing. In Chester E. Finn and Richard

Sousa, What Lies Ahead for America’s Children and Their Schools. Stanford, CA: Hoover Institution Press.

Ferguson, B. (2017). Outcomes of the State Takeover of New Orleans Schools. Dorrance

Publishing Company. Figlio, D. (2006). Testing, crime and punishment. Journal of Public Economics 90(4):

837-851. Friedman, M. (1962). Capitalism and Freedom Chicago: University of Chicago Press. Fryer, R.G. (2014). Injecting charter school best practices into traditional public schools:

Evidence from field experiments. Quarterly Journal of Economics 129(3):1355-1407.

Gill, B. & Booker, K. (2008). School competition and student outcomes. In Helen F.

Ladd and Edward B. Fiske (Eds) Handbook of Research in Education Finance and Policy (pp.183-202). New York: Routledge.

Goldin, C. & Katz, L. (2008). The Race Between Education and Technology. Cambridge,

MA: Harvard University Press. Groen, J. & Polivka, A. (2008). The effect of Hurricane Katrina on the labor market

outcomes of evacuees. American Economic Review 98(2): 43–48. Hanushek, E.A. (1996). A more complete picture of school resource policies.

Review of Educational Research 66(3): 397-409. Hanushek, E.A. (2002). Publicly Provided Education. NBER Working Paper 8799. Cambridge, MA: National Bureau of Economic Research. Hanushek, E.A. & Woessman, L. (2010). The High Cost of Low Educational

Performance: The Long Run Impact of Improving PISA Outcomes. Paris: Organization for Economic Development and Cooperation.

Harris, D.N. (2009). Toward policy-relevant benchmarks for interpreting effect sizes:

Combining effects with costs. Educational Evaluation and Policy Analysis 31(1): 3-29.

49

Harris, D.N. (2017). Why Managed Competition is Better than a Free Market for Schooling. Washington, DC: Brookings Institution.

Harris, D.N., Liu, L., & Barrett, N. (2017). Is the Rise in High School Graduation Rates

Real? High-Stakes School Accountability and Strategic Behavior in Louisiana. Paper presented at the annual meeting of the Association for Public Policy and Management (APPAM). Washington, DC.

Harris, D.N. & Larsen, M. (2015). What Schools Do Families Parents Want (and Why)?

Academic Quality, Extracurricular Activities, and Indirect Costs in New Orleans Post-Katrina School Reforms. New Orleans, LA: Education Research Alliance for New Orleans, Tulane University.

Harris, D.N., Witte, J. F., & Valant, J. (2017). The market for schooling. Shaping

Education Policy: Power and Process (2nd edition). New York, NY: Routledge. Heckman, J.J., Humphries, J.E., & Veramendi, G. (2016). Returns to education: The

causal effects of education on earnings, health, and smoking. NBER Working Paper 22291. Cambridge, MA: National Bureau of Economic Research.

Heckman, J.J., Lochner, L.J., & Todd, P.E. (2008). Earnings Functions and Rates of

Return, Journal of Human Capital 2(1): 1–31. Heckman, J.J., Moon, S.H., Pinto, R., Savelyev, P. & Yavitz, A. (2010). A New cost-

benefit and rate of return analysis for the Perry Preschool program: A summary. NBER Working Paper 16180. Cambridge, MA: National Bureau of Economic Research.

Hill, P. & Campbell, C. (2011). Growing Number of Districts Seek Bold Change

With Portfolio Strategy. University of Washington: Center for Reinventing Public Education.

Hill, P. & Lake, R. (2004). Charter Schools and Accountability in Public Education.

Washington, DC: Brookings Institution Press. Hill, P. L. Pierce, J. Guthrie (1997). Reinventing Public Education: How

Contracting Can Transform America’s Schools. Chicago: University of Chicago Press.

Hornbeck, R. & Keniston, D. (2017). Creative Destruction: Barriers to Urban Growth and

the Great Boston Fire of 1872. American Economic Review, 107 (6): 1365-98. Hoxby, C.M. (1996). How teachers' unions affect education production. Quarterly

Journal of Economics 111(3), 671-718. Hoxby, C. (2000). Competition among Public Schools Benefit Students and Taxpayers?

American Economic Review 90(5): 1209-1238. Hoxby, C. (2002). School choice and school productivity (or could school choice be a

tide that lifts all boats). NBER Working Paper 8873. Cambridge, MA: National Bureau of Economic Research.

Jackson, K.C, Johnson, RC., & Persico, C. (2016). The Effects of School Spending on

Educational and Economic Outcomes: Evidence from School Finance Reforms.

50

The Quarterly Journal of Economics 131(1): 157–218. Jacob, B.A. (2005), Accountability, incentives and behavior: The impact of high-stakes

testing in the Chicago Public Schools. Journal of Public Economics 89: 761-796. Kane, T., & Rouse, C. (1995). Labor-Market Returns to Two- and Four-Year College.

American Economic Review 85(3): 600-614. Kollman, K., Miller, J.H., & Page, S.E. (1997). Political institutions and sorting in a

Tiebout model. American Economic Review 87: 977-92. Koretz, D. (2009). Measuring Up: What Educational Testing Really Tells Us. Cambridge,

MA: Cambridge University Press. Krueger, A. B., &Whitmore, D. M. (2001). The effect of attending a small class in the

early grades on college-test taking and middle school test results: Evidence from Project STAR. Economic Journal 111, 1–28.

Krueger, A.B. & Zhu, P. (2004). Another look at the New York City voucher experiment.

American Behavioral Scientist 47(5): 658-98 Lafortune, J., Rothstein, J., & Schanzenbach, D.W. (2016). School Finance Reform and

the Distribution of Student Achievement. NBER Working Paper No. 22011. Cambridge, MA: National Bureau of Economic Research.

Lee, J. (2008). Test-driven external accountability effective? Synthesizing the evidence

from cross-state causal-comparative and correlational studies. Review of Educational Research 78(3): 608-644.

Levin, H., Belfield, C., Muennig, P., & Rouse, C. (2006). The Costs and Benefits of an

Excellent Education for America's Children. Liang, K-Y. & Zeger, S.L. (1986). Longitudinal data analysis using Generalized Linear

Models. Biometrika 73(1): 13-22. Louisiana Department of Education (n.d.). Graduation Exit Code Pre-Reviews.

Downloaded May 13, 2018 from: https://www.louisianabelieves.com/docs/default-source/data-management/final-exit-code-pre-reviews.pdf?sfvrsn=2.

Louisiana Department of Education. (2015). High School Performance. Retrieved from

http://www.louisianabelieves.com/docs/default-source/katrina/final-louisana-believes-v5-high-school-performance.pdf?sfvrsn=2.

Lovenheim, M. & Willen, A. (2016). The Long-Run Effects of Teacher Collective

Bargaining. CESifo Working Paper Series No. 5977. Available at SSRN: https://ssrn.com/abstract=2815374.

Ludwig, J., Duncan, G.J., Gennetian, L.A., Katz, L.F., Kessler, R.C., Kling, J.R.,

Sanbonmatsu, L. (2013). Long-Term Neighborhood Effects on Low-Income Families: Evidence from Moving to Opportunity. American Economic Review 103(3): 226-31.

51

Maroulis, S., Santillano, R., Jabba, H., & Harris, D. (2015). The Push and Pull of School Performance: Evidence from Student Mobility in New Orleans. Unpublished manuscript.

McKenzie, R. & Levendis, J. (2010). Flood Hazards and Urban Housing Markets: The

Effects of Katrina on New Orleans. The Journal of Real Estate Finance and Economics 40(1): 62–76.

McLaughlin, K.A., Berglund, P., Gruber, M.J., Kessler, R.C., Sampson, N.A., &

Zaslavsky, A.M. (2011). Recovery from PTSD after Hurricane Katrina. Depression and Anxiety 28: 439–446.

Mills, J. & Wolf, P. (2017). Vouchers in the Bayou: The Effects of the Louisiana

Scholarship Program on Student Achievement After 2 Years. Educational Evaluation and Policy Analysis 39(3): 464-484.

National Alliance for Public Charter Schools (2013). A Growing Movement:

America’s Largest Charter School Communities. Downloaded July 3, 2015 from: http://www.publiccharters.org/press/students-32-school-districts-attend-public-charter-schools-market-share-report/.

National Commission on Excellence in Education (1983). A Nation at Risk. Washington,

DC: U.S. Government Printing Office. Obama, B. (2010). Remarks by the President on the Fifth Anniversary of Hurricane

Katrina in New Orleans, Louisiana. August 29, 2010. Retrieved April 6, 2013 from www.whitehouse.gov.

Pane, J.F., McCaffrey, D.F., Tharp-Taylor, S., & Asmus, G.J., Stokes, B.R. (2006).

Student Displacement in Louisiana After the Hurricanes of 2005 Experiences of Public Schools and Their Students. Santa Monica, CA: Rand Corporation.

Pane, J.F., McCaffrey, D.F., Kalra, N. & Zhou, A.J. (2008) Effects of student

displacement in Louisiana during the first academic year after the hurricanes of 2005. Journal of Education for Students Placed at Risk 13(2-3): 168-211.

Paxson, C. & Rouse, C.R. (2008). Returning to New Orleans after Hurricane Katrina.

American Economic Review 98(2): 38-42. P.B. v. Pastorek (2010). No. 2:10-cv-04049. The U.S. District Court of the Eastern

District of Louisiana. October 25, 2010. Perry, A., Harris, D., Buerger, C., & Mack, V. (2015). The Transformation of New

Orleans Public Schools: Addressing System-Level Problems Without a System. New Orleans, LA: The Data Center.

Peterson, P. (2014). Holding students to account. In What Lies Ahead for America’s

Children and Their Schools, Chester E. Finn and Richard Sousa (Eds). Stanford, CA: Hoover Institution Press.

52

Pischke, J-S. (2005). Empirical Methods in Applied Economics: Lecture Notes. Downloaded July 24, 2015 from: http://econ.lse.ac.uk/staff/spischke/ec524/evaluation3.pdf.

Plyer, A., Shrinath, N., & Mack, V. (2015). The New Orleans Index at Ten. New Orleans

LA: The Data Center. Ravitch, D. (2000). The Great School Wars: A History of the New York City Public

Schools. Baltimore, MD: Johns Hopkins University Press. Rothstein, J. (2007). Does competition among public schools benefit students and

taxpayers? A comment on Hoxby (2000). American Economic Review 97(5): 2026-2037.

Rouse, C. (1998). Private school vouchers and student achievement: An evaluation of the

Milwaukee Parental Choice Program. Quarterly Journal of Economics 113(2): 553-602.

Rouse, C. E., Hannaway, J., Goldhaber, D., & Figlio, D. (2013). Feeling the Florida heat?

How low-performing schools respond to voucher and accountability pressure. American Economic Journal: Economic Policy 5(2): 251-81.

Sacerdote, B. (2012). When the saints come marching in: Effects of Katrina evacuees on

schools, student performance and crime. American Economic Journal: Applied 4(1): 109-135.

Sastry, N. & Gregory, J. (2013). The effect of Hurricane Katrina on the prevalence of

health impairments and disability among adults in New Orleans: Differences by age, race, and sex. Social Science & Medicine 80: 212-129.

Seicshnaydre, S. & Albright, R.C. (2015). Expanding Choice and Opportunity in the

Housing Choice Voucher Program. New Orleans: The Data Center. St. Clair, T., Hallberg, K., & Cook, T.D. (2016). The validity and precision of the

comparative interrupted time-series design: Three within-study comparisons. Journal of Educational and Behavioral Statistics 41(3): 269-299.

Strunk, K.O. & Grissom, J.A. (2010). Do strong unions shape district policies? Collective

bargaining, teacher contract restrictiveness, and the political power of teachers’ unions. Educational Evaluation and Policy Analysis 32(3): 389-406.

Tiebout, C. (1956). A pure theory of local expenditures. Journal of Political Economy

64(5): 416-424. Tyack, D. (1974). The One Best System: A History of American Urban Education.

Cambridge, MA: Harvard University Press. United States Department of Agriculture (2014). Eligibility Manual for School Meals.

Washington, DC. Vigdor, J. (2008). The Economic Aftermath of Hurricane Katrina. Journal of Economic

Perspectives 22(4), 135–154.

53

Walberg, H. (2014). Expanding the options. In What Lies Ahead for America’s Children and Their Schools, Chester E. Finn and Richard Sousa (Eds). Stanford, CA: Hoover Institution Press.

Weems, C. F., Taylor, L. K., Cannon, M. F.,Marino, R. C., Romano, D.M., Scott,

B. G., & Triplett, V. (2010). Post traumatic stress, context, and the lingering effects of the Hurricane Katrina disaster among ethnic minority youth. Journal of Abnormal Child Psychology 38: 49–56.

Whitehurst, R. (2012). The Education Choice and Competition Index: Background and

Results 2012. Washington, DC: Brookings Institution. Wong, K. & Shen, F. (2006). Mayors Improving Student Achievement: Evidence from a

National Achievement and Governance Database. Paper prepared for the 2006 annual meeting of the Midwestern Political Science Association, April 20-23, 2006.

54

Figure 1A: Trends in New Orleans Student Achievement Levels

Figure 1B: Trends in New Orleans High School Graduation Rates

Figure 1C: Trends in New Orleans College Entry

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Test

Sco

re S

td. D

ev.

Math ELA Science Social Studies

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Gra

d R

ate

Grad1 Grad2 Grad3

0

0.1

0.2

0.3

0.4

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Col

lege

Ent

ry R

ate

Any College 2-Year 4-Year

Notes: The break in the middle of the figures reflects the landfall of Hurricane Katrina and the beginning of the school reforms. The break is longer with high school graduation and college outcomes because more years of data are required to calculate a single rate in these cases. The years in the x-axis, for high school graduation and college attendance, reflect the cohort year (when students were on-time 10th and 12th graders, respectively). More details are in the main text.

55

Figures 2: Test Score Average Treatment Effects from Panel Estimation

Panel A: 2005 4th Graders Who Returned in 2006

Panel B: 2005 4th Graders Who Returned in 2007 Notes: Results are based on panel estimation of equation (2) with the matched hurricane-affected comparison group of districts (without covariate adjustment in the effect estimation). See additional detail in Table 3. Dashed grey lines indicate 95% confidence intervals.

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

Math

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

ELA

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

Social Studies

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

Science

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

Math

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

ELA

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

Science

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Years

Social Studies

56

Figure 3: Test Score Average Treatment Effects from Pooled Estimation

Notes: Estimates are based on equation (2) with the matched hurricane sample, averaged across grade levels. See Table 4 provides the equivalent estimates based on equation (1). Dashed grey lines indicate 95% confidence intervals.

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Test

Sco

re S

td. D

ev.

Math Scores

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Test

Sco

re S

td. D

ev.

ELA Scores

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Test

Sco

re S

td. D

ev.

Science Scores

-0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Test

Sco

re S

td. D

ev.

Social Studies Scores

57

Figure 4: High School Graduation Average Treatment Effects from Pooled Estimation

Panel A: First-time 10th graders

Panel B: First-time 9th graders

Notes: Graduation is defined here in a way that most closely approximates the typical state-defined measure (Grad1). Estimates are based on equation (2) for the matched hurricane sample. The omitted reference year is 2003 for 10th graders. The dot to the left of Panel B shows that 2002 is the reference point for 9th graders and we cannot test parallel trends in that case due to data limitations. Dashed grey lines indicate 95% confidence intervals.

-0.2

-0.1

0

0.1

0.2

0.3

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Gra

d R

ate

-0.2

-0.1

0

0.1

0.2

0.3

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Gra

d R

ate

58

Figure 5: College Outcome Average Treatment Effects from Pooled Estimation

Panel A: College Attendance

Panel B: College Graduation

Notes: Estimates are based on equation (2) for the matched hurricane sample. Years on the x-axis indicate the year that students were 12th graders, and we use a five-year college graduation, looking forward from that year. College entry is based on “on-time” college entry the fall after a student’s 12th grade year. The last cohort where this calculation is feasible is therefore 2009 (soon after the reforms began). Dashed grey lines indicate 95% confidence intervals.

-0.1

0

0.1

0.2

0.3

2004 2005 2006 2007 2008 2009

Col

lege

Gra

d R

ate

-0.1

0

0.1

0.2

0.3

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Col

lege

Ent

ry R

ate

59

Figures 6A: Effect Heterogeneity – Math Scores (by Race)

Effects by Family Income (FRPL)

Notes: The panel results are for 2005 4th graders who returned by 2006. (We omit results for 2006 scores for this group because of problems with the test administration that year, in the wake of hurricanes.) Dashed grey lines indicate 95% confidence intervals.

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

2004 2005 2006 2007 2008 2009

Test

Sco

re S

td. D

ev.

Panel Estimation

Black White

-0.4

-0.2

0

0.2

0.4

0.6

0.8

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Effe

ct (S

td. D

ev.)

Pooled Estimation

Black White

60

Figure 6B: Effect Heterogeneity – Math Scores (by FRPL)

Notes: See notes on panel and pooled analyses in Figure 6A and Tables 3 and 4.

-0.4 -0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1

2004 2005 2006 2007 2008 2009

Effe

ct (S

td. D

ev.)

Panel Estimation

FRPL Non-FRPL

-0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

1 1.1

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Effe

ct (S

td. D

ev.)

Pooled Estimation

FRPL Non-FRPL

61

Figure 6C: Effect Heterogeneity – High School Graduation (by Race and FRPL; Pooled Estimation)

Notes: As with the average treatment effects for high school graduation in Table 6, these figures show results for pooled analysis of first-time 10th graders who had returned to New Orleans by 2007 (and who therefore could be identified as first-time 10th graders in 2008). The results with 9th grade cohorts are similar (available upon request). Dashed grey lines indicate 95% confidence intervals. See additional notes in Figure 6A.

-0.4

-0.2

0

0.2

0.4

0.6

0.8

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Effe

ct (G

rad

Rat

e)

Black White

-0.4

-0.2

0

0.2

0.4

0.6

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Effe

ct (G

rad

Rat

e)

FRPL Non-FRPL

62

Figure 6D: Effect Heterogeneity – College Entry (by Race and FRPL; Pooled Estimation)

Notes: As with the average treatment effects for college outcomes, these effect heterogeneity estimates are based on pooled estimation with cohorts of 12th graders. Dashed grey lines indicate 95% confidence intervals.

-0.2

-0.1

0

0.1

0.2

0.3

0.4

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Effe

ct (E

ntry

Rat

e)

Black White

-0.2

-0.1

0

0.1

0.2

0.3

0.4

2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Effe

ct (E

ntry

Rat

e)

FRPL Non-FRPL

63

Table 1A: Descriptive Statistics for New Orleans Before and After Katrina

Notes: Table 1A includes New Orleans students in the spring testing file for the given year, excluding students who took alternative assessments. The distribution of individual student scores is normalized to statewide 𝜇 = 0 and 𝜎=1 for the statewide population within year, grade, and subject. The mean differences in the far right-hand column indicate changes before and after the reforms in the New Orleans sample.

MeanN Mean s.d. Min Max N Mean s.d. Min Max Diff.

DemographicsAfrican-American 30,251 0.935 0.247 0 1 18,417 0.877 0.328 0 1 -0.057Hispanic 30,251 0.012 0.109 0 1 18,417 0.038 0.191 0 1 0.026Other 30,251 0.020 0.140 0 1 18,417 0.026 0.158 0 1 0.006White 30,251 0.033 0.179 0 1 18,417 0.059 0.236 0 1 0.026FRL 30,240 0.832 0.374 0 1 18,416 0.875 0.331 0 1 0.043Special Education 30,252 0.113 0.317 0 1 18,417 0.070 0.255 0 1 -0.044ELL 30,252 0.018 0.133 0 1 18,417 0.026 0.158 0 1 0.008

Test ScoresMath 30,068 -0.505 1.032 -4.5121 3.249 18,329 -0.093 1.032 -4.4327 3.173 0.413ELA 29,767 -0.539 1.011 -4.4943 3.313 18,309 -0.136 1.056 -4.9827 3.806 0.402Science 29,478 -0.624 0.931 -4.2155 3.682 18,342 -0.207 1.025 -4.9275 4.162 0.417Social Studies 29,449 -0.539 1.027 -4.2205 3.802 18,321 -0.097 1.050 -5.2573 4.406 0.443

Graduation 9th Grade (2002 vs 2011)Grad 1 4,287 0.524 0.499 0 1 2,899 0.726 0.446 0 1 0.202Grad 2 4,486 0.501 0.500 0 1 3,166 0.665 0.472 0 1 0.164Grad 3 4,293 0.610 0.488 0 1 2,902 0.785 0.411 0 1 0.175

College Attendance (on-time) 12th Grade (2004 vs 2012)Any Attendance 3,878 0.225 0.418 0 1 2,426 0.328 0.469 0 1 0.1032-Year Attendance 3,878 0.067 0.250 0 1 2,426 0.070 0.255 0 1 0.0034-Year Attendance 3,878 0.158 0.365 0 1 2,426 0.258 0.438 0 1 0.100

College Attendance (any) 12th Grade (2004 vs 2009)Any Attendance 3,878 0.534 0.499 0 1 2,306 0.655 0.475 0 1 0.1212-Year Attendance 3,878 0.287 0.452 0 1 2,306 0.411 0.492 0 1 0.1244-Year Attendance 3,878 0.372 0.483 0 1 2,306 0.393 0.489 0 1 0.021

College Persistence 12th Grade (2004 vs 2009)2 Full Years 3,878 0.278 0.476 0 1 2,306 0.374 0.484 0 1 0.0964 Full Years 3,878 0.155 0.418 0 1 2,306 0.214 0.410 0 1 0.059Years of College 3,878 1.099 2.172 0 5 2,306 1.394 1.394 0 5 0.295Grad-Rate 3,878 0.100 0.300 0 1 2,306 0.121 0.121 0 1 0.021

2004-05 2013-14 Panel A: Demographics(New Orleans)

64

Table 1B: Descriptive Statistics for New Orleans Relative to Comparison Groups (Pre-Katrina)

Notes: The pooled results use all grades while the panel results use only 4th graders who returned to their original district in 2006. High school graduation rates are based on cohorts of 9th graders. College outcomes are for cohorts of 12th graders. College entry reflects immediate enrollment after high school graduation. Most of the differences in the right-hand column are statistically significant at p<0.05 and almost are significant at p<0.10.

Panel Pool Panel Pool Panel Pool

DemographicsAfrican-American 0.920 0.935 0.419 0.730 0.500 0.204Hispanic 0.011 0.012 0.020 0.021 -0.009 -0.009Other 0.030 0.020 0.041 0.026 -0.011 -0.006White 0.040 0.033 0.520 0.222 -0.480 -0.189FRL 0.859 0.832 0.730 0.802 0.129 0.030Special Education 0.109 0.113 0.264 0.158 -0.155 -0.044ELL 0.028 0.018 0.008 0.013 0.019 0.005

Test ScoresMath -0.287 -0.505 -0.208 -0.205 -0.078 -0.300ELA -0.293 -0.539 -0.253 -0.250 -0.040 -0.289Science -0.517 -0.624 -0.423 -0.202 -0.093 -0.422Social Studies -0.467 -0.539 -0.359 -0.160 -0.108 -0.380

Graduation 9th Grade (2002)Grad 1 0.524 0.565 -0.041Grad 2 0.501 0.514 -0.013Grad 3 0.610 0.765 -0.155

College (2004)Attendance (on-time) 0.225 0.352 -0.1272-Year Attendance (on-time) 0.067 0.040 0.0274-Year Attendance (on-time) 0.158 0.298 -0.140Attendance (any) 0.534 0.571 -0.0372-Year Attendance (any) 0.287 0.217 0.0704-Year Attendance (any) 0.372 0.459 -0.0872 Full Years 0.278 0.365 -0.0884 Full Years 0.155 0.226 -0.071Years of College 1.099 1.959 -0.861Grad-Rate 0.100 0.137 -0.0374-Year Grad Rate 0.091 0.125 -0.034

Panel B: Pre-Reform Mean Differences (2004-05)

New OrleansOther Hurricane Districts

(Matched)New Orleans Minus

Comparison

65

Table 2: Effects of Population Change

Notes: Panel A shows difference-in-differences (DD) of demographics and test scores (from LDOE administrative data) between all public school students in 2005 in the respective districts and the returnees in those same districts. Panel B shows DD in district-wide demographics based on Census data (public school students only). Panel C reports regression coefficients based on the ECLS, using the same demographics as in the Census; we regressed reading score levels (and gains, separately) on the variable in the left column plus a vector of school fixed effects; each reported coefficient is from a different regression with standard errors are in parentheses. Panel D provides simulated effects of demographic change; specifically, we carried out an out-of-sample prediction, inserting the Census-based DD changes from Panel B into the regression model in Panel C. The “Cumulative” effects come from adding the effect on 3rd grade test levels to the 5th grade gains multiplied by the dosage through 2012 to obtain the total predicted effect of demographic change in student test scores. Standard errors of prediction are available upon request.

Panel A: Population Change (Average Pre-Katrina Characteristics of 3rd Graders)

All Pre-Katrina Students Returnees Diff

All Pre-Katrina Students Returnees Diff Diff-in-Diff

FRL 0.866 0.874 0.008 0.610 0.606 -0.004 0.012Special Ed 0.101 0.103 0.002 0.164 0.171 0.007 -0.005ELL 0.017 0.016 0.000 0.034 0.032 -0.001 0.001Reading Scores -0.665 -0.683 -0.018 0.118 0.143 0.025 -0.043

Panel B. Census Demographic Changes (Public School Students Only)

1999 2013 Change 1999 2013 Change Diff-in-DiffIncome (2013 $) $43,189 $42,453 -$736 $69,659 $71,408 $1,749 -$2,485Prop. BA+ 0.10 0.15 0.05 0.16 0.19 0.03 0.02Prop. Child Poverty 0.57 0.58 0.01 0.30 0.32 0.02 -0.01Prop. < H.S. 0.33 0.20 -0.13 0.23 0.16 -0.07 -0.06

Panel C. Partial Correlations Between Demographics and Test Scores (from ECLS)

Grade 3 Grade 5 Grade 8 Grade 5 Grade 8Income (thous., 2013 $) 0.003 0.003 0.003 0.0004 0.0009

(0.0002) (0.0002) (0.0003) (0.0001) (0.0002)BA+ 0.139 0.253 0.229 0.046 0.092

(0.021) (0.023) (0.03) (0.013) (0.022)Child Poverty -0.437 -0.423 -0.402 -0.082 -0.101

(0.028) (0.035) (0.051) (0.022) (0.038)<H.S. -0.369 -0.366 -0.405 -0.08 -0.076

(0.044) (0.048) (0.065) (0.029) (0.054)

Panel D. Predicted Effects of Census Demographic Change on Student Test Scores (Using Panels B and C)

Grade 3 Grade 5 Grade 8 Grade 5 Grade 8 CumulativeIncome (thous., 2013 $) -0.007 -0.007 -0.007 -0.001 -0.002 -0.012BA+ 0.003 0.005 0.005 0.001 0.002 0.007Child Poverty 0.004 0.004 0.004 0.001 0.001 0.008<H.S. 0.022 0.022 0.024 0.005 0.005 0.044

Average 0.005 0.006 0.006 0.001 0.001 0.012

Dep Var: Test Levels Dep Var: Test Gains

Test Levels Test Gains

New Orleans Hurricane-Affected Districts

New Orleans Hurricane-Affected Districts

66

Table 3: Average Treatment Effects on Test Scores based on Panel Estimation (2006 Returnees)

Notes: Each cell represents a separate regression with estimation at the student level and controls for race, free-reduced price lunch, special education status, and English proficiency in 2005 are included. Columns 2 and 4 are weighted by the number of times a student is matched using a Mahalanobis matching process on 2004 and 2005 test score levels. The first number in each cell is the point estimate for 𝛿 in equation (1). The second number (in parentheses) is the GEE cluster standard error. The third number, in brackets, is the coefficient on a parallel trend taken from a separate comparative interrupted time series estimate where the pre-2006 linear trend is interacted with a New Orleans indicator variable. See text for discussion of the matching process. Significance levels: *** p<.01, ** p<.05, * p<.10.

Entire State Entire State w/ Student

Matching

Hurricane Districts Only

Hurricane Districts w/

Student Matching

MathPost x NOLA 0.222*** 0.190*** 0.181*** 0.173** s.e. (0.055) (0.058) (0.057) (0.071) Parallel Trends Test [0.102**] [-0.002] [0.181***] [-0.011]ELAPost x NOLA 0.123** 0.121** 0.135** 0.084

(0.057) (0.060) (0.058) (0.073)[0.239***] [0.009] [0.206***] [0.013]

SciencePost x NOLA 0.223*** 0.102* 0.204*** 0.057

(0.056) (0.059) (0.057) (0.077)[-0.008] [-0.020] [-0.008] [-0.032]

Social StudiesPost x NOLA 0.249*** 0.093 0.259*** 0.094

(0.060) (0.063) (0.061) (0.080)[-0.022] [-0.025] [-0.041] [-0.057]

Number of Districts 68 68 8 8

Math 0.160*** 0.061 0.162*** 0.060Post x NOLA (0.058) (0.061) (0.060) (0.074)

[-0.069] [0.001] [-0.103**] [0.005]ELAPost x NOLA 0.220*** 0.036 0.179*** -0.005

(0.058) (0.061) (0.059) (0.075)[-0.249***] [-0.009] [-0.214***] [-0.001]

SciencePost x NOLA 0.082 -0.023 0.082 -0.097

(0.055) (0.059) (0.057) (0.072)[-0.048] [0.025] [-0.087*] [0.030]

Social StudiesPost x NOLA 0.225*** 0.083 0.213*** 0.087

(0.055) (0.058) (0.057) (0.073)[-0.066] [0.009] [-0.055] [0.016]


Panel A: 2005 4th Grade Cohort 2005 vs 2009 Diff-in-Diff

Panel B: 2005 5th Grade Cohort 2005 vs 2008 Diff-in-Diff

67

Table 4: Test Score Average Treatment Effects from Pooled Estimation(2005 to 2014)

Notes: Each cell represents a separate regression with estimation at the student level, and controls for race, free-reduced price lunch, special education status, and English proficiency in 2005 are included, using data from 2005 and 2014 only. Columns 2 and 4 are weighted by the number of times a student’s school is matched using a Mahalanobis matching process on 2002 test score levels. The first number in each cell is the point estimate for 𝛿 in equation (1). The second number (in parentheses) is the GEE cluster standard error. The third number is the coefficient on a parallel trend taken from a separate comparative interrupted time series estimate where the pre-2006 linear trend is interacted with a New Orleans indicator variable. See text for discussion of the matching process. Significance levels: *** p<.01, ** p<.05, * p<.10.

Entire StateEntire State w/

School MatchingHurricane Districts

Hurricane Districts w/ School Matching

Math 0.402*** 0.387*** 0.362*** 0.426** s.e. (0.019) (0.023) (0.052) (0.119) Parallel Trends Test [0.037***] [0.007] [0.050***] [0.031**]

ELA 0.363*** 0.360*** 0.322*** 0.345***(0.015) (0.023) (0.025) (0.070)[0.014***] [-0.008] [0.023***] [0.029*]

Science 0.350*** 0.331*** 0.318*** 0.398***(0.015) (0.023) (0.040) (0.062)[0.005**] [-0.013***] [0.012**] [-0.013]

Social Studies 0.381*** 0.361*** 0.347*** 0.425***(0.016) (0.023) (0.027) (0.062)[0.018***] [-0.006] [0.024***] [-0.012]


68

Table 5: Test Score Average Treatment Effects from Students Switching Districts (Annualized Effects)

Notes: Coefficients are based on student-level regressions of achievement on lagged achievement, grade fixed effects, and an indicator for whether the switch occurred before or after Katrina (Post-Katrina). Method 1 (M1) focuses only on switchers either in or out of New Orleans. Method 2 (M2) uses all possible switchers in the state and interacts the post-Katrina variables with the type of switch being made. M1 has a range of 5,066-6,761 observations, while Method 2 has a range of 81,290-82,364. Pre-Katrina district switches are included for 2002-2005, and the post-Katrina years are 2009-2013. Standard errors are GEE clustered at the sending district level for in-switchers and the receiving district levels for out-switchers. See text and earlier footnotes for more details on the model. By design of the identification strategy, the coefficients reflect annual changes in achievement, and therefore the magnitudes cannot be directly compared with the panel and pooled estimated reported elsewhere. Significance levels: *** p<.01, ** p<.05, * p<.10.

M1 M2 M1 M2

MathPost-Katrina 0.105*** -0.072*** 0.029 -0.074***

(0.031) (0.015) (0.047) (0.016)

Switch Type -0.089*** -0.136***(0.023) (0.017)

Switch Type*Post-Katrina 0.175*** 0.089**(0.035) (0.044)

ELAPost-Katrina 0.104*** -0.055*** 0.006 -0.056***

(0.016) (0.012) (0.022) (0.014)

Switch Type -0.120*** -0.114***(0.024) (0.023)


SciencePost-Katrina 0.095** -0.053*** 0.033 -0.060***

(0.043) (0.018) (0.037) (0.015)

Switch Type -0.189*** -0.187***(0.024) (0.024)


Social StudiesPost-Katrina 0.115*** -0.041* -0.051*** 0.077

(0.028) (0.021) (0.016) (0.048)

Switch Type -0.151*** -0.218***(0.035) (0.023)

Switch Type*Post-Katrina 0.151*** 0.119***(0.034) (0.044)

Switch in to New Orleans Switch out of New Orleans

69

Table 6: High School Graduation Average Treatment Effects from Pooled Estimation

Notes: Each cell is from a separate pooled regression estimation of equation (1). Graduation measure 1 (Grad1) only counts graduates who receive a regular diploma from their school and includes students who move out of the public school system in the denominator (see text for more detail). The second measure (Grad2) uses the same definition of graduation as Grad1, but excludes students who left the public school system from the calculation, while Grad3 uses the same total pool of students as Grad1, but allows for alternative degrees. Columns 2 & 4 use school level match weights from Mahalanobis matching of graduation rates in 2002 for the 9th grade cohorts and both 2002 and 2003 for the 10th grade cohorts. The first number in each cell is the point estimate for 𝛿 in equation (1). The second number (in parentheses) is the GEE cluster standard error. The third number, in brackets, is the parallel trends test. See text for discussion of the matching process. Significance levels: *** p<.01, ** p<.05, * p<.10.




School Matching

2012 10th Grade Grad 1 0.120*** 0.096*** 0.069* 0.031 s.e. (0.013) (0.018) (0.030) (0.060) Parallel Trends Test [-0.027***] [-0.016***] [-0.022*] [-0.015]

Grad 2 0.102*** 0.088*** 0.064* 0.055(0.011) (0.015) (0.033) (0.051)

[-0.025***] [-0.012**] [-0.021] [-0.008]

Grad 3 0.126*** 0.097*** 0.079*** 0.039(0.011) (0.016) (0.021) (0.038)

[-0.060***] [-0.044***] [-0.057***] [-0.046***]2011 9th Grader Grad 1 0.119*** 0.126*** 0.079* 0.075* s.e. (0.012) (0.011) (0.034) (0.032)

Grad 2 0.100*** 0.110*** 0.064 0.091*(0.012) (0.009) (0.041) (0.042)

Grad 3 0.126*** 0.123*** 0.090*** 0.081**(0.009) (0.011) (0.018) (0.023)


70

Table 7: College Outcome Average Treatment Effects from Pooled Estimation

Note: Each cell is from a separate pooled regression estimation of equation (1). Estimation is restricted to first-time 12th graders. Columns 2 and 4 use school level match weights from Mahalanobis matching on pre-treatment values of the dependent variables. On-time attendance measures compare 2005 rates to 2012, all other outcomes compare 2005 to 2009 rates. The first number in each cell is the point estimate for 𝛿 in equation (1). The second number (in parentheses) is the GEE cluster standard error. The third number, where shown, is the parallel trend test based on on-time enrollment from the BOR data. See text and footnotes for details of the modified parallel trends tests for college persistence and graduation. Significance levels: *** p<.01, ** p<.05, * p<.10.




School Matching

Attendance (on-time)Any College Attendance 0.103*** 0.114*** 0.095*** 0.150*** s.e. (0.010) (0.014) (0.019) (0.026) Parallel Trends Test [0.016***] [0.003] [0.011] [-0.001]

2-Year Attendance -0.019*** -0.029*** -0.010 -0.020*(0.005) (0.009) (0.007) (0.008)[0.002] [0.003] [0.002] [0.002]

4-Year Attendance 0.122*** 0.138*** 0.105*** 0.161***(0.010) (0.012) (0.016) (0.025)

[0.013***] [-0.006] [0.010] [-0.006]

Attendance (any)Any College Attendance 0.067*** 0.066*** 0.079** 0.078** s.e. (0.010) (0.012) (0.025) (0.025)

2-Year Attendance -0.020 -0.029 0.003 -0.008(0.014) (0.014) (0.012) (0.022)

4-Year Attendance 0.059*** 0.068*** 0.064*** 0.090***(0.007) (0.008) (0.012) (0.014)

Persistence2 Full Years in College 0.068*** 0.060*** 0.071** 0.070** s.e. (0.008) (0.007) (0.025) (0.025)

4 Full Years in College 0.042*** 0.034*** 0.042* 0.044(0.007) (0.007) (0.022) (0.024)

Years of College 0.243*** 0.198*** 0.239** 0.205*(0.027) (0.028) (0.083) (0.096)

GraduationAny Graduation 0.036*** 0.021*** 0.035* 0.032*** s.e. (0.005) (0.005) (0.016) (0.006)

4-Year Graduation 0.047*** 0.033*** 0.048** 0.045**(0.004) (0.004) (0.016) (0.013)


71

Table 8: Effect Summary and Cost-Benefit Analysis Notes: Panels A and B summarize results reported elsewhere (see column 1 descriptions). The two Panel DD estimates for 2014 are shown in brackets because they are extrapolations of the panel effect (see text discussion). The estimate for “Years of Education” combines the effects on high school graduation and college. Panel C provides the cost-benefit analysis. The present discounted value of costs comes from multiplying the dosage by the annual additional costs and applying the discount rate ((δ=0.035). The benefits come from adding the returns to quality (test scores from Panel A) and the returns to quantity (years of education from Panel B), adjusting for productivity growth and discounting. The ratio of these easily exceeds unity, indcating that the reforms pass a cost-benefit test.

Effect Category 2007 2008 2009 2011/12 2014Panel A: Test Score Summary Total NOLA improvement (rel. to 2005) (Table 1) 0.10 0.15 0.25 0.35 0.42 Threats to Identification Population Change Pre-Kartina Scores of Returnees 0.10 0.06 0.04 -0.06 Census/USDOE Simulation 0.01 Interim Schools/Trauma (Pane et al. 2008) -0.06 Effects from DD Panel DD - Low (Figure 3) -0.03 0.05 0.11 {0.28} {0.28} Panel DD - High (Figure 3) {0.39} Pooled DD (Table 4) 0.00 0.11 0.23 0.40 0.40

Panel B: Years of Education Summary Total NOLA improvement (rel. to 2005) HS graduation (Table 1; NOLA only) 0.18 College attend. (Table 1; NOLA only) 0.10 Effects from DD HS graduation (Figure 5A) 0.07 0.11 0.06 College attend. (Figure 5B) 0.10 0.10 0.08 0.15 0.13 Effects on years of education 0.42Panel C: Benefit-Cost Analysis Key Parameters Dosage (post-reform rears in NOLA) 4.50 Annual cost/pupil (from DD) $1,300 Discount rate 0.035 Productivity growth rate 0.010 Return to year of educ (from prior res.) 0.04-0.08 Return to 1 s.d. inc. in test scores (from prior res.) 0.05-0.08

NOLA Adj. Benefit-Cost Ratio (from Panels A and B) 5.66-9.96 BCR: Perry Preschool 7.1-12.2 BCR: Class Size (STAR) 2.83

technical report the effects of the new orleans post … › preferen › you must read this ›...

Documents