the long‐run effect of childhood poverty and the mediating role of...

© 2018 Royal Statistical Society 0964–1998/19/182037

J. R. Statist. Soc. A (2019)182, Part 1, pp. 37–68

The long-run effect of childhood poverty and themediating role of education

Luna Bellani

University of Konstanz and IZA Institute of Labor Economics, Bonn, Germany,and Carlo Dondena Centre for Research on Social Dynamics, Milan, Italy

and Michela Bia

Luxembourg Institute of Socio-Economic Research, Esch-sur-Alzette,Luxembourg

[Received April 2017. Final revision May 2018]

Summary. The paper examines the role of education as a causal channel through which grow-ing up poor affects the economic outcomes in adulthood in the European Union. We apply apotential outcomes approach to quantify those effects and we provide a sensitivity analysis onpossible unobserved confounders, such as child ability. Our estimates indicate that being poorin childhood significantly decreases the level of income in adulthood and increases the averageprobability of being poor. Moreover, our results reveal a significant role of education in this inter-generational transmission.These results are particularly relevant for Mediterranean and centraland eastern European countries.

Keywords: Causal mediation analysis; Education; Intergenerational transmission; Potentialoutcome; Poverty

1. Introduction

The effect of poverty during childhood on individuals’ economic outcomes later in life is a topicof active research and a major policy concern in many developed as well as developing countries.In the USA, children as a group are disproportionately represented among the poor: roughly onein five live in poverty compared with one in eight adults (US Census Bureau, 2014). Moreover,according to data from the Urban Institute about two out of every five children spent at least1 year in poverty before turning 18 years of age. Persistently poor children are then 43% lesslikely to finish college than are poor children, and 13% less likely to finish high school (Ratcliffe,2015). In Europe the picture is similar. From the latest data provided by Eurostat, in 2014 inthe 28 European Union (EU) countries children were the population age group with the highestrisk of poverty or social exclusion. The share of children living in a household at risk of povertyor social exclusion ranged from 16–18% in the Nordic countries, Slovenia and the Netherlandsto 40–52% in Hungary, Romania and Bulgaria. Moreover, 50.5% of children whose parents’highest level of education was low were at risk of poverty compared with 8.0% of children whoseparents’ highest level of education was high (Eurostat, 2016). In the EU the educational levelof current adults is related to the level of education of their parents in all the Member States,

Address for correspondence: Luna Bellani, Fachbereich Wirtschaftswissenschaften, Universitat Konstanz,Universitatsstraße 10, Box 132, Konstanz, Germany.E-mail: [email protected]

38 L. Bellani and M. Bia

with an average association index for being low educated of 14.7 for adults having low educatedparents. In 2011, in Bulgaria and Croatia, this association index was more than 40, whereas, inNorway, Estonia, Denmark and Finland, it was less than 5. This association index is calculatedas an odds ratio and measures how strongly the low level of education of adults is related tothe low level of education of parents compared with the high level of education of parents(Grundiza and Lopez Vilaplana, 2013).

The economics literature on intergenerational transmission focuses typically on estimates ofintergenerational elasticity in income or earnings of parents and their offspring. Fewer studiesfocus on persistence of poverty across generations (see among others Mayer (1997), Shea (2000),Acemoglu and Pischke (2001) and Ermisch et al. (2004)). These papers find significant effects ofparental income or parental financial difficulties on children human capital accumulation andlater labour market outcomes, in the range of a 5% decrease in education given parental jobless-ness for Britain, and a 1.4-percentage-point increase in the probability of attending college foran increase in income of 10% for the USA. Blanden et al. (2007) analysed in detail the associa-tion between childhood family income and later adult earnings among sons, exploring the roleof education, ability, non-cognitive skills and labour market experience in generating intergen-erational persistence in the UK. They did so by decomposing the estimated mobility coefficientconditionally on those mediating variables. They showed that inequalities in achievements atage 16 years and in post-compulsory education by family background are extremely importantin determining the level of intergenerational mobility. In particular they found a dominant roleof education in generating persistence. Cognitive and non-cognitive skills both work indirectlythrough influencing the level of education that is obtained, with the cognitive variables account-ing for 20% of intergenerational persistence and non-cognitive variables accounting for 10%.

Although many contributions agree that growing up in a poor family is associated with theprobability of falling below the poverty threshold in adulthood, the key contentious questionfor policy is whether this association is truly causal in the sense that poverty in childhood per seinfluences later outcomes or whether it is driven by other factors that are correlated with bothchildhood poverty and later outcomes, such as family structure, neighbourhood influences andgenetic transmissions. Moreover, it is relevant for policy to examine plausible causal channelsthrough which being born poor affects the individual’s economic and social status as an adult.

Recent studies have shown that educational differences tend to persist across generations,and differences in such persistence explain a large share of the cross-country variation of in-tergenerational wage correlations (e.g. Solon (2004)). An important part of poverty persistenceis therefore likely to be driven by the effect of parental background on cognitive skills that areacquired by children in formal (and informal) education. Hence, a better understanding of theinteraction between cognitive skills development and the reduction of poverty will help to designmore effective policy interventions.

Experiencing financial difficulties while growing up is not the only determinant of outcomeslater in life. Because of the complexity of the process, different statistical techniques have beenused, each of which relies on a different set of assumptions. In particular, siblings differencemodels and instrumental variables approaches have been applied to similar questions. However,the first method does not guarantee that estimates are unbiased, since some child-specific factorsmay remain contributing to potential bias and the estimates are based on a selected type of familythat could be different with respect to other factors affecting the outcome of the child, as well astheir poverty status. The alternative method suffers from the difficulty of finding an additionalvariable which determines childhood poverty status and which at the same time has no directinfluence on the outcome variable, i.e. a good instrument, resulting in possible weak instrumentbias and anyway providing only a local average effect of childhood poverty.

Childhood Poverty and Mediating Role of Education 39

This paper addresses the following questions. Does living in poverty as a child cause povertyin adulthood? If so, is education driving this causal effect? In particular, we focus our analysison the effect of poverty during childhood for those experiencing it. We believe that this is therelevant measure for policy makers, this population being the natural target of possible publicinterventions, such as antipoverty, social assistance and social protection programmes. More-over, in case those effects were heterogeneous, i.e. they differ for the type of children who did anddid not experience poverty, reporting the average effect on the whole population could in factbe misleading as it would include effects on children who might never be experiencing povertyand for whom the effect could potentially be very different.

We contribute to the above-mentioned literature in three ways: we apply a potential outcomesframework to quantify the effect of experiencing financial difficulties while growing up, we anal-yse the channel of this poverty transmission, introducing individual human capital accumula-tion as an intermediate variable and we provide an extensive sensitivity analysis on unobservedconfounders (e.g. parental ability) for both the direct and the indirect effect.

Our analysis is based on the module on intergenerational transmission of 2011 of the EU Sur-vey of Income and Living Conditions (SILC) data (http://ec.europa.eu/eurostat/web/income-and-living-conditions/overview), where retrospective questions aboutparental characteristics (such as education, age and occupation) were asked. We find that, evenconsidering possible unobserved confounders, such as child ability, being poor in childhoodsignificantly decreases the level of income in adulthood and increases the average probabilityof being poor. Moreover, our results reveal a significant role of human capital accumulation inthis intergenerational transmission.

The remainder of the paper is organized as follows. Section 2 introduces the estimation strategyand in Section 3 the data that are used through the whole paper are described. In Sections 4 and5 we analyse the average and the distributional effect of growing up poor respectively, whereasin Section 6 we focus on the mechanism behind this effect, analysing the role of education.Section 7 provides some heterogeneous effects by welfare state regime and Section 8 presentssensitivity analysis and robustness checks of our results. Finally, Section 9 concludes.

The programs that were used to analyse the data can be obtained from

http://wileyonlinelibrary.com/journal/rss-datasets

2. Estimation strategy

2.1. Average treatment effect on the treatedAs briefly reviewed in Section 1, in this paper we follow the framework of potential outcomesapproach for causal inference (Rubin, 1974, 1978).

Consider a set of N individuals, and denote each of them by subscript i: i=1, : : : , N. Let Ti

indicate whether a child was growing up in a poor household, Ti = 1 (treated), or not, Ti = 0(control). For each individual, we observe a vector of pretreatment variables Xi and the valueof the outcome variable that is associated with the treatment, Yi.1/ for being a poor child andYi.0/ for not being a poor child.

As mentioned in Section 1 in our analysis we are interested in the ‘average treatment on thetreated’ (ATT), which is defined as follows:

τ .X/=E[Yi.1/−Yi.0/|Ti =1]=EXi|Ti=1[E[Yi.1/−Yi.0/|Ti =1, Xi]]

=EXi|Ti=1[E[Yi.1/|Ti =1, Xi]]−EXi|Ti=1[E[Yi.0/|Ti =1, Xi]]:

In our context it is not possible for obvious ethical reasons to design a randomized experiment


on our population of interest (poor children) but, even if we could randomly give unconditionalcash transfer to poor parents today, we would not be able to analyse the long-term effect of iton the children before another 15–20 years. (See for example Haushofer and Shapiro (2016) forthe short-term effect of such transfers or Aizer et al. (2016) for an estimate of the long-run effectof cash transfers which compare accepted and rejected applicants to the ‘Mothers’ pension’programme in the USA (1911–1935).) Therefore, in our study, the critical problem of non-random treatment assignment implies that additional assumptions must be made to estimatethe causal effects of the treatment. An important identifying assumption is the selection onobservables (unconfoundedness) (Rosenbaum and Rubin, 1983). (For a review of the statisticaland econometric work focusing on estimating average treatment effects under this assumption,see Imbens (2004).) The central assumption of our approach is that the ‘assignment to treatment’is unconfounded given the set of observable pretreatment characteristics: Yi.0/⊥Ti|Xi, where,within each cell defined by X, treatment assignment is random and the outcome of controlsis used to estimate the counterfactual outcome of treated in case of no treatment. (A focuson the average effect on the whole population would require a more stringent assumption foridentification: Yi.0/, Yi.1/ ⊥ Ti|Xi. The estimation of the ATT allows instead the individual-specific effect Yi.1/Yi.0/ to depend on treatment status Ti in an arbitrary way.)

Let p.X/ be the probability of growing up in a poor household given the set of covariatesX: p.X/ = Pr.T = 1|X = x/ = E[T |X = x]. Rosenbaum and Rubin (1983) showed that, if thepotential outcome Yi.0/ is independent of treatment assignment conditional on X , it is alsoindependent conditional on p.X/: Yi.0/ ⊥ Ti|p.Xi/. Thus, for a given propensity score value,exposure to treatment can be considered as random and thus poor and non-poor children shouldbe on average observationally equivalent. Formally, given the population of units i, if we knowthe propensity score p.Xi/, then the average effect of being poor on those exposed to poverty(the ATT) can be rewritten as

τ =E[Yi.1/−Yi.0/|Ti =1]=Ep.Xi/|Ti=1[E[Yi.1/−Yi.0/|Ti =1, p.Xi/]]

=Ep.Xi/|Ti=1[E[Yi.1/|Ti =1, p.Xi/]]−Ep.Xi/|Ti=1[E[Yi.0/|Ti =1, p.Xi/]]:

When Ti = 1, then Yi.1/=Yi, whereas Yi.0/ is never observed. To estimate the ATT, we firstestimate for each individual the propensity score by means of a probit model, p.Xi/; then weestimate the unobserved potential outcomes based on the propensity score (PS) to reduce themulti-dimensionality problem to a single scalar dimension, applying the following nearest neigh-bour matching technique (the use of the adjective potential is motivated by the impossibilityof observing, with one observation per unit, the outcome both with and without treatment,Yi,observed = TiYi.1/ + .1 − Ti/Yi.0/, Yi,missing = .1 − Ti/Yi.1/ + TiYi.0/). In particular we imple-ment matching with replacement, which, as shown by Abadie and Imbens (2002, 2011), leadsto a higher variance, but produces higher quality in terms of matched units and hence a lowerbias. Let l.i/ denote the index l of the unit in the opposite treatment group that is the closest tounit i:

|p.Xl/− p.Xi/|� |p.Xj/− p.Xi/| ∀j, Tj �=Ti: .1/

This procedure enables us to select a control group of non-treated individuals (in this case non-poor as a child) who are very similar to treated individuals in terms of their probability of beingpoor as a child. We then estimate τ by

τ = 1n

n∑

i=1{Y i.1/− Y i.0/}

with Y i.0/=Yl1.i/, Y i.1/=Yi and n the number of poor children.


2.2. Average causal mediation effect on the treatedAs previously introduced, we also analyse a mechanism behind this average effect: humancapital accumulation. In our context, the intermediate (education status) and the outcome(income or poverty status) are not concurrently achieved by the individuals, and we could hy-pothetically conceive an intervention on the education status. Such an intervention could forexample consist in school-specific remedial education programmes, which could potentially im-prove poor marks in core courses and eventually decrease the likelihood of having to repeata grade. Another example for an intervention is more flexible pathways for graduation suchas evening courses, which could improve attendance and reduce dropout. As a consequence, apoor child who is now a high school dropout could have instead managed to finish high school.

We focus our analysis on the natural direct and indirect effect on the treated. More specifically,we want to decompose the causal effect of having experienced financial difficulties as a childon the income level and the poverty risk later in life into an indirect effect, transmitted throughcognitive abilities developed over time and measured by educational level (mediator variable),and a direct effect (which includes all other potential mechanisms). In this paper we follow inparticular the approach that was described in Imai et al. (2010a,b), which not only enables usto decompose the causal mechanism behind the effect of childhood poverty but also enables usto conduct sensitivity analysis with respect to key identification assumptions.

To define indirect effects within the potential outcome framework, let Mi.t/ denote the poten-tial value of the mediating variable for unit i with the treatment T = t, and let Yi.t, m/ denote thepotential outcome if T = t and M =m. Under the framework of potential outcomes, the causaleffect is the result of a comparison between potential results.

This is considered as the basic problem of causal inference, and it is true also in mediationanalyses, where the observed outcome Yi{Ti, Mi.Ti/} depends on both the treatment status andthe value of the mediator under the observed treatment level, assuming that a manipulation ofthe mediating variable does not influence the outcome of interest through any channel otherthan the mediator (the consistency assumption).

Our estimand of interest is the change in the outcome (children’s outcomes later in life)corresponding to a change in the mediating variable from the level that would be observedunder the control status Mi.0/ (higher education when not growing up in poverty), to thelevel that would be observed under the treatment status Mi.1/ (higher education when growingup in poverty), among those growing up in poverty. In particular, we are interested in theaverage causal mediation effect among the treated (ACMET) defined as τt = E[Yi{t, Mi.1/}−Yi{t, Mi.0/}|Ti =1]. Similarly, we can define the average direct effect among the treated (ADET)as γt =E[Yi{1, Mi.t/}−Yi{0, Mi.t/}|Ti =1].

Noteworthy, unlike the identification of the average treatment effect, identifying direct andindirect effects requires more stringent assumptions than random treatment assignment. In thissetting, an additional assumption is required, the so-called sequential ignorability (SI) amongtreated individuals:

Yi.t′, m/, Mi.t/⊥Ti|Xi =x, .2/

Yi.t′, m/⊥Mi.t/|Ti =1, Xi =x .3/

where 0 < Pr.Mi =m|Ti =1, Xi =x/ and 0 < Pr.T =1|X=x/, for t =0 and t =1, is in the com-mon support of Xi and Mi respectively. Assumption (2) is the standard unconfoundednessassumption, where treatment assignment is assumed to be independent of potential outcomesand potential mediating variables, conditional on pretreatment characteristics. Assumption (3)states that the mediator variable is now ignorable, given the observed treatment level and the


observed pretreatment characteristics, i.e., among those individuals with the same poverty statusand the same pretreatment observable characteristics, the level of education can be consideredas if it were randomized. It is important to acknowledge that this additional assumption is quitestrong because it excludes not only unobserved pretreatment confounders but also the exis-tence of (measured or unmeasured) post-treatment confounders. (Refer to Imai and Yamamoto(2013) for a method for dealing with multiple mediators based on the same framework whenthe multiple mediators are causally related.) Under the SI assumption, the distribution of anycounterfactual outcome is identified without being based on any specific model. (Refer to thetheorem on non-parametric identification in Imai et al. (2010a).)

To estimate the ACMET and ADET we need first to estimate the potential outcomes andmediators.

We use the following probit model for the mediating variable that is binary (achieving at leastsecondary education):

Mi =1{MÅi > 0} .4/

where

MÅi =α2 +β2Ti + ξ′Xi + εi2:

When the outcome variable is continuous (equivalized disposable income) a linear regressionmodel is implemented:

Yi =α3 +β3Ti +κTiMi +γMi + ξ′Xi + εi3, .5/

whereas when our outcome is binary (adult poverty) a probit model is implemented. (Yi =1{YÅ

i > 0} where YÅi = α3 + β3Ti + κTiMi + γMi + ξ′Xi + εi3.) Moreover, since assuming no

interaction between the treatment and the mediator variable is often unrealistic, we also includethe interaction term in equation (5). Such an interaction might arise if, for example, the effect ofthe educational level on income depends on whether the individual grew up in poverty or not.The coefficients in these models are estimated by weighted probit and least squares regressionmethods, where the weights are equal to the frequency of the control unit that is used in thenearest neighbour propensity score matching procedure, as in equation (1).

Under the SI assumption, we can write each potential outcome as (refer also toVanderWeele and Vansteelandt (2009))

(a) E[Yi{0, Mi.1/}]=EXi|Ti=1[Σ1m=0E[Yi|Mi =m, Ti =0, Xi]Pr.Mi =m|Ti =1, Xi/],

(b) E[Yi{0, Mi.0/}]=EXi|Ti=1[Σ1m=0E[Yi|Mi =m, Ti =0, Xi]Pr.Mi =m|Ti =0, Xi/],

(c) E[Yi{1, Mi.1/}]=EXi|Ti=1[Σ1m=0E[Yi|Mi =m, Ti =1, Xi]Pr.Mi =m|Ti =1, Xi/],

(d) E[Yi{1, Mi.0/}]=EXi|Ti=1[Σ1m=0E[Yi|Mi =m, Ti =1, Xi]Pr.Mi =m|Ti =0, Xi/].

More specifically, the estimates of the potential outcomes are defined as follows:

Yi{0, Mi.t/}= α3 + γMi.t/+ ξ′Xi, .6/

Yi{1, Mi.t/}= α3 + β3 + κMi.t/+ γMi.t/+ ξ′Xi, .7/

where

Mi.t/=1{MÅi .t/> 0} .8/

with

MÅi .t/= α2 + β2t + ξ

′Xi,


for t = 0, 1: The ACMET is then estimated by the average difference between poor children inpredicted outcome across the levels of high school graduation, for the case of having and nothaving experienced poverty (Hicks and Tingley, 2011), i.e.

ˆτ t = 1n

n∑

i=1[Y i{t, Mi.1/}− Y i{t, Mi.0/}],

and the ADET by the average difference between poor children in predicted outcome across thelevels of treatment, for level of high school graduation, i.e.

ˆγt =1n

n∑

i=1[Y i{1, Mi.t/}− Y i{0, Mi.t/}]

for t =0, 1:

To provide confidence intervals around our point estimates, we follow Imai et al. (2010a), whosuggested an algorithm to estimate the average causal mediation effects where the uncertaintyestimates are calculated by means of the quasi-Bayesian Monte Carlo simulation method thatwas proposed by King et al. (2000), in which the posterior distribution of the quantities ofinterest is approximated by their sampling distribution. (As a check of robustness we havealso used the non-parametric bootstrap rather than quasi-Bayesian Monte Carlo simulation tocompute confidence intervals and our results are not affected. The results are available from theauthors on request.)

This procedure can be used for any parametric specification and, as explained in Imai et al.(2010a), it is based on the following steps.

(a) Fit the models for the observed outcome and mediator variables.(b) Simulate the model parameters from their sampling distribution.(c) Repeat the following three steps:

(i) simulate the potential values of the mediator,(ii) simulate the potential outcomes given the simulated values of the mediator and(iii) compute the causal mediation effects.

(d) Compute summary statistics such as point estimates and confidence intervals.

It is important to acknowledge here that in our specific context there might be potential un-observed variables that confound the outcome and mediator relationship even after controllingfor a rich set of information, such as that included in our EU SILC data. In this case, the SIassumption will no longer be satisfied, and the ACMET and ADET will not be identified. Toquantify the extent to which our empirical findings are robust to the existence of unobservedconfounders we provide in Section 8.2 a sensitivity analysis. Moreover, because of the long timeperiod between exposure to poverty, the educational attainment and adult outcomes we cannotcompletely disregard the possibility of potential intermediate confounders that influence boththe level of education and the adult outcomes and are themselves influenced by the exposure topoverty in childhood. The presence of such unobservable confounders would bias our estimatesof the indirect effect. In Section 8.2 we discuss in more detail the implication of it for our results.

To conclude this part, it is important to mention that several other ways to conceptualizethe mediatory role of an intermediate variable in the treatment–outcome relationship have beenproposed in the causal inference literature, and the recent increase in the number of methodsdeveloped is particularly noticeable. (See Huber et al. (2016) for an analysis of the finite sam-ple properties of different classes of parametric and semiparametric estimators under sequen-tial conditional independence assumptions and Linden and Karlson (2013) for a comparison


of the methods applied to the JOBS II data set.) These methods cover semiparametric and non-parametric estimation procedures (Imai et al., 2010b; Pearl, 2001a, b; Hafeman and Schwartz,2009), matching based on the propensity score (Hill et al., 2003), weighting procedures (Petersonet al., 2006; VanderWeele, 2009; Hong, 2010), the principal stratification approach (Frangakisand Rubin, 2002; Jo, 2008; Jo et al., 2011) and the g-computation-based algorithm (Robinsand Greenland, 1992). As emphasized by the recent work of Linden and Karlson (2013), manyof these methodologies are conceptually interconnected, or serve as a basis for the extensionof other techniques. For example, the PS (Rosenbaum and Rubin, 1983) performs as a way ofcoping with the SI assumption (refer to equations (2) and (3) for a formal definition of thisassumption), in either a separate procedure (Hill et al., 2003) or as a basis for weighting andprincipal stratification (Peterson et al., 2006; VanderWeele, 2009; Hong, 2010; Jo et al., 2011).In particular, principal stratification defines causal effects by comparing individuals with thesame potential values of the post-treatment variable under each of the treatment statuses. Theuse of principal stratification enables the introduction of direct versus indirect effects, analysingthe notion of causality when controlling for post-treatment variables (Mealli and Rubin, 2003;Rubin, 2004). Flores and Flores-Lagunes (2009) studied more in detail the relationship of theconcept of direct versus indirect effects with respect to the total average treatment effect, andthey formally discussed the identification and estimation of causal mechanisms and net effectsunder various assumptions. More recently, Huber (2014) showed the identification of causalmechanisms of a binary treatment variable under the unconfoundedness assumption, basinghis estimation strategy on inverse weighting.

The advantage of the approach that we chose is that it allows researchers to uncover andquantify the causal pathway through which the treatment affects the outcome. Conversely, thisis of course not possible when considering only controlled direct effects, which would referto the effect of growing up in poverty keeping the level of education fixed for all individuals.Similarly, principal-stratification-based methods do not enable us to decompose the total effectinto direct and indirect components and thus to estimate the relative contribution of a particularpathway. They focus on local causal effects for specific subpopulations (principal strata) andprovide either direct causal effects of the treatment on the primary outcome of interest in thoseprincipal strata where the intermediate variable is unaffected by the treatment (dissociativeeffects) or a combination of direct and indirect effects in principal strata where the mediator isaffected by the treatment (associative effects). (Refer to VanderWeele (2008) for more details onthe relationships between principal stratification and direct and indirect effects.)

3. Data

The analysis is based on data from the EU SILC, which provides comparable, cross-sectionaldata on income, poverty, social exclusion and living conditions in the EU. (Refer to chapter 2in Atkinson et al. (2017) for a detailed description of this database.) For the specific purposeof this paper we use the module on intergenerational transmission of 2011, where retrospectivequestions about parental characteristics (such as education, age and occupation) referring tothe period in which the interviewee was a young teenager (between the age of 14 and 16 years)were asked to each household member aged over 24 and less than 66 years. This module was alsoasked in 2005 but, given that the questions related to our main treatment are not comparableand the 2011 module provides more background variables on the parents, we decided here tofocus on the last; for a preliminary analysis of the 2005 module refer to Bellani and Bia (2017).We define an individual as experiencing poverty while growing up if she or he reported thefinancial situation of the household as very bad or bad in her or his early adolescence. (As a


check of robustness we use a different definition of childhood poverty that is based on the abilityof making ends meet. The results are presented in Section 8.)

We are aware that individuals may suffer from retrospective recollection bias. Recall bias issaid to occur when accuracy of recall is different by outcome. A recall bias would be presentif there are selective preconceptions between groups of individuals (rich or poor, or more orless educated) about the association between having experienced financial problems duringchildhood and the outcome of interest (income and education). In our context we argue that thetype of questions that are asked are less affected by this problem compared, for example, witha direct question on the level of income in the household during the same period. In supportof our argument on the validity of this measure, previous studies in the fields of medicine andpsychology have found that retrospective recall in adulthood of serious negative experiences inchildhood is sufficiently valid, and it is more likely to have significant under-reporting than ‘falsepositive’ reporting (see Hardt and Rutter (2004) for a review). In particular, closer to our setting,results from a test for concordance between siblings that was performed by Robins et al. (1985)showed that young children are likely to have a general concept of whether the family was rich orpoor, and, more importantly for the validity of treatment–control group analysis, they found nodifference in the sibling agreements between groups of patients with alcoholism or depression,and a control group free of psychiatric disorder, suggesting that interviews requiring recall ofchildhood environment may be reasonably valid. (Refer also to Akerlof and Yellen (1985) andJurges (2007) for examples of the use of retrospective questions in the context of occupationalmobility and their accuracy.)

We use as predictors of the probability of experiencing poverty in growing up the individuals’country of residence and of birth, gender, year and quarter of birth, single parenthood, year andcountry of birth, highest level of education, main occupation of the father and of the mother,tenancy status and number of adults, children and people in the labour force in the household.

The outcomes in adulthood that we are interested in are the logarithm of the equivalizedincome, and the risk of poverty (defined as having an equivalized income that is lower than 60%of the median in his or her country in that year). We use equivalized disposable income that is thetotal income of a household, after tax and other deductions, i.e. available for spending or saving,divided by the number of household members converted into equalized adults; household mem-bers are equalized by using the so-called modified Organisation for Economic Co-operation andDevelopment equivalence scale. As already mentioned at the end of Section 2, as intermediateoutcome we analyse the probability of having completed at least secondary education. We fur-ther restrict our sample to the individuals of working age between 35 and 55 years to maintaina degree of homogeneity in the period of the life cycle in which the outcomes of interest aremeasured. (See for example Chetty et al. (2014) who showed that life cycle bias in measure ofintergenerational mobility is negligible when the child income is measured after age 32 years.)

As presented in Tables 1, 2, 3 and 4, we are left with a sample of more than 100000 individuals,belonging to the EU’s 28 countries (refer to Table A2.1 in the on-line appendix for a list of thecountries included in our study), of which 10% reported having experienced poverty growingup. Among those we have on average more first-generation migrants (11% against 8%), morechildren of single parents (8% versus 3%) and more children with mother out of the labourforce (more than 50% against 40%). The most sizable significant difference between these twogroups of children can be found with respect to parental education, the poor being around 23percentage points more likely to have at maximum a primary education degree. This difference iseven more pronounced once we look at individuals who have less than secondary education, ofwhom almost 95% and 90% have respectively a mother and father with only a primary educationdegree, and 30 and 35 percentage points respectively more than in the more educated group.


Table 1. Descriptive statistics–continuous variables

Characteristic Mean Standard Minimum Maximumdeviation

Child characteristicsYear of birth 1965.8 5.97 1956 1976

Parents’ characteristicsYear of birth of father 1935.9 8.57 1907 1957Year of birth of mother 1938.9 8.26 1913 1959

Household characteristicsNumber of adults in household 2.51 0.97 1 7Number of children in household 2.31 1.22 0 7Number of people in work 1.85 0.88 0 6

OutcomeIncome 4.02 0.41 0.40 5.03

Observations 108747

Also remarkable is the higher difference in maternal employment; the individuals with less thansecondary education are around 24 percentage points more likely to have a mother out of thelabour force (almost 60%). Interestingly there is not a significant difference in parental migrationstatus. In Table A2.2 in the on-line appendix section A.2 we provide the results of a test on theaverage differences in the relevant outputs between our sample and the sample of individualsin the same range of age with missing values of our treatment variable. We can note that thereis no difference in their risk of poverty, but the individuals in our sample are on average richerand slightly less educated.

To give a better sense of how representative our sample is with respect to the total population,we can look at the average risk of poverty in 2011 in the 27 EU countries, as reported by Eurostat.This was 24.2% of the total population. If we then focus on people between 25 and 54 years old,closer to our sample of interest, this percentage is 22.9% of the total population. Knowing thataround 42% of the 27 EU countries population is in this range of age, whereas 29% is in therange that we are interested in, we can easily see that, if the risk of poverty was uniform in thisage group, around 15% of our adult population should be poor to be in line with the officialdata. Our sample shows a slightly lower risk of poverty, 14%, and it could be credibly arguedthat this is due to the higher risk of poverty in the age group between 25 and 29 years (25.5% ofthe overall population), which is excluded by construction from our sample.

4. Effect of experiencing poverty on adult outcomes

As a first step in the analysis we estimate by means of a probit model each individual’s PS, i.e. hisor her probability of being poor in childhood given the observed pretreatment characteristicsthat were introduced in the previous section. In our unmatched sample the PS has a mean valueof 0.11, a median of 0.08 and a standard deviation of 0.1. (For the results of this first step referto column (1) of Table A2.4 in the on-line appendix A.2. As a check of robustness we havebeen estimating this probability also by using interaction terms; the results are presented inSection 8.) As already explained in Section 2.1, the PS is a balancing score (Rosenbaum and


Table 2. Descriptive statistics—dummies or categorical variables

Characteristic Frequency % Cumulative %

Child characteristicsQuarter of birth

1 23265 25.74 25.742 23495 25.99 51.743 22847 25.28 77.014 20776 22.99 100.00

Male0 57666 53.03 53.031 51081 46.97 100.00

Country of birth†EU 3729 3.43 3.43LOC 99603 91.62 95.05OTH 5378 4.95 100.00

Parents’ characteristicsFather not born in country of residence

0 97128 89.32 89.321 11619 10.68 100.00

Mother not born in country of residence0 97295 89.47 89.471 11452 10.53 100.00

Father primary education0 40795 37.51 37.511 67952 62.49 100.00

Mother primary education0 33739 31.03 31.031 75008 68.97 100.00

Father self-employed0 89053 81.89 81.891 19694 18.11 100.00

Father employee0 21706 19.96 19.961 87041 80.04 100.00

Mother self-employed0 97425 89.59 89.591 11322 10.41 100.00

Mother employee0 55843 51.35 51.351 52904 48.65 100.00

Household characteristicsSingle parent

0 104851 96.42 96.421 3896 3.58 100.00

Tenancy status: owner0 29714 27.38 27.381 78804 72.62 100.00

TreatmentNon-poor as a child 97515 89.81 89.81Poor as a child 11063 10.19 100.00

OutcomeAt least secondary education

0 22302 20.66 20.661 85651 79.34 100.00

†LOC, country of the survey; OTH, non-EU country.


Table 3. Test on the control variables means by treatment and mediator status†

Characteristic Results for poor Results for secondary education

No Yes Difference No Yes Difference

Year of birth 1965.92 1964.48 1.44‡ 1964.95 1965.98 −1:04‡Number of adults in household 2.48 2.70 −0:21‡ 2.74 2.44 0.29‡Number of children in household 2.25 2.86 −0:60‡ 2.65 2.23 0.42‡Number of people in work 1.87 1.77 0.10‡ 1.87 1.85 0.02‡Year of birth of father 1936.08 1934.31 1.77‡ 1934.47 1936.27 −1:81‡Year of birth of mother 1939.14 1937.24 1.90‡ 1937.48 1939.33 −1:85‡Male 0.47 0.48 −0:01 0.47 0.47 0.00Single parent 0.03 0.08 −0:05‡ 0.03 0.04 −0:01‡Father not born in country of residence 0.10 0.13 −0:03‡ 0.11 0.11 0.00Mother not born in country of residence 0.10 0.13 −0:03‡ 0.11 0.10 0.00Father primary education 0.60 0.83 −0:23‡ 0.90 0.55 0.35‡Mother primary education 0.67 0.88 −0:22‡ 0.94 0.62 0.31‡Tenancy status=: =owner 0.74 0.61 0.12‡ 0.69 0.74 −0:05‡Father employee 0.81 0.73 0.08‡ 0.73 0.82 −0:09‡Father self-employed 0.18 0.22 −0:04‡ 0.24 0.16 0.08‡Mother employee 0.50 0.33 0.17‡ 0.28 0.54 −0:27‡Mother self-employed 0.10 0.14 −0:04‡ 0.13 0.10 0.03‡

†The table reports the results of a T -test for continuous variables and a proportion test for the binary variables.‡p< 0:001.

Table 4. Test on the control variables heterogeneity by treatment and mediator status

Results for poor Results for secondary education

No Yes No Yes

Quarter of birth1 25.64 26.58 26.72 25.452 26.09 25.25 25.33 26.213 25.31 24.92 24.55 25.484 22.96 23.25 23.39 22.85

Pearson χ2 6.051 21.16p-value 0.109 0.0000974

Country of birth†EU 3.37 3.96 4.19 3.18LOC 91.91 88.97 90.02 92.15OTH 4.71 7.07 5.79 4.66

Pearson χ2 130.7 107.8p-value 4:09×10−29 3:94×10−24

†LOC, country of the survey; OTH, non-EU country.

Rubin, 1983), i.e. within strata with the same value of p.X/ the probability that T ={0, 1} (beingpoor or not) does not depend on the value of X. This balancing property, combined with theunconfoundedness assumption, implies that, for a given PS, exposure to a treatment status israndom and therefore treated and control units should be on average ‘similar’ conditionally onobservable pretreatment characteristics. As a result, to be effective, PS-based methods should


−40 −20 0 20 40 60Standardized % bias across covariates

0 .2 .4 .6 .8 1Propensity Score

(a)

(b)

Fig. 1. (a) Standardized bias ( , unmatched; , matched) and (b) common support ( , untreated; ,treated)

balance characteristics across treatment groups. The extent to which this has been achieved canbe explored by comparing balance in the covariates before and after adjusting for the estimatedPS. Fig. 1(a) provides the standardized bias (as a percentage) for unmatched and matched units,showing a huge improvement in the balancing property when adjusting for the PS, with a biasalways around 0. The reduction of bias due to matching is computed as BR=100.1−BM=B0/


where BM is the standardized bias after matching,

BM = 100.xMC − xMT/√{.S2MC +S2

MT/=2} ,

and B0 is the standardized bias before matching,

B0 = 100.x0C − x0T/√{.S20C +S2

0T/=2} ,

where subscript M denotes after matching and 0 denotes before matching. Another importantrequirement for identification is given by the common support, which ensures that for eachtreated unit there are control units with the same observables. In the matched sample, thecomparison of baseline covariates may be complemented by comparing the distribution of theestimated PS between treated and controls, as shown in Fig. 1(b).

As a second step, we apply single-nearest-neighbour matching to remove bias that is associatedwith differences in covariates (to note that not only the mean value of the difference betweenthe PS of the treated individual and the PS of his or her nearest neighbour is 0 but also it is 0 for99% of our data, reaching a maximum value at 0.03) and estimate the effects of being poor onadulthood outcome, primarily equivalent income and being at risk of poverty. (In our analysiswe use the psmatch2 Stata package (Leuven and Sianesi, 2003).)

To check the robustness of our results, we perform also a doubly robust estimation ofour treatment effect, which combines the entropy balancing method that was proposed byHainmueller (2012) with a least squares (or probit) regression of the outcome on the treatmentvariable. This balancing method constructs a weight for each control observation such thatthe sample moments of observed covariates are identical between the treatment and weightedcontrol groups. Here we impose balance on the second moments of the covariate distributions.(See Table A2.5 in the on-line appendix A.2 for the descriptive statistics of the sample pre- andpost-reweighting.) The marginal effect of the treatment is our doubly robust estimate of theaverage treatment effect. The results presented are the average marginal effect of the treatmentvariable, given by the ordinary least squares coefficient estimation for the continuous variablelog-income, and the average of the marginal effects calculated from the probit estimates for theprobability of being poor in adulthood and to attain secondary education.

Results are presented in Table 5. The results on income show a substantial decrease inequivalized income in adulthood due to exposure to poverty in childhood, of around 5%,

Table 5. ATT†

Main outcomes Intermediateoutcome, education

Income (%) Poverty (percentage points)(percentage points)

PS matching −5:7 5.7 −11:9[−7:1,−4:3] [4.3, 7.1] [−13:4,−10:4]

Doubly robust −4:9 5.2 −11:3estimation [−5:4,−4:4] [4.4, 6.0] [−12:2,−10:4]

Number 108355 108355 107566

†95% confidence intervals are in square brackets.


equal to an average loss of around C764, which is half of the average gross monthly earn-ings of people with less than a secondary degree in 2010, in the 28 EU countries (source:http://ec.europa.eu/eurostat/web/labor-market/earnings/database). Wealso find a significant increase in the probability of falling into poverty of 5 percentage points,which means that on average poor children are 1.5 times more at risk of poverty in adulthoodthan are the non-poor.

Regarding our intermediate outcome of interest, our results show an average decrease of 11percentage points in the probability of completing at least secondary education, which meansthat on average the children who did not experience poverty are 1.8 times more likely to haveat least a secondary education degree. The estimates are overall very close between the twomethods, both in magnitude and in significance.

5. Distributional effect of childhood poverty and education

After having shown that childhood poverty has indeed a relevant detrimental effect on eco-nomic outcomes in adulthood, a fundamental question left to answer to be able to make moreeffective policy recommendations regards the channels through which being raised in a poorfamily affects an individual’s economic and social status as an adult. To do so, as a first step inthis direction, we begin by exploring this effect more in detail, presenting and comparing thedistributions of income in adulthood of the children belonging to different groups. First, weconsider the whole sample and compare the distribution of individuals’ income in both the poorand the non-poor group of children without controlling for their probability of experiencingpoverty in childhood. Looking at Fig. 2(a) we can note that not surprisingly the distributionof incomes of the non-poor children first-order stochastic dominates the other, thus imply-ing higher social welfare in a hypothetical society in which no-one experiences poverty as achild than in a society in which childhood poverty is common. (Stochastic dominance is a par-tial ordering between random variables. An income distribution F is said to have first-orderstochastic dominance over G if, for any income x, F gives at least as high a probability ofreceiving at least x as does G, and, for some x, F gives a higher probability of receiving atleast x. Refer to Atkinson (1970) for a formal analysis of the link between stochastic domi-nance and welfare.) More interestingly, when we look at the matched sample, i.e. the samplewhere the characteristics of the children are matched such as not to differ significantly be-tween the poor and non-poor, this result still holds, suggesting that, even when we controlfor the observable characteristics which are associated with experiencing poverty in the firstplace, the effect of parental financial difficulties does predict lower welfare achievement in thenext generation (see Fig. 2(b)). (In Fig. A2.1 in the on-line appendix A.2 the bottom and thetop of the distribution are plotted for both cases to show that the cumulative densities do notcross.)

The Kolmogorov–Smirnov test shows in fact that there are significant differences in the incomedistribution for these two groups. The largest difference between the distribution functions isalmost 10% in the unmatched sample which reduces to 6% in the matched sample, alwaysremaining highly significant. Focusing on the risk of poverty, we can see that, not only theincidence of poverty is higher, around 21% against 13% (15% in the matched sample) for thenon-poor children, but also its intensity (see Figs 2(c) and 2(d)). (Refer to Jenkins and Lambert(1997) for more details on the ‘three Is of poverty’ (TIP) curves.) Finally, it is worth notingthat the effect of childhood poverty seems to decrease the income that is achievable by the twogroups of children but not to impact how this is distributed within each group (see Figs 2(e) and2(f)).


0.0

2.0

4.0

6C

umul

ativ

e no

rmal

ized

pov

erty

gap

0 .2 .4 .6 .8 1Cumulative population share, p

(a) (b)

(c) (d)

(e) (f)

Fig. 2. Distributional graphs ( , non-poor; , poor; , equality) (the ‘three Is of poverty’curves plot the cumulative normalized poverty gaps per capita of the population, from the biggest to thesmallest, and provide graphical summaries of the incidence (length), intensity (height) and inequality (cur-vature) dimensions of poverty): (a) cumulative distribution function (full sample); (b) cumulative distributionfunction (matched sample); (c) three Is of poverty curves (full sample); (d) three Is of poverty curves (matchedsample); (e) Lorenz curves (full sample); (f) Lorenz curves (matched sample)

Performing the same type of exercise, but focusing on the effect of secondary education,we find that the individuals having only less than secondary education have consistently lowerincome at any percentile of the distribution reaching a maximum of 15% difference with respectto the more educated individuals, a difference which is also in this case highly significant, as wecan see from Figs 3(a) and 3(b). When we look at the risk of poverty by educational level thesedifferences are more pronounced, in fact not only is the incidence of poverty higher, around 27%with respect to 13%, but the difference in its intensity is of particular significance as the averageincome of the poor drops at only 54% of the median income. Inequality, not so surprisingly, is


slightly higher for the more educated group (see Figs 3(c) and 3(d), for poverty and inequalityrespectively). To conclude this part, before moving to the mediation analysis on the role ofeducation, we show in Fig. 4 the distribution of incomes of our sample divided into four groups,characterized not only by growing up poor, but also by their own subsequent educational choice.As we were expecting, the income distribution of the poor children without a secondary degreeis dominated by all the others but, what is less expected, is that education does not seem tobe able to offset the effect of having experienced poverty in childhood for the bottom 40% ofthe income distribution, whereas it does so, although never completely, along the rest of thedistribution. If we then look at the risk of poverty, we can note that acquiring a secondary degreereduces the incidence and intensity of poverty quite drastically for the poor children, of whomonly 13% are poor, compared with 11% of the children who did not experience poverty, whereasthey are consistently less likely to fall into poverty than the children who did not experiencepoverty, but who did not obtain higher education, for whom the incidence of poverty is instead23%.

6. The mediating role of education

Our analysis has shown so far that the individuals who experience poverty growing up aresignificantly less likely to achieve secondary education and the individuals who have less thana secondary degree have consistently lower income. Therefore, it seems worth focusing at firston the mediating role of human capital accumulation. To do so, as introduced in Section 2.2,in this last part of our analysis we implement a causal mediation analysis to uncover the role ofeducation in the intergenerational poverty transmission. In particular, we study whether beingpoor as a child led to substantially lower levels of income later in life by decreasing the likelihoodof at least graduating from high school. Our mediating variable is therefore secondary education,whereas the outcomes of interest are the level of income and the risk of poverty in adulthood.First, on the basis of the fitted mediator model equation (4), we simulate secondary educationattainment levels for the children who experienced and who did not experience poverty. Next,we use the outcome model equation (5) to simulate potential outcomes. (For the binary outcomerisk of poverty we use a probit model.) The ACMET is computed as the average difference indisposable income among those who experienced poverty in childhood across the levels of highschool graduation for the case of having and not having experienced poverty. Finally, we repeatthe two simulation steps 1000 times to estimate the standard errors.

Table 6 shows the estimated average causal mediation effect, average direct effect and averagetotal effect on the treated. The total effect is equal to −5% of equivalized income and 5.6percentage points risk of being poor. The indirect effect, conveyed through the educationallevel, is estimated to be around −1:6% and 1:8 percentage points, suggesting that a significantportion of the average total effect is attributable to a decrease in the probability of graduatingfrom high school. Hence, growing up poor as a child induces a lower level of education whichaccounts for more than 30% of the total effect on adult income and on the risk of poverty. Theeffect of education does not show any significant heterogeneous effect by poverty status on bothoutcomes.

Although education is a major contributor, the part of the transmission mechanisms thatremains unexplained is substantial (around 3:5% of income and 3:8 percentage points in therisk of poverty). As a first possible explanation for it, we should remind the reader that, becauseof data availability, our measure of education gives us information only on the degree completedbut does not allow us to distinguish between different quality of those degrees. Moreover, theextent to which poverty status is transmitted from parents to their children also depends on


(a)

(b)

(c)

(d)

Fig

.3.

Dis

trib

utio

nal

grap

hs—

educ

atio

n(

,no

n-se

cond

ary

educ

atio

n;,

seco

ndar

yed

ucat

ion;

,eq

ualit

y):

(a)

cum

ulat

ive

dist

ribut

ion

func

tion

(mat

ched

sam

ple)

;(b)

gene

raliz

edLo

renz

curv

es(m

atch

edsa

mpl

e);(

c)‘th

ree

Isof

pove

rty’

curv

es(m

atch

edsa

mpl

e)(t

hecu

rves

plot

the

cum

ulat

ive

norm

aliz

edpo

vert

yga

pspe

rca

pita

ofth

epo

pula

tion,

from

the

bigg

estt

oth

esm

alle

st,a

ndpr

ovid

egr

aphi

cals

umm

arie

sof

the

inci

denc

e(le

ngth

),in

tens

ity(h

eigh

t)an

din

equa

lity

(cur

vatu

re)

dim

ensi

ons

ofpo

vert

y);(

d)Lo

renz

curv

es(m

atch

edsa

mpl

e)


(a)

(b)

(c)

(d)

Fig

.4.

Dis

trib

utio

nalg

raph

s—po

vert

yan

ded

ucat

ion

(,n

on-s

econ

dary

educ

atio

n(p

oor)

;,s

econ

dary

educ

atio

n(p

oor)

;,n

on-s

econ

dary

educ

atio

n(n

on-p

oor)

;,

seco

ndar

yed

ucat

ion

(non

-poo

r)):

(a)

cum

ulat

ive

dist

ribut

ion

func

tion

(mat

ched

sam

ple)

;(b

)ge

nera

lized

Lore

nzcu

rves

(mat

ched

sam

ple)

;(c

)‘th

ree

Isof

pove

rty’

curv

es(m

atch

edsa

mpl

e)(t

hecu

rves

plot

the

cum

ulat

ive

norm

aliz

edpo

vert

yga

pspe

rca

pita

ofth

epo

pula

-tio

n,fr

omth

ebi

gges

tto

the

smal

lest

,and

prov

ide

grap

hica

lsum

mar

ies

ofth

ein

cide

nce

(leng

th),

inte

nsity

(hei

ght)

and

ineq

ualit

y(c

urva

ture

)di

men

sion

sof

pove

rty)

;(d)

Lore

nzcu

rves

(mat

ched

sam

ple)


Table 6. Mediation causal analysis on the treated†

Income (%) Poverty(percentage points)

Total effect −5:1 5.6[−5:8, −4:5] [4.6, 6.6]

AverageCausal mediation effect −1:6 1.8

[−1:8,−1:5] [1.6, 2.1]Direct effect −3:5 3.8

[−4:1,−2:9] [2.7, 4.7]% of total effect mediated 32 33

[29, 37] [28, 40]

PoorCausal mediation effect −1:7 2.0

[−2:0,−1:5] [1.7, 2.2]Direct effect −3:6 3.9


[30, 39] [30, 43]

Non-poorCausal mediation effect −1:5 1.7

[−1:7,−1:4] [1.4, 2.0]Direct effect −3:4 3.6


[27, 34] [26, 37]


the combined effect of the investment in education and the rate of return on these investments.The extent to which education is publicly financed and rewarded in the labour market alsomatters and it is in turn affected by the way that both the society and the market operate in theenvironment where the children are raised.

7. Heterogeneous effect by welfare regime

To analyse possible heterogeneous effects of education due to different welfare regimes, in thissection we provide the results for both the direct and the indirect effect of poverty by subsampleof countries, defined following the well-known and established classification of Esping-Andersen(1990) with the addition of the central and eastern European countries (Hemerijck, 2012). (SeeTable A2.6 in the on-line appendix A.2 for the categorization of the countries.)

As presented in Table 7, Scandinavian and Anglo-Saxon countries do not show significantcausal effects of childhood poverty on later life outcomes, whereas these effects are most pro-nounced for Mediterranean countries, which show a higher loss of income (6.5% versus 2.5%and 2.9%) and lower probability of attaining at least secondary education (almost 16 percentagepoints versus 7 percentage points and 10 percentage points respectively) than both continentaland central and eastern European countries. It is quite well known that Scandinavian countrieshave a robust dual-earner model with universal income security and a strong incentive for highemployment participation of females. Therefore, the results for this group are not so surprising,


Table 7. Average treatment effect on the treated by welfare state regime†



Continental −2:5 1.1 −6:8[−4:2,−0:8] [−1:2, 3.4] [−9:8,−3:8]

Number 24733 24733 24648Mediterranean −6:5 6.6 −15:8

[−8:2,−4:8] [4.1, 9.1] [−19:0,−12:5]Number 43618 43618 43343Social Democratic −3:9 3.3 −3:4

[−8:8, 0.9] [−2:9, 9.6] [−10:4, 3.6]Number 4282 4282 4252Central and eastern Europe −2:9 5.1 −9:8

[−4:7,−1:1] [3.2, 7.1] [−11:8,−7:7]Number 43618 43618 43343Liberal −0:9 0.3 −5:4

[−5:4, 3.6] [−7:0, 7.6] [−12:1, 1.3]Number of observations 4623 4623 4509


suggesting that the average difference in income, poverty risk and level of education existingbetween poor and non-poor children are substantially driven by the factors that are responsiblefor childhood poverty in the first place. (See Table A2.7 in the on-line appendix for these averagedifferences by welfare state regime.) The results for the liberal countries, the UK and Ireland,seem at first glance at odds with previous results in the literature (in particular regarding the UK)but we ought to remember that previous contributions looked either at general male earningsmobility (Blanden et al., 2007) or at a particular childhood circumstance, as single motherhoodor joblessness (Ermisch et al., 2004), both of which in this contribution are controlled for. Blan-den and Gibbons (2006) looked explicitly at transmission of poverty but only at associations;they found that 19% of men who experienced poverty as teenagers in the 1970s were in povertyin their 30s, compared with only 10% of those who were not poor in their teens, with the sizeof the transmission similar at ages 33 and 42 years. However, when they compared people whohad a similar teenage family background in other ways apart from family poverty, they foundthat the direct effect of teenage poverty becomes small and statistically weak, showing that thepersistence of poverty from teenage through to middle age can be fully explained by other as-pects of teenage disadvantage, in particular whether one had low educated or non-employedparents.

When we analyse more in detail the mechanisms behind the significant effects in the con-tinental, Mediterranean and the central and eastern European countries, we can note that,in continental Europe, the direct effect is never statistically significant. Moreover, althoughMediterranean countries exhibit on average a higher effect of childhood poverty, the roleof secondary education in this process does not seem to differ between those and the cen-tral and eastern European countries, which in fact display almost the same total average ef-fect and the same composition between direct and indirect effects. Results are presented inTables 8 and 9.


Table 8. Mediation causal analysis on the treated for income by welfare state regime†

Results for Results for Results forcontinental (%) Mediterranean (%) central and

eastern Europe (%)

Total effect −1:8 −5:6 −4:8[−3:0,−0:3] [−6:9,−4:2] [−5:9,−3:8]

AverageCausal mediation effect −0:7 −2:4 −1:8

[−0:9,−0:4] [−2:8,−2:0] [−2:2,−1:5]Direct effect −1:1 −3:2 −3:0

[−2:3, 0.4] [−4:5, −1:8] [−4:0, −2:0]% of total effect mediated 37 43 38

[21, 115] [21, 56] [31, 48]

PoorCausal mediation effect −0:6 −2:4 −1:9

[−0:8,−0:3] [−2:8,−1:9] [−2:2,−1:5]Direct effect −1:0 −3:2 −3:0

[−2:3, 0.4] [−4:5,−1:9] [−4:1,−1:9]% of total effect mediated 32 42 39

[18, 97] [34, 56] [32, 49]

Non-poorCausal mediation effect −0:8 −2:4 −1:8

[−1:0,−0:5] [−2:8,−2:0] [−2:1,−1:4]Direct effect −1:2 −3:2 −2:9

[−2:4, 0.3] [−4:6,−1:7] [−3:9,−2:0]% of total effect mediated 43 44 38

[24, 132] [35, 57] [31, 47]


8. Robustness and sensitivity analysis

8.1. Sensitivity analysis on the average treatment on the treatedOne of the central assumptions of our analysis is that being poor in childhood can be consideredas good as random, conditionally on the set of covariates X. This implies that the outcome ofnon-poor children can be used to estimate the counterfactual outcome of poor children if theywere not experiencing poverty in childhood. The plausibility of this assumption heavily relies onthe quality and amount of information contained in X. (Refer to Section 2.1 for a more detaileddescription of the assumptions that are made.) The validity of this assumption is not directlytestable, since the data are completely uninformative about the distribution of the potentialoutcomes, but its credibility can be supported or rejected by additional sensitivity analysis.

Our analysis would be biased if we were to believe that, even conditionally on all the covariateswe can observe (parental education and occupation, child own age, sex, year and country ofbirth and number of siblings, etc.), being poor in childhood would be linked to some unobservedparental genetic ability which would not only influence the probability that the parents fallinto poverty (being treated) but also the child’s potential outcome as a result of the genetictransmission of ability.

As a first test of the robustness of our results we follow the recent contribution by Oster(2017) and we evaluate the robustness to an omitted variable bias under the assumption that


Table 9. Mediation causal analysis on the treated for poverty by welfare state regime†

Results Results Resultsfor continental for Mediterranean for central and

(percentage points) (percentage points) eastern Europe(percentage points)

Total effect 2.1 5.3 6.5[−0:1, 4.4] [3.6, 7.0] [4.8, 8.2]

AverageCausal mediation effect 0.9 2.1 2.4

[0.5, 1.3] [1.6, 2.6] [1.9, 3.0]Direct effect 1.2 3.3 4.1

[−1:1, 3.7] [1.4, 5.1] [2.3, 5.7]% of total effect mediated 40 39 37

[−105, 231] [29, 57] [29, 51]

PoorCausal mediation effect 0.7 2.3 2.5

[0.4, 1.0] [1.8, 2.8] [2.0, 3.1]Direct effect 1.1 3.5 4.2


[−87, 192] [32, 62] [30,53]

Non-poorCausal mediation effect 1.0 1.9 2.3

[0.6, 1.6] [1.4, 2.4] [1.7, 3.0]Direct effect 1.4 3.1 4.0


[−123, 270] [26, 51] [28, 49]


the relationship between treatment and unobservables can be recovered from the relationshipbetween treatment and observables. Oster (2017) built on the work done by Altonji et al. (2005)and extended it by explicitly connecting the bias to coefficient stability, showing that it is nec-essary to take into account both coefficients and R2-movements in evaluating robustness toomitted variable bias. She developed a consistent, closed form estimator for omitted variablebias which enables calculation of a consistent estimate of the bias-adjusted treatment effectsunder the assumptions that, as in Altonji et al. (2005), the R2 from a hypothetical regressionof the outcome on treatment and both observed and unobserved controls is equal to 1 andthat there is proportional selection on observed and unobserved variables. Moreover, with thismethod, it is possible to calculate the degree of selection on unobservables relative to observ-ables which would be necessary to drive the effect to 0. (Refer to on-line appendix A.1 for aformal description of all the sensitivity procedures.) Under the above-mentioned assumptions,we estimate the coefficient of proportionality which moves the effect of being poor as a child onadult income towards 0 and find that the unobservables would need to be 20 times as importantas the observables to produce a null treatment effect. Moreover, we calculate the bias-adjustedeffect of childhood poverty on adult income under the assumption of equal selection and doubleselection on unobservables relative to observed controls and we find an effect of −0:050 in thefirst case and −0:048 in the second, confronted with an unadjusted effect of −0:052.


Secondly, we implement the sensitivity analysis that was developed by Rosenbaum (2002),which relies on a sensitivity parameter—Γ—that measures the degree of departures from theassumption of random assignment of the treatment. Γ is a multiplicative parameter whichrepresents how much two individuals, with the same pretreatment characteristics, may differin likelihood of receiving the treatment. Results for Rosenbaum’s bounds for the p-values fromWilcoxon’s signed rank test show that, for an increase between 1:1 and 1:2 in Γ, the lower boundof the p-value increases to a level above the usual 0:05 threshold. (In this part of our analysiswe use the program rbounds provided by Gangl (2004), which enables us to run sensitivityanalyses for continuous outcomes, whereas to deal with binary outcomes we use the modulethat was built by Becker and Caliendo (2007), mhbounds. The complete results can be found inthe on-line appendix A.2 in Tables A2.9, A2.10 and A2.11 for income, poverty and secondaryeducation respectively.) This means that if we were to take two children with the same observedcharacteristics, but one with parents with high ability and one with parents with low ability, thechild whose parents are of low ability type would have to have 1:2 times the risk of experiencingpoverty for our effect on adult income to become insignificant. As another way of looking atit, the effect of a dichotomous parental ability variable should be larger than the effect of anobservable variable like maternal employment and as large as the effect of parental migrationbackground. The results of the Mantel–Haenszel test for the binary outcome risk of poverty,under the assumption that we have overestimated the treatment effect, reveal that the confidenceinterval for the effect would include 0 if an unobserved variable, e.g. parents’ ability, caused theodds of experiencing poverty in childhood to be 1:4 times higher for the children of low abilityparents, conditional on all the observable pretreatment characteristics. The effect of secondaryeducation is the most robust as it would remain significant even if the unobserved variable wouldcause the odds of experiencing poverty in childhood to be 1:7 times higher. This effect shouldthus be more than the effect of migration status of the father and the presence of an extra childin the household and almost as much as maternal education.

Finally, to analyse the extent of the possible overestimation, we also follow the approach thatwas suggested by Ichino et al. (2008) and assume that the unobserved ability variable A canbe expressed as a binary variable taking value H , high, or L, low. We perform two simulationexercises. (For this sensitivity analysis we use the sensatt program that was developed byNannicini (2007).) In the first, the probability that A = 1 (high) in each of the four groupsdefined by the treatment status (poor as a child) and the outcome value (poverty in adulthood)(pij) are set so as to let our simulated parental ability A mimic the behaviour of parentaleducation variables, as their strong, although not perfect, positive correlation is well known inthe literature. (See among others Black et al. (2009), Anger and Heineck (2010) and Bjorklundet al. (2010).) In the second, a set of different pij is built to capture the characteristics of thispotential confounder that would drive the ATT estimates to 0 (killer confounder).

In Tables 10, 11 and 12 the results of these sensitivity checks are presented, for povertyand education. These results show that our findings are robust to the introduction of both‘calibrated’ and ‘killer’ confounders. To make our results invalid, the odds of experiencingpoverty in childhood need to be five or 4:2 times higher for high ability than for low abilityparents, and the odds of being at risk of poverty in adulthood almost 4:8 times higher and theodds of having less than secondary education almost 12:7 times higher for non-poor childrenof high ability parents than the odds of low ability parents. Moreover we can see that here thenegative effect of our calibrated parental ability on the odds of being poor as adults of non-poorchildren is lower than the negative effect of parental ability on the probability of experiencingpoverty in childhood, whereas this selection effect is lower than the effect on the odds of havingless than secondary education among non-poor children. This could be partly explained in light


Table 10. Sensitivity analysis: pij -values and odds ratio—poverty

Results for father’s Results for mother’s Results foreducation—calibrated education—calibrated killer

p11 0.11 0.07 0.36p10 0.19 0.13 0.26p01 0.27 0.22 0.20p00 0.42 0.35 0.05Outcome effect 0.52 0.53 4.76Selection effect 0.32 0.26 5.03

Table 11. Sensitivity analysis: pij -values and odds ratio—education

Results for father’s Results for mother’s Results foreducation—calibrated education—calibrated killer

p11 0.05 0.03 0.55p10 0.26 0.18 0.30p01 0.11 0.07 0.40p00 0.46 0.39 0.05Outcome effect 0.14 0.12 12.69Selection effect 0.36 0.30 4.22

Table 12. ATT†

Results Results Results Resultsfor baseline for father’s for mother’s for killer

(percentage points) education—calibrated education—calibrated (percentage points)(percentage points) (percentage points)

Poverty 5.2 3.4 3.7 −0:4[4.4, 6.0] [2.0, 4.8] [2.2, 5.1] [−1:8, 1.1]

Education 11.3 7.2 7.3 0.1[10.4, 12.2] [5.6, 8.8] [5.6, 9.0] [−2:0, 2.2]


of the way that we calibrated our unobserved parental ability, based on their level of education,which is known to have a direct effect on the education of the children. (See among othersPiopiunik (2014) and the references therein.) Finally, although the significance of the estimatedeffects is robust to the introduction of the calibrated unobserved ability, the size of the effects isrevised downwards, suggesting an upward bias in the estimates. In other terms, the sensitivityanalysis is telling us that the existence of a confounder behaving like maternal education mightaccount for nearly 30% of the baseline estimates for the risk of poverty. If we were to considerthis as a good approximation of the bias also on the effect on equivalized income, we wouldthen still have a significant effect of circa 4%.

To conclude this part, we address a couple of concerns that might arise regarding our definitionof childhood poverty and the specification that is used to estimate the probability of being poor


Table 13. ATT—making ends meet†



PS matching −3:4 3.4 −8:5[−4:4, −2:5] [2.3, 4.5] [−9:8, −7:2]

Doubly robust estimation −3:7 3.5 −8:5[−4:1,−3:3] [2.8, 4.2] [−9:3,−7:8]

Number 106448 106448 105679


Table 14. ATT (PS with interactions)†



All −4:2 4.5 −10:3[−5:3, −3:0] [3.2, 5.9] [−11:9, −8:7]

Education and employment −4:0 4.7 −10:9[−5:4, −2:5] [3.5, 5.9] [−12:5, −9:3]

Number 108255 108255 107466


in childhood. In Table 13 we show the estimates for a different definition of childhood povertybased on the ability of making ends meet. The results are consistent in terms of significance andonly slightly smaller in terms of point estimates, as might have been expected given that thosetwo variables are highly correlated (ρ=0:73) and the percentage of treated with this definitionis higher (around 17%). In Table 14 we provide the results when we estimate the probabilityof being poor as a child by using interaction terms between all the variables and the countrydummies and between education and employment status of the parents. Our average treatmenteffects are not significantly affected by different specifications of the model.

8.2. Sensitivity analysis on the average casual mediation effect on the treatedAs previously mentioned in Section 2, both the standard unconfoundedness assumption (as-sumption (2)) and the SI assumption (assumption (3)) rely on the quality and richness of thedata.

Despite the rich set of pretreatment characteristics, if there were unobserved confounders thataffect both the educational level and the income, the SI assumption will no longer be satisfied,and the ACMET and ADET will not be identified. As an example, pre-existing cognitive ornon-cognitive problems might reduce the likelihood of graduating from secondary school, aswell as the likelihood of higher income levels later in life.

To deal with this hypothetical violation of the SI assumption, we assess the role of unobservedconfounders via a sensitivity analysis. In the sensitivity analysis we shall assume that the error


terms in equation (5) and equation (4) follow a standard normal distribution and a normaldistribution with var.εi3/=σ2

3 for εi2 and εi3 respectively, εi2 ∼N .0, 1/ and εi3 ∼N .0, σ23/, and

we assume a bivariate normal distribution of the error terms, with mean 0 and covariance ρσ23,

where ρ is the correlation between the two error terms. This will enable us to derive mediationeffects as a function of ρ, estimating the ACMET under a series of ρ-values different from 0.

We first apply the analysis based on the estimated ρ-parameter and report the indirect effectsas a function of ρ. When ρ is 0, assumption (3) is satisfied, i.e. there is no correlation betweenthe error terms of the mediator and outcome models. Conversely, values of ρ that are differentfrom 0 lead to violations of the SI (Keele et al., 2015).

In our study, the ACMET equals 0 for ρ= 0:3, which means that the true mediated effectcould be 0 if there were a modest violation of the SI.

As the sensitivity parameter itself is rather difficult to interpret directly, we show here analternative approach, expressing the degree of sensitivity as a function of R2, i.e. the usualcoefficient of determination. In the presence of an omitted confounder Ui, the error term willbe a function of Ui and will be equal to εij = λjUi + ε′

ij, with j = 2, 3 for the mediator andoutcome model respectively, and λj representing the unknown coefficient for each equation.The sensitivity analysis is based on the proportion of original variance that is explained by theomitted confounder in the mediator and outcome model, equal to

R2M ≡{var.εi2/−var.ε′

i2/}=var.Mi/

and

R2Y ≡{var.εi3/−var.ε′

i3/}=var.Yi/:

In this setting, ρ is a function of the unexplained variances, proportions in the mediatorand outcome models. (R2Å

M ≡ 1 − var.ε′i2/=var.εi2/; R2Å

Y ≡ 1 − var.ε′i3/=var.εi3/.) The relation-

ship between the ACMET and the R2 can be expressed as the product of the mediating andoutcome variables’ R2, with ρ = sgn.λ2λ3/RÅ

MRÅY for the unexplained variances, and ρ =

Fig. 5. Sensitivity analysis: , ACMET D 0; , ACMET D �0.003; , ACMET D �0.005;, ACMET D 0.001


Table 15. Sensitivity results for the ACMET

ρ at which ACMET = 0 0.3R2Å

M R2ÅY at which ACMET = 0 0.09

˜R2

MR2Y at which ACMET = 0 0.02

sgn.λ2λ3/RMRY =√{.1−R2

M/.1−R2Y /} for the original variances (Hicks and Tingley, 2011).

(When the mediating or outcome variable is binary, the pseudo-R2 that was developed byMcKelvey and Zavoina (1975) is implemented.) To represent the results of this sensitivity anal-ysis we show how much of the observed variations in the mediating (R

2M) and outcome (R

2Y )

variables are explained by a potential unobserved confounder. In Fig. 5 these proportions arereported on the horizontal and vertical axes respectively. The dark curve represents the combina-tion of explained variations for which the ACMET is equal to 0. In particular, the true ACMETwould change sign when the product of the proportions is greater than 0:02. For example, wemight think that pre-existing cognitive and non-cognitive problems will turn into a decreasein both graduation rates and income level in adulthood. In this case, the true ACMET wouldbe 0 if these problems explained around 14% of the variances for both of these variables. Athigher values of both R

2Y and R

2M, the estimated causal mediation effect would be positive. For

example, we can say that our results will be insignificant if the unobserved confounder Ui wasas important in explaining the probability of acquiring at least a secondary education degreeand the household income in adulthood as maternal education, which in both our outcomeand mediator estimation has one of the most important effects in terms of magnitude. (Startingfrom the estimation of the mediator and the outcome model, we calculated the R

2M and R

2Y for

maternal education, which are 0.56 and 0.04 respectively.) We think that the existence of such aconfounder is quite unlikely given the number and type of variables that we can control for inour analysis and given that none of those alone would explain that much of the variance.

To sum up, as reported in Table 15, both R2Y and R

2M must be substantially higher for the

original conclusion to be changed, showing that negative mediation effects for equivalized adultincome is quite robust to deviations from the standard SI assumption of no unobserved con-founding factors.

Nevertheless, this sensitivity analysis does not address the possibility, which was mentionedearlier in Section 2.2, of the presence of unobserved intermediate confounders. If for exampleexperiencing poverty would increase the probability of developing a health condition, whichwould then impact negatively both the probability of graduating and the probability of obtaininga better paid job, our estimates of the indirect effect would be biased.

9. Concluding remarks

This paper examines the causal channels through which growing up poor affects an individual’seconomic outcomes as an adult. We employ a PS matching method under the assumption that,conditionally on observable pretreatment characteristics, growing up in poverty is independentof income level and the probability of being poor later in life. We also perform a doubly robustestimation of our treatment effect, implementing the entropy balancing method with a leastsquares (or probit) regression of our outcomes of interest on experiencing financial problems inchildhood. The richness of our data and a series of thorough sensitivity analyses augment thecredibility of our identifying assumptions.


Our analysis is based on the 2011 module on intergenerational transmission of EU SILCdata. Our results show that, on average, over the 28 European countries that were considered,growing up in financial distress leads to a significant 1.4 times higher probability of being atrisk of poverty in adulthood, and to a significant decrease of 4% in the adult equivalent income(circa C610 on average), in our most conservative estimates. Our results are particularly relevantfor Mediterranean and central and eastern European countries.

Our estimates are robust to the presence of unobservable variables bias. We find that, to driveour effects to 0,

(a) the explanatory power of the unobservables would need to be 20 times as important asthe power of the observables, following the method that was proposed by Oster (2017),

(b) parental unobservable ability would have to increase the parents’ probability of experi-encing financial problems by 20% (in our most restrictive case) even conditionally on allthe other observable parental characteristics, following the method that was developedby Rosenbaum (2002), and

(c) the odds of experiencing poverty would need to be almost five times higher for highability parents than for low ability parents and the odds of being at risk of poverty inadulthood almost five times higher for non-poor children of high ability parents than forthose of low ability parents, following the approach that was suggested by Ichino et al.(2008).

We also investigate the average effect on income later in life in more detail, looking at thecumulative densities of the distributions of children’s income in adulthood belonging to differentgroups. We find that, even after controlling for the probability of growing up poor as a child,the distribution of the non-poor children first-order stochastic dominates the distribution of thepoor children, implying higher social welfare in a hypothetical society where no-one experiencespoverty as a child against a society where childhood poverty is common. Achieving (at least) asecondary level of education does not completely overcome the detrimental effect of childhoodpoverty.

Moreover, experiencing poverty during childhood is more likely to translate into exclusionfrom further education, as the probability of completing at least secondary education is 1.5times higher for children who were not affected by financial problems. We find that educationplays a substantial role as intermediate variable in the causal effect estimation of childhoodpoverty on income and risk of poverty in adulthood, accounting for more than 30% of thetotal effect. This result is quite robust to the presence of unobserved confounding factors, butwe cannot completely disregard the possibility of unobserved intermediate confounders, whichcould result in an overestimation of this indirect effect of education.

Some policy implications can be derived from our study. Our analysis reinforces the needfor policies that are devoted to eliminate the source at the basis of this increased risk, reducingchildhood poverty. Our results are also in line with some previous studies which suggest that pro-gressive government spending on education can increase intergenerational mobility, offsettingparental suboptimal investment in education (Solon, 2004; Mayer and Lopoo, 2008).

Our research also shows the need for further studies that are devoted to the analysis of theother factors driving the effect of childhood poverty on adult incomes. Among others, it is worthhighlighting that parental poverty is likely to be related to lower levels of good health, nutritionand housing, all of which affect child development and thus future incomes. Furthermore,the home and social environment are where beliefs, attitudes and values are shaped and theseare likely to have effects on children’s future attitudes to work, health and family formation(Heckman et al., 2006). Reducing the stress and anxiety of the parents, targeting intensive


health, nutrition and care support for particularly deprived households or areas might also behighly desirable.

Acknowledgements

The authors thank the Joint Editor, the Associate Editor and two referees, Alfonso Flores-Lagunes, Alessandra Mattei, Andreas Peichl, Guido Schwerdt, Philippe Van Kerm, MichaelVogt, Amelie Wuppermann and the participants at seminars at the Institute of Labor Economicsand at the Luxembourg Institute of Socio-Economic Research, to the 6th Society for the Studyof Economic Inequality conference, at the workshop on ‘Public economics and inequality’ inBerlin and the workshop on ‘Uncovering casual mechanisms’ in Munich for comments on thisor earlier versions of the paper. This work has been supported by the second network for theanalysis of the EU SILC, funded by Eurostat. The European Commission bears no responsibilityfor the analyses and conclusions, which are solely those of the authors. In addition, Bellaniacknowledges financial support from an Aides a la Formation-Recherche grant (PDR 2011-1)from the Luxembourg Fonds National de la Recherche cofunded under the Marie Curie Actionsof the European Commission. Previous versions of this paper have been circulated under the title‘Intergenerational poverty transmission in Europe: the role of education’. The usual disclaimersapply.

References

Abadie, A. and Imbens, G. W. (2002) Simple and bias-corrected matching estimators for average treatment effects.Working Paper 283. National Bureau of Economic Research, Cambridge.

Abadie, A. and Imbens, G. W. (2011) Bias-corrected matching estimators for average treatment effects. J. Bus.Econ. Statist., 29, 1–11.

Acemoglu, D. and Pischke, J.-S. (2001) Changes in the wage structure, family income and childrens education.Eur. Econ. Rev., 45, 890–904.

Aizer, A., Eli, S., Ferrie, J., and Lleras-Muney, A. (2016) The long-run impact of cash transfers to poor families.Am. Econ. Rev., 106, 935–971.

Akerlof, G. and Yellen, J. (1985) Unemployment through the filter of memory. Q. J. Econ., 100, 747–773.Altonji, J. G., Elder, T. E. and Taber, C. R. (2005) Selection on observed and unobserved variables: assessing the

effectiveness of Catholic schools. J. Polit. Econ., 113, 151–184.Anger, S. and Heineck, G. (2010) Do smart parents raise smart children?: The intergenerational transmission of

cognitive abilities. J. Popln Econ., 23, 1255–1282.Atkinson, A. (1970) On the measurement of inequality. J. Econ. Theory, 2, 244–263.Atkinson, T., Guio, A.-C. and Marlier, E. (eds) (2017) Monitoring Social Inclusion in Europe. Luxembourg:

Eurostat Statistical.Becker, S. O. and Caliendo, M. (2007) Sensitivity analysis for average treatment effects. Stata J., 7, 71–83.Bellani, L. and Bia, M. (2017) The impact of growing up poor in Europe. In Monitoring Social Inclusion in Europe

(eds T. Atkinson, A.-C. Guio and E. Marlier). Luxembourg: Eurostat Statistical.Bjorklund, A., Eriksson, K. H. and Jantti, M. (2010) IQ and family background: are associations strong or weak?

BE J. Econ. Anal. Poly, 10.Black, S. E., Devereux, P. J. and Salvances, K. G. (2009) Like father, like son?: A note on the intergenerational

transmission of IQ scores. Econ. Lett., 105, 138–140.Blanden, J. and Gibbons, S. (2006) The Persistence of Poverty across Generations: a View from Two British Cohorts.

Bristol: Policy Press.Blanden, J., Gregg, P. and Macmillan, L. (2007) Accounting for intergenerational income persistence: noncognitive

skills, ability and education. Econ. J., 117, C43–C60.Chetty, R., Hendren, N., Kline, P. and Saez, E. (2014) Where is the land of opportunity?: The geography of

intergenerational mobility in the United States. Q. J. Econ., 129, 1553–1623.Ermisch, J., Francesconi, M. and Pevalin, D. J. (2004) Parental partnership and joblessness in childhood and their

influence on young people’s outcomes. J. R. Statist. Soc. A, 167, 69–101.Esping-Andersen, G. (1990) The Three Worlds of Welfare Capitalism. Cambridge: Polity.Eurostat (2016) Children were the age group at the highest risk of poverty or social exclusion in 2014. Statistics

Explained 3/2016. Eurostat, Luxembourg.


Flores, C. and Flores-Lagunes, A. (2009) Identification and estimation of causal mechanisms and net effects of atreatment under unconfoundedness. Report DP 4237. Institute for the Study of Labor, Bonn.

Frangakis, C. E. and Rubin, D. B. (2002) Principal stratification in causal inference. Biometrics, 58, 21–29.Gangl, M. (2004) RBOUNDS: Stata module to perform Rosenbaum sensitivity analysis for average treatment

effects on the treated. Department of Economics, Boston College, Chestnut Hill.Grundiza, S. and Lopez Vilaplana, C. (2013) Is the likelihood of poverty inherited? Statistics in Focus 27/2013.

Eurostat, Luxembourg.Hafeman, D. and Schwartz, S. (2009) Opening the black box: a motivation for the assessment of mediation. Int.

J. Epidem., 38, 838–845.Hainmueller, J. (2012) Entropy balancing for causal effects: a multivariate reweighting method to produce balanced

samples in observational studies. Polit. Anal., 20, 25–46.Hardt, J. and Rutter, M. (2004) Validity of adult retrospective reports of adverse childhood experiences: review

of the evidence. J. Chld Psychol. Psychiatr., 45, 260–273.Haushofer, J. and Shapiro, J. (2016) The short-term impact of unconditional cash transfers to the poor: experi-

mental evidence from Kenya. Q. J. Econ., 131, 1973–2042.Heckman, J. J., Stixrud, J. and Urzua, S. (2006) The effects of cognitive and noncognitive abilities on labor market

outcomes and social behavior. J. Lab. Econ., 24, 411–482.Hemerijck, A. (2012) Changing Welfare States. Oxford: Oxford University Press.Hicks, R. and Tingley, D. (2011) Causal mediation analysis. Stata J., 11, 605–619.Hill, J., Waldfogel, J. and Brooks-Gunn, J. (2003) Sustained effects of high participation in an early intervention

for low-birth-weight premature infants. Devlpmntl Psychol., 38, 730–744.Hong, G. (2010) Ratio of mediator probability weighting for estimating natural direct and indirect effects. Proc.

Biometr. Sect. Am. Statist. Ass., 2401–2415.Huber, M. (2014) Identifying causal mechanisms (primarily) based on inverse probability weighting. J. Appl.

Econmetr., 29, 920–943.Huber, M., Lechner, M. and Mellace, M. (2016) The finite sample performance of estimators for mediation

analysis under sequential conditional independence. J. Bus. Econ. Statist., 34, 139–160.Ichino, A., Mealli, F. and Nannicini, T. (2008) From temporary help jobs to permanent employment: what can

we learn from matching estimators and their sensitivity? J. Appl. Econmetr., 23, 305–327.Imai, K. L., Keele, L. and Tingley, D. (2010a) A general approach to causal mediation analysis. Psychol. Meth.,

15, 309–334.Imai, K. L., Keele, L. and Yamamoto, T. (2010b) Identification, inference and, sensitivity analysis for causal

mediation effects. Statist. Sci., 25, 309–334.Imai, K. and Yamamoto, T. (2013) Identification and sensitivity analysis for multiple causal mechanisms: revisiting

evidence from framing experiments. Polit. Anal., 21, 141–171.Imbens, G. (2004) Non parametric estimation of average treatment effects under exogeneity: a review. Rev. Econ.

Statist., 86, 4–29.Jenkins, S. P. and Lambert, P. J. (1997) Three ‘I’s of poverty curves, with an analysis of UK poverty trends. Oxf.

Econ. Pap., 49, 317–327.Jo, B. (2008) Causal inference in randomized experiments with mediational processes. Psychol. Meth., 13,

314–336.Jo, B., Stuart, E., MacKinnon, D. and Vinokur, A. (2011) The use of propensity scores in mediation analysis.

Multiv. Behav. Res., 46, 425–452.Jurges, H. (2007) Unemployment, life satisfaction and retrospective error. J. R. Statist. Soc. A, 170, 43–61.Keele, L., Tingley, D. and Yamamoto, T. (2015) Identifying mechanism behind policy interventions via causal

mediation analysis. J. Poly Anal. Mangmnt, 34, 937–963.King, G., Tomz, M. and Wittenberg, J. (2000) Making the most of statistical analyses: improving interpretation

and presentation. Am. J. Polit. Sci., 44, 341–355.Leuven, E. and Sianesi, B. (2003) Psmatch2: Stata module to perform full Mahalanobis and propensity score

matching, common support graphing, and covariate imbalance testing. Department of Economics, BostonCollege, Chestnut Hill.

Linden, A. and Karlson, K. B. (2013) Using mediation analysis to identify causal mechanisms in disease man-agement interventions. Hlth Serv. Outcms Res. Methodol., 13, 86–108.

Mayer, S. E. (1997) What Money Can’t Buy: Family Income and Children’s Life Chances. Cambridge: HarvardUniversity Press.

Mayer, S. E. and Lopoo, L. M. (2008) Government spending and intergenerational mobility. J. Publ. Econ., 92,139–158.

McKelvey, R. and Zavoina, W. (1975) A statistical model for the analysis of ordinal level dependent variables.J. Math. Sociol., 4, 103–120.

Mealli, F. and Rubin, D. B. (2003) Assumptions allowing the estimation of direct causal effects. J. Econmetr., 112,79–87.

Nannicini, T. (2007) Simulation-based sensitivity analysis for matching estimators. Stata J., 7, 334–350.Oster, E. (2017) Unobservable selection and coefficient stability: theory and evidence. J. Bus. Econ. Statist., to be

published.


Pearl, J. (2001a) The causal mediation formula—a guide to the assessment of pathways and mechanisms. TechnicalReport R-379. Cognitive Systems Laboratory, University of California at Los Angeles, Los Angeles.

Pearl, J. (2001b) Direct indirect effects. In Proc. 17th Conf. Uncertainty in Artificial Intelligence. pp. 411–420. SanFrancisco: Morgan Kaufmann.

Peterson, M. L., Sinisi, S. E. and van der Laan, M. (2006) Estimation of direct causal effects. Epidemiology, 17,276–284.

Piopiunik, M. (2014) Intergenerational transmission of education and mediating channels: evidence from a com-pulsory schooling reform in Germany. Scand. J. Econ., 116, 878–907.

Ratcliffe, C. (2015) Child poverty and adult success. Brief. Urban Institute, Washington DC.Robins, J. M. and Greenland, S. (1992) Identifiability and exchangeability for direct and indirect effects. Epidemi-

ology, 3, 143–155.Robins, L. N., Schoenberg, S. P., Holmes, S. J., Ratcliff, K. S., Benham, A. and Works, J. (1985) Early home

environment and retrospective recall: a test for concordance between siblings with and without psychiatricdisorders. Am. J. Orthpsychiat., 55, 27–41.

Rosenbaum, P. R. (2002) Observational Studies. New York: Springer.Rosenbaum, P. R. and Rubin, D. B. (1983) The central role of the propensity score in observational studies for

causal effects. Biometrika, 70, 41–55.Rubin, D. (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ.

Psychol., 66, 688–701.Rubin, D. B. (1978) Bayesian inference for causal effects: the role of randomization. Ann. Statist., 6, 34–58.Rubin, D. B. (2004) Direct and indirect causal effects via potential outcomes. Scand. J. Statist., 31, 161–170.Shea, J. (2000) Does parents’ money matter? J. Publ. Econ., 77, 155–184.Solon, G. (2004) A model of intergenerational mobility variation over time and place. In Generational Income

Mobility in North America and Europe (ed. M. Corak), pp. 38–47. Cambridge: Cambridge University Press.US Census Bureau (2014) Poverty status of people, by age, race and Hispanic origin: 1959–2013. In Current

Population Survey, Annual Social and Economic Supplements. Washington DC: US Census Bureau.VanderWeele, T. (2009) Marginal structural models for the estimation of direct and indirect effects. Epidemiology,

20, 18–26.VanderWeele, T. J. (2008) Simple relations between principal stratification and direct and indirect effects. Statist.

Probab. Lett., 78, 2957–2962.VanderWeele, T. J. and Vansteelandt, S. (2009) Conceptual issues concerning mediation, interventions and com-

position. Statist. Interfc. 2, 457–468.

Supporting informationAdditional ‘supporting information’ may be found in the on-line version of this article:

‘Online appendix’

the long‐run effect of childhood poverty and the mediating role of...

Documents