ncrm annual meeting january 2009. people lorraine dearden (dir)john ‘mac’ mcdonald sophia...

23
NCRM Annual Meeting January 2009

Post on 19-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

NCRM Annual MeetingJanuary 2009

Page 2: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

PeopleLorraine Dearden (Dir) John ‘Mac’ McDonald

Sophia Rabe-Hesketh Anna Vignoles

Kirstine Hansen Nikos Tzavidis

James Brown Marcello Sartarelli

Francesca Foliano Alfonso Miranda

Sarah Patel [email protected]

Fellows

Flavio Cunha, Christian Dustmann, Stephen Flavio Cunha, Christian Dustmann, Stephen Machin, Barbara Sianesi and Anders SkrondalMachin, Barbara Sianesi and Anders Skrondal

Page 3: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

ADMIN

• Remit of ADMIN is to develop and disseminate methodologies for making best use of administrative data by exploiting survey data (and vice versa)

• Training and capacity building (Mac McDonald)

Page 4: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

ADMIN

• Strength of administrative data is that they have information on almost everyone.

• Weakness is that they are not rich in covariates– NPD has detailed information on educational

outcomes but no information on parental education.

• We can link richer survey to admin data to enhance the admin data.

Page 5: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

ADMIN

• Weakness of survey data is non-response and attrition.

• Administrative data is virtually (but not fully) complete.

• Scope for using linked survey-administrative data to enhance survey data by telling us about those who are missing from survey data.

Page 6: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Aim is to develop methods to..• make inferences when covariates or responses are

missing in administrative data.• use administrative data to overcome measurement

error in survey variables (e.g. recalled event histories) and vice versa (e.g. ethnicity)

• to tackle bias due to attrition in longitudinal surveys.• using administrative data to improve small-area

estimates of the means and quantiles of survey variables.

Page 7: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Programme 1 (Vignoles)

• Using survey data to enhance methods for the analysis of administrative data

–Measuring the effect of family background and ethnicity on pupil attainment– To what extent does school attendance

reflect real school preferences?

Page 8: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Programme 2 (Brown)

• Using administrative data to enhance methods for the analysis of survey data – Attrition, non-response and the determinants of

school outcomes at 16– Enhancing event history analysis of social surveys

with administrative data

Page 9: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Examples of linked data

• Linked administrative data schools data (NPD/PLASC), FE data (ILR) and higher education data (HESA)– Complete administrative data on entire cohort all

the way through the education system

• NPD/PLASC linked to survey data– LSYPE– MCS

Page 10: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Contribution to work on segregation

• Measuring segregation– Socio-economic segregation– Ethnic segregation

• Modelling causes of segregation– Parental school choice

• Examples drawn from schools but could be applied more broadly

Page 11: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Measuring socio-economic segregation

• Currently measured by FSM binary status• Problematic measure (Hobbs and Vignoles,

2008)– Only picks up bottom 16% of distribution at best– Measurement error in FSM status• E.g. children who do not eat at school not recorded as

FSM

– Changing FSM status in recession

Page 12: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Measuring socio-economic segregation

• Linked data has already provided an assessment of the extent to which FSM really proxies socio-economic disadvantage

• Can provide alternative measures of socio-economic background from surveys– Parental income/ high low income– Parental education/ high low education

• Can assess use of alternative proxies from administrative data e.g. geographic data

Page 13: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Measuring socio-economic segregation

• Linked data can test robustness of segregation work that uses FSM

• Need to be aware of issues raised by Becky Allen on using samples to measure segregation

Page 14: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Ethnic Minority Project

• Originally conceived of as a study of ethnic differences in outcomes – i.e. focusing on missing covariates in model of

ethnic achievement– See work by Wilson et. al., 2005

• Data - PLASC/NPD data linked to LSYPE (cohort born 1990/91)

Page 15: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Ethnic Minority Project

• Do we get estimates of ethnic differences in outcomes wrong if we just rely on administrative data ?

• Do ethnic classifications capture what we are interested in? e.g. example of recent migrants versus long standing populations

• What are differences by ethnicity once we take account of language (EAL)?

Page 16: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

KS3 Results for Pakistani Males

No Controls

NPD Controls

LSYPE Controls

NPD + LSYPEControls

NPD sample

-0.314(0.019)

-0.222(0.018)

LSYPE sample

-0.356(0.068)

-0.265(0.070)

-0.129(0.068)

-0.111(0.069)

NB: Results show differences in standardized score outcomes

NPD controls include gender, ethnicity, age, EAL, SEN, FSM, KS2 score

Page 17: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Measuring ethnicity and EAL

• Is there measurement error in the ethnicity or EAL variables in PLASC?

• If so, are there implications for measuring ethnic segregation and ethnic differences in outcomes ?– see Aspinall and Jacobson, 2007; Battistin and

Sianesi, 2006

Page 18: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Measurement error in ethnicity and EAL

• Multiple measures from LSYPE– Ethnicity and ethnic origin self report– Ethnicity and ethnic origin parents– Language spoken at home– Frequency of English spoken at home

• Measures from PLASC– Ethnicity– EAL binary indicator

Page 19: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Measurement Error

• Misclassification of ethnicity not huge– Sub sample for whom we have full data and who

live with both natural parents 7814– 136 individuals recorded as white British in PLASC

but are not according to LSYPE– 57 individuals recorded white British in LSYPE but

not in PLASC • Evidence of misclassified EAL– 7.2% young people labelled EAL in PLASC but

appear not to be from LSYPE

Page 20: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Ethnicity of those “wrongly” coded EAL in PLASC

ethg_04 | Freq. Percent Cum. -------------------------------+----------------------------------- African | 186 17.13 17.13 Any Other Asian Background | 26 2.39 19.52 Any Other Black Background | 2 0.18 19.71 Any Other Ethnic Group | 11 1.01 20.72 Any Other Mixed Background | 23 2.12 22.84 Any Other White Background | 37 3.41 26.24 Bangladeshi | 106 9.76 36.00 Caribbean | 34 3.13 39.13 Chinese | 6 0.55 39.69 Indian | 320 29.47 69.15 Information Not Obtained | 11 1.01 70.17 Pakistani | 256 23.57 93.74 Refused | 6 0.55 94.29 White British | 38 3.50 97.79 White and Asian | 15 1.38 99.17 White and Black African | 7 0.64 99.82 White and Black Caribbean | 2 0.18 100.00 -------------------------------+----------------------------------- Total | 1,086 100.00

Page 21: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Correlations with error EAL/non white sample

Probit regression Number of obs = 3041 LR chi2(4) = 360.66 Prob > chi2 = 0.0000 Log likelihood = -1710.5072 Pseudo R2 = 0.0954 ------------------------------------------------------------------------------ err1_1_04 | Coef. Std. Err. z P>|z| [95% Conf. Interval] -------------+---------------------------------------------------------------- ks3_apss | .0052081 .0077956 0.67 0.504 -.0100709 .0204871 k2_engtot~_d | -.0014605 .0015275 -0.96 0.339 -.0044543 .0015334 prop white | -1.530476 .0870969 -17.57 0.000 -1.701183 -1.359769 fsm_04 | .0493806 .0596003 0.83 0.407 -.0674338 .166195 _cons | .2645265 .2453994 1.08 0.281 -.2164475 .7455005 ------------------------------------------------------------------------------

Page 22: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Modelling causes of segregation

• Linked MCS data with NPD/PLASC• Details of current school, ranked school

choices, reasons for school choice• Can provide missing covariates in models of

causes of segregation e.g. attitudes to school choice

• Currently project investigating school choice in MCS (Burgess, Greaves, Vignoles and Wilson)

Page 23: NCRM Annual Meeting January 2009. People Lorraine Dearden (Dir)John ‘Mac’ McDonald Sophia Rabe-HeskethAnna Vignoles Kirstine HansenNikos Tzavidis James

Short Courses• Introduction to Data Linkage:• The Value of Data Linkage for Research• Data Linkage – Methodological and Statistical Issues• Enhancing Longitudinal Surveys by Linking to

Administrative Data:• Longitudinal Data Analysis• Event History Analysis• Using Longitudinal Data Linkage to Evaluate Area-

Based Interventions• Data Linkage with the NPD