technical report ch 13

331 First European Survey on Language Competences: Technical Report Chapter 13: Data processing Ȃ Data sets

Upload: lunorip

Post on 17-Jul-2016




6 download


Technical Report Ch 13


Page 1: Technical Report Ch 13

331 First European Survey on Language Competences: Technical Report

Chapter 13: Data

processing Ȃ Data


Page 2: Technical Report Ch 13

332 First European Survey on Language Competences: Technical Report

13 Data processing - Data sets

This chapter details the contents of the ESCL data sets.

The ESLC international data sets consist of seven data files: four student-level files, one teacher-level file and two school-level files.

13.1 The Student Questionnaire and performance data file

Filename: INT_stu.txt

For each student who participated in the assessment the following information is available:

x Identification variables for the educational system, school, target language and student

x The student responses on the questionnaire


x 3ODXVLEOH� YDOXHV� IRU� WKH� VWXGHQWV¶� SHUIRUPDQFH� VFRUHV� LQ� /LVWHQLQJ��5HDGLQJ�and Writing (only for the two skills out of three, for which each student was sampled)


13.2 Language assessment items data files

13.2.1 Scored responses

Filename: INT_cogn_sco.txt

For each student who participated in the cognitive assessment the following information is available:

x Identification variables for the educational system, school, target language, student and marker


x 7KH�VWXGHQWV¶�PDUNHG�UHVSRQVHV�IRU�:ULWLQJ�LWHPV��,Q�FDVH�D�VWXGHQW¶V�Writing booklet was marked by a central marker this file contains the marked responses from the cHQWUDO�PDUNHU��,Q�FDVH�D�VWXGHQW¶V�ERRNOHW�ZDV�PDUNHG�E\�more than one marker, but not a central marker, the file contains the marks by a randomly selected marker

Page 3: Technical Report Ch 13

333 First European Survey on Language Competences: Technical Report

13.2.2 Raw responses

Filename: INT_cogn_raw.txt

For each student who participated in a Listening or Reading test, the following information is available:

x Identification variables for the educational system, school, target language and student

x The students raw responses to Listening and Reading items

13.2.3 Multiple marking

Filename: INT_cogn_mm.txt

For each Writing booklet which was marked more than once the following information is available:

x Identification variables for the educational system, school, target language, student and marker

x Marked responses

13.3 Teacher Questionnaire data file

Filename: INT_tea.txt

For each teacher who filled out the questionnaire the following information is available:

x Identification variables for the educational system, school, target language and teacher


x 7KH�WHDFKHUV¶�LQGLFHV�GHULYHG�IURm the original responses in the questionnaire


13.4 School Questionnaire data files

File names: INT_sch_TL1.txt, INT_sch_TL2.txt

For each school that participated in the survey the following information is available:

x Identification variables for the educational system, implicit and explicit strata, school, target language and principal


Page 4: Technical Report Ch 13

334 First European Survey on Language Competences: Technical Report

x School plausible values for Listening, Reading and Writing and standard errors for the school plausible values

x School weights

The school dataset is divided separate in files for the first target language and the second target language. If a school participated for two target languages, the school is present in both files. Since only one principal responded per school the principal responses and indices are replicated in both files as far as they are applicable to both target languages.

13.5 Records in the data sets

Student level

x All students who attended at least one questionnaire or test booklet session

Teacher level

x All teachers who responded to the questionnaire

School level

x All schools for which at least one student attended a questionnaire or test booklet session

13.6 Records excluded from the datasets

The following data is excluded from the datasets

x Students that did not participate in any session, either because they were ineligible, excluded or absent

x Teachers that did not respond to the questionnaire

x Schools for which no students attended a questionnaire or test booklet session.

13.7 Weights in the datasets

All schools for which any student participated in the survey are in the datasets. However, only students and schools that meet the formal criteria for participation have a weight in the datasets.

A participating student is defined as one who has responded to the Student Questionnaire (required of all students), and has done at least one of the two cognitive tests assigned.

Page 5: Technical Report Ch 13

335 First European Survey on Language Competences: Technical Report

A participating school is defined as a school where at least 25% of the sampled students have completed the questionnaire and at least one test booklet. Based on this criterion four schools (two in the first target language sample and two in the second target language sample) did not get a weight because all questionnaires for these schools were lost.

In Spain and the Flemish Community of Belgium, a number of schools took part that were not part of the sample. These schools can be identified through the code µ(;75$¶�LQ�WKH�YDULDEOH�µPDLQBVWXG\BVDPSOH¶��7KHVH�VFKRROV�DQG�student respondents from these schools do not have weights.

13.8 Representing missing data

Missing responses were coded to distinguish between four types of missing data36:

x Not applicable: 77 for closed questions and 7777 in open questions. This code is used for items or options in the questionnaires that were not administered to respondents, mainly due to the localisation (see Chapter 3).

x Not applicable: 78 for closed questions and 7778 in open questions. This code is used for items or options in the Principal Questionnaire that were not applicable for the target language because the principal responded to the other target language version of the questionnaire.

x Invalid: 88 for closed questions and 8888 in open questions. This code is used when a respondent gave an invalid answer, for example selected several answers when only one answer was expected.

x Missing: 99 for closed questions and 9999 for open questions. This code is used when the respondent did not provide an answer to the questions.

13.9 Identification of respondents, schools and markers

The following identifiers were used:

x Educational system identification variable named educational system_id. The educational system codes used in ESLC are the educational system codes of the European Commission

x The school identification variable named school_id. This consists of the letters µ6&¶�IROORZHG�E\�D�UDQGRPO\�DVVLJQHG���GLJLW�FRGH

x The respondent identification variable named respondent_id. Unique randomly assigned number for identification of students, teachers and principals

x The marker identification variable called marker id. This is a string consisting of a three letter educational system identification variable (ISO 3166, with BGE, BFL, BFR for the German, Flemish and French Communities of Belgium

36 Note that as far as the indices are concerned, each missing value is a true missing value

Page 6: Technical Report Ch 13

336 First European Survey on Language Competences: Technical Report


x Full details of all identifiers and codes used can be found in the codebook made available with the data sets.

Note: since some schools participated for two target languages, merging the student files with the teacher or school files, WKURXJK�ZKDW� LV�NQRZQ�DV�DQ�µLQQHU� MRLQ¶, should always be done on two variables: school_id and targetLanguage_id.