Download - Test Scores and Teacher Selection
-
8/6/2019 Test Scores and Teacher Selection
1/32
TEACHERS COLLEGE, COLUMBIA UNIVERSITY
TEST SCORES AND TEACHER SELECTION
AN EMPIRICAL ANALYSIS FOR TURKEY
M. ALPER DINCER
4/26/2011
[In 2002 in Turkey, a decentralized model of teacher hiring was replaced with a teacher selectionmodel which operates through centralized testing. This study evaluates the impact of this newteacher selection policy on mathematics and science test scores of 8 th graders. The findings show
that a 0.17 standard deviation increase in test scores can be attributed to the new teacherselection policy and the estimated impact is much higher for below median achievers andstudents with female teachers. The findings also provide evidence exhibiting that the new teacherselection policy assigns more teachers to relatively poor schools and classrooms.]
-
8/6/2019 Test Scores and Teacher Selection
2/32
1. Introduction: test scoresThe primary and secondary education systems in Turkey have been undergoing a
restructuring since late 1990s in response to swift developments in the formation of its
economy and the demographics of its young population. One of the main goals of this
restructuring is to increase the quality of learning outcomes in Turkey (Aksit, 2007).
Thus it is important to investigate empirically whether these reform efforts achieve the
intended outcomes or not.
The Trends in International Mathematics and Science Study (TIMSS) and Program for
International Student Assessment (PISA) periodically measure the student achievement
on an international scale and assemble information about students, their families and
schools. Thus with the help of these projects it is possible to track student achievement in
participating countries and make cross-country comparisons. Therefore these projects
provide the necessary data in order to analyze the trend of learning outcomes in Turkey.
A representative set of the student body in 8 th grade which is the final grade of mandatory
schooling in Turkey participated in TIMSS 1999 and 2007. The average mathematics and
science scores of students in Turkey in 1999 were 429 and 433 whereas the international
average scores were 487 and 488, respectively. Similarly the average mathematics and
science scores of students in Turkey in 2007 were 433 and 454 whereas the international
average scores were 488 and 500, respectively. Thus the students in Turkey performed
lower than the average international student achievement. The following table gives the
percentages of students reaching the TIMSS international benchmarks:
-
8/6/2019 Test Scores and Teacher Selection
3/32
Table 1 the percentages of students reaching the TIMSS international benchmarks
Advanced High Intermediate Low
Mathematics
1999 1 6 20 38
2007 5 10 18 26
Science1999 1 5 19 37
2007 3 13 24 31
Source: (Martin et al., 2001a), (Martin et al., 2001b), (Martin, Mullis, Foy, & Olson,2008a), (Martin, Mullis, Foy, & Olson, 2008b)
As a cautionary note, it should be stated that these percentages are not directly
comparable between 1999 and 2007 for Turkey (Martin, et al., 2008a, 2008b). However
these figures present the same pattern in mathematics and science for the students in
Turkey. There are more students in advanced and high international benchmark levels
and there are fewer students in low international benchmark levels1.
PISA offers more definitive information about the trend of learning outcomes of students
in Turkey. Similar to TIMSS PISA measures the reading, mathematics and science test
scores of a student body which is representative for the 15-year old student population in
each participating country. Turkey has participated PISA in 2003, 2006 and 2009 and the
trend in mathematics score is comparable between 2003 and 2009 and the trend in
science test score is comparable between 2006 and 2009 (OECD, 2010).
According to PISA results average mathematics score of 15-year old students in Turkey
increased by 22 points (more than 0.2 standard deviation) and average science score of
1 For the description of these benchmark proficiency levels please see (Martin, et al., 2008b) and
(Martin, et al., 2008a).
-
8/6/2019 Test Scores and Teacher Selection
4/32
15-year old students in Turkey increased by 30 points (approximately 0.3 standard
deviation) (OECD, 2010).
PISA data also shows that in which segment of the student body these improvements
occurred. The percentage of students who falls below the proficiency level 2 decreased
from 52 to 42 in mathematics and in science the same percentage dropped from 47 to 30.
On the other hand the percentages of top performers did not show any increase or
decrease between the respective periods (Figure 1).
These figures on the trend of the average student achievement in mathematics and
science in Turkey highlight at least three important facts. First, for a period which follows
1999, average student achievement in mathematics and science is increasing for the
student population which is either in grade 8 or 15 years old. Second, this increase in
average student achievement is not homogenous. Indeed it is much more intensive on the
lower end of the student achievement distribution in these subjects. Third, these
improvements in average student achievement in Turkey are not due to inflation in test
score scales; average performance of students in Turkey is converging to international
benchmarks as it is defined either by TIMSS or PISA. This convergence is pretty quick at
least according to the measure PISA provided.
These facts immediately raise several questions: Are these changes in student
achievement related to restructuring in the education system in Turkey? If yes, which
aspects of the reform initiative in Turkey did lead to higher learning outcomes in
mathematics and science? Is it possible to identify the channels through which the policy
intervention leads to increases in student achievement? This study attempts to offer some
candidate answers to these questions.
-
8/6/2019 Test Scores and Teacher Selection
5/32
-
8/6/2019 Test Scores and Teacher Selection
6/32
2. Possible explanationsOECD (2010) stresses the role of the Basic Education Programme (BEP) in increasing
learning outcomes in Turkey. The World Bank supported programme defined the
framework for the education reform initiative in Turkey according to the Law No. 4306 2.
With this legislation Ministry of National Education (MONE) aimed to achieve
increasing primary school education, improving the quality of education and overall
student outcomes, closing the performance gap between boys and girls, providing equal
opportunities, matching the performance indicators of the European Union, developing
school libraries, increasing the efficiency of the education system, ensuring that qualified
personnel were employed, integrating information and communication technologies into
the education system and creating local learning centers, based in schools, that are open
to everyone3.
In response to these efforts the attendance rate in the eight-year primary education system
soared from 85 to 100 percent. Similarly, the attendance rate in pre-primary education
system increased from 10 to 25 percent. These increases led to an expansion of the
education system by 3.5 million pupils. These quantitative expansions of the education
system were accompanied by qualitative improvements: During the same period average
class size was reduced from approximately 40 to 30; conditions were improved in all
rural schools and computer laboratories were established in every primary school and
lastly the cost of the BEP exceed the equivalent of USD 11 billion (OECD, 2010).
2http://mevzuat.meb.gov.tr/html/24.html
3http://www.meb.gov.tr/Stats/Apk2002/502.htm
http://mevzuat.meb.gov.tr/html/24.htmlhttp://mevzuat.meb.gov.tr/html/24.htmlhttp://mevzuat.meb.gov.tr/html/24.htmlhttp://www.meb.gov.tr/Stats/Apk2002/502.htmhttp://www.meb.gov.tr/Stats/Apk2002/502.htmhttp://www.meb.gov.tr/Stats/Apk2002/502.htmhttp://www.meb.gov.tr/Stats/Apk2002/502.htmhttp://mevzuat.meb.gov.tr/html/24.html -
8/6/2019 Test Scores and Teacher Selection
7/32
OECD (2010) as well as MONE also highlights the importance of recent curriculum
change in mathematics and science (TTKB, 2008): New curricula were launched in the
2006-2007 school year, starting from the 6 th grade. Similarly, mathematics and language
curricula were also updated and starting from the 9 th grade in the 2008-2009 school year a
new curriculum of science was in force. According to the Board of Education (TTKB)
the aim of this change was to update the content of school education as well as to change
the teaching philosophy and culture within schools.
Although the new curricula is the preferred explanation of MONE and some other
research institutions in Turkey
4
for the increased learning outcomes the connection is not
clear and there is a problem with this specific explanation: First, given that the TIMSS
covers the period between 1999 and 2007 the new curricula explanation does not explain
the improvement in learning outcomes which is evident in TIMSS data. Second, average
achievement in mathematics in PISA is not comparable between 2006 and 2009.
Therefore the timing of the inception of the new curricula and the increase in average
mathematics achievement in Turkey do not overlap. Third, the students who were subject
to the curricula change in science are 9 th graders which constitute only a portion of the
PISA 2009 sample in Turkey; moreover they experienced the new curricula only for two
semesters. It is not clear whether these students may drive a 0.3 standard deviation
increase in the average student achievement in science between 2006 and 2009.
As mentioned earlier, one of the targets of the BEP was to ensure that qualified personnel
were employed. In line with this goal teacher selection policy was changed in 2002 in
4http://www.setav.org/public/HaberDetay.aspx?Dil=tr&hid=57559&q=pisa-yi-dogru-okumak ,
http://www.tepav.org.tr/upload/files/1292255907-
8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdf
http://www.setav.org/public/HaberDetay.aspx?Dil=tr&hid=57559&q=pisa-yi-dogru-okumakhttp://www.setav.org/public/HaberDetay.aspx?Dil=tr&hid=57559&q=pisa-yi-dogru-okumakhttp://www.setav.org/public/HaberDetay.aspx?Dil=tr&hid=57559&q=pisa-yi-dogru-okumakhttp://www.tepav.org.tr/upload/files/1292255907-8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdfhttp://www.tepav.org.tr/upload/files/1292255907-8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdfhttp://www.tepav.org.tr/upload/files/1292255907-8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdfhttp://www.tepav.org.tr/upload/files/1292255907-8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdfhttp://www.tepav.org.tr/upload/files/1292255907-8.PISA_2009_Sonuclarina_Iliskin_Bir_Degerlendirme.pdfhttp://www.setav.org/public/HaberDetay.aspx?Dil=tr&hid=57559&q=pisa-yi-dogru-okumak -
8/6/2019 Test Scores and Teacher Selection
8/32
-
8/6/2019 Test Scores and Teacher Selection
9/32
The difference in teacher quality may lead to substantial difference in student
achievement. In order to understand the relative significance of teacher quality Rivkin et
al. (2005) analyze a unique matched panel data from the UTD Texas Schools Project
which allows them to identify teacher quality based on student performance. They
conclude that the contribution of a ten student reduction in class size to learning is less
than that of a standard deviation increase in teacher quality.
In another study, Rockoff(2004) analyzes a 10-year panel data of test scores and teacher
assignments to understand how much teachers affect learning. The panel structure allows
him to focus on differences in the performance of the same student with different teachers
and to decompose the variation in teacher quality from variation in students
characteristics. His analysis shows that variation in teacher quality explains 23 percent of
the variation in the test scores which is potentially open to policy influence.
Third, teacher characteristics such as qualifications, teaching experience and teacher
education do not exhibit consistently clear and strong effects on student achievement:
Hanushek(2002, 2003) reviews the studies focusing on United States and concludes that
overall there are no systematic effects of characteristics such as teacher education or
teacher experience. Thus it is a challenging inquiry to identify the components which
characterize the quality of teachers.
In the same reviews Hanushek(2002, 2003) also highlights that there is convincingly
strong support for the effects of teachers academic ability as measured by teacher test
scores. In line with Hanusheks inference National Center on Teacher Quality (NCTQ)
(2004) reports that teachers academic aptitude has a clear, measurable effect on learning
and this finding is robust and consistent. The same reports emphasizes that a teachers
-
8/6/2019 Test Scores and Teacher Selection
10/32
literacy ability as measured by standardized tests has an impact on learning more than
any other measureable teacher characteristics. Thus a broad conclusion emerges from
research connecting teacher quality to teachers test scores: Teachers test scores may be
a good measure for teacher quality if these tests are measuring academic aptitude.
Interestingly, there are some studies from Turkey which is in line with these findings.
Several studies which analyze PISA 2006 data for Turkey show that students who were
taught by teachers who passed rigorous testing procedures are associated with higher test
scores (Alacaci & Erbas, 2010; Dincer & Uysal, 2010).
The literature leads to two main conclusions in these aspects: First, teacher quality is an
essential ingredient of education production and it is open to policy influence. Second,
screening teachers with testing which measures academic ability may lead to an increase
in the teacher quality.
4. Basic characteristics of teacher labor market in TurkeyThe main characteristic of teacher labor market in Turkey is the excess supply of
teachers. As of 2010, approximately 327 thousand teachers wait to be employed by the
public sector and the number of applicants is three to four times higher than the number
of the opening teaching positions (Figure 2). This army of inactive teachers represents a
significant population given that the number of employed teachers in the public sector is
680 thousand. MONE also predicts that the optimal number of employed teachers in
public education system 717 thousand5
. Under these circumstances the gap between the
supply and the demand of teachers widens cumulatively.
5http://icden.meb.gov.tr/digeryaziler/MEB_ic_denetim_faaliyet_raporu_2009.pdf
http://icden.meb.gov.tr/digeryaziler/MEB_ic_denetim_faaliyet_raporu_2009.pdfhttp://icden.meb.gov.tr/digeryaziler/MEB_ic_denetim_faaliyet_raporu_2009.pdfhttp://icden.meb.gov.tr/digeryaziler/MEB_ic_denetim_faaliyet_raporu_2009.pdfhttp://icden.meb.gov.tr/digeryaziler/MEB_ic_denetim_faaliyet_raporu_2009.pdf -
8/6/2019 Test Scores and Teacher Selection
11/32
As of 2010, MONE demanded 782 mathematics teachers and it received 2798
applications. For science these figures are 861 and 3546 6, respectively and the gap more
or less is evident in every subject; thus excess supply is not specific to some of the
subjects.
Figure 2: The number of open positions and applicants by subject
0
500
1000
1500
2000
2500
3000
3500
4000
Math Scienceand Tech
Physics Biology Chemistry
# Open positions # Applicants
Source: Authors own calculations fromhttp://personel.meb.gov.tr/ana_sayfa.asp
A candidate rationalization of this excess supply may be the presence of very attractive
teacher salaries in Turkey. However the teacher salaries are not attractive at all in Turkey.
In the public sector the starting salary of a teacher is around 14000$ and it does not
improve much with experience (Figure 3). The salary of a teacher with 15 years of
experience is around 16000$ (OECD, 2009).
Dolton and Gutierrez (2011) present a cross-country analysis of teacher pay and
performance by taking the relative earning distribution in each country into account.
6http://personel.meb.gov.tr/ana_sayfa.asp
http://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asphttp://personel.meb.gov.tr/ana_sayfa.asp -
8/6/2019 Test Scores and Teacher Selection
12/32
Their analysis confirms that the teacher salaries are not especially attractive in Turkey
and the salary-experience profile is flat (Figure 4).
Figure 3: Ratio of salary after 15 years of experience to GDP per capita
0
0.5
1
1.5
2
2.5
Kore
a
German
y
Portugal
Japan
Scotlan
d
NewZealand
Switzerlan
d
Mexico
Spai
n
England
CzechRepubli
c
Turkey
Slovenia
Ireland
Belgium(Fl.)
Australia
OECDaverag
e
Greec
e
Netherland
s
Belgium(Fr.)
Denmar
k
Chile
Finland
Austria
Italy
Franc
e
UnitedState
s
Swede
n
Luxembourg
Hungar
y
Iceland
Norwa
y
Israel
Estonia
Source: (OECD, 2009)
Figure 4: Average teacher wage-experience profile in Turkey
Source: (Dolton & Marcenaro Gutierrez, 2011)
Therefore the starting salaries and the expectation of relatively higher salaries in the
teaching profession cannot explain the excess supply in the teacher labor in Turkey.
-
8/6/2019 Test Scores and Teacher Selection
13/32
Another important feature of the teacher labor market is that all the public servants in
Turkey are protected by law and unions and the job separation is a very unlikely event.
As a result teaching profession offers substantial job security and given the presence of
very high chronic unemployment rates individuals value job security heavily. One study
(Caner & Okten, 2010) analyzes the college major choice decision in a risk and return
framework using university entrance exam data from Turkey and show that individuals
are very sensitive to risk during career choice.
It should be also noted total enrollment in education faculties in Turkey also increased
steadily in time: The total enrollment increased from 33 thousand in 2007 to 45 thousand
in 2008 and 54 thousand in 2009 and MONE expands the teaching force by
approximately 40 thousand each year7.
Thus a combination of an intense demand for job security and increased quotas of
education faculties may provide a more sensible explanation for the excess supply in
teacher labor market in Turkey.
5. Legal framework of teacher selection in TurkeyThere are three main legal sources which regulates the hiring of teachers in Turkey. First,
teachers working in the public sector are subject to Law No. 657. This law defines the
rights as well as legal obligations of public servants since 1965. Second, the regulation of
the tests concerning the assignments of public servant candidates describes the testing
procedure for public servant posts since 2002. Third, MONEs regulation of teacher
assignment and replacementexplains how the testing procedure and test results apply to
7http://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-
fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspx
http://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspxhttp://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspxhttp://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspxhttp://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspxhttp://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspxhttp://www.ogretmenportali.net/HaberGoster/228716e4-64bf-4b55-bb17-fc0ee89baf38/atanmayan-ogretmen-ordusu-buyuyor.aspx -
8/6/2019 Test Scores and Teacher Selection
14/32
teacher selection process. The current version of this regulation is legislated in 2010 and
it has changed many times in the past according to the needs of MONE.
The regulation of the tests concerning the assignments of public servant candidates
basically forms a turning point in teacher selection; because it causes a radical change in
teacher selection policy in Turkey.
In teacher selection system before the legislation of this regulation, i.e. prior 2002, any
eligible teacher candidate was able to apply to any available position announced by
MONE. The applications were processed in provincial offices of MONE and then the
final decision was given by the headquarters of MONE in the capital, Ankara (Figure 5).
Figure 5: A presentation of teacher selection system before 2002
This system was a cause of concern in MONE as well as in State Planning Organization
(SPO) (SPO, 1989). One of the main issues of the pre-2002 system was highlighted by
MONE as a constant imbalance of teacher population across regions. According to the
Research and Development department of MONE, one preliminary report of the 1993
National Education Assembly stressed that more than 10 percent of teachers employed by
MONE in urban areas did not teach a single class. Another issue documented in MONEs
record associated with the pre-2002 was that political pressures and interventions
-
8/6/2019 Test Scores and Teacher Selection
15/32
damaged the fairness and equality principles in teacher employment and caused unrest
among teachers (EARGED, 1995). Indeed this was well-known publicly that to have
connections in provincial offices as well as in the capital was essential to get hired. Thus
nepotism was a general worry about this selection process.
Following the legislation of the above mentioned testing regulation Center of Student
Selection and Placement (OSYM) launched a central examination process which is
known as Public Servant Selection Examination (KPSS). This exam has two sessions:
For the first session the teacher candidates have to answer 120 multiple choice questions
about Turkish, Mathematics, History, Citizenship, General Culture and Geography in 180
minutes. In the second the teacher candidates have to answer 120 multiple choice
questions about educational psychology, educational programs and teaching and
educational guidance in 180 minutes. Then applicants are assigned to teaching positions
centrally by MONE according to their test scores in the central examination and their
ranked list of preferred teaching positions (Figure 6). OSYM conducts the exam annually
and if a teacher candidate fails to be placed to a teaching position then s/he has to take the
exam again in the following year.
Figure 6: A hypothetical presentation of teacher selection after 2002
In this teacher selection system it is not possible to game the hiring process and it is also
not possible to leverage nepotism in order to get a teaching position. Thus it is reasonable
-
8/6/2019 Test Scores and Teacher Selection
16/32
to claim that the central examination and allocation of teaching positions based on test
scores address the problem of lack of fairness. However two questions remain to be
answered: Does the new system ensure that the qualified teachers are employed? Does
this system have an impact on the regional imbalance of teacher population? The first
question is critical because it was one of the main goals of BEP. The second question is
critical because it was the chronic problem of education system (EARGED, 1995; SPO,
1989).
6. DataIn order to answer these research questions I employed TIMSS 1999
8
and TIMSS 20079
data sets for Turkey. These data sets have some very important qualities which render
them very suitable to analyze the questions in interest.
First, as mentioned earlier, these projects assess a representative set of 8th graders in the
participating countries. 8th grade is the final grade of primary education in Turkey and
thus students in the sample should have spent at least a couple of years in their current
institutions.
Second, it is possible to link teachers to students in the same classroom which makes
these data sets especially attractive for this analysis.
Third, the TIMSS project conducts four questionnaires, i.e. student, school, mathematics
teacher and science teacher questionnaires. The student and teacher questionnaires
contain extensive information about demographic and socioeconomic characteristics of
8http://timss.bc.edu/timss1999.html
9http://timss.bc.edu/timss2007/index.html
http://timss.bc.edu/timss1999.htmlhttp://timss.bc.edu/timss1999.htmlhttp://timss.bc.edu/timss1999.htmlhttp://timss.bc.edu/timss2007/index.htmlhttp://timss.bc.edu/timss2007/index.htmlhttp://timss.bc.edu/timss2007/index.htmlhttp://timss.bc.edu/timss2007/index.htmlhttp://timss.bc.edu/timss1999.html -
8/6/2019 Test Scores and Teacher Selection
17/32
students and teachers. In addition, the school questionnaire contains information on
school location, resources and governance.
Fourth, the information collected in 1999 and 2007 is comparable to a certain extent. The
questionnaires in 1999 and 2007 are not overlapping extensively; however most of the
essential information is available in both data sets.
Fifth and most importantly, the policy change which is subject to the evaluation in this
study falls into the middle of 1999 and 2007, the dates Turkey participated to TIMSS.
This allows me to have a reasonable number of observations who are subject to the policy
change which was launched in 2002.
Lastly, the teacher experience is reported in years such as 1, 2, 3 etc. but not in year
categories such as 0-4, 5-8 etc. This distinction is crucial for this analysis because the
data on teacher experience in TIMSS allow me to define the treatment and control groups
with respect to the inception date of the policy change.
7. Methodology and empirical analysisFor the empirical analysis, first, I merged the student, school and teachers data sets for
1999 and 2007 and compiled the 1999 and 2007 TIMSS data sets. Then I defined the
treatment group as the students whose teachers have four or less years of experience. This
assumption is necessary because I do not observe whether the teachers were selected via
central examination or not. Thus I claim that this definition of treatment group
approximates the ideal case.
The justification of this assumption is based on the timing of the TIMSS application and
the central examination. The first central examination in Turkey was conducted in July
-
8/6/2019 Test Scores and Teacher Selection
18/32
2002; OSYM announced the test scores in August 200210 and MONE distributed the
teaching posts based on announced test scores in September, October and November
200211. On the other hand The TIMSS 2007 application in Turkey was conducted in
April, May and June 2007(Olson, Martin, Mullis, & Arora, 2008). Thus a teacher who
was selected with the first central examination should have assigned to the post as early
as September 2002 and the same teacher should have answered the TIMSS teacher
questionnaire as late as June 2007. According to this hypothetical example this teacher
should not have five years of experience at the time of TIMSS application. Therefore the
treatment group is assumed to be as defined above.
However this is an imperfect measure of selection via central examination: First, teacher
turnover leads to measurement error; because it is possible to quit and return teaching
which may be especially an issue for female teachers who may substitute teaching with
child raising for a couple of years. Second, OSYM conducted another central
examination which is known as Central Elimination Examination for Institutions (KMS)
in 200112. KMS was different then KPSS and it is not clear how many teaching posts
were distributed based on KMS scores as well as whether KMS scores were the sole
determinant of the teacher assignments. This issue may also lead to measurement error.
Keeping these shortcomings in mind I basically compared the difference of average
student achievement between treatment and control groups in 1999 and 2007 with a basic
differences-in-differences approach. The main assumption of this approach is that the
10http://www.osym.gov.tr/belge/1-6128/2002-sinavlari.html 11http://personel.meb.gov.tr/sayfa_goster.asp?ID=207
12http://www.osym.gov.tr/belge/1-12485/2001-sinavlari.html
http://www.osym.gov.tr/belge/1-6128/2002-sinavlari.htmlhttp://www.osym.gov.tr/belge/1-6128/2002-sinavlari.htmlhttp://www.osym.gov.tr/belge/1-6128/2002-sinavlari.htmlhttp://personel.meb.gov.tr/sayfa_goster.asp?ID=207http://personel.meb.gov.tr/sayfa_goster.asp?ID=207http://personel.meb.gov.tr/sayfa_goster.asp?ID=207http://www.osym.gov.tr/belge/1-12485/2001-sinavlari.htmlhttp://www.osym.gov.tr/belge/1-12485/2001-sinavlari.htmlhttp://www.osym.gov.tr/belge/1-12485/2001-sinavlari.htmlhttp://www.osym.gov.tr/belge/1-12485/2001-sinavlari.htmlhttp://personel.meb.gov.tr/sayfa_goster.asp?ID=207http://www.osym.gov.tr/belge/1-6128/2002-sinavlari.html -
8/6/2019 Test Scores and Teacher Selection
19/32
change in mean test scores that the control group experiences over time reflects the same
change that the treatment group would have experienced had they not been exposed to the
treatment. Another important assumption of differences-in-differences approach is that
unobserved characteristics have the same distribution across time points and across
treatment groups. I will discuss the validity of these assumptions in the subsequent
sections.
For the differences-in-differences analysis I have estimated the following regression
models:
Table 2: Difference-in-Differences estimations
-
8/6/2019 Test Scores and Teacher Selection
20/32
In these regression models represents the dependent variable which is either the
mathematics or science test score. However it should be mentioned that TIMSS does not
provide point estimates of mathematics and science test scores instead TIMSS gives five
plausible values of mathematics and science ability. For the sake of simplicity I averaged
the five plausible values for each subject and then used the averaged plausible values as
the measure of the subject test score. TIMSS 2007 Technical Report highlights that
taking the average of the plausible values will not yield suitable estimates of individual
student scores (Olson, et al., 2008). In this analysis I repeated some of the estimations
with plausible values and then compared the point estimates and the standard errors of the
population parameter in interest, i.e. . In all cases the point estimates were very close to
each and the standard errors were slightly larger which did not affect the statistical
significance levels.
In these regression models stands for the TIMSS cycle (1999 and 2007),
defines the treatment variable which equals to 1 if the subject teacher has four
or less years of experience. Observed information regarding teachers, students, classes
and schools enters the regression models as control variables (Table 3).
The list of control variables was basically constructed within the data limitations. The
variables available in TIMSS 1999 and 2007 data sets are not overlapping to a significant
degree and in some cases although the necessary variables are available in both data sets
the scales of measurement are different. For example this was a serious issue in terms of
school location variable. All in all I experimented with every variable which is available
in both data sets. The number of missing observations partially had an impact on the list
of control variables.
-
8/6/2019 Test Scores and Teacher Selection
21/32
Table 3: List of Control Variables
Teacher
characteristics
Class
characteristics
Student
characteristics
School
resources
Sex Diversity inacademic ability
Sex An indicatorfor school
resourcesAge Diversity in
socioeconomicbackground
Age Location
Subject degree Presence ofdisruptive students
Parentaleducation
Experience Class size # books at home
Instructional time Computer athomeLanguagespoken at home
Following the difference-in-differences analysis with mathematics and science
achievement I utilized another aspect of the data structure: The treatment variable offers
variation by subject. This means that same student may have a mathematics teacher who
has four or less years of experience whereas his science teacher may have more than four
years of experience (or vice versa). Given that both the mathematics and science test
scores are observed for each student this structure allows me to employed individual
fixed effects. For that purpose I compiled the mathematics and science data sets and
incorporated student fixed effects into the regression models defined in Table 2. This
approach allowed me to relax one of the assumptions which are associated with
difference-in-differences approach. After adding student fixed effects into the model I do
not have assume that unobserved student and school characteristics have the same
distribution across time points and across treatment groups. However I still have to
assume that unobserved class characteristics have the same distribution across time points
and across treatment groups (Table 4). Lastly it should be also mentioned that there are
-
8/6/2019 Test Scores and Teacher Selection
22/32
other examples which employs very similar identification strategies such as the study of
Lavy (2010). In this study the researcher establishes a causal link between instructional
time and student achievement by making use of the within-individual variation in the test
scores and within-subject variation in the instructional time. In its essence the
identification strategy I am employing is identical to the approach Lavy (2010) used with
one exception that I embedded it into a difference-in-differences framework (Table 4).
Table 4: Fixed effects and difference-in-differences estimations
Although this identification strategy allows me to relax some of the assumptions of the
differences-in-differences approach it has also its own shortcomings: First, it leads to a
reduction in the sample size automatically and this problem becomes more pronounced in
sub-group analysis. Second, it is not possible to decompose the effect into two parts as
learning gains in mathematics and learning gains science.
8. FindingsThe following table gives the estimated values for the coefficient of interest under
different specification as described in Table 2 as well as it also presents sub-group
estimates of this coefficient. The analysis has been conducted separately for mathematics
and science test scores (Table 5).
-
8/6/2019 Test Scores and Teacher Selection
23/32
Table 5: Estimation results of difference-in-differences
MathematicsWhole sample Female teacher sample Male teacher sample Below median achievers
sampleAbove median achie
sample
Coef Std Err Adj R Coef Std Err Adj R Coef Std Err Adj R Coef Std Err Adj R Coef Std Err A
Model 1 -17.86 [13.94] 0.04 -23.53 [22.02] 0.07 -15.50 [17.44] 0.03 -2.38 [6.32] 0.03 -11.90 [8.50]
Model 2 -28.15* [16.46] 0.08 -25.76 [20.89] 0.19 -20.10 [21.03] 0.06 -4.34 [7.47] 0.04 -18.43* [10.30]
Model 3 -27.12 [17.26] 0.10 -10.76 [21.23] 0.26 -28.35 [24.38] 0.09 -2.80 [7.12] 0.05 -16.59 [11.18]
Model 4 -14.19 [14.02] 0.27 -7.01 [18.09] 0.38 -10.20 [20.20] 0.24 -2.40 [6.82] 0.10 -8.38 [9.29]
Model 5 -0.61 [13.56] 0.30 6.45 [17.84] 0.40 0.55 [19.41] 0.26 0.22 [6.66] 0.11 -0.26 [9.14]
Obs. 6,750 2,757 3,993 3,354 3,396
Science
Model 1 -15.49 [12.30] 0.07 -42.53** [16.51] 0.10 4.36 [17.24] 0.05 -3.41 [6.31] 0.01-8.46 [6.25]
Model 2 -17.42 [12.34] 0.09 -28.14 [17.63] 0.12 -2.40 [17.70] 0.08 -5.99 [5.76] 0.03 -5.87 [6.94]
Model 3 -17.85 [13.71] 0.14 -15.22 [21.73] 0.18 -12.49 [17.03] 0.17 -4.83 [6.05] 0.04 -4.79 [7.31]
Model 4 -6.89 [10.63] 0.31 -4.98 [17.31] 0.35 -5.00 [13.05] 0.31 -2.70 [5.23] 0.10 -2.21 [6.52] Model 5 -6.29 [10.56] 0.31 -11.90 [18.50] 0.36 -2.73 [12.71] 0.31 -4.15 [5.32] 0.11 -1.38 [6.53]
Obs. 7,085 3,131 3,954 3,536 3,549
Robust standard errors in brackets clustered at the class level, *** p
-
8/6/2019 Test Scores and Teacher Selection
24/32
The results in Table 5 draw attention to several important issues: First, standard errors are
very large. Among 50 point estimates of the treatment effect only three of them are
statistically different than different than zero at least at 10 percent significance level.
Second, almost all of the point estimates have a negative sign. Third, the point estimates
are not stable. In the Model 1 without any control variables the point estimates are
negative and large; however the addition of teacher, class, student and school
characteristics into the regression model rasps this negative treatment effect towards zero.
In some sub-groups addition of these control variables also led to sign changes. A closer
look to the female teacher and male teacher sub-groups highlights that this problem is
much more severe in female teacher sub-group. All in all, the difference-in-differences
analysis does not provide any information about the possible impact of treatment on
student learning. Because of the very large standard errors the treatment effect may be
negative, zero or positive. However it also shows that observed class, student and school
characteristics do not have the same distribution across time points and across treatment
groups given that the point estimates are instable and change signs. Therefore it is also
very likely that unobserved class, student and school characteristics do not have the same
distribution across time points and across treatment groups which is a violation of the
assumptions underlying difference-in-differences approach. This may also be a sign of
differential assignment of teachers with four or less years of experience to classrooms
between 1999 and 2007. In the following I incorporate the student fixed effects into the
regression models in order to take into account the factors at the student and school levels
(Table 6). However teacher and class characteristics vary between the subjects; thus the
regressions contain controls for observed teacher and class characteristics.
-
8/6/2019 Test Scores and Teacher Selection
25/32
Table 6: Estimation results of student fixed effects and difference-in-differences
Mathematics & science scores combinedWhole sample Female teacher sample Male teacher sample Below median achievers
sampleAbove median ac
sample
Coef Std Err Adj R Coef Std Err Adj R Coef Std Err Adj R Coef Std Err Adj R Coef Std Err
Model 1 3.68 [10.62] 0.01 15.10 [12.92] 0.04 7.11 [10.85] 0.03 3.94 [12.59] 0.02 2.94 [8.44]
Model 2 4.28 [9.36] 0.09 16.07 [12.61] 0.10 8.59 [13.17] 0.14 -0.60 [12.49] 0.20 2.42 [8.13]
Model 3 14.77** [6.89] 0.22 41.56** [18.32] 0.30 17.63 [14.13] 0.23 20.67*** [5.32] 0.52 6.17 [9.91]
Obs. 4619 612 1166 2959 1675
Robust standard errors in brackets clustered at the class level, *** p
-
8/6/2019 Test Scores and Teacher Selection
26/32
The results in Table 6 are in contrast with the result in Table 5. Generally the standard
errors are smaller; more interestingly with one exception all of the point estimates of the
treatment effect are positive. The point estimates are not sensitive to the addition of the
teacher characteristics to the regression; however they are very sensitive to the addition
of class characteristics. According to the Model 3, i.e. after controlling for teacher and
class characteristics, the impact of the treatment is estimated precisely for the whole,
female teacher and below median achievers samples.
The standard deviation of the dependent variable in the whole sample is 89. Thus the
impact of the policy change in 2002 on student achievement is around 0.17 standard
deviations. However the sub-group analysis exhibits that this impact is channeled mostly
through female teachers. The estimated impact of the treatment effect in the female
teacher sample is 2.8 times higher than the whole sample whereas in the male teacher
sample the impact is not precisely estimated. Another important inference which can be
drawn from Table 6 is that the below median achievers benefit more from the new
teacher selection compared to above median achievers. Thus the treatment effect is
concentrated on below median achievers. Lastly, the sensitivity of the point estimates to
the addition of class characteristics are in line with the findings in Table 5. This may be
due to the within-school (between classroom) differential assignment of teachers with 4
or less years of experience to classrooms between 1999 and 2007.
The findings in Table 6 provide evidence in favor of a positive and moderately large
treatment effect. Thus it may be claimed that within the contextual framework in Turkey
teacher selection with centralized testing may lead to higher learning outcomes compared
to a decentralized recruitment system. However, there may be other underlying reasons
-
8/6/2019 Test Scores and Teacher Selection
27/32
which can potentially explain the findings in Table 6: For example, there may be a
secular increase in the quality of education faculties in Turkey. If this is the case the
estimated impact may be due to the quality increase in education faculties instead of the
new teacher selection policy. In the same line of thought it can be said that more and
more high school students with higher ability opt for education faculties; thus ability
distribution of the pool of teacher candidates may shift in time. However if these
arguments are true I should expect to detect positive estimates of treatment effect for
different segments of teachers. In order to test these arguments I divided the sample of
teachers who have more than four years experience into three parts such that the sizes of
the subsamples are equal. These segments are 5-8, 9-20 and 20+ years of experience.
Thus these categories defined the alternative treatment variables for each case and I
repeated the individual fixed effects exercise with the full model which includes teacher
and class characteristics as controls. In Table 7 none of the point estimates are
statistically significant and positive; additionally statistically insignificant point estimates
are small when compared with the positive point estimates in Table 6. Thus I failed to
detect any positive impact of the treatment effect with alternative treatment definitions.
Therefore it is more likely that the estimated impact is due to the new selection policy
rather than a secular increase in the quality of education faculties or student body.
9. ConclusionThese findings are suggestive in their nature and they are not suitable to make causal
inferences: Combining individual fixed effects with difference-in-differences allows for a
relatively precise estimate of the treatment effect. The remaining problem with this
approach is the lack of a complete set of classroom characteristics. The point estimates
-
8/6/2019 Test Scores and Teacher Selection
28/32
are sensitive to the classroom characteristics and unobserved classroom characteristics
may cause a bias on the estimate. Although all of this analysis shows that the possible
direction of this bias is downward.
The findings also provide a reasonable explanation for the trend in TIMSS and PISA
results. First, since the analyzed period precedes the curriculum reform in Turkey the
findings cannot be attributed to the curriculum reform. Second, the findings present a
concentrated impact on below median achievers whereas no impact for above median
achievers. This is perfectly in line with what we observe in PISA cycles for students in
Turkey.
The findings are also in accordance with the literature on teacher quality: As mentioned
earlier teachers academic ability is one of most robust indicators of teachers
effectiveness (Hanushek, 2002, 2003; NCTQ, 2004). Basturk(2008) shows that test
scores in college entrance exam are highly predictive for the KPSS test score. Therefore
it should be reasonable to interpret success in KPSS as an indication of higher academic
ability.
Lastly, the following table depicts the degree of differential assignment of teachers into
schools and classrooms. These tables can be interpreted as MONE attempts to ensure a
more balanced distribution of teacher assignment across resource rich and poor regions.
As mentioned earlier MONE as well as SPO were concerned about the imbalance of
teaching force across regions (Table 8).
After the introduction of the central examination the teaching force became much more
female, the new teachers were assigned to classrooms which were much more diverse in
terms of socioeconomic background and have fewer resources for instruction. The
-
8/6/2019 Test Scores and Teacher Selection
29/32
-
8/6/2019 Test Scores and Teacher Selection
30/32
-
8/6/2019 Test Scores and Teacher Selection
31/32
Lavy, V. (2010). Do Differences in Schools Instruction Time Explain International
Achievement Gaps in Math, Science, and Reading? Evidence from Developed
and Developing Countries: National Bureau of Economic Research.
Martin, M.O., Mullis, I.V.S., Foy, P., & Olson, J.F. (2008a). TIMSS 2007: International
Mathematics Report: Findings from IEA's Trends in International Mathematics
and Science Study at the Fourth and Eighth Grades: IEA TIMSS & PIRLS
International Study Center, Lynch School of Education, Boston College.
Martin, M.O., Mullis, I.V.S., Foy, P., & Olson, J.F. (2008b). TIMSS 2007: International
Science Report: Findings from IEA's Trends in International Mathematics and
Science Study at the Fourth and Eighth Grades: IEA TIMSS & PIRLS
International Study Center, Lynch School of Education, Boston College.
Martin, M.O., Mullis, I.V.S., OConnor, K.M., Chrostowski, S.J., Gregory, K.D., Smith,
T.A., & Garden, R.A. (2001a). Mathematics benchmarking report: TIMSS
1999Eighth grade. Chestnut Hill, MA: International Study Center.
Martin, M.O., Mullis, I.V.S., OConnor, K.M., Chrostowski, S.J., Gregory, K.D., Smith,
T.A., & Garden, R.A. (2001b). Science benchmarking report: TIMSS 1999
Eighth grade. Chestnut Hill, MA: International Study Center, Lynch School of
Education, Boston College.
NCTQ. (2004). Increasing the Odds How Good Policies Can Yield Better Teachers:
NCTQ.
OECD. (2009).Education at a Glance 2009: OECD Indicators : Organization for
Economic Cooperation and Development.
OECD. (2010). PISA 2009 Results: Learning Trends: OECD.
-
8/6/2019 Test Scores and Teacher Selection
32/32
Olson, J.F., Martin, M.O., Mullis, I.V.S., & Arora, A. (2008). TIMSS 2007: Technical
Report: International Association for the Evaluation of Educational Achievement.
Rivkin, S.G., Hanushek, E.A., & Kain, J.F. (2005). Teachers, schools, and academic
achievement.Econometrica, 73(2), 417-458.
Rockoff, J.E. (2004). The impact of individual teachers on student achievement:
Evidence from panel data. The American Economic Review, 94(2), 247-252.
Santiago, P. (2002). Teacher demand and supply: Improving teaching quality and
addressing teacher shortages. OECD Education Working Papers.
Schacter, J., & Thum, Y.M. (2004). Paying for high-and low-quality teaching.Economics
of Education Review, 23(4), 411-430.
SPO. (1989).Altinci bes yillik kalkinma plani 1990-1994. Ankara: SPO.
TTKB. (2008).lkgretim Matematik Dersi 68 Snflar retim Program ve Klavuzu
(Teaching Syllabus and Curriculum Guidebook for Elementary school mathematics
course: Grades 6 to 8). Ankara: Ministry of National Education (MONE)