matt strom dissertation
DESCRIPTION
Achievement Gap SouthwestTRANSCRIPT
AN EXAMINATION OF THE ACHIEVEMENT GAP AND SCHOOL LABELS IN A
SOUTHWEST SUBURBAN DISTRICT IN THE UNITED STATES
By Matthew D. Strom
A Dissertation
Submitted in Partial Fulfillment
of the Requirements for the Degree of
Doctor of Education
in Educational Leadership
Northern Arizona University
December 2011
Approved:
Richard L. Wiggall, Ed.D., Chair
Walter J. Delecki, Ph.D.
Gary Emanuel, Doctor of Arts
George Montopoli, Ph.D.
ii
ABSTRACT
AN EXAMINATION OF THE ACHIEVEMENT GAP AND SCHOOL LABELS IN A
SOUTHWEST SUBURBAN DISTRICT IN THE UNITED STATES
MATTHEW D. STROM
School labeling, or ranking, has become common place in the NCLB era of
school accountability. Most states have implemented a system that enables the public to
compare school to school and district to district. Labeling systems were intended by
NCLB to measure the effectiveness of a school and the ability of a school to ensure equal
educations to subgroups throughout their population. NCLB was a “call to arms” to
address the epidemic of lagging student achievement in minority subgroup populations
throughout the United States. Schools that did not leave any children behind were
intended to be recognized as superior to the rest. Ten years later research is muddled on
the effects of NCLB with respect to the very achievement gap it sought to address.
School ranking systems throughout the United States are being examined on how well
they identify schools that have met the requirements of NCLB. The primary requirement
of NCLB is for a school to close the achievement gap. Within this study you will find an
examination of the achievement gap in a suburban school district within the state of
Arizona and consequently an examination of the labels attached to this district’s schools
by the Arizona Department of Education. The findings for the research can be
summarized under two major themes. One theme was that a wide majority of schools in
this district, with exceptional and non-exceptional labels issued by ADE, still had
significant work to be done in closing the academic achievement gap between ethnicities.
iii
The second theme being that a school label within this district is highly associated with
certain demographic variables. Combining both of these themes results in a better
understanding of the relationship between ADE issued school labels and the ability of a
school to accomplish the mandate set forth in NCLB of closing the achievement gap
between ethnic subgroups. Administrators, teachers and parents throughout the suburban
school district need to be aware of the relationships studied in this research. School-level
and district-level administrators throughout the district must understand that while several
schools in the district, and the district itself, are viewed favorably throughout the state of
Arizona there is still much to accomplish with respect to closing the achievement gap.
Teachers throughout the district must not rest on the accomplishment of their school
being labeled highly by ADE. Teaching, educating and mentoring students of different
ethnicities is not best measured in a school label. Minority parents throughout the
district, as a result of this research, need to continue to become educated about what a
school is doing to best service the need of their individual child. Minority parents must
understand that this is the case whether their child is attending a school with an excelling
label or an underperforming label. All shareholders within this district must be cautious
in consuming the ADE issued school labels. Specifically, the shareholders must be
careful in interpreting what a school label means for an individual child and in particular
an ethnically diverse individual child.
iv
ACKNOWLEDGEMENTS
My grandfather, Herman Strom, fought in World War II to earn the GI Bill so he
could get his college degree. I would like to express my gratitude for my family
members who have sacrificed in their life so that I was in position to accomplish what I
have done in mine. These family members include: Herman Strom, Madeline Strom,
Harold Davis, Letha Davis, Andy Strom, Megan Strom and Betsy Jenkins. Most
importantly, I would like to express my thanks to my parents, Larry and Kathy Strom.
I am very grateful to have had a strong committee for this dissertation. Dr. Ric
Wiggall, my chair, served as a critical voice that kept positive in the constant revision
process. Dr. George Montopolli was a source of knowledge in the statistical analysis
relevant to my ideas. Dr. Walter Delecki and Dr. Gary Emanuel provided feedback and
suggestions to ensure the quality of my dissertation. I also wish to express my gratitude
to Dr. Edie Hartin whose expertise in writing ensured a smooth dissertation process.
I also had a group of colleagues that helped me cope with the daily reality of
being employed full time and in a doctoral program concurrently. Whether it was a
round of golf, a game of cards or a lunch time venting session the list of my colleagues
and friends that I owe thanks to for preserving my personal sanity include: Darin
Lawton, Matthew Barber and Sean Casey. Many thanks to all of those people
aforementioned as this dissertation would not have been started or finished without you.
This is truly a shared accomplishment.
v
TABLE OF CONTENTS
CHAPTER PAGE
1 Overview................................................................................................................ 1
Introduction...................................................................................................... 1
Berliner, School Accountability and the Achievement Gap............................ 4
The Contraposition of Berliner ........................................................................ 7
Statement of the Problem................................................................................. 9
Purpose of Study .............................................................................................. 9
Research Questions........................................................................................ 10
Significance of the Study ............................................................................... 11
Delimitations.................................................................................................. 13
Limitations ..................................................................................................... 14
Definition of Terms........................................................................................ 15
Organization of the Study .............................................................................. 16
Summary ........................................................................................................ 17
2 Review of the Literature ...................................................................................... 18
Introduction.................................................................................................... 18
History of Assessment ................................................................................... 18
History of Achievement Gap ......................................................................... 22
Historical Background of Equity in Education.............................................. 25
No Child Left Behind..................................................................................... 29
vi
CHAPTER PAGE
Studies and Reports Regarding the Trends in the Achievement Gap since NCLB............................................................................................ 34
Summary ........................................................................................................ 39
3 Methodology........................................................................................................ 41
Introduction.................................................................................................... 41
Restatement of the Problem........................................................................... 41
Restatement of Research Questions............................................................... 42
Research Design............................................................................................. 43
Target Population........................................................................................... 43
Sample............................................................................................................ 43
Sampling Procedures ..................................................................................... 44
Data Collection Procedures............................................................................ 45
Data Analysis ................................................................................................. 46
Validity .......................................................................................................... 47 External Validity...................................................................................... 48 Internal Validity ....................................................................................... 48
4 Findings and Results ...................................................................................... 50 Introduction.............................................................................................. 50 Analysis of the Achievement Gap Using AIMS Proficiency Percentage............................................................................. 51 2010 AIMS Summary – Overall ........................................................ 51 2010 AIMS Summary – By Subject .................................................. 58 2011 AIMS Summary – Overall ........................................................ 66 2011 AIMS Summary – By Subject .................................................. 73
Summary of AIMS Proficiency Data from Spring 2010 and Spring 2011.............................................................................................. 80
vii
CHAPTER PAGE
Analysis of the Achievement Gap Using AIMS Scale Score .................. 80
2011 and 2011 Achievement Gap Analyzed through Average Scale Score............................................................................................... 81
Summary of ANOVAs and Average Scale Score.................................. 101 Ethnicity Proportion and Z-Score .................................................... 101 Correlation and Linear Regression between Ethnicity Proportions and Z-Score .................................................................. 102 2010 Data ................................................................................... 103 2011 Data ................................................................................... 106 Summary of Linear Regression Analysis .............................................. 110 A-F Letter Grade and Demographic Data.............................................. 111 Regressing A-F Letter Grade Value onto Ethnicity Proportions, Free and Reduced Lunch Rate and ELL Proportions Using Multiple Linear Regression ............................... 111 Summary of Relationship between School Level Variables and School Letter Grades....................................................................... 120 Summary of Chapter 4 ........................................................................... 121 5 Conclusions, Summary, Implications, and Recommendations.................... 123 Summary of the Study ........................................................................... 123 Overview of the Problem....................................................................... 124 Purpose Statement.................................................................................. 124 Research Methodology .......................................................................... 125 Major Findings Summary ...................................................................... 125 Research Question 1 ........................................................................ 126 Research Question 2 ........................................................................ 126 Research Question 3 ........................................................................ 127 Research Question 4 ........................................................................ 127 Major Findings Discussion .................................................................... 127
viii
CHAPTER PAGE
Findings Related to the Literature.......................................................... 130 Divergent Findings................................................................................. 133 Conclusions............................................................................................ 135 Implications for Action .......................................................................... 136 Recommendations for Further Research................................................ 139 Concluding Remarks.............................................................................. 140 REFERENCES .............................................................................................................. 143
APPENDICIES
A CUSD IRB Approval ......................................................................................... 154
BIOGRAPHICAL INFORMATION............................................................................. 156
ix
LIST OF TABLES
TABLE PAGE
1 ELL Percentages and Free and Reduced Lunch Percentages Comparison............44
2 Ethnic Comparison between Suburban School District and State of Arizona.......45
3 Spring AIMS 2010 Subgroup Performance by Ethnicity and Elementary School .................................................................................................52
4 Spring AIMS 2010 Subgroup Performance by Ethnicity and Junior High School ................................................................................................53
5 Spring AIMS 2010 Subgroup Performance by Ethnicity and High School ...........................................................................................................55
6 Spring AIMS 2010 Subgroup Proficiency Gaps Summary of All Schools ...........57
7 Spring AIMS 2010 Subgroup Proficiency Gaps Summary of Excelling Schools...................................................................................................58
8 Spring AIMS 2010 Elementary School #1 Performance by Subject and Ethnicity .............................................................................................59
9 Spring AIMS 2010 Junior High #4 Performance by Subject and Ethnicity .............................................................................................60
10 Spring AIMS 2010 High School #1 Performance by Subject and Ethnicity .............................................................................................62
11 Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2010 Spring AIMS Administration...........63
12 Number of All District Excelling Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2010 Spring AIMS Administration...........65
13 Spring AIMS 2011 Subgroup Performance by Ethnicity and Elementary School .................................................................................................67
14 Spring AIMS 2011 Subgroup Performance by Ethnicity and Junior High School ................................................................................................68
x
TABLE PAGE
15 Spring AIMS 2011 Subgroup Performance by Ethnicity and High School ...........................................................................................................69
16 Spring AIMS 2011 Subgroup Proficiency Gaps Summary of All Schools ...........71
17 Spring AIMS 2011 Subgroup Proficiency Gaps Summary of Excelling Schools...................................................................................................72
18 Spring AIMS 2011 Elementary School #1 Performance by Subject and Ethnicity .............................................................................................74
9 Spring AIMS 2011 Junior High #4 Performance by Subject and Ethnicity .............................................................................................75
20 Spring AIMS 2011 High School #1 Performance by Subject and Ethnicity .............................................................................................76
21 Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2011 Spring AIMS Administration...........77
22 Number of All District Excelling Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2011 Spring AIMS Administration...........79
23 Average Scale Score throughout the District on 2010 AIMS Mathematics Administration .......................................................................................................82
24 Levene’s Test for Homogeneity of Variance P-Values for Each School across Ethnicities with Respect to Average Scale Score .......................................85
25 Kolmogorov-Smirnov (KS) P-Values for Normality for 2010 and 2011 AIMS Distributions by Ethnicity ..................................................................88
26 Results from the 2010 ANOVA for 29 District-Wide Schools that did not violate the Assumptions of the ANOVA ...............................................................91
27 Results from the 2011 ANOVA for 34 District-Wide Schools that did not violate the Assumptions of the ANOVA ...............................................................94
28 2010 Results for Tukey HSD – Post Hoc Tests.....................................................97
29 2011 Results for Tukey HSD – Post Hos Tests .....................................................99
xi
TABLE PAGE
30 Linear Correlation Summary for 2010 Z-Score regressed on Percentage of Asian/Caucasian (ACP) Students at a School......................................................104
31 Coefficients and Standard Error of Coefficients for 2010 Z-Score Regressed onto Percentage of Asian/Caucasian Students at a School.................105 32 Linear Correlation Summary for 2011 Z-Score regressed on Percentage of Asian/Caucasian (ACP) Students at a School...............................107 33 Coefficients and Standard Error of Coefficients for 2011 Z-Score Regressed onto Percentage of Asian/Caucasian Students at a School.................108 34 Inter-correlation Matrix for Three Variables being examined in 2011 Multiple Linear Regression..................................................................................116 35 Collinearity Statistics for Three Variables used in 2011 Multiple Linear Regression............................................................................................................116 36 2011 Multiple Linear Regression Model for Letter Grade regressed onto the variables of SUM3 and Percentage of Asian Students at a School................119 37 2011 Single Variable Linear Regression Model for Letter Grade regressed onto the variable of Free and Reduced Lunch Percentage...................120 38 2011 Coefficient of Determination for Letter Grade regressed onto the variable of Free and Reduced Lunch Percentage.................................................120
xii
LIST OF FIGURES
FIGURE PAGE
1 NCLB Student Achievement Expectations for English for All Subgroups...........29
2 NCLB Student Achievement Expectations for Mathematics for All Subgroups ........................................................................................................30
3 Scatterplot of 2010 Z-Score versus Proportion of Asian and Caucasian Students at a School .............................................................................................105
4 Residual Plot for Regression Model regressing 2010 Z-Score onto Percentage of Asian/Caucasian Students at a School ..........................................106
5 Scatterplot of 2011 Z-Score versus Proportion of Asian and Caucasian Students at a School .............................................................................................108
6 Residual Plot for Regression Model regressing 2011 Z-Score onto Percentage of Asian/Caucasian Students at a School ..........................................110
7 Residual Plot for 2011 Regression that Regresses School Letter Grade onto Four Independent Variables.........................................................................112
8 Scatterplot Matrix for All Variables in 2011 Multiple Regression......................113
9 Residual Plot for 2011 Regression that Regresses School Letter Grade onto Three Independent Variables .......................................................................114
10 Scatterplot Matrix for Three Variables in 2011 Multiple Regression .................117
11 Residual Plot for 2011 Regression that Regresses School Letter Grade onto SUM3 and Percent of Students at a School that are Asian........................ 118
xiii
DEDICATION
This dissertation is dedicated to the four people who have shared in the sacrifice,
time commitment, highs and lows throughout the process. I would like to dedicate this
dissertation to my wife, Marcia, and three sons, Zavian, Quentin and Elijah. My passion
for educational equity has been maximized by your involvement in my life. My belief in
my own abilities has been secured through your constant support. And my purpose in life
is solidified in your existence.
xiv
“Do we truly will to see each and every child in this nation develop to the peak of his or
her capacities?”
Asa Hilliard, 1991
CHAPTER 1
Overview
Introduction
The average person makes many decisions on a daily basis, both serious and
mundane. In making these decisions, they have to account for various different
competing needs that must be prioritized. So, more often than not the average person
finds themselves looking for a label of sorts to help them make a decision that is both
informed and efficient. For example, one might read food labels to sort out the poor from
the good quality products; or society may judge politicians by the label of their political
party. Specifically, this study focuses on the average person who uses labels to identify
which school may provide the best quality education for their child while understanding
the requirement of No Child Left Behind (NCLB) to close the achievement gap. Prior to
NCLB a parent or guardian used sources such as word of mouth information from
community members to gather more information about a school. Now parents and
guardians alike enjoy the convenience of judging a school based on a label garnered from
student achievement data.
In 2001 President George W. Bush signed into law the No Child Left Behind Act
(NCLB). A reauthorization of the 1965 Elementary and Secondary Education Act,
NCLB was implemented with bipartisan support throughout the legislative branch (Hess
& Petrilli, p. 18). In fact spearheading the implementation of NCLB and school
accountability were democrats Senator Edward Kennedy and Representative George
Miller, and republicans Senator Judd Gregg and Representative John Boehner (Hess &
2
Petrilli, p. 19). These four members of the United States Congress served as critical
leaders in molding the principles implanted in NCLB.
One of the main reasons that Democrats and Republicans favored NCLB was due
to its sweeping reform with respect to the achievement gap (Hess & Petrilli, p. 21). In
essence “the law is premised on the notion that local education politics are fundamentally
broken, and that only strong, external pressure on school systems, focused on student
achievement, will produce a political dynamic that leads to school improvement” (Hess
& Petrilli, p. 23). NCLB required states to set up standards and measure whether students
performed to those standards broken down by subgroup. Consequently, student
achievement broken down by subgroup could address the overriding concern of the
achievement gap. The goal became to close the achievement gap by the school year
ending in the spring of 2014.
In the era of NCLB accountability is a mainstay for students, schools, districts,
and states. NCLB has caused the education system to emphasize a new culture of
accountability. It requires the closing of the achievement gap by 2014 and schools are
currently ranked, or labeled, based on their ability to make adequate yearly progress
(AYP) toward that goal. As a measurement, a school ranking should possess the quality
of correctly identifying those schools that are performing the best across all subgroups
and making progress toward closing the achievement gap by 2014. One might assume,
based on NCLB requirements, the schools ranked highest would be those showing gains
toward closing the achievement gap or schools that have already accomplished closing
the gap. Branding a school with the highest label, although it shows no progress in
3
diminishing the achievement gap, may garner criticism about the validity of the
measurement system used with respect to the goals of NCLB.
Upon the implementation of policies to satisfy NCLB the state of Arizona
determined that in order to be identified as an excelling school a school must be at least
one standard deviation above the average school in the percentage of students that exceed
on the Arizona Instrument to Measure Standards (AIMS) test (ADE, 2008, pg. 21).
Using this method of measurement to determine an excelling school versus a highly
performing school begs the question of whether the achievement gap is being closed at
schools in the state of Arizona. If a school needs to only score one standard deviation
above average in exceeds then are schools with a high proportion of White and Asian
students at an advantage? And if this is the case then does the label attached to schools in
Arizona have meaning beyond identifying the demographics and socioeconomic status of
a school? Essentially, are schools in the state of Arizona labeled as such because they
continue to attack the educational epidemic of low student-achievement within ethnic
minority subgroups?
The achievement gap between Hispanic and Black high school students in
comparison to their White and Asian peers appears to present an unsolved challenge
within the American and Arizona educational system. These gaps have existed at the
national, state, district and school level for decades. Furthermore, research suggests the
gaps are persisting within the twenty-first century educational climate. In a
Center for Education Policy release in October of 2009 it was stated:
Across subgroups and states, there was more progress in closing gaps at the
elementary and middle school levels than at the high school level. Even with this
4
progress, however, the gaps between subgroups often remained large – upwards
of 20 percentage points in many cases (p. 2).
The annually recurring achievement gaps at schools throughout the nation are
alarming. In the national era of school, district and state accountability it has been
deemed mandatory that educators take corrective measures to address this continuing
trend (NAEP, 2009, p.4).
The educational goal of closing the achievement gap is a necessity in ensuring the
civil rights of children in America. According to NCLB, in order for schools and districts
to receive their Title I funds each state shall establish annual measurable objectives for
subgroups within districts and schools. Schools and districts that fail to make adequate
yearly progress (AYP) toward those objectives for each subgroup will be subject to
corrective actions as determined by the state. NCLB mandated, to the applause of
politicians on both sides of the aisle, that the achievement gap be addressed within every
school, and district, nationwide.
Berliner, School Accountability and the Achievement Gap
David C. Berliner of Arizona State University is possibly one of the United
States’ foremost critics of NCLB and the remnants of school accountability. Berliner
views accountability sought by NCLB as placing the blame for low-achievement among
certain minority subgroups on teachers and administrators (Berliner, 2009). Berliner
argues that other factors, primarily out-of-school factors (OSFs), are more to blame for
the achievement gap in certain subgroups than the school, the teachers, or the
administrators.
5
One can ascertain that poverty and socio-economic status (SES) are central
problems for certain ethnic groups within America. Poverty exists at a higher rate in
America among both Hispanic and Black populations, 25.3% and 25.8% respectively,
than it does among White and Asian populations, 12.3% and 12.5% respectively
(DeNavas-Walt, Proctor, & Smith, 2010). Understanding this Berliner says there are
several educational consequences for children that live in poverty that result in a
persistent achievement gap (Berliner, 2009). One out-of-school factor that Berliner
brings to light in his research is Low Birth Weight (LBW) and Very Low Birth Weight
(VLBW). “African Americans, for example, are almost twice as likely as European
Americans to have a LBW child and almost three times as likely to have a VLBW child”
(Berliner, 2009). He goes on to mention that birth weight and IQ are correlated at
approximately 0.70 and that LBW children grow up to have IQs that are on average 11
points lower than those born at or above normal birth weight. Berliner’s argument is
simply that the effects of poverty more readily explain the achievement gap rather than a
failing educational system.
Throughout his research Berliner suggests several other OSFs that could be just as
prevalent in the achievement gap as teacher pedagogy. Berliner cites, with
accompanying statistics, OSFs like food insecurity, pollution, family violence and
neighborhood communities can all have a significant impact on the achievement gap.
The negative aspects of all of these OSFs occur more frequently in lower socio-economic
status (SES) and high poverty areas. All of these OSFs present one more hurdle for a
student in the educational process. The majority of these OSFs occur at a higher rate
among Hispanic and Black students because they, in higher proportions, live in poverty.
6
Berliner is not the only researcher who believes that the impact of poverty on
education might be the biggest factor in the relentless achievement gap. Achievement
gaps among subgroups within a population do not just occur in the United States.
Birenbaum and Nasser (2006) and Zuzovsky (2008) concluded that there is an
achievement gap in Israel between children who speak Hebrew and those who speak
Arabic. The Arab population in Israel is typically from families that have parents with
less education, lower income levels and a higher percentage of families that live below
the poverty line. These studies found that Jewish children, those who speak Hebrew,
perform better than the poorer Arabic children at mathematics. In fact, Birenbaum and
Nasser (2006) found the coefficient of determination to be around 0.6. Thus, about 60
percent of the variation between Jewish and Arabic children in mathematics can be
explained by the variation in their socioeconomic status and their variation in educational
resources.
The link between poverty, ethnic background and student achievement did not
begin with Berliner. The concern over these factors and equality of education started to
become a central focus when Dr. James S. Cooper of Johns Hopkins University published
Equality of Educational Opportunity in 1966. The study, known better as the Coleman
Report, concluded that, “black children started out school trailing behind their white
counterparts and essentially never caught up”(Viadero, p. 1). The study found that the
leading factor in contributing to this perpetual achievement gap in student’s academic
performance was their family backgrounds (Viadero, p. 1). Borman and Dowling (2010)
summarized, in the introduction to their research, that Coleman’s finding still holds a lot
of educational clout. Family background was a variable that inevitably included the
7
socioeconomic status of the family and could be classified within Berliner’s idea of out-
of-school factors.
Although poverty is quickly dismissed by many politicians as an excuse to not
produce a better educational system, Berliner’s idea that failing schools and the
achievement gap may be more the result of poverty should not be ignored. Berliner
simply believes that, “the problems of achievement among America’s poor are much
more likely to be located outside the school than in it” (Berliner, 2009, pg. 4). Poverty
can create a multitude of side effects including poor health, lack of food, minimal
prenatal care and consequently children that, on average, underperform in academics.
The Contraposition of Berliner
Some rectangles are not squares. In Euclidean geometry this statement is true.
Logically, if the propositional statement is valid then the contraposition of that statement
must also be valid. “Some a are not b,” naturally implies that “Some not b are not a”
(Tidman & Kahane, 2003, p. 319). In this geometry case, the contraposition is some non-
squares are not rectangles and it must be valid in Euclidean geometry. This final
statement must also be true because the argument is valid and the premise in Euclidean
geometry is true (Tidman & Kahane, 2003, p. 8). Berliner argues that some variables
associated with poverty result in poor student achievement. Furthermore, he provides
statistical evidence to suggest that the statement is valid (Berliner, 2009). Therefore, the
contraposition of his argument must be both valid and true. The contraposition is that
some high (non-low) student achievement is the result of variables associated with wealth
(non-poverty).
8
In fact, the contraposition of Berliner’s preposition is something that American
educators and educational leaders continue to ignore. In light of the conflicting evidence
with respect to the achievement gap, one could be hard pressed to argue with Berliner’s
viewpoint that the achievement gap has been the result of something much broader than
the educational system. The failures of schools with respect to student achievement
might be caused by more than poor teachers and poor administrators. The failure of
schools might have more to do with our inability as a country to fix our inept social
policies for those in poverty than fixing our educational system (Berliner, 2009). But, if
we are to conclude this we must not continue to ignore the contraposition. Schools that
we deem to be good or excellent throughout our states and our nation might be this, not
because of their best practices in the classroom and in administration, but due to their
limited exposure to the ill effects of poverty.
Educators and educational leaders have long thrived on the single school in a
district where all children are exceeding the standards. In the state of Arizona, the
excelling schools are those written about in the papers and those recognized by the
public. It must be the curriculum at those schools; it must be the teachers at those
schools; it must be the administrative leadership at those schools that cause them to be
excelling schools. The envy of all other schools in the state of Arizona excelling schools
are viewed as the places where things are done right; best practices are implemented and
leadership has a vision. Might it be that out-of-school factors are just as much to blame
for the excelling status of a school as OSFs are to blame for the failing status of another?
A staunch supporter of public education, Berliner attempts to protect the poor side
of education while glossing over the implications of his argument to the wealthy side.
9
The following research seeks to provide a foundation for framing school labels in the
state of Arizona: Berliner’s home state. Could it be that, even though a school is granted
a dignified label, little progress has been made at that school with respect to the
achievement gap? Might it be that these schools receive accolades merely because of
their demographics?
Statement of Problem
The purpose of this study was to examine the achievement gap in mathematics
and reading at all non-alternative schools within a suburban school district in the state of
Arizona for the 2009-2010 and 2010-2011 school years. The study was specifically
interested in student achievement, as measured by scale score, across ethnic subgroups
with respect to the state standardized AIMS examination. Furthermore, the study sought
to examine demographic reasons on why schools in this district obtained a certain school
label. Other interests of the study included the descriptive analysis of cross-sectional data
in reading and mathematics at these schools from 2009-2010 and 2010-2011 and the
predictive abilities of the percentage of non Black/Hispanic students with respect to the
percentage of students that exceeded on the AIMS examination. Using four main
research questions as a guide, data from two prior years was analyzed at schools
throughout the suburban school district.
Purpose of the Study
The purpose was to examine the achievement gap in mathematics and reading at
all non-alternative schools within a suburban school district within the state of Arizona
for the 2009-2010 and 2010-2011 school years. Furthermore, the study sought to
examine demographic reasons on why the schools within the suburban school district
10
obtained high and low school labels. The study was specifically interested in student
achievement across ethnic subgroups with respect to the state standardized AIMS
examination. Another interest of the study included the descriptive analysis of cross-
sectional data in reading and mathematics at these schools from 2009-2010 and 2010-
2011. Furthermore, the study sought to define the predictive abilities of the percentage of
non Black/Hispanic students with respect to the percentage of students that exceeded on
the AIMS examination. Using four main research questions as a guide, data from two
prior years was analyzed at schools throughout the suburban school district.
Research Questions
This dissertation was guided by the following questions:
1. What is the two year cross-sectional data trends for the achievement gap among
White, Asian, Hispanic and Black students on the 2009-2010 and 2010-2011
AIMS mathematics, reading and writing sections at all schools in the suburban
school district?
2. Is the average student achievement, as measured by average scale score, in ethnic
subgroups different for the 2009-2010 and 2010-2011 AIMS examinations at each
non-alternative school throughout the suburban school district?
3. Is the percentage of Asian and White students correlated with the state-issued z-
score, a standardized score for the percent of students that exceed on the AIMS
examination at a school in a given year, which helps determine school labels
within the state of Arizona?
4. Are free and reduced lunch rates, English Language Learner rates, percentage of
Asian students and percentage of White students correlated with the AZ LEARNS
11
A-F letter grades published by the state of Arizona for schools within the
suburban district?
These questions examine the achievement gap in the suburban district in order to
establish a baseline for the current validity of the school labeling system.
Significance of the Study
Despite the efforts of NCLB nine years ago, the eradication of the achievement
gap continues to elude our nation, our states and our districts (CEP, 2009). In an attempt
to solve why the achievement gap still persists one must identify a group of root causes.
Furthermore, the group of factors must be separated into what is controllable versus
uncontrollable by the education community. Thus, of the factors that contribute to the
persistence of the achievement gap many researchers believe that variance among
subgroup achievement can be most readily explained by out-of-school factors (OSFs).
OSFs can provide a multitude of reasons for the lingering gap (Berliner, 2010). Berliner
argues that schools are “not in the position to eliminate the achievement gap” because the
gap is the result of variables outside the schools control. Other researchers believe that
the unrelenting achievement gap is more related to teacher-level factors (Levine &
Marcus, 2007; Beecher & Sweeney, 2008; Harlan, 2009; Liew, Chen & Hughes 2010;
McKown & Weinstein, 2008), school-level factors (Burch, Theoharis, Rauscher, 2010;
Marshall, 2009) or district-level factors (Diamond, 2006; Leithwood, 2010; Loesch,
2010). Hierarchical Linear Models (HLMs) have helped researchers examine the factors
at each of these levels in determining their effects on the achievement gap (Wei, 2008;
Zhang & Zhang, 2002).
12
HLMs have helped determine that a multitude of factors nested within many
different levels of the educational system contribute to the achievement gap. As Wei
(2008) noted, “school accountability systems should be designed so that classroom level
variation can be taken into consideration when quantifying the precision of school
rankings” (pg. 3). While OSFs, school-level factors and district level factors are thought
to play a role in the achievement gap it still remains that each individual school carries
the responsibility to close their individual achievement gap. After all, OSFs are, by
definition, out of the locus of control of a school. Therefore, schools must remain
steadfast in their commitment to focus on those factors which they control and address
them so that all of their subgroups can perform academically.
Many schools throughout the state of Arizona continue to receive the
distinguished label of excelling by the Arizona Department of Education. In the state of
Arizona school labels are dispersed into six categories:
1. Excelling
2. Highly Performing
3. Performing Plus
4. Performing
5. Underperforming, and
6. Failing
Is it possible that excelling schools still perpetuate an achievement gap despite being
labeled excelling? The main goal of NCLB was closing the achievement gap and
ensuring all students a basic level of education. Therefore, schools that achieve the
highest label in the state of Arizona should show significant strides in accomplishing this
13
goal. In an effort to provide analysis of the achievement gap at schools in the suburban
school district this study performed descriptive analysis achievement gap data in 2009-
2010 and 2010-2011. It also examined the statistical significance of AIMS achievement
across ethnic subgroups by using an ANOVA. The study looked into the correlation
between the percentages of non-Hispanic/Black students and the z-score issued by the
Arizona Department of Education for that school. Finally, the study examined the
correlation between new letter grades issued by the Arizona Department of Education and
free and reduced lunch rates, English Language Learner rates, percentage of Asian
students, and percentage of White students. The answers to the research questions
enables administrators, teachers, parents and other stakeholders to better understand what
it means to receive a certain label by the state of Arizona. For instance, a minority parent
will be able to establish what a label in this suburban school district means for his or her
child. Additionally, a principal will understand whether a label correctly identifies the
ability of their school to service the needs of minority students and close the achievement
gaps. The superintendent can improve his/her ability to recognize why a school in this
district achieves their label. Finally, this study gives a baseline to understanding whether
NCLB has had a significant impact on closing the achievement gap within this suburban
school district in the state of Arizona.
Delimitations
1. The study was conducted on data from the Arizona Instrument to Measure
Standards (AIMS) administered in the spring of 2010 and 2011.
2. The study included schools from one suburban school district in the state of
Arizona.
14
3. The study does not include data from charter schools or other districts within
the state of Arizona. Consequently, the ability to generalize beyond the scope
of this study is minimized.
4. The study is being conducted within a school district for which the doctoral
candidate is employed.
Limitations
1. The study takes a limited look, through the fourth research question, into
changes in the labeling system within the state of Arizona (Kiley, 2010). The
ADE Learns letter grades of A through F are in their first year of
implementation. As a result, this is believed to be the first study examining
the letter grades and their relationship to other variables.
2. The schools in this study are from the Phoenix metropolitan area. The school
district analyzed was an ideal school district in SES status, ELL population
and student demographics to start examining the ADE labels.
3. The difference in socioeconomic status (SES) for each student within the
examined school is a confounding variable that is not included in the scope of
this study. While the SES of the entire school is examined in the fourth
research question by examining free and reduced lunch, the SES of individual
students is not taken into account.
4. Yearly changes to the AIMS examinations did not occur during the 2010 and
2011 spring administration of the mathematics and reading examinations.
However, questions within the exam do vary from year to year. Changes in
15
questions on well constructed standardized tests that are vertical scaled should
have minimal effect in subgroup population data.
Definition of Terms
Achievement Gap: Notion that minority students, specifically Blacks and
Hispanics, tend to lag behind their White/Asian counterparts in student achievement on
standardized assessments (Orlich, 2004).
Adequate Yearly Progress (AYP): Annual status check of identified data elements
to determine whether schools and school districts are meeting state progress goals (Smith,
2005).
Arizona Instrument to Measure Standards (AIMS): The test required by the state
of Arizona that measures student achievement on reading, writing and mathematics based
on Arizona state standards.
Asian: A student having origins in any of the original peoples of the Far East,
Southeast Asian, the Indian subcontinent or the Pacific Islands. This category excludes
students of Hispanic origin.
Black: A student having origins in any of the black racial groups in Africa. This
category excludes students of Hispanic origin.
Excelling School: A school in the state of Arizona that is labeled excelling by the
Arizona Department of Education is more than one standard deviation above the mean in
regards to the proportion of students that exceed on the AIMS test in conjunction with
meeting the requirement for Status and MAP points.
Hispanic: A student of Mexican, Puerto Rican, Cuban, Central or South
American, or other Spanish culture or origin, regardless of race.
16
No Child Left Behind Act (NCLB): President George W. Bush’s education reform
bill enacted in January 2002 which holds that all states across the U.S. will reach
universal proficiency in reading and mathematics by the end of the 2013-2014 school
year.
Out-of-School Factors (OSFs): Factors that are more frequently found in low
socioeconomic neighborhoods and as a result have an educational effect on students that
live in poverty. Included in these factors are such things as low birth weight, inadequate
medical and vision care and food insecurity (Berliner, 2009).
School Labeling Systems: An accountability system for schools, required by
NCLB, that ranks schools based on student academic performance on standardized state
assessments that measure standards.
White: A student having origins in any of the original peoples of Europe, North
Africa or the Middle East. This category excludes students of Hispanic origin.
Organization of the Study
The remainder of the study is organized into four chapters, a list of references,
and appendices. Chapter Two consists of a literature review that examines the current
research dealing with the achievement gap and subgroup achievement. Chapter Three
delineates the sampling techniques, methodology, and design of the study. The statistical
analysis of the data collected and a descriptive summary of the implications from the data
analysis are contained in Chapter Four. The summary of the findings from the research
along with implications and recommendations for future research are found in Chapter
Five. Immediately following Chapter Five is a list of references and appendices.
17
Summary
NCLB was implemented in 2001 in an effort to address the gap in achievement
between certain subgroups. The law strives to ensure that by 2014 all subgroups are
performing at a minimal level as measured by standardized achievement tests developed
from state standards. Schools are to be held accountable for improving subgroup
achievement through school labeling systems. School labeling systems are measuring
devices that provide comparative information for the public to judge schools’ ability to
drive student achievement and close the subgroup achievement gap. This study attempts
to analyze the achievement gaps at schools in a suburban Arizona school district and the
state of Arizona’s labeling system for the district schools.
CHAPTER 2
Review of the Literature
Introduction
The following review of literature is intended to provide a background of the
achievement gap in the United States. In particular, the review will focus on the history
of educational assessment, history of educational equality and impact of the 2001 NCLB
Act with respect to the achievement gap. Since the enactment of NCLB, schools
throughout the nation have been mandated with the task of closing the performance gap
between several subgroups. Numerous research studies have ensued focusing on
everything from the stringency of school accountability systems (Wei, 2008) to out-of-
school factors that perpetuate the achievement gap (Berliner, 2010). The proceeding
literature review aims to capture this research so as to frame main factors concerning the
achievement gap, its measurement, and school labeling.
History of Assessment
In 1845 Horace Mann and his educational ally Samuel Howe asked the Boston
School Committee to administer a written examination to school children instead of the
traditional oral examination (Rothman, 1995, p. 33). Oral examination had, for centuries,
dominated evaluation methods for students and measuring their learning outcomes. In
1219 AD, University of Bologna started giving oral examinations in law and in 1636
Oxford started holding oral exams in order to achieve a degree (Limprianou &
Athanasou, 2009, p. 6). Mann and Howe reasoned that these new written examinations
could provide objective information about student learning and quality of teaching
(Rothman, 1995, p. 33). Upon receiving the results from the initial testing Mann became
19
more confident of the power of the new testing methods. He began to “advocate for the
regular use of written tests to monitor the quality of instruction and permit comparisons
among teachers and schools.” (Rothman, 1995, p. 34)
After initial implementation, assessment within primary and secondary education
grew. Resnick and Resnick (1985) noted that tests in a variety of school subjects were
implemented during the last two decades of the 19th century and the first two decades of
the 20th century. As testing grew, cost efficiency started to become of critical importance
to those paying the bill on testing students—taxpayers. Resnick and Resnick (1985)
reasoned that this cost-efficiency drove the development of short-answer and multiple-
choice tests which were objective and cost-efficient simultaneously.
During the first part of the 20th century educational psychologist Edward
Thorndike helped push assessment further by helping the Army develop the Alpha and
Beta tests. The Army, during World War I, employed the knowledge of
psychometricians lead by Thorndike and Robert M. Yerkes to develop mental and
cognitive testing (Peterson & West, 2003, p. 2). Now known as the ASVAB test, these
tests were intended on helping the Army identify the intelligence of soldiers (Rothman,
1995, p. 37). It is doubtful that these tests had an impact on the outcome of the war.
However, the process of implementing a test to measure intelligence became more
acceptable as a result (Zimmerman & Schunk, 2003).
The biggest impact of the Alpha and Beta tests was aiding in the social acceptance
of testing as a means of determining those suited for more intelligent ventures (Peterson
& West, 2003, p. 3). The most common test still seen as a result of this development is
the Scholastic Aptitude Test. Stephen Gould stated that the Army had developed a test to
20
measure all pupils. As a result, “Tests could now rank and stream everybody; the era of
mass testing had begun,” (Gould, 1981, p. 195) and the SAT lead the way. In 1929, the
University of Iowa developed the Iowa Test of Basic Skills and the Iowa Test for
Educational Development. These tests were intended to help schools gather information
about student achievement data. But, their bigger impact may have been in developing
“large-scale” testing equipment and methodologies that were cost-efficient (Rothman,
1995, p. 38).
Until the 1960s the educational system in America was thought to, “solve
problems associated with civil rights, hunger, malnutrition, immigration, crime, teenage
drug use and economic inequality” (Peterson & West, 2003, p. 4). However, in the
middle of the 1960s and into the 1970s a concern had arisen with declining SAT scores.
From 1963 to 1977 average SAT scores had dropped from 478 points to 429 points on the
verbal section and 502 points to 470 points on the mathematics section (Rothman, 1995,
p. 40). Along with the decline in SAT scores, educational surveys during the decade
suggested that United States children were amongst the lowest in academic achievement
when compared to their international peers (Nichols & Berliner, 2007, p. 4). The panic
of Americans alarmed by the Russian launching of Sputnik in 1957 (USDoE, 2009, p. 8)
in conjunction with the data suggesting a failure in the educational system was the first
alarm for an educational crisis. These concerns lead to many states adopting minimum-
competency testing. From 1973 to 1983, “the number of states with some form of MCT
requirement went from 2 to 34” (Linn & Miller, 2005, p. 4). The view of public
education solving problems shifted to a view that public education perpetuated problems
during the 1960s and 1970s. As a result, accountability through testing increased.
21
The educational crisis culminated almost two decades after Sputnik in a report
commissioned by President Ronald Reagan’s education secretary Terrel H. Bell. A
Nation at Risk: The Imperative for Education Reform in America (1983) was a vigorous
attack on American education and the inability of the educational system to rise out of
mediocrity. The report states, “the educational foundations of our society are presently
being eroded by a rising tide of mediocrity that threatens our very future as a Nation and
a people” (Bell, 1983, p. 1). All of the 50 states put into place some type of reform after
Bell released the report and at the center of most of these reforms was a test for
accountability purposes (Linn & Miller, 2005, p. 5). A Nation at Risk served as a
catapult in advancing standardized assessment in public education.
A new face of accountability followed A Nation at Risk as Bell released a “wall
chart” that attempted to show how the fifty states measured in performance (Rotham,
1995, p. 44). As the 1980s passed, standardized tests started to dominate accountability
systems. The majority of states in this decade noted finding that the majority of their
students were above national average and John Cannell labeled this phenomena the “Lake
Wobegone effect” (Linn & Miller, 2005, p. 6). The Lake Wobegone effect is, essentially,
when every member contained in a comparison group that accounts for the entire
population reports to be above average. Noting that either states were misrepresenting
data or old norms were being used to score recent tests the above average results for
every state, the “Lake Woebegone effect”, was the result of pressure on states to show
significant gains in their educational system (Linn & Miller, 2005, p. 6). The overall
emphasis on student achievement through high-stakes testing provided political pressure
on states to show educational gains.
22
The standards based reform of the last two decades has brought the importance of
standardized assessments to an all-time high. Criterion referenced test, intended to show
minimal level of understanding of state mandated standards, have become the preferred
method of high-stakes testing. NCLB has reinforced the use of these types of tests to
primarily “determine rewards and sanctions for schools” (Linn & Miller, 2005, p. 8).
Much like Terrel Bell did when forming a “wall chart” for state educational
performances; NCLB has provided ranking systems based on assessments for districts
and schools within each state.
Mann and Howe originally asked Boston Schools Committee to implement
written assessments for students in order to more objectively evaluate student learning
and quality of instruction (Rotham, 1995). As time has passed assessment remains at the
forefront of educational reform of the 21st century. Showing significant assessment
gains, which resulted in the “Lake Woebegone effect” in the 1980s, continues to be a
central focus for parents, students, teachers, schools, district and politicians. NCLB
mandated school labeling systems, whether reliable or not, provide a landscape in which
pressure on administrators and teachers is ever increasing. Schools strive to be labeled as
excelling. States strive to have excelling schools. The “Lake Woebegone effect” of the
1980s leaves one to wonder whether the label for a school is statistically correct.
History of Achievement Gap
The differences in student achievement between subgroups in the American
education system have long been debated. During the 1950’s and 1960’s, the inequities
in opportunity and achievement of the education system were brought into the forefront
by Brown v. Board of Education (1954), the Elementary and Secondary Education Act
23
(1965) and the Civil Rights Act (1964). In 1963 an article on desegregation in
Englewood, New Jersey documented the achievement gap between Black students and
White students in elementary schools throughout the local school districts (Walker, 1963,
p. 8). The term “achievement gap” surfaced one year later in the Hauser Report on
Chicago public schools when the authors stated, “intensified educational opportunities for
Negro boys and girls would result in a major closing of the achievement gap between
group performances of Negro students and other groups of students” (Hauser, McMurrin,
Nabrit, Nelson & Odell, 1964). The above sources exemplify that the achievement gap in
its infancy focused on Black and White students.
The National Assessment on Education Progress (NAEP) showed that the
educational system made significant gains in closing the black-white achievement gap in
the two decades following the civil rights movement (Barton & Cooley, p. 3). The NAEP
was founded in 1964, from a grant by the Carnegie Corporation, with the intent of putting
a metric lens on achievement (NAEP, 2009). The first assessment was implemented in
the 1969-1970 school year and it obtained a baseline measure of student achievement
(NAEP, 2009). The NAEP was used in an effort to monitor national progress in
education specifically with an interest in equity. During the 1970s and 1980s the test
showed progress in the educational systems ability to close the black-white achievement
gap. “In reading, for example, a 39-point gap for 13-years olds was reduced to an 18-
point gap in 1988. For 17-year-olds, the gap declined from 53 points to 20 points”
(Barton & Cooley, p. 6). The progress of the 1970s and 1980s was met with the
optimistic viewpoint that education systems could progress towards eliminating the
black-white achievement gap.
24
NAEP data from the 1990s presented a much different viewpoint of the ability of
the educational system to close the black-white achievement gap. The gap, which
generally became narrower in the 1970s and 1980s, actually began to show some
increases in certain age groups amongst the subjects of reading and mathematics (Barton
& Cooley, p. 7). However, in the period from 1999 to 2004 that gap slowly narrowed
again (Barton & Cooley, p. 15). In 2004, Secretary of Education Margaret Spellings
stated that NAEP data showed, “proof that No Child Left Behind is working—it is
helping to raise the achievement of young students of every race and from every type of
family background” (USDoE, 2005). In contrast, Marshall Smith (2007) suggested that
the progress seen in 2004 NAEP data was not far enough removed from the
implementation of NCLB in order to credit the legislation for the decreasing achievement
gap trends.
In the years after the implementation of NCLB, evidence has found that few gains
have been made in closing the black-white achievement gap. During 2004 to 2008 the
NAEP discovered “no statistically significant differences” in the changes of the black-
white achievement gap (Barton & Cooley, p. 15). In fact, despite the slight decrease in
the black-white achievement gap between 1999 and 2004 the lack of progress in closing
the gap since the 1980s casts a shadow of doubt on recent education reform. The strict
standards-based reform effort that has swept the country has shown little benefit in
closing the achievement gap (Orfield, 2006).
Although the term achievement gap was originally developed to describe the
achievement disparities between Black and White students the term has evolved into a
broader meaning. The term has grown to encompass the disparity in achievement
25
performance between any two groups of students (Education Week, 2004). Specifically,
since the passing of NCLB the gaps of importance include minority-majority ethnicity
achievement gaps, the gender gaps, and socioeconomic gaps.
The gaps in achievement between many of the subgroups highlighted by NCLB
continue to remain significant. A 2010 report sponsored by the Center on Education
Policy stated that, “States continue to confront large Hispanic-White gaps in achievement
on state reading and mathematics tests” (Kober, Chudowsky, & Chudowsky, 2010, p.
23). The report also concluded that, “less progress has been made in narrowing
achievement gaps on state tests for Native American than other racial/ethnic groups”
(Kober et al., 2010, p. 28). One of the few gains found by the CEP report was the gain in
achievement between low-income students and high-income students (Kober et al., 2010,
p. 31). Ultimately, the report concluded that evidence from state tests and the NAEP was
inconclusive with respect to narrowing achievement gaps.
The word “achievement gap” first surfaced in the civil rights era. First used to
describe the disparity in achievement between Black and White students the term has
since been used to describe other subgroup gaps. NAEP, a nationwide assessment, has
provided data since 1970 that has suggested that gaps narrowed in the 1970s and 1980s
and widened in the 1990s before becoming stagnant in the first decade of the 21st century.
The gap in academic performance between subgroups continues to be a critical issue
almost 50 years beyond the civil rights era.
Historical Background of Equity in Education
The NCLB Act must be viewed in the proper historical context. The American
education system has long pondered “How do we ensure equal education?” Plessy v.
26
Furgeson, 163 U.S. 537, was a landmark case in 1896 that supported separate education
for racial subgroups. The case determined that separate education for children of
different race is fine as long as equal educational resources were supplied (Imber & Van
Geel, 2004, p. 208). The response to this policy from W.E.B. Du Bois in the Common
School and the Negro American in 1911 was, “The alarming neglect of and
discrimination against the Negro schools are plainly evident to anyone who reads the
reports of educational officers in the southern states” (Reese, 2005, p. 210). Dubois, like
many who followed, believed that separate education was inherently not equal. For the
last one-hundred fourteen years educators, legislators and the judiciary have not been
able to concretely define what ensures different subgroups with equal education.
Equal education came to light again in 1954 in Brown v. Board of Education of
Topeka, 347 U.S. 483. The idea that separate educations were equal from Plessy v.
Furgeson was challenged in the case which sought to bring an end to segregated
education (Reese, 2005, p. 226). Thirteen parents on behalf of their twenty children filed
suit demanding that the Board reverse its racial segregation policies in its schools. The
decision of the court reversed the Plessy decision of 1896. The court determined that
separate educations were inherently unequal. Furthermore, districts were now required to
desegregate their schools in an attempt to ensure that all children were provided equal
education (Imber & Van Geel, 2004, p. 213).
The Elementary and Secondary Education Act was passed eleven years after
Brown v. Board of Education in 1965. ESEA was a bold legislative act that provided
federal funding to help ensure equal education in low poverty areas (Rury, 2002, p. 191).
Schools with a high enough proportion of students on the Free and Reduced Lunch
27
program would qualify for the federal funds. In an American era that valued the civil
rights of its citizens, ESEA was an attempt to ensure the right of education for
economically disadvantaged students (Webb, 2006, p. 288).
During the same era of the passing of ESEA women saw a major amendment
within educational rights, Title IX. In 1972 the implementation of Title IX had the
primary purpose of prohibiting gender inequity in education (Webb, 2006, p. 297). Title
IX is most noted for its significant impact in the participation of women in interscholastic
and intercollegiate athletics. Title IX also sought to provide women with equal
educational access (Rury, 2002, p. 196). Following Title IX, the Women’s Educational
Equity Act of 1974, which sought to encourage women into math and science, was
passed (Webb, 2006, p. 299). As a result in the decades since its enactment gender equity
in academics has become a central focus of many secondary and post-secondary
institutions.
As the nation began to demand equity in education for low SES students and
females, several other groups began to question how equitable education was for their
children. In 1975, P.L. 94-142, which is more commonly known as the Individual with
Disabilities Education Act (IDEA), was created in an effort to ensure equality in
education for handicapped children (Webb, 2006, p. 300). The law essentially required
that schools provide children with special needs a free and appropriate education within
the least restrictive environment (Imber & Van Geel, 2004, p. 262). The law has been
amended several times in its 35 year history. But, in its barest form the overriding
principals of IDEA sought to protect education for handicapped children and that goal
28
has not been altered. The federal government has deemed that there must be equality in
education for children with disabilities.
Most recently, in 2001 President George W. Bush signed the NCLB Act (NCLB)
in order to provide equal education for all subgroup populations throughout the United
States. The Senate vote for NCLB recorded eighty-seven YEAs and ten NAYs (U.S.
Senate Roll Call, 2001) while the House of Representative vote recorded three-hundred
eighty-four YEAs and forty-five NAYs (Clerk of the House, 2001). Republicans and
Democrats alike voted for NCLB in hopes of providing a vision for twenty-first century
education in America. Unprecedented bipartisan support for NCLB suggested that
Americans widely viewed elementary and secondary education as a civil right of
children. NCLB set forth a baseline of accountability systems that provided a framework
to guide school systems to ensuring equitable education for all subgroups throughout the
United States.
Equal opportunity in education has been historically questioned by the legislative
and judicial branches of the United States government. From Plessy v. Ferguson to
NCLB, how to ensure education gets dispersed uniformly and equitably to all subgroups
within the population has been a debated topic. Providing high quality education to all
American children has been the central goal of educational reform for over the last
century (Kopan & Walberg, 1974, p. 1). Republican, Democrat, and Independents all
understand that, as a civil right, education must be dispersed to the masses in an
evenhanded fashion.
29
No Child Left Behind
The NCLB Act of 2001 included several key components to strive for equity in
education. The most impactful of these components was probably the creation of
performance-based accountability systems through student testing (Popham, 2004, p. 14).
Under NCLB all fifty states were mandated to create accountability systems for students,
schools and districts. The goal of these systems was to ensure that all students in all
subgroups nationwide reached reading and mathematics proficiency by 2014 (Popham,
2004, p. 23; Webb, 2006, p. 360). The following figures (Steel, 2009, p. 14) demonstrate
what the expected progress for all subgroups in English (Figure 1) and mathematics
(Figure 2) may have been:
Figure 1. NCLB Student Achievement Expectations for English for All Subgroups.
30
Figure 2. NCLB Student Achievement Expectations for Mathematics for All Subgroups.
The charts show that by 2014 every student, in all subgroups throughout the entire United
States, will be meeting performance standards set forth by the state. Although, a recent
development at the White House is enabling states to opt out of the 2014 deadline for
100% proficiency in reading and mathematics in exchange for adopting President
Obama’s new education agenda (Bruce, p. 1). Under the original requirement of NCLB,
if all subgroups are meeting standards, the achievement gap would have been nullified.
From developed state standards students would be tested and assessed on their knowledge
of the standards. Students would be held accountable by withholding diplomas for poor
performance or prohibiting grade advancement as a result of low achievement scores.
Schools and districts would be held accountable by reaching target goals for student
achievement in reading and mathematics.
School and district target goals are referred to in NCLB vernacular as annual
measurable objectives (AMO). AMO are set by states in reading and mathematics.
31
Schools as well as their districts must meet their AMO in order to achieve adequate
yearly progress (AYP). A school that fails to make AYP might face sanctions from the
state. Furthermore, schools that fail to meet their AMO and fall short of AYP for several
years can face further sanctions that include state takeover or dissemination of the school
(Webb, 2006, p. 365). Other consequences of schools perpetually falling short of AYP
may include providing transportation for students to attend schools that are making AYP.
AMO are not just set for the whole school or district. The school and the district
are expected to meet AMO for all subgroups (Popham, 2004, p. 24). Consequently, a
school meeting their AMO and achieving AYP has been successful at meeting goals for
all of their different subgroups including but not limited to special education students,
racial and ethnic students, and educationally disadvantaged students (Webb, 2006, p.
365). Recognition that AMO are satisfied for all of these subgroups does not, however,
suggest that the achievement gap has been closed at the schools meeting AYP.
Schools meeting AYP and improving subgroup scores on state-mandated
assessments can, mathematically, see an increase in performance gaps between
subgroups. Assume that in the 2007-2008 school year, the fictional Equality High School
shows that seventy-eight percent of White students have met the state-wide performance
standard in mathematics and fifty-four percent of Black students have done the same. In
the following year, Equality High School sees that eighty-five percent of White students
met the standard, which is a seven percent increase. In turn, sixty percent of the Black
students met the standard, which is a six percent increase. Both increases qualify the
school to make their AMO for each subgroup and consequently AYP is achieved. But,
the performance gap between White students and Black students at Equality High School
32
has increased from twenty-four percent to twenty-five percent. As a result, the question
of whether the achievement gap is being addressed at schools remains.
NCLB was implemented for a variety of reasons. The most prevalent of these
reasons was ensuring the civil right of all of America’s children to be exposed to equal
educational opportunities. John Dewey emphasized the belief in the early twentieth
century that, “education is the means by which Americans try to improve individuals and
society” (Reese, 2005, p. 322) and within American society NCLB was an attempt to
ensure equal educational opportunity to all students. As a result, mandates in NCLB
focused on making schools and districts accountable for equal educational opportunity
through closing the achievement gap. By making schools and districts accountable for
making AMO for all subgroups politicians in the federal government believed that the
achievement gap would inherently be closed. Unfortunately, the impact of NLCB with
respect to the achievement gap is uncertain.
Schools are held accountable for all subgroups to perform academically because
NCLB requires that states set up a school labeling or ranking system. There is variation
among school rankings for each state (Linn, 2006). Some states use growth models to
rank schools while other states use performance targets. For example, “Kentucky’s
accountability system and California’s Academic Performance Index (API) are examples
of the successive cohorts approach (growth) to measuring improvement in student
achievement” (Linn, 2006, p. 12). NCLB also allows states to determine their minimum
subgroup size as long as the minimum size is greater than five and less than one-hundred
(Wei, 2008). In the state of Arizona the minimum subgroup size is currently thirty.
States were also given the ability to, “select the interval for their intermediate
33
achievement goals (for each subgroup) after they set the starting point” (Wei, 2008).
Finally, each individual state has an individual labeling system. In the state of Arizona
school labels are dispersed in six categories: failing, underperforming, performing,
performing plus, highly performing and excelling. As of the 2011-2012 school year, the
Arizona State Legislature has determined that these labels need to be accompanied by an
additional label due to their ambiguous nature (Kiley, 2010). The new Arizona labeling
system will accompany the current labeling system so that a parent can better understand
that a performing plus school is average. The new state of Arizona labeling system
includes the labels of A, B, C, D and F. As of 2011-2012 a performing plus school will
most likely also receive a label as a ‘C’ school. School labels give the general public a
perception about the performance of the school in relationship to other schools. The
school labels also attempt to help community members understand the ability of a school
to satisfy the requirements of NCLB.
A better understanding of school performance, including progress in closing the
achievement gap, based on a label was what NCLB legislation hoped to provide.
Unfortunately, even with the change recently required by the Arizona State Legislature,
one is left to wonder whether school rankings correctly measure a school’s progress and
performance in many areas. The Arizona school label of excelling is obtained by a
school that meets a certain status level of achievement and accumulates a z-score of at
least one in the exceed category of student performance (ADE, 2010). A z-score of at
least one implies that a school is at least one standard deviation above the average
percentage of students that exceed on the AIMS test in the state. With this as the
measurement that determines being excelling, schools with high proportions of
34
educationally advantaged students might have distinct assistance in achieving an
excelling label. A school, through no fault of its own, may be labeled excelling because
of the same OSFs that Berliner argues perpetuate the achievement gap.
After understanding the issues behind school rankings, the measurement of
school-wide performance becomes the central focus. How does one measure whether a
school is performing in accordance with NCLB? What is measured in order to ensure a
school label correctly portrays its ability to educate students from all of its subgroups?
Furthermore, once a measurement device is in place what checking mechanisms does the
state have for ensuring its validity?
Studies and Reports Regarding the Trends in the Achievement Gap since NCLB
Since the enactment of NCLB in 2001 the central question posed by numerous
researchers has been, “What are the trends in the achievement gap?” The conclusion
with respect to the impact of NCLB on the achievement gap is far from settled. Another
decade might pass before educational researchers ever understand the depth and brevity
of NCLB on the achievement gap. As a result, current research on the role NCLB has
played in closing the gap is muddled and far from conclusive.
A 2005 technical report from the Northwest Evaluation Association (NWEA) was
released and outlined the immediate impact of NCLB on student achievement and
growth. With NCLB having been implemented for just three years, the report does
mention its limitations on predicting the long term effects of the legislation. Examining
the mean scores of students on state tests the report concludes that “from the Fall 2001 to
Fall 2003 improvement was greater for all ethnic groups among students enrolled in
grades that administered their respective tests” (Cronin, Kingsbury, McCall & Bowe,
35
2005, pg. 41). Furthermore, the report suggested that “on the whole, evidence indicated
that small but substantive gains in achievement were made by Blacks, Hispanics, and
Native-Americans that would serve to reduce achievement gaps between these groups
and European-American and Asian students” (Cronin, Kingsbury, McCall & Bowe, 2005,
pg. 42). The report essentially concluded that the initial findings of the impact of NCLB
were positive when examining the achievement gap.
The NWEA (2005) report does caution about the importance of the starting
position when measuring the achievement gap. It is noted in the report that there is
substantially more room for Black and Hispanic students to grow in comparison to their
Asian and White counterparts. On every standardized test there is a ceiling effect. The
ceiling that exists is the maximum score that is achievable on that test. Subgroups that
perform better when baseline data is acquired are closer to the ceiling and their room for
growth is less. Therefore, it is very likely that in the initial years after implementation of
NCLB one might see a closing of the achievement gap due to the ceiling effect. Minority
subgroups have more room for growth and therefore the gap inherently closes. This idea
can also explain, “how an achievement gap might be reduced in an environment in which
minority students grow less” (Cronin, Kingsbury, McCall & Bowe, 2005, pg. 45).
Researchers have taken note of the ceiling effect and as a result a focus on growth
rates among subgroups has become equally as important as the gap itself. Essentially, to
close the achievement gap it is necessary to have the growth rates of lower performing
subgroups to be higher than the growth rates of the highest performing subgroups. The
key to NCLB is that this must be achieved while the performance of all subgroups
increases. Unfortunately, research suggests that growth rates remain approximately
36
constant across ethnic groups when initial performance position is taken into account
(Goldschmidt, 2004).
The 2005 NWEA report initially concludes that there is statistically significant
evidence to suggest that the achievement gap since the implementation of NCLB has
been reduced. However, upon further analysis the report recognized that when other
measurements are taken into account less dramatic conclusions can be made about the
achievement gap since the implementation of NCLB. At one point in the report it is
stated, “In sixth grade mathematics, the actual achievement gap between European-
American and Black students in our sample increased from a gap of 7 points to 10 points
between the fall of 2003 and the spring of 2004” (NWEA, 2005, pg. 46). The devices in
which we choose to measure the achievement gap are of critical importance. A complete
picture of the gap will not be captured on simply the difference between the percentages
of students that meet proficiency between subgroups.
In October 2009, the Center on Education Policy (CEP) produced a report on
2007-2008 state test score trends and the test implications about the achievement gap.
The report included five main findings in state testing data:
1. All subgroups showed more gains than declines in grade 4 at all three
achievement levels.
2. As measured by percentages of students scoring proficient, gaps between
subgroups have narrowed in most states at the elementary, middle and high
school levels, although in a notable minority of cases gaps have widened.
37
3. Most often gaps narrowed because the achievement of lower-performing
subgroups went up rather than because the achievement of higher-performing
subgroups went down.
4. Gaps in percentages proficient narrowed more often for the Hispanic and
Black subgroups than for other subgroups.
5. Although mean scores indicate that gaps have narrowed more often than they
have widened, mean scores give a less rosy picture of progress in closing
achievement gaps than percentages proficient.
The overriding conclusion to these findings might be that state test scores suggest that the
achievement gap is being addressed. However, the fifth finding of this report poses
questions that are a necessity when measuring anything. The question that must be asked
when measuring is, “What is the measurement device and how is the measurement
constructed?” In this case the report, itself, addressed the issue.
When examining the achievement gap with the percentage of students that met the
proficient status on the state mandated standards assessment, the Center for Education
Policy found that seventy-one percent of the time the achievement gap had been
narrowed (CEP, 2009). Alternatively, when examining the same exact data with a mean
score as the measurement device the achievement gap was only narrowed fifty-nine
percent of the time. The report provided similar data with respect to areas were the
achievement gap actually widened. “Mean gaps also widened more than percentage
proficient gaps—37% of the time for mean scores versus 24% of the time for percentages
proficient” (CEP, 2009). The Center for Education Policy report does suggest that
overall the achievement gap is being narrowed. But, the important underlying issue that
38
is presented in the report of “how do we measure the achievement gap” must be noted.
Once again, variance in measurement techniques can create very different understandings
of the impact of NCLB.
The enactment of NCLB certainly moved the Elementary and Secondary
Education Act into the age of accountability. In every state the state-wide standards
based testing varies but the main objective remains to hold schools accountable for
student achievement in every subgroup. The argument can be made whether these
accountability systems play a role in increasing student achievement. Costrell (1997) and
Bishop (1997), in separate studies, concluded that exit examinations created awareness
and improvement of student achievement among leadership, faculty, and students. But,
much like differing opinions on the impact of NCLB on the achievement gap, other
researchers have concluded just the opposite. Studies have concluded that high-stakes
exit examinations have little to no effect on student achievement and can actually cause
an increase in dropout rates among low-achieving students (Jacob, 2001). Perhaps one
of the most infamous critics of high-stakes testing includes David Berliner. Amrein and
Berliner (2002) ran a group of studies on the precipice of NCLB that concluded that little
change was found when school based accountability was in place. Other researchers
have concluded that there is improvement in student achievement with school based
accountability systems in place (Carnoy & Loeb, 2002; Hanushek & Raymond, 2004).
Ultimately, as research continues to waffle over the impact of state accountability
systems for schools they also continue to debate over the impact of these systems on the
achievement gap. Hanushek and Raymond (2003a, 2003b, 2004) concluded that while
accountability increased student achievement for the White, Black and Hispanic
39
populations there were varying results with respect to the achievement gap. They found
that the White and Black achievement gap grew while the White and Hispanic
achievement gap got smaller. Differing effects for two different achievement gaps during
the era of school accountability created by NCLB.
Summary
Accountability has long been an issue within education. Oral examinations
served as an accountability technique for students during the thirteenth century. In the
nineteenth century Mann and Howe implemented written examinations in order to hold
students accountable for knowledge. Over the course of the last century accountability
has expanded to now include teacher, school, district and state accountability.
School, district and state accountability over that last five decades have had a
primary focus on equity in education. Brown v. Board of Education, Title IX, IDEA, and
NCLB all have primary purposes that can be generalized under the common theme of
equity in education. Holding schools, districts and states accountable for equal
educational access and equal educational outcomes is the goal of these judicial and
legislative developments.
NCLB was thought to be the first major Civil Rights Act of the twenty-first
century. Closing the achievement gap amongst various subgroup populations in
American education was the central focus in NCLB. The achievement gap had been
documented during the previous three decades and NCLB set out to make education
equitable for all race, gender, and SES subgroups.
Research on the impact of NCLB with respect to the achievement gap is
convoluted. Findings from different researchers often paint different pictures to whether
40
NCLB has had a significant impact on closing the achievement gap across the United
States. Some research suggests that the achievement gap has been decreased in years
following the implementation of NCLB. Other research suggests that while the gap may
have been reduced the reduction is minimal at best. Finally, some research implies that
the metric in measuring the achievement gap can play a significant role in ones
conclusion on the impact of NCLB on the achievement gap.
Mann and Howe turned the keys to ignition over fifteen decades ago when they
began administering written examinations as a means for measuring student progress.
Measuring has now become of instrumental importance to American educators. The
achievement gap has been of particular interest to measure as evidenced by the
implementation of NCLB. The impact of NCLB and its school accountability systems on
the achievement gap may take another fifteen decades to unravel.
Chapter 3
Methodology
Introduction
A description of the research design, the research questions that guided the study,
the target population, sample procedures, the sample, research instrumentation, data
collection procedures, and data analysis plans are presented. Before providing the
methodology for the research a brief review of the purpose of the study and the research
questions are provided.
Restatement of the Problem
The purpose of this study was to examine the achievement gap in mathematics
and reading at all non-alternative schools within a suburban school district in the state of
Arizona for the 2009-2010 and 2010-2011 school years. The study was specifically
interested in student achievement, as measured by scale score, across ethnic subgroups
with respect to the state standardized AIMS examination. Furthermore, the study sought
to examine demographic reasons on why schools in this district obtained a certain school
label. Other interests of the study included the descriptive analysis of cross-sectional data
in reading and mathematics at these schools from 2009-2010 and 2010-2011 and the
predictive abilities of the percentage of non African-America/Hispanic students with
respect to the percentage of students that exceeded on the AIMS examination. Using four
main research questions as a guide, data from two prior years was analyzed at schools
throughout the suburban school district.
42
Restatement of Research Questions
This dissertation was guided by the following questions:
1. What is the two year cross-sectional data trends for the achievement gap among
White, Asian, Hispanic and Black students on the 2009-2010 and 2010-2011
AIMS mathematics, reading and writing sections at all schools in the suburban
school district?
2. Is the average student achievement, as measured by average scale score, in ethnic
subgroups different for the 2009-2010 and 2010-2011 AIMS examinations at each
non-alternative school throughout the suburban school district?
3. Is the percentage of Asian and White students correlated with the state-issued z-
score, a standardized score for the percent of students that exceed on the AIMS
examination at a school in a given year, which helps determine school labels
within the state of Arizona?
4. Are free and reduced lunch rates, English Language Learner rates, percentage of
Asian students and percentage of White students correlated with the AZ LEARNS
A-F letter grades published by the state of Arizona for schools within the
suburban district?
These questions examine the achievement gap in the suburban district in order to
establish a baseline for the current validity of the school labeling system.
43
Research Design
Ex-post facto data was analyzed with quantitative methods for this research. The
use of quantitative research methods allowed evaluation of Arizona’s school labeling
system with respect to the suburban school district according to two years of cross-
sectional data.
Target Population
The population of interest includes all schools within the suburban school district
in the state of Arizona. In specific, due to the convenience sample the actual population
is specifically limited to the suburban school district being analyzed.
Sample
From the 2009-2010 and 2010-2011 school years the suburban school district had
41 schools. Within those two years the district schools received diverse labelings from
the Arizona Department of Education. Specifically, the district had schools ranging from
underperforming to excelling. Furthermore, since the district is a unified district these
labels were distributed across the elementary, junior high and high school levels. As a
result, a convenient sample for this study consists of state testing data, student
demographic data and school labels from the 2009-2010 and 2010-2011 school years for
the 41 schools in the suburban school district that existed for both of the testing years.
44
Sampling Procedures
The suburban school district was selected as a convenient sample of schools in the
state of Arizona. The district has a diverse population of students across ethnic groups,
SES status and English-language learners. As reflected in Table 1, the population of
students in the district was 4.77% ELL in comparison to a statewide ELL rate of 6.7%
(ADE, 2011). The district also had 28.47% of their students receiving Free and Reduced
Lunch Program benefits in comparison to the statewide rate of 41.5% (ADE, 2011).
Table 1
ELL Percentages and Free and Reduced Lunch Percentages Comparison
ELL
Free and Reduced Lunch Percentages
District 4.77% 28.47%
Statewide 6.7% 41.5%
According to ethnic demographics (see Table 2) the suburban district is 6.7%
Black, 25.8% Hispanic, 57.1% White and 8.5% Asian while the state is 5.5%, 42.2%,
42.9% and 2.8% in those respective categories (ADE, 2011).
45
Table 2
Ethnic Comparison between Suburban School District and State of Arizona
District
State of Arizona
Black 6.77% 5.5%
Hispanic 25.8% 42.2%
White 57.1% 42.9%
Asian 8.5% 2.8%
The diversity of this district in comparison to the state made it an ideal convenience
sample for the purposes of the study.
Data Collection Procedures
Data were gathered from the results of the statewide AIMS test which is given on
a yearly basis to third through eight grade and tenth grade students. The data was
obtained from the Arizona Department of Education (ADE) website for each of the two
academic school years and three subject areas to be analyzed. The data, as distributed by
ADE, is provided to the general public in a Microsoft Excel spreadsheet. This
spreadsheet was utilized to obtain student achievement data.
46
Data Analysis
The quantitative analysis examined whether a school label is linked to the closing
of the achievement gap at that school within the suburban school district. Cross sectional
data trends were examined by creating bar graphs showing the difference in average scale
score between subgroups over the spring 2010 and 2011 exam administrations.
Furthermore, cross sectional data were analyzed by constructing stacked bar graphs based
on student achievement labels of exceeding, meets, approaching, and fall far below to
examine the first research question.
In order to examine the second research question, priori ANOVA tests, with
family-wise α ≈ 0.06 due to six post-hoc pair-wise comparisons, was performed using
average scale score and standard deviation of scale score for each subgroup in each
testing year at every district school. Post-hoc tests using Tukey-Kramer method were run
to analyze pair-wise comparisons. The Tukey-Kramer post-hoc analysis was conducted
at α=0.01. The post-hoc alpha level accounts for the Bonferroni correction and limits the
overall probability of committing a Type I error to 0.06 because, at most, six pair-wise
comparisons were made.
In order to analyze the third research question, Pearson’s correlation coefficient
and, subsequently, a linear regression was performed on the two bivariate scatterplots
produced using district data from each of the AIMS administrations in 2010 and 2011.
The model uses a percentage of Asian and White students as the independent variable and
the z-score produced by the state as the dependent variable. After performing the
analysis, the coefficient of determination, the slope of the regression model, the standard
error of the slope and the p-value for the slope were recorded in order to show if there is
47
statistical significance in the model. A coefficient of determination greater than 0.30 will
suggest that the percentage of Asian and White students has predictive capability for the
z-score that ADE uses to label and excelling school because the effect size, as measured
by r, will be medium (Cohen, 1992, pg. 157).
A simple multiple linear regression was performed on 2011 letter grade data to
examine the fourth research question. The model used free and reduced lunch rate,
English Language Learner rates, percentage of Asian students and percentage of White
students as the independent variables and the quantitative A-F letter grade as the
dependent variable. The coefficients, standard error of the coefficients, t-score and p-
value for each independent variable was noted along with the coefficient of determination
for the entire model. P-values for each coefficient being less than α = 0.05 will suggest
that the variable significantly contributes to the model. A coefficient of determination
greater than 0.13 will suggest that the simple multiple regression model has predictive
capabilities for the ADE A through F letter grade because the effect size, as measured by
f2, will be medium (Cohen, 1992). Corresponding residual plots for regression models
will also be analyzed to ensure the errors in the regression models constructed are
random.
Validity
Validity is a term within research that has a wide variety of definitions. Kerlinger
(1964) simply stated, “Are we measuring what we think we are?” Black and Champion
(1976) offered the following definition for validity, “the measure that an instrument
measures what it is supposed to.” Hammersley’s (1987) stated that, “an account is valid
if it represents accurately those features of the phenomena, that is intended to describe,
48
explain or theorise.” The purpose of the study was to examine the achievement gap at
schools in the suburban school district and the labels placed on these schools by Arizona
Department of Education system. Validity therefore can be seen in the ability of the
study to accurately reflect the achievement gap at these schools and consequently provide
some explanation for that schools corresponding school label. The theory of the study
being that labels attached to district schools provide little evidence, through descriptive
and inferential statistics, for the closing of the achievement gap.
External Validity
External validity addresses the ability to generalize a study to other populations.
The results from this research study cannot be generalized to other districts in the state of
Arizona. Because the sample consists of a convenience sample of one suburban school
district future research in other districts will need to be completed before any results can
be generalized throughout the state of Arizona. Furthermore, the extent to which the
results can be generalized to other states with different school ranking, or labeling,
systems exist is limited as well.
Internal Validity
Internal validity deals with the truth about inferences made with respect to a
cause-effect relationship (Trochim, 2006). Curren and Werth (2004, p. 220) state that
internal validity is the, “assertion that an observed relation between two variables reflects
a causal process or that the lack of an observed relation reflects the lack of a causal
process.” The research was performed on ex-post facto data from an observational study
49
which is the AIMS test. As a result, no inferences about cause and effect relationships
will be made in this study. Therefore, since there will be no inference about a causal
relationship between variables internal validity is not a concern within this study.
Chapter 4
Findings and Results
Introduction
The study reported in this chapter examined the achievement gap at schools in a
suburban school district in the southwest United States. Furthermore, this study sought to
examine the relationship between variables such as percent of students at a school that
were of Asian and White ethnicities and numeric variables such as z-score that help ADE
in determining school labels. The chapter is organized in terms of the four specific
research questions that were posed in Chapter 1 and restated in Chapter 3.
First, a report of the analysis is provided on the achievement gap at the 40 schools
throughout the district by examining proficiency percentages during 2010 and 2011
AIMS examinations administrations. Second, the chapter moves to the analysis of the
achievement gap at the 40 schools throughout the district by examining average scale
score by running ANOVAs on 2010 and 2011 AIMS data. Third, an examination of the
relationship between z-score, a variable instrumental to determining AZ Learn Legacy
school labels, and percent of Asian and White students at a school by using correlation
and regression. Finally, in continuing with the relationship between variables the chapter
commences with examining the relationship between four different school-level variables
and the school letter grade assigned by ADE for the 2011 school year. A final summary
of all the information found throughout the chapter is the concluding analysis that
presides at the end.
51
Analysis of the Achievement Gap Using AIMS Proficiency Percentage
The analysis of the achievement gap using the proportion of students proficient on
the AIMS examination during the 2010 and 2011 AIMS test administration follows. This
analysis addresses the first research question of:
1. What is the two year cross-sectional data trends for the achievement gap
among White, Asian, Hispanic and Black students on the 2009-2010 and
2010-2011 AIMS mathematics, reading and writing sections at all schools in
the suburban school district?
The analysis of this research question is divided into two sections. One of these sections
addresses the results of the 2010 AIMS administration and the other addresses the results
of the 2011 AIMS administration.
2010 AIMS Summary - Overall
An achievement gap during the 2010 Spring AIMS administration was noted at
the majority of the schools in the suburban school district with respect to proficiency.
Proficiency is determined by the proportion of students that meet or exceed the standards
on the AIMS examination. As revealed in Table 3, at Elementary #1, 98.33% of the
Asian students were proficient on the AIMS examination while only 67.15% and 77.53%
of the Black and Hispanic students were proficient, respectively. The gap in proficiency
between Asian and Black students was 31.18% and the proficiency gap between Asian
and Hispanic students was 20.8%. Similar performance gaps at Elementary School #1
exist between White and Black students and White and Hispanic students. In fact, the
gap between both of these groups was 21.21% and 10.83%, respectively.
52
Further examination of the other two elementary schools on Table 3 shows similar
results with respect to the achievement gap amongst students of different ethnicities
during the 2010 Spring AIMS administration. At Elementary #2 Asian and White
students were proficient on the exam at 92.23% and 92.3%, respectively. In contrast,
Black and Hispanic students were proficient on the exam at a rate of 76.67% and 81.3%,
respectively. At Elementary #3 Asian and White students were proficient on the exam at
95.52% and 89.71%, respectively. This compares to Black and Hispanic students being
proficient on the exam at 81.6% and 80.75%, respectively. Once again the gaps in
performance amongst different ethnic subgroups are observed.
Table 3
Spring AIMS 2010 Subgroup Performance by Ethnicity and Elementary School
School
Ethnicity % FFB % Approach % Meets % Exceeds
Elementary School #1
Asian 0% 1.67% 38.89% 51.44%
Black 5.71% 27.14 52.86 14.29
Hispanic 7.34% 15.14 58.26 19.27
White 2.08% 9.56 51.23 37.13
Elementary School #2
Asian 0.49% 7.28 44.66 47.57
Black 4.44% 18.89 58.89 17.78
Hispanic 4.88% 13.82 67.48 13.82
White 1.37% 6.33 59.81 32.49
(table continues)
53
Table 3 (continued)
School
Ethnicity % FFB % Approach % Meets % Exceeds
Elementary School #3
Asian 1.63% 2.86 47.76 47.76
Black 0% 18.39 70.11 11.49
Hispanic 7.49% 11.46 60.96 19.97
White 3.60% 6.69 56.75 32.96
Achievement gap trends are also seen at the junior high level during the 2010
Spring AIMS administration. As reported in Table 4, at junior high #4 89.43% and
87.59% of Asian and White students were proficient on examinations taken. Noticeably
below this performance, at junior high #4 Black and Hispanic students were 70.79% and
73.77% proficient, respectively. The smallest gap in proficiency existed between White
and Hispanic students at junior high #4 where White students were13.82% more likely to
be proficient than Hispanic students.
Table 4
Spring AIMS 2010 Subgroup Performance by Ethnicity and Junior High School
School
Ethnicity % FFB % Approach % Meets % Exceeds
Junior High #4 Asian 3.40% 7.17% 55.85% 33.58%
Black 10.11% 19.10 56.18 14.61
Hispanic 8.83% 17.40 60.52 13.25
White 4.14% 8.27 62.77 24.82
(table continues)
54
Table 4 (continued)
School
Ethnicity % FFB % Approach % Meets % Exceeds
Junior High #5 Asian 3.59% 5.99 53.89 36.53
Black 6.17% 14.81 66.67 12.35
Hispanic 9.47% 11.89 58.15 20.48
White 2.87% 6.65 61.21 29.27
Junior High #6 Asian 7.32% 18.29 50.00 24.39
Black 18.86% 21.14 48.57 11.43
Hispanic 20.41% 24.55 47.75 7.28
White 7.53% 11.88 59.53 21.06 At the high school level a similar trend is noticed for the Spring 2010 AIMS
administration. As stated in Table 5, all four high schools in this suburban school district
displayed more than a 10% difference in proficiency for every comparison of
White/Asian and Black/Hispanic performance. The smallest achievement gap between
these subgroups existed at High School #4 where Asian students were only 11.95% more
likely to be proficient than Hispanic students. The largest achievement gap between
these subgroups existed at High School #2 where the White-Hispanic gap was at 30.62%
with respect to proficiency.
55
Table 5
Spring AIMS 2010 Subgroup Performance by Ethnicity and High Schools
School
Ethnicity % FFB % Approach % Meets % Exceeds
High School #1 Asian 1.08% 4.30% 58.06% 36.56%
Black 7.81% 20.31 57.03 14.84
Hispanic 12.84% 13.76 55.96 17.43
White 4.34% 9.37 59.20 27.09
High School #2 Asian 4.17% 9.52 41.07 45.24
Black 11.36% 17.73 55.91 15.00
Hispanic 14.81% 27.53 48.17 9.49
White 4.39% 7.32 57.91 30.37
High School #3 Asian 1.59% 5.82 40.74 51.85
Black 9.74% 13.96 55.19 21.10
Hispanic 9.93% 16.06 53.79 20.22
White 3.56% 5.84 50.36 40.24
High School #4 Asian 3.03% 9.09 55.30 32.58
Black 16.33% 22.45 52.04 9.18
Hispanic 8.15% 15.93 57.04 18.89
White 2.52% 7.65 62.74 27.09
A summary table of all achievement gaps as measured by proficiency percentage
on the 2010 Spring AIMS examination can be found in Table 6. The data shows that of
the 40 schools within the suburban school district the majority of them still exhibit large
56
differences in proficiency between Black/Hispanic students and Asian/White students.
For example, 29 out of the 40 schools, or 72.5%, exhibited a gap in proficiency that was
at least 10% lower for Black students as compared to White students. Similarly, 62.5%
of the schools within the district showed at least a 10% proficiency gap between Hispanic
and White students.
As found in the Spring 2010 AIMS administration, schools within this district
continued to struggle with closing the achievement gap with respect to proficiency
percentage. Also detailed in Table 6, only 3 out of 40 schools within the district showed
a successful closing of the achievement gap between Black and Asian students. Of
further note is that two of these three schools had special circumstances in closing this
gap. One of the schools was a preparatory school set up by the district in order to capture
high achieving students and maximize their academic achievement. Another of these
schools had a significantly small portion of Asian students and the closing of the gap was
most likely the result of increased variance amongst such a small Asian student sample.
The data in Table 6 shows that achievement within the district still continues to be
different between ethnic subgroups.
57
Table 6
Spring AIMS 2010 Subgroup Proficiency Gaps Summary of All Schools
Number of Schools with Observed Gap
Ethnicity
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 32 5 2 1
Hispanic-Asian 27 11 1 1
Black-White 29 9 2 0
Hispanic-White 25 14 1 0
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column.
The data from the 2010 Spring AIMS administration as compared to school labels
is more prevalent when examining the achievement gaps at each school by label. Table 7
summarizes the achievement gaps observed for all excelling schools in the district. The
data in Table 7 shows that of the 22 excelling schools during the 2010 school year
anywhere from 63.6% to 72.7% of them showed a sizeable achievement gap between
Hispanic/Black students and White/Asian students. Once again, one of the only schools
to have realized a closing of the achievement gap between these subgroups was a
specialty school designed for high academic achieving students. This school can be
observed in the gap column represented by 0% < X < 10%. Besides this school only two
other schools that were labeled excelling demonstrated a closing of the achievement gap.
One of these schools exhibited this closing between Black and Asian students and the
58
other between Hispanic and Asian students. Overall the majority of excelling schools
demonstrated persistent achievement gaps during the 2010 AIMS administration.
Table 7
Spring AIMS 2010 Subgroup Proficiency Gaps Summary of Excelling Schools
Number of Excelling Schools with Observed Gap
Ethnicity
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 15 5 1 1
Black-White 16 5 1 0
Hispanic-Asian 14 6 1 1
Hispanic-White 14 7 1 0
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column 2010 AIMS Summary – By Subject
Examining the results of the 2010 Spring AIMS administration on the school level
by subject matter gives a similar picture as the overall summary presented above. At the
vast majority of schools large achievement gaps are observed between White/Asian and
Hispanic/Black subgroups in each of the three subjects: mathematics, reading and
writing. These gaps exist across all levels of schooling within the district from the
elementary level to the high school level.
Elementary #1 exhibits disparate performance amongst ethnic subgroups when
broken down by subject. As demonstrated in Table 8, the largest achievement gap exists
between the Asian and Black students at Elementary #1 where it was 36.47% more likely
59
for an Asian student to show proficiency on the mathematics examination than a Black
student. One should also observe that the smallest achievement gap is observed between
White and Hispanic students in writing. A White student during the 2010 Spring AIMS
administration was only 4.34% more likely to show proficiency in writing in comparison
to a Hispanic student. Overall, Table 8 reveals that Elementary School #1 still exhibits
large performance gaps between ethnic subgroups at this school when disaggregated by
subject.
Table 8
Spring AIMS 2010 Elementary School #1 Performance by Subject and Ethnicity
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Mathematics Asian 0% 2.82% 19.72% 77.46%
Black 10.71% 28.57 35.71 25.00
Hispanic 11.63% 13.95 50.00 24.42
White 3.98% 9.48 36.39 50.15
Reading Asian 0% 1.41 53.52 45.07
Black 3.57% 28.57 64.29 3.57
Hispanic 4.65% 17.44 63.95 13.95
White 0.91% 8.23 60.37 30.49
Writing Asian 0% 0 47.37 52.63
Black 0% 21.43 64.29 14.29
Hispanic 4.35% 13.04 63.04 19.57
White 0.62% 12.42 62.73 24.22
60
Information provided in Table 9, shows similar patterns exist at the junior high
level with respect to the achievement gap across subgroups. Junior high #4 displayed
issues that were seen across most of the junior highs within the district. As one can
observe in Table 9, Black/Hispanic proficiency rates were lower than White/Asian
proficiency rates across all three subjects. The most significant proficiency gap is the
Asian-Black gap in mathematics where Asian students were 34.98% more likely to be
proficient at mathematics than their student peers that were Black. The smallest
achievement gap was in writing where Black students lagged behind their Asian peers by
5.38% in proficiency. Table 9 clearly demonstrates the existing gaps in proficiency
between ethnicities at junior high #4.
Table 9
Spring AIMS 2010 Junior High #4 Performance by Subject and Ethnicity
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Mathematics Asian 6.42% 3.67% 32.11% 57.80%
Black 19.72% 25.35 32.39 22.54
Hispanic 17.11% 23.03 38.82 21.05
White 9.11% 10.01 41.08 39.79
Reading Asian 0.92% 11.01 68.81 19.27
Black 4.23% 16.90 66.20 12.68
Hispanic 3.95% 14.47 73.68 7.89
White 1.02% 8.19 75.03 15.75
(table continues)
61
Table 9 (continued)
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Writing Asian 2.13% 6.38 80.85 10.64
Black 2.78% 11.11 83.33 2.78
Hispanic 2.47% 12.35 76.54 8.64
White 0.50% 5.03 81.16 13.32
Imparted in Table 10, High School #1 shows that the observed pattern of schools
demonstrating significant achievement gaps in subject areas continues at the high school
level. In mathematics the smallest achievement gap was between White and Hispanic
students where White students were 19.17% more likely to achieve proficiency. In
reading, the gap between White and Black students was the smallest with White students
demonstrating a rate of proficiency that was 7.67% greater. Writing exhibited the
smallest achievement gaps. The smallest of the gaps in writing existed between Hispanic
and White students with Hispanic students being 3% less likely to pass the AIMS writing
examination.
62
Table 10
Spring AIMS 2010 High School #1 Performance by Subject and Ethnicity
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Mathematics Asian 3.33% 3.33% 36.67% 56.67%
Black 22.73% 25.00 38.64 13.64
Hispanic 30.77% 10.26 39.74 19.23
White 12.06% 9.80 45.98 32.16
Reading Asian 0% 3.33 76.67 20.00
Black 0% 16.67 73.81 9.52
Hispanic 2.82% 21.13 66.20 9.86
White 0.53% 8.47 73.54 17.46
Writing Asian 0% 6.06 60.61 33.33
Black 0% 19.05 59.52 21.43
Hispanic 2.90% 10.14 63.77 23.19
White 0.25% 9.80 58.79 31.16
The patterns illustrated in the examination of the three schools in Table 10, Table
9 and Table 8 across the subjects of reading, writing and mathematics were consistently
observed throughout the schools in this suburban district. The summary of subject level
achievement gaps for all analyzed schools is provided in Table 11. In examining Table
11, it should be noted that at least 87.5% of all the schools within the district exhibited
lower proficiency performance amongst Hispanic/Black students than Asian/White
students in mathematics and reading. The subject where the achievement gap appears to
63
elude educators the most is mathematics where at least 92.5% of the district schools
showed an achievement gap amongst the ethnicities examined. Furthermore, at least
77.5% of the schools had mathematics achievement gaps that were at least a 10-
percentage point difference during the 2010 school year.
Table 11
Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing
by Ethnicity for 2010 Spring AIMS Administration
Mathematics
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 90.00% 2.50 2.50 5.00
Hispanic-Asian 87.50% 5.00 2.50 5.00
Black-White 90.00% 7.50 2.50 0.00
Hispanic-White 77.50% 20.00 2.50 0.00
Reading
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 57.50% 32.50 7.50 2.50
Hispanic-Asian 60.00% 27.50 10.00 2.50
Black-White 62.50% 25.00 10.00 2.50
Hispanic-White 62.50% 30.00 7.50 0.00
(table continues)
64
Table 11 (continued)
Writing
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 52.63% 26.32 21.05 0.00
Hispanic-Asian 47.37% 26.32 21.05 5.26
Black-White 33.33% 28.21 30.77 7.69
Hispanic-White 35.00% 35.00 22.50 7.50
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column.
When analyzing achievement gaps by subject for excelling schools (see Table 12)
the picture during the 2009-2010 school year is not much different. The vast majority of
excelling schools within the school district showed significant achievement gaps in
mathematics, reading and writing. Discovered in Table 12, in mathematics at least 20 of
the 22 excelling schools, or 90.9%, showed an achievement gap with respect to
proficiency. Reading and writing did not fare much better with at least 81.8% and
59.1%, respectively, of the excelling schools showing a gap. Overall, the tables show
that excelling schools within this suburban school district exhibited a distinct
achievement gap during the 2009-2010 school year.
65
Table 12
Number of All District Excelling Schools With Observed Gap in Mathematics, Reading
and Writing by Ethnicity for 2010 Spring AIMS Administration
Mathematics
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 19 1 1 1
Black-White 19 2 1 0
Hispanic-Asian 18 2 1 1
Hispanic-White 17 4 1 0
Reading
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 10 9 2 1
Black-White 12 7 3 0
Hispanic-Asian 11 7 3 1
Hispanic-White 12 8 2 0
Writing
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 8 7 6 1
Black-White 6 7 8 1
Hispanic-Asian 11 6 4 1
Hispanic-White 8 8 6 0
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column
66
2011 AIMS Summary - Overall
As with the 2009-2010 AIMS data, an achievement gap during the 2011 Spring
AIMS administration was noticed at the majority of the schools in the suburban school
district with respect to proficiency. Table 13, as compared to Table 3 in the 2010 AIMS
summary, demonstrated the same pattern in achievement gaps during the 2010-2011
school year. At Elementary #1 96.46% of the Asian students were proficient on the
AIMS examination while only 82.76% and 80.20% of the Black and Hispanic students
were proficient, respectively. Similar performance gaps can be seen at Elementary
School #1 between Black and Hispanic students in comparison to White students where
88.70% of White students were proficient. Further examination of the other two
elementary schools in Table 13 shows similar results with respect to the achievement gap
amongst students of different ethnicities during the 2011 Spring AIMS administration.
At Elementary #2 Asian and White students were proficient on the exam at 95.05% and
91.60%, respectively. This compares to Black and Hispanic students being proficient on
the exam at 76.67% and 81.3%, respectively. At Elementary #3 Asian and White
students were proficient on the exam at 95.52% and 89.71%, respectively. This
compares to Black and Hispanic students being proficient on the exam at 85.71% and
78.22%, respectively. Once again, gaps in performance amongst different ethnic
subgroups are observed.
67
Table 13
Spring AIMS 2011 Subgroup Performance by Ethnicity and Elementary School
School
Ethnicity % FFB % Approach % Meets % Exceeds
Elementary School #1
Asian 1.97% 1.97% 34.48% 61.58%
Black 3.45% 13.79 70.69 12.07
Hispanic 2.54% 17.26 58.88 21.32
White 2.50% 8.80 49.82 38.88
Elementary School #2
Asian 0.50% 4.46 55.45 39.60
Black 3.57% 10.71 61.90 23.81
Hispanic 4.03% 17.74 60.48 17.74
White 1.40% 7.00 57.80 33.80
Elementary School #3
Asian 0.47% 5.16 44.13 50.23
Black 1.64% 16.39 67.21 14.75
Hispanic 4.57% 20.57 62.29 12.57
White 3.95% 11.07 55.25 29.72
Achievement gap trends are also seen at the junior high level during the 2011
Spring AIMS administration (see Table 14). At junior high #4 85.20% and 84.08% of
Asian and White students were proficient on examinations taken. Noticeably below this
performance at junior high #4 Black and Hispanic students were 68.57% and 70.72%
proficient, respectively. The smallest gap in proficiency existed between White and
Hispanic students at junior high #4 where White students were13.36% more likely to be
proficient than Hispanic students.
68
Table 14
Spring AIMS 2011 Subgroup Performance by Ethnicity and Junior High School
School
Ethnicity % FFB % Approach % Meets % Exceeds
Junior High #4
Asian 5.10% 9.69% 50.51% 34.69%
Black 8.57% 22.86 57.14 11.43
Hispanic 10.72% 18.55 55.94 14.78
White 5.52% 10.39 60.84 23.24
Junior High #5
Asian 2.74% 6.16 52.74 38.36
Black 9.35% 17.27 62.59 10.79
Hispanic 6.60% 16.75 57.11 19.54
White 4.37% 8.83 58.05 28.75
Junior High #6
Asian 3.85% 6.41 64.10 25.64
Black 27.87% 25.68 36.61 9.84
Hispanic 23.11% 26.32 45.49 5.08
White 10.53% 15.30 54.83 19.34
At the high school level a similar trend is noticed for the Spring 2011 AIMS
administration. As shown in Table 15, all four high schools in this suburban school
district displayed more than a 10% difference in proficiency for nearly every comparison
of White/Asian and Black/Hispanic performance. The only achievement gap that did not
show a 10% difference was at high school #1 where the gap in proficiency between
White and Hispanic students was 9.17%. The largest achievement gap between these
subgroups once again existed at High School #2 where the White-Hispanic gap was at
69
25.72% with respect to proficiency. All four of the high schools within this suburban
school district were labeled as excelling schools during the 2010-2011 school year.
Table 15
Spring AIMS 2011 Subgroup Performance by Ethnicity and High School
School
Ethnicity % FFB % Approach % Meets % Exceeds
High School #1
Asian 1.00% 3.00% 56.00% 40.00%
Black 13.29% 12.03 67.09 7.59
Hispanic 9.12% 14.04 63.16 13.68
White 5.01% 8.98 65.82 20.19
High School #2
Asian 2.16% 5.95 49.19 42.70
Black 12.15% 16.02 64.64 7.18
Hispanic 12.05% 24.32 55.87 7.76
White 4.05% 6.60 61.73 27.62
High School #3
Asian 1.31% 2.09 49.35 47.26
Black 8.54% 14.95 61.92 14.59
Hispanic 9.78% 17.53 61.25 11.44
White 3.48% 4.84 57.37 34.32
(table continues)
70
Table 15 (continued)
School
Ethnicity % FFB % Approach % Meets % Exceeds
High School #4
Asian 1.72% 5.17 63.79 29.31
Black 12.90% 18.55 55.65 12.90
Hispanic 9.41% 15.68 63.07 11.85
White 4.85% 6.21 64.11 24.83
A summary of all achievement gaps as measured by proficiency percentage on the
2011 Spring AIMS examination can be found in Table 16. The table shows that of the 40
schools examined within the suburban school district the majority of them still exhibit
large differences in proficiency between Black/Hispanic students and Asian/White
students. For example, 25 out of the 40 schools, or 62.5%, exhibited a gap in proficiency
that was at least 10% lower for Black students as compared to White students. Similarly,
60.0% of the schools within the district showed at least a 10% proficiency gap between
Hispanic and White students.
71
Table 16
Spring AIMS 2011 Subgroup Proficiency Gaps Summary of All Schools
Number of Schools with Observed Gap
Ethnicity
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 30 8 2 0
Hispanic-Asian 31 6 3 0
Black-White 25 11 2 2
Hispanic-White 24 12 3 1
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column.
During the spring 2011 AIMS administration, schools within the suburban district
continued to struggle with closing the achievement gap with respect to proficiency
percentage. Results from analyzing descriptive statistics suggest that 2011 achievement
gap data remained very similar to 2010 achievement gap data. In the 2009-2010 school
year 3 out of 40 schools within the district showed a successful closing the achievement
gap between Black and Asian students. During the 2010-2011 school year this number
reduced to 2 out of 40. The 2009-2010 school year showed two schools had closed the
Hispanic-White achievement gap but during the 2010-2011 school year this number
increased to 4. Ultimately, it must be noted that for the vast majority of schools within
this school district large achievement gaps remained during the 2010-2011 school year.
The data from the 2011 Spring AIMS administration is more prevalent when
conditionally examining the achievement gaps at each school by label. Table 17 shows a
72
summary of the achievement gaps observed for all excelling schools. Of the 22 excelling
schools during the 2011 school year anywhere from 54.5% to 68.1% of them showed a
sizeable achievement gap between Hispanic/Black students and White/Asian students
(see Table 17). Once again, as detailed in Table 17, one of the only schools to have
realized a closing of the achievement gap between these subgroups was a specialty school
designed for high academic achieving students. This school can be observed in the gap
column represented by 0% < X < 10%. At a minimum 20 of the 22 excelling schools in
the district showed an achievement gap between the ethnic groups being examined.
Table 17
Spring AIMS 2011 Subgroup Proficiency Gaps Summary of Excelling Schools
Number of Excelling Schools with Observed Gap
Ethnicity
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 15 6 1 0
Black-White 13 8 1 0
Hispanic-Asian 14 6 2 0
Hispanic-White 12 8 2 0
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column
73
2011 AIMS Summary – By Subject
Examining the results of the 2011 Spring AIMS administration on the school level
by subject matter gives a similar picture as the overall summary not disaggregated by
subject. At the vast majority of schools large achievement gaps are observed between
White/Asian and Hispanic/Black subgroups in each of the three subjects: mathematics,
reading and writing. These gaps exist across all levels of schooling within the district
from the elementary level to the high school level.
Elementary #1 exhibits disparate performance amongst ethnic subgroups when
broken down by subject. As conveyed in Table 18, the largest achievement gap exists
between the Asian and Black students at Elementary #1 where it was 15.49% more likely
for an Asian student to show proficiency on the mathematics examination than a Black
student. While this is a drastic improvement over the 2009-2010 gap it is still far from
having closed the achievement gap between these two ethnicities. Table 18 shows this
proficiency disparity and several others. One should also observe that the White and
Hispanic achievement gap, which was smallest achievement gap during the 2009-2010
school year, increased from 4.34% to 12.49%. A White student during the 2011 Spring
AIMS administration was 12.49% more likely to show proficiency in writing in
comparison to a Hispanic student. Overall, Table 18 reflects that the trend of large
performance gaps between ethnic subgroups at this school during the 2010-2011 school
year continued.
74
Table 18
Spring AIMS 2011 Elementary School #1 Performance by Subject and Ethnicity
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Mathematics Asian 2.67% 2.67% 17.33% 77.33%
Black 8.33% 12.50 62.50 16.67
Hispanic 5.33% 16.00 44.00 34.67
White 4.50% 8.41 32.13 54.95
Reading Asian 1.33% 0 48.00 50.67
Black 0% 8.33 79.17 12.50
Hispanic 1.33% 9.33 73.33 16.00
White 1.20% 4.20 61.26 33.33
Writing Asian 1.89% 3.77 39.62 54.72
Black 0% 30.00 70.00 0
Hispanic 0% 31.91 59.57 8.51
White 1.14% 18.29 61.71 18.86
As imparted in Table 19, at the junior high level similar patterns exist with respect
to the achievement gap across subgroups. Junior high #4 displayed issues that were seen
across most of the junior highs within the district. As uncovered in Table 19, one can
observe Black/Hispanic proficiency rates were lower than White/Asian proficiency rates
across all three subjects at junior high #4. The most significant proficiency gap is the
Asian-Black gap in mathematics where Asian student were 18.80% more likely to be
proficient at mathematics than their student peers that were Black. Once again, while this
75
gap between Asian-Black students is significantly lower the 2009-2010 school year it is
still far from demonstrating the requirement in NCLB of closing the achievement gap.
Table 19
Spring AIMS 2011 Junior High #4 Performance by Subject and Ethnicity
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Mathematics Asian 11.11% 7.41% 30.86% 50.62%
Black 17.91% 19.40 43.28 19.40
Hispanic 24.09% 12.41 40.15 23.36
White 10.61% 9.60 43.69 36.11
Reading Asian 1.23% 7.41 61.73 29.63
Black 0% 20.90 71.64 7.46
Hispanic 2.19% 18.25 67.15 12.41
White 1.89% 7.83 73.23 17.05
Writing Asian 0% 20.59 70.59 8.82
Black 7.32% 31.71 56.10 4.88
Hispanic 1.41% 30.99 64.79 2.82
White 2.70% 16.91 70.10 10.29
As shown in Table 20, High School #1 shows that the observed pattern of
achievement gaps at the subject level continue in the district for high schools. In
mathematics the smallest achievement gap was between White and Hispanic students
where White students were 12.44% more likely to achieve proficiency. As detailed in
Table 20, the gap between White and Black students in reading was the smallest with
76
White students demonstrating a rate of proficiency that was 6.15% greater. Writing
exhibited the smallest achievement gaps. The smallest of the gaps in writing existed
between Hispanic and White students with Hispanic students being 7.52% less likely to
pass the AIMS writing examination. However, this gap in writing proficiency between
Hispanic and White students was more than double the 2009-2010 gap. The achievement
gap rates demonstrated by high school #1 help confirm the trends observed in the subject
level analysis of school data.
Table 20
Spring AIMS 2011 High School #1 Performance by Subject and Ethnicity
Subject
Ethnicity % FFB % Approach % Meets % Exceeds
Mathematics Asian 0% 0% 43.75% 56.25%
Black 28.57% 7.14 51.79 12.50
Hispanic 23.23% 12.12 46.46 18.18
White 11.35% 11.57 47.60 29.48
Reading Asian 0% 0 54.55 45.45
Black 3.92% 9.80 80.39 5.88
Hispanic 2.11% 12.63 70.53 14.74
White 1.78% 5.79 72.38 20.04
Writing Asian 2.86% 8.57 68.57 20.00
Black 5.88% 19.61 70.59 3.92
Hispanic 1.10% 17.58 73.63 7.69
White 1.63% 9.53 78.37 10.47
77
The patterns illustrated in the examination of the three schools across the subjects
of reading, writing and mathematics were consistently observed throughout the schools in
this suburban district. The summary of all analyzed schools for the 2011 AIMS
administration are provided in Table 21. In examining Table 21, it should be noted that
at least 67.5% of all the schools within the district exhibited lower proficiency
performance amongst Hispanic/Black students than Asian/White students in
mathematics, reading and writing. The subject where the achievement gap appears to
elude educators the most is mathematics where at least 90.0% of the district schools
showed an achievement gap. Furthermore, at least 67.5% of the schools had mathematics
achievement gaps that were at least a 10-percentage point difference during the 2011
school year.
Table 21
Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing
by Ethnicity for 2011 Spring AIMS Administration.
Mathematics
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 87.50% 10.00 2.50 0.00
Hispanic-Asian 80.00% 12.50 7.50 0.00
Black-White 70.00% 22.50 5.00 2.50
Hispanic-White 67.50% 22.50 7.50 2.50
(table continues)
78
Table 21 (continued)
Reading
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 60.00% 30.00 7.50 2.50
Hispanic-Asian 57.50% 32.50 7.50 2.50
Black-White 50.00% 30.00 15.00 5.00
Hispanic-White 40.00% 47.50 10.00 2.50
Writing
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 71.79% 10.26 7.69 10.26
Hispanic-Asian 71.79% 17.95 7.69 2.56
Black-White 45.00% 22.50 22.50 10.00
Hispanic-White 57.50% 27.50 12.50 2.50
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column.
When analyzing achievement gaps by subject for excelling schools the picture
during the 2010-2011 school year is not much different than the prior year (see Table 22).
The vast majority of excelling schools within the school district showed achievement
gaps in mathematics, reading and writing. Table 22 also demonstrates that in
mathematics at least 19 of the 22 excelling schools, or 86.3%, showed an achievement
gap with respect to proficiency. Reading and writing had comparable results with at least
86.3% and 72.7%, respectively, of the excelling schools showing a gap.
79
Table 22
Number of All District Excelling Schools With Observed Gap in Mathematics, Reading
and Writing by Ethnicity for 2011 Spring AIMS Administration.
Mathematics
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 17 4 1 0
Black-White 15 6 1 0
Hispanic-Asian 14 5 3 0
Hispanic-White 12 8 2 0
Reading
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 11 9 2 0
Black-White 9 10 3 0
Hispanic-Asian 9 11 2 0
Hispanic-White 6 14 2 0
Writing
X* < -10%
-10% < X < 0% 0% < X < 10% X > 10%
Black-Asian 16 3 3 0
Black-White 9 7 5 1
Hispanic-Asian 14 5 3 0
Hispanic-White 12 6 4 0
Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column
80
Summary of AIMS Proficiency Data from Spring 2010 and Spring 2011
After examining proficiency percentages amongst the four ethnic subgroups for
the 40 schools within the suburban school district it is evident that despite school labels
the closing of the achievement gap continues to elude a wide majority of the schools
Each school exhibited minor differences in the achievement gap but upon compilation the
data suggest that whether a school is excelling, or highly performing, there appears to be
distinct proficiency performance differences between ethnic subgroups at these schools.
In fact, at a minimum 59.1% of the excelling schools during each of the 2010 and 2011
school years had lower proficiency performance amongst Black or Hispanic students in
comparison to White or Asian students in mathematics, reading or writing. Furthermore,
in this suburban school district, where approximately 75% of the schools receive one of
the two highest labels from ADE, at least 60% of the schools still exhibit achievement
gaps of 10% percentage points amongst academically low-performing ethnic subgroups.
Analysis of the Achievement Gap Using AIMS Scale Score
The analysis of the achievement gap using the average scale score on the AIMS
examination during the 2010 and 2011 AIMS test administration follows. This analysis
addresses the second research question of:
2. Is the average student achievement, as measured by average scale score, in
ethnic subgroups different for the 2009-2010 and 2010-2011 AIMS
examinations at each non-alternative school throughout the suburban school
district?
81
The analysis of this research question is divided into two sections. The first section
provides an analysis of ANOVA results for the 2010 and 2011 school years. The second
section provides a summary of this analysis.
2010 and 2011 Achievement Gap Analyzed through Average Scale Score
After examining the achievement gap relative to the percentage of students that
met proficiency at a given school the next question in the study was to examine the
achievement gap with respect to the average scale score at a given school. The AIMS
examination results in students receiving two types of scores: a raw score and a scale
score. A raw score is simply how many questions did the student get correct on the
exam. The scale score is a transformation of the raw score such that comparisons across
different versions of the test can be made. Essentially, by horizontally scaling an
examination a 6th grade student that took version H of the mathematics exam can be
compared to a 6th grade student that took version G of the mathematics exam. Using the
scaled scores for different ethnic subgroups within each school to analyze the
achievement gap examines the validity of the results drawn in the first part of Chapter 4.
Originally it was believed that average scale score for students of one ethnic
subgroup at a school could be compared to the average scale score of all other ethnic
subgroups at a school using an ANOVA. Unfortunately, upon initially examining
average scale scores it was evident that the process would not be this simple. As noted in
Table 23, average scale scores in mathematics appeared to increase with each grade level
throughout the district. The table shows that across the 7 years in which the AIMS
mathematics examination is administered the average scale score increases from 384.37
to 518.30.
82
Table 23
Average Scale Score throughout the District on 2010 AIMS Mathematics Administration
Grade
Average
3rd 384.37
4th 401.23
5th 410.85
6th 428.72
7th 442.92
8th 453.59
10th 518.30
This suggested that while the AIMS examination was horizontally scaled the
scaling across grade levels was not equivalent. As a result, if a school had a
disproportionate population that included a higher percentage of sixth grade Black
students than third grade Black students the average scale score for Black students would
be skewed by the lack of vertical scaling. Consequently, results of the ANOVA could
possibly reflect the grade location of ethnic students rather than their performance.
One way to account for the lack of vertical scaling across grade-levels was to
transform the data so that every student’s score is on the same scale. Using a simple
linear transformation, xi = zσ + µ, all of the scale scores for each student in the district
were transformed into a z-score based on the average scale score and the standard
deviation of scale score for every grade level. The advantage of using this linear
transformation process is that the original distribution is preserved and the distribution of
83
scores amongst ethnic subgroups is also preserved. After performing the linear
transformation on both the 2010 and 2011 AIMS mathematics and reading scores the
ANOVAs were performed on the z-scores transformed from the average scale score.
In total there were eighty ANOVAs performed on the AIMS data for the 2010 and
2011 school years. Each of the 40 schools had one ANOVA for the 2010 school year and
one ANOVA for the 2011 school year. Transforming mathematics and reading scores
into a z-score allowed for the comparison of mathematics and reading z-scores in
conjunction with each other. A z-score of 1 in mathematics represents the same thing as
a z-score of 1 in reading, which is a student scoring one standard deviation above the
mean on each respective subject test. As a result the ANOVA was analyzed using the
mathematics and reading z-scores as the dependent variable and ethnicity (Black,
Hispanic, White, Asian) as the factors.
When performing an ANOVA it is important to analyze the data to see if any of
the conditions have been violated. The three main conditions for ANOVAs include:
normality of observations within each factor, homogeneity of variances across each factor
and independence of observations across each factor. The most important of the
assumptions when running an ANOVA is independence. Stevens (2007, p. 59) states that
independence is “by far the most important assumption, for even a small violation of it
produces a substantial effect on both the level of significance and the power of the F
statistic.” Fortunately, in the case of this research independency has not been violated.
When evaluating independency Glass and Hopkins (1984, p. 353) issued the statement
that, “whenever the treatment is individually administered, observations are independent.
But where treatments involve interaction among persons…the observations may
84
influence each other.” In the case of the AIMS examination the treatment, or test, is
individually administered. Consequently, according to Glass and Hopkins the
observations on the AIMS examination must be independent of one another for each
individual student.
The second most important of the assumptions in an ANOVA is the assumption
of homogeneity of the population variance across ethnicities. Levene’s test on
homogeneity of variance provided analysis with respect to this assumption. During the
2010 school year 11 of the 40 schools exhibited heterogeneity of variances according to
Levene’s (α = 0.05). During the 2011 school year 6 of the 40 schools exhibited
heterogeneity of variances according to Levene’s (α = 0.05). Although Levene’s test for
these schools showed cause for concern with heterogeneity, the violation of the
assumption is not as problematic unless group sizes are sharply unequal (Stevens, p.58).
Unfortunately, as listed in Table 24, in the case of the 11 schools from 2010 and the 6
schools from 2011 the ratios of the largest group size to the smallest group size all were
in excess of 1.5. As a result, the corresponding ANOVAs for these schools were not
analyzed for the purposes of this research.
85
Table 24
Levene’s Test for Homogeneity of Variance P-Values for Each School across Ethnicities
with Respect to Average Scale Score
School Number
2010 Levene’s
P-Value
Largest/Smallest Group Ratio
2011 Levene’s P-Value
Largest/Smallest Group Ratio
Elementary #1 0.428 0.496
Elementary #2 0.070 0.771
Elementary #3 0.235 0.790
High School #1 0.359 0.710
Junior High #1 0.205 0.129
High School #2 0.037 6.147 0.001 5.565
Elementary #4 0.767 0.086
Elementary #5 0.020 17.632 0.680
Elementary #6 0.538 0.619
Elementary #7 0.178 0.025 8.222
Junior High #2 0.136 0.691
Elementary #8 0.067 0.760
Elementary #9 0.025 4.940 0.029 4.149
Elementary #10 0.120 0.228
Elementary #11 0.472 0.784
Elementary #12 0.055 0.332
(table continues)
86
Table 24 (continued)
School Number
2010 Levene’s
P-Value
Largest/Smallest Group Ratio
2011 Levene’s P-Value
Largest/Smallest Group Ratio
Elementary #13 0.333 0.100
High School #3 0.951 0.001 4.423
Special #1 0.072 0.055
Elementary #14 0.896 0.693
Elementary #15 0.205 0.346
Elementary #16 0.270 0.124
Elementary #17 0.308 0.888
Junior High #3 0.677 0.462
Elementary #18 0.964 0.435
Elementary #19 0.004 11.154 0.549
Elementary #20 0.923 0.171
High School #4 0.443 0.130
Elementary #21 0.134 0.409
Elementary #22 0.239 0.587
Elementary #23 0.213 0.182
Elementary #24 0.113 0.961
Elementary #25 0.589 0.417
Elementary #26 0.042 13.726 0.192
Junior High #4 0.005 10.986 0.023 10.934
(table continues)
87
Table 24 (continued)
School Number
2010 Levene’s
P-Value
Largest/Smallest Group Ratio
2011 Levene’s P-Value
Largest/Smallest Group Ratio
Elementary #27 0.046 10.294 0.066
Elementary #28 0.016 8.881 0.250
Elementary #29 0.033 11.520 0.361
Junior High #5 0.000 11.293 0.287
Junior High #6 0.008 13.875 0.001 12.424
The final assumption for an ANOVA is the assumption of normality within each
group. Kolmogorov-Smirnov tests (α = 0.05) were run on each subgroup within each
school for the 2010 and 2011 school years to determine whether the distribution of scale
scores were approximately normal. As shown in Table 25, during the 2010 AIMS
administration, 106 of the 160 tests failed to reject normality as a reasonable assumption
for each group distribution. In 2011, 105 of the 160 tests failed to reject normality as a
reasonable assumption for each groups distribution. One might become concerned about
the 54 groups in 2010 and the 55 groups in 2011 which violated the assumption of
normality. However, a summary by Glass, Peckham and Sanders (1972) on research
conducted studying the effects of non-normality on an ANOVA shows that non-
normality only has a slight effect on Type I errors. Stevens (1972, p. 57) states, “the F
statistic is robust with respect to the normality assumption.” As a result, even though the
normality assumption was violated in approximately 34%, as shown in Table 25, of the
88
group distributions it should have a negligible effect on the Type I error rate and power of
the ANOVAs due the robust nature of the F statistic when encountering non-normality.
Table 25
Kolmogorov-Smirnov (KS) P-Values for Normality for 2010 and 2011 AIMS
Distributions by Ethnicity
School Number
2010 KS
White
2010 KS
Hispanic
2010 KS
Black
2010 KS
Asian
2011 KS
White
2011 KS
Hispanic
2011 KS
Black
2011 KS
Asian
E #1 0.000 0.200 0.200 0.200 0.038 0.200 0.000 0.085
E #2 0.002 0.031 0.200 0.200 0.000 0.004 0.073 0.088
E #3 0.000 0.001 0.200 0.057 0.001 0.200 0.200 0.039
HS #1 0.001 0.187 0.060 0.200 0.000 0.063 0.200 0.200
JH #1 0.000 0.014 0.200 0.022 0.000 0.200 0.080 0.078
HS #2 0.000 0.010 0.200 0.200 0.000 0.200 0.024 0.200
E #4 0.011 0.200 0.200 0.200 0.034 0.031 0.200 0.200
E #5 0.200 0.200 0.200 0.200 0.000 0.022 0.200 0.001
E #6 0.000 0.200 0.048 0.019 0.000 0.013 0.170 0.029
E #7 0.000 0.200 0.200 0.200 0.000 0.096 0.200 0.200
JH #2 0.018 0.200 0.200 0.083 0.001 0.039 0.200 0.010
E #8 0.002 0.200 0.200 0.200 0.030 0.018 0.200 0.200
E #9 0.200 0.007 0.200 0.041 0.054 0.017 0.200 0.002
E #10 0.200 0.030 0.200 0.112 0.197 0.200 0.033 0.200
(table continues)
89
Table 25 (continued)
School Number
2010 KS
White
2010 KS
Hispanic
2010 KS
Black
2010 KS
Asian
2011 KS
White
2011 KS
Hispanic
2011 KS
Black
2011 KS
Asian
E #11 0.200 0.200 0.200 0.200 0.087 0.060 0.200 0.200
E #12 0.200 0.041 0.200 0.200 0.200 0.157 0.200 0.200
E #13 0.049 0.034 0.200 0.200 0.200 0.039 0.200 0.200
HS #3 0.000 0.200 0.200 0.050 0.000 0.161 0.200 0.004
S #1 0.065 0.000 0.200 0.001 0.001 0.200 0.200 0.200
E #14 0.200 0.200 0.200 0.200 0.024 0.156 0.200 0.200
E #15 0.038 0.087 0.017 0.200 0.001 0.026 0.200 0.061
E #16 0.007 0.007 0.200 0.200 0.003 0.018 0.200 0.091
E #17 0.200 0.049 0.200 0.200 0.013 0.200 0.011 0.078
JH #3 0.057 0.008 0.200 0.080 0.002 0.151 0.036 0.075
E #18 0.001 0.200 0.156 0.200 0.069 0.200 0.200 0.200
E #19 0.200 0.200 0.178 0.162 0.200 0.022 0.098 0.200
E #20 0.009 0.200 0.200 0.200 0.200 0.200 0.087 0.200
HS #4 0.000 0.000 0.200 0.200 0.000 0.200 0.200 0.200
E #21 0.000 0.025 0.200 0.200 0.000 0.190 0.038 0.200
E #22 0.000 0.031 0.200 0.200 0.000 0.177 0.200 0.200
E #23 0.200 0.200 0.200 0.200 0.200 0.027 0.200 0.200
E #24 0.200 0.200 0.200 0.200 0.200 0.002 0.181 0.200
(table continues)
90
Table 25 (continued)
School Number
2010 KS
White
2010 KS
Hispanic
2010 KS
Black
2010 KS
Asian
2011 KS
White
2011 KS
Hispanic
2011 KS
Black
2011 KS
Asian
E #25 0.063 0.200 0.200 0.200 0.020 0.200 0.200 0.200
E #26 0.001 0.200 0.200 0.200 0.000 0.200 0.200 0.084
JH #4 0.001 0.200 0.200 0.045 0.000 0.200 0.200 0.200
E #27 0.001 0.200 0.192 0.049 0.200 0.200 0.200 0.200
E #28 0.090 0.200 0.200 0.200 0.002 0.200 0.157 0.200
E #29 0.000 0.200 0.200 0.013 0.016 0.200 0.200 0.099
JH #5 0.000 0.004 0.200 0.024 0.000 0.200 0.200 0.200
JH #6 0.000 0.000 0.200 0.200 0.199 0.010 0.002 0.200
After taking into account all of the required assumptions further analysis of the
ANOVA was performed on the remaining schools that did not violate the critical
assumptions. This included 29 schools from 2010 and 34 schools from 2011. The results
from the 2010 and 2011 ANOVA for these schools are shared in Table 26.
As reported in Table 26, of the 29 schools that did not violate the assumptions of
the ANOVA 27 of these schools had statistically significant ANOVA results (α = 0.06).
These results suggest that between the ethnicities of Asian, Black, Hispanic and White 27
of the schools exhibited a difference between average scale score in at least two of the
ethnic subgroups. Of the 27 schools in 2010 that exhibited a statistically significant
difference in average scale score between at least two ethnic subgroups 16 were
excelling, 6 were highly performing, 1 was performing plus and 4 were performing.
91
During the 2010 school year 30 schools in the district received the two highest school
labels issued by ADE with 22 being excelling and 8 being highly performing. 73.3% of
these 30 schools still exhibited an achievement gap, as measured by average scale score,
between at least two of the four ethnic subgroups studied. Furthermore, 72.7% of the 22
schools in the district that were labeled excelling still exhibited an achievement gap. The
examination of achievement gap with respect to average scale score in 2010 returned
results similarly seen in the comparison of proficiency percentage in the prior analysis
within Chapter 4 for the first research question.
Table 26
Results from the 2010 ANOVA for 29 District-Wide Schools that did not violate the
Assumptions of the ANOVA
School
F-Statistic P-Value
Elementary #1 48.254 0.000
Elementary #2 20.291 0.000
Elementary #3 25.073 0.000
High School #1 9.459 0.000
Junior High #1 82.894 0.000
Elementary #4 7.858 0.000
Elementary #6 5.167 0.002
Elementary #7 10.561 0.000
Junior High #2 4.041 0.008
(table continues)
92
Table 26 (continued)
School
F-Statistic P-Value
Elementary #8 15.168 0.000
Elementary #10 6.675 0.000
Elementary #11 10.484 0.000
Elementary #12 1.668 0.172
Elementary #13 7.969 0.000
High School #3 59.319 0.000
Special #1 4.923 0.002
Elementary #14 2.748 0.042
Elementary #15 7.453 0.000
Elementary #16 19.781 0.000
Elementary #17 10.932 0.000
Junior High #3 88.497 0.000
Elementary #18 11.188 0.000
Elementary #20 9.207 0.000
High School #4 13.877 0.000
Elementary #21 3.760 0.011
Elementary #22 20.083 0.000
Elementary #23 14.863 0.000
Elementary #24 1.746 0.157
Elementary #25 50.360 0.000
93
The results from the 2011 analysis of variance remained consistent with the
results from 2010 as discovered in Table 27. After taking into account the assumptions
of the ANOVA, 34 schools were left to analyze. The results from the 2011 ANOVA are
observed in Table 27.
As relayed in Table 27, of the 34 schools that did not violate the assumptions of
the ANOVA 32 of these schools had statistically significant ANOVA results (α = 0.06).
These results suggest that between the ethnicities of Asian, Black, Hispanic and White 32
of the schools exhibited a difference between average scale score in at least two of the
ethnic subgroups. Of the 32 schools in 2011 that exhibited a statistically significant
difference in average scale score between at least two ethnic subgroups 19 were
excelling, 4 were highly performing, 5 were performing plus and 4 were performing.
During the 2011 school year 29 schools in the district received the two highest school
labels issued by ADE with 22 being excelling and 7 being highly performing. 79.3% of
these 29 schools still exhibited an achievement gap, as measured by average scale score,
between at least two of the four ethnic subgroups studied. Furthermore, 86.3% of the 22
schools in the district that were labeled excelling still exhibited an achievement gap. In
conjunction with the AZ Learns Legacy labels the ADE also issued during the 2011
school year a new label call AZ Learns A-F letter grades. Throughout the district 28
schools received the top letter grades of A or B with 18 schools receiving the letter grade
of A. 78.6% of these 28 schools still exhibited an achievement gap between at least two
of the four ethnic subgroups as measured by average scale score. Similarly, 83.3% of the
18 schools that received an A still exhibited an achievement gap.
Table 27
94
Results from the 2011 ANOVA for 34 District-Wide Schools that did not violate the
Assumptions of the ANOVA
School
F-Statistic P-Value
Elementary #1 29.157 0.000
Elementary #2 11.499 0.000
Elementary #3 19.417 0.000
High School #1 26.203 0.000
Junior High #1 124.766 0.000
Elementary #4 7.989 0.000
Elementary #5 9.537 0.000
Elementary #6 12.773 0.000
Junior High #2 9.985 0.000
Elementary #8 16.801 0.000
Elementary #10 8.295 0.000
Elementary #11 6.573 0.000
Elementary #12 10.708 0.000
Elementary #13 2.112 0.098
Special #1 13.898 0.000
Elementary #14 7.790 0.000
Elementary #15 7.531 0.000
(table continues)
95
Table 27 (continued)
School
F-Statistic P-Value
Elementary #16 20.204 0.000
Elementary #17 4.585 0.003
Junior High #3 68.232 0.000
Elementary #18 16.650 0.000
Elementary #19 2.700 0.045
Elementary #20 6.522 0.000
High School #4 20.652 0.000
Elementary #21 3.918 0.009
Elementary #22 8.342 0.000
Elementary #23 26.462 0.000
Elementary #24 1.417 0.237
Elementary #25 35.985 0.000
Elementary #26 26.754 0.000
Elementary #27 28.745 0.000
Elementary #28 66.337 0.000
Elementary #29 15.898 0.000
Junior High #5 26.891 0.000
For the 27 and 32 schools during the 2010 and 2011 school years, respectively,
for which the ANOVA tests showed an achievement gap further analysis was preformed
using Tukey HSD to discover which specific ethnicities demonstrated a significant
96
achievement gap. During the 2010 school year the results of Tukey HSD, α = 0.01
accounting for the Bonferroni effect, for the 27 schools which had a significant F-statistic
from the ANOVA are displayed in Table 28.
Of the 27 schools where the ANOVA showed a significant achievement gap 17 of
these schools exhibited an achievement gap as measured by average scale score between
White and Hispanic students (see Table 28). Furthermore, Tukey HSD showed that 18 of
these schools showed a significant gap in performance between White and Black
students. Although the gaps were not as frequent at schools in comparison to Asian
students they did exist. A significant gap between Hispanic and Black students in
comparison to Asian students existed at 14 and 16 of these schools, respectively. Of the
30 schools that received one of the two highest labels by ADE almost half, 14, still
showed a significant gap between White and Hispanic students at their school. More
than half of the 30 schools, 16, show a significant performance gap with respect to
average scale score between White and Black student at their school. The Hispanic and
Black performance achievement gaps on AIMS with respect to Asian performance faired
similar amongst these 30 schools with a distinguished label from ADE. The number of
the distinguishably labeled schools with Hispanic and Black achievement gaps in
comparison to Asian students was 12 and 13 respectively. Ultimately, somewhere
between 40% and 53.3% of the schools labeled distinguished by ADE still exhibited an
achievement gap between Hispanic and Black students in comparison to White and Asian
students as examined by Tukey HSD.
97
Table 28
2010 Results for Tukey HSD – Post Hoc Tests
School
W-H W-B W-A H-B H-A B-A
Elementary #1 0.000 0.000 0.000 0.213 0.000 0.000
Elementary #2 0.000 0.000 0.003 0.994 0.000 0.000
Elementary #3 0.000 0.000 0.000 0.986 0.000 0.000
High School #1 0.004 0.003 0.166 0.905 0.001 0.000
Junior High #1 0.000 0.000 0.000 0.302 0.000 0.000
Elementary #4 0.052 0.001 0.444 0.327 0.011 0.000
Elementary #6 0.342 0.001 0.961 0.282 0.836 0.058
Elementary #7 0.168 0.000 0.627 0.033 0.075 0.000
Junior High #2 0.497 0.769 0.026 0.294 0.008 0.692
Elementary #8 0.000 0.872 0.620 0.763 0.996 0.968
Elementary #10 0.001 0.892 0.794 0.457 0.032 0.579
Elementary #11 0.024 0.000 0.279 0.061 0.001 0.000
Elementary #13 0.000 0.008 0.995 0.895 0.023 0.119
High School #3 0.000 0.000 0.003 0.973 0.000 0.000
Special #1 0.871 0.335 0.013 0.671 0.035 0.023
Elementary #14 0.627 0.120 0.618 0.268 0.263 0.052
Elementary #15 0.042 0.000 0.963 0.302 0.124 0.003
(table continues)
98
Table 28 (continued)
School
W-H W-B W-A H-B H-A B-A
Elementary #16 0.179 0.000 0.000 0.003 0.000 0.000
Elementary #17 0.000 0.014 0.438 0.904 0.000 0.003
Junior High #3 0.000 0.000 0.103 0.256 0.000 0.000
Elementary #18 0.000 0.028 0.070 0.860 0.001 0.001
Elementary #20 0.000 0.002 0.963 0.871 0.012 0.010
High School #4 0.000 0.000 0.965 0.026 0.137 0.000
Elementary #21 0.129 0.034 0.999 0.798 0.648 0.276
Elementary #22 0.000 0.589 0.744 0.002 0.000 0.321
Elementary #23 0.000 0.000 0.992 0.998 0.020 0.075
Elementary #25 0.000 0.000 0.000 0.660 0.000 0.000
Note. (α = 0.01) During the 2011 school year the results of Tukey HSD, α = 0.01 accounting for
the Bonferroni effect, for the 32 schools which had a significant F-statistic from the
ANOVA are displayed in Table 29. Of the 32 schools where the ANOVA showed a
significant achievement gap 25 of these schools exhibited an achievement gap as
measured by average scale score between White and Hispanic students (see Table 29).
Furthermore, Tukey HSD showed that 18 of these schools showed a significant gap in
performance between White and Black students. Although the gaps were not as frequent
at schools in comparison to Asian students they did exist. A significant gap between
Hispanic and Black students in comparison to Asian students existed at 22 and 18 of
99
these schools, respectively. Of the 28 schools that received one of the two highest grades
by ADE more than half, 17, still showed a significant gap between White and Hispanic
students at their school. Exactly half of these 28 schools, 14, show a significant
performance gap with respect to average scale score between White and Black student at
their school. The Hispanic and Black performance achievement gaps on AIMS with
respect to Asian performance faired similar amongst these 28 schools with a
distinguished grades from ADE. The number of the distinguished letter grades with
Hispanic and Black achievement gaps in comparison to Asian students was 17 and 15,
respectively. Ultimately, as stated in Table 29, somewhere between 50% and 60.7% of
the schools graded distinguished by ADE still exhibited an achievement gap between
Hispanic and Black students in comparison to White and Asian students as examined by
Tukey HSD.
Table 29
2011 Results for Tukey HSD – Post Hoc Tests
School
W-H W-B W-A H-B H-A B-A
Elementary #1 0.000 0.002 0.000 0.945 0.000 0.000
Elementary #2 0.000 0.327 0.265 0.167 0.000 0.050
Elementary #3 0.001 0.238 0.000 0.975 0.000 0.000
High School #1 0.000 0.000 0.000 0.384 0.000 0.000
Junior High #1 0.000 0.000 0.000 0.135 0.000 0.000
Elementary #4 0.010 0.071 0.122 0.954 0.000 0.003
(table continues)
100
Table 29 (continued)
School
W-H W-B W-A H-B H-A B-A
Elementary #5 0.004 0.004 0.293 0.983 0.000 0.000
Elementary #6 0.789 0.000 0.504 0.000 0.987 0.001
Junior High #2 0.072 0.135 0.001 0.998 0.000 0.000
Elementary #8 0.000 0.006 1.000 0.999 0.150 0.285
Elementary #10 0.000 0.072 0.947 0.997 0.297 0.581
Elementary #11 0.033 0.043 0.273 0.957 0.003 0.003
Elementary #12 0.000 0.000 0.000 0.889 0.961 0.999
Special #1 1.000 0.219 0.000 0.336 0.002 0.000
Elementary #14 0.001 0.966 0.002 0.123 0.115 0.011
Elementary #15 0.005 0.003 0.964 0.777 0.016 0.005
Elementary #16 0.000 0.000 0.885 0.208 0.002 0.000
Elementary #17 0.004 0.157 0.955 0.998 0.320 0.505
Junior High #3 0.000 0.000 0.015 0.974 0.000 0.000
Elementary #18 0.000 0.116 0.058 0.877 0.000 0.003
Elementary #19 0.805 0.994 0.063 0.824 0.025 0.163
Elementary #20 0.000 0.915 0.727 0.225 0.003 0.606
High School #4 0.000 0.000 0.629 1.000 0.000 0.000
Elementary #21 0.025 0.191 0.722 0.998 0.951 0.935
Elementary #22 0.001 0.003 0.946 0.957 0.012 0.011
(table continues)
101
Table 29 (continued)
School
W-H W-B W-A H-B H-A B-A
Elementary #23 0.000 0.000 0.999 0.639 0.000 0.013
Elementary #25 0.000 0.000 0.023 0.851 0.000 0.000
Elementary #26 0.000 0.000 0.024 0.080 0.000 0.000
Elementary #27 0.000 0.000 0.835 0.991 0.000 0.000
Elementary #28 0.000 0.000 0.011 0.606 0.000 0.100
Elementary #29 0.000 0.649 0.432 0.071 0.000 0.215
Junior High #5 0.000 0.000 0.010 0.342 0.000 0.000
Note. (α = 0.01)
Summary of ANOVAs and Average Scale Score
Similar to results observed in the examination of proficiency percentages,
excelling schools and highly performing schools examined through ANOVAs with
respect to average scale score continued to show significant achievement gaps. In the
2010 school year approximately 72% (see Table 11) of the schools that were
distinguishably labeled by ADE still exhibited at least one achievement gap between the
four ethnicities and in 2011 approximately 75% (see Table 21) of the highly graded
schools had similar results. Upon further analysis of post-hoc tests between 40% and
53.3% (see Table 28) of the 30 highly labeled schools during the 2010 school year
exhibited an average scale score gap between either Hispanic and Black students and
either White and Asian students. Furthermore, during the 2011 school year results were
102
congruent with post-hoc ANOVA tests showing between 50% and 60.7% (see Table 29)
of the 28 highly graded schools had persistent gaps similar to those observed in 2010.
Ethnicity Proportion and Z-Score
The analysis of the proportions of Asian/White students in comparison to a
schools Z-score follows. This analysis addresses the third research question of:
3. Is the percentage of Asian and White students correlated with the state-issued
z-score, a standardized score for the percent of students that exceed on the
AIMS examination at a school in a given year, which helps determine school
labels within the state of Arizona?
The analysis of this research question is divided into three sections. One of the sections
addresses the 2010 school year and one address the 2011 school year. A summary of
findings follows these two sections.
Correlation and Linear Regression between Ethnicity Proportions and Z-Score
A relationship exists between the percentage of Asian/White students at a school
and the corresponding z-score that the school receives in the AZ LEARNS Legacy school
labeling model. The strength of the relationship has varied slightly over the 2009-2010
and 2010-2011 school years. Overall the relationship between these two variables is
strong suggesting that a z-score may be directly linked to school demographics.
One of the ways that the ADE establishes the difference between a highly
performing school and an excelling school is through standardizing the percentage of
students at a school that exceed on the AIMS examination. The standardization process
is statistically known as a z-score. The z-score is calculated by taking the proportion of
103
students at a given school that exceed on the AIMS exam subtracting the average
proportion of students that exceed on the AIMS exam for each school state-wide and then
dividing by the standard deviation of the proportions of students that exceed on the AIMS
exam for each school state-wide. Schools that have met the requirements to be highly
performing or excelling are further separated by examining the z-score.
A z-score greater than or equal to one establishes a school as excelling as opposed
to highly performing. Z-scores greater than or equal to one suggest that a school has had
enough students exceed AIMS exams so that they are at least one standard deviation
above the average proportion of exceeders for schools throughout the state. Each school
with the suburban school district being analyzed received a z-score for the 2010 and 2011
school years. As established previously in Chapter 4, 22 of the schools throughout the
district received the excelling label which means that 22 of the schools had z-scores
greater than or equal to one.
2010 Data. A correlation and regression analysis of 2010 data established that
during this school year there was a strong relationship between the proportion of
Asian/White students at a school and the z-score that the school received. The coefficient
of determination for the regression analysis was 0.702 (see Table 30) which suggests that
70.2% of the variability in z-score can be accounted for by the variability in the
proportion of Asian/White students at the school. A relationship as strong as this
suggests that the number of exceeders at a school is strongly related to this demographic
make-up of a school.
104
Table 30
Linear Correlation Summary for 2010 Z-Score regressed on Percentage of
Asian/Caucasian (ACP) Students at a School
R R Square
Adjusted R
Square
Std. Error of the Estimate
Durbin-Watson
.838(a) .702 .694 .5714580 1.663
Note. a Predictors: (Constant), ACP. b Dependent Variable: z-score
Further analysis of the 2010 scatterplot and regression shows increased support
for the strong relationship between z-score and the proportion of Asian/White students at
a school (see Figure 3). The model of Y = B0 + B1X resulted in Y = -1.309 + 3.600X
(see Table 31) where Y is the predicted z-score based off of the line of conditional means
and X is the proportion of Asian/White students at a school. The slope of this model is
3.600 and the 95% confidence-interval for the slope is (2.52, 4.06). At an α=0.05 the
confidence interval suggest that a slope of 0, or no relationship between these two
variables, can be statistically rejected. Consequently, when accounting for error in the
model, it appears as if the strong relationship can still be validated.
105
Figure 3. Scatterplot of 2010 Z-Score versus Proportion of Asian and Caucasian
Students at a School.
Table 31
Coefficients and Standard Error of Coefficients for 2010 Z-Score Regressed onto
Percentage of Asian/Caucasian Students at a School
Unstandardized Coefficients
t Sig
B Std. Error
(Constant) -1.309 .259 -5.058 .000
ACP 3.600 .381 9.454 .000
Note. a Dependent Variable: Zscore
The biggest caution must be given when examining the residual plot for the 2010
data. At first glance of the bivariate scatterplot the model does seem to be somewhat
heteroskedastic. The residual plot for the model is found in Figure 4. Limited data points
106
in the suburban district being examined created a situation where minimal z-scores were
available for schools with small proportions of Asian/White students. As a result it does
appear as if the model exhibits increased variance as the proportion of Asian/White
increases. However, it has been observed that in cases that involve moderate violation of
homoscedasticity the violation may make the model weaker but they do not necessarily
make the model invalid (Tabachnick & Fidell, 1996). Although this limits the use of the
model as a predictor of z-score it does not inhibit the interpretation that there is a strong
relationship between these two variables.
Figure 4. Residual Plot for Regression Model regressing 2010 Z-Score onto Percentage
of Asian/Caucasian Students at a School.
2011 Data. The examination of 2011 data on z-scores and percent of
White/Asian students at a school produced similar results in comparison to 2010 data.
Once again, during this school year a strong relationship between the two variables is
exhibited. Pearson’s product-moment correlation coefficient was 0.823 which resulted in
107
a coefficient of determination of 0.678 (see Table 32). Essentially, 67.8% of the
variability in z-score can be accounted for by the variability in the proportion of
Asian/White students at a given school within the district. Once again, the high R2 value
suggests that the relationship between the two variables being is examined is significant.
Table 32
Linear Correlation Summary for 2011 Z-Score regressed on Percentage of
Asian/Caucasian (ACP) Students at a School
R R Square
Adjusted R Square
St. Error of the Estimate
Durbin-Watson
.823(a) .678 .670 .5611580 1.700
Note. a Predictors: (Constant), ACP, b Dependent Variable: Zscore
Further analysis of the 2011 scatterplot and regression shows increased support
for the strong relationship between z-score and the proportion of Asian/White students at
a school (see Figure 5). The model of Y = B0 + B1X resulted in Y = -1.131 + 3.288X
(see Table 33) where Y is the predicted z-score based off of the line of conditional means
and X is the proportion of Asian/White students at a school. The slope of this model is
3.2875 and the 95% confidence-interval for the slope is (2.54 , 4.03). At an α=0.05 the
confidence interval suggest that a slope of 0, or no relationship between these two
variables, can be statistically rejected. Consequently, when accounting for error in the
model, it appears as if the relationship can still be validated.
108
Figure 5. Scatterplot of 2011 Z-Score versus Proportion of Asian and Caucasian
Students at a School.
Table 33
Coefficients and Standard Error of Coefficients for 2011 Z-Score Regressed onto
Percentage of Asian/Caucasian Students at a School
Unstandardized Coefficients
t Sig
B Std. Error
(Constant) -1.131 .249 -4.540 .000
ACP 3.288 .367 8.948 .000
Note. a Dependent Variable: Zscore
Once again the biggest caution for the model would be actually using the model to
predict the z-score for schools that have a higher proportion of Asian/White students.
109
While
he
d
the
t
the model does establish a strong relationship between the two variables it does
exhibit similar signs of heteroskedasticity when examining the Figure 6 residual plot.
Figure 6 shows that as the proportion of Asian/White student increases the variance in t
prediction of the model also increases. This is of concern if a district administrator use
the model in an attempt to predict the z-score for any given school. However, it is of less
concern when simply using the model to establish that a strong relationship between the
two variables exists. In fact, upon further examination of the 2011 residual plot one
might notice a point of importance that would actually play a role in reducing the R2
value for the regression model. The point, although not influential as measured by
Cook’s distance (Di = 0.2798), has a residual of 1.78 and can be readily explained by
school being a specialty school within the district that particularly targets the highes
performing academic students. Essentially, it is a school that pulls the very best students
from other district schools in order to give them a maximized learning experience.
Without this point the value of the coefficient of determination would certainly increase
resulting in further evidence for the strength of the relationship between z-score and
proportions of Asian/White students at a school.
110
Figure 6. Residual Plot for Regression Model regressing 2011 Z-Score onto Percentage
of Asian/Caucasian Students at a School.
Summary of Linear Regression Analysis
Data from the 2010 and 2011 AIMS examinations and consequently the 2010 and
2011 AZ Learns Legacy school labeling systems suggests that there is a strong
relationship between the variables being examined. The strength of the relationship is
cemented by coefficients of determination of 0.7017 and 0.6781 (see Table 30 and Table
32) which would be considered large effect sizes within the social and behavioral
sciences. This relationship is of particular importance as z-score is used to help
determine whether a school is labeled as excelling.
111
A-F Letter Grade and Demographic Data
The analysis of the four demographic data points and the state issued school letter
grade at a school follows. This analysis addresses the fourth research question of:
4. Are free and reduced lunch rates, English Language Learner rates, percentage
of Asian students and percentage of White students correlated with the AZ
LEARNS A-F letter grades published by the state of Arizona for schools
within the suburban district?
The analysis of this research question is divided into two sections. The first sections
examines the simple multiple linear regression for the 2011 school year. The second
section provides a summary of these findings.
Regressing A-F Letter Grade Value onto Ethnicity Proportions, Free and Reduced Lunch
Rate and ELL Proportions Using Multiple Linear Regression
Standard multiple regression was performed to establish the ability of the
proportion of White (W), proportion of Asian (A), proportion of ELL (ELL) and
proportion of free and reduced lunch rate (FR) students to predict the AZ Learns A-F
letter grade (Grade) assigned by ADE. It was determined that there were not any outliers
by using Mahalanobis distance to evaluate the data for multivariate outliers. The critical
χ2 value with df=4 at p<0.001, which is the generally accepted p-value for Mahalanobis
distance (Mertler & Vannatta, 2005, p. 53), is χ2=18.467. The largest χ2 value observed
in the data set was 12.995 for elementary school #12. A residual scatterplot, displayed in
Figure 7, was used to examine the condition of multivariate homogeneity of variance-
covariance (Mertler & Vannatta, p. 173). The examination of the standardized residual in
112
comparison to the standardized predicted value yielded a residual plot that did not
demonstrate clustering. As a result the condition of multivariate homogeneity of
variance-covariance was assumed. Finally, a scatterplot matrix (see Figure 8) examining
the relationship between all variables intended to be used in the standard multiple
regression was analyzed to examine multivariate normality and linearity. Unfortunately,
some of the variable scatterplots did exhibit non-elliptical patterns. In particular, the
variable of ELL proportions consistently showed an L-Shape pattern when examined
with other variables. Consequently, it was determined to eliminate the variable of ELL
proportions from the regression analysis.
Figure 7. Residual Plot for 2011 Regression that Regresses School Letter Grade onto
Four Independent Variables.
113
GradeWAELLFR
Gra
deW
AE
LLF
R
Figure 8. Scatterplot Matrix for All Variables in 2011 Multiple Regression.
After examining the data and eliminating ELL proportions it was then necessary
to go back and re-examine the standardized residual plot that helped establish
multivariate homogeneity. Figure 9 displays the residual plot for a multiple regression
with the variable of ELL proportions removed from the model. Once again, as revealed
in Figure 9, the standardized residual plot showed no signs of clustering and,
consequently, the process moved onto analyzing the information from the regression
analysis.
114
Figure 9. Residual Plot for 2011 Regression that Regresses School Letter Grade onto
Three Independent Variables.
One of the first issues that a researcher must check after performing a multivariate
regression is multicollinearity. Multicollinearity can limit the size of R2 by having
multiple variables account for the same variability and it can make it difficult to evaluate
which predictors are the most important (Stevens, 2007, p.234). There are several ways
to account for multicollinearity including analyzing correlation matrices, tolerance values
and variance inflation factors (VIF). High correlations amongst independent variables;
tolerance values less than 0.1 (Mertler & Vannatta, p. 169); or VIF factors that exceed 10
(Myers, 1990, p. 369) all can establish a foundation for variables that exhibit
multicollinearity. In the case of the multiple regression analyzed in this research it was
quickly observed using all three of these methods (see Table 34 and Table 35) two of the
three variables showed strong tendencies towards multicollinearity.
115
As presented in Table 34, Free and reduced lunch proportions and White student
proportions were correlated with r = -0.958. Furthermore, Table 35 details that each of
these variables had low tolerance values, 0.028 and 0.039 respectively, and high VIF
factors at 35.851 and 25.483.
Stevens suggests that there are three ways to deal with multicollinearity which
include:
1. If there are three measures relating to a single construct which have
intercorrelations of about 0.8 or larger then add them to form a single
predictor.
2. Use factorial analysis methodology.
3. Use ridge regression (Stevens, p. 235).
Using the first of Stevens’ suggestions is quite possibly the most viable for the multiple
regression being analyzed in this situation. However, in order to use Stevens suggestion
we need three measures relating to a single construct. The three measures that have high
intercorrelations (see Table 34), at least 0.8, in this study include Free and Reduced
proportions, ELL proportions and White proportions. Thus, we will introduce the ELL
proportion back into the multiple regression by cumulating it with Free and Reduced
proportions and White proportions.
116
Table 34
Inter-correlation Matrix for Three Variables being examined in 2011 Multiple Linear
Regression
Grade
FR A W
Grade 1.000 -.823 .670 .712
FR -.823 1.000 -.688 -.958
A .670 -.688 1.000 .509
W .712 -.958 .509 1.000
Table 35 Collinearity Statistics for Three Variables used in 2011 Multiple Linear Regression
Collinearity Statistics
(Constant)
Tolerance VIF
FR .028 35.851
A .252 3.972
W .039 25.483
Note. a Dependent Variable: Grade
After combining the three variables (SUM3) the conditions for multiple
regression were retested again. Figure 10 shows the scatterplot matrix exhibited no major
violations of linearity and normality with the majority of the six scatterplots exhibiting
moderately elliptical patterns.
117
GradeSUM3A
Gra
deS
UM
3A
Figure 10. Scatterplot Matrix for Three Variables in 2011 Multiple Regression
A reexamination of Mahalanobis distance once again showed no outliers in the
data set with p = 0.001 and χ2=13.82. The standardized residual plot exhibited no
clustering and demonstrated a random pattern of errors (see Figure 11).
118
Figure 11. Residual Plot for 2011 Regression that Regresses School Letter Grade onto
SUM3 and Percent of Students at a School that are Asian.
Finally, multicollinearity was reduced in the preliminary regression analysis. As
shown in Table 36, tolerance for both variables was acceptable at 0.361 and due its
inverse relationship VIF was also acceptable at 2.767. Unfortunately, after examining all
of the conditions for standard multiple regression further analysis of the regression data
(see Table 36) examined found that the variable of proportion of Asian students was not
statistically significant for the model (t = 0.223, p = 0.825).
119
Table 36
2011 Multiple Linear Regression Model for Letter Grade regressed onto the variables of
SUM3 and Percentage of Asian Students at a School
Unstandardized
Coefficients
t Sig Collinearity Statistics
B
Std. Error Tolerance VIF
(Constant) 237.229 25.291 9.380 .000
A 14.971 67.059 .223 .825 .361 2.767
Sum -113.048 22.108 -5.113 .000 .361 2.767
Note. a Dependent Variable: Grade
Due to the intercorrelation of these variables, and the inability to filter the
variance accounted for by each individual variable, it was determined that the research
would proceed with constructing a single variable model based on the highest correlated
variable. The variable that exhibited the highest correlation with the ADE assigned letter
grade was free and reduced lunch proportion. The model constructed for was a simple
linear regression of the form Y = B0 + B1X. After running the single regression the
resulting linear equation displayed in Table 37 was Y = 153.821 – 64.289X where Y was
the predicted grade issued by ADE and X was the proportion of free and reduced lunch
students at a school in the district. As shown in Table 38 the model constructed had r = -
0.823 and R2 = 0.677. Overall this suggests that 67.7% of the variability in school grades
assigned by ADE within this district was accounted for by the variability in the free and
reduced lunch percentage at the school.
120
Table 37
2011 Single Variable Linear Regression Model for Letter Grade regressed onto the
variable of Free and Reduced Lunch Percentage
Unstandardized Coefficients
t Sig
B
Std. Error
(Constant) 153.821 2.924 52.614 .000
FR -64.289 7.210 -8.916 .000
Note. a Dependent Variable: Grade Table 38
2011 Coefficient of Determination for Letter Grade regressed onto the variable of Free
and Reduced Lunch Percentage
R R Square
Adjusted R Square
St. Error of the Estimate
Durbin-Watson
.823(a) .677 .668 11.52450 1.681
Note. a Predictors: (Constant), FR, b Dependent Variable: Grade
Summary of Relationship between School-Level Variables and School Letter Grades
The relationship between free and reduced lunch percentages and AZ Learns A-F
letter grades within this suburban district is demonstrated to be significant. High
intercorrelation (see Table 34) between variables being examined made it difficult for the
research to employ the multiple linear regression techniques desired. The intercorrelation
is most likely attributed to the fact that ethnicity and poverty continue to be interwoven
within social class structure of the United States. Schools with low poverty (low free and
121
reduced lunch proportions) almost always show higher proportions of White and Asian
students because families of these ethnicities are less likely to live in poverty.
Ultimately, the high intercorrelation of the predetermined variables prohibited the use of
multiple linear regressions.
Of the four variables examined free and reduced lunch proportions of each school
showed the highest correlation (see Table 34) with the dependent variable of school letter
grades. The relationship between these two variables was strong. In comparison to the
regression analysis relating z-score to proportion of Asian/White students at a school this
model also showed comparable strength. Although the intended examination of four
variables in relationship to the variable of school letter grades failed, the information that
one of the best variables we have in education to measure poverty, free and reduce lunch
percentage, provides the best predictor of school letter grade is very useful.
Summary of Chapter 4
Throughout the analysis provided in Chapter 4 two themes appeared to become
prevalent. First, the majority of schools throughout the district still exhibited an
achievement gap between ethnic groups represented at their school. Moreover, the
achievement gap found in both proficiency percentage and average scale score was
shown to exist even if a school was labeled with a distinguished school label by ADE.
Secondly, it was found that their existed a relationship between certain demographic data
and numerical variables that aide in determining school labels. A strong relationship
between the percent of White and Asian students at a school and z-score was found.
Also, a strong relationship between free and reduced lunch rates at a school and the
122
school letter grade assigned by ADE existed. These two themes were the most prevalent
of all findings as their implications for schools labeled with distinguished labels by the
ADE are profound.
CHAPTER 5
Conclusions, Summary, Implications, and Recommendations
Introduction
This chapter provides a summary of the study and important conclusions drawn
from the analysis provided in Chapter 4. The chapter presents a discussion of the major
implications for action that can be drawn from the data presented throughout the research.
It then makes recommendations for further research that can be conducted at the school,
district and state level. Also included in the chapter are a review of the methodology,
findings as related to current literature and concluding remarks. The chapter serves as a
summary to readers in an effort to focus on the critical conclusions from the provided
research.
Summary of the Study
The study sought to better understand the achievement gap at schools in a
suburban district in the southwest United States. A descriptive analysis of the
achievement gap as measured by proficiency percentage on the AIMS examination
coupled with an inferential analysis of average scale score on the AIMS examination for
the 2009-2010 and 2010-2011 school years provided a comprehensive picture of the gap
at these schools. Using figures and tables, accompanied with the ANOVA results from
the inferential analysis, the research sought to better understand whether schools were
making progress in closing the achievement gap addressed in NCLB. Throughout the
analysis a conditional relationship with school labels was examined by referencing
124
conditional probabilities based on results from the descriptive analysis and the inferential
analysis.
The secondary part of the study attempted to examine the relationships between
demographic variables and school labels which could possibly mask whether a school
had or had not closed the achievement gap. A linear correlation and regression was
analyzed in an attempt to describe the relationship between the percentage of White and
Asian students at a school and z-score (an ADE issued standardized score that is used in
determining AZ Learns Legacy school labels). A simple multiple linear regression was
analyzed in an attempt to describe the relationship between four different demographic
variables and the school letter grade which is an ADE school label issued starting in the
2011 school year.
Overview of the Problem
The inception of NCLB has mandated that a ranking or labeling system for
schools and districts be established and sustained for accountability. However, there is
variation within and between each state’s ranking systems. Specifically, in Arizona’s
ranking system the labels may not statistically identify if the achievement gap has been
closed. As a result, school labels may be more readily linked with the demographics of
the school than the best practices within the school.
Purpose Statement
The purpose was to examine the achievement gap, particularly in mathematics
and reading, at all non-alternative schools within a suburban school district within the
state of Arizona for the 2009-2010 and 2010-2011 school years. Furthermore, the study
sought to examine demographic reasons on why the schools within the suburban school
125
district obtained high and low school labels. The study was specifically interested in
student achievement across ethnic subgroups with respect to the state standardized AIMS
examination. Another interest of the study included the descriptive analysis of cross-
sectional data in reading and mathematics at these schools from 2009-2010 and 2010-
2011. Furthermore, the study sought to define the predictive abilities of the percentage of
non Black/Hispanic students with respect to the percentage of students that exceeded on
the AIMS examination. Using four main research questions as a guide, data from two
prior years was analyzed at schools throughout the suburban school district.
Research Methodology
Ex-post facto data was analyzed with quantitative methods for this research. The
data allowed for the examination of the achievement gap at a school district in the
southwest United States. Furthermore, the data allowed for an evaluation of Arizona’s
school labeling system with respect to the suburban school district. The data was
conveniently sampled from 40 different schools throughout the district and included
student AIMS scores, ADE issued z-scores, ADE issued school letter grades and school
demographic data. The quantitative data was then analyzed descriptively and
inferentially by using an ANOVA. Finally, bivariate and multivariate relationships were
examined between demographic data and ADE issued z-score and school letter grade.
Major Findings Summary
The findings from this research study are given in two parts in this section. First,
a summary of findings for each research question is briefly provided. Following that
summary an overall thematic summary is provided to synthesize the information to a
126
broader level. The summary of findings for each research question will state the research
question followed by a brief summary of the overall findings regarding said question.
Research Question 1
What is the two year cross-sectional data trends for the achievement gap among
White, Asian, Hispanic and Black students on the 2009-2010 and 2010-2011 AIMS
mathematics, reading and writing sections at all schools in the suburban school district?
Finding. The two year cross-sectional data trends for the achievement gap within
this suburban school district suggest that the majority of schools within the district still
struggle with closing the elusive gap. Moreover, schools with distinguished labels from
the ADE have been shown to have similar problems with closing the achievement gap in
this district.
Research Question 2
Is the average student achievement, as measured by average scale score, in ethnic
subgroups different for the 2009-2010 and 2010-2011 AIMS examinations at each non-
alternative school throughout the suburban district?
Finding. The achievement gap, when examined by average scale score, existed at
the majority of schools throughout the district. The findings from this research questions
were consistent with the findings from the first research questions. These findings
included district schools continuing to struggle with closing the achievement gap and
distinguishably labeled schools similarly struggle. Analyzing the achievement gap,
amongst the schools in this suburban district, using two different metrics provided
significant evidence to support a lingering achievement gap.
127
Research Question 3
Is the percentage of Asian and White students correlated with the state-issued z-
score, a standardized score for the percent of students that exceed on the AIMS
examination at a school in a given year, which helps determine school label within the
state of Arizona?
Finding. For both the 2010 and 2011 school year the percentage of Asian and
White students at school was highly correlated with the state-issued z-score. A strong
relationship between these two variables suggest that a school’s ability to be labeled as
excelling is related to whether they have a high proportion of Asian and White students.
Research Question 4
Are free and reduced lunch rates, English Language Learner rates, percentage of
Asian students and percentage of White students correlated with AZ LEARNS A-F letter
grades published by the state of Arizona for schools within the suburban district?
Finding. Inter-correlation between the variables examined provided a significant
complication when analyzing this research question. After using a couple of different
statistical solutions a single variable linear regression showed that free and reduced lunch
rates were highly correlated with A-F letter grades for schools within this district.
Major Findings Discussion
The findings from each research question presented in Chapter 4 can be
summarized under two major themes. First, throughout the suburban school district that
was examined schools continue to struggle with closing the persistent achievement gap
between the four ethnicities studied. Second, school labels and grades are strongly
correlated with demographic measures prevalent in the school. Accounting for both of
128
these themes the proceeding discussion provides clarification with respect to underlying
data providing evidence for these themes.
Schools within the suburban school district examined continued to show a
persistent achievement gap. This gap was found in schools across the labeling spectrum.
In 2009-2010, the majority of schools throughout the district demonstrated an
achievement gap on AIMS. When examining the conditional distribution of the
achievement gap across school labels it was also shown that the persistent achievement
gap was prevalent amongst schools in the district with the highest of school labels. In
2011, the trend in achievement gap on the AIMS examination continued with the
majority of schools in the district showing distinct gaps in performance amongst different
ethnicities across all three subjects. Schools with high letter grades, A or B, in 2010-
2011 also continued to show significant achievement gaps. In concert, the cross-sectional
examination of 2009-2010 and 2010-2011 AIMS results showed the majority of schools
throughout the district continued to show persistent achievement gaps and a large
proportion of distinguishably labeled, and graded, schools showed similar achievement
gaps.
The achievement gap endured when examining the achievement gap from a
different perspective. Instead of using proficiency percentage as a measurement of
achievement gap, average scale score was inferentially examined to study the
achievement gap during the same two years. Once again, the majority of the schools
throughout the district demonstrated a die-hard achievement gap. Further analysis of
conditional probabilities showed that during 2009-2010 a high proportion of the schools
labeled highly performing demonstrated an achievement gap between either Hispanic and
129
Black students and either Asian and White students. The 2010-2011 data continued to
show achievement gaps in average scale score amongst schools in this district labeled
distinctly by ADE. An analysis of A or B schools in 2010-2011 showed that district
schools that received these high grades had similar ethnic achievement gaps. In an
attempt to avoid drawing false conclusions based on proficiency percentages the analysis
of average scale score provided a solidified conclusion. The majority of schools
throughout the district showed a significant achievement gap and a large proportion of
schools with elevated labels, and grades, showed similar achievement gaps.
The second theme resulted from examining the relationship between variables that
help determine school labels and demographic data. The relationship between the
percentage of White and Asian students at a school and the school-level z-score was
shown to be strong. During the 2009-2010 school year, a significant amount of the
variability in z-score was accounted for by the variability in the percentage of White and
Asian students at the schools. During the 2010-2011 school year the percent of
variability was nearly identical. The relationship between this demographic variable and
z-score, a variable that helps the ADE determine school labels, suggests that it is strongly
possible that a critical factor in determining school labels is the percentage of White and
Asian students at a given school.
Further analysis of the new school label, AZ Learns A-F letter grades, issued by
the ADE during the 2010-2011 school year helped to cauterize the second theme. After
attempting to run a simple multiple linear regression on four different demographic
variables present in any given school in relationship to school letter grade it was found
that one single variable accounted for the majority of the variability in letter grade. Due
130
to multicollinearity amongst the four variables examined and the lack of statistical
significance in some of the variables, free and reduced lunch percentage was left to serve
as the sole predictor of school letter grade throughout the district. Once again, a large
percentage of the variability in school letter grade was accounted for by the variability in
free and reduced lunch percentage. Once again, this information helps stitch the
interwoven theme that a school label is strongly associated with certain demographic
variables present at a school.
The comprehensive findings from the study of this suburban school district
resulted in two major themes. One theme was that schools in this district, with
exceptional and non-exceptional labels issued by ADE, still had significant work to be
done in closing the academic achievement gap between ethnicities. The second theme
was that a school label within this district is highly associated with certain demographic
variables. Combining both of these themes results in a better understanding of the
relationship between ADE issued school labels and the ability of a school to accomplish
the mandate set forth in NCLB of closing the achievement gap between ethnic subgroups.
Findings Related to the Literature
The findings of this research align well with previous research in the field of
education. First, the persistent achievement gap found at schools throughout the district
echo the achievement gap seen throughout the nation on the NAEP exam. Second, the
strong relationship between the two demographic variables analyzed in each regression
analysis follows from Berliner’s analysis of out-of-school factors. The two main themes
prevalent in the research findings in this dissertation readily link to other research in the
educational field.
131
The persistent achievement gap eludes the nation, as a whole, and it continues to
elude the majority of the schools in the examined district. The Center for Education
Policy (2009) showed through NAEP results that the annually recurring achievement
gaps continue to elude our nation’s educators. The research performed on this district in
the southwest United States found results with respect to a persistent achievement gap
that were not contrary to the 2009 Center for Education Policy release. Furthermore,
Kober, Chudowsky & Chudowsky (2010) concluded that Hispanic-White gaps in
achievement in reading and mathematics continue to plague educators. For the school
district studied this research finding certainly held true. The majority of schools
continued to show significant gaps in state performance testing between Hispanic and
White students in reading and mathematics. The findings with respect to the achievement
gap in this suburban school district certainly follow from previous research.
Out-of-school factors, studied by Berliner (2009), that tend to appear in higher-
poverty areas impact educational attainment. The results from this study indicate that
holistic school-wide demographic indicators that would link to OSFs are highly
correlated to measurements of school achievement. Birenbaum and Nasser (2006),
Zuzovsky (2008) and Berliner (2009) all conclude that achievement gaps among
subgroups within a population can be linked to the impact of poverty. In this study, two
findings show similar results.
First, the correlation between the state issued z-score and the Asian/White
demographic make up of a school reiterate these previous research findings. The variable
of percentage of Asian/White students at each school within the district is a holistic
summary variable that is linked to poverty. DeNavas-Walt, Proctor and Smith (2010)
132
established that Asian and White people in the United States are approximately half as
likely to live in poverty as Hispanic and Black people. Consequently, the variable of
percentage of Asian/White students at a school just summarizes poverty and out-of-
school factors. Therefore, the research findings that the percentage of Asian/White
students at school is highly correlated with the state issued z-score, a standardized
measurement of the proportion of students that exceed the standards on the AIMS test,
would be directly in line with the findings of poverty’s relationship with educational
outcomes.
Second, the correlation between free and reduced lunch rate percentages at a
school and the ADE issued numerical value for school letter grade are directly in line
with Berliner (2009) and other researchers. The findings in this school district suggest
that schools in higher poverty areas receive lower ADE letter grades. Similarly, schools
in wealthier areas receive higher ADE letter grades. The relationship between the chosen
measurement of poverty, free and reduced lunch rates, and ADE letter grades was shown
to be strong. This finding is directly in line with Berliner’s implication that out-of-school
factors caused by the impacts of poverty can have school-level effects on educational
achievement.
The findings of previous research in the educational field and the findings in this
study link together well. Berliner’s (2009) research on the ill-effects of poverty in
conjunction with the Center for Educational Policy (2009) research on NAEP results and
the persistent achievement gap show that the findings in this district are not outside
expectation. The relationship between current research and the research in this district in
133
the southwest United States aid in strengthening the findings submitted in this
dissertation.
Divergent Findings
The study had some findings that were not intended to be examined but became
prevalent when examining the data. The most surprising result came when performing
the simple multiple linear regression for the fourth research question. Multicollinearity
between the four variables, determined prior to the study for accounting for school letter
grade, prohibited the use of simple multiple linear regression. High inter-correlations
between the variables removed the use of some of the variables. Furthermore, a
statistically insignificant variable caused the multiple linear regression to be revised to a
single-variable linear regression. The sole variable that remained accounted for a large
proportion of the variance in school letter grade throughout the district. Interestingly the
variable that remained was free and reduced lunch percentage. This variable is simply a
measurement of poverty at a school. Consequently, the relationship between school letter
grade and free and reduced lunch percentage throughout the school district was found to
be of critical importance when interpreting school-level letter grades.
Heterogeneity of variances across ethnic subgroups in a small proportion of
schools was another divergent finding that was unexpected. When using inferential
statistics to examine average scale score across the four ethnic subgroups 11 schools
during the 2009-2010 school year and 6 schools during the 2010-2011 school year
exhibited heterogeneity of variances. Most typically, the heterogeneity of variances
could be attributed to the large ratio observed between the largest sample size and
smallest sample size of the four ethnic subgroups. More concisely stated, the
134
heterogeneity of variances was primarily seen in schools that were more ethnically
homogeneous. This violation of condition of the ANOVA brought to the forefront the
idea that the achievement gap becomes more difficult to analyze as subgroup sample
sizes gets smaller. For this precise reason, many states have implemented in the NCLB
adequate yearly progress measurement a minimum sample size condition that allows
schools with small samples in certain subgroup categories to not be measured for
accountability.
In analyzing the descriptive statistics for the proficiency percentage between
ethnic subgroups there were more elementary schools that showed signs of closing the
achievement gap than secondary schools. The implications of this finding could be quite
varied. It could possibly be that as students matriculate from elementary to junior high to
high school the subgroup sample size increases reducing variability and thus achievement
gaps become more readably noticeable. Another explanation could be that students of
different ethnic backgrounds within this district may possibly diverge in academic
performance as they get older. The academic growth rates of different ethnic subgroups
might be unequal which would then result in less achievement gaps in elementary schools
and larger achievement gaps in high schools. Finally, it could be that high schools
encompass larger boundaries and are more likely to encounter diverse socioeconomic
statuses between their students. So, elementary schools socioeconomic status could be
more homogeneous resulting in smaller achievement gaps in comparison to high schools.
The research provided in this study does not attempt to answer any of these theories. But,
it was observed that all of the high schools exhibited large achievement gaps between
ethnicities whereas not all elementary schools did the same.
135
Conclusions
The effect of poverty on the educational achievement of students is recognized
throughout educational research. The Coleman Report first examined the relationship
between family backgrounds and its link to the perpetual achievement gap (Viadero,
2006, p. 1). Berliner (2009), Birenbaum and Nasser (2006) and Zuzovsky (2008) all have
provided more recent evidence of poverty’s interaction with educational achievement.
While the scope of this research was limited to a single school district in the southwest
United States, it should not be alarming that even in this district the impact of poverty can
be seen.
Schools in this district were shown to have higher school labels, issued by the
ADE, when they had lower free and reduced lunch rates and a higher proportion of
Asian/White students. Essentially, schools labeled effective by the ADE within this
district are certainly benefited by not being exposed to the very ill-effects that Berliner
suggest rise out of poverty. A school in this district can not manifest from its exemplary
status with ADE the idea that they are supplying any better education to students than
their sister schools. Similarly, a school in this district that has a label, or grade, with
ADE that is below exemplary should not panic with the idea that they are doing a poorer
job of educating their students. ADE school grades and labels for schools in this district,
currently, should be viewed in the proper context of representing socioeconomic status
and not the quality of education in that school.
The quality of education of a school within this district should not come into
question as a result of school grade and it must be clear that every school within the
district must continue to emphasize finding ways to close the achievement gap. Much
136
like the nation, schools throughout the district continue to struggle with closing the
persistent achievement gap for ethnic minorities. A school label, or grade, cannot be a
resting point where educators believe their work with respect to the achievement gap is
finished. Rather, as this research has shown, despite a school’s label it is extremely likely
that a school within this district continues to exhibit trends of Hispanic and Black
students that underperform their academic peers throughout their school. NCLB sought
accountability so that schools, districts and states could no longer ignore the silent
minorities in the educational process. Schools in this district must recognize the results
of this research, hear the cry of its minority students and implement research-based
programs for improving achievement amongst these groups at every school including
those with a distinguished label from the ADE.
Implications for Action
Administrators, teachers and parents throughout the suburban school district need
to be aware of the relationships studied in this research. The research has implications
for everyone of these groups throughout the district. The implications are both corrective
and cautious for each of these shareholders throughout the district.
School-level and district-level administrators must understand that while several
district schools, and the district itself, are viewed favorably throughout the state of
Arizona there is still much to accomplish with respect to closing the achievement gap.
Furthermore, since receiving a distinguished label from the state of Arizona in this
district is highly related to some demographic variables, administrators must realize that a
high label does not necessarily suggest best practices are in place for students of all
ethnicities. The highly performing label might merely suggest that a school in a non-
137
poverty area with a high proportion of White and Asian students. While the school may
have research-based best-practices implemented, an ADE issued school label is not the
“effect-size” that should be used to determine success of school programs. All school
administrators throughout the district should be encouraged to look beyond the label to
measure, analyze and implement school programs that impact students of different
ethnicities in a multitude of ways.
Teachers throughout the district must not rest on the accomplishment of their
school being labeled highly by ADE. Teaching, educating and mentoring students of
different ethnicities is not best measured in a school label. As evidenced by this research,
teachers must understand that just because a school is an “A” or “excelling” it does not
necessarily mean that it provides an “A” education for Hispanic or Black students.
Teachers must continue to explore programs that can persist to have an effect on the
diverse populations that each school throughout the district serves.
Parents throughout the district, as a result of this research, need to continue to
become educated about what a school is doing to best service the needs of their
individual child. School-level labels might provide necessary summary level data easy to
report in newspapers. But, the level of analysis that these labels provides to a Black or
Hispanic parent in an upper-class neighborhood in this district is extremely limited.
Perpetual achievement gaps in schools with high labels suggest that minority parents in
the district need to continue to pressure schools to service the needs of their children and
promote the idea of education amongst all subgroups. The achievement gaps existent in
the majority of schools throughout the district amongst ethnic subgroups call for parents
of minority children to become educated about school labels, involved in the educational
138
system, and active in ensuring their students are exposed to the level of education
demanded in NCLB.
All shareholders must be cautious in consuming the ADE issued school labels.
Specifically, the shareholders must be careful in interpreting what a school label means
for an individual child and particularly an ethnically diverse individual child. Much like
a business can hide discrepancies in salary between males and females by reporting an
overall average salary a school-wide label can disguise discrepancies in educational
performance between ethnicities by reporting summary level achievement data.
Shareholders throughout the school district must be aware that school labels summarize
school performance but they do not analyze school performance.
Finally, at the national level politicians and other stakeholders must realize that
rhetoric without action is little more than a social pacifier. The research findings for the
district analyzed in this dissertation should bear witness to the idea that closing the
achievement gap is going to take a much more concentrated effort than NCLB. In 2001
politicians spoke to NCLB being the Civil Rights Act of the 21st century. NCLB was an
act that was going to ensure the civil liberty of equal education to all subgroups.
Certainly, the state-mandated testing and nationally-mandated accountability systems
called for in NCLB, with the purpose of eradicating the achievement gap, have had very
little effect on the elusive achievement gap in the district analyzed in this research.
Furthermore, national-level research would suggest similar findings. Hopefully this
research helps in establishing the need to reexamine testing systems, accountability
systems and the very metrics used in ranking schools throughout this nation.
139
Recommendations for Further Research
The research performed in this study was limited in its ability to generalize to
other districts, states and the nation. Further research based on this study should be
encouraged, first and foremost, to other districts. Examining the implications of school
labels in relationship to the academic achievement gaps amongst ethnicities is important
and valuable information that no district should ignore. NCLB mandated that districts
throughout the nation continue to seek ways to close the disparate achievement gap
between ethnic subgroups. A school and district that continues to ignore issues with an
achievement gap because the state has deemed them as exemplary is like a business that
shows a profit but could have profited more by analyzing their costs.
Within the given district in which this data was analyzed it is recommended that
analysis of programs intended to close the achievement gap amongst ethnicities be
conducted. With the understanding that the achievement gap in this district remains
persistent, specialized programs that were designed to address the underperforming
ethnic subgroups must be analyzed to evaluate their impact on student achievement.
Furthermore, a comprehensive study analyzing the effect size of research based programs
throughout the United States should be conducted with respect to programs that impact
achievement gaps at the school level.
The district analyzed would also be served by further analyzing the achievement
gap by ethnicity for all students that live in poverty. For example, a future research
project should include analyzing the performance at given school for Black, Asian, White
and Hispanic students whom all are currently on free and reduced lunch. Examining the
achievement gap in this manner could provide further evidence if there was truly an
140
ethnic achievement gap at these schools even when socioeconomic status was held in
control. An academic performance study across ethnicities for all students qualifying for
free and reduced lunch would further the research performed in this dissertation.
Further research is also recommended at the state level. As the state continues to
modify how to rank and label schools it should be recommended that they continue to
analyze exactly what their labeling system measures. John Tukey, one of the foremost
statisticians during the 20th century, has been quoted as saying, “It is better to have an
approximate answer to the right question than a precise answer to the wrong question”
(Brainy Quote, 2011). The state of Arizona must constantly analyze their school labeling
system with Tukey’s insight in the forefront of their mind. Producing school labels that
are precise in terms of statistical measurement but enable schools to be labeled solely on
their demographic representation are misleading. Analyzing whether school
demographics are directly related to school labels on a yearly basis helps ensure that state
is constantly updating their labeling system.
Finally, as the state of Arizona expands its abilities to measure different variables
prevalent in the school system a review of the multiple regression for school level grades
should be encouraged. The variables currently available through ADE, as shown in this
research, tend to be inter-correlated. As ADE expands its abilities to collect useful
information on students, schools and districts a review of the multiple linear regression in
relationship to school grade would be warranted.
Concluding Remarks
NCLB had intended to implement accountability systems for educational
institutions throughout the United States in an effort to reform education. In particular,
141
the biggest reform sought was the closing of the achievement gap between ethnic
subgroups. A variety of accountability systems have since been implemented across the
United States. Unfortunately, too many states use their accountability measurements as a
final judgment for the academic quality of a school. Linn (2008) stated that:
Accountability system results can have value without making causal inferences
about school quality, solely from the results of student achievement measures and
demographic characteristics. Treating the results as descriptive information and
for identification of schools that require more intensive investigation of
organizational and instructional process characteristics are potentially of
considerable value. Rather than using the results of the accountability system as
the sole determiner of sanctions for schools, they could be used to flag school that
need more intensive investigation to reach sound conclusions about needed
improvement or judgments about quality. (p. 21)
The insights from Linn’s research on accountability systems are valuable. Unfortunately,
with respect to the single school district studied in this research the accountability and
school labeling system used by ADE has been found to have “flags” for schools that
receive the best of labels. Reoccurring achievement gaps at excelling and highly
performing schools in the suburban school district studied suggest that schools with the
highest of labels might need to be flagged as well. Not because these schools are
underperforming with respect to all student achievement but rather because they have
certain ethnic subgroups that are underperforming in comparison with their peer groups.
142
The ability to measure the impact of school-level and teacher-level factors
continues to be an inexact science. The multivariate environment that children are
exposed to on a daily basis throughout the year prohibit accountability models from being
perfect. George Box (1979), a 21st century statistician, reminds educators that, “All
models are wrong but some are useful.” Certainly, every accountability model in
education thus far has been wrong and educators should continue to establish what to
measure and how to measure it. As the models become more useful one must continue to
understand that most accountability models provide school-wide and district-wide
summary data. As this research has shown, for the suburban district analyzed, the very
summary level data used to hold schools and districts accountable can be misleading
when not analyzed in a disaggregated fashion. The inexact science of accountability
measurements, while demanded for by a variety of stakeholders, need to be cautiously
viewed because as Box suggests every model is inherently flawed but each model has
useful information when digested properly.
REFERENCES
Amrein, A. L., & Berliner, D. C. (2002). High-stakes testing, uncertainty, and student
learning. Education Policy Analysis Archives, 10(18).
Arizona Department of Education (ADE) (2008). Zscore. Retrieved from
https://www.azed.gov/azlearns/AZLEARNSTechnicalManual2008.pdf (pg 21).
Arizona Department of Education (ADE) (2010). Arizona’s school accountability system.
Retrieved from http://www.azed.gov/research-
evaluation/files/2011/09/2010azlearns-technical-manual.pdf (pg 23).
Arizona Department of Education (ADE) (2011). Arizona October 1st Enrollment
Figures. Retrieved from http://www.ade.az.gov/researchpolicy/AZEnroll
Barton, P. E., & Coley, R. J. (2010, July). The black-white achievement gap. When
progress stopped. Princeton, NJ: Policy Evaluation and Research Center.
Beecher, M., & Sweeny, S. M. (2008). Closing the achievement gap with curriculum
enrichment and differentiation: One school's story. Journal of Advanced
Academics, 19(3), 520-530.
Bell, T. C. (1983). A nation at risk. Retrieved from
http://www2.ed.gov/pubs/NatAtRisk/risk.html
Berliner, D. C. (2009). Are teachers responsible for low achievement by poor students?
Kappa Delta Pi Record, (46), 18-21. Retrieved from
http://greatlakescenter.org/docs/Policy_Briefs/Berliner_NonSchool.pdf
144
Birenbaum, M., & Nasser, F. (2006). Ethnic and gender differences in mathematics
achievement and in dispositions towards the study of mathematics. Learning and
Instruction, 16, 26-40.
Bishop, J. H. (1997). What should high-school graduates know in economics? The effect
of national standards and curriculum-based exams on achievement. The American
Economic Review, 87(2), 260-264.
Black, J. A. & Champion, D. J. (1976). Methods and issues in social research. New
York, NY: Holt.
Borman, G., & Dowling, M. (2010). Teachers College Record, 112(5), 1201-1246.
Box, G, (1979, May). Robustness in the strategy of scientific model building. In R. L.
Launer and G. N. Wilkinson (Eds.) Robustness in statistics: Proceedings of a
workshop. Salt Lake City, UT: Academic Press.
Brainy Quote. (2011). Retrieved from
http://www.brainyquote.com/words/ap/approximate131762.html
Bruce, M. (2011, September 23). Obama: ‘No child left behind’ changes will allow states
to meet higher standards. Retrieved from,
http://abcnews.go.com/blogs/politics/2011/09/obama-no-child-left-behind-
changes-will-allow-states-to-meet-higher-standards/
145
Burch, P., Theoharis, G., & Rauscher, E. (2010). Class size reduction in practice
investigating the influence of the elementary school principal. Educational Policy,
24(2), 330-358.
Carnoy, M., & Loeb, S. (2002). Does external accountability affect student outcomes? A
cross-state analysis. Educational Evaluation and Policy Analysis, 24(4), 305-331.
Center on Education Policy (CEP) (2009, October). Are achievement gaps closing and is
achievement rising for all? State test score trends through 2007-08, part 3.
Clerk of the House of Representatives. (2001). Final votes for roll call 145.
Coleman, J. S. (1996). Equality of educational opportunity. Washington, DC: U.S.
Department of Health, Education, and Welfare.
Costrell, R. M. (1997). Can centralized educational standards raise welfare? Journal of
Public Economics, 65, 271-293.
Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.
Cronin, J., Kingsbury, G. C., McCall, M. S., & Bowe, B. (2005). The impact of the no
child left behind act on student achievement and growth: 2005 edition (Technical
Report). Lake Oswego, OR: Northwest Evaluation Association.
Curren, P. J. & Werth, R. J. (2004). Interindividual differences in intraindividual
variation: Balancing internal and external validity. Measurement, 2(4), 219-247.
146
DeNavas-Walt, C., Proctor, B. D., & Smith, J. C. (2010, September). Income, poverty,
and health insurance coverage in the United States: 2009. US Department of
Commerce. Retrieved from, http://www.census.gov/prod/2010pubs/p60-238.pdf
Diamond, J. B. (2006). Still separate and unequal: Examining race, opportunity and
school achievement in "integrated" suburbs. Journal of Negro Education, 75(3),
495-505.
Education Week. September 10, 2004. Achievement Gap. Retrieved from
http://www.edweek.org/ew/issues/achievement-gap/
Glass, G. & Hopkins, K. (1984). Statistical methods in education and psychology.
Englewood Cliffs, NJ: Prentice Hall.
Glass, G., Peckham, P., & Sander, J. (1972). Consequences of failure to meet
assumptions underlying the fixed effects analyses of variance and covariance.
Review of Educational Research, 42, 237-288.
Goldschmidt, P. (2004, October). Models for school accountability and program
evaluation. Presentation at the Reidy Interactive Lecture Series: Incorporating
measures of student growth into state accountability systems, Nashua, NH.
Retrieved from http://www.nciea.org/publications/RILS_PG04.pdf
Gould, S. J. (1981). The mismeasure of man: The definitive refutation to the argument of
the bell curve. New York: Norton.
Hammersley, M. (1987). Some notes on the terms ‘validity’ and ‘reliability.’ British
Educational Research Journal, 13(1), 73-81.
147
Hanushek, E. A., & Raymond, M. E. (2003a). In Paul E. Peterson and Martin R. West
(Eds.), Lessons about the design of state accountability systems. Brookings.
Hanushek, E. A., & Raymond, M. E. (2003b). Improving educational quality: How best
to evaluate our schools? Education in the 21st century: Meeting the challenges of
a changing world.
Hanushek, E. A., & Raymond, M. E. (2004). The effect of school accountability systems
on the level and distribution of student achievement. Journal of the European
Economic Association, 2(2), 406-415.
Harlin, R. (2009). The impact of teachers' expectations on diverse learners' academic
outcomes. Childhood Education, 85(4), 253-256.
Hauser, P. M., McMurrin, S. M., Nabrit, J. M., Nelson, L. W., & Odell, W. R. (1964).
Integration of the public schools–Chicago, p.20-21. Chicago, IL: Board of
Education, Chicago Public Schools.
Hess, F. M., & Petrilli, M. J. (2006). No child left behind. New York, NY: Peter Lang
Publishing.
Imber, M., & Van Geel, T. (2004). Educational law (3rd ed.). Mahwah, NJ: Lawrence
Erlbaum Associates, Publishers.
Jacob, B. A. (2001). Getting tough. The impact of high school graduation exams.
Educational Evaluation and Policy Analysis, 23(2), 99-121.
148
Kerlinger, F. N. (1964). Foundations of behavioral research. New York, NY: Holt.
Kiley, K. (2010, October 10). Arizona to change how it evaluates schools. Arizona
Republic.
Kober, N., Chudowsky, N., & Chudow, V. (2010, December). Slow and uneven progress
in narrowing gaps. State test score trends through 2008-09, part 2. Center on
Education Policy.
Kopan, A., & Walberg, H. (1974). Rethinking educational equality. Berkley, CA:
McCutchan Publishing Corporation.
Leithwood, K. (2010). Characteristics of school districts that are exceptionally effective
in closing the achievement gap. Leadership and Policy in Schools, 9(3), 245-291.
Levine, T. H., & Marcus, A. S. (2007). Closing the achievement gap through teacher
collaboration: Facilitating multiple trajectories of teacher learning. Journal of
Advanced Academics, 19(1), 116-138.
Liew, J., Chen, Q., & Hughes, J. N. (2010). Childhood effortful control, teacher-student
relationships, and achievement in academically at-risk children: Additive and
interactive effects. Early Childhood Research Quarterly, 25(1), 51-64.
Limprianou, J. & Athanasou, J. A. (2009). A teacher’s Guide to Educational
Assessment. (Revised Edition) Rotterdam/Boston/Taipei: SensePublishers.
Retrieved from, https://www.sensepublishers.com/files/9789087909147PR.pdf
149
Linn, R. L. (2006). Educational accountability systems. CSE technical report 687.
Technical Report No. ED492875. National Center for Research on Evaluation,
Standards and Student Testing.
Linn, R.L. (2008). Educational accountability systems. In K. E. Ryan and L. A. Shepard,
(Eds.), The Future of Test-Based Accountability, pp. 3-24. New York: Routledge.
Linn, R. L., & Miller, D. M. (2005). Measurement and assessment in teaching. Upper
Saddle River, NJ: Pearson Education, Inc.
Loesch, P. C. (2010). 4 core strategies for implementing change. Leadership, 39(5), 28-
31.
Marshall, K. (2009). A how-to plan for widening the gap. Phi Delta Kappan, 90(9), 650-
655.
McKown, C., & Weinstein, R. S. (2008). Teacher expectations, classroom context, and
the achievement gap. Journal of School Psychology, 46(3), 235-261.
Mertler, C. & Vannatta, R. (2005). Advanced and multivariate statistical methods (3rd
ed.). Glenda, CA: Pyrczak Publishing.
Myers, R. (1990). Classical and modern regression with applications (2nd ed.). Boston,
MA: Duxbury Press.
150
National Assessment of Educational Progress (NAEP) (2009). The Nation’s report card.
Grade 12 reading and mathematics National and pilot state results. US
Department of Education. Retrieved from
http://nces.ed.gov/nationsreportcard/about/naephistory.asp
Nichols, S. L., & Berliner, D. C., (2007). Collateral Damage: How High-Stakes Testing
Corrupts America’s Schools. Cambridge, MA: Harvard Education Press.
Orfield, G. (2006). Forward to J. Lee, Tracking achievement gaps and assessing the
impact of NCLB on the gaps: An in-depth look into national and state reading and
math outcome trends. Cambridge, MA: The Civil Rights Project at Harvard
University. Retrieved from www.agi.harvard.edu/Search/download.php?id=84
Orlich, D. C. (2004). No child left behind: An illogical accountability model. The
Clearing House, 78(1), 6-11.
Peterson, P. E., & West, M. R., (Eds.). (2003). No child left behind? The politics and
practice of accountability. Washington, DC: Brookings Institution Press.
Popham, J. W. (2004). America's "failing" schools: How parents and teachers can cope
with no child left behind. New York, NY: Routledge Falmer.
Reese, W. J. (2005). America's public schools: From the common school to "no child left
behind." Baltimore, MD: The Johns Hopkins University Press.
Resnick, L. B. & Resnick, D. P. (1985). Standards, curriculum, and performance: A
historical and comparative perspective. Educational Researcher, 16(9), 13-20.
151
Rotham, R. (1995). Measuring up: Standards, assessment and school reform. San
Francisco, CA: Jossey-Bass Publishers.
Rury, J. L. (2002). Educational and social change: Themes in the history of American
schooling. Mahwah, NJ: Lawrence Erlbaum Associates.
Smith, E. (2005). Raising standards in American schools: The case of no child left
behind. Journal of Education Policy, 20(4), 507-524.
Smith, M. (2007). Leaving NCLB renewal behind. Retrieved from
http://www.educationevolving.org/pdf/mikesmithoped.pdf
Steel, T. D. (2009). Closing the achievement gap: What can be done? Unpublished
Doctoral Dissertation (3355438), University of Southern California, Ann Arbor,
MI.
Stevens, J. (2007). Intermediate statistics (3rd ed.). New York, NY: Lawrence Erlbaum
Associates.
Tabachnick, B.G. & Fidell, L.S. (1996). Using multivariate statistics (3rd ed.). New
York: Harper Collins.
Tidman, P., & Kahane, H. (2003). Logic and philosophy: A modern introduction, (9th
ed.). Belmont, CA: Wadsworth/Thomson Learning.
Trochim, W. M. (2006). The Research Methods Knowledge Base, (2nd ed.). Retrieved
from: http://www.socialresearchmethods.net/kb/intval.php
152
U.S. Department of Education. (2005, July 14). Spellings hails new national report card
results: Today's news “proof that No Child Left Behind is working.” Press
release. Retrieved from
http://www.ed.gov/news/pressreleases/2005/07/07142005.html
U.S. Department of Education. (2009, January). Great expectations. Holding ourselves
and our schools accountable for results. Office of the Secretary.
U.S. Senate roll call votes 107th congress - 1st session (2001).
Viadero, D. (2006, June 21). Race report’s influence felt 40 years later. Legacy of
Coleman study was new view of equity. Education Week, 25(41).
Walker, G. (1963, July 6). Englewood and the northern dilemma. The Nation, 197, 7-10
Webb, L. D. (2006). The history of American education: A great American experiment.
Upper Saddle River, NJ: Pearson Education, Inc.
Wei, X. (2008). Accountability stringency, incentives and student performance. Doctoral
Dissertation, Stanford University.
Zhang, Y., & Zhang, L. (2002, April 1-5). The applicability of selected regression and
hierarchical linear models to the estimation of school and teacher effects, 1-19.
New Orleans, LA..
153
Zimmerman, B. J. & Schunk, D. H. (2003), Educational psychology: A century of
contributions. Lawrence Erlbaum Associates, ISBN 0805836829 retrieved from
http://books.google.com/books?id=bqo5A2nBwHYC&pg=PA37&lpg=PA37&dq
=Zimmerman,+Barry+J.;+Schunk,+Dale+H.+(2003),+Educational+Psychology:+
A+Century+of+Contributions,+Lawrence+Erlbaum+Associates,+ISBN+0805836
829&source=bl&ots=KeS4NKCSJu&sig=LqV9OnONnHFDiS1cCyPqHtM9Zus
&hl=en&ei=af2YTf3QOJTWiALi74CdCQ&sa=X&oi=book_result&ct=result&re
snum=1&ved=0CBQQ6AEwAA#v=onepage&q&f=false
Zuzovsky, R. (2008). Closing achievement gaps between Hebrew-speaking and Arabic-
speaking students in Israel: Findings from TIMSS-2003. Studies in Educational
Evaluation, 24, 105-117.
APPENDIX A
CUSD IRB APPROVAL
155
156
157
BIOGRAPHICAL INFORMATION
Matt Strom was born on May 9, 1977 in Boone, Iowa. With a family that always stressed education Matt graduated from Mountain Pointe High School, 1995, Arizona State University, 1998, and received his graduate degree from Northern Arizona University, 2002. During the process of receiving his graduate degree he became married to Marcia Jones. After marrying Marcia, the good Lord blessed the Strom family with three children named Zavian, Quentin and Elijah. Matt has worked in a variety of different roles during his 14 year educational career. As a 21-year old teacher Matt first started teaching mathematics at Mesquite High School. As he was about to turn 25, Matt was hired as the varsity boys basketball coach at a 5A school making him the youngest active large school varsity boys coach in the state of Arizona. Matt has served in several other roles throughout his educational career that include: mathematics department chair, AVID teacher and head varsity golf coach. Currently, Matt’s role includes being the research analyst to the superintendent of his current district. The constant desire that Matt has to learn has enabled him to participate in many learning experiences since his graduate degree. He was a PLC leader for the Project Pathways STEM project out of ASU. He has attended numerous educational workshops including AVID training and NCTM conferences. Furthermore, in an effort to reconnect with his mathematics classes from his undergraduate degree Matt studied for and passed Exam P, Probability for Risk Management, the first exam in the Society of Actuaries exam process. Matt, like the majority of his fellow educators, has a thirst for knowledge and as a result he started the doctoral process in the summer of 2008 through Northern Arizona University. Upon completion of his degree Matt hopes to continue to grow in the educational field. He strongly desires to gain his superintendent’s certificate in an effort to gain employment in district-level educational administration at the K-12 level. In accord with the themes of this dissertation Matt hopes to bear witness to the day when ethnicity is not a determining factor in the quality of education that a child receives.