matt strom dissertation

AN EXAMINATION OF THE ACHIEVEMENT GAP AND SCHOOL LABELS IN A

SOUTHWEST SUBURBAN DISTRICT IN THE UNITED STATES

By Matthew D. Strom

A Dissertation

Submitted in Partial Fulfillment

of the Requirements for the Degree of

Doctor of Education

in Educational Leadership

Northern Arizona University

December 2011

Approved:

Richard L. Wiggall, Ed.D., Chair

Walter J. Delecki, Ph.D.

Gary Emanuel, Doctor of Arts

George Montopoli, Ph.D.

ii

ABSTRACT

AN EXAMINATION OF THE ACHIEVEMENT GAP AND SCHOOL LABELS IN A

SOUTHWEST SUBURBAN DISTRICT IN THE UNITED STATES

MATTHEW D. STROM

School labeling, or ranking, has become common place in the NCLB era of

school accountability. Most states have implemented a system that enables the public to

compare school to school and district to district. Labeling systems were intended by

NCLB to measure the effectiveness of a school and the ability of a school to ensure equal

educations to subgroups throughout their population. NCLB was a “call to arms” to

address the epidemic of lagging student achievement in minority subgroup populations

throughout the United States. Schools that did not leave any children behind were

intended to be recognized as superior to the rest. Ten years later research is muddled on

the effects of NCLB with respect to the very achievement gap it sought to address.

School ranking systems throughout the United States are being examined on how well

they identify schools that have met the requirements of NCLB. The primary requirement

of NCLB is for a school to close the achievement gap. Within this study you will find an

examination of the achievement gap in a suburban school district within the state of

Arizona and consequently an examination of the labels attached to this district’s schools

by the Arizona Department of Education. The findings for the research can be

summarized under two major themes. One theme was that a wide majority of schools in

this district, with exceptional and non-exceptional labels issued by ADE, still had

significant work to be done in closing the academic achievement gap between ethnicities.

iii

The second theme being that a school label within this district is highly associated with

certain demographic variables. Combining both of these themes results in a better

understanding of the relationship between ADE issued school labels and the ability of a

school to accomplish the mandate set forth in NCLB of closing the achievement gap

between ethnic subgroups. Administrators, teachers and parents throughout the suburban

school district need to be aware of the relationships studied in this research. School-level

and district-level administrators throughout the district must understand that while several

schools in the district, and the district itself, are viewed favorably throughout the state of

Arizona there is still much to accomplish with respect to closing the achievement gap.

Teachers throughout the district must not rest on the accomplishment of their school

being labeled highly by ADE. Teaching, educating and mentoring students of different

ethnicities is not best measured in a school label. Minority parents throughout the

district, as a result of this research, need to continue to become educated about what a

school is doing to best service the need of their individual child. Minority parents must

understand that this is the case whether their child is attending a school with an excelling

label or an underperforming label. All shareholders within this district must be cautious

in consuming the ADE issued school labels. Specifically, the shareholders must be

careful in interpreting what a school label means for an individual child and in particular

an ethnically diverse individual child.

iv

ACKNOWLEDGEMENTS

My grandfather, Herman Strom, fought in World War II to earn the GI Bill so he

could get his college degree. I would like to express my gratitude for my family

members who have sacrificed in their life so that I was in position to accomplish what I

have done in mine. These family members include: Herman Strom, Madeline Strom,

Harold Davis, Letha Davis, Andy Strom, Megan Strom and Betsy Jenkins. Most

importantly, I would like to express my thanks to my parents, Larry and Kathy Strom.

I am very grateful to have had a strong committee for this dissertation. Dr. Ric

Wiggall, my chair, served as a critical voice that kept positive in the constant revision

process. Dr. George Montopolli was a source of knowledge in the statistical analysis

relevant to my ideas. Dr. Walter Delecki and Dr. Gary Emanuel provided feedback and

suggestions to ensure the quality of my dissertation. I also wish to express my gratitude

to Dr. Edie Hartin whose expertise in writing ensured a smooth dissertation process.

I also had a group of colleagues that helped me cope with the daily reality of

being employed full time and in a doctoral program concurrently. Whether it was a

round of golf, a game of cards or a lunch time venting session the list of my colleagues

and friends that I owe thanks to for preserving my personal sanity include: Darin

Lawton, Matthew Barber and Sean Casey. Many thanks to all of those people

aforementioned as this dissertation would not have been started or finished without you.

This is truly a shared accomplishment.

v

TABLE OF CONTENTS

CHAPTER PAGE

1 Overview................................................................................................................ 1

Introduction...................................................................................................... 1

Berliner, School Accountability and the Achievement Gap............................ 4

The Contraposition of Berliner ........................................................................ 7

Statement of the Problem................................................................................. 9

Purpose of Study .............................................................................................. 9

Research Questions........................................................................................ 10

Significance of the Study ............................................................................... 11

Delimitations.................................................................................................. 13

Limitations ..................................................................................................... 14

Definition of Terms........................................................................................ 15

Organization of the Study .............................................................................. 16

Summary ........................................................................................................ 17

2 Review of the Literature ...................................................................................... 18

Introduction.................................................................................................... 18

History of Assessment ................................................................................... 18

History of Achievement Gap ......................................................................... 22

Historical Background of Equity in Education.............................................. 25

No Child Left Behind..................................................................................... 29

vi

CHAPTER PAGE

Studies and Reports Regarding the Trends in the Achievement Gap since NCLB............................................................................................ 34

Summary ........................................................................................................ 39

3 Methodology........................................................................................................ 41

Introduction.................................................................................................... 41

Restatement of the Problem........................................................................... 41

Restatement of Research Questions............................................................... 42

Research Design............................................................................................. 43

Target Population........................................................................................... 43

Sample............................................................................................................ 43

Sampling Procedures ..................................................................................... 44

Data Collection Procedures............................................................................ 45

Data Analysis ................................................................................................. 46

Validity .......................................................................................................... 47 External Validity...................................................................................... 48 Internal Validity ....................................................................................... 48

4 Findings and Results ...................................................................................... 50 Introduction.............................................................................................. 50 Analysis of the Achievement Gap Using AIMS Proficiency Percentage............................................................................. 51 2010 AIMS Summary – Overall ........................................................ 51 2010 AIMS Summary – By Subject .................................................. 58 2011 AIMS Summary – Overall ........................................................ 66 2011 AIMS Summary – By Subject .................................................. 73

Summary of AIMS Proficiency Data from Spring 2010 and Spring 2011.............................................................................................. 80

vii

CHAPTER PAGE

Analysis of the Achievement Gap Using AIMS Scale Score .................. 80

2011 and 2011 Achievement Gap Analyzed through Average Scale Score............................................................................................... 81

Summary of ANOVAs and Average Scale Score.................................. 101 Ethnicity Proportion and Z-Score .................................................... 101 Correlation and Linear Regression between Ethnicity Proportions and Z-Score .................................................................. 102 2010 Data ................................................................................... 103 2011 Data ................................................................................... 106 Summary of Linear Regression Analysis .............................................. 110 A-F Letter Grade and Demographic Data.............................................. 111 Regressing A-F Letter Grade Value onto Ethnicity Proportions, Free and Reduced Lunch Rate and ELL Proportions Using Multiple Linear Regression ............................... 111 Summary of Relationship between School Level Variables and School Letter Grades....................................................................... 120 Summary of Chapter 4 ........................................................................... 121 5 Conclusions, Summary, Implications, and Recommendations.................... 123 Summary of the Study ........................................................................... 123 Overview of the Problem....................................................................... 124 Purpose Statement.................................................................................. 124 Research Methodology .......................................................................... 125 Major Findings Summary ...................................................................... 125 Research Question 1 ........................................................................ 126 Research Question 2 ........................................................................ 126 Research Question 3 ........................................................................ 127 Research Question 4 ........................................................................ 127 Major Findings Discussion .................................................................... 127

viii

CHAPTER PAGE

Findings Related to the Literature.......................................................... 130 Divergent Findings................................................................................. 133 Conclusions............................................................................................ 135 Implications for Action .......................................................................... 136 Recommendations for Further Research................................................ 139 Concluding Remarks.............................................................................. 140 REFERENCES .............................................................................................................. 143

APPENDICIES

A CUSD IRB Approval ......................................................................................... 154

BIOGRAPHICAL INFORMATION............................................................................. 156

ix

LIST OF TABLES

TABLE PAGE

1 ELL Percentages and Free and Reduced Lunch Percentages Comparison............44

2 Ethnic Comparison between Suburban School District and State of Arizona.......45

3 Spring AIMS 2010 Subgroup Performance by Ethnicity and Elementary School .................................................................................................52

4 Spring AIMS 2010 Subgroup Performance by Ethnicity and Junior High School ................................................................................................53

5 Spring AIMS 2010 Subgroup Performance by Ethnicity and High School ...........................................................................................................55

6 Spring AIMS 2010 Subgroup Proficiency Gaps Summary of All Schools ...........57

7 Spring AIMS 2010 Subgroup Proficiency Gaps Summary of Excelling Schools...................................................................................................58

8 Spring AIMS 2010 Elementary School #1 Performance by Subject and Ethnicity .............................................................................................59

9 Spring AIMS 2010 Junior High #4 Performance by Subject and Ethnicity .............................................................................................60

10 Spring AIMS 2010 High School #1 Performance by Subject and Ethnicity .............................................................................................62

11 Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2010 Spring AIMS Administration...........63

12 Number of All District Excelling Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2010 Spring AIMS Administration...........65

13 Spring AIMS 2011 Subgroup Performance by Ethnicity and Elementary School .................................................................................................67

14 Spring AIMS 2011 Subgroup Performance by Ethnicity and Junior High School ................................................................................................68

x

TABLE PAGE

15 Spring AIMS 2011 Subgroup Performance by Ethnicity and High School ...........................................................................................................69

16 Spring AIMS 2011 Subgroup Proficiency Gaps Summary of All Schools ...........71

17 Spring AIMS 2011 Subgroup Proficiency Gaps Summary of Excelling Schools...................................................................................................72

18 Spring AIMS 2011 Elementary School #1 Performance by Subject and Ethnicity .............................................................................................74

9 Spring AIMS 2011 Junior High #4 Performance by Subject and Ethnicity .............................................................................................75

20 Spring AIMS 2011 High School #1 Performance by Subject and Ethnicity .............................................................................................76

21 Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2011 Spring AIMS Administration...........77

22 Number of All District Excelling Schools with Observed Gap in Mathematics, Reading and Writing by Ethnicity for 2011 Spring AIMS Administration...........79

23 Average Scale Score throughout the District on 2010 AIMS Mathematics Administration .......................................................................................................82

24 Levene’s Test for Homogeneity of Variance P-Values for Each School across Ethnicities with Respect to Average Scale Score .......................................85

25 Kolmogorov-Smirnov (KS) P-Values for Normality for 2010 and 2011 AIMS Distributions by Ethnicity ..................................................................88

26 Results from the 2010 ANOVA for 29 District-Wide Schools that did not violate the Assumptions of the ANOVA ...............................................................91

27 Results from the 2011 ANOVA for 34 District-Wide Schools that did not violate the Assumptions of the ANOVA ...............................................................94

28 2010 Results for Tukey HSD – Post Hoc Tests.....................................................97

29 2011 Results for Tukey HSD – Post Hos Tests .....................................................99

xi

TABLE PAGE

30 Linear Correlation Summary for 2010 Z-Score regressed on Percentage of Asian/Caucasian (ACP) Students at a School......................................................104

31 Coefficients and Standard Error of Coefficients for 2010 Z-Score Regressed onto Percentage of Asian/Caucasian Students at a School.................105 32 Linear Correlation Summary for 2011 Z-Score regressed on Percentage of Asian/Caucasian (ACP) Students at a School...............................107 33 Coefficients and Standard Error of Coefficients for 2011 Z-Score Regressed onto Percentage of Asian/Caucasian Students at a School.................108 34 Inter-correlation Matrix for Three Variables being examined in 2011 Multiple Linear Regression..................................................................................116 35 Collinearity Statistics for Three Variables used in 2011 Multiple Linear Regression............................................................................................................116 36 2011 Multiple Linear Regression Model for Letter Grade regressed onto the variables of SUM3 and Percentage of Asian Students at a School................119 37 2011 Single Variable Linear Regression Model for Letter Grade regressed onto the variable of Free and Reduced Lunch Percentage...................120 38 2011 Coefficient of Determination for Letter Grade regressed onto the variable of Free and Reduced Lunch Percentage.................................................120

xii

LIST OF FIGURES

FIGURE PAGE

1 NCLB Student Achievement Expectations for English for All Subgroups...........29

2 NCLB Student Achievement Expectations for Mathematics for All Subgroups ........................................................................................................30

3 Scatterplot of 2010 Z-Score versus Proportion of Asian and Caucasian Students at a School .............................................................................................105

4 Residual Plot for Regression Model regressing 2010 Z-Score onto Percentage of Asian/Caucasian Students at a School ..........................................106

5 Scatterplot of 2011 Z-Score versus Proportion of Asian and Caucasian Students at a School .............................................................................................108

6 Residual Plot for Regression Model regressing 2011 Z-Score onto Percentage of Asian/Caucasian Students at a School ..........................................110

7 Residual Plot for 2011 Regression that Regresses School Letter Grade onto Four Independent Variables.........................................................................112

8 Scatterplot Matrix for All Variables in 2011 Multiple Regression......................113

9 Residual Plot for 2011 Regression that Regresses School Letter Grade onto Three Independent Variables .......................................................................114

10 Scatterplot Matrix for Three Variables in 2011 Multiple Regression .................117

11 Residual Plot for 2011 Regression that Regresses School Letter Grade onto SUM3 and Percent of Students at a School that are Asian........................ 118

xiii

DEDICATION

This dissertation is dedicated to the four people who have shared in the sacrifice,

time commitment, highs and lows throughout the process. I would like to dedicate this

dissertation to my wife, Marcia, and three sons, Zavian, Quentin and Elijah. My passion

for educational equity has been maximized by your involvement in my life. My belief in

my own abilities has been secured through your constant support. And my purpose in life

is solidified in your existence.

xiv

“Do we truly will to see each and every child in this nation develop to the peak of his or

her capacities?”

Asa Hilliard, 1991

CHAPTER 1

Overview

Introduction

The average person makes many decisions on a daily basis, both serious and

mundane. In making these decisions, they have to account for various different

competing needs that must be prioritized. So, more often than not the average person

finds themselves looking for a label of sorts to help them make a decision that is both

informed and efficient. For example, one might read food labels to sort out the poor from

the good quality products; or society may judge politicians by the label of their political

party. Specifically, this study focuses on the average person who uses labels to identify

which school may provide the best quality education for their child while understanding

the requirement of No Child Left Behind (NCLB) to close the achievement gap. Prior to

NCLB a parent or guardian used sources such as word of mouth information from

community members to gather more information about a school. Now parents and

guardians alike enjoy the convenience of judging a school based on a label garnered from

student achievement data.

In 2001 President George W. Bush signed into law the No Child Left Behind Act

(NCLB). A reauthorization of the 1965 Elementary and Secondary Education Act,

NCLB was implemented with bipartisan support throughout the legislative branch (Hess

& Petrilli, p. 18). In fact spearheading the implementation of NCLB and school

accountability were democrats Senator Edward Kennedy and Representative George

Miller, and republicans Senator Judd Gregg and Representative John Boehner (Hess &

2

Petrilli, p. 19). These four members of the United States Congress served as critical

leaders in molding the principles implanted in NCLB.

One of the main reasons that Democrats and Republicans favored NCLB was due

to its sweeping reform with respect to the achievement gap (Hess & Petrilli, p. 21). In

essence “the law is premised on the notion that local education politics are fundamentally

broken, and that only strong, external pressure on school systems, focused on student

achievement, will produce a political dynamic that leads to school improvement” (Hess

& Petrilli, p. 23). NCLB required states to set up standards and measure whether students

performed to those standards broken down by subgroup. Consequently, student

achievement broken down by subgroup could address the overriding concern of the

achievement gap. The goal became to close the achievement gap by the school year

ending in the spring of 2014.

In the era of NCLB accountability is a mainstay for students, schools, districts,

and states. NCLB has caused the education system to emphasize a new culture of

accountability. It requires the closing of the achievement gap by 2014 and schools are

currently ranked, or labeled, based on their ability to make adequate yearly progress

(AYP) toward that goal. As a measurement, a school ranking should possess the quality

of correctly identifying those schools that are performing the best across all subgroups

and making progress toward closing the achievement gap by 2014. One might assume,

based on NCLB requirements, the schools ranked highest would be those showing gains

toward closing the achievement gap or schools that have already accomplished closing

the gap. Branding a school with the highest label, although it shows no progress in

3

diminishing the achievement gap, may garner criticism about the validity of the

measurement system used with respect to the goals of NCLB.

Upon the implementation of policies to satisfy NCLB the state of Arizona

determined that in order to be identified as an excelling school a school must be at least

one standard deviation above the average school in the percentage of students that exceed

on the Arizona Instrument to Measure Standards (AIMS) test (ADE, 2008, pg. 21).

Using this method of measurement to determine an excelling school versus a highly

performing school begs the question of whether the achievement gap is being closed at

schools in the state of Arizona. If a school needs to only score one standard deviation

above average in exceeds then are schools with a high proportion of White and Asian

students at an advantage? And if this is the case then does the label attached to schools in

Arizona have meaning beyond identifying the demographics and socioeconomic status of

a school? Essentially, are schools in the state of Arizona labeled as such because they

continue to attack the educational epidemic of low student-achievement within ethnic

minority subgroups?

The achievement gap between Hispanic and Black high school students in

comparison to their White and Asian peers appears to present an unsolved challenge

within the American and Arizona educational system. These gaps have existed at the

national, state, district and school level for decades. Furthermore, research suggests the

gaps are persisting within the twenty-first century educational climate. In a

Center for Education Policy release in October of 2009 it was stated:

Across subgroups and states, there was more progress in closing gaps at the

elementary and middle school levels than at the high school level. Even with this

4

progress, however, the gaps between subgroups often remained large – upwards

of 20 percentage points in many cases (p. 2).

The annually recurring achievement gaps at schools throughout the nation are

alarming. In the national era of school, district and state accountability it has been

deemed mandatory that educators take corrective measures to address this continuing

trend (NAEP, 2009, p.4).

The educational goal of closing the achievement gap is a necessity in ensuring the

civil rights of children in America. According to NCLB, in order for schools and districts

to receive their Title I funds each state shall establish annual measurable objectives for

subgroups within districts and schools. Schools and districts that fail to make adequate

yearly progress (AYP) toward those objectives for each subgroup will be subject to

corrective actions as determined by the state. NCLB mandated, to the applause of

politicians on both sides of the aisle, that the achievement gap be addressed within every

school, and district, nationwide.

Berliner, School Accountability and the Achievement Gap

David C. Berliner of Arizona State University is possibly one of the United

States’ foremost critics of NCLB and the remnants of school accountability. Berliner

views accountability sought by NCLB as placing the blame for low-achievement among

certain minority subgroups on teachers and administrators (Berliner, 2009). Berliner

argues that other factors, primarily out-of-school factors (OSFs), are more to blame for

the achievement gap in certain subgroups than the school, the teachers, or the

administrators.

5

One can ascertain that poverty and socio-economic status (SES) are central

problems for certain ethnic groups within America. Poverty exists at a higher rate in

America among both Hispanic and Black populations, 25.3% and 25.8% respectively,

than it does among White and Asian populations, 12.3% and 12.5% respectively

(DeNavas-Walt, Proctor, & Smith, 2010). Understanding this Berliner says there are

several educational consequences for children that live in poverty that result in a

persistent achievement gap (Berliner, 2009). One out-of-school factor that Berliner

brings to light in his research is Low Birth Weight (LBW) and Very Low Birth Weight

(VLBW). “African Americans, for example, are almost twice as likely as European

Americans to have a LBW child and almost three times as likely to have a VLBW child”

(Berliner, 2009). He goes on to mention that birth weight and IQ are correlated at

approximately 0.70 and that LBW children grow up to have IQs that are on average 11

points lower than those born at or above normal birth weight. Berliner’s argument is

simply that the effects of poverty more readily explain the achievement gap rather than a

failing educational system.

Throughout his research Berliner suggests several other OSFs that could be just as

prevalent in the achievement gap as teacher pedagogy. Berliner cites, with

accompanying statistics, OSFs like food insecurity, pollution, family violence and

neighborhood communities can all have a significant impact on the achievement gap.

The negative aspects of all of these OSFs occur more frequently in lower socio-economic

status (SES) and high poverty areas. All of these OSFs present one more hurdle for a

student in the educational process. The majority of these OSFs occur at a higher rate

among Hispanic and Black students because they, in higher proportions, live in poverty.

6

Berliner is not the only researcher who believes that the impact of poverty on

education might be the biggest factor in the relentless achievement gap. Achievement

gaps among subgroups within a population do not just occur in the United States.

Birenbaum and Nasser (2006) and Zuzovsky (2008) concluded that there is an

achievement gap in Israel between children who speak Hebrew and those who speak

Arabic. The Arab population in Israel is typically from families that have parents with

less education, lower income levels and a higher percentage of families that live below

the poverty line. These studies found that Jewish children, those who speak Hebrew,

perform better than the poorer Arabic children at mathematics. In fact, Birenbaum and

Nasser (2006) found the coefficient of determination to be around 0.6. Thus, about 60

percent of the variation between Jewish and Arabic children in mathematics can be

explained by the variation in their socioeconomic status and their variation in educational

resources.

The link between poverty, ethnic background and student achievement did not

begin with Berliner. The concern over these factors and equality of education started to

become a central focus when Dr. James S. Cooper of Johns Hopkins University published

Equality of Educational Opportunity in 1966. The study, known better as the Coleman

Report, concluded that, “black children started out school trailing behind their white

counterparts and essentially never caught up”(Viadero, p. 1). The study found that the

leading factor in contributing to this perpetual achievement gap in student’s academic

performance was their family backgrounds (Viadero, p. 1). Borman and Dowling (2010)

summarized, in the introduction to their research, that Coleman’s finding still holds a lot

of educational clout. Family background was a variable that inevitably included the

7

socioeconomic status of the family and could be classified within Berliner’s idea of out-

of-school factors.

Although poverty is quickly dismissed by many politicians as an excuse to not

produce a better educational system, Berliner’s idea that failing schools and the

achievement gap may be more the result of poverty should not be ignored. Berliner

simply believes that, “the problems of achievement among America’s poor are much

more likely to be located outside the school than in it” (Berliner, 2009, pg. 4). Poverty

can create a multitude of side effects including poor health, lack of food, minimal

prenatal care and consequently children that, on average, underperform in academics.

The Contraposition of Berliner

Some rectangles are not squares. In Euclidean geometry this statement is true.

Logically, if the propositional statement is valid then the contraposition of that statement

must also be valid. “Some a are not b,” naturally implies that “Some not b are not a”

(Tidman & Kahane, 2003, p. 319). In this geometry case, the contraposition is some non-

squares are not rectangles and it must be valid in Euclidean geometry. This final

statement must also be true because the argument is valid and the premise in Euclidean

geometry is true (Tidman & Kahane, 2003, p. 8). Berliner argues that some variables

associated with poverty result in poor student achievement. Furthermore, he provides

statistical evidence to suggest that the statement is valid (Berliner, 2009). Therefore, the

contraposition of his argument must be both valid and true. The contraposition is that

some high (non-low) student achievement is the result of variables associated with wealth

(non-poverty).

8

In fact, the contraposition of Berliner’s preposition is something that American

educators and educational leaders continue to ignore. In light of the conflicting evidence

with respect to the achievement gap, one could be hard pressed to argue with Berliner’s

viewpoint that the achievement gap has been the result of something much broader than

the educational system. The failures of schools with respect to student achievement

might be caused by more than poor teachers and poor administrators. The failure of

schools might have more to do with our inability as a country to fix our inept social

policies for those in poverty than fixing our educational system (Berliner, 2009). But, if

we are to conclude this we must not continue to ignore the contraposition. Schools that

we deem to be good or excellent throughout our states and our nation might be this, not

because of their best practices in the classroom and in administration, but due to their

limited exposure to the ill effects of poverty.

Educators and educational leaders have long thrived on the single school in a

district where all children are exceeding the standards. In the state of Arizona, the

excelling schools are those written about in the papers and those recognized by the

public. It must be the curriculum at those schools; it must be the teachers at those

schools; it must be the administrative leadership at those schools that cause them to be

excelling schools. The envy of all other schools in the state of Arizona excelling schools

are viewed as the places where things are done right; best practices are implemented and

leadership has a vision. Might it be that out-of-school factors are just as much to blame

for the excelling status of a school as OSFs are to blame for the failing status of another?

A staunch supporter of public education, Berliner attempts to protect the poor side

of education while glossing over the implications of his argument to the wealthy side.

9

The following research seeks to provide a foundation for framing school labels in the

state of Arizona: Berliner’s home state. Could it be that, even though a school is granted

a dignified label, little progress has been made at that school with respect to the

achievement gap? Might it be that these schools receive accolades merely because of

their demographics?

Statement of Problem

The purpose of this study was to examine the achievement gap in mathematics

and reading at all non-alternative schools within a suburban school district in the state of

Arizona for the 2009-2010 and 2010-2011 school years. The study was specifically

interested in student achievement, as measured by scale score, across ethnic subgroups

with respect to the state standardized AIMS examination. Furthermore, the study sought

to examine demographic reasons on why schools in this district obtained a certain school

label. Other interests of the study included the descriptive analysis of cross-sectional data

in reading and mathematics at these schools from 2009-2010 and 2010-2011 and the

predictive abilities of the percentage of non Black/Hispanic students with respect to the

percentage of students that exceeded on the AIMS examination. Using four main

research questions as a guide, data from two prior years was analyzed at schools

throughout the suburban school district.

Purpose of the Study

The purpose was to examine the achievement gap in mathematics and reading at

all non-alternative schools within a suburban school district within the state of Arizona

for the 2009-2010 and 2010-2011 school years. Furthermore, the study sought to

examine demographic reasons on why the schools within the suburban school district

10

obtained high and low school labels. The study was specifically interested in student

achievement across ethnic subgroups with respect to the state standardized AIMS

examination. Another interest of the study included the descriptive analysis of cross-

sectional data in reading and mathematics at these schools from 2009-2010 and 2010-

2011. Furthermore, the study sought to define the predictive abilities of the percentage of

non Black/Hispanic students with respect to the percentage of students that exceeded on

the AIMS examination. Using four main research questions as a guide, data from two

prior years was analyzed at schools throughout the suburban school district.

Research Questions

This dissertation was guided by the following questions:

1. What is the two year cross-sectional data trends for the achievement gap among

White, Asian, Hispanic and Black students on the 2009-2010 and 2010-2011

AIMS mathematics, reading and writing sections at all schools in the suburban

school district?

2. Is the average student achievement, as measured by average scale score, in ethnic

subgroups different for the 2009-2010 and 2010-2011 AIMS examinations at each

non-alternative school throughout the suburban school district?

3. Is the percentage of Asian and White students correlated with the state-issued z-

score, a standardized score for the percent of students that exceed on the AIMS

examination at a school in a given year, which helps determine school labels

within the state of Arizona?

4. Are free and reduced lunch rates, English Language Learner rates, percentage of

Asian students and percentage of White students correlated with the AZ LEARNS

11

A-F letter grades published by the state of Arizona for schools within the

suburban district?

These questions examine the achievement gap in the suburban district in order to

establish a baseline for the current validity of the school labeling system.

Significance of the Study

Despite the efforts of NCLB nine years ago, the eradication of the achievement

gap continues to elude our nation, our states and our districts (CEP, 2009). In an attempt

to solve why the achievement gap still persists one must identify a group of root causes.

Furthermore, the group of factors must be separated into what is controllable versus

uncontrollable by the education community. Thus, of the factors that contribute to the

persistence of the achievement gap many researchers believe that variance among

subgroup achievement can be most readily explained by out-of-school factors (OSFs).

OSFs can provide a multitude of reasons for the lingering gap (Berliner, 2010). Berliner

argues that schools are “not in the position to eliminate the achievement gap” because the

gap is the result of variables outside the schools control. Other researchers believe that

the unrelenting achievement gap is more related to teacher-level factors (Levine &

Marcus, 2007; Beecher & Sweeney, 2008; Harlan, 2009; Liew, Chen & Hughes 2010;

McKown & Weinstein, 2008), school-level factors (Burch, Theoharis, Rauscher, 2010;

Marshall, 2009) or district-level factors (Diamond, 2006; Leithwood, 2010; Loesch,

2010). Hierarchical Linear Models (HLMs) have helped researchers examine the factors

at each of these levels in determining their effects on the achievement gap (Wei, 2008;

Zhang & Zhang, 2002).

12

HLMs have helped determine that a multitude of factors nested within many

different levels of the educational system contribute to the achievement gap. As Wei

(2008) noted, “school accountability systems should be designed so that classroom level

variation can be taken into consideration when quantifying the precision of school

rankings” (pg. 3). While OSFs, school-level factors and district level factors are thought

to play a role in the achievement gap it still remains that each individual school carries

the responsibility to close their individual achievement gap. After all, OSFs are, by

definition, out of the locus of control of a school. Therefore, schools must remain

steadfast in their commitment to focus on those factors which they control and address

them so that all of their subgroups can perform academically.

Many schools throughout the state of Arizona continue to receive the

distinguished label of excelling by the Arizona Department of Education. In the state of

Arizona school labels are dispersed into six categories:

1. Excelling

2. Highly Performing

3. Performing Plus

4. Performing

5. Underperforming, and

6. Failing

Is it possible that excelling schools still perpetuate an achievement gap despite being

labeled excelling? The main goal of NCLB was closing the achievement gap and

ensuring all students a basic level of education. Therefore, schools that achieve the

highest label in the state of Arizona should show significant strides in accomplishing this

13

goal. In an effort to provide analysis of the achievement gap at schools in the suburban

school district this study performed descriptive analysis achievement gap data in 2009-

2010 and 2010-2011. It also examined the statistical significance of AIMS achievement

across ethnic subgroups by using an ANOVA. The study looked into the correlation

between the percentages of non-Hispanic/Black students and the z-score issued by the

Arizona Department of Education for that school. Finally, the study examined the

correlation between new letter grades issued by the Arizona Department of Education and

free and reduced lunch rates, English Language Learner rates, percentage of Asian

students, and percentage of White students. The answers to the research questions

enables administrators, teachers, parents and other stakeholders to better understand what

it means to receive a certain label by the state of Arizona. For instance, a minority parent

will be able to establish what a label in this suburban school district means for his or her

child. Additionally, a principal will understand whether a label correctly identifies the

ability of their school to service the needs of minority students and close the achievement

gaps. The superintendent can improve his/her ability to recognize why a school in this

district achieves their label. Finally, this study gives a baseline to understanding whether

NCLB has had a significant impact on closing the achievement gap within this suburban

school district in the state of Arizona.

Delimitations

1. The study was conducted on data from the Arizona Instrument to Measure

Standards (AIMS) administered in the spring of 2010 and 2011.

2. The study included schools from one suburban school district in the state of

Arizona.

14

3. The study does not include data from charter schools or other districts within

the state of Arizona. Consequently, the ability to generalize beyond the scope

of this study is minimized.

4. The study is being conducted within a school district for which the doctoral

candidate is employed.

Limitations

1. The study takes a limited look, through the fourth research question, into

changes in the labeling system within the state of Arizona (Kiley, 2010). The

ADE Learns letter grades of A through F are in their first year of

implementation. As a result, this is believed to be the first study examining

the letter grades and their relationship to other variables.

2. The schools in this study are from the Phoenix metropolitan area. The school

district analyzed was an ideal school district in SES status, ELL population

and student demographics to start examining the ADE labels.

3. The difference in socioeconomic status (SES) for each student within the

examined school is a confounding variable that is not included in the scope of

this study. While the SES of the entire school is examined in the fourth

research question by examining free and reduced lunch, the SES of individual

students is not taken into account.

4. Yearly changes to the AIMS examinations did not occur during the 2010 and

2011 spring administration of the mathematics and reading examinations.

However, questions within the exam do vary from year to year. Changes in

15

questions on well constructed standardized tests that are vertical scaled should

have minimal effect in subgroup population data.

Definition of Terms

Achievement Gap: Notion that minority students, specifically Blacks and

Hispanics, tend to lag behind their White/Asian counterparts in student achievement on

standardized assessments (Orlich, 2004).

Adequate Yearly Progress (AYP): Annual status check of identified data elements

to determine whether schools and school districts are meeting state progress goals (Smith,

2005).

Arizona Instrument to Measure Standards (AIMS): The test required by the state

of Arizona that measures student achievement on reading, writing and mathematics based

on Arizona state standards.

Asian: A student having origins in any of the original peoples of the Far East,

Southeast Asian, the Indian subcontinent or the Pacific Islands. This category excludes

students of Hispanic origin.

Black: A student having origins in any of the black racial groups in Africa. This

category excludes students of Hispanic origin.

Excelling School: A school in the state of Arizona that is labeled excelling by the

Arizona Department of Education is more than one standard deviation above the mean in

regards to the proportion of students that exceed on the AIMS test in conjunction with

meeting the requirement for Status and MAP points.

Hispanic: A student of Mexican, Puerto Rican, Cuban, Central or South

American, or other Spanish culture or origin, regardless of race.

16

No Child Left Behind Act (NCLB): President George W. Bush’s education reform

bill enacted in January 2002 which holds that all states across the U.S. will reach

universal proficiency in reading and mathematics by the end of the 2013-2014 school

year.

Out-of-School Factors (OSFs): Factors that are more frequently found in low

socioeconomic neighborhoods and as a result have an educational effect on students that

live in poverty. Included in these factors are such things as low birth weight, inadequate

medical and vision care and food insecurity (Berliner, 2009).

School Labeling Systems: An accountability system for schools, required by

NCLB, that ranks schools based on student academic performance on standardized state

assessments that measure standards.

White: A student having origins in any of the original peoples of Europe, North

Africa or the Middle East. This category excludes students of Hispanic origin.

Organization of the Study

The remainder of the study is organized into four chapters, a list of references,

and appendices. Chapter Two consists of a literature review that examines the current

research dealing with the achievement gap and subgroup achievement. Chapter Three

delineates the sampling techniques, methodology, and design of the study. The statistical

analysis of the data collected and a descriptive summary of the implications from the data

analysis are contained in Chapter Four. The summary of the findings from the research

along with implications and recommendations for future research are found in Chapter

Five. Immediately following Chapter Five is a list of references and appendices.

17

Summary

NCLB was implemented in 2001 in an effort to address the gap in achievement

between certain subgroups. The law strives to ensure that by 2014 all subgroups are

performing at a minimal level as measured by standardized achievement tests developed

from state standards. Schools are to be held accountable for improving subgroup

achievement through school labeling systems. School labeling systems are measuring

devices that provide comparative information for the public to judge schools’ ability to

drive student achievement and close the subgroup achievement gap. This study attempts

to analyze the achievement gaps at schools in a suburban Arizona school district and the

state of Arizona’s labeling system for the district schools.

CHAPTER 2

Review of the Literature

Introduction

The following review of literature is intended to provide a background of the

achievement gap in the United States. In particular, the review will focus on the history

of educational assessment, history of educational equality and impact of the 2001 NCLB

Act with respect to the achievement gap. Since the enactment of NCLB, schools

throughout the nation have been mandated with the task of closing the performance gap

between several subgroups. Numerous research studies have ensued focusing on

everything from the stringency of school accountability systems (Wei, 2008) to out-of-

school factors that perpetuate the achievement gap (Berliner, 2010). The proceeding

literature review aims to capture this research so as to frame main factors concerning the

achievement gap, its measurement, and school labeling.

History of Assessment

In 1845 Horace Mann and his educational ally Samuel Howe asked the Boston

School Committee to administer a written examination to school children instead of the

traditional oral examination (Rothman, 1995, p. 33). Oral examination had, for centuries,

dominated evaluation methods for students and measuring their learning outcomes. In

1219 AD, University of Bologna started giving oral examinations in law and in 1636

Oxford started holding oral exams in order to achieve a degree (Limprianou &

Athanasou, 2009, p. 6). Mann and Howe reasoned that these new written examinations

could provide objective information about student learning and quality of teaching

(Rothman, 1995, p. 33). Upon receiving the results from the initial testing Mann became

19

more confident of the power of the new testing methods. He began to “advocate for the

regular use of written tests to monitor the quality of instruction and permit comparisons

among teachers and schools.” (Rothman, 1995, p. 34)

After initial implementation, assessment within primary and secondary education

grew. Resnick and Resnick (1985) noted that tests in a variety of school subjects were

implemented during the last two decades of the 19th century and the first two decades of

the 20th century. As testing grew, cost efficiency started to become of critical importance

to those paying the bill on testing students—taxpayers. Resnick and Resnick (1985)

reasoned that this cost-efficiency drove the development of short-answer and multiple-

choice tests which were objective and cost-efficient simultaneously.

During the first part of the 20th century educational psychologist Edward

Thorndike helped push assessment further by helping the Army develop the Alpha and

Beta tests. The Army, during World War I, employed the knowledge of

psychometricians lead by Thorndike and Robert M. Yerkes to develop mental and

cognitive testing (Peterson & West, 2003, p. 2). Now known as the ASVAB test, these

tests were intended on helping the Army identify the intelligence of soldiers (Rothman,

1995, p. 37). It is doubtful that these tests had an impact on the outcome of the war.

However, the process of implementing a test to measure intelligence became more

acceptable as a result (Zimmerman & Schunk, 2003).

The biggest impact of the Alpha and Beta tests was aiding in the social acceptance

of testing as a means of determining those suited for more intelligent ventures (Peterson

& West, 2003, p. 3). The most common test still seen as a result of this development is

the Scholastic Aptitude Test. Stephen Gould stated that the Army had developed a test to

20

measure all pupils. As a result, “Tests could now rank and stream everybody; the era of

mass testing had begun,” (Gould, 1981, p. 195) and the SAT lead the way. In 1929, the

University of Iowa developed the Iowa Test of Basic Skills and the Iowa Test for

Educational Development. These tests were intended to help schools gather information

about student achievement data. But, their bigger impact may have been in developing

“large-scale” testing equipment and methodologies that were cost-efficient (Rothman,

1995, p. 38).

Until the 1960s the educational system in America was thought to, “solve

problems associated with civil rights, hunger, malnutrition, immigration, crime, teenage

drug use and economic inequality” (Peterson & West, 2003, p. 4). However, in the

middle of the 1960s and into the 1970s a concern had arisen with declining SAT scores.

From 1963 to 1977 average SAT scores had dropped from 478 points to 429 points on the

verbal section and 502 points to 470 points on the mathematics section (Rothman, 1995,

p. 40). Along with the decline in SAT scores, educational surveys during the decade

suggested that United States children were amongst the lowest in academic achievement

when compared to their international peers (Nichols & Berliner, 2007, p. 4). The panic

of Americans alarmed by the Russian launching of Sputnik in 1957 (USDoE, 2009, p. 8)

in conjunction with the data suggesting a failure in the educational system was the first

alarm for an educational crisis. These concerns lead to many states adopting minimum-

competency testing. From 1973 to 1983, “the number of states with some form of MCT

requirement went from 2 to 34” (Linn & Miller, 2005, p. 4). The view of public

education solving problems shifted to a view that public education perpetuated problems

during the 1960s and 1970s. As a result, accountability through testing increased.

21

The educational crisis culminated almost two decades after Sputnik in a report

commissioned by President Ronald Reagan’s education secretary Terrel H. Bell. A

Nation at Risk: The Imperative for Education Reform in America (1983) was a vigorous

attack on American education and the inability of the educational system to rise out of

mediocrity. The report states, “the educational foundations of our society are presently

being eroded by a rising tide of mediocrity that threatens our very future as a Nation and

a people” (Bell, 1983, p. 1). All of the 50 states put into place some type of reform after

Bell released the report and at the center of most of these reforms was a test for

accountability purposes (Linn & Miller, 2005, p. 5). A Nation at Risk served as a

catapult in advancing standardized assessment in public education.

A new face of accountability followed A Nation at Risk as Bell released a “wall

chart” that attempted to show how the fifty states measured in performance (Rotham,

1995, p. 44). As the 1980s passed, standardized tests started to dominate accountability

systems. The majority of states in this decade noted finding that the majority of their

students were above national average and John Cannell labeled this phenomena the “Lake

Wobegone effect” (Linn & Miller, 2005, p. 6). The Lake Wobegone effect is, essentially,

when every member contained in a comparison group that accounts for the entire

population reports to be above average. Noting that either states were misrepresenting

data or old norms were being used to score recent tests the above average results for

every state, the “Lake Woebegone effect”, was the result of pressure on states to show

significant gains in their educational system (Linn & Miller, 2005, p. 6). The overall

emphasis on student achievement through high-stakes testing provided political pressure

on states to show educational gains.

22

The standards based reform of the last two decades has brought the importance of

standardized assessments to an all-time high. Criterion referenced test, intended to show

minimal level of understanding of state mandated standards, have become the preferred

method of high-stakes testing. NCLB has reinforced the use of these types of tests to

primarily “determine rewards and sanctions for schools” (Linn & Miller, 2005, p. 8).

Much like Terrel Bell did when forming a “wall chart” for state educational

performances; NCLB has provided ranking systems based on assessments for districts

and schools within each state.

Mann and Howe originally asked Boston Schools Committee to implement

written assessments for students in order to more objectively evaluate student learning

and quality of instruction (Rotham, 1995). As time has passed assessment remains at the

forefront of educational reform of the 21st century. Showing significant assessment

gains, which resulted in the “Lake Woebegone effect” in the 1980s, continues to be a

central focus for parents, students, teachers, schools, district and politicians. NCLB

mandated school labeling systems, whether reliable or not, provide a landscape in which

pressure on administrators and teachers is ever increasing. Schools strive to be labeled as

excelling. States strive to have excelling schools. The “Lake Woebegone effect” of the

1980s leaves one to wonder whether the label for a school is statistically correct.

History of Achievement Gap

The differences in student achievement between subgroups in the American

education system have long been debated. During the 1950’s and 1960’s, the inequities

in opportunity and achievement of the education system were brought into the forefront

by Brown v. Board of Education (1954), the Elementary and Secondary Education Act

23

(1965) and the Civil Rights Act (1964). In 1963 an article on desegregation in

Englewood, New Jersey documented the achievement gap between Black students and

White students in elementary schools throughout the local school districts (Walker, 1963,

p. 8). The term “achievement gap” surfaced one year later in the Hauser Report on

Chicago public schools when the authors stated, “intensified educational opportunities for

Negro boys and girls would result in a major closing of the achievement gap between

group performances of Negro students and other groups of students” (Hauser, McMurrin,

Nabrit, Nelson & Odell, 1964). The above sources exemplify that the achievement gap in

its infancy focused on Black and White students.

The National Assessment on Education Progress (NAEP) showed that the

educational system made significant gains in closing the black-white achievement gap in

the two decades following the civil rights movement (Barton & Cooley, p. 3). The NAEP

was founded in 1964, from a grant by the Carnegie Corporation, with the intent of putting

a metric lens on achievement (NAEP, 2009). The first assessment was implemented in

the 1969-1970 school year and it obtained a baseline measure of student achievement

(NAEP, 2009). The NAEP was used in an effort to monitor national progress in

education specifically with an interest in equity. During the 1970s and 1980s the test

showed progress in the educational systems ability to close the black-white achievement

gap. “In reading, for example, a 39-point gap for 13-years olds was reduced to an 18-

point gap in 1988. For 17-year-olds, the gap declined from 53 points to 20 points”

(Barton & Cooley, p. 6). The progress of the 1970s and 1980s was met with the

optimistic viewpoint that education systems could progress towards eliminating the

black-white achievement gap.

http://nces.ed.gov/nationsreportcard/about/naephistory.asp

24

NAEP data from the 1990s presented a much different viewpoint of the ability of

the educational system to close the black-white achievement gap. The gap, which

generally became narrower in the 1970s and 1980s, actually began to show some

increases in certain age groups amongst the subjects of reading and mathematics (Barton

& Cooley, p. 7). However, in the period from 1999 to 2004 that gap slowly narrowed

again (Barton & Cooley, p. 15). In 2004, Secretary of Education Margaret Spellings

stated that NAEP data showed, “proof that No Child Left Behind is working—it is

helping to raise the achievement of young students of every race and from every type of

family background” (USDoE, 2005). In contrast, Marshall Smith (2007) suggested that

the progress seen in 2004 NAEP data was not far enough removed from the

implementation of NCLB in order to credit the legislation for the decreasing achievement

gap trends.

In the years after the implementation of NCLB, evidence has found that few gains

have been made in closing the black-white achievement gap. During 2004 to 2008 the

NAEP discovered “no statistically significant differences” in the changes of the black-

white achievement gap (Barton & Cooley, p. 15). In fact, despite the slight decrease in

the black-white achievement gap between 1999 and 2004 the lack of progress in closing

the gap since the 1980s casts a shadow of doubt on recent education reform. The strict

standards-based reform effort that has swept the country has shown little benefit in

closing the achievement gap (Orfield, 2006).

Although the term achievement gap was originally developed to describe the

achievement disparities between Black and White students the term has evolved into a

broader meaning. The term has grown to encompass the disparity in achievement

25

performance between any two groups of students (Education Week, 2004). Specifically,

since the passing of NCLB the gaps of importance include minority-majority ethnicity

achievement gaps, the gender gaps, and socioeconomic gaps.

The gaps in achievement between many of the subgroups highlighted by NCLB

continue to remain significant. A 2010 report sponsored by the Center on Education

Policy stated that, “States continue to confront large Hispanic-White gaps in achievement

on state reading and mathematics tests” (Kober, Chudowsky, & Chudowsky, 2010, p.

23). The report also concluded that, “less progress has been made in narrowing

achievement gaps on state tests for Native American than other racial/ethnic groups”

(Kober et al., 2010, p. 28). One of the few gains found by the CEP report was the gain in

achievement between low-income students and high-income students (Kober et al., 2010,

p. 31). Ultimately, the report concluded that evidence from state tests and the NAEP was

inconclusive with respect to narrowing achievement gaps.

The word “achievement gap” first surfaced in the civil rights era. First used to

describe the disparity in achievement between Black and White students the term has

since been used to describe other subgroup gaps. NAEP, a nationwide assessment, has

provided data since 1970 that has suggested that gaps narrowed in the 1970s and 1980s

and widened in the 1990s before becoming stagnant in the first decade of the 21st century.

The gap in academic performance between subgroups continues to be a critical issue

almost 50 years beyond the civil rights era.

Historical Background of Equity in Education

The NCLB Act must be viewed in the proper historical context. The American

education system has long pondered “How do we ensure equal education?” Plessy v.

26

Furgeson, 163 U.S. 537, was a landmark case in 1896 that supported separate education

for racial subgroups. The case determined that separate education for children of

different race is fine as long as equal educational resources were supplied (Imber & Van

Geel, 2004, p. 208). The response to this policy from W.E.B. Du Bois in the Common

School and the Negro American in 1911 was, “The alarming neglect of and

discrimination against the Negro schools are plainly evident to anyone who reads the

reports of educational officers in the southern states” (Reese, 2005, p. 210). Dubois, like

many who followed, believed that separate education was inherently not equal. For the

last one-hundred fourteen years educators, legislators and the judiciary have not been

able to concretely define what ensures different subgroups with equal education.

Equal education came to light again in 1954 in Brown v. Board of Education of

Topeka, 347 U.S. 483. The idea that separate educations were equal from Plessy v.

Furgeson was challenged in the case which sought to bring an end to segregated

education (Reese, 2005, p. 226). Thirteen parents on behalf of their twenty children filed

suit demanding that the Board reverse its racial segregation policies in its schools. The

decision of the court reversed the Plessy decision of 1896. The court determined that

separate educations were inherently unequal. Furthermore, districts were now required to

desegregate their schools in an attempt to ensure that all children were provided equal

education (Imber & Van Geel, 2004, p. 213).

The Elementary and Secondary Education Act was passed eleven years after

Brown v. Board of Education in 1965. ESEA was a bold legislative act that provided

federal funding to help ensure equal education in low poverty areas (Rury, 2002, p. 191).

Schools with a high enough proportion of students on the Free and Reduced Lunch

27

program would qualify for the federal funds. In an American era that valued the civil

rights of its citizens, ESEA was an attempt to ensure the right of education for

economically disadvantaged students (Webb, 2006, p. 288).

During the same era of the passing of ESEA women saw a major amendment

within educational rights, Title IX. In 1972 the implementation of Title IX had the

primary purpose of prohibiting gender inequity in education (Webb, 2006, p. 297). Title

IX is most noted for its significant impact in the participation of women in interscholastic

and intercollegiate athletics. Title IX also sought to provide women with equal

educational access (Rury, 2002, p. 196). Following Title IX, the Women’s Educational

Equity Act of 1974, which sought to encourage women into math and science, was

passed (Webb, 2006, p. 299). As a result in the decades since its enactment gender equity

in academics has become a central focus of many secondary and post-secondary

institutions.

As the nation began to demand equity in education for low SES students and

females, several other groups began to question how equitable education was for their

children. In 1975, P.L. 94-142, which is more commonly known as the Individual with

Disabilities Education Act (IDEA), was created in an effort to ensure equality in

education for handicapped children (Webb, 2006, p. 300). The law essentially required

that schools provide children with special needs a free and appropriate education within

the least restrictive environment (Imber & Van Geel, 2004, p. 262). The law has been

amended several times in its 35 year history. But, in its barest form the overriding

principals of IDEA sought to protect education for handicapped children and that goal

28

has not been altered. The federal government has deemed that there must be equality in

education for children with disabilities.

Most recently, in 2001 President George W. Bush signed the NCLB Act (NCLB)

in order to provide equal education for all subgroup populations throughout the United

States. The Senate vote for NCLB recorded eighty-seven YEAs and ten NAYs (U.S.

Senate Roll Call, 2001) while the House of Representative vote recorded three-hundred

eighty-four YEAs and forty-five NAYs (Clerk of the House, 2001). Republicans and

Democrats alike voted for NCLB in hopes of providing a vision for twenty-first century

education in America. Unprecedented bipartisan support for NCLB suggested that

Americans widely viewed elementary and secondary education as a civil right of

children. NCLB set forth a baseline of accountability systems that provided a framework

to guide school systems to ensuring equitable education for all subgroups throughout the

United States.

Equal opportunity in education has been historically questioned by the legislative

and judicial branches of the United States government. From Plessy v. Ferguson to

NCLB, how to ensure education gets dispersed uniformly and equitably to all subgroups

within the population has been a debated topic. Providing high quality education to all

American children has been the central goal of educational reform for over the last

century (Kopan & Walberg, 1974, p. 1). Republican, Democrat, and Independents all

understand that, as a civil right, education must be dispersed to the masses in an

evenhanded fashion.

29

No Child Left Behind

The NCLB Act of 2001 included several key components to strive for equity in

education. The most impactful of these components was probably the creation of

performance-based accountability systems through student testing (Popham, 2004, p. 14).

Under NCLB all fifty states were mandated to create accountability systems for students,

schools and districts. The goal of these systems was to ensure that all students in all

subgroups nationwide reached reading and mathematics proficiency by 2014 (Popham,

2004, p. 23; Webb, 2006, p. 360). The following figures (Steel, 2009, p. 14) demonstrate

what the expected progress for all subgroups in English (Figure 1) and mathematics

(Figure 2) may have been:

Figure 1. NCLB Student Achievement Expectations for English for All Subgroups.

30

Figure 2. NCLB Student Achievement Expectations for Mathematics for All Subgroups.

The charts show that by 2014 every student, in all subgroups throughout the entire United

States, will be meeting performance standards set forth by the state. Although, a recent

development at the White House is enabling states to opt out of the 2014 deadline for

100% proficiency in reading and mathematics in exchange for adopting President

Obama’s new education agenda (Bruce, p. 1). Under the original requirement of NCLB,

if all subgroups are meeting standards, the achievement gap would have been nullified.

From developed state standards students would be tested and assessed on their knowledge

of the standards. Students would be held accountable by withholding diplomas for poor

performance or prohibiting grade advancement as a result of low achievement scores.

Schools and districts would be held accountable by reaching target goals for student

achievement in reading and mathematics.

School and district target goals are referred to in NCLB vernacular as annual

measurable objectives (AMO). AMO are set by states in reading and mathematics.

31

Schools as well as their districts must meet their AMO in order to achieve adequate

yearly progress (AYP). A school that fails to make AYP might face sanctions from the

state. Furthermore, schools that fail to meet their AMO and fall short of AYP for several

years can face further sanctions that include state takeover or dissemination of the school

(Webb, 2006, p. 365). Other consequences of schools perpetually falling short of AYP

may include providing transportation for students to attend schools that are making AYP.

AMO are not just set for the whole school or district. The school and the district

are expected to meet AMO for all subgroups (Popham, 2004, p. 24). Consequently, a

school meeting their AMO and achieving AYP has been successful at meeting goals for

all of their different subgroups including but not limited to special education students,

racial and ethnic students, and educationally disadvantaged students (Webb, 2006, p.

365). Recognition that AMO are satisfied for all of these subgroups does not, however,

suggest that the achievement gap has been closed at the schools meeting AYP.

Schools meeting AYP and improving subgroup scores on state-mandated

assessments can, mathematically, see an increase in performance gaps between

subgroups. Assume that in the 2007-2008 school year, the fictional Equality High School

shows that seventy-eight percent of White students have met the state-wide performance

standard in mathematics and fifty-four percent of Black students have done the same. In

the following year, Equality High School sees that eighty-five percent of White students

met the standard, which is a seven percent increase. In turn, sixty percent of the Black

students met the standard, which is a six percent increase. Both increases qualify the

school to make their AMO for each subgroup and consequently AYP is achieved. But,

the performance gap between White students and Black students at Equality High School

32

has increased from twenty-four percent to twenty-five percent. As a result, the question

of whether the achievement gap is being addressed at schools remains.

NCLB was implemented for a variety of reasons. The most prevalent of these

reasons was ensuring the civil right of all of America’s children to be exposed to equal

educational opportunities. John Dewey emphasized the belief in the early twentieth

century that, “education is the means by which Americans try to improve individuals and

society” (Reese, 2005, p. 322) and within American society NCLB was an attempt to

ensure equal educational opportunity to all students. As a result, mandates in NCLB

focused on making schools and districts accountable for equal educational opportunity

through closing the achievement gap. By making schools and districts accountable for

making AMO for all subgroups politicians in the federal government believed that the

achievement gap would inherently be closed. Unfortunately, the impact of NLCB with

respect to the achievement gap is uncertain.

Schools are held accountable for all subgroups to perform academically because

NCLB requires that states set up a school labeling or ranking system. There is variation

among school rankings for each state (Linn, 2006). Some states use growth models to

rank schools while other states use performance targets. For example, “Kentucky’s

accountability system and California’s Academic Performance Index (API) are examples

of the successive cohorts approach (growth) to measuring improvement in student

achievement” (Linn, 2006, p. 12). NCLB also allows states to determine their minimum

subgroup size as long as the minimum size is greater than five and less than one-hundred

(Wei, 2008). In the state of Arizona the minimum subgroup size is currently thirty.

States were also given the ability to, “select the interval for their intermediate

33

achievement goals (for each subgroup) after they set the starting point” (Wei, 2008).

Finally, each individual state has an individual labeling system. In the state of Arizona

school labels are dispersed in six categories: failing, underperforming, performing,

performing plus, highly performing and excelling. As of the 2011-2012 school year, the

Arizona State Legislature has determined that these labels need to be accompanied by an

additional label due to their ambiguous nature (Kiley, 2010). The new Arizona labeling

system will accompany the current labeling system so that a parent can better understand

that a performing plus school is average. The new state of Arizona labeling system

includes the labels of A, B, C, D and F. As of 2011-2012 a performing plus school will

most likely also receive a label as a ‘C’ school. School labels give the general public a

perception about the performance of the school in relationship to other schools. The

school labels also attempt to help community members understand the ability of a school

to satisfy the requirements of NCLB.

A better understanding of school performance, including progress in closing the

achievement gap, based on a label was what NCLB legislation hoped to provide.

Unfortunately, even with the change recently required by the Arizona State Legislature,

one is left to wonder whether school rankings correctly measure a school’s progress and

performance in many areas. The Arizona school label of excelling is obtained by a

school that meets a certain status level of achievement and accumulates a z-score of at

least one in the exceed category of student performance (ADE, 2010). A z-score of at

least one implies that a school is at least one standard deviation above the average

percentage of students that exceed on the AIMS test in the state. With this as the

measurement that determines being excelling, schools with high proportions of

34

educationally advantaged students might have distinct assistance in achieving an

excelling label. A school, through no fault of its own, may be labeled excelling because

of the same OSFs that Berliner argues perpetuate the achievement gap.

After understanding the issues behind school rankings, the measurement of

school-wide performance becomes the central focus. How does one measure whether a

school is performing in accordance with NCLB? What is measured in order to ensure a

school label correctly portrays its ability to educate students from all of its subgroups?

Furthermore, once a measurement device is in place what checking mechanisms does the

state have for ensuring its validity?

Studies and Reports Regarding the Trends in the Achievement Gap since NCLB

Since the enactment of NCLB in 2001 the central question posed by numerous

researchers has been, “What are the trends in the achievement gap?” The conclusion

with respect to the impact of NCLB on the achievement gap is far from settled. Another

decade might pass before educational researchers ever understand the depth and brevity

of NCLB on the achievement gap. As a result, current research on the role NCLB has

played in closing the gap is muddled and far from conclusive.

A 2005 technical report from the Northwest Evaluation Association (NWEA) was

released and outlined the immediate impact of NCLB on student achievement and

growth. With NCLB having been implemented for just three years, the report does

mention its limitations on predicting the long term effects of the legislation. Examining

the mean scores of students on state tests the report concludes that “from the Fall 2001 to

Fall 2003 improvement was greater for all ethnic groups among students enrolled in

grades that administered their respective tests” (Cronin, Kingsbury, McCall & Bowe,

35

2005, pg. 41). Furthermore, the report suggested that “on the whole, evidence indicated

that small but substantive gains in achievement were made by Blacks, Hispanics, and

Native-Americans that would serve to reduce achievement gaps between these groups

and European-American and Asian students” (Cronin, Kingsbury, McCall & Bowe, 2005,

pg. 42). The report essentially concluded that the initial findings of the impact of NCLB

were positive when examining the achievement gap.

The NWEA (2005) report does caution about the importance of the starting

position when measuring the achievement gap. It is noted in the report that there is

substantially more room for Black and Hispanic students to grow in comparison to their

Asian and White counterparts. On every standardized test there is a ceiling effect. The

ceiling that exists is the maximum score that is achievable on that test. Subgroups that

perform better when baseline data is acquired are closer to the ceiling and their room for

growth is less. Therefore, it is very likely that in the initial years after implementation of

NCLB one might see a closing of the achievement gap due to the ceiling effect. Minority

subgroups have more room for growth and therefore the gap inherently closes. This idea

can also explain, “how an achievement gap might be reduced in an environment in which

minority students grow less” (Cronin, Kingsbury, McCall & Bowe, 2005, pg. 45).

Researchers have taken note of the ceiling effect and as a result a focus on growth

rates among subgroups has become equally as important as the gap itself. Essentially, to

close the achievement gap it is necessary to have the growth rates of lower performing

subgroups to be higher than the growth rates of the highest performing subgroups. The

key to NCLB is that this must be achieved while the performance of all subgroups

increases. Unfortunately, research suggests that growth rates remain approximately

36

constant across ethnic groups when initial performance position is taken into account

(Goldschmidt, 2004).

The 2005 NWEA report initially concludes that there is statistically significant

evidence to suggest that the achievement gap since the implementation of NCLB has

been reduced. However, upon further analysis the report recognized that when other

measurements are taken into account less dramatic conclusions can be made about the

achievement gap since the implementation of NCLB. At one point in the report it is

stated, “In sixth grade mathematics, the actual achievement gap between European-

American and Black students in our sample increased from a gap of 7 points to 10 points

between the fall of 2003 and the spring of 2004” (NWEA, 2005, pg. 46). The devices in

which we choose to measure the achievement gap are of critical importance. A complete

picture of the gap will not be captured on simply the difference between the percentages

of students that meet proficiency between subgroups.

In October 2009, the Center on Education Policy (CEP) produced a report on

2007-2008 state test score trends and the test implications about the achievement gap.

The report included five main findings in state testing data:

1. All subgroups showed more gains than declines in grade 4 at all three

achievement levels.

2. As measured by percentages of students scoring proficient, gaps between

subgroups have narrowed in most states at the elementary, middle and high

school levels, although in a notable minority of cases gaps have widened.

37

3. Most often gaps narrowed because the achievement of lower-performing

subgroups went up rather than because the achievement of higher-performing

subgroups went down.

4. Gaps in percentages proficient narrowed more often for the Hispanic and

Black subgroups than for other subgroups.

5. Although mean scores indicate that gaps have narrowed more often than they

have widened, mean scores give a less rosy picture of progress in closing

achievement gaps than percentages proficient.

The overriding conclusion to these findings might be that state test scores suggest that the

achievement gap is being addressed. However, the fifth finding of this report poses

questions that are a necessity when measuring anything. The question that must be asked

when measuring is, “What is the measurement device and how is the measurement

constructed?” In this case the report, itself, addressed the issue.

When examining the achievement gap with the percentage of students that met the

proficient status on the state mandated standards assessment, the Center for Education

Policy found that seventy-one percent of the time the achievement gap had been

narrowed (CEP, 2009). Alternatively, when examining the same exact data with a mean

score as the measurement device the achievement gap was only narrowed fifty-nine

percent of the time. The report provided similar data with respect to areas were the

achievement gap actually widened. “Mean gaps also widened more than percentage

proficient gaps—37% of the time for mean scores versus 24% of the time for percentages

proficient” (CEP, 2009). The Center for Education Policy report does suggest that

overall the achievement gap is being narrowed. But, the important underlying issue that

38

is presented in the report of “how do we measure the achievement gap” must be noted.

Once again, variance in measurement techniques can create very different understandings

of the impact of NCLB.

The enactment of NCLB certainly moved the Elementary and Secondary

Education Act into the age of accountability. In every state the state-wide standards

based testing varies but the main objective remains to hold schools accountable for

student achievement in every subgroup. The argument can be made whether these

accountability systems play a role in increasing student achievement. Costrell (1997) and

Bishop (1997), in separate studies, concluded that exit examinations created awareness

and improvement of student achievement among leadership, faculty, and students. But,

much like differing opinions on the impact of NCLB on the achievement gap, other

researchers have concluded just the opposite. Studies have concluded that high-stakes

exit examinations have little to no effect on student achievement and can actually cause

an increase in dropout rates among low-achieving students (Jacob, 2001). Perhaps one

of the most infamous critics of high-stakes testing includes David Berliner. Amrein and

Berliner (2002) ran a group of studies on the precipice of NCLB that concluded that little

change was found when school based accountability was in place. Other researchers

have concluded that there is improvement in student achievement with school based

accountability systems in place (Carnoy & Loeb, 2002; Hanushek & Raymond, 2004).

Ultimately, as research continues to waffle over the impact of state accountability

systems for schools they also continue to debate over the impact of these systems on the

achievement gap. Hanushek and Raymond (2003a, 2003b, 2004) concluded that while

accountability increased student achievement for the White, Black and Hispanic

39

populations there were varying results with respect to the achievement gap. They found

that the White and Black achievement gap grew while the White and Hispanic

achievement gap got smaller. Differing effects for two different achievement gaps during

the era of school accountability created by NCLB.

Summary

Accountability has long been an issue within education. Oral examinations

served as an accountability technique for students during the thirteenth century. In the

nineteenth century Mann and Howe implemented written examinations in order to hold

students accountable for knowledge. Over the course of the last century accountability

has expanded to now include teacher, school, district and state accountability.

School, district and state accountability over that last five decades have had a

primary focus on equity in education. Brown v. Board of Education, Title IX, IDEA, and

NCLB all have primary purposes that can be generalized under the common theme of

equity in education. Holding schools, districts and states accountable for equal

educational access and equal educational outcomes is the goal of these judicial and

legislative developments.

NCLB was thought to be the first major Civil Rights Act of the twenty-first

century. Closing the achievement gap amongst various subgroup populations in

American education was the central focus in NCLB. The achievement gap had been

documented during the previous three decades and NCLB set out to make education

equitable for all race, gender, and SES subgroups.

Research on the impact of NCLB with respect to the achievement gap is

convoluted. Findings from different researchers often paint different pictures to whether

40

NCLB has had a significant impact on closing the achievement gap across the United

States. Some research suggests that the achievement gap has been decreased in years

following the implementation of NCLB. Other research suggests that while the gap may

have been reduced the reduction is minimal at best. Finally, some research implies that

the metric in measuring the achievement gap can play a significant role in ones

conclusion on the impact of NCLB on the achievement gap.

Mann and Howe turned the keys to ignition over fifteen decades ago when they

began administering written examinations as a means for measuring student progress.

Measuring has now become of instrumental importance to American educators. The

achievement gap has been of particular interest to measure as evidenced by the

implementation of NCLB. The impact of NCLB and its school accountability systems on

the achievement gap may take another fifteen decades to unravel.

Chapter 3

Methodology

Introduction

A description of the research design, the research questions that guided the study,

the target population, sample procedures, the sample, research instrumentation, data

collection procedures, and data analysis plans are presented. Before providing the

methodology for the research a brief review of the purpose of the study and the research

questions are provided.

Restatement of the Problem

The purpose of this study was to examine the achievement gap in mathematics

and reading at all non-alternative schools within a suburban school district in the state of

Arizona for the 2009-2010 and 2010-2011 school years. The study was specifically

interested in student achievement, as measured by scale score, across ethnic subgroups

with respect to the state standardized AIMS examination. Furthermore, the study sought

to examine demographic reasons on why schools in this district obtained a certain school

label. Other interests of the study included the descriptive analysis of cross-sectional data

in reading and mathematics at these schools from 2009-2010 and 2010-2011 and the

predictive abilities of the percentage of non African-America/Hispanic students with

respect to the percentage of students that exceeded on the AIMS examination. Using four

main research questions as a guide, data from two prior years was analyzed at schools

throughout the suburban school district.

42

Restatement of Research Questions

This dissertation was guided by the following questions:

1. What is the two year cross-sectional data trends for the achievement gap among

White, Asian, Hispanic and Black students on the 2009-2010 and 2010-2011

AIMS mathematics, reading and writing sections at all schools in the suburban

school district?

2. Is the average student achievement, as measured by average scale score, in ethnic

subgroups different for the 2009-2010 and 2010-2011 AIMS examinations at each

non-alternative school throughout the suburban school district?

3. Is the percentage of Asian and White students correlated with the state-issued z-


examination at a school in a given year, which helps determine school labels

within the state of Arizona?

4. Are free and reduced lunch rates, English Language Learner rates, percentage of

Asian students and percentage of White students correlated with the AZ LEARNS

A-F letter grades published by the state of Arizona for schools within the

suburban district?

These questions examine the achievement gap in the suburban district in order to

establish a baseline for the current validity of the school labeling system.

43

Research Design

Ex-post facto data was analyzed with quantitative methods for this research. The

use of quantitative research methods allowed evaluation of Arizona’s school labeling

system with respect to the suburban school district according to two years of cross-

sectional data.

Target Population

The population of interest includes all schools within the suburban school district

in the state of Arizona. In specific, due to the convenience sample the actual population

is specifically limited to the suburban school district being analyzed.

Sample

From the 2009-2010 and 2010-2011 school years the suburban school district had

41 schools. Within those two years the district schools received diverse labelings from

the Arizona Department of Education. Specifically, the district had schools ranging from

underperforming to excelling. Furthermore, since the district is a unified district these

labels were distributed across the elementary, junior high and high school levels. As a

result, a convenient sample for this study consists of state testing data, student

demographic data and school labels from the 2009-2010 and 2010-2011 school years for

the 41 schools in the suburban school district that existed for both of the testing years.

44

Sampling Procedures

The suburban school district was selected as a convenient sample of schools in the

state of Arizona. The district has a diverse population of students across ethnic groups,

SES status and English-language learners. As reflected in Table 1, the population of

students in the district was 4.77% ELL in comparison to a statewide ELL rate of 6.7%

(ADE, 2011). The district also had 28.47% of their students receiving Free and Reduced

Lunch Program benefits in comparison to the statewide rate of 41.5% (ADE, 2011).

Table 1

ELL Percentages and Free and Reduced Lunch Percentages Comparison

ELL

Free and Reduced Lunch Percentages

District 4.77% 28.47%

Statewide 6.7% 41.5%

According to ethnic demographics (see Table 2) the suburban district is 6.7%

Black, 25.8% Hispanic, 57.1% White and 8.5% Asian while the state is 5.5%, 42.2%,

42.9% and 2.8% in those respective categories (ADE, 2011).

45

Table 2

Ethnic Comparison between Suburban School District and State of Arizona

District

State of Arizona

Black 6.77% 5.5%

Hispanic 25.8% 42.2%

White 57.1% 42.9%

Asian 8.5% 2.8%

The diversity of this district in comparison to the state made it an ideal convenience

sample for the purposes of the study.

Data Collection Procedures

Data were gathered from the results of the statewide AIMS test which is given on

a yearly basis to third through eight grade and tenth grade students. The data was

obtained from the Arizona Department of Education (ADE) website for each of the two

academic school years and three subject areas to be analyzed. The data, as distributed by

ADE, is provided to the general public in a Microsoft Excel spreadsheet. This

spreadsheet was utilized to obtain student achievement data.

46

Data Analysis

The quantitative analysis examined whether a school label is linked to the closing

of the achievement gap at that school within the suburban school district. Cross sectional

data trends were examined by creating bar graphs showing the difference in average scale

score between subgroups over the spring 2010 and 2011 exam administrations.

Furthermore, cross sectional data were analyzed by constructing stacked bar graphs based

on student achievement labels of exceeding, meets, approaching, and fall far below to

examine the first research question.

In order to examine the second research question, priori ANOVA tests, with

family-wise α ≈ 0.06 due to six post-hoc pair-wise comparisons, was performed using

average scale score and standard deviation of scale score for each subgroup in each

testing year at every district school. Post-hoc tests using Tukey-Kramer method were run

to analyze pair-wise comparisons. The Tukey-Kramer post-hoc analysis was conducted

at α=0.01. The post-hoc alpha level accounts for the Bonferroni correction and limits the

overall probability of committing a Type I error to 0.06 because, at most, six pair-wise

comparisons were made.

In order to analyze the third research question, Pearson’s correlation coefficient

and, subsequently, a linear regression was performed on the two bivariate scatterplots

produced using district data from each of the AIMS administrations in 2010 and 2011.

The model uses a percentage of Asian and White students as the independent variable and

the z-score produced by the state as the dependent variable. After performing the

analysis, the coefficient of determination, the slope of the regression model, the standard

error of the slope and the p-value for the slope were recorded in order to show if there is

47

statistical significance in the model. A coefficient of determination greater than 0.30 will

suggest that the percentage of Asian and White students has predictive capability for the

z-score that ADE uses to label and excelling school because the effect size, as measured

by r, will be medium (Cohen, 1992, pg. 157).

A simple multiple linear regression was performed on 2011 letter grade data to

examine the fourth research question. The model used free and reduced lunch rate,

English Language Learner rates, percentage of Asian students and percentage of White

students as the independent variables and the quantitative A-F letter grade as the

dependent variable. The coefficients, standard error of the coefficients, t-score and p-

value for each independent variable was noted along with the coefficient of determination

for the entire model. P-values for each coefficient being less than α = 0.05 will suggest

that the variable significantly contributes to the model. A coefficient of determination

greater than 0.13 will suggest that the simple multiple regression model has predictive

capabilities for the ADE A through F letter grade because the effect size, as measured by

f2, will be medium (Cohen, 1992). Corresponding residual plots for regression models

will also be analyzed to ensure the errors in the regression models constructed are

random.

Validity

Validity is a term within research that has a wide variety of definitions. Kerlinger

(1964) simply stated, “Are we measuring what we think we are?” Black and Champion

(1976) offered the following definition for validity, “the measure that an instrument

measures what it is supposed to.” Hammersley’s (1987) stated that, “an account is valid

if it represents accurately those features of the phenomena, that is intended to describe,

48

explain or theorise.” The purpose of the study was to examine the achievement gap at

schools in the suburban school district and the labels placed on these schools by Arizona

Department of Education system. Validity therefore can be seen in the ability of the

study to accurately reflect the achievement gap at these schools and consequently provide

some explanation for that schools corresponding school label. The theory of the study

being that labels attached to district schools provide little evidence, through descriptive

and inferential statistics, for the closing of the achievement gap.

External Validity

External validity addresses the ability to generalize a study to other populations.

The results from this research study cannot be generalized to other districts in the state of

Arizona. Because the sample consists of a convenience sample of one suburban school

district future research in other districts will need to be completed before any results can

be generalized throughout the state of Arizona. Furthermore, the extent to which the

results can be generalized to other states with different school ranking, or labeling,

systems exist is limited as well.

Internal Validity

Internal validity deals with the truth about inferences made with respect to a

cause-effect relationship (Trochim, 2006). Curren and Werth (2004, p. 220) state that

internal validity is the, “assertion that an observed relation between two variables reflects

a causal process or that the lack of an observed relation reflects the lack of a causal

process.” The research was performed on ex-post facto data from an observational study

49

which is the AIMS test. As a result, no inferences about cause and effect relationships

will be made in this study. Therefore, since there will be no inference about a causal

relationship between variables internal validity is not a concern within this study.

Chapter 4

Findings and Results

Introduction

The study reported in this chapter examined the achievement gap at schools in a

suburban school district in the southwest United States. Furthermore, this study sought to

examine the relationship between variables such as percent of students at a school that

were of Asian and White ethnicities and numeric variables such as z-score that help ADE

in determining school labels. The chapter is organized in terms of the four specific

research questions that were posed in Chapter 1 and restated in Chapter 3.

First, a report of the analysis is provided on the achievement gap at the 40 schools

throughout the district by examining proficiency percentages during 2010 and 2011

AIMS examinations administrations. Second, the chapter moves to the analysis of the

achievement gap at the 40 schools throughout the district by examining average scale

score by running ANOVAs on 2010 and 2011 AIMS data. Third, an examination of the

relationship between z-score, a variable instrumental to determining AZ Learn Legacy

school labels, and percent of Asian and White students at a school by using correlation

and regression. Finally, in continuing with the relationship between variables the chapter

commences with examining the relationship between four different school-level variables

and the school letter grade assigned by ADE for the 2011 school year. A final summary

of all the information found throughout the chapter is the concluding analysis that

presides at the end.

51

Analysis of the Achievement Gap Using AIMS Proficiency Percentage

The analysis of the achievement gap using the proportion of students proficient on

the AIMS examination during the 2010 and 2011 AIMS test administration follows. This

analysis addresses the first research question of:

1. What is the two year cross-sectional data trends for the achievement gap

among White, Asian, Hispanic and Black students on the 2009-2010 and

2010-2011 AIMS mathematics, reading and writing sections at all schools in

the suburban school district?

The analysis of this research question is divided into two sections. One of these sections

addresses the results of the 2010 AIMS administration and the other addresses the results

of the 2011 AIMS administration.

2010 AIMS Summary - Overall

An achievement gap during the 2010 Spring AIMS administration was noted at

the majority of the schools in the suburban school district with respect to proficiency.

Proficiency is determined by the proportion of students that meet or exceed the standards

on the AIMS examination. As revealed in Table 3, at Elementary #1, 98.33% of the

Asian students were proficient on the AIMS examination while only 67.15% and 77.53%

of the Black and Hispanic students were proficient, respectively. The gap in proficiency

between Asian and Black students was 31.18% and the proficiency gap between Asian

and Hispanic students was 20.8%. Similar performance gaps at Elementary School #1

exist between White and Black students and White and Hispanic students. In fact, the

gap between both of these groups was 21.21% and 10.83%, respectively.

52

Further examination of the other two elementary schools on Table 3 shows similar

results with respect to the achievement gap amongst students of different ethnicities

during the 2010 Spring AIMS administration. At Elementary #2 Asian and White

students were proficient on the exam at 92.23% and 92.3%, respectively. In contrast,

Black and Hispanic students were proficient on the exam at a rate of 76.67% and 81.3%,

respectively. At Elementary #3 Asian and White students were proficient on the exam at

95.52% and 89.71%, respectively. This compares to Black and Hispanic students being

proficient on the exam at 81.6% and 80.75%, respectively. Once again the gaps in

performance amongst different ethnic subgroups are observed.

Table 3

Spring AIMS 2010 Subgroup Performance by Ethnicity and Elementary School

School

Ethnicity % FFB % Approach % Meets % Exceeds

Elementary School #1

Asian 0% 1.67% 38.89% 51.44%

Black 5.71% 27.14 52.86 14.29

Hispanic 7.34% 15.14 58.26 19.27

White 2.08% 9.56 51.23 37.13


Asian 0.49% 7.28 44.66 47.57

Black 4.44% 18.89 58.89 17.78

Hispanic 4.88% 13.82 67.48 13.82

White 1.37% 6.33 59.81 32.49

(table continues)

53

Table 3 (continued)

School



Asian 1.63% 2.86 47.76 47.76

Black 0% 18.39 70.11 11.49

Hispanic 7.49% 11.46 60.96 19.97

White 3.60% 6.69 56.75 32.96

Achievement gap trends are also seen at the junior high level during the 2010

Spring AIMS administration. As reported in Table 4, at junior high #4 89.43% and

87.59% of Asian and White students were proficient on examinations taken. Noticeably

below this performance, at junior high #4 Black and Hispanic students were 70.79% and

73.77% proficient, respectively. The smallest gap in proficiency existed between White

and Hispanic students at junior high #4 where White students were13.82% more likely to

be proficient than Hispanic students.

Table 4

Spring AIMS 2010 Subgroup Performance by Ethnicity and Junior High School

School


Junior High #4 Asian 3.40% 7.17% 55.85% 33.58%

Black 10.11% 19.10 56.18 14.61

Hispanic 8.83% 17.40 60.52 13.25

White 4.14% 8.27 62.77 24.82

(table continues)

54

Table 4 (continued)

School


Junior High #5 Asian 3.59% 5.99 53.89 36.53

Black 6.17% 14.81 66.67 12.35

Hispanic 9.47% 11.89 58.15 20.48

White 2.87% 6.65 61.21 29.27

Junior High #6 Asian 7.32% 18.29 50.00 24.39

Black 18.86% 21.14 48.57 11.43

Hispanic 20.41% 24.55 47.75 7.28

White 7.53% 11.88 59.53 21.06 At the high school level a similar trend is noticed for the Spring 2010 AIMS

administration. As stated in Table 5, all four high schools in this suburban school district

displayed more than a 10% difference in proficiency for every comparison of

White/Asian and Black/Hispanic performance. The smallest achievement gap between

these subgroups existed at High School #4 where Asian students were only 11.95% more

likely to be proficient than Hispanic students. The largest achievement gap between

these subgroups existed at High School #2 where the White-Hispanic gap was at 30.62%

with respect to proficiency.

55

Table 5

Spring AIMS 2010 Subgroup Performance by Ethnicity and High Schools

School


High School #1 Asian 1.08% 4.30% 58.06% 36.56%

Black 7.81% 20.31 57.03 14.84

Hispanic 12.84% 13.76 55.96 17.43

White 4.34% 9.37 59.20 27.09

High School #2 Asian 4.17% 9.52 41.07 45.24

Black 11.36% 17.73 55.91 15.00

Hispanic 14.81% 27.53 48.17 9.49

White 4.39% 7.32 57.91 30.37


Black 9.74% 13.96 55.19 21.10

Hispanic 9.93% 16.06 53.79 20.22

White 3.56% 5.84 50.36 40.24


Black 16.33% 22.45 52.04 9.18

Hispanic 8.15% 15.93 57.04 18.89

White 2.52% 7.65 62.74 27.09

A summary table of all achievement gaps as measured by proficiency percentage

on the 2010 Spring AIMS examination can be found in Table 6. The data shows that of

the 40 schools within the suburban school district the majority of them still exhibit large

56

differences in proficiency between Black/Hispanic students and Asian/White students.

For example, 29 out of the 40 schools, or 72.5%, exhibited a gap in proficiency that was

at least 10% lower for Black students as compared to White students. Similarly, 62.5%

of the schools within the district showed at least a 10% proficiency gap between Hispanic

and White students.

As found in the Spring 2010 AIMS administration, schools within this district

continued to struggle with closing the achievement gap with respect to proficiency

percentage. Also detailed in Table 6, only 3 out of 40 schools within the district showed

a successful closing of the achievement gap between Black and Asian students. Of

further note is that two of these three schools had special circumstances in closing this

gap. One of the schools was a preparatory school set up by the district in order to capture

high achieving students and maximize their academic achievement. Another of these

schools had a significantly small portion of Asian students and the closing of the gap was

most likely the result of increased variance amongst such a small Asian student sample.

The data in Table 6 shows that achievement within the district still continues to be

different between ethnic subgroups.

57

Table 6

Spring AIMS 2010 Subgroup Proficiency Gaps Summary of All Schools

Number of Schools with Observed Gap

Ethnicity

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 32 5 2 1

Hispanic-Asian 27 11 1 1

Black-White 29 9 2 0

Hispanic-White 25 14 1 0

Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column.

The data from the 2010 Spring AIMS administration as compared to school labels

is more prevalent when examining the achievement gaps at each school by label. Table 7

summarizes the achievement gaps observed for all excelling schools in the district. The

data in Table 7 shows that of the 22 excelling schools during the 2010 school year

anywhere from 63.6% to 72.7% of them showed a sizeable achievement gap between

Hispanic/Black students and White/Asian students. Once again, one of the only schools

to have realized a closing of the achievement gap between these subgroups was a

specialty school designed for high academic achieving students. This school can be

observed in the gap column represented by 0% < X < 10%. Besides this school only two

other schools that were labeled excelling demonstrated a closing of the achievement gap.

One of these schools exhibited this closing between Black and Asian students and the

58

other between Hispanic and Asian students. Overall the majority of excelling schools

demonstrated persistent achievement gaps during the 2010 AIMS administration.

Table 7

Spring AIMS 2010 Subgroup Proficiency Gaps Summary of Excelling Schools

Number of Excelling Schools with Observed Gap

Ethnicity

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%





Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column 2010 AIMS Summary – By Subject

Examining the results of the 2010 Spring AIMS administration on the school level

by subject matter gives a similar picture as the overall summary presented above. At the

vast majority of schools large achievement gaps are observed between White/Asian and

Hispanic/Black subgroups in each of the three subjects: mathematics, reading and

writing. These gaps exist across all levels of schooling within the district from the

elementary level to the high school level.

Elementary #1 exhibits disparate performance amongst ethnic subgroups when

broken down by subject. As demonstrated in Table 8, the largest achievement gap exists

between the Asian and Black students at Elementary #1 where it was 36.47% more likely

59

for an Asian student to show proficiency on the mathematics examination than a Black

student. One should also observe that the smallest achievement gap is observed between

White and Hispanic students in writing. A White student during the 2010 Spring AIMS

administration was only 4.34% more likely to show proficiency in writing in comparison

to a Hispanic student. Overall, Table 8 reveals that Elementary School #1 still exhibits

large performance gaps between ethnic subgroups at this school when disaggregated by

subject.

Table 8

Spring AIMS 2010 Elementary School #1 Performance by Subject and Ethnicity

Subject


Mathematics Asian 0% 2.82% 19.72% 77.46%

Black 10.71% 28.57 35.71 25.00

Hispanic 11.63% 13.95 50.00 24.42

White 3.98% 9.48 36.39 50.15

Reading Asian 0% 1.41 53.52 45.07

Black 3.57% 28.57 64.29 3.57

Hispanic 4.65% 17.44 63.95 13.95

White 0.91% 8.23 60.37 30.49

Writing Asian 0% 0 47.37 52.63

Black 0% 21.43 64.29 14.29

Hispanic 4.35% 13.04 63.04 19.57

White 0.62% 12.42 62.73 24.22

60

Information provided in Table 9, shows similar patterns exist at the junior high

level with respect to the achievement gap across subgroups. Junior high #4 displayed

issues that were seen across most of the junior highs within the district. As one can

observe in Table 9, Black/Hispanic proficiency rates were lower than White/Asian

proficiency rates across all three subjects. The most significant proficiency gap is the

Asian-Black gap in mathematics where Asian students were 34.98% more likely to be

proficient at mathematics than their student peers that were Black. The smallest

achievement gap was in writing where Black students lagged behind their Asian peers by

5.38% in proficiency. Table 9 clearly demonstrates the existing gaps in proficiency

between ethnicities at junior high #4.

Table 9

Spring AIMS 2010 Junior High #4 Performance by Subject and Ethnicity

Subject


Mathematics Asian 6.42% 3.67% 32.11% 57.80%

Black 19.72% 25.35 32.39 22.54

Hispanic 17.11% 23.03 38.82 21.05

White 9.11% 10.01 41.08 39.79

Reading Asian 0.92% 11.01 68.81 19.27

Black 4.23% 16.90 66.20 12.68

Hispanic 3.95% 14.47 73.68 7.89

White 1.02% 8.19 75.03 15.75

(table continues)

61

Table 9 (continued)

Subject


Writing Asian 2.13% 6.38 80.85 10.64

Black 2.78% 11.11 83.33 2.78

Hispanic 2.47% 12.35 76.54 8.64

White 0.50% 5.03 81.16 13.32

Imparted in Table 10, High School #1 shows that the observed pattern of schools

demonstrating significant achievement gaps in subject areas continues at the high school

level. In mathematics the smallest achievement gap was between White and Hispanic

students where White students were 19.17% more likely to achieve proficiency. In

reading, the gap between White and Black students was the smallest with White students

demonstrating a rate of proficiency that was 7.67% greater. Writing exhibited the

smallest achievement gaps. The smallest of the gaps in writing existed between Hispanic

and White students with Hispanic students being 3% less likely to pass the AIMS writing

examination.

62

Table 10

Spring AIMS 2010 High School #1 Performance by Subject and Ethnicity

Subject



Black 22.73% 25.00 38.64 13.64

Hispanic 30.77% 10.26 39.74 19.23

White 12.06% 9.80 45.98 32.16

Reading Asian 0% 3.33 76.67 20.00

Black 0% 16.67 73.81 9.52

Hispanic 2.82% 21.13 66.20 9.86

White 0.53% 8.47 73.54 17.46

Writing Asian 0% 6.06 60.61 33.33

Black 0% 19.05 59.52 21.43

Hispanic 2.90% 10.14 63.77 23.19

White 0.25% 9.80 58.79 31.16

The patterns illustrated in the examination of the three schools in Table 10, Table

9 and Table 8 across the subjects of reading, writing and mathematics were consistently

observed throughout the schools in this suburban district. The summary of subject level

achievement gaps for all analyzed schools is provided in Table 11. In examining Table

11, it should be noted that at least 87.5% of all the schools within the district exhibited

lower proficiency performance amongst Hispanic/Black students than Asian/White

students in mathematics and reading. The subject where the achievement gap appears to

63

elude educators the most is mathematics where at least 92.5% of the district schools

showed an achievement gap amongst the ethnicities examined. Furthermore, at least

77.5% of the schools had mathematics achievement gaps that were at least a 10-

percentage point difference during the 2010 school year.

Table 11

Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing

by Ethnicity for 2010 Spring AIMS Administration

Mathematics

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 90.00% 2.50 2.50 5.00

Hispanic-Asian 87.50% 5.00 2.50 5.00

Black-White 90.00% 7.50 2.50 0.00

Hispanic-White 77.50% 20.00 2.50 0.00

Reading

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 57.50% 32.50 7.50 2.50

Hispanic-Asian 60.00% 27.50 10.00 2.50

Black-White 62.50% 25.00 10.00 2.50

Hispanic-White 62.50% 30.00 7.50 0.00

(table continues)

64

Table 11 (continued)

Writing

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 52.63% 26.32 21.05 0.00

Hispanic-Asian 47.37% 26.32 21.05 5.26

Black-White 33.33% 28.21 30.77 7.69

Hispanic-White 35.00% 35.00 22.50 7.50


When analyzing achievement gaps by subject for excelling schools (see Table 12)

the picture during the 2009-2010 school year is not much different. The vast majority of

excelling schools within the school district showed significant achievement gaps in

mathematics, reading and writing. Discovered in Table 12, in mathematics at least 20 of

the 22 excelling schools, or 90.9%, showed an achievement gap with respect to

proficiency. Reading and writing did not fare much better with at least 81.8% and

59.1%, respectively, of the excelling schools showing a gap. Overall, the tables show

that excelling schools within this suburban school district exhibited a distinct

achievement gap during the 2009-2010 school year.

65

Table 12

Number of All District Excelling Schools With Observed Gap in Mathematics, Reading

and Writing by Ethnicity for 2010 Spring AIMS Administration

Mathematics

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%





Reading

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%





Writing

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 8 7 6 1

Black-White 6 7 8 1



Note. *X represents the difference between percentages of students proficient in each ethnic subgroup given in the gap column

66

2011 AIMS Summary - Overall

As with the 2009-2010 AIMS data, an achievement gap during the 2011 Spring

AIMS administration was noticed at the majority of the schools in the suburban school

district with respect to proficiency. Table 13, as compared to Table 3 in the 2010 AIMS

summary, demonstrated the same pattern in achievement gaps during the 2010-2011

school year. At Elementary #1 96.46% of the Asian students were proficient on the

AIMS examination while only 82.76% and 80.20% of the Black and Hispanic students

were proficient, respectively. Similar performance gaps can be seen at Elementary

School #1 between Black and Hispanic students in comparison to White students where

88.70% of White students were proficient. Further examination of the other two

elementary schools in Table 13 shows similar results with respect to the achievement gap

amongst students of different ethnicities during the 2011 Spring AIMS administration.

At Elementary #2 Asian and White students were proficient on the exam at 95.05% and

91.60%, respectively. This compares to Black and Hispanic students being proficient on

the exam at 76.67% and 81.3%, respectively. At Elementary #3 Asian and White

students were proficient on the exam at 95.52% and 89.71%, respectively. This

compares to Black and Hispanic students being proficient on the exam at 85.71% and

78.22%, respectively. Once again, gaps in performance amongst different ethnic

subgroups are observed.

67

Table 13

Spring AIMS 2011 Subgroup Performance by Ethnicity and Elementary School

School



Asian 1.97% 1.97% 34.48% 61.58%

Black 3.45% 13.79 70.69 12.07

Hispanic 2.54% 17.26 58.88 21.32

White 2.50% 8.80 49.82 38.88


Asian 0.50% 4.46 55.45 39.60

Black 3.57% 10.71 61.90 23.81

Hispanic 4.03% 17.74 60.48 17.74

White 1.40% 7.00 57.80 33.80


Asian 0.47% 5.16 44.13 50.23

Black 1.64% 16.39 67.21 14.75

Hispanic 4.57% 20.57 62.29 12.57

White 3.95% 11.07 55.25 29.72

Achievement gap trends are also seen at the junior high level during the 2011

Spring AIMS administration (see Table 14). At junior high #4 85.20% and 84.08% of

Asian and White students were proficient on examinations taken. Noticeably below this

performance at junior high #4 Black and Hispanic students were 68.57% and 70.72%

proficient, respectively. The smallest gap in proficiency existed between White and

Hispanic students at junior high #4 where White students were13.36% more likely to be

proficient than Hispanic students.

68

Table 14

Spring AIMS 2011 Subgroup Performance by Ethnicity and Junior High School

School


Junior High #4

Asian 5.10% 9.69% 50.51% 34.69%

Black 8.57% 22.86 57.14 11.43

Hispanic 10.72% 18.55 55.94 14.78

White 5.52% 10.39 60.84 23.24

Junior High #5

Asian 2.74% 6.16 52.74 38.36

Black 9.35% 17.27 62.59 10.79

Hispanic 6.60% 16.75 57.11 19.54

White 4.37% 8.83 58.05 28.75

Junior High #6

Asian 3.85% 6.41 64.10 25.64

Black 27.87% 25.68 36.61 9.84

Hispanic 23.11% 26.32 45.49 5.08

White 10.53% 15.30 54.83 19.34

At the high school level a similar trend is noticed for the Spring 2011 AIMS

administration. As shown in Table 15, all four high schools in this suburban school

district displayed more than a 10% difference in proficiency for nearly every comparison

of White/Asian and Black/Hispanic performance. The only achievement gap that did not

show a 10% difference was at high school #1 where the gap in proficiency between

White and Hispanic students was 9.17%. The largest achievement gap between these

subgroups once again existed at High School #2 where the White-Hispanic gap was at

69

25.72% with respect to proficiency. All four of the high schools within this suburban

school district were labeled as excelling schools during the 2010-2011 school year.

Table 15

Spring AIMS 2011 Subgroup Performance by Ethnicity and High School

School


High School #1

Asian 1.00% 3.00% 56.00% 40.00%

Black 13.29% 12.03 67.09 7.59

Hispanic 9.12% 14.04 63.16 13.68

White 5.01% 8.98 65.82 20.19

High School #2

Asian 2.16% 5.95 49.19 42.70

Black 12.15% 16.02 64.64 7.18

Hispanic 12.05% 24.32 55.87 7.76

White 4.05% 6.60 61.73 27.62

High School #3

Asian 1.31% 2.09 49.35 47.26

Black 8.54% 14.95 61.92 14.59

Hispanic 9.78% 17.53 61.25 11.44

White 3.48% 4.84 57.37 34.32

(table continues)

70


School


High School #4

Asian 1.72% 5.17 63.79 29.31

Black 12.90% 18.55 55.65 12.90

Hispanic 9.41% 15.68 63.07 11.85

White 4.85% 6.21 64.11 24.83

A summary of all achievement gaps as measured by proficiency percentage on the

2011 Spring AIMS examination can be found in Table 16. The table shows that of the 40

schools examined within the suburban school district the majority of them still exhibit

large differences in proficiency between Black/Hispanic students and Asian/White

students. For example, 25 out of the 40 schools, or 62.5%, exhibited a gap in proficiency

that was at least 10% lower for Black students as compared to White students. Similarly,

60.0% of the schools within the district showed at least a 10% proficiency gap between

Hispanic and White students.

71

Table 16

Spring AIMS 2011 Subgroup Proficiency Gaps Summary of All Schools

Number of Schools with Observed Gap

Ethnicity

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%






During the spring 2011 AIMS administration, schools within the suburban district

continued to struggle with closing the achievement gap with respect to proficiency

percentage. Results from analyzing descriptive statistics suggest that 2011 achievement

gap data remained very similar to 2010 achievement gap data. In the 2009-2010 school

year 3 out of 40 schools within the district showed a successful closing the achievement

gap between Black and Asian students. During the 2010-2011 school year this number

reduced to 2 out of 40. The 2009-2010 school year showed two schools had closed the

Hispanic-White achievement gap but during the 2010-2011 school year this number

increased to 4. Ultimately, it must be noted that for the vast majority of schools within

this school district large achievement gaps remained during the 2010-2011 school year.

The data from the 2011 Spring AIMS administration is more prevalent when

conditionally examining the achievement gaps at each school by label. Table 17 shows a

72

summary of the achievement gaps observed for all excelling schools. Of the 22 excelling

schools during the 2011 school year anywhere from 54.5% to 68.1% of them showed a

sizeable achievement gap between Hispanic/Black students and White/Asian students

(see Table 17). Once again, as detailed in Table 17, one of the only schools to have

realized a closing of the achievement gap between these subgroups was a specialty school

designed for high academic achieving students. This school can be observed in the gap

column represented by 0% < X < 10%. At a minimum 20 of the 22 excelling schools in

the district showed an achievement gap between the ethnic groups being examined.

Table 17

Spring AIMS 2011 Subgroup Proficiency Gaps Summary of Excelling Schools

Number of Excelling Schools with Observed Gap

Ethnicity

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%






73

2011 AIMS Summary – By Subject

Examining the results of the 2011 Spring AIMS administration on the school level

by subject matter gives a similar picture as the overall summary not disaggregated by

subject. At the vast majority of schools large achievement gaps are observed between

White/Asian and Hispanic/Black subgroups in each of the three subjects: mathematics,

reading and writing. These gaps exist across all levels of schooling within the district

from the elementary level to the high school level.

Elementary #1 exhibits disparate performance amongst ethnic subgroups when

broken down by subject. As conveyed in Table 18, the largest achievement gap exists

between the Asian and Black students at Elementary #1 where it was 15.49% more likely

for an Asian student to show proficiency on the mathematics examination than a Black

student. While this is a drastic improvement over the 2009-2010 gap it is still far from

having closed the achievement gap between these two ethnicities. Table 18 shows this

proficiency disparity and several others. One should also observe that the White and

Hispanic achievement gap, which was smallest achievement gap during the 2009-2010

school year, increased from 4.34% to 12.49%. A White student during the 2011 Spring

AIMS administration was 12.49% more likely to show proficiency in writing in

comparison to a Hispanic student. Overall, Table 18 reflects that the trend of large

performance gaps between ethnic subgroups at this school during the 2010-2011 school

year continued.

74

Table 18

Spring AIMS 2011 Elementary School #1 Performance by Subject and Ethnicity

Subject



Black 8.33% 12.50 62.50 16.67

Hispanic 5.33% 16.00 44.00 34.67

White 4.50% 8.41 32.13 54.95

Reading Asian 1.33% 0 48.00 50.67

Black 0% 8.33 79.17 12.50

Hispanic 1.33% 9.33 73.33 16.00

White 1.20% 4.20 61.26 33.33

Writing Asian 1.89% 3.77 39.62 54.72

Black 0% 30.00 70.00 0

Hispanic 0% 31.91 59.57 8.51

White 1.14% 18.29 61.71 18.86

As imparted in Table 19, at the junior high level similar patterns exist with respect

to the achievement gap across subgroups. Junior high #4 displayed issues that were seen

across most of the junior highs within the district. As uncovered in Table 19, one can

observe Black/Hispanic proficiency rates were lower than White/Asian proficiency rates

across all three subjects at junior high #4. The most significant proficiency gap is the

Asian-Black gap in mathematics where Asian student were 18.80% more likely to be

proficient at mathematics than their student peers that were Black. Once again, while this

75

gap between Asian-Black students is significantly lower the 2009-2010 school year it is

still far from demonstrating the requirement in NCLB of closing the achievement gap.

Table 19

Spring AIMS 2011 Junior High #4 Performance by Subject and Ethnicity

Subject



Black 17.91% 19.40 43.28 19.40

Hispanic 24.09% 12.41 40.15 23.36

White 10.61% 9.60 43.69 36.11

Reading Asian 1.23% 7.41 61.73 29.63

Black 0% 20.90 71.64 7.46

Hispanic 2.19% 18.25 67.15 12.41

White 1.89% 7.83 73.23 17.05

Writing Asian 0% 20.59 70.59 8.82

Black 7.32% 31.71 56.10 4.88

Hispanic 1.41% 30.99 64.79 2.82

White 2.70% 16.91 70.10 10.29

As shown in Table 20, High School #1 shows that the observed pattern of

achievement gaps at the subject level continue in the district for high schools. In

mathematics the smallest achievement gap was between White and Hispanic students

where White students were 12.44% more likely to achieve proficiency. As detailed in

Table 20, the gap between White and Black students in reading was the smallest with

76

White students demonstrating a rate of proficiency that was 6.15% greater. Writing

exhibited the smallest achievement gaps. The smallest of the gaps in writing existed

between Hispanic and White students with Hispanic students being 7.52% less likely to

pass the AIMS writing examination. However, this gap in writing proficiency between

Hispanic and White students was more than double the 2009-2010 gap. The achievement

gap rates demonstrated by high school #1 help confirm the trends observed in the subject

level analysis of school data.

Table 20

Spring AIMS 2011 High School #1 Performance by Subject and Ethnicity

Subject


Mathematics Asian 0% 0% 43.75% 56.25%

Black 28.57% 7.14 51.79 12.50

Hispanic 23.23% 12.12 46.46 18.18

White 11.35% 11.57 47.60 29.48

Reading Asian 0% 0 54.55 45.45

Black 3.92% 9.80 80.39 5.88

Hispanic 2.11% 12.63 70.53 14.74

White 1.78% 5.79 72.38 20.04

Writing Asian 2.86% 8.57 68.57 20.00

Black 5.88% 19.61 70.59 3.92

Hispanic 1.10% 17.58 73.63 7.69

White 1.63% 9.53 78.37 10.47

77

The patterns illustrated in the examination of the three schools across the subjects

of reading, writing and mathematics were consistently observed throughout the schools in

this suburban district. The summary of all analyzed schools for the 2011 AIMS

administration are provided in Table 21. In examining Table 21, it should be noted that

at least 67.5% of all the schools within the district exhibited lower proficiency

performance amongst Hispanic/Black students than Asian/White students in

mathematics, reading and writing. The subject where the achievement gap appears to

elude educators the most is mathematics where at least 90.0% of the district schools

showed an achievement gap. Furthermore, at least 67.5% of the schools had mathematics

achievement gaps that were at least a 10-percentage point difference during the 2011

school year.

Table 21

Percent of All District Schools with Observed Gap in Mathematics, Reading and Writing

by Ethnicity for 2011 Spring AIMS Administration.

Mathematics

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 87.50% 10.00 2.50 0.00

Hispanic-Asian 80.00% 12.50 7.50 0.00

Black-White 70.00% 22.50 5.00 2.50

Hispanic-White 67.50% 22.50 7.50 2.50

(table continues)

78


Reading

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 60.00% 30.00 7.50 2.50

Hispanic-Asian 57.50% 32.50 7.50 2.50

Black-White 50.00% 30.00 15.00 5.00

Hispanic-White 40.00% 47.50 10.00 2.50

Writing

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%

Black-Asian 71.79% 10.26 7.69 10.26

Hispanic-Asian 71.79% 17.95 7.69 2.56

Black-White 45.00% 22.50 22.50 10.00

Hispanic-White 57.50% 27.50 12.50 2.50


When analyzing achievement gaps by subject for excelling schools the picture

during the 2010-2011 school year is not much different than the prior year (see Table 22).

The vast majority of excelling schools within the school district showed achievement

gaps in mathematics, reading and writing. Table 22 also demonstrates that in

mathematics at least 19 of the 22 excelling schools, or 86.3%, showed an achievement

gap with respect to proficiency. Reading and writing had comparable results with at least

86.3% and 72.7%, respectively, of the excelling schools showing a gap.

79

Table 22

Number of All District Excelling Schools With Observed Gap in Mathematics, Reading

and Writing by Ethnicity for 2011 Spring AIMS Administration.

Mathematics

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%





Reading

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%





Writing

X* < -10%

-10% < X < 0% 0% < X < 10% X > 10%


Black-White 9 7 5 1




80

Summary of AIMS Proficiency Data from Spring 2010 and Spring 2011

After examining proficiency percentages amongst the four ethnic subgroups for

the 40 schools within the suburban school district it is evident that despite school labels

the closing of the achievement gap continues to elude a wide majority of the schools

Each school exhibited minor differences in the achievement gap but upon compilation the

data suggest that whether a school is excelling, or highly performing, there appears to be

distinct proficiency performance differences between ethnic subgroups at these schools.

In fact, at a minimum 59.1% of the excelling schools during each of the 2010 and 2011

school years had lower proficiency performance amongst Black or Hispanic students in

comparison to White or Asian students in mathematics, reading or writing. Furthermore,

in this suburban school district, where approximately 75% of the schools receive one of

the two highest labels from ADE, at least 60% of the schools still exhibit achievement

gaps of 10% percentage points amongst academically low-performing ethnic subgroups.

Analysis of the Achievement Gap Using AIMS Scale Score

The analysis of the achievement gap using the average scale score on the AIMS

examination during the 2010 and 2011 AIMS test administration follows. This analysis

addresses the second research question of:

2. Is the average student achievement, as measured by average scale score, in

ethnic subgroups different for the 2009-2010 and 2010-2011 AIMS

examinations at each non-alternative school throughout the suburban school

district?

81

The analysis of this research question is divided into two sections. The first section

provides an analysis of ANOVA results for the 2010 and 2011 school years. The second

section provides a summary of this analysis.

2010 and 2011 Achievement Gap Analyzed through Average Scale Score

After examining the achievement gap relative to the percentage of students that

met proficiency at a given school the next question in the study was to examine the

achievement gap with respect to the average scale score at a given school. The AIMS

examination results in students receiving two types of scores: a raw score and a scale

score. A raw score is simply how many questions did the student get correct on the

exam. The scale score is a transformation of the raw score such that comparisons across

different versions of the test can be made. Essentially, by horizontally scaling an

examination a 6th grade student that took version H of the mathematics exam can be

compared to a 6th grade student that took version G of the mathematics exam. Using the

scaled scores for different ethnic subgroups within each school to analyze the

achievement gap examines the validity of the results drawn in the first part of Chapter 4.

Originally it was believed that average scale score for students of one ethnic

subgroup at a school could be compared to the average scale score of all other ethnic

subgroups at a school using an ANOVA. Unfortunately, upon initially examining

average scale scores it was evident that the process would not be this simple. As noted in

Table 23, average scale scores in mathematics appeared to increase with each grade level

throughout the district. The table shows that across the 7 years in which the AIMS

mathematics examination is administered the average scale score increases from 384.37

to 518.30.

82

Table 23

Average Scale Score throughout the District on 2010 AIMS Mathematics Administration

Grade

Average

3rd 384.37

4th 401.23

5th 410.85

6th 428.72

7th 442.92

8th 453.59

10th 518.30

This suggested that while the AIMS examination was horizontally scaled the

scaling across grade levels was not equivalent. As a result, if a school had a

disproportionate population that included a higher percentage of sixth grade Black

students than third grade Black students the average scale score for Black students would

be skewed by the lack of vertical scaling. Consequently, results of the ANOVA could

possibly reflect the grade location of ethnic students rather than their performance.

One way to account for the lack of vertical scaling across grade-levels was to

transform the data so that every student’s score is on the same scale. Using a simple

linear transformation, xi = zσ + µ, all of the scale scores for each student in the district

were transformed into a z-score based on the average scale score and the standard

deviation of scale score for every grade level. The advantage of using this linear

transformation process is that the original distribution is preserved and the distribution of

83

scores amongst ethnic subgroups is also preserved. After performing the linear

transformation on both the 2010 and 2011 AIMS mathematics and reading scores the

ANOVAs were performed on the z-scores transformed from the average scale score.

In total there were eighty ANOVAs performed on the AIMS data for the 2010 and

2011 school years. Each of the 40 schools had one ANOVA for the 2010 school year and

one ANOVA for the 2011 school year. Transforming mathematics and reading scores

into a z-score allowed for the comparison of mathematics and reading z-scores in

conjunction with each other. A z-score of 1 in mathematics represents the same thing as

a z-score of 1 in reading, which is a student scoring one standard deviation above the

mean on each respective subject test. As a result the ANOVA was analyzed using the

mathematics and reading z-scores as the dependent variable and ethnicity (Black,

Hispanic, White, Asian) as the factors.

When performing an ANOVA it is important to analyze the data to see if any of

the conditions have been violated. The three main conditions for ANOVAs include:

normality of observations within each factor, homogeneity of variances across each factor

and independence of observations across each factor. The most important of the

assumptions when running an ANOVA is independence. Stevens (2007, p. 59) states that

independence is “by far the most important assumption, for even a small violation of it

produces a substantial effect on both the level of significance and the power of the F

statistic.” Fortunately, in the case of this research independency has not been violated.

When evaluating independency Glass and Hopkins (1984, p. 353) issued the statement

that, “whenever the treatment is individually administered, observations are independent.

But where treatments involve interaction among persons…the observations may

84

influence each other.” In the case of the AIMS examination the treatment, or test, is

individually administered. Consequently, according to Glass and Hopkins the

observations on the AIMS examination must be independent of one another for each

individual student.

The second most important of the assumptions in an ANOVA is the assumption

of homogeneity of the population variance across ethnicities. Levene’s test on

homogeneity of variance provided analysis with respect to this assumption. During the

2010 school year 11 of the 40 schools exhibited heterogeneity of variances according to

Levene’s (α = 0.05). During the 2011 school year 6 of the 40 schools exhibited

heterogeneity of variances according to Levene’s (α = 0.05). Although Levene’s test for

these schools showed cause for concern with heterogeneity, the violation of the

assumption is not as problematic unless group sizes are sharply unequal (Stevens, p.58).

Unfortunately, as listed in Table 24, in the case of the 11 schools from 2010 and the 6

schools from 2011 the ratios of the largest group size to the smallest group size all were

in excess of 1.5. As a result, the corresponding ANOVAs for these schools were not

analyzed for the purposes of this research.

85

Table 24

Levene’s Test for Homogeneity of Variance P-Values for Each School across Ethnicities

with Respect to Average Scale Score

School Number

2010 Levene’s

P-Value

Largest/Smallest Group Ratio

2011 Levene’s P-Value


Elementary #1 0.428 0.496



High School #1 0.359 0.710

Junior High #1 0.205 0.129

High School #2 0.037 6.147 0.001 5.565


Elementary #5 0.020 17.632 0.680


Elementary #7 0.178 0.025 8.222

Junior High #2 0.136 0.691


Elementary #9 0.025 4.940 0.029 4.149

Elementary #10 0.120 0.228

Elementary #11 0.472 0.784

Elementary #12 0.055 0.332

(table continues)

86


School Number

2010 Levene’s

P-Value




Elementary #13 0.333 0.100

High School #3 0.951 0.001 4.423

Special #1 0.072 0.055

Elementary #14 0.896 0.693

Elementary #15 0.205 0.346

Elementary #16 0.270 0.124

Elementary #17 0.308 0.888

Junior High #3 0.677 0.462

Elementary #18 0.964 0.435

Elementary #19 0.004 11.154 0.549

Elementary #20 0.923 0.171

High School #4 0.443 0.130

Elementary #21 0.134 0.409

Elementary #22 0.239 0.587

Elementary #23 0.213 0.182

Elementary #24 0.113 0.961

Elementary #25 0.589 0.417

Elementary #26 0.042 13.726 0.192

Junior High #4 0.005 10.986 0.023 10.934

(table continues)

87


School Number

2010 Levene’s

P-Value




Elementary #27 0.046 10.294 0.066

Elementary #28 0.016 8.881 0.250

Elementary #29 0.033 11.520 0.361

Junior High #5 0.000 11.293 0.287

Junior High #6 0.008 13.875 0.001 12.424

The final assumption for an ANOVA is the assumption of normality within each

group. Kolmogorov-Smirnov tests (α = 0.05) were run on each subgroup within each

school for the 2010 and 2011 school years to determine whether the distribution of scale

scores were approximately normal. As shown in Table 25, during the 2010 AIMS

administration, 106 of the 160 tests failed to reject normality as a reasonable assumption

for each group distribution. In 2011, 105 of the 160 tests failed to reject normality as a

reasonable assumption for each groups distribution. One might become concerned about

the 54 groups in 2010 and the 55 groups in 2011 which violated the assumption of

normality. However, a summary by Glass, Peckham and Sanders (1972) on research

conducted studying the effects of non-normality on an ANOVA shows that non-

normality only has a slight effect on Type I errors. Stevens (1972, p. 57) states, “the F

statistic is robust with respect to the normality assumption.” As a result, even though the

normality assumption was violated in approximately 34%, as shown in Table 25, of the

88

group distributions it should have a negligible effect on the Type I error rate and power of

the ANOVAs due the robust nature of the F statistic when encountering non-normality.

Table 25

Kolmogorov-Smirnov (KS) P-Values for Normality for 2010 and 2011 AIMS

Distributions by Ethnicity

School Number

2010 KS

White

2010 KS

Hispanic

2010 KS

Black

2010 KS

Asian

2011 KS

White

2011 KS

Hispanic

2011 KS

Black

2011 KS

Asian

E #1 0.000 0.200 0.200 0.200 0.038 0.200 0.000 0.085

E #2 0.002 0.031 0.200 0.200 0.000 0.004 0.073 0.088

E #3 0.000 0.001 0.200 0.057 0.001 0.200 0.200 0.039

HS #1 0.001 0.187 0.060 0.200 0.000 0.063 0.200 0.200

JH #1 0.000 0.014 0.200 0.022 0.000 0.200 0.080 0.078

HS #2 0.000 0.010 0.200 0.200 0.000 0.200 0.024 0.200

E #4 0.011 0.200 0.200 0.200 0.034 0.031 0.200 0.200

E #5 0.200 0.200 0.200 0.200 0.000 0.022 0.200 0.001

E #6 0.000 0.200 0.048 0.019 0.000 0.013 0.170 0.029

E #7 0.000 0.200 0.200 0.200 0.000 0.096 0.200 0.200

JH #2 0.018 0.200 0.200 0.083 0.001 0.039 0.200 0.010

E #8 0.002 0.200 0.200 0.200 0.030 0.018 0.200 0.200

E #9 0.200 0.007 0.200 0.041 0.054 0.017 0.200 0.002

E #10 0.200 0.030 0.200 0.112 0.197 0.200 0.033 0.200

(table continues)

89


School Number

2010 KS

White

2010 KS

Hispanic

2010 KS

Black

2010 KS

Asian

2011 KS

White

2011 KS

Hispanic

2011 KS

Black

2011 KS

Asian

E #11 0.200 0.200 0.200 0.200 0.087 0.060 0.200 0.200

E #12 0.200 0.041 0.200 0.200 0.200 0.157 0.200 0.200

E #13 0.049 0.034 0.200 0.200 0.200 0.039 0.200 0.200

HS #3 0.000 0.200 0.200 0.050 0.000 0.161 0.200 0.004

S #1 0.065 0.000 0.200 0.001 0.001 0.200 0.200 0.200

E #14 0.200 0.200 0.200 0.200 0.024 0.156 0.200 0.200

E #15 0.038 0.087 0.017 0.200 0.001 0.026 0.200 0.061

E #16 0.007 0.007 0.200 0.200 0.003 0.018 0.200 0.091

E #17 0.200 0.049 0.200 0.200 0.013 0.200 0.011 0.078

JH #3 0.057 0.008 0.200 0.080 0.002 0.151 0.036 0.075

E #18 0.001 0.200 0.156 0.200 0.069 0.200 0.200 0.200

E #19 0.200 0.200 0.178 0.162 0.200 0.022 0.098 0.200

E #20 0.009 0.200 0.200 0.200 0.200 0.200 0.087 0.200

HS #4 0.000 0.000 0.200 0.200 0.000 0.200 0.200 0.200

E #21 0.000 0.025 0.200 0.200 0.000 0.190 0.038 0.200

E #22 0.000 0.031 0.200 0.200 0.000 0.177 0.200 0.200

E #23 0.200 0.200 0.200 0.200 0.200 0.027 0.200 0.200

E #24 0.200 0.200 0.200 0.200 0.200 0.002 0.181 0.200

(table continues)

90


School Number

2010 KS

White

2010 KS

Hispanic

2010 KS

Black

2010 KS

Asian

2011 KS

White

2011 KS

Hispanic

2011 KS

Black

2011 KS

Asian

E #25 0.063 0.200 0.200 0.200 0.020 0.200 0.200 0.200

E #26 0.001 0.200 0.200 0.200 0.000 0.200 0.200 0.084

JH #4 0.001 0.200 0.200 0.045 0.000 0.200 0.200 0.200

E #27 0.001 0.200 0.192 0.049 0.200 0.200 0.200 0.200

E #28 0.090 0.200 0.200 0.200 0.002 0.200 0.157 0.200

E #29 0.000 0.200 0.200 0.013 0.016 0.200 0.200 0.099

JH #5 0.000 0.004 0.200 0.024 0.000 0.200 0.200 0.200

JH #6 0.000 0.000 0.200 0.200 0.199 0.010 0.002 0.200

After taking into account all of the required assumptions further analysis of the

ANOVA was performed on the remaining schools that did not violate the critical

assumptions. This included 29 schools from 2010 and 34 schools from 2011. The results

from the 2010 and 2011 ANOVA for these schools are shared in Table 26.

As reported in Table 26, of the 29 schools that did not violate the assumptions of

the ANOVA 27 of these schools had statistically significant ANOVA results (α = 0.06).

These results suggest that between the ethnicities of Asian, Black, Hispanic and White 27

of the schools exhibited a difference between average scale score in at least two of the

ethnic subgroups. Of the 27 schools in 2010 that exhibited a statistically significant

difference in average scale score between at least two ethnic subgroups 16 were

excelling, 6 were highly performing, 1 was performing plus and 4 were performing.

91

During the 2010 school year 30 schools in the district received the two highest school

labels issued by ADE with 22 being excelling and 8 being highly performing. 73.3% of

these 30 schools still exhibited an achievement gap, as measured by average scale score,

between at least two of the four ethnic subgroups studied. Furthermore, 72.7% of the 22

schools in the district that were labeled excelling still exhibited an achievement gap. The

examination of achievement gap with respect to average scale score in 2010 returned

results similarly seen in the comparison of proficiency percentage in the prior analysis

within Chapter 4 for the first research question.

Table 26

Results from the 2010 ANOVA for 29 District-Wide Schools that did not violate the

Assumptions of the ANOVA

School

F-Statistic P-Value

Elementary #1 48.254 0.000

Elementary #2 20.291 0.000

Elementary #3 25.073 0.000

High School #1 9.459 0.000

Junior High #1 82.894 0.000



Elementary #7 10.561 0.000

Junior High #2 4.041 0.008

(table continues)

92


School

F-Statistic P-Value

Elementary #8 15.168 0.000

Elementary #10 6.675 0.000

Elementary #11 10.484 0.000

Elementary #12 1.668 0.172

Elementary #13 7.969 0.000

High School #3 59.319 0.000

Special #1 4.923 0.002

Elementary #14 2.748 0.042

Elementary #15 7.453 0.000

Elementary #16 19.781 0.000

Elementary #17 10.932 0.000

Junior High #3 88.497 0.000

Elementary #18 11.188 0.000

Elementary #20 9.207 0.000

High School #4 13.877 0.000

Elementary #21 3.760 0.011

Elementary #22 20.083 0.000

Elementary #23 14.863 0.000

Elementary #24 1.746 0.157

Elementary #25 50.360 0.000

93

The results from the 2011 analysis of variance remained consistent with the

results from 2010 as discovered in Table 27. After taking into account the assumptions

of the ANOVA, 34 schools were left to analyze. The results from the 2011 ANOVA are

observed in Table 27.

As relayed in Table 27, of the 34 schools that did not violate the assumptions of

the ANOVA 32 of these schools had statistically significant ANOVA results (α = 0.06).

These results suggest that between the ethnicities of Asian, Black, Hispanic and White 32

of the schools exhibited a difference between average scale score in at least two of the

ethnic subgroups. Of the 32 schools in 2011 that exhibited a statistically significant

difference in average scale score between at least two ethnic subgroups 19 were

excelling, 4 were highly performing, 5 were performing plus and 4 were performing.

During the 2011 school year 29 schools in the district received the two highest school

labels issued by ADE with 22 being excelling and 7 being highly performing. 79.3% of

these 29 schools still exhibited an achievement gap, as measured by average scale score,

between at least two of the four ethnic subgroups studied. Furthermore, 86.3% of the 22

schools in the district that were labeled excelling still exhibited an achievement gap. In

conjunction with the AZ Learns Legacy labels the ADE also issued during the 2011

school year a new label call AZ Learns A-F letter grades. Throughout the district 28

schools received the top letter grades of A or B with 18 schools receiving the letter grade

of A. 78.6% of these 28 schools still exhibited an achievement gap between at least two

of the four ethnic subgroups as measured by average scale score. Similarly, 83.3% of the

18 schools that received an A still exhibited an achievement gap.

Table 27

94

Results from the 2011 ANOVA for 34 District-Wide Schools that did not violate the

Assumptions of the ANOVA

School

F-Statistic P-Value

Elementary #1 29.157 0.000

Elementary #2 11.499 0.000

Elementary #3 19.417 0.000

High School #1 26.203 0.000

Junior High #1 124.766 0.000



Elementary #6 12.773 0.000

Junior High #2 9.985 0.000

Elementary #8 16.801 0.000

Elementary #10 8.295 0.000

Elementary #11 6.573 0.000

Elementary #12 10.708 0.000

Elementary #13 2.112 0.098

Special #1 13.898 0.000

Elementary #14 7.790 0.000

Elementary #15 7.531 0.000

(table continues)

95


School

F-Statistic P-Value

Elementary #16 20.204 0.000

Elementary #17 4.585 0.003

Junior High #3 68.232 0.000

Elementary #18 16.650 0.000

Elementary #19 2.700 0.045

Elementary #20 6.522 0.000

High School #4 20.652 0.000

Elementary #21 3.918 0.009

Elementary #22 8.342 0.000

Elementary #23 26.462 0.000

Elementary #24 1.417 0.237

Elementary #25 35.985 0.000

Elementary #26 26.754 0.000

Elementary #27 28.745 0.000

Elementary #28 66.337 0.000

Elementary #29 15.898 0.000

Junior High #5 26.891 0.000

For the 27 and 32 schools during the 2010 and 2011 school years, respectively,

for which the ANOVA tests showed an achievement gap further analysis was preformed

using Tukey HSD to discover which specific ethnicities demonstrated a significant

96

achievement gap. During the 2010 school year the results of Tukey HSD, α = 0.01

accounting for the Bonferroni effect, for the 27 schools which had a significant F-statistic

from the ANOVA are displayed in Table 28.

Of the 27 schools where the ANOVA showed a significant achievement gap 17 of

these schools exhibited an achievement gap as measured by average scale score between

White and Hispanic students (see Table 28). Furthermore, Tukey HSD showed that 18 of

these schools showed a significant gap in performance between White and Black

students. Although the gaps were not as frequent at schools in comparison to Asian

students they did exist. A significant gap between Hispanic and Black students in

comparison to Asian students existed at 14 and 16 of these schools, respectively. Of the

30 schools that received one of the two highest labels by ADE almost half, 14, still

showed a significant gap between White and Hispanic students at their school. More

than half of the 30 schools, 16, show a significant performance gap with respect to

average scale score between White and Black student at their school. The Hispanic and

Black performance achievement gaps on AIMS with respect to Asian performance faired

similar amongst these 30 schools with a distinguished label from ADE. The number of

the distinguishably labeled schools with Hispanic and Black achievement gaps in

comparison to Asian students was 12 and 13 respectively. Ultimately, somewhere

between 40% and 53.3% of the schools labeled distinguished by ADE still exhibited an

achievement gap between Hispanic and Black students in comparison to White and Asian

students as examined by Tukey HSD.

97

Table 28

2010 Results for Tukey HSD – Post Hoc Tests

School

W-H W-B W-A H-B H-A B-A

Elementary #1 0.000 0.000 0.000 0.213 0.000 0.000

Elementary #2 0.000 0.000 0.003 0.994 0.000 0.000

Elementary #3 0.000 0.000 0.000 0.986 0.000 0.000

High School #1 0.004 0.003 0.166 0.905 0.001 0.000

Junior High #1 0.000 0.000 0.000 0.302 0.000 0.000

Elementary #4 0.052 0.001 0.444 0.327 0.011 0.000

Elementary #6 0.342 0.001 0.961 0.282 0.836 0.058

Elementary #7 0.168 0.000 0.627 0.033 0.075 0.000

Junior High #2 0.497 0.769 0.026 0.294 0.008 0.692

Elementary #8 0.000 0.872 0.620 0.763 0.996 0.968

Elementary #10 0.001 0.892 0.794 0.457 0.032 0.579

Elementary #11 0.024 0.000 0.279 0.061 0.001 0.000

Elementary #13 0.000 0.008 0.995 0.895 0.023 0.119

High School #3 0.000 0.000 0.003 0.973 0.000 0.000

Special #1 0.871 0.335 0.013 0.671 0.035 0.023

Elementary #14 0.627 0.120 0.618 0.268 0.263 0.052

Elementary #15 0.042 0.000 0.963 0.302 0.124 0.003

(table continues)

98


School


Elementary #16 0.179 0.000 0.000 0.003 0.000 0.000

Elementary #17 0.000 0.014 0.438 0.904 0.000 0.003

Junior High #3 0.000 0.000 0.103 0.256 0.000 0.000

Elementary #18 0.000 0.028 0.070 0.860 0.001 0.001

Elementary #20 0.000 0.002 0.963 0.871 0.012 0.010

High School #4 0.000 0.000 0.965 0.026 0.137 0.000

Elementary #21 0.129 0.034 0.999 0.798 0.648 0.276

Elementary #22 0.000 0.589 0.744 0.002 0.000 0.321

Elementary #23 0.000 0.000 0.992 0.998 0.020 0.075

Elementary #25 0.000 0.000 0.000 0.660 0.000 0.000

Note. (α = 0.01) During the 2011 school year the results of Tukey HSD, α = 0.01 accounting for

the Bonferroni effect, for the 32 schools which had a significant F-statistic from the

ANOVA are displayed in Table 29. Of the 32 schools where the ANOVA showed a

significant achievement gap 25 of these schools exhibited an achievement gap as

measured by average scale score between White and Hispanic students (see Table 29).

Furthermore, Tukey HSD showed that 18 of these schools showed a significant gap in

performance between White and Black students. Although the gaps were not as frequent

at schools in comparison to Asian students they did exist. A significant gap between

Hispanic and Black students in comparison to Asian students existed at 22 and 18 of

99

these schools, respectively. Of the 28 schools that received one of the two highest grades

by ADE more than half, 17, still showed a significant gap between White and Hispanic

students at their school. Exactly half of these 28 schools, 14, show a significant

performance gap with respect to average scale score between White and Black student at

their school. The Hispanic and Black performance achievement gaps on AIMS with

respect to Asian performance faired similar amongst these 28 schools with a

distinguished grades from ADE. The number of the distinguished letter grades with

Hispanic and Black achievement gaps in comparison to Asian students was 17 and 15,

respectively. Ultimately, as stated in Table 29, somewhere between 50% and 60.7% of

the schools graded distinguished by ADE still exhibited an achievement gap between

Hispanic and Black students in comparison to White and Asian students as examined by

Tukey HSD.

Table 29

2011 Results for Tukey HSD – Post Hoc Tests

School


Elementary #1 0.000 0.002 0.000 0.945 0.000 0.000

Elementary #2 0.000 0.327 0.265 0.167 0.000 0.050

Elementary #3 0.001 0.238 0.000 0.975 0.000 0.000

High School #1 0.000 0.000 0.000 0.384 0.000 0.000

Junior High #1 0.000 0.000 0.000 0.135 0.000 0.000

Elementary #4 0.010 0.071 0.122 0.954 0.000 0.003

(table continues)

100


School


Elementary #5 0.004 0.004 0.293 0.983 0.000 0.000

Elementary #6 0.789 0.000 0.504 0.000 0.987 0.001

Junior High #2 0.072 0.135 0.001 0.998 0.000 0.000

Elementary #8 0.000 0.006 1.000 0.999 0.150 0.285

Elementary #10 0.000 0.072 0.947 0.997 0.297 0.581

Elementary #11 0.033 0.043 0.273 0.957 0.003 0.003

Elementary #12 0.000 0.000 0.000 0.889 0.961 0.999

Special #1 1.000 0.219 0.000 0.336 0.002 0.000

Elementary #14 0.001 0.966 0.002 0.123 0.115 0.011

Elementary #15 0.005 0.003 0.964 0.777 0.016 0.005

Elementary #16 0.000 0.000 0.885 0.208 0.002 0.000

Elementary #17 0.004 0.157 0.955 0.998 0.320 0.505

Junior High #3 0.000 0.000 0.015 0.974 0.000 0.000

Elementary #18 0.000 0.116 0.058 0.877 0.000 0.003

Elementary #19 0.805 0.994 0.063 0.824 0.025 0.163

Elementary #20 0.000 0.915 0.727 0.225 0.003 0.606

High School #4 0.000 0.000 0.629 1.000 0.000 0.000

Elementary #21 0.025 0.191 0.722 0.998 0.951 0.935

Elementary #22 0.001 0.003 0.946 0.957 0.012 0.011

(table continues)

101


School


Elementary #23 0.000 0.000 0.999 0.639 0.000 0.013

Elementary #25 0.000 0.000 0.023 0.851 0.000 0.000

Elementary #26 0.000 0.000 0.024 0.080 0.000 0.000

Elementary #27 0.000 0.000 0.835 0.991 0.000 0.000

Elementary #28 0.000 0.000 0.011 0.606 0.000 0.100

Elementary #29 0.000 0.649 0.432 0.071 0.000 0.215

Junior High #5 0.000 0.000 0.010 0.342 0.000 0.000

Note. (α = 0.01)

Summary of ANOVAs and Average Scale Score

Similar to results observed in the examination of proficiency percentages,

excelling schools and highly performing schools examined through ANOVAs with

respect to average scale score continued to show significant achievement gaps. In the

2010 school year approximately 72% (see Table 11) of the schools that were

distinguishably labeled by ADE still exhibited at least one achievement gap between the

four ethnicities and in 2011 approximately 75% (see Table 21) of the highly graded

schools had similar results. Upon further analysis of post-hoc tests between 40% and

53.3% (see Table 28) of the 30 highly labeled schools during the 2010 school year

exhibited an average scale score gap between either Hispanic and Black students and

either White and Asian students. Furthermore, during the 2011 school year results were

102

congruent with post-hoc ANOVA tests showing between 50% and 60.7% (see Table 29)

of the 28 highly graded schools had persistent gaps similar to those observed in 2010.

Ethnicity Proportion and Z-Score

The analysis of the proportions of Asian/White students in comparison to a

schools Z-score follows. This analysis addresses the third research question of:

3. Is the percentage of Asian and White students correlated with the state-issued

z-score, a standardized score for the percent of students that exceed on the

AIMS examination at a school in a given year, which helps determine school

labels within the state of Arizona?

The analysis of this research question is divided into three sections. One of the sections

addresses the 2010 school year and one address the 2011 school year. A summary of

findings follows these two sections.

Correlation and Linear Regression between Ethnicity Proportions and Z-Score

A relationship exists between the percentage of Asian/White students at a school

and the corresponding z-score that the school receives in the AZ LEARNS Legacy school

labeling model. The strength of the relationship has varied slightly over the 2009-2010

and 2010-2011 school years. Overall the relationship between these two variables is

strong suggesting that a z-score may be directly linked to school demographics.

One of the ways that the ADE establishes the difference between a highly

performing school and an excelling school is through standardizing the percentage of

students at a school that exceed on the AIMS examination. The standardization process

is statistically known as a z-score. The z-score is calculated by taking the proportion of

103

students at a given school that exceed on the AIMS exam subtracting the average

proportion of students that exceed on the AIMS exam for each school state-wide and then

dividing by the standard deviation of the proportions of students that exceed on the AIMS

exam for each school state-wide. Schools that have met the requirements to be highly

performing or excelling are further separated by examining the z-score.

A z-score greater than or equal to one establishes a school as excelling as opposed

to highly performing. Z-scores greater than or equal to one suggest that a school has had

enough students exceed AIMS exams so that they are at least one standard deviation

above the average proportion of exceeders for schools throughout the state. Each school

with the suburban school district being analyzed received a z-score for the 2010 and 2011

school years. As established previously in Chapter 4, 22 of the schools throughout the

district received the excelling label which means that 22 of the schools had z-scores

greater than or equal to one.

2010 Data. A correlation and regression analysis of 2010 data established that

during this school year there was a strong relationship between the proportion of

Asian/White students at a school and the z-score that the school received. The coefficient

of determination for the regression analysis was 0.702 (see Table 30) which suggests that

70.2% of the variability in z-score can be accounted for by the variability in the

proportion of Asian/White students at the school. A relationship as strong as this

suggests that the number of exceeders at a school is strongly related to this demographic

make-up of a school.

104

Table 30

Linear Correlation Summary for 2010 Z-Score regressed on Percentage of

Asian/Caucasian (ACP) Students at a School

R R Square

Adjusted R

Square

Std. Error of the Estimate

Durbin-Watson

.838(a) .702 .694 .5714580 1.663

Note. a Predictors: (Constant), ACP. b Dependent Variable: z-score

Further analysis of the 2010 scatterplot and regression shows increased support

for the strong relationship between z-score and the proportion of Asian/White students at

a school (see Figure 3). The model of Y = B0 + B1X resulted in Y = -1.309 + 3.600X

(see Table 31) where Y is the predicted z-score based off of the line of conditional means

and X is the proportion of Asian/White students at a school. The slope of this model is

3.600 and the 95% confidence-interval for the slope is (2.52, 4.06). At an α=0.05 the

confidence interval suggest that a slope of 0, or no relationship between these two

variables, can be statistically rejected. Consequently, when accounting for error in the

model, it appears as if the strong relationship can still be validated.

105

Figure 3. Scatterplot of 2010 Z-Score versus Proportion of Asian and Caucasian

Students at a School.

Table 31

Coefficients and Standard Error of Coefficients for 2010 Z-Score Regressed onto

Percentage of Asian/Caucasian Students at a School

Unstandardized Coefficients

t Sig

B Std. Error

(Constant) -1.309 .259 -5.058 .000

ACP 3.600 .381 9.454 .000

Note. a Dependent Variable: Zscore

The biggest caution must be given when examining the residual plot for the 2010

data. At first glance of the bivariate scatterplot the model does seem to be somewhat

heteroskedastic. The residual plot for the model is found in Figure 4. Limited data points

106

in the suburban district being examined created a situation where minimal z-scores were

available for schools with small proportions of Asian/White students. As a result it does

appear as if the model exhibits increased variance as the proportion of Asian/White

increases. However, it has been observed that in cases that involve moderate violation of

homoscedasticity the violation may make the model weaker but they do not necessarily

make the model invalid (Tabachnick & Fidell, 1996). Although this limits the use of the

model as a predictor of z-score it does not inhibit the interpretation that there is a strong

relationship between these two variables.

Figure 4. Residual Plot for Regression Model regressing 2010 Z-Score onto Percentage

of Asian/Caucasian Students at a School.

2011 Data. The examination of 2011 data on z-scores and percent of

White/Asian students at a school produced similar results in comparison to 2010 data.

Once again, during this school year a strong relationship between the two variables is

exhibited. Pearson’s product-moment correlation coefficient was 0.823 which resulted in

107

a coefficient of determination of 0.678 (see Table 32). Essentially, 67.8% of the

variability in z-score can be accounted for by the variability in the proportion of

Asian/White students at a given school within the district. Once again, the high R2 value

suggests that the relationship between the two variables being is examined is significant.

Table 32

Linear Correlation Summary for 2011 Z-Score regressed on Percentage of

Asian/Caucasian (ACP) Students at a School

R R Square

Adjusted R Square

St. Error of the Estimate

Durbin-Watson

.823(a) .678 .670 .5611580 1.700

Note. a Predictors: (Constant), ACP, b Dependent Variable: Zscore

Further analysis of the 2011 scatterplot and regression shows increased support

for the strong relationship between z-score and the proportion of Asian/White students at

a school (see Figure 5). The model of Y = B0 + B1X resulted in Y = -1.131 + 3.288X

(see Table 33) where Y is the predicted z-score based off of the line of conditional means

and X is the proportion of Asian/White students at a school. The slope of this model is

3.2875 and the 95% confidence-interval for the slope is (2.54 , 4.03). At an α=0.05 the

confidence interval suggest that a slope of 0, or no relationship between these two

variables, can be statistically rejected. Consequently, when accounting for error in the

model, it appears as if the relationship can still be validated.

108

Figure 5. Scatterplot of 2011 Z-Score versus Proportion of Asian and Caucasian

Students at a School.

Table 33

Coefficients and Standard Error of Coefficients for 2011 Z-Score Regressed onto

Percentage of Asian/Caucasian Students at a School


t Sig

B Std. Error

(Constant) -1.131 .249 -4.540 .000

ACP 3.288 .367 8.948 .000

Note. a Dependent Variable: Zscore

Once again the biggest caution for the model would be actually using the model to

predict the z-score for schools that have a higher proportion of Asian/White students.

109

While

he

d

the

t

the model does establish a strong relationship between the two variables it does

exhibit similar signs of heteroskedasticity when examining the Figure 6 residual plot.

Figure 6 shows that as the proportion of Asian/White student increases the variance in t

prediction of the model also increases. This is of concern if a district administrator use

the model in an attempt to predict the z-score for any given school. However, it is of less

concern when simply using the model to establish that a strong relationship between the

two variables exists. In fact, upon further examination of the 2011 residual plot one

might notice a point of importance that would actually play a role in reducing the R2

value for the regression model. The point, although not influential as measured by

Cook’s distance (Di = 0.2798), has a residual of 1.78 and can be readily explained by

school being a specialty school within the district that particularly targets the highes

performing academic students. Essentially, it is a school that pulls the very best students

from other district schools in order to give them a maximized learning experience.

Without this point the value of the coefficient of determination would certainly increase

resulting in further evidence for the strength of the relationship between z-score and

proportions of Asian/White students at a school.

110

Figure 6. Residual Plot for Regression Model regressing 2011 Z-Score onto Percentage

of Asian/Caucasian Students at a School.

Summary of Linear Regression Analysis

Data from the 2010 and 2011 AIMS examinations and consequently the 2010 and

2011 AZ Learns Legacy school labeling systems suggests that there is a strong

relationship between the variables being examined. The strength of the relationship is

cemented by coefficients of determination of 0.7017 and 0.6781 (see Table 30 and Table

32) which would be considered large effect sizes within the social and behavioral

sciences. This relationship is of particular importance as z-score is used to help

determine whether a school is labeled as excelling.

111

A-F Letter Grade and Demographic Data

The analysis of the four demographic data points and the state issued school letter

grade at a school follows. This analysis addresses the fourth research question of:

4. Are free and reduced lunch rates, English Language Learner rates, percentage

of Asian students and percentage of White students correlated with the AZ

LEARNS A-F letter grades published by the state of Arizona for schools

within the suburban district?

The analysis of this research question is divided into two sections. The first sections

examines the simple multiple linear regression for the 2011 school year. The second

section provides a summary of these findings.

Regressing A-F Letter Grade Value onto Ethnicity Proportions, Free and Reduced Lunch

Rate and ELL Proportions Using Multiple Linear Regression

Standard multiple regression was performed to establish the ability of the

proportion of White (W), proportion of Asian (A), proportion of ELL (ELL) and

proportion of free and reduced lunch rate (FR) students to predict the AZ Learns A-F

letter grade (Grade) assigned by ADE. It was determined that there were not any outliers

by using Mahalanobis distance to evaluate the data for multivariate outliers. The critical

χ2 value with df=4 at p<0.001, which is the generally accepted p-value for Mahalanobis

distance (Mertler & Vannatta, 2005, p. 53), is χ2=18.467. The largest χ2 value observed

in the data set was 12.995 for elementary school #12. A residual scatterplot, displayed in

Figure 7, was used to examine the condition of multivariate homogeneity of variance-

covariance (Mertler & Vannatta, p. 173). The examination of the standardized residual in

112

comparison to the standardized predicted value yielded a residual plot that did not

demonstrate clustering. As a result the condition of multivariate homogeneity of

variance-covariance was assumed. Finally, a scatterplot matrix (see Figure 8) examining

the relationship between all variables intended to be used in the standard multiple

regression was analyzed to examine multivariate normality and linearity. Unfortunately,

some of the variable scatterplots did exhibit non-elliptical patterns. In particular, the

variable of ELL proportions consistently showed an L-Shape pattern when examined

with other variables. Consequently, it was determined to eliminate the variable of ELL

proportions from the regression analysis.

Figure 7. Residual Plot for 2011 Regression that Regresses School Letter Grade onto

Four Independent Variables.

113

GradeWAELLFR

Gra

deW

AE

LLF

R

Figure 8. Scatterplot Matrix for All Variables in 2011 Multiple Regression.

After examining the data and eliminating ELL proportions it was then necessary

to go back and re-examine the standardized residual plot that helped establish

multivariate homogeneity. Figure 9 displays the residual plot for a multiple regression

with the variable of ELL proportions removed from the model. Once again, as revealed

in Figure 9, the standardized residual plot showed no signs of clustering and,

consequently, the process moved onto analyzing the information from the regression

analysis.

114


Three Independent Variables.

One of the first issues that a researcher must check after performing a multivariate

regression is multicollinearity. Multicollinearity can limit the size of R2 by having

multiple variables account for the same variability and it can make it difficult to evaluate

which predictors are the most important (Stevens, 2007, p.234). There are several ways

to account for multicollinearity including analyzing correlation matrices, tolerance values

and variance inflation factors (VIF). High correlations amongst independent variables;

tolerance values less than 0.1 (Mertler & Vannatta, p. 169); or VIF factors that exceed 10

(Myers, 1990, p. 369) all can establish a foundation for variables that exhibit

multicollinearity. In the case of the multiple regression analyzed in this research it was

quickly observed using all three of these methods (see Table 34 and Table 35) two of the

three variables showed strong tendencies towards multicollinearity.

115

As presented in Table 34, Free and reduced lunch proportions and White student

proportions were correlated with r = -0.958. Furthermore, Table 35 details that each of

these variables had low tolerance values, 0.028 and 0.039 respectively, and high VIF

factors at 35.851 and 25.483.

Stevens suggests that there are three ways to deal with multicollinearity which

include:

1. If there are three measures relating to a single construct which have

intercorrelations of about 0.8 or larger then add them to form a single

predictor.

2. Use factorial analysis methodology.

3. Use ridge regression (Stevens, p. 235).

Using the first of Stevens’ suggestions is quite possibly the most viable for the multiple

regression being analyzed in this situation. However, in order to use Stevens suggestion

we need three measures relating to a single construct. The three measures that have high

intercorrelations (see Table 34), at least 0.8, in this study include Free and Reduced

proportions, ELL proportions and White proportions. Thus, we will introduce the ELL

proportion back into the multiple regression by cumulating it with Free and Reduced

proportions and White proportions.

116

Table 34

Inter-correlation Matrix for Three Variables being examined in 2011 Multiple Linear

Regression

Grade

FR A W

Grade 1.000 -.823 .670 .712

FR -.823 1.000 -.688 -.958

A .670 -.688 1.000 .509

W .712 -.958 .509 1.000

Table 35 Collinearity Statistics for Three Variables used in 2011 Multiple Linear Regression

Collinearity Statistics

(Constant)

Tolerance VIF

FR .028 35.851

A .252 3.972

W .039 25.483

Note. a Dependent Variable: Grade

After combining the three variables (SUM3) the conditions for multiple

regression were retested again. Figure 10 shows the scatterplot matrix exhibited no major

violations of linearity and normality with the majority of the six scatterplots exhibiting

moderately elliptical patterns.

117

GradeSUM3A

Gra

deS

UM

3A

Figure 10. Scatterplot Matrix for Three Variables in 2011 Multiple Regression

A reexamination of Mahalanobis distance once again showed no outliers in the

data set with p = 0.001 and χ2=13.82. The standardized residual plot exhibited no

clustering and demonstrated a random pattern of errors (see Figure 11).

118


SUM3 and Percent of Students at a School that are Asian.

Finally, multicollinearity was reduced in the preliminary regression analysis. As

shown in Table 36, tolerance for both variables was acceptable at 0.361 and due its

inverse relationship VIF was also acceptable at 2.767. Unfortunately, after examining all

of the conditions for standard multiple regression further analysis of the regression data

(see Table 36) examined found that the variable of proportion of Asian students was not

statistically significant for the model (t = 0.223, p = 0.825).

119

Table 36

2011 Multiple Linear Regression Model for Letter Grade regressed onto the variables of

SUM3 and Percentage of Asian Students at a School

Unstandardized

Coefficients

t Sig Collinearity Statistics

B

Std. Error Tolerance VIF

(Constant) 237.229 25.291 9.380 .000

A 14.971 67.059 .223 .825 .361 2.767

Sum -113.048 22.108 -5.113 .000 .361 2.767

Note. a Dependent Variable: Grade

Due to the intercorrelation of these variables, and the inability to filter the

variance accounted for by each individual variable, it was determined that the research

would proceed with constructing a single variable model based on the highest correlated

variable. The variable that exhibited the highest correlation with the ADE assigned letter

grade was free and reduced lunch proportion. The model constructed for was a simple

linear regression of the form Y = B0 + B1X. After running the single regression the

resulting linear equation displayed in Table 37 was Y = 153.821 – 64.289X where Y was

the predicted grade issued by ADE and X was the proportion of free and reduced lunch

students at a school in the district. As shown in Table 38 the model constructed had r = -

0.823 and R2 = 0.677. Overall this suggests that 67.7% of the variability in school grades

assigned by ADE within this district was accounted for by the variability in the free and

reduced lunch percentage at the school.

120

Table 37

2011 Single Variable Linear Regression Model for Letter Grade regressed onto the

variable of Free and Reduced Lunch Percentage


t Sig

B

Std. Error

(Constant) 153.821 2.924 52.614 .000

FR -64.289 7.210 -8.916 .000

Note. a Dependent Variable: Grade Table 38

2011 Coefficient of Determination for Letter Grade regressed onto the variable of Free

and Reduced Lunch Percentage

R R Square

Adjusted R Square

St. Error of the Estimate

Durbin-Watson

.823(a) .677 .668 11.52450 1.681

Note. a Predictors: (Constant), FR, b Dependent Variable: Grade

Summary of Relationship between School-Level Variables and School Letter Grades

The relationship between free and reduced lunch percentages and AZ Learns A-F

letter grades within this suburban district is demonstrated to be significant. High

intercorrelation (see Table 34) between variables being examined made it difficult for the

research to employ the multiple linear regression techniques desired. The intercorrelation

is most likely attributed to the fact that ethnicity and poverty continue to be interwoven

within social class structure of the United States. Schools with low poverty (low free and

121

reduced lunch proportions) almost always show higher proportions of White and Asian

students because families of these ethnicities are less likely to live in poverty.

Ultimately, the high intercorrelation of the predetermined variables prohibited the use of

multiple linear regressions.

Of the four variables examined free and reduced lunch proportions of each school

showed the highest correlation (see Table 34) with the dependent variable of school letter

grades. The relationship between these two variables was strong. In comparison to the

regression analysis relating z-score to proportion of Asian/White students at a school this

model also showed comparable strength. Although the intended examination of four

variables in relationship to the variable of school letter grades failed, the information that

one of the best variables we have in education to measure poverty, free and reduce lunch

percentage, provides the best predictor of school letter grade is very useful.

Summary of Chapter 4

Throughout the analysis provided in Chapter 4 two themes appeared to become

prevalent. First, the majority of schools throughout the district still exhibited an

achievement gap between ethnic groups represented at their school. Moreover, the

achievement gap found in both proficiency percentage and average scale score was

shown to exist even if a school was labeled with a distinguished school label by ADE.

Secondly, it was found that their existed a relationship between certain demographic data

and numerical variables that aide in determining school labels. A strong relationship

between the percent of White and Asian students at a school and z-score was found.

Also, a strong relationship between free and reduced lunch rates at a school and the

122

school letter grade assigned by ADE existed. These two themes were the most prevalent

of all findings as their implications for schools labeled with distinguished labels by the

ADE are profound.

CHAPTER 5

Conclusions, Summary, Implications, and Recommendations

Introduction

This chapter provides a summary of the study and important conclusions drawn

from the analysis provided in Chapter 4. The chapter presents a discussion of the major

implications for action that can be drawn from the data presented throughout the research.

It then makes recommendations for further research that can be conducted at the school,

district and state level. Also included in the chapter are a review of the methodology,

findings as related to current literature and concluding remarks. The chapter serves as a

summary to readers in an effort to focus on the critical conclusions from the provided

research.

Summary of the Study

The study sought to better understand the achievement gap at schools in a

suburban district in the southwest United States. A descriptive analysis of the

achievement gap as measured by proficiency percentage on the AIMS examination

coupled with an inferential analysis of average scale score on the AIMS examination for

the 2009-2010 and 2010-2011 school years provided a comprehensive picture of the gap

at these schools. Using figures and tables, accompanied with the ANOVA results from

the inferential analysis, the research sought to better understand whether schools were

making progress in closing the achievement gap addressed in NCLB. Throughout the

analysis a conditional relationship with school labels was examined by referencing

124

conditional probabilities based on results from the descriptive analysis and the inferential

analysis.

The secondary part of the study attempted to examine the relationships between

demographic variables and school labels which could possibly mask whether a school

had or had not closed the achievement gap. A linear correlation and regression was

analyzed in an attempt to describe the relationship between the percentage of White and

Asian students at a school and z-score (an ADE issued standardized score that is used in

determining AZ Learns Legacy school labels). A simple multiple linear regression was

analyzed in an attempt to describe the relationship between four different demographic

variables and the school letter grade which is an ADE school label issued starting in the

2011 school year.

Overview of the Problem

The inception of NCLB has mandated that a ranking or labeling system for

schools and districts be established and sustained for accountability. However, there is

variation within and between each state’s ranking systems. Specifically, in Arizona’s

ranking system the labels may not statistically identify if the achievement gap has been

closed. As a result, school labels may be more readily linked with the demographics of

the school than the best practices within the school.

Purpose Statement

The purpose was to examine the achievement gap, particularly in mathematics

and reading, at all non-alternative schools within a suburban school district within the

state of Arizona for the 2009-2010 and 2010-2011 school years. Furthermore, the study

sought to examine demographic reasons on why the schools within the suburban school

125

district obtained high and low school labels. The study was specifically interested in

student achievement across ethnic subgroups with respect to the state standardized AIMS

examination. Another interest of the study included the descriptive analysis of cross-

sectional data in reading and mathematics at these schools from 2009-2010 and 2010-

2011. Furthermore, the study sought to define the predictive abilities of the percentage of

non Black/Hispanic students with respect to the percentage of students that exceeded on

the AIMS examination. Using four main research questions as a guide, data from two

prior years was analyzed at schools throughout the suburban school district.

Research Methodology

Ex-post facto data was analyzed with quantitative methods for this research. The

data allowed for the examination of the achievement gap at a school district in the

southwest United States. Furthermore, the data allowed for an evaluation of Arizona’s

school labeling system with respect to the suburban school district. The data was

conveniently sampled from 40 different schools throughout the district and included

student AIMS scores, ADE issued z-scores, ADE issued school letter grades and school

demographic data. The quantitative data was then analyzed descriptively and

inferentially by using an ANOVA. Finally, bivariate and multivariate relationships were

examined between demographic data and ADE issued z-score and school letter grade.

Major Findings Summary

The findings from this research study are given in two parts in this section. First,

a summary of findings for each research question is briefly provided. Following that

summary an overall thematic summary is provided to synthesize the information to a

126

broader level. The summary of findings for each research question will state the research

question followed by a brief summary of the overall findings regarding said question.

Research Question 1

What is the two year cross-sectional data trends for the achievement gap among

White, Asian, Hispanic and Black students on the 2009-2010 and 2010-2011 AIMS

mathematics, reading and writing sections at all schools in the suburban school district?

Finding. The two year cross-sectional data trends for the achievement gap within

this suburban school district suggest that the majority of schools within the district still

struggle with closing the elusive gap. Moreover, schools with distinguished labels from

the ADE have been shown to have similar problems with closing the achievement gap in

this district.

Research Question 2

Is the average student achievement, as measured by average scale score, in ethnic

subgroups different for the 2009-2010 and 2010-2011 AIMS examinations at each non-

alternative school throughout the suburban district?

Finding. The achievement gap, when examined by average scale score, existed at

the majority of schools throughout the district. The findings from this research questions

were consistent with the findings from the first research questions. These findings

included district schools continuing to struggle with closing the achievement gap and

distinguishably labeled schools similarly struggle. Analyzing the achievement gap,

amongst the schools in this suburban district, using two different metrics provided

significant evidence to support a lingering achievement gap.

127

Research Question 3

Is the percentage of Asian and White students correlated with the state-issued z-


examination at a school in a given year, which helps determine school label within the

state of Arizona?

Finding. For both the 2010 and 2011 school year the percentage of Asian and

White students at school was highly correlated with the state-issued z-score. A strong

relationship between these two variables suggest that a school’s ability to be labeled as

excelling is related to whether they have a high proportion of Asian and White students.

Research Question 4

Are free and reduced lunch rates, English Language Learner rates, percentage of

Asian students and percentage of White students correlated with AZ LEARNS A-F letter

grades published by the state of Arizona for schools within the suburban district?

Finding. Inter-correlation between the variables examined provided a significant

complication when analyzing this research question. After using a couple of different

statistical solutions a single variable linear regression showed that free and reduced lunch

rates were highly correlated with A-F letter grades for schools within this district.

Major Findings Discussion

The findings from each research question presented in Chapter 4 can be

summarized under two major themes. First, throughout the suburban school district that

was examined schools continue to struggle with closing the persistent achievement gap

between the four ethnicities studied. Second, school labels and grades are strongly

correlated with demographic measures prevalent in the school. Accounting for both of

128

these themes the proceeding discussion provides clarification with respect to underlying

data providing evidence for these themes.

Schools within the suburban school district examined continued to show a

persistent achievement gap. This gap was found in schools across the labeling spectrum.

In 2009-2010, the majority of schools throughout the district demonstrated an

achievement gap on AIMS. When examining the conditional distribution of the

achievement gap across school labels it was also shown that the persistent achievement

gap was prevalent amongst schools in the district with the highest of school labels. In

2011, the trend in achievement gap on the AIMS examination continued with the

majority of schools in the district showing distinct gaps in performance amongst different

ethnicities across all three subjects. Schools with high letter grades, A or B, in 2010-

2011 also continued to show significant achievement gaps. In concert, the cross-sectional

examination of 2009-2010 and 2010-2011 AIMS results showed the majority of schools

throughout the district continued to show persistent achievement gaps and a large

proportion of distinguishably labeled, and graded, schools showed similar achievement

gaps.

The achievement gap endured when examining the achievement gap from a

different perspective. Instead of using proficiency percentage as a measurement of

achievement gap, average scale score was inferentially examined to study the

achievement gap during the same two years. Once again, the majority of the schools

throughout the district demonstrated a die-hard achievement gap. Further analysis of

conditional probabilities showed that during 2009-2010 a high proportion of the schools

labeled highly performing demonstrated an achievement gap between either Hispanic and

129

Black students and either Asian and White students. The 2010-2011 data continued to

show achievement gaps in average scale score amongst schools in this district labeled

distinctly by ADE. An analysis of A or B schools in 2010-2011 showed that district

schools that received these high grades had similar ethnic achievement gaps. In an

attempt to avoid drawing false conclusions based on proficiency percentages the analysis

of average scale score provided a solidified conclusion. The majority of schools

throughout the district showed a significant achievement gap and a large proportion of

schools with elevated labels, and grades, showed similar achievement gaps.

The second theme resulted from examining the relationship between variables that

help determine school labels and demographic data. The relationship between the

percentage of White and Asian students at a school and the school-level z-score was

shown to be strong. During the 2009-2010 school year, a significant amount of the

variability in z-score was accounted for by the variability in the percentage of White and

Asian students at the schools. During the 2010-2011 school year the percent of

variability was nearly identical. The relationship between this demographic variable and

z-score, a variable that helps the ADE determine school labels, suggests that it is strongly

possible that a critical factor in determining school labels is the percentage of White and

Asian students at a given school.

Further analysis of the new school label, AZ Learns A-F letter grades, issued by

the ADE during the 2010-2011 school year helped to cauterize the second theme. After

attempting to run a simple multiple linear regression on four different demographic

variables present in any given school in relationship to school letter grade it was found

that one single variable accounted for the majority of the variability in letter grade. Due

130

to multicollinearity amongst the four variables examined and the lack of statistical

significance in some of the variables, free and reduced lunch percentage was left to serve

as the sole predictor of school letter grade throughout the district. Once again, a large

percentage of the variability in school letter grade was accounted for by the variability in

free and reduced lunch percentage. Once again, this information helps stitch the

interwoven theme that a school label is strongly associated with certain demographic

variables present at a school.

The comprehensive findings from the study of this suburban school district

resulted in two major themes. One theme was that schools in this district, with

exceptional and non-exceptional labels issued by ADE, still had significant work to be

done in closing the academic achievement gap between ethnicities. The second theme

was that a school label within this district is highly associated with certain demographic

variables. Combining both of these themes results in a better understanding of the

relationship between ADE issued school labels and the ability of a school to accomplish

the mandate set forth in NCLB of closing the achievement gap between ethnic subgroups.

Findings Related to the Literature

The findings of this research align well with previous research in the field of

education. First, the persistent achievement gap found at schools throughout the district

echo the achievement gap seen throughout the nation on the NAEP exam. Second, the

strong relationship between the two demographic variables analyzed in each regression

analysis follows from Berliner’s analysis of out-of-school factors. The two main themes

prevalent in the research findings in this dissertation readily link to other research in the

educational field.

131

The persistent achievement gap eludes the nation, as a whole, and it continues to

elude the majority of the schools in the examined district. The Center for Education

Policy (2009) showed through NAEP results that the annually recurring achievement

gaps continue to elude our nation’s educators. The research performed on this district in

the southwest United States found results with respect to a persistent achievement gap

that were not contrary to the 2009 Center for Education Policy release. Furthermore,

Kober, Chudowsky & Chudowsky (2010) concluded that Hispanic-White gaps in

achievement in reading and mathematics continue to plague educators. For the school

district studied this research finding certainly held true. The majority of schools

continued to show significant gaps in state performance testing between Hispanic and

White students in reading and mathematics. The findings with respect to the achievement

gap in this suburban school district certainly follow from previous research.

Out-of-school factors, studied by Berliner (2009), that tend to appear in higher-

poverty areas impact educational attainment. The results from this study indicate that

holistic school-wide demographic indicators that would link to OSFs are highly

correlated to measurements of school achievement. Birenbaum and Nasser (2006),

Zuzovsky (2008) and Berliner (2009) all conclude that achievement gaps among

subgroups within a population can be linked to the impact of poverty. In this study, two

findings show similar results.

First, the correlation between the state issued z-score and the Asian/White

demographic make up of a school reiterate these previous research findings. The variable

of percentage of Asian/White students at each school within the district is a holistic

summary variable that is linked to poverty. DeNavas-Walt, Proctor and Smith (2010)

132

established that Asian and White people in the United States are approximately half as

likely to live in poverty as Hispanic and Black people. Consequently, the variable of

percentage of Asian/White students at a school just summarizes poverty and out-of-

school factors. Therefore, the research findings that the percentage of Asian/White

students at school is highly correlated with the state issued z-score, a standardized

measurement of the proportion of students that exceed the standards on the AIMS test,

would be directly in line with the findings of poverty’s relationship with educational

outcomes.

Second, the correlation between free and reduced lunch rate percentages at a

school and the ADE issued numerical value for school letter grade are directly in line

with Berliner (2009) and other researchers. The findings in this school district suggest

that schools in higher poverty areas receive lower ADE letter grades. Similarly, schools

in wealthier areas receive higher ADE letter grades. The relationship between the chosen

measurement of poverty, free and reduced lunch rates, and ADE letter grades was shown

to be strong. This finding is directly in line with Berliner’s implication that out-of-school

factors caused by the impacts of poverty can have school-level effects on educational

achievement.

The findings of previous research in the educational field and the findings in this

study link together well. Berliner’s (2009) research on the ill-effects of poverty in

conjunction with the Center for Educational Policy (2009) research on NAEP results and

the persistent achievement gap show that the findings in this district are not outside

expectation. The relationship between current research and the research in this district in

133

the southwest United States aid in strengthening the findings submitted in this

dissertation.

Divergent Findings

The study had some findings that were not intended to be examined but became

prevalent when examining the data. The most surprising result came when performing

the simple multiple linear regression for the fourth research question. Multicollinearity

between the four variables, determined prior to the study for accounting for school letter

grade, prohibited the use of simple multiple linear regression. High inter-correlations

between the variables removed the use of some of the variables. Furthermore, a

statistically insignificant variable caused the multiple linear regression to be revised to a

single-variable linear regression. The sole variable that remained accounted for a large

proportion of the variance in school letter grade throughout the district. Interestingly the

variable that remained was free and reduced lunch percentage. This variable is simply a

measurement of poverty at a school. Consequently, the relationship between school letter

grade and free and reduced lunch percentage throughout the school district was found to

be of critical importance when interpreting school-level letter grades.

Heterogeneity of variances across ethnic subgroups in a small proportion of

schools was another divergent finding that was unexpected. When using inferential

statistics to examine average scale score across the four ethnic subgroups 11 schools

during the 2009-2010 school year and 6 schools during the 2010-2011 school year

exhibited heterogeneity of variances. Most typically, the heterogeneity of variances

could be attributed to the large ratio observed between the largest sample size and

smallest sample size of the four ethnic subgroups. More concisely stated, the

134

heterogeneity of variances was primarily seen in schools that were more ethnically

homogeneous. This violation of condition of the ANOVA brought to the forefront the

idea that the achievement gap becomes more difficult to analyze as subgroup sample

sizes gets smaller. For this precise reason, many states have implemented in the NCLB

adequate yearly progress measurement a minimum sample size condition that allows

schools with small samples in certain subgroup categories to not be measured for

accountability.

In analyzing the descriptive statistics for the proficiency percentage between

ethnic subgroups there were more elementary schools that showed signs of closing the

achievement gap than secondary schools. The implications of this finding could be quite

varied. It could possibly be that as students matriculate from elementary to junior high to

high school the subgroup sample size increases reducing variability and thus achievement

gaps become more readably noticeable. Another explanation could be that students of

different ethnic backgrounds within this district may possibly diverge in academic

performance as they get older. The academic growth rates of different ethnic subgroups

might be unequal which would then result in less achievement gaps in elementary schools

and larger achievement gaps in high schools. Finally, it could be that high schools

encompass larger boundaries and are more likely to encounter diverse socioeconomic

statuses between their students. So, elementary schools socioeconomic status could be

more homogeneous resulting in smaller achievement gaps in comparison to high schools.

The research provided in this study does not attempt to answer any of these theories. But,

it was observed that all of the high schools exhibited large achievement gaps between

ethnicities whereas not all elementary schools did the same.

135

Conclusions

The effect of poverty on the educational achievement of students is recognized

throughout educational research. The Coleman Report first examined the relationship

between family backgrounds and its link to the perpetual achievement gap (Viadero,

2006, p. 1). Berliner (2009), Birenbaum and Nasser (2006) and Zuzovsky (2008) all have

provided more recent evidence of poverty’s interaction with educational achievement.

While the scope of this research was limited to a single school district in the southwest

United States, it should not be alarming that even in this district the impact of poverty can

be seen.

Schools in this district were shown to have higher school labels, issued by the

ADE, when they had lower free and reduced lunch rates and a higher proportion of

Asian/White students. Essentially, schools labeled effective by the ADE within this

district are certainly benefited by not being exposed to the very ill-effects that Berliner

suggest rise out of poverty. A school in this district can not manifest from its exemplary

status with ADE the idea that they are supplying any better education to students than

their sister schools. Similarly, a school in this district that has a label, or grade, with

ADE that is below exemplary should not panic with the idea that they are doing a poorer

job of educating their students. ADE school grades and labels for schools in this district,

currently, should be viewed in the proper context of representing socioeconomic status

and not the quality of education in that school.

The quality of education of a school within this district should not come into

question as a result of school grade and it must be clear that every school within the

district must continue to emphasize finding ways to close the achievement gap. Much

136

like the nation, schools throughout the district continue to struggle with closing the

persistent achievement gap for ethnic minorities. A school label, or grade, cannot be a

resting point where educators believe their work with respect to the achievement gap is

finished. Rather, as this research has shown, despite a school’s label it is extremely likely

that a school within this district continues to exhibit trends of Hispanic and Black

students that underperform their academic peers throughout their school. NCLB sought

accountability so that schools, districts and states could no longer ignore the silent

minorities in the educational process. Schools in this district must recognize the results

of this research, hear the cry of its minority students and implement research-based

programs for improving achievement amongst these groups at every school including

those with a distinguished label from the ADE.

Implications for Action

Administrators, teachers and parents throughout the suburban school district need

to be aware of the relationships studied in this research. The research has implications

for everyone of these groups throughout the district. The implications are both corrective

and cautious for each of these shareholders throughout the district.

School-level and district-level administrators must understand that while several

district schools, and the district itself, are viewed favorably throughout the state of

Arizona there is still much to accomplish with respect to closing the achievement gap.

Furthermore, since receiving a distinguished label from the state of Arizona in this

district is highly related to some demographic variables, administrators must realize that a

high label does not necessarily suggest best practices are in place for students of all

ethnicities. The highly performing label might merely suggest that a school in a non-

137

poverty area with a high proportion of White and Asian students. While the school may

have research-based best-practices implemented, an ADE issued school label is not the

“effect-size” that should be used to determine success of school programs. All school

administrators throughout the district should be encouraged to look beyond the label to

measure, analyze and implement school programs that impact students of different

ethnicities in a multitude of ways.

Teachers throughout the district must not rest on the accomplishment of their

school being labeled highly by ADE. Teaching, educating and mentoring students of

different ethnicities is not best measured in a school label. As evidenced by this research,

teachers must understand that just because a school is an “A” or “excelling” it does not

necessarily mean that it provides an “A” education for Hispanic or Black students.

Teachers must continue to explore programs that can persist to have an effect on the

diverse populations that each school throughout the district serves.

Parents throughout the district, as a result of this research, need to continue to

become educated about what a school is doing to best service the needs of their

individual child. School-level labels might provide necessary summary level data easy to

report in newspapers. But, the level of analysis that these labels provides to a Black or

Hispanic parent in an upper-class neighborhood in this district is extremely limited.

Perpetual achievement gaps in schools with high labels suggest that minority parents in

the district need to continue to pressure schools to service the needs of their children and

promote the idea of education amongst all subgroups. The achievement gaps existent in

the majority of schools throughout the district amongst ethnic subgroups call for parents

of minority children to become educated about school labels, involved in the educational

138

system, and active in ensuring their students are exposed to the level of education

demanded in NCLB.

All shareholders must be cautious in consuming the ADE issued school labels.

Specifically, the shareholders must be careful in interpreting what a school label means

for an individual child and particularly an ethnically diverse individual child. Much like

a business can hide discrepancies in salary between males and females by reporting an

overall average salary a school-wide label can disguise discrepancies in educational

performance between ethnicities by reporting summary level achievement data.

Shareholders throughout the school district must be aware that school labels summarize

school performance but they do not analyze school performance.

Finally, at the national level politicians and other stakeholders must realize that

rhetoric without action is little more than a social pacifier. The research findings for the

district analyzed in this dissertation should bear witness to the idea that closing the

achievement gap is going to take a much more concentrated effort than NCLB. In 2001

politicians spoke to NCLB being the Civil Rights Act of the 21st century. NCLB was an

act that was going to ensure the civil liberty of equal education to all subgroups.

Certainly, the state-mandated testing and nationally-mandated accountability systems

called for in NCLB, with the purpose of eradicating the achievement gap, have had very

little effect on the elusive achievement gap in the district analyzed in this research.

Furthermore, national-level research would suggest similar findings. Hopefully this

research helps in establishing the need to reexamine testing systems, accountability

systems and the very metrics used in ranking schools throughout this nation.

139

Recommendations for Further Research

The research performed in this study was limited in its ability to generalize to

other districts, states and the nation. Further research based on this study should be

encouraged, first and foremost, to other districts. Examining the implications of school

labels in relationship to the academic achievement gaps amongst ethnicities is important

and valuable information that no district should ignore. NCLB mandated that districts

throughout the nation continue to seek ways to close the disparate achievement gap

between ethnic subgroups. A school and district that continues to ignore issues with an

achievement gap because the state has deemed them as exemplary is like a business that

shows a profit but could have profited more by analyzing their costs.

Within the given district in which this data was analyzed it is recommended that

analysis of programs intended to close the achievement gap amongst ethnicities be

conducted. With the understanding that the achievement gap in this district remains

persistent, specialized programs that were designed to address the underperforming

ethnic subgroups must be analyzed to evaluate their impact on student achievement.

Furthermore, a comprehensive study analyzing the effect size of research based programs

throughout the United States should be conducted with respect to programs that impact

achievement gaps at the school level.

The district analyzed would also be served by further analyzing the achievement

gap by ethnicity for all students that live in poverty. For example, a future research

project should include analyzing the performance at given school for Black, Asian, White

and Hispanic students whom all are currently on free and reduced lunch. Examining the

achievement gap in this manner could provide further evidence if there was truly an

140

ethnic achievement gap at these schools even when socioeconomic status was held in

control. An academic performance study across ethnicities for all students qualifying for

free and reduced lunch would further the research performed in this dissertation.

Further research is also recommended at the state level. As the state continues to

modify how to rank and label schools it should be recommended that they continue to

analyze exactly what their labeling system measures. John Tukey, one of the foremost

statisticians during the 20th century, has been quoted as saying, “It is better to have an

approximate answer to the right question than a precise answer to the wrong question”

(Brainy Quote, 2011). The state of Arizona must constantly analyze their school labeling

system with Tukey’s insight in the forefront of their mind. Producing school labels that

are precise in terms of statistical measurement but enable schools to be labeled solely on

their demographic representation are misleading. Analyzing whether school

demographics are directly related to school labels on a yearly basis helps ensure that state

is constantly updating their labeling system.

Finally, as the state of Arizona expands its abilities to measure different variables

prevalent in the school system a review of the multiple regression for school level grades

should be encouraged. The variables currently available through ADE, as shown in this

research, tend to be inter-correlated. As ADE expands its abilities to collect useful

information on students, schools and districts a review of the multiple linear regression in

relationship to school grade would be warranted.

Concluding Remarks

NCLB had intended to implement accountability systems for educational

institutions throughout the United States in an effort to reform education. In particular,

141

the biggest reform sought was the closing of the achievement gap between ethnic

subgroups. A variety of accountability systems have since been implemented across the

United States. Unfortunately, too many states use their accountability measurements as a

final judgment for the academic quality of a school. Linn (2008) stated that:

Accountability system results can have value without making causal inferences

about school quality, solely from the results of student achievement measures and

demographic characteristics. Treating the results as descriptive information and

for identification of schools that require more intensive investigation of

organizational and instructional process characteristics are potentially of

considerable value. Rather than using the results of the accountability system as

the sole determiner of sanctions for schools, they could be used to flag school that

need more intensive investigation to reach sound conclusions about needed

improvement or judgments about quality. (p. 21)

The insights from Linn’s research on accountability systems are valuable. Unfortunately,

with respect to the single school district studied in this research the accountability and

school labeling system used by ADE has been found to have “flags” for schools that

receive the best of labels. Reoccurring achievement gaps at excelling and highly

performing schools in the suburban school district studied suggest that schools with the

highest of labels might need to be flagged as well. Not because these schools are

underperforming with respect to all student achievement but rather because they have

certain ethnic subgroups that are underperforming in comparison with their peer groups.

142

The ability to measure the impact of school-level and teacher-level factors

continues to be an inexact science. The multivariate environment that children are

exposed to on a daily basis throughout the year prohibit accountability models from being

perfect. George Box (1979), a 21st century statistician, reminds educators that, “All

models are wrong but some are useful.” Certainly, every accountability model in

education thus far has been wrong and educators should continue to establish what to

measure and how to measure it. As the models become more useful one must continue to

understand that most accountability models provide school-wide and district-wide

summary data. As this research has shown, for the suburban district analyzed, the very

summary level data used to hold schools and districts accountable can be misleading

when not analyzed in a disaggregated fashion. The inexact science of accountability

measurements, while demanded for by a variety of stakeholders, need to be cautiously

viewed because as Box suggests every model is inherently flawed but each model has

useful information when digested properly.

REFERENCES

Amrein, A. L., & Berliner, D. C. (2002). High-stakes testing, uncertainty, and student

learning. Education Policy Analysis Archives, 10(18).

Arizona Department of Education (ADE) (2008). Zscore. Retrieved from

https://www.azed.gov/azlearns/AZLEARNSTechnicalManual2008.pdf (pg 21).

Arizona Department of Education (ADE) (2010). Arizona’s school accountability system.

Retrieved from http://www.azed.gov/research-

evaluation/files/2011/09/2010azlearns-technical-manual.pdf (pg 23).

Arizona Department of Education (ADE) (2011). Arizona October 1st Enrollment

Figures. Retrieved from http://www.ade.az.gov/researchpolicy/AZEnroll

Barton, P. E., & Coley, R. J. (2010, July). The black-white achievement gap. When

progress stopped. Princeton, NJ: Policy Evaluation and Research Center.

Beecher, M., & Sweeny, S. M. (2008). Closing the achievement gap with curriculum

enrichment and differentiation: One school's story. Journal of Advanced

Academics, 19(3), 520-530.

Bell, T. C. (1983). A nation at risk. Retrieved from

http://www2.ed.gov/pubs/NatAtRisk/risk.html

Berliner, D. C. (2009). Are teachers responsible for low achievement by poor students?

Kappa Delta Pi Record, (46), 18-21. Retrieved from

http://greatlakescenter.org/docs/Policy_Briefs/Berliner_NonSchool.pdf

https://www.azed.gov/azlearns/AZLEARNSTechnicalManual2008.pdf

http://www.azed.gov/research-evaluation/files/2011/09/2010azlearns-technical-manual.pdf

http://www.azed.gov/research-evaluation/files/2011/09/2010azlearns-technical-manual.pdf

http://www.ade.az.gov/researchpolicy/AZEnroll

http://www2.ed.gov/pubs/NatAtRisk/risk.html

http://greatlakescenter.org/docs/Policy_Briefs/Berliner_NonSchool.pdf

144

Birenbaum, M., & Nasser, F. (2006). Ethnic and gender differences in mathematics

achievement and in dispositions towards the study of mathematics. Learning and

Instruction, 16, 26-40.

Bishop, J. H. (1997). What should high-school graduates know in economics? The effect

of national standards and curriculum-based exams on achievement. The American

Economic Review, 87(2), 260-264.

Black, J. A. & Champion, D. J. (1976). Methods and issues in social research. New

York, NY: Holt.

Borman, G., & Dowling, M. (2010). Teachers College Record, 112(5), 1201-1246.

Box, G, (1979, May). Robustness in the strategy of scientific model building. In R. L.

Launer and G. N. Wilkinson (Eds.) Robustness in statistics: Proceedings of a

workshop. Salt Lake City, UT: Academic Press.

Brainy Quote. (2011). Retrieved from

http://www.brainyquote.com/words/ap/approximate131762.html

Bruce, M. (2011, September 23). Obama: ‘No child left behind’ changes will allow states

to meet higher standards. Retrieved from,

http://abcnews.go.com/blogs/politics/2011/09/obama-no-child-left-behind-

changes-will-allow-states-to-meet-higher-standards/

http://www.brainyquote.com/words/ap/approximate131762.html

http://abcnews.go.com/blogs/politics/2011/09/obama-no-child-left-behind-changes-will-allow-states-to-meet-higher-standards/

http://abcnews.go.com/blogs/politics/2011/09/obama-no-child-left-behind-changes-will-allow-states-to-meet-higher-standards/

145

Burch, P., Theoharis, G., & Rauscher, E. (2010). Class size reduction in practice

investigating the influence of the elementary school principal. Educational Policy,

24(2), 330-358.

Carnoy, M., & Loeb, S. (2002). Does external accountability affect student outcomes? A

cross-state analysis. Educational Evaluation and Policy Analysis, 24(4), 305-331.

Center on Education Policy (CEP) (2009, October). Are achievement gaps closing and is

achievement rising for all? State test score trends through 2007-08, part 3.

Clerk of the House of Representatives. (2001). Final votes for roll call 145.

Coleman, J. S. (1996). Equality of educational opportunity. Washington, DC: U.S.

Department of Health, Education, and Welfare.

Costrell, R. M. (1997). Can centralized educational standards raise welfare? Journal of

Public Economics, 65, 271-293.

Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155-159.

Cronin, J., Kingsbury, G. C., McCall, M. S., & Bowe, B. (2005). The impact of the no

child left behind act on student achievement and growth: 2005 edition (Technical

Report). Lake Oswego, OR: Northwest Evaluation Association.

Curren, P. J. & Werth, R. J. (2004). Interindividual differences in intraindividual

variation: Balancing internal and external validity. Measurement, 2(4), 219-247.

146

DeNavas-Walt, C., Proctor, B. D., & Smith, J. C. (2010, September). Income, poverty,

and health insurance coverage in the United States: 2009. US Department of

Commerce. Retrieved from, http://www.census.gov/prod/2010pubs/p60-238.pdf

Diamond, J. B. (2006). Still separate and unequal: Examining race, opportunity and

school achievement in "integrated" suburbs. Journal of Negro Education, 75(3),

495-505.

Education Week. September 10, 2004. Achievement Gap. Retrieved from

http://www.edweek.org/ew/issues/achievement-gap/

Glass, G. & Hopkins, K. (1984). Statistical methods in education and psychology.

Englewood Cliffs, NJ: Prentice Hall.

Glass, G., Peckham, P., & Sander, J. (1972). Consequences of failure to meet

assumptions underlying the fixed effects analyses of variance and covariance.

Review of Educational Research, 42, 237-288.

Goldschmidt, P. (2004, October). Models for school accountability and program

evaluation. Presentation at the Reidy Interactive Lecture Series: Incorporating

measures of student growth into state accountability systems, Nashua, NH.

Retrieved from http://www.nciea.org/publications/RILS_PG04.pdf

Gould, S. J. (1981). The mismeasure of man: The definitive refutation to the argument of

the bell curve. New York: Norton.

Hammersley, M. (1987). Some notes on the terms ‘validity’ and ‘reliability.’ British

Educational Research Journal, 13(1), 73-81.

http://www.census.gov/prod/2010pubs/p60-238.pdf

http://www.edweek.org/ew/issues/achievement-gap/

147

Hanushek, E. A., & Raymond, M. E. (2003a). In Paul E. Peterson and Martin R. West

(Eds.), Lessons about the design of state accountability systems. Brookings.

Hanushek, E. A., & Raymond, M. E. (2003b). Improving educational quality: How best

to evaluate our schools? Education in the 21st century: Meeting the challenges of

a changing world.

Hanushek, E. A., & Raymond, M. E. (2004). The effect of school accountability systems

on the level and distribution of student achievement. Journal of the European

Economic Association, 2(2), 406-415.

Harlin, R. (2009). The impact of teachers' expectations on diverse learners' academic

outcomes. Childhood Education, 85(4), 253-256.

Hauser, P. M., McMurrin, S. M., Nabrit, J. M., Nelson, L. W., & Odell, W. R. (1964).

Integration of the public schools–Chicago, p.20-21. Chicago, IL: Board of

Education, Chicago Public Schools.

Hess, F. M., & Petrilli, M. J. (2006). No child left behind. New York, NY: Peter Lang

Publishing.

Imber, M., & Van Geel, T. (2004). Educational law (3rd ed.). Mahwah, NJ: Lawrence

Erlbaum Associates, Publishers.

Jacob, B. A. (2001). Getting tough. The impact of high school graduation exams.

Educational Evaluation and Policy Analysis, 23(2), 99-121.

148

Kerlinger, F. N. (1964). Foundations of behavioral research. New York, NY: Holt.

Kiley, K. (2010, October 10). Arizona to change how it evaluates schools. Arizona

Republic.

Kober, N., Chudowsky, N., & Chudow, V. (2010, December). Slow and uneven progress

in narrowing gaps. State test score trends through 2008-09, part 2. Center on

Education Policy.

Kopan, A., & Walberg, H. (1974). Rethinking educational equality. Berkley, CA:

McCutchan Publishing Corporation.

Leithwood, K. (2010). Characteristics of school districts that are exceptionally effective

in closing the achievement gap. Leadership and Policy in Schools, 9(3), 245-291.

Levine, T. H., & Marcus, A. S. (2007). Closing the achievement gap through teacher

collaboration: Facilitating multiple trajectories of teacher learning. Journal of

Advanced Academics, 19(1), 116-138.

Liew, J., Chen, Q., & Hughes, J. N. (2010). Childhood effortful control, teacher-student

relationships, and achievement in academically at-risk children: Additive and

interactive effects. Early Childhood Research Quarterly, 25(1), 51-64.

Limprianou, J. & Athanasou, J. A. (2009). A teacher’s Guide to Educational

Assessment. (Revised Edition) Rotterdam/Boston/Taipei: SensePublishers.

Retrieved from, https://www.sensepublishers.com/files/9789087909147PR.pdf

https://www.sensepublishers.com/files/9789087909147PR.pdf

149

Linn, R. L. (2006). Educational accountability systems. CSE technical report 687.

Technical Report No. ED492875. National Center for Research on Evaluation,

Standards and Student Testing.

Linn, R.L. (2008). Educational accountability systems. In K. E. Ryan and L. A. Shepard,

(Eds.), The Future of Test-Based Accountability, pp. 3-24. New York: Routledge.

Linn, R. L., & Miller, D. M. (2005). Measurement and assessment in teaching. Upper

Saddle River, NJ: Pearson Education, Inc.

Loesch, P. C. (2010). 4 core strategies for implementing change. Leadership, 39(5), 28-

31.

Marshall, K. (2009). A how-to plan for widening the gap. Phi Delta Kappan, 90(9), 650-

655.

McKown, C., & Weinstein, R. S. (2008). Teacher expectations, classroom context, and

the achievement gap. Journal of School Psychology, 46(3), 235-261.

Mertler, C. & Vannatta, R. (2005). Advanced and multivariate statistical methods (3rd

ed.). Glenda, CA: Pyrczak Publishing.

Myers, R. (1990). Classical and modern regression with applications (2nd ed.). Boston,

MA: Duxbury Press.

150

National Assessment of Educational Progress (NAEP) (2009). The Nation’s report card.

Grade 12 reading and mathematics National and pilot state results. US

Department of Education. Retrieved from


Nichols, S. L., & Berliner, D. C., (2007). Collateral Damage: How High-Stakes Testing

Corrupts America’s Schools. Cambridge, MA: Harvard Education Press.

Orfield, G. (2006). Forward to J. Lee, Tracking achievement gaps and assessing the

impact of NCLB on the gaps: An in-depth look into national and state reading and

math outcome trends. Cambridge, MA: The Civil Rights Project at Harvard

University. Retrieved from www.agi.harvard.edu/Search/download.php?id=84

Orlich, D. C. (2004). No child left behind: An illogical accountability model. The

Clearing House, 78(1), 6-11.

Peterson, P. E., & West, M. R., (Eds.). (2003). No child left behind? The politics and

practice of accountability. Washington, DC: Brookings Institution Press.

Popham, J. W. (2004). America's "failing" schools: How parents and teachers can cope

with no child left behind. New York, NY: Routledge Falmer.

Reese, W. J. (2005). America's public schools: From the common school to "no child left

behind." Baltimore, MD: The Johns Hopkins University Press.

Resnick, L. B. & Resnick, D. P. (1985). Standards, curriculum, and performance: A

historical and comparative perspective. Educational Researcher, 16(9), 13-20.


http://www.agi.harvard.edu/Search/download.php?id=84

151

Rotham, R. (1995). Measuring up: Standards, assessment and school reform. San

Francisco, CA: Jossey-Bass Publishers.

Rury, J. L. (2002). Educational and social change: Themes in the history of American

schooling. Mahwah, NJ: Lawrence Erlbaum Associates.

Smith, E. (2005). Raising standards in American schools: The case of no child left

behind. Journal of Education Policy, 20(4), 507-524.

Smith, M. (2007). Leaving NCLB renewal behind. Retrieved from

http://www.educationevolving.org/pdf/mikesmithoped.pdf

Steel, T. D. (2009). Closing the achievement gap: What can be done? Unpublished

Doctoral Dissertation (3355438), University of Southern California, Ann Arbor,

MI.

Stevens, J. (2007). Intermediate statistics (3rd ed.). New York, NY: Lawrence Erlbaum

Associates.

Tabachnick, B.G. & Fidell, L.S. (1996). Using multivariate statistics (3rd ed.). New

York: Harper Collins.

Tidman, P., & Kahane, H. (2003). Logic and philosophy: A modern introduction, (9th

ed.). Belmont, CA: Wadsworth/Thomson Learning.

Trochim, W. M. (2006). The Research Methods Knowledge Base, (2nd ed.). Retrieved

from: http://www.socialresearchmethods.net/kb/intval.php

http://www.educationevolving.org/pdf/mikesmithoped.pdf

http://www.socialresearchmethods.net/kb/intval.php

152

U.S. Department of Education. (2005, July 14). Spellings hails new national report card

results: Today's news “proof that No Child Left Behind is working.” Press

release. Retrieved from

http://www.ed.gov/news/pressreleases/2005/07/07142005.html

U.S. Department of Education. (2009, January). Great expectations. Holding ourselves

and our schools accountable for results. Office of the Secretary.

U.S. Senate roll call votes 107th congress - 1st session (2001).

Viadero, D. (2006, June 21). Race report’s influence felt 40 years later. Legacy of

Coleman study was new view of equity. Education Week, 25(41).

Walker, G. (1963, July 6). Englewood and the northern dilemma. The Nation, 197, 7-10

Webb, L. D. (2006). The history of American education: A great American experiment.

Upper Saddle River, NJ: Pearson Education, Inc.

Wei, X. (2008). Accountability stringency, incentives and student performance. Doctoral

Dissertation, Stanford University.

Zhang, Y., & Zhang, L. (2002, April 1-5). The applicability of selected regression and

hierarchical linear models to the estimation of school and teacher effects, 1-19.

New Orleans, LA..

http://www.ed.gov/news/pressreleases/2005/07/07142005.html

153

Zimmerman, B. J. & Schunk, D. H. (2003), Educational psychology: A century of

contributions. Lawrence Erlbaum Associates, ISBN 0805836829 retrieved from

http://books.google.com/books?id=bqo5A2nBwHYC&pg=PA37&lpg=PA37&dq

=Zimmerman,+Barry+J.;+Schunk,+Dale+H.+(2003),+Educational+Psychology:+

A+Century+of+Contributions,+Lawrence+Erlbaum+Associates,+ISBN+0805836

829&source=bl&ots=KeS4NKCSJu&sig=LqV9OnONnHFDiS1cCyPqHtM9Zus

&hl=en&ei=af2YTf3QOJTWiALi74CdCQ&sa=X&oi=book_result&ct=result&re

snum=1&ved=0CBQQ6AEwAA#v=onepage&q&f=false

Zuzovsky, R. (2008). Closing achievement gaps between Hebrew-speaking and Arabic-

speaking students in Israel: Findings from TIMSS-2003. Studies in Educational

Evaluation, 24, 105-117.

http://books.google.com/books?id=bqo5A2nBwHYC&pg=PA37&lpg=PA37&dq=Zimmerman,+Barry+J.;+Schunk,+Dale+H.+(2003),+Educational+Psychology:+A+Century+of+Contributions,+Lawrence+Erlbaum+Associates,+ISBN+0805836829&source=bl&ots=KeS4NKCSJu&sig=LqV9OnONnHFDiS1cCyPqHtM9Zus&hl=en&ei=af2YTf3QOJTWiALi74CdCQ&sa=X&oi=book_result&ct=result&resnum=1&ved=0CBQQ6AEwAA#v=onepage&q&f=false






APPENDIX A

CUSD IRB APPROVAL

157

BIOGRAPHICAL INFORMATION

Matt Strom was born on May 9, 1977 in Boone, Iowa. With a family that always stressed education Matt graduated from Mountain Pointe High School, 1995, Arizona State University, 1998, and received his graduate degree from Northern Arizona University, 2002. During the process of receiving his graduate degree he became married to Marcia Jones. After marrying Marcia, the good Lord blessed the Strom family with three children named Zavian, Quentin and Elijah. Matt has worked in a variety of different roles during his 14 year educational career. As a 21-year old teacher Matt first started teaching mathematics at Mesquite High School. As he was about to turn 25, Matt was hired as the varsity boys basketball coach at a 5A school making him the youngest active large school varsity boys coach in the state of Arizona. Matt has served in several other roles throughout his educational career that include: mathematics department chair, AVID teacher and head varsity golf coach. Currently, Matt’s role includes being the research analyst to the superintendent of his current district. The constant desire that Matt has to learn has enabled him to participate in many learning experiences since his graduate degree. He was a PLC leader for the Project Pathways STEM project out of ASU. He has attended numerous educational workshops including AVID training and NCTM conferences. Furthermore, in an effort to reconnect with his mathematics classes from his undergraduate degree Matt studied for and passed Exam P, Probability for Risk Management, the first exam in the Society of Actuaries exam process. Matt, like the majority of his fellow educators, has a thirst for knowledge and as a result he started the doctoral process in the summer of 2008 through Northern Arizona University. Upon completion of his degree Matt hopes to continue to grow in the educational field. He strongly desires to gain his superintendent’s certificate in an effort to gain employment in district-level educational administration at the K-12 level. In accord with the themes of this dissertation Matt hopes to bear witness to the day when ethnicity is not a determining factor in the quality of education that a child receives.

matt strom dissertation

Documents

suburban school district

school labels

strom school labeling

school ranking systems

southwest suburban district

districtlevel administrators

academic achievement

requirements of nclb