can large-scale tests be fair to all students? bias issues related to wasl catherine s. taylor...

53
Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley University of Washington March 29, 2007

Upload: phyllis-briggs

Post on 13-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Can Large-Scale Tests be Fair to All Students?

Bias Issues Related to WASL

Catherine S. TaylorUniversity of Washington/OSPI

Yoonsun LeeOSPI

Johnnie McKinleyUniversity of Washington

March 29, 2007

Page 2: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Focus of this presentation is on three studies:

• Study 1: What can we learn from Bias and Sensitivity Review procedures used for WASL (2004)

• Study 2: Report of input from two Public Forums on Bias and Sensitivity (2004)– Yakima

– Seattle

• Study 3: Investigation of ‘Differential Item Functioning’ (AKA statistical bias) in WASL test items (1997-2001)

Page 3: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

WASL test items are developed using state of the art procedures:

• Test Specifications: define how many and what types of items will be on a test

• Item Specifications: define exactly what kinds of items will assess each Grade Level Expectation (GLE)

• Item writing: overseen by skilled test developers• Item reviews: check for match to GLEs by

teachers• Bias and sensitivity reviews: by individuals who

represent the diversity of WA State students

Page 4: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

WASL test items are developed using state of the art procedures:

• Item pilots: items are randomly assigned to students throughout WA State

• Item data reviews: based on students’ performances– Statistical difficulty: Is the item easy or difficult

because of content tested NOT some flaw in the item?– Statistical validity: Do high performing students do

better on the item than low performing students?– Statistical bias: Is item performance related to level of

knowledge and skill NOT group membership?

Page 5: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Study 1: Bias & Sensitivity Reviews• Committee members represent diversity in the

student population (regions, ethnicity, gender, socio-economic status, religion, special population issues)

• Members review reading passages and items for: Implied or overt stereotyping or negative representations

of any group Too much or too little representation of any group Terms that may be confusing to students based on

language, region, culture, socio-economic status, etc. Controversial issues and topics that may affect some

groups more than others

Page 6: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Procedures Used to Observe Bias & Sensitivity Reviews:

• Participant-observer

• Recorded panelists comments during review process

• Cross checked records with facilitator notes

• Looked for patterns in notes/records in relation to reading passages and items

Page 7: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Results of Bias and Sensitivity Review Observations:

• Few passages or test items are identified as problematic

• Reading passages present the greatest potential for bias

• Sources of bias in reading passages are subtle

Page 8: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Reading passages present the greatest potential for bias:

• WASL reading passages include:– narrative and informative passages– passages with social studies, science, and

literary content

• WASL reading passages are from published sources

• Authors resist changes to their published writing (even when changes lessen bias/stereotyping)

Page 9: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Sources of bias in reading passages are subtle:• Alterations of original narratives:

– Use of legends and folk tales may be altered to fit Western notions of literature

– Language changes can change meaning (first feast vs. barbeque)

• “Othering”: – Biographies may focus on how individuals overcame or

coped with their minority status (Jackie Robinson; Helen Keller)

– Informational passages about cultural groups may have a patronizing tone (i.e., aren’t “their” ways cute)

• Interpretations: Items may focus on interpretations that are unique to middle class values rather than values of the culture of origin

Page 10: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Questions?

Page 11: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Study 2: Bias & Sensitivity Forums

• Two community forums (Yakima and Seattle)

• Community members came together to discuss concerns about WASL

• Participants included:– Teachers and school administrators

– Tribal elders

– Latino community leaders

– Parents and community members

Page 12: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Procedures used to Gather Data during Bias & Sensitivity Forums

• Did mock bias & sensitivity review

• Presented methods used for statistical “bias” analysis (also called differential item functioning (DIF))

• Showed items flagged for DIF and asked for likely causes

• Small group discussion with reports to larger group

• Recorded participant ideas about bias issues in WASL

• Examined written notes and chart paper for themes

Page 13: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Themes in Participant Comments• Need for involvement of minority teachers in all

stages of WASL development work – this may require community involvement

• Need for sensitivity to cultural values in selection of reading passages, item content, and the types of questions (particularly in reading)

• Need for inclusion of tribal elders in selection of text and contexts for WASL items

• Need for inclusion of individuals with cultural expertise in bias/sensitivity review panels

Page 14: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Study 3: Differential Item Functioning (DIF) Analyses

What is Differential Item Functioning (DIF or Item Bias)?

• When examinees, from different groups, with the same level of ability, have a different chance of answering an item correctly (Dorans & Holland, 1993)

• Most Bias analyses focus on cultural differences with students grouped according to some inherent demographic attribute (Scheuneman &Gerritz, 1990; Schmitt & Dorans, 1990; Wang & Lane, 1996).

Page 15: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Usual Focus of DIF/Bias Studies

• Two comparable groups of examinees Reference group – Larger or more dominant

group Focal group – Smaller or less dominant group

• Common demographic dimensions: Males compared with femalesStudents speaking English as first language

compared with students speaking English as second language

European American students compared with students from other American races/cultures

Page 16: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Multidimensionality as an Explanation for DIF/Bias

Multidimensionality occurs when:

1. An item requires use of two or more abilities (e.g., reading and mathematics) to respond correctly

2. DIF/Bias for multi-dimensional items occurs when individuals from different groups have:

identical ability on the primary dimension

unequal ability on the secondary dimension

A different likelihood of answering the item correctly

Page 17: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Typical Steps in a DIF Analysis:

• Identify two groups to be compared

• Compute item performance for students in each group at each total test score

• Summarize the differences in performance across all test scores

Page 18: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

DIF/Bias Statistical Procedures

• Mantel-Haenszel

(Holland & Thayer, 1988)

• Logistic regression

(Swaminathan & Rogers, 1990)

• SIBTEST

(Shealy & Stout’s simultaneous item bias, 1993)

Page 19: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

SIBTEST

• A nonparametric statistical test to sequentially detect DIF/Bias present in one or more items of a test

• An outgrowth of the multidimensional IRT modeling of DIF/Bias (Nandakumar & Stout, 1993; Ackerman, 1994; Roussos & Stout, 1996)

Page 20: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Example of Equal Abilities Distribution on the Primary Dimension; Different Abilities on the Secondary Dimension

3

0

-3 0 3

θ1

θ2

F R

R

F

Mathematics

Readin

g

θ1

R = F

Page 21: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Comparison of White Students' and Black Students' Performance on a Hyphothetical Item

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

270 275 280 285 290 295 300 305 310 315

Scale Score on the Test

Pe

rce

nt

of

Stu

de

nts

wit

h C

orr

ec

t A

ns

we

r

White Students

Black Students

Page 22: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

DIF Can Go Both Ways:

• When individual students get their total scores from different items – that’s normal

• When there is a pattern in how groups of students get their total scores - that’s DIF

• When students in a group do better than expected on an item based on their total test score DIF is in favor of the group

• When students in a group do more poorly than expected on an item based on their total test score, DIF is against the group.

Page 23: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Typical Causes of DIF:• Impact: Students from different groups receive

different educational experiences such that item performance differences reflect true differences in knowledge/skills.

• Culture/Background: Students from different backgrounds bring unique perspectives to bear on test items.

• Language: Language used in items is differentially familiar to students.

• Effort: Examinees from different groups may attempt different items based on perceived likelihood of success.

Page 24: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Research on DIF for WASL Test Items:

• Studies were conducted after items had been:– reviewed by bias & sensitivity committee– examined for statistical bias– used in an operational test

• Compared performance of:

– Males and Females

– White students and Black/African American students

– White students and Latino/Hispanic students

– White students and Native American students

– White students and Asian/Pacific Islander students

Page 25: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Research on DIF for WASL Test Items:

• Examined test items from:

1997, 1998, 1999, 2000, 2001 Grade 4 Reading and Mathematics

1998, 1999, 2000, 2001 Grade 7 Reading and Mathematics

1999, 2000, 2001 Grade 10 Reading and Mathematics

Page 26: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

DIF Results for Reading:• Most reading items showed no statistical bias• Reading items flagged for Gender DIF:

Multiple choice items tend to favor boys Performance items tend to favor girls Items favoring boys tend to be related to informational

passages Reading items flagged for Ethnic DIF

Multiple-choice items asking for text interpretation tend to favor white students

Performance-items asking for text interpretation tend to favor minority students

Patterns became more extreme across grade levels

Page 27: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Reading Items Flagged for

DIF (Males & Females)

Grade Item

TypeFavor Males Favor Females

4 MC 1.20 0.00

P 0.80 3.00

7 MC 4.50 0.50

P 0.50 5.00

10 MC 5.33 0.33

P 2.33 6.00

Page 28: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Reading Items Flagged for

DIF (Asian/Pacific Islander & White)

Grade Item

Type

Favor Asians/

Pacific Islanders

Favor Whites

4 MC 0.20 1.40

P 2.20 0.60

7 MC 0.00 4.25

P 5.50 0.00

10 MC 0.00 4.00

P 6.67 1.67

Page 29: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Reading Items Flagged for DIF (Black/African & White)

Grade Item

Type

Favor Blacks/Africans Favor Whites

4 MC 0.20 0.60

P 2.00 0.40

7 MC 0.00 2.25

P 3.25 0.25

10 MC 0.67 2.33

P 5.33 1.33

Page 30: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Reading Items Flagged for

DIF (Native American & White)

Grade Item

Type

Favor Native Americans Favor Whites

4 MC 0.00 0.00

P 1.00 0.20

7 MC 0.00 0.25

P 1.00 0.25

10 MC 0.00 1.00

P 1.67 0.67

Page 31: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Reading Items Flagged for

DIF (Latino/Hispanic & White)

Grade Item

Type

Favor Latinos/

HispanicsFavor Whites

4 MC 0.40 1.20

P 2.20 0.20

7 MC 0.00 3.25

P 5.50 0.00

10 MC 0.00 3.00

P 6.00 1.67

Page 32: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Excerpt from a reading passage:The best looking fences are often the simplest. A simple

fence around a beautiful home can be like a frame around a picture. The house isn’t hidden; its beauty is enhanced by the frame. But a fence can be a massive, ugly thing, too, made of bricks and mortar. Sometimes the insignificant little fences do their job just as well as the ten-foot walls. Maybe it’s only a string stretched between here and there in a field. The message is clear; don’t cross here.

Every fence has its own personality and some don’t have much. There are friendly fences. A friendly fence takes kindly to being leaned on. There are friendly fences around some playgrounds. And some playgrounds fences are more fun to play on than anything they surround. There are more mean fences than friendly fences overall, though. Some have their own built-in invitation not to be sat upon. Unfriendly fences get it right back sometimes. You seldom see one that hasn’t been hit, bashed, or bumped or in some way broken or knocked down.

Page 33: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Example of a Reading an Item that Shows

Statistical Bias in Favor of Focal Groups:

In the sixth paragraph, the author talks about friendly and unfriendly fences. How can you tell them apart?

________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

* Favors Latinos, Blacks/African Americans, and Asian/Pacific Islanders

Page 34: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Example of a Reading Item that Shows

Statistical Bias in Favor of Focal Groups:

What is the author’s attitude toward fences? Give three pieces of evidence from the essay to support your point.

________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________________

* favors females, Asian/Pacific Islanders, and Latinos

Page 35: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Example of a Reading Item that Shows Statistical Bias in Favor of Males and Whites

Page 36: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

DIF Results for Mathematics:

• Most mathematics items showed no statistical DIF

• Mathematics items flagged for Gender DIF: Multiple choice items tend to favor boys Performance items tend to favor girls DIF items favoring boys tend to require simple

applications of mathematical procedures in number, algebra, geometry, and statistics

DIF items favoring girls tend to assess data analysis, measurement, complex applications, reasoning, and problem-solving

Number of items flagged for DIF increased across grade levels

Page 37: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

DIF Results for Mathematics:

• Ethnic DIF statistical patterns:

Performance items were flagged for DIF more often than multiple-choice items

Slightly more of the flagged performance items favored minority students, although differences were small

Page 38: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Mathematics Items

Flagged for DIF (Males & Females)

Grade Item

TypeFavor Males Favor Females

4 MC 2.20 0.00

P 2.00 5.20

7 MC 3.50 0.50

P 1.75 5.25

10 MC 5.67 0.00

P 3.67 7.33

Page 39: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Mathematics Items Flagged for

DIF (Asian/Pacific Islander & White)

Grade Item

TypeFavor Asians/

Pacific Islanders

Favor Whites

4 MC 1.00 2.00

P 1.80 1.60

7 MC 1.50 2.50

P 5.75 3.00

10 MC 3.00 1.33

P 3.67 4.67

Page 40: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Mathematics Items

Flagged for DIF (Black/African & White)

Grade Item

TypeFavor Blacks/

Africans

Favor Whites

4 MC 1.00 0.80

P 2.00 1.40

7 MC 0.25 0.75

P 3.25 1.50

10 MC 2.33 1.33

P 3.00 3.00

Page 41: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Mathematics Items

Flagged for DIF (Native American & White)

Grade Item

TypeFavor Native

Americans

Favor Whites

4 MC 0.00 0.00

P 1.80 1.00

7 MC 0.00 0.50

P 1.75 1.25

10 MC 0.00 0.67

P 3.00 2.00

Page 42: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Mean Number of Mathematics Items

Flagged for DIF (Latino/Hispanic & White)

Grade Item

TypeFavor Latinos/

Hispanics

Favor Whites

4 MC 0.80 1.00

P 2.60 0.80

7 MC 0.25 1.25

P 3.50 1.75

10 MC 0.33 0.67

P 3.00 2.00

Page 43: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

DIF Results for Mathematics:

• Content analysis of Mathematics items flagged for Ethnic DIF: Flagged items favoring Asian/Pacific Islander

students generally assessed number concepts, computation, geometric procedures, algebraic procedures, and simple statistics

Flagged items favoring Black/African, Native American, and Latino/Hispanic students generally assessed number, number patterns, computation, and logical reasoning

Flagged items favoring White students generally assessed data analysis, data representation, measurement, reasoning, and problem-solving

Page 44: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Example of a Mathematics Item that Shows Statistical Bias in Favor of Focal Groups:

Favor Latinos, Native Americans, Asian/Pacific Islanders, Black/African Americans, and Females

Page 45: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Example of a Mathematics Item that Shows Statistical Bias in Favor of Focal Groups:

* Favors Asian/Pacific Islanders

Page 46: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Conclusions from DIF Studies:

• Results suggest:– Exclusive reliance on multiple-choice items for

reading tests may result in bias against girls and minority students – particularly when items assess interpretation of text

– Exclusive reliance on multiple-choice items for mathematics tests may result in bias against girls

– Ethnic DIF results in mathematics suggest that content of instruction differs for students in different groups

Page 47: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Additional Points

• Similar results have been found in studies of other tests

• However, these results can only be generalized when:– Items are written in the same way as WASL

items (structured, not too open-ended)– Diverse, appropriate interpretations and problem

solutions are selected for use to train scorers

Page 48: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Can Standardized Tests be Fair to All Students?

Yes, under some conditions:– Use of reading passages that maintain cultural

characteristics– Well developed performance items that present

clear directions to students– Use of item writers from diverse backgrounds– Selection of anchor papers and training papers

that represent diverse, valid responses– Cultural experts in bias & sensitivity reviews

Page 49: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

AppendixAppendix

The following pages give The following pages give the mathematical model the mathematical model

for SIBTESTfor SIBTEST

Page 50: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Estimated True Score Estimated True Score (Regression Correction)(Regression Correction)

where,

Page 51: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Bias indexBias index

(After regression correction applied)

Page 52: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Bias statisticBias statistic

where

Page 53: Can Large-Scale Tests be Fair to All Students? Bias Issues Related to WASL Catherine S. Taylor University of Washington/OSPI Yoonsun Lee OSPI Johnnie McKinley

Bias indexBias index

where

Gfk: proportion of people in focal group getting score of k