Download - Validation Studies & Cut Scores January 21, 2007 EPSB Work Session

Validation Studies &

Cut Scores

January 21, 2007

EPSB Work

Session

Purposes of Testing/Cut

Scores

To meet statutory requirements and to select individuals who have a minimum level of academic proficiency & content knowledge to be presumed capable of delivering education to children in the public schools.

Testing partners with –

Admission requirements into teacher preparation programs, including Praxis I tests, college admission exams, grade point averages, and other academic proficiency assessments

Kentucky Teacher Internship Program (KTIP)

What decision are we making?

What purpose are we serving?

What’s the best way to make a decision?

Inferences

Two Types of Inferences

1. Select those with the highest levels of qualification

2. Select those who have minimum qualifications

Minimum = low

Inferences

Negative Consequences of increasing cut scores:

1. Reduce the number of qualified applicants for certification

2. Increase the number of emergency and conditional certificates

3. Create teacher shortages4. Reduce institutional QPI scores5. Disparate impact

In any case, the same people will be teaching.

NegativeConsequen

ces

Positive consequences of increasing cut scores:

1. Decrease the number of false positives

2. Perhaps marginally increase the quality of teaching

PositiveConsequen

ces

Research

When teacher basic skills test scores have been used as predictors of teacher performance, few studies have shown any strong relationship, and some studies have shown no relationship at all.

Studies that have attempted to relate content knowledge to teacher performance (most of them confined to mathematics and science) have shown modest results at best.

Standard Error of Measurement (SEM)

Methodologies for establishing cut scores– Angoff– Contrasting Groups– Bookmark– Yeager-Mills (Body of Work)– Other

TechnicalConsiderati

ons

• Results in determination of whether a test is valid for use in KY & provides a recommended cut score

• Most widely used

• Requires teacher judgments

• Held up in most research studies as the method that produces the most stable results

• Generally accepted by the courts & professionals in the field

Angoff Method

Validation Process/Cut Score Recommendation

• A panel of teachers representative from across the state and from each grade level and content area are selected to review items on each test.

• Teachers are asked to estimate the proportion of persons with minimally acceptable skills in the content area who would be expected to get each item right. (At least 70% of the items must be judged job relevant in order for the test to be deemed a valid measure of performance).

• After all items have been rated, the judgments of the teachers are combined to recommend a cut score for the whole test.

Angoff MethodProcess

Based on Decision Rules applied since May 1999

Accept the recommendation of the validation panel unless:

a. The recommendation fell below the current passing score, or

b. The recommendation fell below the Southern Regional Education Board (SREB) average, orc. The recommendation and SREB score fell below the

15th

national percentile, ord. The recommendation exceeded the 25th national

percentile

Cut Score Recommendat

ions

TestTest

Number

ValidationPanel

RecommendedCut Score

Regulatory Cut Score

Earth Science: Content Knowledge 0571 145 145Principles of Learning & Teaching: Grades K-6 0522 164 161Principles of Learning & Teaching: Grades 5-9 0523 166 161Principles of Learning & Teaching: Grades 7-12 0524 158 161Speech Communication 0220 570 580Education of Exceptional Students: Core Content Knowledge 0353 157 157Elementary Education: Content Knowledge 0014 148 148Education of Exceptional Students: Mild to Moderate Disabilities 0542 165 172Education of Exceptional Students: Core Content Knowledge 0353 157 157Biology: Content Knowledge 0235 154 146Chemistry: Content Knowledge 0245 161 147Physics: Content Knowledge 0265 145 133

Tests Validated Since May 1999 & Corresponding Cut Scores

Challenges

Evidence must support that each test chosen is:

#1: valid for the purpose for which it is used

#2: anchored in reasonable expectations of job performance

#3: a reliable measure

#4: does not unfairly disadvantage members of demographic groups

ETS Assuranc

e

The Educational Testing Service (ETS) employs many

psychometricians and conducts many test studies that influence test development, maintenance,

and revision, including bias reviews, DIF analysis, reliability coefficients, and calculation of p-

values.

Recommended

Framework

• Cut scores between the 15th – 25th percentiles, inclusive

• Greater than or equal to current cut score

• Comparable to SREB average cut score

• Use disparate impact estimates as indicators of possible program performance reviews, combined with other information

Legal Consideratio

ns

14th AmendmentDue Process and Equal Protection Clause

Section 1“. . . nor shall any State deprive any person

of life, liberty, or property, without due process of law; nor deny to any person

within its jurisdiction the equal protection of the laws.”

Section 5The Congress shall have the power to

enforce, by appropriate legislation, the provisions of this article.

Legal Consideratio

ns

Civil Rights Act of 1964

Title VII prohibits discrimination in employment

on the basis of race, color, religion, national origin, or

sex.

Title VI prohibits discrimination in federally

funded programs or activities on the basis of race, color, or

national origin.

Legal Consideratio

ns

Testing, in and of itself, is usually only

determined to be discriminatory if it has a disparate impact on a

protected class.

Prima Facie

Disparate Impact case “. . . established when: (1)

plaintiff identifies a specific employment practice to be challenged; and (2) through relevant statistical analysis proves that the challenged practice has an adverse impact on a protected group.” Isabel v. City of Memphis, 404 F.3d 404, 411 (6th Cir.2005).

http://web2.westlaw.com/find/default.wl?tf=-1&rs=WLW6.11&referencepositiontype=S&serialnum=2006438611&fn=_top&sv=Split&tc=-1&findtype=Y&referenceposition=411&db=506&vr=2.0&rp=%2Ffind%2Fdefault.wl&mt=Kentucky


Prima Facie

If the plaintiff meets this burden, the employer must show that the protocol in question has “a manifest relationship to the employment”-the so-called “business justification.” Griggs, 401 U.S. at 432, 91 S.Ct. 849.

http://web2.westlaw.com/find/default.wl?rs=WLW6.11&serialnum=1971127025&fn=_top&sv=Split&tc=-1&findtype=Y&tf=-1&db=708&vr=2.0&rp=%2Ffind%2Fdefault.wl&mt=Kentucky




Prima Facie

If the employer succeeds, the plaintiff must then show that other tests or selection protocols would serve the employer's interest without creating the undesirable discriminatory effect.

“An employer cannot be held liable for disparate impact if a legitimate business policy results in workforce disparities.” Bacon v. Honda of America Mfg., Inc., 370 F.3d 565, 579 (6th Cir.2004)



Good Example of a Bad Example of “Business Justification:”So we went in that little room there, and we looked at one another, and we knew we were playing with fire. We had all these pressures.... You knew you were putting both feet, both hands, in the middle of a philosophic war, a media war, a racial war ...Finally somebody said, well, what can we take to the people? At that point we forgot the university. We forgot everybody.... What kind of argument we can make that the people gon buy? And some soul in there said, well could we make the argument that the teachers ought to be smarter than half the students. And we looked around. We said, them old boys down there in Letohatchee will buy that. Everybody will buy it. We were all Alabamians. We all good old boys.We said, we can sell that. Folks in Lowndes County will buy it. Folks up in Wilburn will buy it. Even sophisticates up there in them Birmingham Newspapers, that'll make sense that the teacher ought to be as smart as at least half the students she's teaching.So [one of the steering committee members] was commissioned to go to his office and find out what the average ACT was for graduates, came back and said, I believe it's 16.4. So our big decision was whether to go to 17 or 16. And the only argument I think I recall them arguing for 16. Then we could go back out and say, looka here. Of course, this is also a fallacious argument because the student-the teacher never is as smart as half the students.... [But] that was the scientific basis of it gentlemen and lady. It was just that scientific. Groves v. Alabama State Board of Education, 776 F.Supp. 1518, 1530 (M.D.Ala. 1971).

Prior Litigations

Sharif by Salahuddin v. New York State Education Department, 709 F.Supp. 345 (S.D.N.Y. 1989)

Fields v. Hallsville Independent School District, 906 F.2d 1017 (5th Cir. 1990)

Groves v. Alabama State Board of Education, 776 F.Supp. 1518, 1530 (M.D.Ala. 1991)

Association of Mexican-American Educators v. State of California, 231 F.3d 572 (9th Cir. 2000)

White v. Engler, 188 F.Supp.2d 730 (E.D.Michigan 2001)

Summary

Teacher testing is used to make inferences about future teacher performance

Inferences make sense only in the context of a decision

The decision of interest with teacher tests is whether an individual has a minimum level

of academic proficiency and content knowledge

Summary

Cut scores are based on the relevance of test items to performance as a teacher in the

appropriate content area, using a modified Angoff procedure

Cut scores are recommended through the application of an agreed upon process but

approved by the EPSB Board

Summary

A set of Decision Rules have been applied since May 1999.

A recommended framework suggests that:

Cut scores be – between the 15th & 25th percentiles greater than or equal to current cut scores comparable to SREB average cut scores

Disparate impact be used as a possible indicator of program concerns

Summary

Tests that have been scientifically tailored and vetted to measure a legitimate skill set related to the

certificate holder's duties will withstand judicial scrutiny.

Questions?

Download - Validation Studies & Cut Scores January 21, 2007 EPSB Work Session

Top Related