michigan assessment consortium common assessment development series module 16 – validity

34
Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Upload: tal

Post on 16-Feb-2016

49 views

Category:

Documents


0 download

DESCRIPTION

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity. Developed by. Bruce R. Fay, PhD Wayne RESA James Gullen , PhD Oakland Schools. Support. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Michigan Assessment Consortium

Common Assessment Development Series

Module 16 –Validity

Page 2: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Developed by

Bruce R. Fay, PhD Wayne RESA

James Gullen, PhD Oakland Schools

Page 3: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Support

The Michigan Assessment Consortium professional development series in common assessment development is funded in part by the Michigan Association of Intermediate School Administrators in cooperation with …

Page 4: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

In Module 16 you will learn about

Validity: what it is, what it isn’t, why it’s important

Types/Sources of Evidence for Validity

Page 5: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Validity & Achievement Testing – The Old(er) View

Validity is the degree to which a test measures what it is intended to measure.

This view suggests that validity is a property of a test.

Page 6: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Validity and Achievement Testing – The New(er) View

Validity relates to the meaningful use of results

Validity is not a property of a test

Key question: Is it appropriate to use the results of this test to make the decision(s) we are trying to make?

Page 7: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Validity & Proposed Use “Validity refers to the degree to which

evidence and theory support the interpretations of test scores entailed by proposed uses of the tests.”

(AERA, APA, & NCME, 1999, p. 9)

Page 8: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Validity as Evaluation “Validity is an integrated evaluative

judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment.”

(Messick, 1989, p. 13)

Page 9: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Meaning in Context Validity is contextual – it does not exist in

a vacuum Validity has to do with the degree to

which test results can be meaningfully interpreted and correctly used with respect to a question to be answered or a decision to be made – it is not an all or nothing thing

Page 10: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Prerequisites to Validity Certain things have to be in place before

validity can be addressed

Page 11: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Reliability A property of the test Statistical in nature “Consistency” or repeatability The test actually measures something

Page 12: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Fairness Freedom from bias with respect to:

Content Item construction Test administration (testing environment) Anything else that would cause differential

performance based on factors other than the students knowledge/ability with respect to the subject matter

Page 13: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

The Natural Order of Things

Reliability precedes Fairness precedes Validity.

Only if a test is reliable can you then determine if it is fair, and only if it is fair, can you then make any defensible use of the results.

However, having a reliable, fair test does not guarantee valid use.

Page 14: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Validity Recap

Not a property of the test Not essentially statistical Interpretation of results Meaning in context Requires judgment

Page 15: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Types/Sources of Validity

Internal Validity Face Content Response Criterion (int) Construct

External Validity Criterion (ext)

Concurrent Predictive

Consequential

Page 16: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Internal Validity

Practical Content Response Criterion (int)

Not so much Face Construct

Page 17: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

External ValidityCriterion (ext) Usually statistical

(measures of association or correlation)

Requires the existence of other tests or points of quantitative comparison

May require a “known good” assumption

Consequential Relates directly to the

“correctness” of decisions based on results

Usually established over multiple cases and time

Page 18: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

To Validate or Not to Validate…

…is that the question?

Page 19: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Decision-making without data…

is just guessing.

Page 20: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

But use of improperly validated data…

…leads to confidently arriving at potentially

false conclusions.

Page 21: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Practical Realities Although validity is not a statistical

property of a test, both quantitative and qualitative methods are used to establish evidence for the validity for any particular use

Many of these methods are beyond the scope of what most schools/districts can do for themselves…but there are things you can do

Page 22: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Clear Purpose

Be clear and explicit about the intended purpose for which a test is developed and how the results are to be used

Page 23: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Documented Process Implementing the process outlined in this

training, with fidelity, will provide a big step in this direction, especially if you document what you are doing

Page 24: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Internal First, then External

Focus first on Internal Validity Content Response Criterion

Focus next on External Validity Concurrent Predictive Consequential

Page 25: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Content & Criterion Evidence

Create the foundation for these by: Using test blueprints to design and explicitly

document the relationship (alignment and coverage) of the items on a test to content standards

Specifying appropriate numbers, types, and levels of items for the content to be assessed

Page 26: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

More on Content & Criterion

Have test items written and reviewed by people with content/assessment expertise using a defined process such as the one described in this series. Be sure to review for bias and other criteria.

Create rubrics, scoring guides, or answer keys as needed, and check them for accuracy

Page 27: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

It’s Not Just the Items…

Establish/document administration procedures

Determine how the results will be reported and to whom. Develop draft reporting formats.

Page 28: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Field Testing and Scoring

Field test your assessment Evaluate the test administration For open-ended items, train scorers and

check that scoring is consistent (establish inter-rater reliability)

Create annotated scoring guides using actual (anonymous) student papers as exemplars

Page 29: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Field Test Results Analysis Analyze the field test results for reliability,

bias, and response patterns Make adjustments based on this analysis Report results to field testers and

evaluate their ability to interpret the data and make correct inferences/decisions

Repeat the field testing if needed

Page 30: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

How Good is Good Enough?

Establish your initial performance standards in light of your field test data, and adjust if needed

Consider external validity by “comparing” pilot results to results from other “known good” tests or data points

Page 31: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Ready, Set, Go! (?)

When the test “goes live” take steps to ensure the it is administered properly; monitor and document this, noting any anomalies

Page 32: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Behind the Scenes

Ensure that tests are scored accurately. Pay particular attention to the scoring of open-ended items. Use a process that allows you to check on inter-rater reliability, at least on a sample basis

Page 33: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Making Meaning

Ensure that test results are reported: Using previously developed formats To the correct users In a timely fashion

Follow up on whether the users can/do make meaningful use of the results

Page 34: Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Conclusion