staistical detection of test fraud (data forensics) - where do i start?

32
Nathan Thompson, PhD Terry Ausman Statistical Detection: Where do I start?

Upload: nathan-thompson

Post on 08-Jan-2017

217 views

Category:

Education


0 download

TRANSCRIPT

Page 1: Staistical detection of test fraud (data forensics) - where do I start?

Nathan Thompson, PhD Terry Ausman

Statistical Detection: Where do I start?

Page 2: Staistical detection of test fraud (data forensics) - where do I start?

2

Welcome! This is some of the lessons I have learned

while diving into the field. Overview of the topic Discuss resources Save time and effort for anyone starting out

Purpose is NOT to be a full workshop on data forensics

Page 3: Staistical detection of test fraud (data forensics) - where do I start?

3

Outline History Where do I start learning? Resources What are threats to test security?

How do I start deterring? Deterrent solutions like weblock and remote

proctoring How do I start detecting? Intro to data forensics Software for detection

Page 4: Staistical detection of test fraud (data forensics) - where do I start?

4

History

Literature dates before 1950 Many collusion indices Most were descriptive or completely ad hoc Notable exception: Frary, Tideman, and Watts

(1977) – G2 Modern era started when Wollack adapted G2

to IRT Other analyses not as much literature

Page 5: Staistical detection of test fraud (data forensics) - where do I start?

5

How do I start learning? x

Page 6: Staistical detection of test fraud (data forensics) - where do I start?

6

Resources

In the past, if you wanted to learn:1. Read all the original articles2. Read reviews• Bliss (2012) Covington Award – 25 indices • Khalid, Mehmood, & Rehman (2011) – 20 indices• Cizek 1997 book: good but little attention to

forensics• You still need all the originals.

UNTIL…

Page 7: Staistical detection of test fraud (data forensics) - where do I start?

7

ResourcesWollack &

Maynes (2013)Kingston & Clark

(2014)

You can now start here!

Page 8: Staistical detection of test fraud (data forensics) - where do I start?

8

Overview of Security Threats

Major sources of issues Brain dump makers (harvesting) Brain dump takers (preknowledge) Specific location problems Examinee collusion Receiving help (teacher, proctor, outside) Proxy testing What is your list?

Page 9: Staistical detection of test fraud (data forensics) - where do I start?

9

Harvesting

What: Steal your content and make it

public

Why: Often (but not always) to make

money

How: Memorization or images; Brain

dump sites

Deter: CAT/LOFT

Detect: Unusual responses & latencies;

brain dump comparisons; Trojan Horses

Minimize: Frequent republishing

Page 10: Staistical detection of test fraud (data forensics) - where do I start?

10

Preknowledge

What: Knowing the questions and answers

Why: Easy pass

How: Brain dump sites (used to be word of mouth)

Deter: CAT/LOFT

Detect: High score, low time; brain dump comparisons;

Trojan Horses

Minimize: Frequent republishing

Page 11: Staistical detection of test fraud (data forensics) - where do I start?

11

Examinee Collusion

What: Copying

Why: More items correct

How: Individual or group effort

Deter: CAT/LOFT, multiple forms, proctors

Detect: Collusion indices, group rollups

Minimize: CAT/LOFT, multiple forms

Page 12: Staistical detection of test fraud (data forensics) - where do I start?

12

Receiving help

What: Teacher, proctor, or outside aid

Why: More items correct; often

benefits the aider

How: Individual or group effort

Deter: CAT/LOFT, multiple forms,

proctors

Detect: Collusion indices, group rollups,

erasure

Minimize: CAT/LOFT, multiple forms, TEIs,

Perf tests

Page 13: Staistical detection of test fraud (data forensics) - where do I start?

13

How do I start deterring? x

Page 14: Staistical detection of test fraud (data forensics) - where do I start?

14

Many options User roles in test development Limit access to test content during delivery Verify identity of examinee Test window date/time Test location (IP addresses) Lockdown browser Proctor/Examinee authentication Biometrics for ID Proctor training

Page 15: Staistical detection of test fraud (data forensics) - where do I start?

Many providers

Page 16: Staistical detection of test fraud (data forensics) - where do I start?

16

How do I start detecting? x

Page 17: Staistical detection of test fraud (data forensics) - where do I start?

17

It’s a Hypothesis Test!

First step: Identify the threats you are worried about and how

you think it would present itself in data

Page 18: Staistical detection of test fraud (data forensics) - where do I start?

18

It’s a Hypothesis Test!

Independent variables Test centers/locations Countries Training programs Test forms Individuals

Page 19: Staistical detection of test fraud (data forensics) - where do I start?

19

It’s a Hypothesis Test!

Dependent variables Item response or test time Item statistics Test statistics (mean/SD, pass rate) Person statistics (intra-individual) Collusion indices

Page 20: Staistical detection of test fraud (data forensics) - where do I start?

20

It’s a Hypothesis Test!

12/2/2014

If you aim at nothing, that’s exactly what

you’ll hit.

Page 21: Staistical detection of test fraud (data forensics) - where do I start?

21

It’s a Hypothesis Test!

Example: Teachers helping kids Item statistics different than other teachers Collusion indices Relatively high scores with relatively short time

– bivariate plot? Item latencies different than other teachers

Page 22: Staistical detection of test fraud (data forensics) - where do I start?

22

It’s a Hypothesis Test! Example: Brain dump users Collusion indices Responses on Trojan Horses Relatively high scores with

relatively short time Item latencies Group level not likely (could be

at any test center)

Page 23: Staistical detection of test fraud (data forensics) - where do I start?

23

Time High score, low time: Preknowledge or aid Low score, high time: Harvester

Response patterns Person fit Score gains

Step 2: Determine your analysis x

Page 24: Staistical detection of test fraud (data forensics) - where do I start?

24

Options for DetectionIn

tra-In

divv

idua

l • Time/RTE (CBT only)

• Response patterns

• Score gains• Person fit

Inte

r-In

divi

dual

• Collusion Indices

• Erasure (paper only, also Group level)

Gro

up

• Roll-up of intra and inter

• Descriptive Statistics

Page 25: Staistical detection of test fraud (data forensics) - where do I start?

25

More on Collusion Indices

How is collusion quantified? 100 item test… Error similarity – we both had 10 errors:

Same items? Same responses on those items?

Response similarity We gave the same response on 50 items? 90?

Some indices are standardized/probabilistic (good) Some are descriptive or non-probabilistic (bad) Can vary in direction (one/two)

Page 26: Staistical detection of test fraud (data forensics) - where do I start?

26

More on Collusion Indices

There are issues to consider when comparing:

ESA only looks at errors, ignores rest of data Major confound with ability

Two examinees with 99/100 will get flagged as collusion!

Therefore important to condition on this Some indices have no theoretical basis

whatsoever

Page 27: Staistical detection of test fraud (data forensics) - where do I start?

27

More about collusionProbabilistic Descriptive Ad

hocError Similarity

B&B EIC EEIC HHHHJ

Response Similarity

Wollack’s OmegaWesolowsky ZjkFrary et al G2

RIC

Page 28: Staistical detection of test fraud (data forensics) - where do I start?

28

More resources

ITC Guidelines on the Security of Tests, Examinations, and Other Assessments

TILSA Test Security Guidebook

Conference presentations/workshops (harder to find)

Page 29: Staistical detection of test fraud (data forensics) - where do I start?

29

Software

Next step: Find software that meets your needs

Scrutiny! S-check R packages (CopyDetect) SIFT Integrity Caveon IRT software like IRTPRO or Xcalibre

Page 30: Staistical detection of test fraud (data forensics) - where do I start?

30

Epilogue: Then what?

Define a pathway for investigation and actions

Joy Matthews-Lopez and Paul Jones

Page 31: Staistical detection of test fraud (data forensics) - where do I start?

31

Examples (if time)

500 certification candidates Gr4 Math (locations) Check on teachers and schools; there is

incentive to help students

Page 32: Staistical detection of test fraud (data forensics) - where do I start?

Summary – Q&A