assessment in the workplace - ucl

Assessment in the Workplace

Dr Gavin Johnson

Consultant Gastroenterologist UCLH

Senior Lecturer in Medical Education UCL

1

Objectives

1) Discuss the evolution of workplace-

based assessment

2) Argue the pros and cons of WPBA

3) Improve the utility of WPBA

4) Evaluate the utility of WPBA

5) Appreciate the changing role of WPBA

in 2012

2

Why assess doctors?

• Public confidence

– Scepticism of profession to self-regulate (Smith 1998)

– Better measures of quality of practice (Scally 1998)

• Evidence of competence/inform progression

– Tomorrow’s doctors (GMC 1998, 2003)

• To drive learning (Van der Vleuten 2000)

• To improve trainee confidence

• To rebuke legal challenges (Tweed and Miola 2001)

3

The Metro Front Page 2011

4

The Metro Front Page 2011

5

Assessment – Miller 1990

DOES

SHOWS HOW

KNOWS HOW

KNOWS

6

DOES

SHOWS HOW

KNOWS HOW

KNOWS

KNOWLEDGE

COMPETENCE

PERFORMANCE


7

DOES

SHOWS HOW

KNOWS HOW

KNOWS

MSF, ACAT

OSCE, miniCEX

Best answer MCQ

T/F MCQ


8

WPBA – the origins

• Chart stimulated recall

– ABEM >1983

• Multisource feedback

– Business and industry >1993

• miniCEX

– Norcini >1995

9

MSF

DOPS

CbD

miniCEX

12

13

Curriculum

Communication skills

Procedural Skills

Team Work

Clinical judgement & decision-making

Clinical skills

Knowledge

Teaching Skills

Audit

Interpretative Skills

13

14

Curriculum

Communication skills

Procedural Skills

Team Work

Clinical judgement & decision-making

Clinical skills

Knowledge

Teaching Skills

Audit

KBA

TO

MSF

AA DOPS

mini-CEX

mini-CEX, MSF

CBD, ACAT

Interpretative Skills CBD

14

✓

✗

WPBA

15

• In vivo

– higher up Miller’s pyramid

• Educational Impact (facilitate feedback)

• Drive learning

• Gather evidence:

– inform decision making

– Re-sample borderline trainees

• Practical/Cheap

✓

16

Educational Impact – Cbd Comments • “Very helpful to receive constructive feedback on outpatient

encounters + letter written to GP.”

• “Helpful to receive structured feedback on consultation in

outpatient clinic”

• “Valuable exercise covering ground not previously covered

in other assessments.”

• “Useful assessment. A useful way to document

conversations and assessments taking place on a daily

basis.”

✓

17

Educational Impact – Cbd Comments • “Very helpful to receive constructive feedback on outpatient

encounters + letter written to GP.”

• “Helpful to receive structured feedback on consultation in

outpatient clinic”

• “Valuable exercise covering ground not previously covered

in other assessments.”

• “Useful assessment. A useful way to document

conversations and assessments taking place on a daily

basis.”

✓

18

Feedback

Kolb 1984

✓

19

• Time – trainee, assessor

• Space – appropriate areas for discussion

• Conflict - Turning supervisors into assessors

• Reliability – calibrating assessors, faculty

development

• Validity – being used incorrectly/en masse

• Formative assessments summative

decisions

✗

20

✓

✗

WPBA

21

“The profession is rightly suspicious of the use of reductive

‘tick-box’ approaches to assess the complexities of

professional behaviour, and widespread confusion exists

regarding the standards, methods and goals of individual

assessment methods…This has resulted in widespread

cynicism about WBA within the profession, which is now

increasing”

23

✓

✗

WPBA

24

Improving and Evaluating WPBA

25

Van der Vleuten (1996)

• Reliability – are the scores reproducible?

• Validity – does it measure the knowledge, skills and

attitudes it was designed to cover?

• Educational Impact – does assessment drive

learning?

• Practicality/Cost – is assessment feasible an

acceptable?

w = ‘weighting’ depending on context

Utility = Rw × Vw × Iw × Pw

26

Validity •To improve:

– Match objectives and to

assessments

– Pilot

– Collaborate in the

development of the

assessment

•To measure:

– Question trainees and

assessors

– Correlation between

similar performance traits

within assessment

– Correlation between

different assessments

measuring similar traits

e.g. CbD and ACAT

– Do scores improve over

time? 27

Reliability

•Train assessors

•Use grounded

descriptions of

performance

•Increased number of

assessments

•Increase number of

assessors

• Gather a large number of

assessments

– Generalisability theory

28

To improve:

To measure:

Ask a stupid question, you’ll get a stupid answer: Construct alignment improves the performance of WPBA

J Crossley, GJ Johnson, JR Booth, WB Wade

Medical Education 2011

29

ACAT ratings CMT 2008-9 - Overall Clinical Judgement

Well below expectations 0.0%

Below expectations 0.0%

Borderline 0.1%

Meets Expectations 18.8%

Above expectations 55.0%

Well above expectations 26.0%

n=13,977

30

Hypothesis

‘WPBA reliability improves when the

assessor’s rating uses anchor statements

based on clear descriptors of

performance, rather than on a scale based

on what was expected by the assessor’

31

Methods (1)

• RCP Nationwide electronic portfolio

• WPBA form had both old and new scales

• Data extracted and anonymised

• ‘Real world assessments’

• All years of higher speciality training

• Generalisability theory used to calculate reliability for both old and new rating scales

32

Methods (2) : ACAT Anchor Statements

Below level expected during

Foundation Programme

Trainee required frequent supervision to

assist in almost all clinical management

plans and/or time management

Performed at the level expected

at completion of Foundation

Programme / early Core Training

Trainee required supervision to assist in

some clinical management plans and/or time

management


on completion of Core Training /

early Higher Training

Supervision and assistance needed for

complex cases, competent to run the acute

care period with senior support


during Higher Training

Very little supervising consultant input

needed, competent to run the acute care

period with occasional senior support


for completion of Higher Training

Able to practise independently and provide

senior supervision for the acute care period

Results (1)

• mini-CEX, n = 3185

• CbD, n = 4513

• ACAT, n = 3235

34

Results (2) : mini-CEX

Number of CbDs 3 6 9 12

R co-efficient – old rating 0.55 0.71 0.78 0.83

R co-efficient – new rating 0.77 0.87 0.91 0.93

Results (3) : CbD




Results (4) : ACAT




Conclusion from Study

• The reliability of WPBA improves

significantly when the ratig for Overall

Performance is based on the stage of

training (with descriptive anchor

statements) rather than a scale based on

‘what was expected’ by assessor

38

Feasibility •To improve:

length of

assessments

number of

required assessments

– Embed in working day

– Facilitate process • handheld

•To measure:


assessors

• Questionnaires

• Focus groups

• Interviews

– Assessment form data

• Duration

• Satisfaction

39

Educational Impact •To improve:

– Faculty development

– Find time!

– Encourage free text

boxes to be completed

(reflective practice)

– Discuss at appraisal

•To measure:


assessors

– Evaluate quality of free

text entries

40

Challenges 2012

• Too many WPBA

• Ratings removed

• Only ‘anchor statements’

• Difficult to use to inform progression

• Legal challenges 41

Where do we go? • Clarity – purpose and benefits

• Train the assessors

• Use formatively only – ? reliability irrelevant

• Educational Supervision

• Progression needs to be the opinion of an

‘expert’ and evidence based

• ARCP decision need to stand up to legal

scrutiny 42

Conclusions • Boom…to bust?

• There are established benefits

– Educational Impact

• Consensus needed on how summative

decisions are reached

– But this must be evidence based

43

assessment in the workplace - ucl

Documents