assessment in the workplace - ucl
TRANSCRIPT
Assessment in the Workplace
Dr Gavin Johnson
Consultant Gastroenterologist UCLH
Senior Lecturer in Medical Education UCL
1
Objectives
1) Discuss the evolution of workplace-
based assessment
2) Argue the pros and cons of WPBA
3) Improve the utility of WPBA
4) Evaluate the utility of WPBA
5) Appreciate the changing role of WPBA
in 2012
2
Why assess doctors?
• Public confidence
– Scepticism of profession to self-regulate (Smith 1998)
– Better measures of quality of practice (Scally 1998)
• Evidence of competence/inform progression
– Tomorrow’s doctors (GMC 1998, 2003)
• To drive learning (Van der Vleuten 2000)
• To improve trainee confidence
• To rebuke legal challenges (Tweed and Miola 2001)
3
The Metro Front Page 2011
4
The Metro Front Page 2011
5
Assessment – Miller 1990
DOES
SHOWS HOW
KNOWS HOW
KNOWS
6
DOES
SHOWS HOW
KNOWS HOW
KNOWS
KNOWLEDGE
COMPETENCE
PERFORMANCE
Assessment – Miller 1990
7
DOES
SHOWS HOW
KNOWS HOW
KNOWS
MSF, ACAT
OSCE, miniCEX
Best answer MCQ
T/F MCQ
Assessment – Miller 1990
8
WPBA – the origins
• Chart stimulated recall
– ABEM >1983
• Multisource feedback
– Business and industry >1993
• miniCEX
– Norcini >1995
9
10
11
MSF
DOPS
CbD
miniCEX
12
13
Curriculum
Communication skills
Procedural Skills
Team Work
Clinical judgement & decision-making
Clinical skills
Knowledge
Teaching Skills
Audit
Interpretative Skills
13
14
Curriculum
Communication skills
Procedural Skills
Team Work
Clinical judgement & decision-making
Clinical skills
Knowledge
Teaching Skills
Audit
KBA
TO
MSF
AA DOPS
mini-CEX
mini-CEX, MSF
CBD, ACAT
Interpretative Skills CBD
14
✓
✗
WPBA
15
• In vivo
– higher up Miller’s pyramid
• Educational Impact (facilitate feedback)
• Drive learning
• Gather evidence:
– inform decision making
– Re-sample borderline trainees
• Practical/Cheap
✓
16
Educational Impact – Cbd Comments • “Very helpful to receive constructive feedback on outpatient
encounters + letter written to GP.”
• “Helpful to receive structured feedback on consultation in
outpatient clinic”
• “Valuable exercise covering ground not previously covered
in other assessments.”
• “Useful assessment. A useful way to document
conversations and assessments taking place on a daily
basis.”
✓
17
Educational Impact – Cbd Comments • “Very helpful to receive constructive feedback on outpatient
encounters + letter written to GP.”
• “Helpful to receive structured feedback on consultation in
outpatient clinic”
• “Valuable exercise covering ground not previously covered
in other assessments.”
• “Useful assessment. A useful way to document
conversations and assessments taking place on a daily
basis.”
✓
18
Feedback
Kolb 1984
✓
19
• Time – trainee, assessor
• Space – appropriate areas for discussion
• Conflict - Turning supervisors into assessors
• Reliability – calibrating assessors, faculty
development
• Validity – being used incorrectly/en masse
• Formative assessments summative
decisions
✗
20
✓
✗
WPBA
21
22
“The profession is rightly suspicious of the use of reductive
‘tick-box’ approaches to assess the complexities of
professional behaviour, and widespread confusion exists
regarding the standards, methods and goals of individual
assessment methods…This has resulted in widespread
cynicism about WBA within the profession, which is now
increasing”
23
✓
✗
WPBA
24
Improving and Evaluating WPBA
25
Van der Vleuten (1996)
• Reliability – are the scores reproducible?
• Validity – does it measure the knowledge, skills and
attitudes it was designed to cover?
• Educational Impact – does assessment drive
learning?
• Practicality/Cost – is assessment feasible an
acceptable?
w = ‘weighting’ depending on context
Utility = Rw × Vw × Iw × Pw
26
Validity •To improve:
– Match objectives and to
assessments
– Pilot
– Collaborate in the
development of the
assessment
•To measure:
– Question trainees and
assessors
– Correlation between
similar performance traits
within assessment
– Correlation between
different assessments
measuring similar traits
e.g. CbD and ACAT
– Do scores improve over
time? 27
Reliability
•Train assessors
•Use grounded
descriptions of
performance
•Increased number of
assessments
•Increase number of
assessors
• Gather a large number of
assessments
– Generalisability theory
28
To improve:
To measure:
Ask a stupid question, you’ll get a stupid answer: Construct alignment improves the performance of WPBA
J Crossley, GJ Johnson, JR Booth, WB Wade
Medical Education 2011
29
ACAT ratings CMT 2008-9 - Overall Clinical Judgement
Well below expectations 0.0%
Below expectations 0.0%
Borderline 0.1%
Meets Expectations 18.8%
Above expectations 55.0%
Well above expectations 26.0%
n=13,977
30
Hypothesis
‘WPBA reliability improves when the
assessor’s rating uses anchor statements
based on clear descriptors of
performance, rather than on a scale based
on what was expected by the assessor’
31
Methods (1)
• RCP Nationwide electronic portfolio
• WPBA form had both old and new scales
• Data extracted and anonymised
• ‘Real world assessments’
• All years of higher speciality training
• Generalisability theory used to calculate reliability for both old and new rating scales
32
Methods (2) : ACAT Anchor Statements
Below level expected during
Foundation Programme
Trainee required frequent supervision to
assist in almost all clinical management
plans and/or time management
Performed at the level expected
at completion of Foundation
Programme / early Core Training
Trainee required supervision to assist in
some clinical management plans and/or time
management
Performed at the level expected
on completion of Core Training /
early Higher Training
Supervision and assistance needed for
complex cases, competent to run the acute
care period with senior support
Performed at the level expected
during Higher Training
Very little supervising consultant input
needed, competent to run the acute care
period with occasional senior support
Performed at the level expected
for completion of Higher Training
Able to practise independently and provide
senior supervision for the acute care period
Results (1)
• mini-CEX, n = 3185
• CbD, n = 4513
• ACAT, n = 3235
34
Results (2) : mini-CEX
Number of CbDs 3 6 9 12
R co-efficient – old rating 0.55 0.71 0.78 0.83
R co-efficient – new rating 0.77 0.87 0.91 0.93
Results (3) : CbD
Number of CbDs 3 6 9 12
R co-efficient – old rating 0.48 0.65 0.73 0.78
R co-efficient – new rating 0.73 0.84 0.89 0.92
Results (4) : ACAT
Number of CbDs 3 6 9 12
R co-efficient – old rating 0.21 0.35 0.44 0.52
R co-efficient – new rating 0.36 0.53 0.63 0.70
Conclusion from Study
• The reliability of WPBA improves
significantly when the ratig for Overall
Performance is based on the stage of
training (with descriptive anchor
statements) rather than a scale based on
‘what was expected’ by assessor
38
Feasibility •To improve:
length of
assessments
number of
required assessments
– Embed in working day
– Facilitate process • handheld
•To measure:
– Question trainees and
assessors
• Questionnaires
• Focus groups
• Interviews
– Assessment form data
• Duration
• Satisfaction
39
Educational Impact •To improve:
– Faculty development
– Find time!
– Encourage free text
boxes to be completed
(reflective practice)
– Discuss at appraisal
•To measure:
– Question trainees and
assessors
– Evaluate quality of free
text entries
40
Challenges 2012
• Too many WPBA
• Ratings removed
• Only ‘anchor statements’
• Difficult to use to inform progression
• Legal challenges 41
Where do we go? • Clarity – purpose and benefits
• Train the assessors
• Use formatively only – ? reliability irrelevant
• Educational Supervision
• Progression needs to be the opinion of an
‘expert’ and evidence based
• ARCP decision need to stand up to legal
scrutiny 42
Conclusions • Boom…to bust?
• There are established benefits
– Educational Impact
• Consensus needed on how summative
decisions are reached
– But this must be evidence based
43