Scientific Basis for Assessment of Surgeons Using Simulation
James R Korndorffer Jr MD MHPE FACS
Nothing to disclose
Scientific Basis for Assessment of Surgeons Using Simulation
James R Korndorffer Jr MD MHPE FACS
Goals and Objectives
Identify why we use simulation for summative assessment
Become familiar with one conceptual framework for simulation assessment development
Understand current concepts of evaluating assessment validity
Recognize the current status of utilization of a scientific basis for assessment using simulation
What is Simulation
‘‘In broad, simple terms a simulation is a person, device, or set of conditions which attempts to present [education and] evaluation problems authentically. The student or trainee is required to respond to the problems as he or she would under natural circumstances. Frequently the trainee receives performance feedback as if he or she were in the real situation.” (McGaghie, 1999)
Summative Assessment Definition
Any systematic method of obtaining information from tests and other sources,
used to draw inferences about characteristics of people, objects or
programs.
It can’t be that hard “Evaluation (assessment) is probably the most logical field in
the world and if you use a little bit of logic, it just fits together and jumps at you…..It’s very common sense.” McGuire
“Common sense is very rare” Voltaire
Why Use Simulation for Assessment Traditional methods emphasizes faculty/mentor
impression recall bias, central tendency, halo effect poor correlation with actual performance difficult to standardize and reproduce Anecdotal
“how we always do it” Members of healthcare team are trained and
evaluated in isolation
Why Use Simulation for Assessment
Standardized methods “check list” of objectives and skills Emphasizes independent observer/instructor Reproducible Healthcare team members test together in more realistic setting
Summative Assessment
Miller’s Pyramid
Stakeholder Buy-in
Evidence Centered Design for Simulation Based Assessment RJ Mislevy CRESST Report 800, July 2011
Validity of Simulation Assessment Validation/Validity concepts have evolved
Early 20th century Evaluate ability of a test to predict performance
Mid 20th century Evaluate educational tests Sample representing the entire domain
Evaluate psychological testing Representative of the characteristic of interest
Validity of Simulation Assessment
Joint Committee on Standards for Educational and Psychological Testing American Educational Research
Association American Psychological Association National Council on Measurement in
Education
Validity of Simulation Assessment 1974 Standards Types of validity (construct, content, criterion) “face validity” discounted for over 60 years
Valid instruments/tests
1985/1999/2014 Standards Unitary concept of validity Valid use and interpretation of scores Hypothesis driven accumulation of validity evidence test content, response process, internal structure, relationships
to other variables, consequences of testing
Use of 1974 Standards
0 20 40 60 80 100
"types" of validity
valid simulator
Korndorffer et al Am J Surg. 2010 Jan;199(1):99-104.
Use of current standards
0 20 40 60 80 100
any current concept
evidence for validity
validity of results
use of results
Evidence of Relationship to other variables
0 20 40 60 80 100
correlation
>3 groups
3 groups
2 groups
Evidence of Test Content
0 20 40 60 80 100
general survey
expert survey
author opinion
Validity Evidence Based on
0 20 40 60 80 100
Content
Internal Structure
Relation to other variables
Response Process
Consequences
Cook et al Academic Medicine, Vol. 88, No. 6 June 2013
Validity Evidence for FLS Based on test content
Does test represent the domain - yes Written test blueprint Instruments are used in MIS
Based on response process Does test cause taker to exhibit trait of interest – yes
Tasks are same as those used in MIS Two handed movement, cutting, suturing
Based on internal structure Is test structured so results are reproducible - yes
Interrater reliability 0.998 – skills Test retest reliability 0.89 – skills
Validity Evidence for FLS
Based on relationships to other variables Does the test relate to assessment of interest – yes
Results correlate with in-training technical skills assessment and performance in an animal lab
Results correlated with GOALS scores – 0.81 In a multivariate analysis results were independent predictor of
operative performance evaluated by GOALS
Based on consequences of testing Does the test have the granularity to distinguish between the
groups of interest - yes Over 90% of the chief residents pass
Current standards Validity - appropriateness, meaningfulness, and
usefulness of the specific inferences made from scores
Validation - the hypothesis driven process of accumulating evidence to support such inferences
Stronger validation evidence Expert opinion Novice vs. expert
Conclusion As simulation assessment development and validation efforts continue to expand in the
medical education arena, the medical community must remain current and utilize a contemporary framework of development and validity, to adhere to an authentic scientific method and to avoid improper judgments of
performance