developing a model for investigating the impact of assessment

87
Developing a model for investigating the impact of assessment within educational contexts by a public examination provider Dr Nick Saville, Research and Validation Group, Cambridge ESOL

Upload: others

Post on 12-Sep-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Developing a model for investigating the impact of assessment

Developing a model for investigating the

impact of assessment within educational

contexts by a public examination provider

Dr Nick Saville, Research and Validation Group, Cambridge ESOL

Page 2: Developing a model for investigating the impact of assessment

Developing

a model for investigating

the impact of assessment

within educational contexts

by a public examination provider

"Impact by Design"

Page 3: Developing a model for investigating the impact of assessment

Nick Saville

AthensJune 2006

A model for investigating the impact of (language)

assessment within

educational contexts

Teaching Testing Learning

A Perspective from Cambridge ESOL

Page 4: Developing a model for investigating the impact of assessment

Nick Saville

AthensJune 2006

A model for investigating the impact of assessment

within educational contexts

Teaching Testing Learning

Implications for Cambridge Assessment?

Page 5: Developing a model for investigating the impact of assessment

153 years of history ……..

In tune with the spirit of the Victorian age

14th December 1858

370 students in seven different local contexts took an examination paper set by UCLES for the first time

Page 6: Developing a model for investigating the impact of assessment

153 years of history ……..

"This year, we find that students have acquired a great deal of skill but that they seem to have acquired it for examination purposes"

Art examiner writing in The TES, 1915

Michael Shaw - Remembrance of things passedCover Story - Magazine, TES (10 December 2010)

Page 7: Developing a model for investigating the impact of assessment

153 years of history …….. plus ça change

"This year, we find that students have acquired a great deal of skill but that they seem to have acquired it for examination purposes"

Art examiner writing in The TES, 1915

Michael Shaw - Remembrance of things passedCover Story - Magazine, TES (10 December 2010)

Page 8: Developing a model for investigating the impact of assessment

• Background to ESOL's approach• 1980’s

• Messick, Bachman – early 1990s

• The literature on washback/impact• early work and recent progress

• gaps? where next?

• Analysis of three case studies• what can be learnt?

• Towards a Comprehensive Model of Impact• applicable to other educational contexts?

Outline for today's talk

Page 9: Developing a model for investigating the impact of assessment

V

Test

R Practicality?

ESOL background – 1987-1990 : Japan

Considerations in developing fair tests

The art of the possible

Page 10: Developing a model for investigating the impact of assessment

PracticalityV

P

TestR

“Practicality in Language Testing: an educational management model”

Main argument: test development is a form of educational innovation - and needs to be managed as such

“... achieving a balance between the purpose of the test, its validity for the purpose, the required reliability for the purpose and the constraints

imposed by the context is essentially the task facing the test designer ….”

Saville (1990), University of Reading.

A Cambridge test development project: Japan, 1987 to 1989

Page 11: Developing a model for investigating the impact of assessment

Putting the test into context

V

R P

Test

" The aim … is not only to encourage good testing practice, but to prevent bad tests being produced ....

... a bad test is not only one with low reliability and dubious validity but also one which has a damaging washback on the curriculum".

Saville 1990

……. any test which is produced should be appropriate to the educational context in which it is to be used and the effect on learners and institutions will be a major consideration.

Page 12: Developing a model for investigating the impact of assessment

V

R P

Test

Putting the test into context

Page 13: Developing a model for investigating the impact of assessment

Impact Ripples

V

R P

Test

Page 14: Developing a model for investigating the impact of assessment

V

R P

Test

I

Local Impact

“micro” level

Impact Ripples

Page 15: Developing a model for investigating the impact of assessment

V

R P

Test

II

Wider Impact

“macro”

level II

Impact Ripples

Page 16: Developing a model for investigating the impact of assessment

U = V + R + I + P

Prof L Bachman (UCLA) - Cambridge Seminars 1990/91

The unitary concept of Usefulness

Overall Validity

Page 17: Developing a model for investigating the impact of assessment

U = V + R + I + P

Bachman and Palmer, 1996 : U = Cv + A + I + R + I + P

Developing “useful tests”, fit for purpose

Balancing the test qualities

Usefulness as “overall Validity”

Page 18: Developing a model for investigating the impact of assessment

U = V + R + I + P

Bachman and Palmer, 1996 : U = Cv + A + I + R + I + P

Developing “useful tests”, fit for purpose

Balancing the test qualities

Usefulness as “overall Validity”

Page 19: Developing a model for investigating the impact of assessment

Current ESOL Practice

Principles of Good Practice - 2011

Quality Management and validation in language assessment

VRIP

Page 20: Developing a model for investigating the impact of assessment

Current ESOL Practice

Page 21: Developing a model for investigating the impact of assessment

Principles of Good PracticeQuality Management and validation in language assessment

www.cambridgeesol.org/about/standards/pogp.html

VRIPSee also brochure - Making an Impact

Current ESOL Practice

Page 22: Developing a model for investigating the impact of assessment

Starting to develop a model of impact

g 1993 – 1995

g Using VRIP to develop and revise examse.g. the revision of IELTS 1995

• The IELTS impact project

g An expanded view of impact - from the test developer’s perspective

• Working for positive impact• Limiting negative consequences

Page 23: Developing a model for investigating the impact of assessment

Maxim 1 PLANUse a rational and explicit approach to test development

Maxim 2 SUPPORTSupport stakeholders in the testing process

Maxim 3 COMMUNICATEProvide comprehensive, useful and transparent information

Maxim 4 MONITOR and EVALUATECollect all relevant data and analyse as required.

Maxims for achieving/monitoring impact

Milanovic and Saville, 1995 Considering the impact of the Cambridge EFL examinations

Page 24: Developing a model for investigating the impact of assessment

The literature on washback/impact

g Readings in the language testing literature:• Hamp-Lyons (1989)• Wall and Alderson (1993) Does washback exist? etc..• Language Testing (1996: 13, 3) Messick, Bailey, etc…• Hamp-Lyons (1997)• Watanabe (1997)• Cheng and Watanabe (eds) (2004)

• Recent PhD studies and subsequent books in SILT series based on research conducted in the 1990s:

• Cheng (SILT 21 - 2005)• Wall (SILT 23 - 2005)• Hawkey (SILT 24 - 2006)• Green (SILT 25 -2007) - “washback in context”

Page 25: Developing a model for investigating the impact of assessment

g Washback (or backwash) has been broadly defined in the assessment literature as the effect of testing on teaching and learning

g One aspect of the broader phenomenon known as impact

Washback/impact

Page 26: Developing a model for investigating the impact of assessment

g Based on who or what might be affected:• Teaching• Learning • Content• Rate of learning• Sequence of teaching/learning• Degree/depth of curriculum coverage• Attitudes of teachers/learners• Etc.

Alderson and Wall, 1993

15 washback hypotheses

Page 27: Developing a model for investigating the impact of assessment

g A continuum - stretching from harmful at one end, through neutral to beneficial at the other end

Negative Neutral Positive

- +

Washback

Page 28: Developing a model for investigating the impact of assessment

g Negative?• Restriction of content – narrowing of

curriculum• Too much time practising for the test

g Positive?• Transparent objectives and outcomes• Increased motivation of learners• Increased accountability of teachers (?)

Washback

Page 29: Developing a model for investigating the impact of assessment

The “law” of unintended consequences

g “Any purposeful action will produce some unintended consequences” or side-effects

g “Goodhart’s Law”(or “Campbell’s Law” in the USA)• a variant of the “law” of unintended

consequences

Page 30: Developing a model for investigating the impact of assessment

“Goodhart’s Law”

g “All performance indicators lose their meaning when adopted as policy targets”

g Examples:• England - school achievement targets - school

league tables• USA – No Child Left Behind (NCLB)

g The clearer you are about what you want, the more likely you are to get it – but the less likely it is to mean what you wanted it to!

(Dylan Wiliam, Cambridge 2008)

Page 31: Developing a model for investigating the impact of assessment

Perverse incentives?

g Assessment policy can create a tension between

• educational objectives at the micro level (teaching and learning in schools) and

• a requirement for accountability at the macro level

Page 32: Developing a model for investigating the impact of assessment

g Negative?• Restriction of content – narrowing of

curriculum• Too much time practising for the test

g Positive?• Transparent objectives and outcomes• Increased motivation of learners• Increased accountability of teachers (?)

g BUT – cause and effect explanations are rarely adequate …..

Washback

Page 33: Developing a model for investigating the impact of assessment

g Negative?• Restriction of content – narrowing of

curriculum• Too much time practising for the test

g Positive?• Transparent objectives and outcomes• Increased motivation of learners• Increased accountability of teachers (?)

g BUT – cause and effect explanations are rarely adequate …..

Washback

Page 34: Developing a model for investigating the impact of assessment

g Negative?• Restriction of content – narrowing of

curriculum• Too much time practising for the test

g Positive?• Transparent objectives and outcomes• Increased motivation of learners• Increased accountability of teachers (?)

g BUT – cause and effect explanations are rarely adequate …..

Washback

Page 35: Developing a model for investigating the impact of assessment

Washback Models

In the language testing literature:

• Hughes (1993)

• Bailey (1996)

• Watanabe (2004)

• Cheng (2004, 2005)

• Green (2007)

Page 36: Developing a model for investigating the impact of assessment

3 Ps:

Participants• students• teachers

Processes

Products• learning• teaching• materials• curricula

Bailey’s 1996 Model (based on Hughes 1993)

Page 37: Developing a model for investigating the impact of assessment

3 Ps:

Participants• students• teachers

Processes

Products• learning• teaching• materials• curricula

Bailey’s 1996 Model (based on Hughes 1993)

Page 38: Developing a model for investigating the impact of assessment

The literature on washback/impact

g Readings in the language testing literature:• Hamp-Lyons (1989)• Wall and Alderson (1993) Does washback exist? etc..• Language Testing (1996: 13, 3) Messick, Bailey, etc…• Hamp-Lyons (1997)• Watanabe (1997)• Cheng and Watanabe (eds) (2004)

• Recent PhD studies and subsequent books in SILT series based on research conducted in the 1990s:

• Cheng (SILT 21 - 2005) • Wall (SILT 23 - 2005)• Hawkey (SILT 24 - 2006)• Green (SILT 25 - 2007) - “washback in context”

Page 39: Developing a model for investigating the impact of assessment

Liying Cheng Dianne Wall Roger Hawkey

Studies in Language Testing series

Page 40: Developing a model for investigating the impact of assessment

The literature on washback/impact

g Readings in the language testing literature:• Hamp-Lyons (1989)• Wall and Alderson (1993) Does washback exist? etc..• Language Testing (1996: 13, 3) Messick, Bailey, etc…• Hamp-Lyons (1997)• Watanabe (1997)• Cheng and Watanabe (eds) (2004)

• Recent PhD studies and subsequent books in SILT series based on research conducted in the 1990s:

• Cheng (SILT 21 - 2005)• Wall (SILT 23 - 2005)• Hawkey (SILT 24 - 2006)• Green (SILT 25 - 2007) - “washback in context”

Page 41: Developing a model for investigating the impact of assessment

FocalConstruct

Test designcharacteristics

item formatcontent

complexityetc.

Overlap

Potential fornegative backwash

Potential forpositive backwash

Perception oftest importance

Perception oftest difficulty

Backwash toparticipant

Important

Unimportant

No backwash

Intense backwash

Easy

Unachievable

Challenging

Washback direction

Washback intensity

Washback variabilityParticipant characteristics and values

Knowledge/ understanding of test demandsResources to meet test demandsAcceptance of test demands

Other stakeholdersCourse providersMaterials writers

PublishersTeachersLearners

Green IELTS Washback in context: Preparation for academic writing in higher education(SILT 25, 2007)

The model starts from test design characteristics and related validity issues of construct representationidentified with washback by Messick (1996)

Washback will be most intense –have the most powerful effects on teaching and learning behaviours –where participants see the test as challenging and the results as importantSEE BLUE ARROW

Page 42: Developing a model for investigating the impact of assessment

Studies in Language Testing, 25

IELTS - Washback in context

Studies in Language Testing series

Page 43: Developing a model for investigating the impact of assessment

The literature on washback/impact

So• Impact is relatively new in the field of language assessment - an

extension on the notion of washback and related to ethicality• It is now considered to be of growing importance• It is part of a validity argument and evidence needs to be provided

Broadly speaking there is consensus • washback is an aspect of impact related to the “micro contexts” of the

classroom and the school (teaching and learning)• impact deals with wider influences and includes the “macro contexts” -

tests and examinations in societyBUT

g The dynamics between the micro and macro contexts mean that this is a complex rather than a simple or linear relationship

- a “complex dynamic system”

Page 44: Developing a model for investigating the impact of assessment

The literature on washback/impact

And currently:

• there has not been a comprehensive model of test or examination impact within educational contexts

• impact has not yet been fully integrated into an approach to test development and validation in a systematic way

Page 45: Developing a model for investigating the impact of assessment

Three case studies – 1995 to 2004

g Case 1 - the world-wide survey of the impact of IELTS• a starting point for the work and the original model for what has followed• a conceptualisation of impact and design/validation of suitable instruments to

investigate it

g Case 2 - the Italian PL2000 project• an application of the model within a macro educational context• an initial attempt at the applying the approach on a limited basis within a

state educational context• Hawkey – SILT 24 (2006)

g Case 3 - the Florence Language Learning Gains Project• an extension and re-application of the model within in a single school context • at the micro level focusing on individual stakeholders within a single

language teaching institution

Page 46: Developing a model for investigating the impact of assessment

Case 1 - the IELTS Impact studies

The project had the following aim within the IELTS revision project (1993-5):

….. to investigate the impact of the test on candidates and on other test users, as part of the continuous process of ensuring that IELTS is as valid, effective and ethical as possible

IELTS 1995 Revision Project

Page 47: Developing a model for investigating the impact of assessment

Phases of the IELTS Impact Study

Phase One: Prof. C.Alderson (Lancaster University) was commissioned to develop first draft of data collection instruments (1995)

Phase Two: trialling, revision, rationalisation of instruments

Phase Three: (2001-2004)pre-survey, main data collection, analyses, report

See: Research Notes (2, 2000; 6, 2001; 15, 2004)Alderson and Banerjee (SILT 11, 2001)Saville and Hawkey (2004 - in Cheng and Watanabe)Hawkey (SILT 24, 2006)

Page 48: Developing a model for investigating the impact of assessment

g Responses received from:• 572 pre- and post-IELTS candidates• 83 teachers completing the teacher questionnaire• 43 teachers completing the instrument for the analysis of textbook

materials

g Stakeholder interviews and focus groups at selected case study centres, involving:

120 students21 teachers 15 receiving institution administrators. 12 “live” IELTS-preparation classes have been video-recorded

and analysed.

Stakeholder participation in Phase 3

Page 49: Developing a model for investigating the impact of assessment

Some key points and lessons learnt

g Setting objectives, design and research questions• The instruments – development and validation• The data – (strategies for collection, storage, retrieval)• The analysis and interpretation of multiple sources of

data (quantitative and qualitative)

g Managing impact studies • practical, legal, ethical issues• project management and action planning

g But the IELTS international dimension introduces multiple contexts – many more case studies required in specific contexts

Page 50: Developing a model for investigating the impact of assessment

Using international certification in Italian state-sector education

Case 2 - the Italian PL2000 project

Page 51: Developing a model for investigating the impact of assessment

• an application of the approach within a single macro educational context

• an initial attempt at the applying the approach on a limited basis within a state educational context

Case 2 - the Italian PL2000 project

Page 52: Developing a model for investigating the impact of assessment

Case 2 - the Italian PL2000 project

g The Progetto Lingue 2000 within the state school system of Italy

g As the name suggests - came into practice in the academic year 1999 to 2000

Page 53: Developing a model for investigating the impact of assessment

Progetto Lingue 2000

g The intention of the progetto was:

“.... to introduce innovation into the teaching and

learning of other languages by putting greater

emphasis on the development of communicative

competence in all grades of the school system”

Italian Ministry document

Page 54: Developing a model for investigating the impact of assessment

Progetto Lingue 2000

g Emphasis on

• the use of new technology in pedagogic contexts

• self-study and the individualisation of the learning experience

g The adoption of a level system based on the Council of Europe’s Common European Framework of Reference (CEFR) as learning objectives and standards

g The option of getting a certificate of proficiency to certify the level reached• the certificate should be aligned to the CEFR scale and issued by a

certificating body which is recognised internationally

Page 55: Developing a model for investigating the impact of assessment

Educationalgoals

Italy’s national learning goals integrated with pan-European - Council of Europe - goals

An educational innovation project

Progetto Lingue 2000

Page 56: Developing a model for investigating the impact of assessment

Educationalgoals

ResourcesTeacher

Development& support

Assessmentand

Certification

Curriculumdesign

Progetto Lingue 2000

Page 57: Developing a model for investigating the impact of assessment

Educationalgoals

Assessment,CertificationIncluding optional

external certification

Progetto Lingue 2000

Page 58: Developing a model for investigating the impact of assessment

PL2000 Impact Project 2001-2

Main interdependent language programme stakeholders and dimensions

Learning goals,curriculum,

syllabus

Students

Parents

Teachers

Teacher-trainers

Curriculum developers

Testers

Publishers

Receiving institutions

Employers

Students

Parents

Teachers

Teacher-trainers

Curriculum developers

Testers

Publishers

Receiving institutions

Employers

Materials

Teacher Support

Testing

Methodology

Page 59: Developing a model for investigating the impact of assessment

Some key points and lessons learnt

g Applied lessons learnt in the IELTS studiesg Adapted the instruments and data collection techniquesg Introduced new features of data collection

• Seven case study schools with school visits and interviews

g Proved the successful application of the approach within a national context

g Showed the possibility of matching learning objectives and tests via a “neutral” framework of reference – CEFR

g But – only limited data g Test provider was an “outsider”

Page 60: Developing a model for investigating the impact of assessment

Studies in Language Testing, 24

Impact Theory and Practice

Studies in Language Testing series

Page 61: Developing a model for investigating the impact of assessment

Case 3 – Florence project

g the Florence Language Learning Gains Project

• an extension and re-application of the model within in a single school context

• at the micro level focusing on individual stakeholders within a single language teaching institution

(British Institute of Florence)

Page 62: Developing a model for investigating the impact of assessment

Key points and lessons learnt:

g Focus on washback on language performance and learning growth• Can the influence of the test be separated from the other

variables?

g Longitudinal study over one academic year (2002-3)g Participant learners were compared in terms of:

• Competence level• Age• Stage• Motivation• External (high stakes) or internal final exam• Learning gain

g Provided multiple sources of very rich data

g But - difficult and costly to dog Requires active participation of many stakeholder groups and

individuals

Page 63: Developing a model for investigating the impact of assessment

Learning from the 3 impact case studies

g What can be learned using these specific impact projects as meta-data?

Page 64: Developing a model for investigating the impact of assessment

Learning from the 3 impact case studies

g Three key factors of contemporary educational systems need to be accounted for:

1. the nature of complex dynamic systems(see for example D. Larsen Freeman 1997)

2. the roles that stakeholders play within such systems

3. the need to see assessment projects as educational innovations within the systems and to manage change effectively – need a theory of action

Page 65: Developing a model for investigating the impact of assessment

1. The nature of complex dynamic systems

Page 66: Developing a model for investigating the impact of assessment

LearnersTeachersTest writers/examiners Receiving institutionsSchool ownersFuture employersGovernment agenciesProfessional bodiesTest centre administratorsMaterials writersPublishersetc

Learners Parents/carersTeachersReceiving institutions EmployersSchool ownersExaminersGovernment agenciesProfessional bodiesAcademic researchersTest writers/Examinersetc

Test constructsTest format

Test conditions

Test assessment

criteria

Test scores

Testing System

Contexts of test use - consequencesInputs to test design

2. The roles that stakeholders play

Page 67: Developing a model for investigating the impact of assessment

LearnersTeachersTest writers/examiners Receiving institutionsSchool ownersFuture employersGovernment agenciesProfessional bodiesTest centre administratorsMaterials writersPublishersetc

Learners Parents/carersTeachersReceiving institutions EmployersSchool ownersExaminersGovernment agenciesProfessional bodiesAcademic researchersTest writers/Examinersetc

Test constructsTest format

Test conditions

Test assessment

criteria

Test scores

Testing System

Contexts of test use - consequencesInputs to test design

The roles that stakeholders play

Page 68: Developing a model for investigating the impact of assessment

LearnersTeachersTest writers/examiners Receiving institutionsSchool ownersFuture employersGovernment agenciesProfessional bodiesTest centre administratorsMaterials writersPublishersetc

Learners Parents/carersTeachersReceiving institutions EmployersSchool ownersExaminersGovernment agenciesProfessional bodiesAcademic researchersTest writers/Examinersetc

Test constructsTest format

Test conditions

Test assessment

criteria

Test scores

Testing System

Contexts of test use - consequencesInputs to test design

The roles that stakeholders play

Page 69: Developing a model for investigating the impact of assessment

LearnersTeachersTest writers/examiners Receiving institutionsSchool ownersFuture employersGovernment agenciesProfessional bodiesTest centre administratorsMaterials writersPublishersetc

Learners Parents/carersTeachersReceiving institutions EmployersSchool ownersExaminersGovernment agenciesProfessional bodiesAcademic researchersTest writers/Examinersetc

Test constructsTest format

Test conditions

Test assessment

criteria

Test scores

Testing System

Contexts of test use - consequencesInputs to test design

The roles that stakeholders play

Page 70: Developing a model for investigating the impact of assessment

See Wall (SILT 22, 2005)… a case study using insights from testing and innovation theory e.g. Henrichsen (1989)

3. The need to see assessment projects as educational innovations and to manage change effectively

Hybrid Model of the Diffusion / Implementation Process

Antecedents Process Consequences

Timeline

Page 71: Developing a model for investigating the impact of assessment

Learning from the case studies

g When applied to (language) assessment, two key factors also need to be accounted for :

a) the nature of the construct: language itself as a socio-cognitive phenomenon - the latest views on validity

b) the nature of the test development and validation process• from conception to routine data collection and analysis

g Impact research, therefore is another kind of validation activity ........

Page 72: Developing a model for investigating the impact of assessment

Theory Test Taking Context

TT CONTEXT• TLU • Learning context • Context of score use

a) A socio-cognitive framework

MessickBachmanKaneMislevyWeir….. etc.

Consequential aspects

of validity

Page 73: Developing a model for investigating the impact of assessment

Theory Test Taking Context

TT CONTEXT• TLU • Learning context • Context of score use

A socio-cognitive framework

The testing system

CoreConstruct

Consequential aspects

of validity

see also Pellegrino

Page 74: Developing a model for investigating the impact of assessment

Theory Test Taking Context

TT CONTEXT• TLU • Learning context • Context of score use

The contexts

Learning contexts

Testingcontexts

Use of resultscontexts

Consequential aspects

of validity

Page 75: Developing a model for investigating the impact of assessment

Theory Test Taking Context

TT CONTEXT• TLU • Learning context • Context of score use

ImpactConsequential

aspectsof validity

The contexts

Page 76: Developing a model for investigating the impact of assessment

..Test

performance

..“Real world”

(target situation of use)

True score

Test score

How can we score what we observe?

Relates to marking,rating criteria

Scoring model

Evaluation

Does the test measure consistently?

Relates totest reliability,rater training,scale construction and version equating using IRTetc

Measurement model

Generalization Extrapolation

Does the test score reflect the candidate’s actual ability?

Relates to Validity

e.g. a Socio-cognitive model linking features of the learners, the test content and the skills to be measured

CEFRlevels

Specific testing context Link to context -neutral frameworkIdealization

How does the specific learning/testing context relate to a more general proficiency framework?

Depends on identifying the salient features of the levels and the specific learner group – not all salient features may be relevant to all groups.

Quantitative and qualitative evidence may be provided.

inference to a framework - from Dr Neil Jones

… based on Kane, Mislevy etc.

Page 77: Developing a model for investigating the impact of assessment

b) Model of the Test Development Process

“ … seek validity by design as a likely basis for washback”

Messick, 1996: 252

Seek "impact by design"

i.e. a theory of action

Saville, 2009

Page 78: Developing a model for investigating the impact of assessment

Identifying stakeholders and their needs

Linking these needs to the requirements of test usefulness- including predicted impact

- theoretical

- practical

Long term, Iterative Processes -a key feature of validation

Model of the Test Development Process

Page 79: Developing a model for investigating the impact of assessment

Involvement of the stakeholder constituency

E.g. during test design and development

g presentation and consultation to do with specifications and detailed syllabus designs

g professional support programmes for institutions and individual teachers/students etc. who plan to use the examinations

g training and employment of suitable personnel within the field to work on all aspects of the examination cycle – to be question/item writers, to act as examiners, etc.

Cf. the Maxims referred to above

Page 80: Developing a model for investigating the impact of assessment

After an examination becomes operational

g Procedures need to be in place to collect data routinely which allows impact to be estimated:

• who is taking the examination (i.e. a profile of the candidates)

• who is using the examination results and for what purpose• who is teaching towards the examination and under what circumstances• what kinds of courses and materials are being designed and used to prepare

candidates• what effect the examination has on public perceptions generally

(e.g. regarding educational standards)• how the examination is viewed by those directly involved in educational

processes(e.g. by students, examination takers, teachers, parents, etc.)

• how the examination is viewed by members of society outside education(e.g. by politicians, business people, etc.)

Page 81: Developing a model for investigating the impact of assessment

Towards a comprehensive model

g How can these considerations be combined to produce a comprehensive, integrated model?

• to guide language testers in ways to build impact into test development and validation systems

• to promote research into impact by a wide range of stakeholders

Page 82: Developing a model for investigating the impact of assessment

A meta-framework building on Milanovic & Saville’s maxims (1996)

Four inter-related dimensions:1. re-conceptualise the role of impact study within the assessment enterprise,

vis-à-vis societal systems generally and language education specifically

2. introduce the concept of “impact by design” into the planning and operationalisation of language assessments by examination providers

3. re-organise validation procedures to incorporate impact research into

operational activities to provide the basis for knowing about and

understanding how well an assessment system works in practice with regard to its impact (as defined in point 1 above)

4. develop an appropriate theory of action which enables examination providers

to work with stakeholders to achieve the intended objectives, to avoid negative consequences and to take remedial action when necessary.

Page 83: Developing a model for investigating the impact of assessment

“Impact by design”

g Integral part of a framework for developing and validating examination systems

g A concept akin to social impact assessment (SIA)

g Focus on what matters – e.g. successful learning

Page 84: Developing a model for investigating the impact of assessment

Impacts (positive and negative) anticipated in design phase

Impact research methodology used to find out what happens

Remedial action taken when needed on the basis of impact evidence

Key considerations

Centrality of language construct, theories of language learning- a socio-cognitive model- learning understood as change- effective communication

Impact research incorporated into routine validation processesMixed method designs used with impact “toolkit” to collect quantitative and qualitative data

Importance of the timeline with iterative cycles of review and revisions implemented over time

Emergent aspects of validityImproved understanding of the meaning of language assessment in context and of the effects and

consequences on systems and people

StancePerspective of UK examinations boardInfluenced by critical realism, contemporary pragmatism

Reconceptualising impact taking account of:- theories of knowledge - socio-cognitive theory- constructivism- theories of change

Impact by design

Procedural basis for knowing about effects and consequences

Theory of Action

A revised model (2009)

Page 85: Developing a model for investigating the impact of assessment

Applications beyond ESOL?

g Applying the model within the UK educational context:

g The Asset Languages Project (2003 onwards)

Page 86: Developing a model for investigating the impact of assessment

Conclusion

Investigating impact as validationg The investigation of impact is not a discrete or one-off activity

g It is an essential component in establishing the overall validity (usefulness) of an assessment system in terms of its fitness forspecific purposes and contexts of use

g The proposed model locates the study of test impact as one of a set of research and development tools within an iterative approach to on-going test validation

g It is consistent with Messick, 1996:

“In essence ..... test validation is empirical evaluation of meaning and consequences of measurement, taking into account extraneous factors in the applied setting that might erode or promote validity of local score interpretation and use.”

Page 87: Developing a model for investigating the impact of assessment

Thank You!

[email protected]