is there life beyond language testing? an introduction to ... · assessment, e.g....

___________________________________________________________________

CRILE Working Papers No. 58 (2004) ___________________________________________________________________

Is there life beyond language testing?

An introduction to alternative language assessment.

Dina Tsagari

Abstract This paper aims to be an introduction to the so-called ‘movement of alternative

assessment’ (Alderson and Banerjee, 2001) that has recently made its appearance

within the field of language testing and assessment. The paper attempts to familiarise

readers interested in the area with the fundamental principles, much of the associated

terminology and methods. It also raises a number of issues in the hope that they will

serve as a springboard for further discussion, research and experimentation in the field.

CWP 58 (2004)

2

1. Introduction Language testing, generally associated with formal assessment procedures such as

tests and examinations carried out at specified times and serving a variety of purposes

(i.e. diagnostic, achievement, progress, etc.), is a vital component of instructional

language programmes throughout the world.

While this type of assessment is a mainstay of educational programmes

(Butterfield et al., 1999), educators and critics from various backgrounds have raised

a number of concerns about its usefulness as the primary measure of student

achievements.

Before attempting to discuss ‘alternative assessment’ at any length, it is useful

first to look at some of the issues that have contributed to the need for assessment

reform.

2. Concerns about language testing

2.1. Dissatisfaction with types of information gathered Proponents of process-oriented curricula and instruction argue that traditional testing

techniques, e.g. multiple-choice, fill-in-the-gaps, matching, etc., are often incongruent

with current second/foreign language classroom practices. In particular, they argue

that rich, descriptive information about the products and, more importantly, about the

process of learning and the ongoing measurement of student growth needed for

formative evaluation and for planning instructional strategies cannot be gathered by

conventional testing methods (Barootchi & Keshvarz, 2002). As Genesee and

Hamayan (1994: 229) stress “... tests can be useful for collecting information about

student achievement under certain restricted conditions, but they are not particularly

useful for collecting information about students' attitudes, motivation, interests, and

learning strategies” (for similar discussions see also Archbald, 1991; Herman and

Winters, 1994; Madaus, 1988; Resnick and Resnick, 1992; Wiggins, 1989a, 1989b,

1994; Wolf et al., 1991).

2.2 Dissatisfaction with high-stakes/standardised tests The literature also presents an array of negative criticism with regard to the ‘washback

effects’ or consequences of high-stakes standardised tests and exams experienced on a

number of levels:

CWP 58 (2004)

3

i) Curricular level

Critics of high-stakes tests attest that these are responsible for narrowing the school

curriculum by directing teachers to focus only on those subjects and skills that are

included in the examinations. As a consequence, such tests are said to “dominate and

distort the whole curriculum” (Vernon, 1956: 166; see also Kirkland, 1971; Shepard,

1991; inter alia).

ii) Educational level

Critics also point out that high-stakes examinations affect

a. the methodology teachers use in the classroom, i.e. teachers restrict the

methods they use and employ various exam preparation practices (also known

as “coaching” or “cramming”) at the expense of other learning activities which

do not always contribute directly to passing the exam (Alderson and Wall,

1993; Haladyna et al., 1991; Shepard, 1990; Wall, 1996),

b. the range, scope and types of instructional materials teachers use, i.e. high-

stakes exams gradually turn instructional materials into replicas of the actual

examination papers (Bailey, 1999; Cheng, 1997; Gipps, 1994; Hamp-Lyons,

1998; Hilke, and Wadden, 1997; Lam, 1993; Mehrens & Kaminsky, 1989;

Paris et al., 1991),

c. students’ learning and studying practices, i.e. in high-stake examination

contexts students tend to adopt ‘surface’ approaches to learning as opposed to

‘deep’ approaches (Crooks, 1988; Entwistle and Entwistle, 1991; Newstead

and Findlay, 1997). As a result, students’ ‘reasoning power’ is impeded, rote-

memorisation is encouraged by concentrating on recall of isolated details and

students resist attempts to engage in risky cognitive activities which can prove

both effective and potentially beneficial for their future improvement (Black

and Wiliam, 1998; Dietel, Herman and Knuth, 1991).

iii) Psychological level

Furthermore, high-stakes standardised tests are also said to have undesirable effects

on:

a. students’ psychology, i.e. it is believed that the role of the students in contexts

where high-stakes tests are introduced is that of passive recipients of

knowledge and their needs and intentions are generally ignored. High-stakes

CWP 58 (2004)

4

tests are also said to have detrimental consequences on students’ intrinsic

motivation, self-confidence, effort, interest and involvement in the language

learning experience and induce negative feelings in students such as anxiety,

boredom, worry and fear, which, according to the literature, are not conducive

to learning (Broadfoot, 2003; Gipps, 1994; Madaus, 1988; Paris et al., 1991;

Spielberger, 1972; Zeidner, 1996, 1998),

b. teachers’ psychology, i.e. it is argued that the dictates of high-stakes tests

reduce the professional knowledge and status of teachers and exercise a great

deal of pressure on them to improve test scores which eventually makes

teachers experience negative feelings of shame, embarrassment, guilt, anxiety

and anger (Gipps, 1994; Herman and Golan, 1993; Johnstone et al., 1995;

Madaus, 1988; Shepard, 1991; Smith, 1991).

2.3 Dissatisfaction with teacher-made tests In addition to the above, it is also argued that teacher-made tests, if used as the sole

indicators of ability and/or growth of students in the classroom, may generate faulty

results which cannot monitor student progress in the school curriculum (Barootchi &

Keshvarz, 2002; O'Malley & Valdez Pierce, 1992).

It is also believed that the use of tests in classroom settings tends to over-

emphasise the grading function more than the learning function of the language

learning process. As Black and Wiliam (1998) point out, in such contexts there is a

tendency to use a normative rather than a criterion approach to assessment which is

likely to create competition between pupils rather than personal improvement leading

to de-motivation and making students lose confidence in their own capacity to learn

(see also Black, 1993 and Crooks, 1988). In addition, it is also said that teachers do

not generally review the assessment questions or tasks they use in their classroom

tests and do not discuss them critically with peers. As a consequence there is little

reflection on what is being assessed (Black and Wiliam, 1998). Teachers, according to

Black and Wiliam, also do not trust or use their test results as these do not tell them

what they need to know about their students’ learning and appear to be unaware of the

assessment work of their colleagues, too (see also Harlen & Deakin-Crick, 2002;

2003).

CWP 58 (2004)

5

2.4 Equity in education

Other than the above, interest groups representing both linguistically and culturally

diverse students and students with special education needs have called for a change in

approaches to assessment that are more multiculturally sensitive and free of

normative, linguistic, and cultural biases found in traditional testing in order to ensure

equity in educational opportunities and achieve educational excellence for all students

(Hamayan, 1995; Huerta-Macias, 1995; Martin-Kniep, 2000; Soodak, 2000; inter

alia).

As a consequence of all the above criticisms, a shift in practice from

psychometrics to educational assessment made its appearance. This new tendency in

assessment has come to be known as the ‘alternative assessment movement’ in recent

state-of-the-art articles (Alderson and Banerjee, 2001; Bachman, 2000; Worthen,

1993).

3. What is alternative assessment?

3.1 Definitions

There is no single definition of ‘alternative assessment’ in the relevant literature. For

some educators, alternative assessment is a term adopted to contrast with standardised

assessment, e.g. professionally-prepared objective tests consisting mostly of multiple-

choice items especially in the US tradition (Huerta-Macias, 1995). Others look at

alternative assessment in more general terms. For instance, Hamayan (1995) sees that

alternative assessment “refers to procedures and techniques which can be used within

the context of instruction and can be easily incorporated into the daily activities of the

school or classroom” (ibid:213). To this Smith (1999) adds that “[a]lternative

assessment might take place outside the classroom or even the institution at various

points in time, and the subjects being tested may be asked to present their knowledge in

various ways” (ibid:703).

Kohonen (1997) makes the point that alternative assessment (the author uses

the term ‘authentic assessment’)

…. emphasises the communicative meaningfulness of evaluation and the

commitment to measure that which we value in education. It uses such

CWP 58 (2004)

6

forms of assessment that reflect student learning, achievement, motivation

and attitudes on instructionally-relevant classroom activities ... Its results

can be used to improve instruction, based on the knowledge of learner

progress (ibid:13).

In a more recent publication, Alderson and Banerjee (2001) provide the

following definition:

‘Alternative assessment’ is usually taken to mean assessment procedures

which are less formal than traditional testing, which are gathered over

a period of time rather than being taken at one point in time, which are

usually formative rather than summative in function, are often low-stakes

in terms of consequences, and are claimed to have beneficial washback

effects (ibid: 228)

3.2 Some further terminology Other than the diversity of definitions as to what alternative assessment is, there is

also a plethora of terms used to refer to ways of assessing students’ language products

and processes without the use of tests.

Other than the term ‘alternative’ assessment (see Alderson and Banerjee,

2001; Balliro, 1993; Brown and Hudson, 1998a, 1998b; Brown, 1998; Clapham,

2000; Genesee & Upshur 1996; Gipps and Stobbart, 2003; Hamayan, 1995; Hancock,

1994; Herman et al., 1992; Huerta-Macias, 1995; Shohamy, 1998; Smith, 1999; inter

alia), a variety of labels has been used to refer to ways of assessing students’ language

achievements without the use of tests. The most frequent are:

‘authentic’ assessment (Cumming and Maxwell, 1999; Darling-Hammond, 1994;

Elliott, 1991; Fradd et al, 1994; Kohonen, 1997, 1999, 2000; Newman et al., 1998;

O’Malley and Valdez Pierce, 1996; Terwilliger, 1997, 1998; Wiggins, 1989a, 1989b,

1993; Wolf et al, 1991; Hart, 1994; inter alia),

‘performance’ assessment (Aschbacher, 1991; Shavelson et al., 1992; Soodak,

2000; inter alia),

‘continuous assessment’ (Bruton, 1991; Glover & Thomas, 1999; Puhl, 1997;

inter alia),

CWP 58 (2004)

7

‘on-going assessment’ (Carbery, 1999; Croker, 1999; inter alia)

as well as ‘informal assessment’, ‘descriptive assessment’, ‘direct assessment’,

‘dynamic assessment’, ‘instructional assessment’, ‘responsive evaluation’,

‘complementary assessment’, ‘formative assessment’, ‘portfolio assessment’,

‘situated/contextualised assessment’ and ‘assessment by exhibition’.

Due to lack of space the differences in meaning and use could not be discussed here.

The interested reader could explore these through the references mentioned. However,

the term ‘alternative assessment’ will be used in this paper since it is more generic than

the other terms and it incorporates characteristics of the other commonly-used labels.

4. Benefits of alternative assessment Researchers and practitioners in the field believe that alternative assessment can:

a. Evaluate the process and product of learning as well as other important

learning behaviours

It is stressed that because most alternative assessment is ongoing in nature, the picture

that emerges about the learner and his or her language proficiency also reflects the

developmental processes that take place in language learning over time. Thus, through

alternative assessment, it is possible to focus on both the process and the product of

language learning (Belanoff & Dickson, 1991; Genesee & Hamayan, 1994; Hamayan,

1995; Wiggins, 1989a, 1989b).

Other than the above belief, educationists also claim that through alternative

assessment it is possible to collect information about some of the factors that

influence achievement found in the students’ linguistic, cultural, familial or

educational backgrounds, e.g. their prior educational experiences, their family

education, etc. which can be especially important when planning and evaluating the

effectiveness of instruction (Genesee & Hamayan, 1994; Kohonen, 1997; O’ Malley

and Valdez Pierce, 1996).

Furthermore, Genesee and Upshur (1996) stress that alternative assessment

methods can also gather information about those factors that affect student

achievement which, according to the authors, should be seen as an integral part of

students’ assessment, e.g.

CWP 58 (2004)

8

• learning strategies (e.g. whether the student takes risks, improvises, focuses on

meaning/form, self-corrects, uses first language strategies)

• affective and personality styles (e.g. whether the student is enthusiastic, self-

reliant, resourceful, passive)

• students’ work habits (e.g. whether the student is punctual, follows

instructions well, meets goals, prepares for class homework, seeks assistance

when needed)

• students’ social behaviour (e.g. whether the student works cooperatively,

socialises with peers, participates in class discussion)

• reactions to the course (e.g. student participates actively in class activities,

requires extra guidance, shows initiative)

b. Evaluate and monitor instruction

Alternative assessment is also believed to provide a strong link between instruction and

assessment by forming part of a feedback loop that allows classroom teachers to

monitor and modify instruction continually in response to results of student assessment.

This process is illustrated in Figure 1 (adapted from Genesee and Hamayan, 1994:215).

Figure 1. Classroom-based assessment

Instructional Plans Revise Proceed

Instruction

Assessment

Objective not

achieved

Objective achieved

CWP 58 (2004)

9

c. Produce meaningful results to a variety of stakeholders

It is also believed that information obtained from alternative methods of assessment can

be much more useful and informative compared to test scores and easy to interpret and

understand (Alderson and Banerjee, 2001; Clapham, 2000).

Hamayan (1995) makes the point that this represents a tremendous benefit not

only for teachers but other ‘clients’ of assessment, e.g. students, parents and

administrators. In particular she sees that alternative assessment methods allow

students to “see their own accomplishments in terms that they can understand and,

consequently, it allows them to assume responsibility for their learning” (ibid: 215)

while parents are offered a clear insight into what their children are doing in school.

Teachers are also provided with “data on their students and their classroom for

educational decision-making….” (ibid: 215). Alternative assessment also gives them

the opportunity to chronicle the success of the curriculum and can present them with a

framework for organising students’ work. Even administrators can benefit from

alternative assessment. According to Hamayan, “administrators, who are typically

least convinced of the advantages of alternative assessment, can benefit from the clear

information about student and teacher attainment over time” (1995: 215).

d. Relate to cognitive psychology and related fields

Furthermore, alternative assessment is also said to be in line with views expressed in

cognitive psychology, which suggest that learning is not linear, but proceeds in many

directions at once and at an uneven pace. Under this perspective, as Dietel et al.

(1991:4) argue, students should be given the opportunity to use the strategies they

acquired at the right time and in the right way so as to apply them for the realization

of particular tasks. They also stress that alternative assessment techniques allow

learners plenty of time to ‘generate’ rather than ‘choose’ a response: after recently-

acquired knowledge is brought to the forefront of their minds, the higher-order

thinking skills of synthesis and analysis are required for the learners when

participating in alternative assessment activities, which they can later reconsider by

critically working together with the teacher or other learners in sharing perceptions.

e. Represent a collaborative approach to assessment

Alternative assessment also represents a collaborative approach to assessment that

enables teachers and students to interact in the teaching/learning process (Barootchi &

CWP 58 (2004)

10

Keshvarz, 2002). Thus, in the context of alternative assessment, collaborative work is

reinforced among students and/or between students and teachers within a relaxed

classroom atmosphere.

f. Support students’ psychologically

In addition to the above, alternative assessment is said to enhance learners’ self-esteem

and feelings of efficacy as a growing person. Furthermore, it is believed that alternative

assessment can foster intrinsic learning motivation and learner involvement (Broadfoot,

1986, 2003; Gardner, 1993; Gottlieb, 1995; Kohonen, 1997; Leach et al., 1998;

Mortimer, 1998; Wiggins, 1993; Wolf et al., 1991; inter alia).

g. Promote autonomous and self-directed learning

It has also been argued that participating in alternative assessment can assist learners

in becoming skilled judges of their own strengths and weaknesses and in setting

realistic goals for themselves which can develop their capacity to become self-

directed and autonomous learners (by acquiring the necessary metacognitive

knowledge and strategies, language learning strategies and cognitive styles) and thus

develop lifelong learning skills (Brindley, 2001; Council of Europe, 2001; Kohonen

1999, 2000; Leites & Butureira, 2000; Lemos, 1999; Luoma and Tarnanen, 2003;

inter alia).

h. Provide new roles for teachers

With regard to the role of teachers within the alternative assessment paradigm, Genesee

(2001) points out that “[t]hese new evaluation approaches recognise classroom teachers

as reflective, self-motivated professionals” (ibid:150) while Kohonen (1997) points that

alternative assessment allows teachers more space for developing criteria (ibid:14) and

strengthens “the importance of the teacher’s professional judgement and commitment

to enhancing student learning” (ibid:13).

3.3 Alternative Methods of Assessment The following list summarises some of the most commonly used types or methods of

alternative assessment (based on Brown, 1998; Cohen, 1994; Genesee & Hamayan,

CWP 58 (2004)

11

1994; Genesee and Upshur 1996; Hamayan, 1995; Ioannou-Georgiou and Pavlou,

2003; Newman and Smolen, 1993; O’Malley and Valdez Pierce 1996; Short, 1993):

• Conferences

• Debates

• Demonstrations

• Diaries/Journals

• Dramatizations

• Exhibitions

• Games

• Observations

• Peer-assessment

• Portfolios

• Projects

• Self-assessment

• Story retelling

• Think-alouds

It is important to note here, following Hamayan’s suggestion (1995:218) that the

above methods of assessment need to be distinguished from tools or ways which

educators can use to record alternative assessment information. The author cites the

following as the most frequent ways of recording alternative assessment:

• Anecdotal records

• Checklists

• Learner profiles

• Progress cards

• Questionnaires

• Rating Scales

(for a different classification of methods of alternative assessment, see also Herman et

al., 1992; Navarrete et al., 1990 and Short, 1993).

5. Concerns raised about certain qualities of alternative assessment Although alternative assessment provides new possibilities for language evaluation,

concerns about how certain of its qualities (ie conceptual, technical, practical, etc)

may be realised and/or appropriately investigated have been voiced by educational

measurement and language testing specialists. For instance, it is argued that alternative

assessment documentation provides rich data about learning but it is much more cost-

effective and time-consuming for the teacher to administer and analyse thoughtfully in

order to give accurate feedback to the learner - especially in classes with large numbers

of learners (Alderson and Banjeree, 2001; Brindley, 2001; Clapham, 2000; Kohonen,

1997).

Another concern is related to the special skills needed by teachers in order to

successfully implement alternative methods of assessment (Breen et al., 1997; Clark

and Gipps, 2000). As Cizek (2000:2) comments in the context of general education in

the USA: “Perhaps the peskiest pocket of resistance in the assessment revolution is

CWP 58 (2004)

12

the inadequate preparation of teachers and administrators in the fundamentals of

educational assessment”. To this Kohonen (1997) adds that learners also need a great

deal of personal supervision and clear guidelines as it is quite likely that certain

learners may resist the new practices, being accustomed to more traditional language

assessment practices.

Brown and Hudson (1998a, 1998b) also point out that alternative assessment

needs to satisfy the same standards or psychometric qualities as do conventional tests,

that is validity, reliability and practicality and should be critically evaluated for their

‘fitness for purpose’ (what Bachman and Palmer (1996) called ‘usefulness’). Brown

and Hudson also emphasise that decisions for use of any alternative assessment

procedures should also be informed by considerations of consequences (washback)

and the significance, need for, and value of, feedback based on the assessment results

(see also Alderson and Banerjee; 2001; Clapham, 2000; Gipps and Stobbart, 2003 ;

Worthen, 1993).

Hamp-Lyons (1996) and Hamp-Lyons and Condon (2000), on the basis of

their studies of portfolio assessment mainly conducted in the US, also argue the case

for the adoption of a number of practices to ensure an ethical basis for the evaluation

of alternative assessments, focusing their discussion on the following criteria:

1. transfer and

generalizability

2. cognitive complexity

3. content quality

4. content coverage

5. meaningfulness

6. cost and effect

The question of whether alternative assessment can be used for large-scale evaluation is

also discussed in the literature devoted to alternative assessment. Worthen (1993:447-

453) proposes that alternative assessment can reach its full potential in education for

large-scale assessment applications if:

1. conceptual clarity is achieved to ensure consistency in the applications of

alternative assessment.

2. a mechanism for evaluation and self-criticism of alternative assessment practices

is established.

3. the users of alternative assessment, whether they are teachers or administrators,

become well versed in issues of assessment and measurement.

4. standardisation of assessment judgements is introduced.

CWP 58 (2004)

13

5. the ability to assess complex thinking skills can be established.

6. education’s key stakeholders (e.g. legislators, school boards, teachers, students,

associations of professional educators, etc) are persuaded of its importance and

usefulness.

7. the fiscal and logistic feasibility of alternative assessment for large-assessment

is shown.

(see also Brindley, 1998; Gipps and Stobbart, 2003 ; Stansfield, 1994).

Worthen also suggests that ‘in the interim, it would seem prudent to develop and test

alternative assessment approaches in low-stakes settings where they can serve needs

for better classroom assessment’ (ibid : 451).

Van Daalen (1999:21) concurs that there is a need for on-going research on

psychometric features of alternative assessment as part of the development of

alternative assessment procedures. Hamp-Lyons (1997) also sees the need for further

studies: “We must conduct studies of the impact of alternative assessment, on the

same basis that we apply to traditional forms of assessment. We cannot assume that

because alternative assessments start from humanistic concerns they produce

outcome that do only good and no harm…” (ibid:300)

6. Responses to concerns raised Advocates of alternative assessment object to the above views on philosophical

grounds. For instance, Huerta-Macias (1995) argues that alternative assessment is

valid and reliable by virtue of its close integration with learning and teaching:

trustworthiness of a measure consists of its credibility and auditability.

Alternative assessments are in and of themselves valid, due to the direct

nature of the assessment. Consistency is ensured by the auditability of

the procedure (leaving evidence of decision making processes), by using

multiple tasks, by training judges to use clear criteria, and by

triangulating any decision making process with varied sources of data

(for example, students, families and teachers). Alternative assessment

consists of valid and reliable procedures that avoid many of the

problems inherent in traditional testing including norming, linguistic,

and cultural biases (ibid: 10)

CWP 58 (2004)

14

Hamayan (1995), a strong supporter of alternative assessment, also argues that

alternative assessment approaches provide a wealth of information which could serve

as a context for more valid interpretations of standardised test results. She also

stresseds that information from alternative assessment procedures can constitute the

sole basis for educational and instructional decision-making.

Lynch (2001) further argues that alternative assessment represents a different

paradigm (an ‘assessment culture’) and therefore cannot be evaluated from within the

traditional positivist framework of educational measurement (a ‘testing culture’).

Other researchers have also suggested that the application of psychometric criteria for

technical adequacy may result in comparisons that reflect unfairly on alternative

methods of assessment (Gipps, 1994; Linn et al., 1991; Moss, 1992). In this regard, it

has been argued that new rules of evidence are needed for alternative assessment. In

an attempt to address this issue, Linn et al. (1991) have proposed that a different set of

validation criteria needs to be applied to alternative assessment, for instance:

• the extent of transfer and generalizabilty of the assessment tasks beyond the

assessment situation

• the cognitive complexity of students’ responses to the assessment tasks

• the content quality of the tasks

• the adequacy of sampling

• the meaningfulness of the assessment to students and

• the cost efficiency of the assessment system

(see also Garcia & Pearson, 1991; Gipps, 1994; Van Daalen, 1999)

7. Conclusion The alternative assessment paradigm, as discussed in the present paper, is seen to

embody a different concept of assessment, i.e. assessment as an essential part of the

learning process. However, further theoretical and empirical work needs to be done to

examine alternative assessment practices in depth. For example, we need to

reconceptualise alternative assessment and its relationship to standardised testing, to

understand how the aspects of alternative assessment are actually accomplished in

classroom interaction and to develop appropriate theory and research methods in the

study of this highly complex and dynamic teaching-learning-assessing interface

before any definite conclusions about its positive effects on teaching and learning are

CWP 58 (2004)

15

drawn. Therefore, the present paper makes an urgent appeal to future researchers with

an interest in the area to conduct empirical research in this exciting field within

foreign/second language settings.

CWP 58 (2004)

16

BIBLIOGRAPHY

Alderson, J. C. and Banerjee, J. 2001. Language testing and assessment (Part 1). Language Teaching 34, 4:213-236. Alderson, J. C. and Wall, D. 1993. Does washback exist? Applied Linguistics 14, 2: 115-129. Archbald, D. A. 1991. Authentic assessment: an introduction to a neo-behavioural approach to classroom assessment. School Psychology Quarterly 6, 4: 273-278. Aschbacher, P. R. 1991. Performance assessment: State activity, interest and concerns. Applied Measurement in Education 4, 4: 275-288. Bachman, L. F. 2000. Modern language testing at the turn of the century: assuring that we count counts. Language Testing 17, 1: 1-42. Bachman, L. F. and Palmer, A. S. 1996. Language Testing in Practice. Oxford: Oxford University Press. Bailey, K. M. 1999. Washback in Language Testing. TOEFL Monograph Series MS-15. Princeton, NJ: Educational Testing Service. Balliro, L. 1993. What kind of alternative? Examining alternative assessment. TESOL Quarterly 27, 3:558-561. Barootchi, N. and Keshavarz, M. H. 2002. Assessment of achievement through portfolio and teacher-made tests. Educational Research 44, 3:279-288. Belanoff, P. and Dickson, M. eds. 1991. Portfolios: Process and Product. Portmouth, NH: Boyton/Cook. Black, P. and Wiliam, D. 1998. Assessment and Classroom Learning. Assessment in Education: principles, policy and practice 5, 1: 7-74. Black, P. J. 1993. Formative and summative assessment by teachers. Studies in Science Education 21, 49-97. Breen, M., Baratt-Pugh, C., Derewianka, B., House, H., Hudson, C., Lumley T. and Roth, M. 1997. Profiling ESL children. Volume 1: Key issues and findings. Canberra; Department of Employment, Education, Training and Youth Affairs. Brindley, G. 1998. Outcomes-based assessment and reporting in language learning programmes: A review of the issues. Language Testing 15, 1: 45-85. Brindley, G. 2001. Outcomes-based assessment in practice: some examples and emerging insights. Language Testing 18, 4:393-407. Broadfoot, P. M. ed. 1986. Profiles and records of achievement. London: Holt,

CWP 58 (2004)

17

Rinehart and Wilson. Broadfoot, P. M. 2003. Dark Alleys and Blind Bends: Testing the Language of Learning. Paper presented over the 25th Language Testing Research Colloquium, 22-25 July, University of Reading. Brown, J. B. ed. 1998. New Ways of Classroom Assessment. USA: TESOL.

Brown, J. B. and Hudson, T. 1998a. The alternatives in language assessment’ TESOL Quarterly 32, 4:653-675. Brown, J. B. and Hudson, T. 1998b. The alternatives in language assessment: Advantages and Disadvantages. University of Hawai’i Working Papers in ESL, 16, 2:79-103. Bruton, A. 1991. Testing round the world: continuous assessment in Spanish State Schools. Language Testing Update, 10:14-20. Butterfield, S., Williams, A. and Marr, A. 1999. Talking about Assessment: mentor-student dialogues about pupil assessment in initial teacher training. Assessment in Education 6, 2:225- 246. Carbery, S. 1999. Fundamentals of on-going assessment. Shiken: JALT Testing & Evaluation SIG Newsletter 3, 1:3-7. Cheng, L. 1997. The Washback Effect of Public Examination Change on Classroom Teaching. Department of Linguistics , University of Hong Kong: PhD thesis.. Cizek, G. 2000. Pockets of resistance in the assessment revolution. Educational Measurement: Issues and Practice 19, 3:19-23. Clapham, C. 2000. Assessment and Testing. Annual Review of Applied Linguistics 20:147-161. Clark, S. and Gipps, C. 2000. The role of teachers in teacher assessment in England 1996-1998. Evaluation and Research in Education 14:38-52. Cohen, A. D. 1994. Assessing language ability in the classroom. 2nd edition. Boston, MA: Heinle and Heinle. Council of Europe 2001. Common European Framework of Reference for Languages: Learning, teaching, assessment. Cambridge: Cambridge University Press. Crooks, T. J. 1988. The impact of classroom evaluation practices on students. Review of Educational Research 58: 438-481. Croker, R. 1999. Fundamentals of on-going assessment. Shiken: JALT Testing & Evaluation SIG Newsletter 3, 1:8-12.

CWP 58 (2004)

18

Cumming, J. J. and Maxwell, G. S. 1999. Contextualising Authentic Assessment’ Assessment in Education 6, 2:177-194. Darling-Hammond, L. 1994. Performance-based assessment and educational equity. Harvard Educational Review 64, 1: 5-30. Dietel, R. J., Herman, J. L. and Knuth, R. A. 1991. What does research say about assessment. NCREL, 1-17. Available online at http://www.ncrel.org/sdrs/areas/stw_esys/4assess.htm Elliott, S. 1991. Authentic assessment: an introduction to a neobehavioral approach to classroom assessment. School Psychology Quarterly 6, 4:273-278. Entwistle, N. J. and Entwistle, N. 1991. Contrasting forms of understanding for degree examinations: The student experience and its implications. Higher Education, 22:205-227. Fradd, S. H., Larrinaga McGee, P. and Wilen, D. K. 1994. Instructional Assessment. Reading, MA: Addison-Wesley. Garcia, G. E. and Pearson, P. D. 1991. The Role of Assessment in a Diverse Society. In Literacy for a Diverse Society, eds. E. F. Hiebert. New York: Teachers’ College Press. Gardner, H. 1993. Assessment in context: the alternative to standardized testing. In Changing assessments: alternative views of aptitude, achievement and instruction eds. R. Gifford and M. C. O’Connor. Massachusetts, USA: Kluwer Academic Press. Genesee, F. 2001. Evaluation. In The Cambridge Guide to Teaching English to Speakers of Other Languages, eds. T. Carter and D. Nunan. Cambridge: Cambridge University Press. Pp. 144-150. Genesee, F. and Hamayan, E.1994. Classroom-based assessment. In Educating Second Language Children. eds. F. Genesee Cambridge: Cambridge University Press. Pp. 212- 239. Genesee, F. and Upshur, J. 1996. Classroom-based Evaluation in Second Language Education. Cambridge: Cambridge University Press. Gipps, C. and Stobbart, G. 2003. Alternative Assessment. In International Handbook of Educational Evaluation, eds. T. Kellaghan, D. L. Stufflebeam and L. A.Wingate. Dordrecht: Kluwer Academic Publishers. Pp. 549-576. Gipps, C. V. 1994. Beyond Testing: towards a theory of educational assessment. London: Falmer Press. Glover, P. and Thomas, R. 1999. Coming to grips with continuous assessment. Assessment in Education 6, 1:117-127.

CWP 58 (2004)

19

Gottlieb, M. 1995. Nurturing student learning through portfolios. TESOL Journal 5, 1:12-14. Haladyna, T. M., Nolen, S. B. and Haas, N. S. 1991. Raising standardized achievement tests scores and the origins of test score pollution. Educational Research 20, 5:2-7. Hamayan, E. V. 1995. Approaches to Alternative Assessment. Annual Review of Applied Linguistics 15:212-226. Hamp-Lyons, L. 1996. Applying ethical standards to portfolio assessments in writing in English as a second language. In Performance Testing, Cognition and Assessment, eds. M. Milanovic and N. Saville. Cambridge: Cambridge University Press. Pp. 151 164.. Hamp-Lyons, L. 1997. Washback, impact and validity: ethical concerns. Language Testing, 14/3:295-303. Hamp-Lyons, L. 1998. Ethical Test Preparation Practice: The Case of the TOEFL. TESOL QUARTERLY (Forum Commentary) 32(2): 329-337. Hamp-Lyons, L. and Condon W. 2000. Assessing the Portfolio. New Jersey: Hampton Press Inc.Cresskill.

Hancock, C. R. 1994. Glossary of Selected Terms. In Teaching, Testing and Assessment: Making the Connection, ed. C. R. Hancock .Lincolnwood, Illinois, USA: National Textbook Company. Pp. 235-240. Harlen, W. and Deakin-Crick, R. 2002. A systematic review of the impact of summative assessment and tests on students’ motivation for learning (EPPI- Centre Review, version 1.1) Research Evidence in Education Library. London: EPPI-Centre, Social Science Research Unit, Institute of Education. Available online at http://eppi.ioe.ac.uk/ Harlen, W. and Deakin-Crick, R. 2003. Testing and motivation for learning. Assessment in Education: Principles, Policy & Practice 10, 2:169-208. Hart, D. 1994. Authentic assessment: a handbook for educators. New York: Addison-Wesley. Herman, J. L., Aschbacher, P. R. and Winters, L. 1992. A practical guide to alternative assessment. Alexandria, VA: Association for Supervision and Curriculum Development. Herman, J. L. and Golan, S. 1993. The effects of standardized testing on teaching and schools. Educational Measurement: Issues and Practices 12, 4:20-25. Herman, J. and Winters, L. 1994. Portfolio research: a slim collection. Educational Leadership 52, 2:48-55.

CWP 58 (2004)

20

Hilke, R. and Wadden, P.1997. The TOEFL and its Imitators: Analyzing the TOEFL and Evaluating TOEFL-prep texts. RELC Journal 28(1): 29-53. Huerta-Macias, A. 1995. Alternative assessment: responses to commonly asked questions. TESOL Journal 5, 1: 8-11. Ioannou-Georgiou, S. and Pavlou, P. 2003. Assessing Young Learners. Oxford: Oxford University Press. Johnstone, P., Guice, S., Baker, K., Malone, J. and Michelson, N. 1995. Assessment of teaching and learning in literature-based classrooms. Teaching and Teacher Education 11:359-371. Kirkland, M. C. 1971. The effects of tests on students and schools. Review of Educational Research 41:303-350. Kohonen, V. 1997. Authentic Assessment as an Integration of Language Learning, Teaching, Evaluation and the Teacher’s Professional Growth. In Current Developments and Alternatives in Language Assessment: Proceedings of LTRC 1996, eds. A., Huhta, V. Cohonen, L. Kurki-Suonio and S. Luoma. University of Jyvaskyla: Jyvaskyla. Pp.7- 22. Kohonen, V. 1999. Authentic assessment in affective foreign language education. In Affect in Language Learning, ed. J. Arnold. Cambridge: CUP. Pp. 279-294. Kohonen, V. 2000. Student reflection in portfolio assessment: making language learning more visible. Babylonia 1:15-18. Lam, H. P. 1993. Washback-Can it be quantified ? A study on the impact of English English Examinations in Hong Kong. Department of Linguistics, University of Leeds, UK: MA dissertation. Leach. L., Neutze G. and Zepke, N. 1998. Motivation in assessment. In Motivating Students, eds. S. Brown, S. Armstrong, G. Thompson. England: Kogan Page. Pp. 201-209 Leites, M. and Butureira, A. 2000. Self-directed learning as a growing trend in in-company EFL. Humanizing Language Teaching 2, 6:1-7. Available online at: http://www.hltmag.co.uk/NOV00/sart2.htm Lemos, M. S. 1999. Students’ goals and self-regulation in the classroom. International Journal of Educational Research 31, 6: 471-485. Linn, R., Baker, E. and Dunbar, S. 1991.Complex, performance-based: expectations and validation criteria. Educational Researcher 20:15-21. Luoma, S. and Tarnanen, M. 2003. Creating a self-rating instrument for L2 writing: from idea to implementation. Language Testing 20, 4:440-465.

CWP 58 (2004)

21

Lynch, B. 2001. Rethinking assessment from a critical perspective. Language Testing 18, 4: 351-372. Madaus, G. F. 1988. The influence of Testing on the Curriculum. In Critical Issues in Curriculum: 87th Yearbook for the National Society for the Study of Education, ed. L. N. Tanner. Chicago: University of Chicago Press. Pp. 83-121. Martin-Kniep, G. O. 2000. Standards, feedback, and diversified assessment: addressing equity issues at the classroom level. Reading & Writing Quarterly 16: 239–256

Mehrens, W. A. and Kaminski, J. 1989. Methods for improving standardized test scores: fruitful, fruitless or fraudulent? Educational Measurement: Issues and Practice 8, 1:14-22. Mortimer, J. 1998. Motivating Student Learning through Facilitating Independence: Self and Peer Assessment of Reflective Practice – An Action Research Project. In Motivating Students, eds. S. Brown, S. Armstrong and G. Thompson. England: Kogan Page. Pp. 173-187 Moss, P. A. 1992. Shifting conceptions of validity in educational measurement: implications for performance assessment. Review of Educational Research 62, 3:229-258. Navarrete, C., Wilde, J., Nelson, C., Martinez, R., and Hargett, G. 1990. Informal assessment in educational evaluation: Implications for bilingual education programs. Washington, DC: National Clearinghouse for Bilingual Education. Newman, C. and Smolen, L. 1993. Portfolio assessment in our schools: implementation, advantages and concerns. Mid-Western Educational Researcher 6: 28-32. Newman, F., Brandt, R. and Wiggins, G. 1998. An exchange of views on ‘semantics, psychometrics, and assessment reform: a close look at “authentic” assessments. Educational Researcher 27, 6:19-22. Newstead, S. E. and Findlay, K. 1997. Some problems with using examination performance as a measure of teaching ability. Psychology Teaching Review 6, 1:23-30. O’Malley, M. and Valdez Pierce, L. 1996. Authentic assessment for English language learners. New York: Addison-Wesley. Paris, S. G., Lawton, T. A., Turner, J.C. and Roth, J. L. 1991. A developmental perspective on standardized achievement testing. Educational Researcher 20, 5:12-20.

CWP 58 (2004)

22

Puhl, C. A. 1997. Develop, Not Judge: Continuous Assessment in the ESL Classroom. English Teaching Forum 35, 2:2-9. Resnick, L., and Resnick, D. 1992. Assessing the thinking curriculum : New Tools for educational reform. In Changing assessments: Alternative views of aptitude, achievement and instruction, eds. B. R. Giþord and M. C. O. Connor. Boston, MA: Kluwer. Pp. 37–75. Shavelson, R. J., Baxter, G. P. and Pine, J. 1992. Performance assessments: political rhetoric and measurement reality. Educational Researcher 21, 4:22-27. Shepard, L. A. 1990. Inflated test score gains: is the problem old norms or teaching the test? Educational Measurement: Issues and Practice 9:15-22. Shepard, L. A. 1991. Will national tests improve student learning? CSE Technical Report 342, CREEST, University of Colorado, Boulder. Available online at http://www.gseis.ucla.edu/ Shohamy, E. 1998. Alternative assessment in language testing: applying a Multiplism approach. In Testing and Evaluation in Second Language Education, eds. E. Li, and G. James. Hong Kong: Language Centre, Hong Kong University of Science and Technology. Pp. 99-114 Short, D. J. 1993. Assessing integrated language and content instruction. TESOL Quarterly 27, 4: 627-656. Smith, M. L. 1991. Put to the test: the effects of external testing on teachers’ Educational Researcher 20, 5:8-11. Smith, K. 1999. Language Testing: Alternative Methods. In Concise Encyclopedia of Educational Linguistics, ed. B. Spolsky. Amsterdam: Elsevier. Pp. 703-706. Soodak, L. C. 2000. Performance assesement: exploring issues of equity and fairness. Reading and Writing Quarterly 16: 175–178. Spielberger, C. D. 1972. Anxiety: Current trends in theory and research. New York: Academic Press. Stansfield, C. 1994. Developments in foreign language testing and instruction: A national perspective. In Teaching, testing and assessment: Making the connection eds. C. Hancock, C. Lincolnwood, IL: National Textbook Company. Pp.43-67. Terwilliger, J. 1997. Semantics, psychometrics, and assessment reform: a close look at “authentic” assessments. Educational Researcher 26, 8:24-27. Terwilliger, J. 1998. Rejoinder: response to Wiggins and Newman. Educational Researcher 26, 6:22-23. Van Daalen, M. 1999. Test Usefulness in Alternative Assessment. Dialog on

CWP 58 (2004)

23

Language Instruction 13,1&2:1-26. Vernon, P. E. 1956. The Measurement of Abilities (2nd edition). London: University of London Press. Wall, D. 1996. Introducing new tests into traditional systems: insights from general education and from innovation theory. Language Testing 13, 3:334-354. Wiggins, G. 1989a. A true test: Toward more authentic and equitable assessment. Phi Delta Kappan 70:703-713. Wiggins, G. 1989b. Teaching to the (authentic) test. Educational Leadership 46,7:41-47. Wiggins, G. 1993. Assessing student performance. San Francisco: Jossey Bass. Wiggins, G. 1994. Toward more authentic assessment of language performances. In Teaching, Testing and Assessment: Making the Connection, ed. C. R. Hancock. Lincolnwood, Illinois USA: National Textbook Company. Pp. 94:69-85. Wolf, D., J. Bixby, J. Glenn, and Gardner, H. 1991. To use their minds well: Investigating new forms of student assessment. Review of Research in Education 17: 31-74. Worthen, B. R 1993. Critical issues that will determine the future of alternative assessment. Phi Delta Kappan 74, 6:444-456. Zeidner, M. 1996. How do high school and college students cope with test situations? British Journal of Educational Psychology 66:115-128. Zeidner, M. 1998. Test Anxiety – The State of the Art. New York: Plenum Press.

is there life beyond language testing? an introduction to ... · assessment, e.g....

Documents