testing evaluation pp t
TRANSCRIPT
-
7/29/2019 Testing Evaluation Pp t
1/51
THE PRACTICE OF ENGLISH LANGUAGE TEACHING
TESTING &EVALUATION
-
7/29/2019 Testing Evaluation Pp t
2/51
TESTING AND ASSESSMENT
Measurement and evaluation have been with us for a longtime.
Since the effect of testing on teaching and learning isunavoidable, testing is an important part of everylanguage teaching and language learning experience.
-
7/29/2019 Testing Evaluation Pp t
3/51
WHY TEST?
Has the instruction been successful?
Were the materials for instruction at
the right level?
Have all language skills been emphasized
equally?
What points need reviewing?
Should the same materials be used next year
or do they need some modifications?.
-
7/29/2019 Testing Evaluation Pp t
4/51
TEST EVALUATIONMEASURMENT
-
7/29/2019 Testing Evaluation Pp t
5/51
Test is the narrowest of the three terms.
It often connotes to the presentation ofa set of questions to be answered.
In general what distinguishes a test from other
types of measurement is that it is designed to obtain
a specific sample of behavior.
TEST
It is one type of measurement.
It may be used for pedagogic or descriptive
purposes.
-
7/29/2019 Testing Evaluation Pp t
6/51
MEASUREMENT
It implies a broader sense.
We can measure characteristics by means
other than giving tests e.g. using observation,rating scales, or other devices that
allow us to obtain information in a quantitative
form.
-
7/29/2019 Testing Evaluation Pp t
7/51
EVALUATIONIt has been defined in a variety
of ways.
In general, it refers to thesystematic gathering of
information for purposes of
decision making.
In other words, evaluation is aprofessional judgment or a
process that allows one to make
a judgment about the desirability
or value of a measure.
-
7/29/2019 Testing Evaluation Pp t
8/51
So, testing is not the only way in which
information bout peoples language ability
can be gathered.
It is just one form of assessment.
-
7/29/2019 Testing Evaluation Pp t
9/51
ASSESSMENT
SUMMATIVE
FORMATIVE
-
7/29/2019 Testing Evaluation Pp t
10/51
FORMATIVE ASSESSMENT
To check on the progress of their students.
To see how far they have mastered what
they should have learned.
And then to use this information to modify
their future teaching plans.
It cab also be the basis for
feedback to the students
e.g. informal testor quizzes
-
7/29/2019 Testing Evaluation Pp t
11/51
It is used at the end of
the term, semester
, or year in order
to measure what
has been achievedboth by groups and
by individuals.
SUMMATIVE ASSESSMENT
e.g. final
examination
In most cases grades
are assigned
on the basis of
performance on tests
in addition toclassroom performance.
-
7/29/2019 Testing Evaluation Pp t
12/51
Prof ic iency
test
Diagnost ic
test
Placement
test
Progress
(Achievement)
test
Different reasons for testing learners
Different kinds of tests
Port fol io
assessment
-
7/29/2019 Testing Evaluation Pp t
13/51
PLACEMENT TEST
It is used to sort new students into relatively homogenous language ability
groupings so that they can start a course at approximately the same level
as the other students in the class.
It is one of the most frequently used test at different levels of language
instruction.
-
7/29/2019 Testing Evaluation Pp t
14/51
DIAGNOSTIC TEST
It is designed to show what skills or knowledge a learner knows or does not
know. In other words, it is used to identify students strengths and
weaknesses.
It is in the reverse side of achievement test in the sense that while theinterest of the achievement test is in success, the interest in the diagnostic
test is in failure, what has gone wrong, in order to develop remedies.
-
7/29/2019 Testing Evaluation Pp t
15/51
ACHIEVEMENT TEST
It is designed to measure the degree of students learning from a particular
set of instructional materials.
It is directly related to language courses. It means that such tests normally
come after a program of instruction or items of the test are drawn from thecontent of instruction directly.
e.g. final, midterm, and class examinations
-
7/29/2019 Testing Evaluation Pp t
16/51
PROFICIENCY TEST
It is used to measure the overall language ability of the learners regardless
of any training they may have had in that language.
It seeks to answer the question:having learned this much, what can the
student do with it?
e.g. Test of English as a Foreign Language (TOEFL)
-
7/29/2019 Testing Evaluation Pp t
17/51
PORTFOLIO ASSESSMENT
Many educational institutions allow students to assemble a
portfolio of their works over a period of time(a term or semester),
so the students can be assessed by looking at three or four ofthe best pieces of work over this period.
-
7/29/2019 Testing Evaluation Pp t
18/51
ADVANTAGES
Provide evidence of
students effort
Help students to becomemore autonomous
Help them to self monitor
their own learning
DISADVANTAGES
It is time-consuming.
Teachers will need clear
training in how to select
items from the portfolio
and how to give them
grades.
In preparing their portfolios,
students may have been
helped by others.
-
7/29/2019 Testing Evaluation Pp t
19/51
RELIABILITYVALIDITY
Characteristics of a good test
1 2
-
7/29/2019 Testing Evaluation Pp t
20/51
VALIDITY
It measures what it is supposed to measure (construct
validity)
or can be used for the purposes for which it is intended.
Valid + for
It means any given test may be valid for some purposes, butnot for others.
Validity tells us what can be inferred from test scores.
-
7/29/2019 Testing Evaluation Pp t
21/51
Different kinds
of validityface validity: a test should look, on the face of it,as if it is valid. A test which consisted of only three
multiple-choice items would not convince students
of its face validity.
criterion-related validity: it is based on the extent to whichperformance on a newly-developed test is related to some other
criterion measure which is an indicator of the ability tested.
-
content validity: if the content of a test constitutes arepresentative sample of the language skills, structures, etc.with which it is meant to be concerned.
-
7/29/2019 Testing Evaluation Pp t
22/51
-
7/29/2019 Testing Evaluation Pp t
23/51
First, write explicit specifications for the test and make
sure that you include a representative sample of the
content of these in the test.
Second, whenever feasible, use direct testing.
Third, make sure that the scoring of responses relates
directly to what is being tested.
Finally, do everything possible to make the test
reliable. If a test is not reliable, it cannot be valid.
-
7/29/2019 Testing Evaluation Pp t
24/51
What is
reliability?
-
7/29/2019 Testing Evaluation Pp t
25/51
RELIABILITY
Reliability is a quality of test scores.
It refers to the consistency of measuresacross different times,
test forms, raters, and other characteristics of the measurementcontext.
Synonyms for reliability are:
Dependability, stability, consistency, predictability,accuracy
-
7/29/2019 Testing Evaluation Pp t
26/51
How to make test more reliable?
Take enough samples of behavior
Exclude items which do not discriminate well between weaker and
stronger students.
Do not allow candidates too much freedom.
Provide clear and explicit instructions.
Write unambiguous items.
Provide uniform and non-distracting conditions of administration.
..
-
7/29/2019 Testing Evaluation Pp t
27/51
Two kinds of testing
Discrete-point testing:
only tests one thing at a time and the answer is
either correct or incorrect.e.g. asking students to choose the correct form of
tense, or multiple-choice tests
Integrative testing:
expects students to use a variety of language at
any given time
e.g. writing a composition, doing conversational oral test
-
7/29/2019 Testing Evaluation Pp t
28/51
Types of test items
Direct test itemIt requires the candidate to
perform precisely the skill
we wish to measure.
It tries to be as much like
real-life language use as
possible.
e.g. writing samples, oral
interview
Indirect test item
It tries to measure the
abilities which underlie the
skills in which we are
interested.
e.g. multiple-choice
questions, cloze
procedures, sentencereordering
-
7/29/2019 Testing Evaluation Pp t
29/51
Time Line
Transformat
ion andparaphrase
Sentencereordering
Multiple-choice items
Clozeprocedures
-
7/29/2019 Testing Evaluation Pp t
30/51
MULTIPLE-CHOICE ITEMS
Scoring is reliable, easy, and economical.
It is possible to include more items in agiven period of time.ADVANTAGES
It is very difficult to write successful
items. It restricts what can be tested.
It tests only recognition knowledge.
Cheating may be facilitated.
DISADVANTAGES
-
7/29/2019 Testing Evaluation Pp t
31/51
CLOZE PROCEDURES
It offers the ideal indirect, but integrative
testing items.
They can be prepared quickly, and are
an extremely cost-effective way of
finding out about a testees overall
knowledge.
Cause the deletion of words is random,
it avoids test designers failing.
Cause of the randomness of deleted
words, anything may be
tested.(grammar, collocations, fixedphrases,)
Supplying the correct word for the blank
implies an understanding of context and
a knowledge of that word and how it
operates.
In some cases, there are several
possible answers.
The actual score a student gets
depends on the particular words that
are deleted, rather than on any general
English knowledge.(problem of
reliability)
advantages disadvantages
-
7/29/2019 Testing Evaluation Pp t
32/51
They tell us somethingabout the candidatesknowledge of the
language system.
TRANSFORMATION
AND
PARAPHRASES
-
7/29/2019 Testing Evaluation Pp t
33/51
SENTENCE REORDERING
It tells us quite a lot aboutstudents underlyingknowledge of syntax andlexico-grammatical elements.
Although they are easy to write,they are not always possible toensure only one correct order.
-
7/29/2019 Testing Evaluation Pp t
34/51
DIRECT TEST ITEMS
To have valid and reliable direct test items, test designers need to
do the following:
-
7/29/2019 Testing Evaluation Pp t
35/51
Create a level playing field: it means that in the case of awritten test or in testing receptive skills, it is needed to avoidmaking excessive demands on the students general or
specialist knowledge. That is the topics should not be toogeneral or too specific.
Replicate real-life interaction: it means that tests of listeningor speaking should reflect real life, i.e. the text should be asrealistic as possible.
-
7/29/2019 Testing Evaluation Pp t
36/51
WRITING AND MARKING TESTS
-
7/29/2019 Testing Evaluation Pp t
37/51
WRITINGTESTS
1- assess the test situation2- decide what to test
3- balance the elements
4- weight the scores
5- make the test work
-
7/29/2019 Testing Evaluation Pp t
38/51
OBJECTIVE SCORING: a method of scoring in which the scores are given
according to some predetermined
criteria. in this method, each correct
answer is usually counted one point.
SUBJECTIVE SCORING: a method of scoring in which the scoring
procedures do not follow any objective criteria. So, the fluctuations ofscores from one scorer to another creates a serious problem. To
compensate for the inadequacies of subjective scoring, the following
solutions are recommended:
-
7/29/2019 Testing Evaluation Pp t
39/51
1- Training
It means that the scorer should not come to the task fresh. Theyshould see some scripts at different levels.e.g. they may be allowed to watch videoed oral test in order to betrained to rate the samples of spoken language accurately andconsistently in terms of predetermined description of performance.
-
7/29/2019 Testing Evaluation Pp t
40/51
2- More than one scorer
More scorer, more reliabilityThe more people who look at a script, the greater the chance that its true
worth will be located somewhere between the various scores it is given.
sometimes we can use a moderator
whose job is to check samples of
scorers work to see that it conforms with
the general standards laid down for
the exam.
-
7/29/2019 Testing Evaluation Pp t
41/51
3- Global assessment scale
A way of specifying score is to use a pre-defined descriptions of performance. Such
descriptions say what the students need to be capable of in order to gain the
required marks.
However, they are not without problems:
Maybe these descriptions do not exactly match the students who is
speaking.
Another one is that different teachers will not agree on the meaning of
scale descriptors.
-
7/29/2019 Testing Evaluation Pp t
42/51
We can mark tests for different elements, instead of general assessment.
A combination of global and analytic scoring gives us the best chance
of the reliable marking.
Scorecriteria
Fluency
Use of vocabulary
Use of grammar
Pronunciation
Repair skills
Task completion
intelligibility
4- Analytic profiles
-
7/29/2019 Testing Evaluation Pp t
43/51
5- Scoring and interacting during oral tests
if we separate the role ofscorer(or examiner) from the role ofinterlocutor
(the examiner who guides and provokes conversation), it will allow the
scorer to observe and assess, free from the responsibility of keeping theinteraction with the candidate or candidates going.
e.g. In test of speaking, we can put students in pairs or groups for certain
tasks. It will help to relax students in a way that interlocutor-candidate
interaction might not.
-
7/29/2019 Testing Evaluation Pp t
44/51
TEACHING FOR TESTS backwash (wash back) effect: the effect of the nature of a test on
teaching and learning. In other words, it is the potential impact of test ontest takers and their characteristics, on teaching and learning activities,
and on educational system and society.
Two kinds:
Harmful (negative) backwash: when test and testing techniques are at
variance with the objectives of the course.
Beneficial (positive) backwash: if a test is regarded as important, if the
stakes are high, preparation for it can come to dominate all teaching and
learning activities.
-
7/29/2019 Testing Evaluation Pp t
45/51
What does teaching for test mean?
Exam-teachers
Those who quit reasonably want
their students to pass the tests
and exams they are going to
take, so their teaching becomedominated by the test. Suffering
from the backwash effect, they
might stick rigidly to exam-
format activities in class.
In such a situation, the format of
the test is determining the
format of the class.
Non-exam teachers
They might use a range of
different activities.
-
7/29/2019 Testing Evaluation Pp t
46/51
Many teachers believe that teaching examclasses are extremely satisfying
because:
Since students perceive a clear sense of
purpose and are highly motivated to do
as well as possible, they are in some
sense easier to teach than students
whose focus is less clear.
Also, in training students to developgood exam skills (e.g.
working on their own, reviewing whatthey have done, learning to usereference tools, keeping an
independent learning record, etc.), wepush them towards autonomouslearning.
-
7/29/2019 Testing Evaluation Pp t
47/51
However, to be a good exam-preparation teacher is not easy.
They need to be familiar with the test their students are taking, and be able
to answer their students concerns and worries, and to walk a fine line
between good exam preparation and the wash back effect. So there are
number of things they can do in an exam class:
Train for test
Discuss general exam skills
Do practice test
Have funIgnore the test
-
7/29/2019 Testing Evaluation Pp t
48/51
1- Train for test types
Generally, we can make students familiar with the test items they will have to face sothat they can give their best, and the test discovers their level of English.
e.g. we can show them the various types of tests.
Help them to understand what the test designer is aiming for.
Help them to focus on what they are being asked to do and why.
and so on
2- Discuss general exam skills
We can remind students about general test and exam skills and teach them how to
organize their work so that they can revise effectively.
e.g. help them to pace themselves so that they do not spend a disproportionate
amount of time on only one part of exam.
Remind them how easily they can find the answer by reading question carefully, and
-
7/29/2019 Testing Evaluation Pp t
49/51
3- Do practice test
It means giving students the chance to practice taking the test so that they
get a feel for experience.
During a course, students can sit practice papers or whole practice tests.
4- Have fun
Although students need to practice certain test types, it has not to be done in
a boring or tense way. Teachers can use number of ways of having fun
with tests and exams.
e.g. teachers can ask students to write their own test items, based on
language they have been working on and the examples they have seen so
far.
-
7/29/2019 Testing Evaluation Pp t
50/51
5- ignore the test
Warning
Exam teacher should be careful that only discussing on exam techniques
and taking practice tests in class may become lesson and class
monotonous. In other words, in such classes there is a possibility thatgeneral English improvement will be compromised at the expense of exam
preparation.
To avoid this problem, we need to ignore the examfrom time to time so
that we have opportunities to work on general language issues toencourage students to take part in the kind of motivating activities that are
appropriate for all English lesson.
-
7/29/2019 Testing Evaluation Pp t
51/51
THANK TOU