testing evaluation pp t

Upload: kelemen-maria-magdolna

Post on 14-Apr-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/29/2019 Testing Evaluation Pp t

    1/51

    THE PRACTICE OF ENGLISH LANGUAGE TEACHING

    TESTING &EVALUATION

  • 7/29/2019 Testing Evaluation Pp t

    2/51

    TESTING AND ASSESSMENT

    Measurement and evaluation have been with us for a longtime.

    Since the effect of testing on teaching and learning isunavoidable, testing is an important part of everylanguage teaching and language learning experience.

  • 7/29/2019 Testing Evaluation Pp t

    3/51

    WHY TEST?

    Has the instruction been successful?

    Were the materials for instruction at

    the right level?

    Have all language skills been emphasized

    equally?

    What points need reviewing?

    Should the same materials be used next year

    or do they need some modifications?.

  • 7/29/2019 Testing Evaluation Pp t

    4/51

    TEST EVALUATIONMEASURMENT

  • 7/29/2019 Testing Evaluation Pp t

    5/51

    Test is the narrowest of the three terms.

    It often connotes to the presentation ofa set of questions to be answered.

    In general what distinguishes a test from other

    types of measurement is that it is designed to obtain

    a specific sample of behavior.

    TEST

    It is one type of measurement.

    It may be used for pedagogic or descriptive

    purposes.

  • 7/29/2019 Testing Evaluation Pp t

    6/51

    MEASUREMENT

    It implies a broader sense.

    We can measure characteristics by means

    other than giving tests e.g. using observation,rating scales, or other devices that

    allow us to obtain information in a quantitative

    form.

  • 7/29/2019 Testing Evaluation Pp t

    7/51

    EVALUATIONIt has been defined in a variety

    of ways.

    In general, it refers to thesystematic gathering of

    information for purposes of

    decision making.

    In other words, evaluation is aprofessional judgment or a

    process that allows one to make

    a judgment about the desirability

    or value of a measure.

  • 7/29/2019 Testing Evaluation Pp t

    8/51

    So, testing is not the only way in which

    information bout peoples language ability

    can be gathered.

    It is just one form of assessment.

  • 7/29/2019 Testing Evaluation Pp t

    9/51

    ASSESSMENT

    SUMMATIVE

    FORMATIVE

  • 7/29/2019 Testing Evaluation Pp t

    10/51

    FORMATIVE ASSESSMENT

    To check on the progress of their students.

    To see how far they have mastered what

    they should have learned.

    And then to use this information to modify

    their future teaching plans.

    It cab also be the basis for

    feedback to the students

    e.g. informal testor quizzes

  • 7/29/2019 Testing Evaluation Pp t

    11/51

    It is used at the end of

    the term, semester

    , or year in order

    to measure what

    has been achievedboth by groups and

    by individuals.

    SUMMATIVE ASSESSMENT

    e.g. final

    examination

    In most cases grades

    are assigned

    on the basis of

    performance on tests

    in addition toclassroom performance.

  • 7/29/2019 Testing Evaluation Pp t

    12/51

    Prof ic iency

    test

    Diagnost ic

    test

    Placement

    test

    Progress

    (Achievement)

    test

    Different reasons for testing learners

    Different kinds of tests

    Port fol io

    assessment

  • 7/29/2019 Testing Evaluation Pp t

    13/51

    PLACEMENT TEST

    It is used to sort new students into relatively homogenous language ability

    groupings so that they can start a course at approximately the same level

    as the other students in the class.

    It is one of the most frequently used test at different levels of language

    instruction.

  • 7/29/2019 Testing Evaluation Pp t

    14/51

    DIAGNOSTIC TEST

    It is designed to show what skills or knowledge a learner knows or does not

    know. In other words, it is used to identify students strengths and

    weaknesses.

    It is in the reverse side of achievement test in the sense that while theinterest of the achievement test is in success, the interest in the diagnostic

    test is in failure, what has gone wrong, in order to develop remedies.

  • 7/29/2019 Testing Evaluation Pp t

    15/51

    ACHIEVEMENT TEST

    It is designed to measure the degree of students learning from a particular

    set of instructional materials.

    It is directly related to language courses. It means that such tests normally

    come after a program of instruction or items of the test are drawn from thecontent of instruction directly.

    e.g. final, midterm, and class examinations

  • 7/29/2019 Testing Evaluation Pp t

    16/51

    PROFICIENCY TEST

    It is used to measure the overall language ability of the learners regardless

    of any training they may have had in that language.

    It seeks to answer the question:having learned this much, what can the

    student do with it?

    e.g. Test of English as a Foreign Language (TOEFL)

  • 7/29/2019 Testing Evaluation Pp t

    17/51

    PORTFOLIO ASSESSMENT

    Many educational institutions allow students to assemble a

    portfolio of their works over a period of time(a term or semester),

    so the students can be assessed by looking at three or four ofthe best pieces of work over this period.

  • 7/29/2019 Testing Evaluation Pp t

    18/51

    ADVANTAGES

    Provide evidence of

    students effort

    Help students to becomemore autonomous

    Help them to self monitor

    their own learning

    DISADVANTAGES

    It is time-consuming.

    Teachers will need clear

    training in how to select

    items from the portfolio

    and how to give them

    grades.

    In preparing their portfolios,

    students may have been

    helped by others.

  • 7/29/2019 Testing Evaluation Pp t

    19/51

    RELIABILITYVALIDITY

    Characteristics of a good test

    1 2

  • 7/29/2019 Testing Evaluation Pp t

    20/51

    VALIDITY

    It measures what it is supposed to measure (construct

    validity)

    or can be used for the purposes for which it is intended.

    Valid + for

    It means any given test may be valid for some purposes, butnot for others.

    Validity tells us what can be inferred from test scores.

  • 7/29/2019 Testing Evaluation Pp t

    21/51

    Different kinds

    of validityface validity: a test should look, on the face of it,as if it is valid. A test which consisted of only three

    multiple-choice items would not convince students

    of its face validity.

    criterion-related validity: it is based on the extent to whichperformance on a newly-developed test is related to some other

    criterion measure which is an indicator of the ability tested.

    -

    content validity: if the content of a test constitutes arepresentative sample of the language skills, structures, etc.with which it is meant to be concerned.

  • 7/29/2019 Testing Evaluation Pp t

    22/51

  • 7/29/2019 Testing Evaluation Pp t

    23/51

    First, write explicit specifications for the test and make

    sure that you include a representative sample of the

    content of these in the test.

    Second, whenever feasible, use direct testing.

    Third, make sure that the scoring of responses relates

    directly to what is being tested.

    Finally, do everything possible to make the test

    reliable. If a test is not reliable, it cannot be valid.

  • 7/29/2019 Testing Evaluation Pp t

    24/51

    What is

    reliability?

  • 7/29/2019 Testing Evaluation Pp t

    25/51

    RELIABILITY

    Reliability is a quality of test scores.

    It refers to the consistency of measuresacross different times,

    test forms, raters, and other characteristics of the measurementcontext.

    Synonyms for reliability are:

    Dependability, stability, consistency, predictability,accuracy

  • 7/29/2019 Testing Evaluation Pp t

    26/51

    How to make test more reliable?

    Take enough samples of behavior

    Exclude items which do not discriminate well between weaker and

    stronger students.

    Do not allow candidates too much freedom.

    Provide clear and explicit instructions.

    Write unambiguous items.

    Provide uniform and non-distracting conditions of administration.

    ..

  • 7/29/2019 Testing Evaluation Pp t

    27/51

    Two kinds of testing

    Discrete-point testing:

    only tests one thing at a time and the answer is

    either correct or incorrect.e.g. asking students to choose the correct form of

    tense, or multiple-choice tests

    Integrative testing:

    expects students to use a variety of language at

    any given time

    e.g. writing a composition, doing conversational oral test

  • 7/29/2019 Testing Evaluation Pp t

    28/51

    Types of test items

    Direct test itemIt requires the candidate to

    perform precisely the skill

    we wish to measure.

    It tries to be as much like

    real-life language use as

    possible.

    e.g. writing samples, oral

    interview

    Indirect test item

    It tries to measure the

    abilities which underlie the

    skills in which we are

    interested.

    e.g. multiple-choice

    questions, cloze

    procedures, sentencereordering

  • 7/29/2019 Testing Evaluation Pp t

    29/51

    Time Line

    Transformat

    ion andparaphrase

    Sentencereordering

    Multiple-choice items

    Clozeprocedures

  • 7/29/2019 Testing Evaluation Pp t

    30/51

    MULTIPLE-CHOICE ITEMS

    Scoring is reliable, easy, and economical.

    It is possible to include more items in agiven period of time.ADVANTAGES

    It is very difficult to write successful

    items. It restricts what can be tested.

    It tests only recognition knowledge.

    Cheating may be facilitated.

    DISADVANTAGES

  • 7/29/2019 Testing Evaluation Pp t

    31/51

    CLOZE PROCEDURES

    It offers the ideal indirect, but integrative

    testing items.

    They can be prepared quickly, and are

    an extremely cost-effective way of

    finding out about a testees overall

    knowledge.

    Cause the deletion of words is random,

    it avoids test designers failing.

    Cause of the randomness of deleted

    words, anything may be

    tested.(grammar, collocations, fixedphrases,)

    Supplying the correct word for the blank

    implies an understanding of context and

    a knowledge of that word and how it

    operates.

    In some cases, there are several

    possible answers.

    The actual score a student gets

    depends on the particular words that

    are deleted, rather than on any general

    English knowledge.(problem of

    reliability)

    advantages disadvantages

  • 7/29/2019 Testing Evaluation Pp t

    32/51

    They tell us somethingabout the candidatesknowledge of the

    language system.

    TRANSFORMATION

    AND

    PARAPHRASES

  • 7/29/2019 Testing Evaluation Pp t

    33/51

    SENTENCE REORDERING

    It tells us quite a lot aboutstudents underlyingknowledge of syntax andlexico-grammatical elements.

    Although they are easy to write,they are not always possible toensure only one correct order.

  • 7/29/2019 Testing Evaluation Pp t

    34/51

    DIRECT TEST ITEMS

    To have valid and reliable direct test items, test designers need to

    do the following:

  • 7/29/2019 Testing Evaluation Pp t

    35/51

    Create a level playing field: it means that in the case of awritten test or in testing receptive skills, it is needed to avoidmaking excessive demands on the students general or

    specialist knowledge. That is the topics should not be toogeneral or too specific.

    Replicate real-life interaction: it means that tests of listeningor speaking should reflect real life, i.e. the text should be asrealistic as possible.

  • 7/29/2019 Testing Evaluation Pp t

    36/51

    WRITING AND MARKING TESTS

  • 7/29/2019 Testing Evaluation Pp t

    37/51

    WRITINGTESTS

    1- assess the test situation2- decide what to test

    3- balance the elements

    4- weight the scores

    5- make the test work

  • 7/29/2019 Testing Evaluation Pp t

    38/51

    OBJECTIVE SCORING: a method of scoring in which the scores are given

    according to some predetermined

    criteria. in this method, each correct

    answer is usually counted one point.

    SUBJECTIVE SCORING: a method of scoring in which the scoring

    procedures do not follow any objective criteria. So, the fluctuations ofscores from one scorer to another creates a serious problem. To

    compensate for the inadequacies of subjective scoring, the following

    solutions are recommended:

  • 7/29/2019 Testing Evaluation Pp t

    39/51

    1- Training

    It means that the scorer should not come to the task fresh. Theyshould see some scripts at different levels.e.g. they may be allowed to watch videoed oral test in order to betrained to rate the samples of spoken language accurately andconsistently in terms of predetermined description of performance.

  • 7/29/2019 Testing Evaluation Pp t

    40/51

    2- More than one scorer

    More scorer, more reliabilityThe more people who look at a script, the greater the chance that its true

    worth will be located somewhere between the various scores it is given.

    sometimes we can use a moderator

    whose job is to check samples of

    scorers work to see that it conforms with

    the general standards laid down for

    the exam.

  • 7/29/2019 Testing Evaluation Pp t

    41/51

    3- Global assessment scale

    A way of specifying score is to use a pre-defined descriptions of performance. Such

    descriptions say what the students need to be capable of in order to gain the

    required marks.

    However, they are not without problems:

    Maybe these descriptions do not exactly match the students who is

    speaking.

    Another one is that different teachers will not agree on the meaning of

    scale descriptors.

  • 7/29/2019 Testing Evaluation Pp t

    42/51

    We can mark tests for different elements, instead of general assessment.

    A combination of global and analytic scoring gives us the best chance

    of the reliable marking.

    Scorecriteria

    Fluency

    Use of vocabulary

    Use of grammar

    Pronunciation

    Repair skills

    Task completion

    intelligibility

    4- Analytic profiles

  • 7/29/2019 Testing Evaluation Pp t

    43/51

    5- Scoring and interacting during oral tests

    if we separate the role ofscorer(or examiner) from the role ofinterlocutor

    (the examiner who guides and provokes conversation), it will allow the

    scorer to observe and assess, free from the responsibility of keeping theinteraction with the candidate or candidates going.

    e.g. In test of speaking, we can put students in pairs or groups for certain

    tasks. It will help to relax students in a way that interlocutor-candidate

    interaction might not.

  • 7/29/2019 Testing Evaluation Pp t

    44/51

    TEACHING FOR TESTS backwash (wash back) effect: the effect of the nature of a test on

    teaching and learning. In other words, it is the potential impact of test ontest takers and their characteristics, on teaching and learning activities,

    and on educational system and society.

    Two kinds:

    Harmful (negative) backwash: when test and testing techniques are at

    variance with the objectives of the course.

    Beneficial (positive) backwash: if a test is regarded as important, if the

    stakes are high, preparation for it can come to dominate all teaching and

    learning activities.

  • 7/29/2019 Testing Evaluation Pp t

    45/51

    What does teaching for test mean?

    Exam-teachers

    Those who quit reasonably want

    their students to pass the tests

    and exams they are going to

    take, so their teaching becomedominated by the test. Suffering

    from the backwash effect, they

    might stick rigidly to exam-

    format activities in class.

    In such a situation, the format of

    the test is determining the

    format of the class.

    Non-exam teachers

    They might use a range of

    different activities.

  • 7/29/2019 Testing Evaluation Pp t

    46/51

    Many teachers believe that teaching examclasses are extremely satisfying

    because:

    Since students perceive a clear sense of

    purpose and are highly motivated to do

    as well as possible, they are in some

    sense easier to teach than students

    whose focus is less clear.

    Also, in training students to developgood exam skills (e.g.

    working on their own, reviewing whatthey have done, learning to usereference tools, keeping an

    independent learning record, etc.), wepush them towards autonomouslearning.

  • 7/29/2019 Testing Evaluation Pp t

    47/51

    However, to be a good exam-preparation teacher is not easy.

    They need to be familiar with the test their students are taking, and be able

    to answer their students concerns and worries, and to walk a fine line

    between good exam preparation and the wash back effect. So there are

    number of things they can do in an exam class:

    Train for test

    Discuss general exam skills

    Do practice test

    Have funIgnore the test

  • 7/29/2019 Testing Evaluation Pp t

    48/51

    1- Train for test types

    Generally, we can make students familiar with the test items they will have to face sothat they can give their best, and the test discovers their level of English.

    e.g. we can show them the various types of tests.

    Help them to understand what the test designer is aiming for.

    Help them to focus on what they are being asked to do and why.

    and so on

    2- Discuss general exam skills

    We can remind students about general test and exam skills and teach them how to

    organize their work so that they can revise effectively.

    e.g. help them to pace themselves so that they do not spend a disproportionate

    amount of time on only one part of exam.

    Remind them how easily they can find the answer by reading question carefully, and

  • 7/29/2019 Testing Evaluation Pp t

    49/51

    3- Do practice test

    It means giving students the chance to practice taking the test so that they

    get a feel for experience.

    During a course, students can sit practice papers or whole practice tests.

    4- Have fun

    Although students need to practice certain test types, it has not to be done in

    a boring or tense way. Teachers can use number of ways of having fun

    with tests and exams.

    e.g. teachers can ask students to write their own test items, based on

    language they have been working on and the examples they have seen so

    far.

  • 7/29/2019 Testing Evaluation Pp t

    50/51

    5- ignore the test

    Warning

    Exam teacher should be careful that only discussing on exam techniques

    and taking practice tests in class may become lesson and class

    monotonous. In other words, in such classes there is a possibility thatgeneral English improvement will be compromised at the expense of exam

    preparation.

    To avoid this problem, we need to ignore the examfrom time to time so

    that we have opportunities to work on general language issues toencourage students to take part in the kind of motivating activities that are

    appropriate for all English lesson.

  • 7/29/2019 Testing Evaluation Pp t

    51/51

    THANK TOU