a training course on test design for elt teachers

إختبار إعداد كيفية على معله تدريب دورة

2015 2016 /

A Training Course On

Test Design

For ELT Teachers

الكويت دولة

وزارة التربية

التوديه الفني العاو للغة الانجليسية

State of Kuwait

Ministry of Education

ELT General Supervision

2

الجدول السمني و المادة العلنية للدورة التدريبية

الصفحة الموضوع و

عدد

الصاعات

1

(Assessment & Evaluation)

والتقويه القياط

Competences / Standards Based Assessment

على الكفايات والمعايير بنيالتقويه الم

3 - 15

عوض

وم لكل ت

عاشا

خ

لاث

2 Designing tests Using High Levels of Thinking

تصنيه الأختبارات باشتخداو مصتويات التفكير العليا

16 - 24

3 Cornerstones Of Test Design اشاشيات تصنيه الأختبار

25 - 34

4

Interpreting Test Scores & Benefiting from its results

ىتائج الإختبار والإشتفادة مً اليتائجليل تح

35 - 52

5

Constructing a good test according to questions

types designed by the ELT General Supervision

مً التوديه الفني العاو للغة الواردةطبقا لأنماط الأشئلة ديدختبارإتصنيه

الإنجليسية

53-59

الموده الفني العاو للغة الانجليسية

3

الموضوع الأول

Assessment & Evaluation )

والتقويه القياط

Competences / Standards Based Assessment

على الكفايات والمعايير المبنيالقياط

4

Assessment & Evaluation

Introduction:

The overall goal of assessment is to improve student learning. Assessment provides

students, parents/guardians, and teachers with valid information concerning student

progress and their attainment of the expected curriculum. Assessment and evaluation

measure whether or not learning and/or learning objectives are being met. It requires the

gathering of evidence of student performance over a period of time to measure learning

and understanding. Evidence of learning could take the form of dialogue, journals, written

work, portfolios, and tests along with many other learning tasks. Evaluation ,on the other

hand ,occurs when a mark is assigned after the completion of a task, test, quiz, lesson or

learning activity.

The difference between assessment and evaluation:

Assessment: is an on-going process aimed at improving student learning, programs, and

services that involves a process of:

publicly sharing expectations

defining criteria and standards for quality

gathering, analyzing, and interpreting evidence about how well performance

matches the criteria

using the results to documents, explain, and improve performance

Evaluation: focuses on grades. Evaluation appraises the strengths and weaknesses of

programmes, policies, personnel, products, and organizations to improve their

effectiveness.

Key differences between assessment and evaluation:

Dimension of Difference Assessment Evaluation

Content: timing, primary

purpose

Formative: ongoing, to

improve learning

Summative: final, to

gauge quality

Orientation: focus of

measurement

Process-oriented: how

learning is going

Product-oriented: what’s

been learned

Findings: uses thereof Diagnostic: identify

areas for improvement

Judgmental: arrive at an

overall grade/score

How to Assess Students’ Learning and Performance

5

Learning takes place in students’ heads where it is invisible to others. This means that

learning must be assessed through performance: what students can do with their learning.

Assessing students’ performance can involve different types of assessments .

Types of assessment in Competency Based Education

Summative assessment (Assessment of Learning )

assists teacher to make judgments about student achievement at certain relevant

points in the learning process or unit of study (e.g. end of course, project, semester,

can be used formally to measure the level of achievement of learning outcomes

(e.g. tests, labs, assignments, projects, presentations etc.)

can also be used to judge programme, teaching and/or unit of study effectiveness

(that is as a form of evaluation).

Formative assessment (Assessment for Learning )

is the practice of building a cumulative record of student achievement

usually takes place during day to day learning experiences and involves ongoing,

informal observations throughout the term, course, semester or unit of study

is used to monitor students’ ongoing progress and to provide immediate and

meaningful feedback

assists teachers in modifying or extending their programmes or adapting their

learning and teaching methods

is very applicable and helpful during early group work processes.

Assessment as Learning

Assessment as learning develops and supports students' metacognitive skills. This form of

assessment is crucial in helping students become lifelong learners. As students engage in

peer and self-assessment, they learn to make sense of information, relate it to prior

knowledge and use it for new learning. Students develop a sense of ownership and

efficacy when they use teacher, peer and self-assessment feedback to make adjustments,

improvements and changes to what they understand.

Comparing Assessment for Learning and Assessment of Learning

6

Assessment for Learning

(Formative Assessment)

Assessment of Learning

(Summative Assessment)

Checks learning to determine what to do next

and then provides suggestions of what to do—

teaching and learning are indistinguishable

from assessment.

Checks what has been learned to date.

Is designed to assist educators and students in

improving learning.

Is designed for the information of those not

directly involved in daily learning and teaching

(school administration, parents, school board,

post-secondary institutions) in addition to

educators and students.

Is used continually by providing descriptive

feedback.

Is presented in a periodic report.

Usually uses detailed, specific and descriptive

feedback—in a formal or informal report.

Usually compiles data into a single number,

score or mark as part of a formal report.

Is not reported as part of an achievement

grade.

Is reported as part of an achievement grade.

Usually focuses on improvement, compared

with the student's ―previous best‖ (self-

referenced, making learning more personal).

Usually compares the student's learning either

with other students' learning (norm-referenced,

making learning highly competitive) or the

standard for a grade level (criterion-referenced,

making learning more collaborative and

individually focused).

Involves the student. Does not always involve the student.

Purposes of assessment and evaluation

Teaching and learning

The primary purpose of assessment is to improve students’ learning and teachers’ teaching

as both respond to the information it provides. Assessment for learning is an ongoing

process that arises out of the interaction between teaching and learning. What makes

assessment for learning effective is how well the information is used.

System Improvement

7

Assessment can do more than simply diagnose and identify students’ learning needs; it can

be used to assist improvements across the education system in a cycle of continuous

improvement:

Students and teachers can use the information gained from assessment to determine

their next teaching and learning steps.

Parents and families can be kept informed of next plans for teaching and learning

and the progress being made, so they can play an active role in their children’s

learning.

School administrations can use the information for school-wide planning, to support

their teachers and determine professional development needs.

ELT board can use assessment information to assist its supervisory role and its

decisions about staffing and resourcing.

The Education Area can use assessment information to inform their advice for

school improvement.

The Ministry of Education can use assessment information to undertake policy

review and development at a national level, so that government funding and policy

intervention is targeted appropriately to support improved student outcomes.

The Role of Testing in Evaluation

Testing is closely tied to evaluation. Tests of some sort play a role in virtually all

educational programme evaluations; indeed, too often an "evaluation" is no more than a

hasty analysis of whether test scores rose.

What is a Test?

A test is defined as a systematic procedure for measuring a sample of behaviour. The

phrase "systematic procedure" indicates that a test is constructed, administered and scored

according to predetermined rules. It also indicates that test items are chosen to fit the test

specification, and the same items are administered to all persons who share the same time

limits.

Types of Tests

Tests may be divided into many types:

A) In terms of technique:

8

1-Subjective Tests:

These tests take the form of writing sentences, paragraphs or essays. In subjective tests, it

usually happens that different scores to the same question.

2-Objective Tests:

The grading of these tests is independent of the person marking the tests because these

tests have definite answers, which have no room for subjectivity in grading.

Types of objective Tests:

1. Multiple choice tests.

2. True or False Tests.

3. Matching Tests.

B) In terms of what they are intended to measure:

1-Achievement Test:

This is designed to measure students' mastery of what should have been taught.

2-Diagnostic Test :

This test is designed to diagnose the problems or weaknesses our students may

encounter.

3-Proficiency Test: This test can be used to measure how suitable candidate will be for performing a

certain task or following a specific course.

4-Aptitude Test:

This test predicts probable success or failure in certain areas of language study.

B) In terms of function:

1-Norm-Referenced Tests:

9

Such tests place the student in a rank order. i.e. it tells the examiner how a student has

performed compared with his classmates.

2-Criterion-Referenced Tests:

These tests tell the examiner whether the student has achieved the desired objectives or

not, regardless of other students standards. Such tests can be used during the school year.

Conclusion

Assessment is integral to the teaching–learning process, facilitating student learning and

improving instruction, and can take a variety of forms. Assessment can enhance teaching

and student learning. All of the above purposes are important; if you use assessment

procedures appropriately, you will help all students learn well.

Competences / Standards Based Assessment Introduction:

Competency-Based Language Teaching (CBLT) focuses on what ―learners are expected to

do with the language‖ (Richards & Rodgers, 2001, p.141). This approach emerged in the

United States in the 1970s and can be described as ―defining educational goals in terms of

precise measurable descriptions of the knowledge, skills, and behaviours students should

possess at the end of a course of study‖ (Richards & Rodgers, 2001, p.141).

Instruction and assessment within CBE are specific to the development and evaluation of

the integration of knowledge, skills, and attitudes (competence). Teaching for competence

requires specific teaching methods that use teaching techniques which foster the

http://teflpedia.com/Learners

http://teflpedia.com/Language

10

integration of knowledge, skills, and attitudes. Likewise, assessment of competence

requires assessment methods that allow the learner to integrate the knowledge and skills in

a meaningful way.

What are Competences?

Competences are integrated systems of knowledge, skills, attitudes, values and beliefs

developed through formal (and non-formal) education that allow individuals to become

responsible and autonomous persons, able to solve a diversity of problems and perform in

everyday life-settings at the quality level expressed by the standards.

The Kuwait National Curriculum operates with three types of competences:

Key Competences

General Competences

Specific Competences

Key Competences: represent a transferable, multifunctional package (system) of knowledge, skills,

values, attitudes, beliefs, and personal/social attributes that all individuals need to

acquire for their personal development, inclusion and employment

(i.e. for being successful in their personal and social life, as well as in their

professional career).

Key competences are supposed to be achieved by the end of the Secondary stage.

They are cross-curricular (i.e. non-subject specific), transferable and

multifunctional competences – so that, in principle, all subjects should contribute to

their development.

General Competences:

General competences are subject-specific.

They define the most general subject-based knowledge, skills and attitudes/values

embedded/integrated in students’ expected outcomes by the end of Grade 12.

Specific Competences:

are structured and developed in students during a school year.

As compared to the General Competences:

Specific Competences define more specific systems of integrated knowledge, skills

and attitudes/values.

11

They can even cover specialized, topic-based competences students are supposed to

display by the end of each grade. The specific competences are clustered in the

following four dimensions:

– A range of realities specific to the subject (knowledge);

– A range of operations (skills and strategies)

– A range of personal and social responses (attitudes, values)

– A range of connections with other subjects and domains.

–

Discipline Based Vs Competency-Based Curriculum Design

Competence-Based Disciplined-Based

Outcomes Content

Competences Objectives

Criterion referenced grade Norm referenced grade

Objective assessment Subjective assessment

Learner centered Teacher centered

Integrated learning Passive Learning

Androgogy Pedagogy

Formative formulation Summative evaluation

Learner performance Instructional delivery

Skills/performance focus Knowledge/theory focus

Outcomes focus Structural /process focus

Assessed by performance Assessed by counting

Time and sequence derived

by assessment

Exposed to specific content

for pre- assigned time

Standards:

All competence-based curricula – including the new Kuwait National Curriculum –

introduce and largely use the concept of standards.

Standards describe expectations and are used to judge the level of performance in a

field or domain.

Teachers and Standards:

Before teachers begin to teach, they must answer the following questions:

• What standard(s) are being addressed?

• What type or format of assessment will be used? Will students have to write a report?

Answer multiple-choice questions? Do a project? Respond to a short scenario? Give an

oral report?

12

• What are the specific skills/knowledge students will have to include in their response to

demonstrate that they have learned the standard?

Standards-based education does not mean that teachers must abandon current classroom

projects and lessons. Much of the good presently done in classrooms is indeed appropriate

and should be retained and enhanced. However, what is done must be aligned to specific

standards.

Types of standards in KNC:

Performance standards

They refer to the quality level to be achieved by students in performing their general

competences by the end of each of the school stages – i.e. Primary, Intermediate,

and Secondary. The measurement of the performance standards is a matter of

different forms of national summative assessments or examinations.

Curriculum standards ( Content Standards )

They refer to the quality level to be achieved by students in attaining the specific

competences. Curriculum standards describe to what extent the specific competences

should be achieved by the end of each grade. In the Kuwait Curriculum, curriculum

standards are related to specific competences defined in the subject curriculum.

Curriculum standards are a matter of school- and class-based formative and

summative assessment.

Why do we need standards?

Clarify expectations for the expected results and performance.

Help educators plan and carry out instruction and assessment

Provide a clear focus

Serve as a tool for accountability and quality assurance.

Promote equity :every student can achieve high results

Ensure quality instruction through professional development

Provide guidance for teachers

Provide common criteria to assess learning and professional performance.

Help establish a criteria for selection, for certification and for recognition of high

levels of performance

In constructing any test, teachers may consult the list of competences and sub

competences and relate them to the book content of the target grade.

13

Sample Competences ( Grade 1 ) Primary Stage

Grad

e O

ne

Competence Example( in Tests )

Learners identify upper and lower cases Match upper and lower cases

( Reading )

Learners identify words beginning with

sounds.

Match pictures to word initials

( Reading )

Learners trace and copy numbers Copy numbers ( Writing )

Learners recognize simple sentences while

listening. listen and tick the correct

pictures

Listen & number the correct

pictures

( Listening )

Sample Competences / Secondary Stage

Sec

on

dary

Sta

ge

Competence Example( in Tests )

identify main ideas, topic sentences, from

supporting details

Reading Comprehension

( MCQs )

read texts and draw inferences Reading comprehension

( MCQs or Productive Questions

Use vocabulary strategies to discern the

meaning of words, for example, roots,

affixes ,word classification, etc.

Reading Comprehension

( MCQs )

Read for specific information Reading Comprehension

( MCQs & Productive Questions )

Write a summary, a diary, a book review

Reading comprehension

( Summary –Making )

Write a report based on a discussion Writing

14

Write paragraphs on familiar topics and on

previously learned academic content

using the elements of a paragraph Writing

Compose multi-paragraph essays using

writing process with guidance Writing

Write in a variety of forms i.e. narratives,

content area reports, letters autobiography Writing

Conclusion

To any globally minded teacher, it quickly becomes obvious that traditional assessment

practices—both classroom-based and large-scale measures—are inadequate to support the

complex mix of knowledge, skills, and dispositions that comprise global competence.

Schools that are committed to it as a goal for all students quickly realize that they must

leverage a variety of learning experiences, in and out of school, to ensure that students are

ready for the world.

References:

www.cmu.edu

cte.cornell.edu/teaching-ideas

www.columbia.eduep.uoregon.edu

https://teachingcommons.stanford.edu

www.edutopia.org/comprehensive-

the International Board of Standards for Training, Performance and

Instruction

Linda Darling-Hammond is the Charles E. Ducommun Professor of

Education at the Stanford University School of Education

The ELT National Curriculum Statement

http://www.cmu.edu/

http://www.edutopia.org/comprehensive-

15

الموضوع الجاىي

Designing Tests Using High Levels of

Thinking

تصنيه الأختبارات باشتخداو مصتويات التفكير العليا

Standards for the English language and Arts. NATIONAL COUNCIL OF

TEACHERS OF ENGLISH 1111 W. Kenyon Road, Urbana, Illinois

61801-1096

Read more: http://www.answers.com/topic/cut-off-score#ixzz2HLoSN8Yy

http://asiasociety.org/education/partnership-global-learning/making-

case/why-we-need-competency-based-education

Compiled by

Wafaa Al-Jraiwi Private Sector

Huda Hussein Al-Ahmadi Educational Area

http://www.answers.com/topic/cut-off-score#ixzz2HLoSN8Yy

17

Introduction

Critical thinking is the ability to apply reasoning and logic to new or unfamiliar ideas,

opinions, and situations. Thinking critically involves seeing things in an open-minded way

and examining an idea or concept from as many angles as possible. This important skill

allows people to look past their own views of the world and to better understand the

opinions of others.

Another aspect of critical thinking is the ability to approach a problem or situation

rationally. Rationality requires analyzing all known information, and making judgments or

analyses based on fact or evidence, rather than opinion or emotion.

Critical thinking is a very important aspect in every individual. Questioning techniques

were designed to develop our cognitive skill. Questions are as good as the responses

provided so it is better to sharpen our questioning skills. Our level of thinking is not so

much sharpened when the questions asked are more on factual recall or stock knowledge.

We get to work on our cognitive aspect through higher level questions that require more

than basic knowledge-level responses.

Questions must be constructed in a way that triggers evaluation, analysis, and synthesis of

facts and information. Higher level cognitive questions usually start with words as

―Explain,‖ ―Describe,‖ ―Compare,‖ ―Why,‖ and ―How.‖

Bloom’s Taxonomy Bloom’s Taxonomy provides an important framework for teachers to use to focus on

higher order thinking. It can assist teachers in designing performance tasks, and providing

feedback on student work.

Questions for Critical Thinking can be used in the classroom to develop all levels of

thinking within the cognitive domain. The results will improve attention to detail, increase

comprehension and expand problem solving skills.

The stages of thinking

18

Taxonomy level representative

verbs question cues

Remembering

Can the student

recall or remember

the information?

define,

duplicate,

list,

memorize,

recall,

repeat,

reproduce state

• What is …?

• How is …?

• Where is …?

• When did _______ happen?

• How did ______ happen?

• How would you explain …?

• How would you describe …?

• What do you recall …?

• How would you show …?

• Who (what) were the main …?

• What are three …?

• What is the definition of...?

Understanding

Can the student

explain ideas or

concepts?

classify,

describe,

discuss,

explain,

identify,

locate,

recognize,

report,

select,

translate,

paraphrase

• How would you classify the type of …?

• How would you compare …? contrast …?

• How would you rephrase the meaning …?

• What facts or ideas show …?

• What is the main idea of …?

• Which statements support …?

• How can you explain what is meant …?

• What can you say about …?

• Which is the best answer …?

• How would you summarize …?

Applying

Can the student

use the

information in a

new way?

choose,

demonstrate,

dramatize,

employ,

illustrate,

interpret,

operate,

schedule,

sketch,

solve,

use,

write.

• How would you use …?

• What examples can you find to …?

• How would you solve _______ using what you

have learned …?

• How would you organize _______ to show …?

• How would you show your understanding of …?

• What approach would you use to …?

• How would you apply what you learned to

develop?

• What other way would you plan to …?

• What would result if …?

• How can you make use of the facts to …?

• What elements would you choose to change …?

• What facts would you select to show …?

• What questions would you ask in an interview

with?

19

Analyzing

Can the student

distinguish

between the

different parts?

appraise,

compare,

contrast,

criticize,

differentiate,

discriminate,

distinguish,

examine,

experiment,

question, test.

• What are the parts or features of …?

• How is _______ related to …?

• Why do you think …?

• What is the theme …?

• What motive is there …?

• What conclusions can you draw …?

• How would you classify …?

• How can you identify the different parts …?

• What evidence can you find …?

• What is the relationship between …?

• How can you make a distinction between …?

• What is the function of

Evaluating

Can the student

justify a stand or

decision?

appraise,

argue,

defend,

judge,

select,

support,

value,

evaluate

• Why do you agree with the actions? The outcomes?

• What is your opinion of …?

• How would you prove …? disprove …?

• How can you assess the value or importance of …?

• What would you recommend …?

• How would you rate or evaluate the …?

• What choice would you have made …?

• How would you prioritize …?

• What details would you use to support the view …?

• Why was it better than …?

Creating

Can the student

create new

product or point

of view?

assemble,

construct,

create,

design,

develop,

formulate,

write

• What changes would you make to solve …?

• How would you improve …?

• What would happen if …?

• How can you elaborate on the reason …?

• What alternative can you propose …?

• How can you invent …?

• How would you adapt _____ to create a different ?

• How could you change (modify) the plot (plan) …?

• What could be done to minimize (maximize) …?

• What way would you design …?

• What could be combined to improve (change) …?

• How would you test or formulate a theory for …?

• What would you predict as the outcome of ...?

• How can a model be constructed that

Application on The story (Great Expectations)

20

Knowledge ( remembering )

What was the name of the convict Pip met in the graveyard?

Who was the real benefactor of Pip?

Comprehension ( understanding)

How would you summarize the characteristics of a real gentleman?

Application( applying)

What would the result be if Pip didn’t help Magwitch in the graveyard?

Analysis (analyzing)

Why do you think Pip wanted to be a gentleman?

Evaluation( evaluating)

What is your opinion of Pip’s treatment towards Joe, after becoming a gentleman?

Synthesis (creating)

What would happen if Pip didn’t meet Magwitch at all?

The Importance of Asking Higher Level Questions

Why are we encouraged to ask higher level thinking questions? For one, these types of

questions belong to the evaluative level which enhances critical thinking apart from literal

thinking. Also, a habitual orientation to questions at higher levels helps develop an

individual’s positive self-concept. It makes a person realize that their ideas are important

and can contribute.

A deeper level of comprehension is enhanced since individuals are able to relate to

personal background experiences of a situation. Repeating factual information is good, but

being asked to comprehend and infer from those facts will make us understand the essence

of an issue or a situation.

It is also important to note that for all questions, not all answers can be found on the page

or based on factual recall. Creativity in thinking is needed to promote new ideas and

concepts. A person becomes better prepared to face challenging situations at a more

mature level of thinking with the exposure to inferential and applied level of questions.

21

Levels of Questions

A very good practice for students in developing their thinking skills and increasing

comprehension is to ask effective questions. Instead of asking them lower-level questions

most of the time, it is important to balance literal or knowledge-based questions with

higher level questions so the students will learn to think at a higher level of thinking as

well.

What Should be the Main Objectives to Assess Higher Order Thinking?

1. It should assess students’ skills and abilities in analyzing, synthesizing, applying, and

evaluating information.

2. It should concentrate on thinking skills that can be employed with maximum flexibility,

in a wide variety of subjects, situations, contexts, and educational levels.

3. It should account for both the important differences among subjects , skills, and

processes to all subjects.

4. It should focus on all forms of intellectual ability .

5. It should lead to the improvement of instruction.

6. It should make clear the inter-connectedness of our knowledge and abilities.

7. It should assess fundamental skills that are essential to being a responsible, decision-

making member of the work-place.

8. It should be based on clear concepts and have well-thought-out, rationally articulated

goals, criteria, and standards.

9. It should account for the integration of communication skills, problem-solving, and

critical thinking, and it should assess all of them without compromising essential

features of any of them.

22

10. It should respect cultural diversity by focusing on the common-core skills, abilities, and

traits useful in all cultures.

11. It should test thinking that promotes the active engagement of students in constructing

their own knowledge and understanding.

12. It should concentrate on assessing the fundamental cognitive structures of

communication, for example:

A. with reading and listening, the ability to

create an accurate interpretation,

assess the author’s or speaker’s purpose,

accurately identify the question-at-issue or problem being discussed,

accurately identify basic concepts of what is said or written,

see significant implications

identify, understand, and evaluate assumptions.

recognize evidence, argument, inference in oral and written presentations,

reasonably assess the credibility of an author or speaker,

accurately grasp the point of view of the author or speaker,

empathetically reason within the point of view of the author or speaker.

B. with writing and speaking, the ability to:

identify and explicate one’s own point of view and its implications,

be clear about and communicate clearly, in either spoken or written form, the

problem one is addressing,

be clear about what one is assuming, presupposing, or taking for granted,

present one’s position precisely, accurately, completely, and give relevant, logical,

and fair arguments for it,

cite relevant evidence and experiences to support one’s position,

see, formulate, and take account of alternative positions and opposing points of view,

recognizing and evaluating evidence and key assumptions on both sides,

illustrate one’s central concepts with significant examples and show how they apply

in real situations,

empathetically entertain strong objections from points of view other than one’s own.

23

13. It should assess the skills, abilities, and attitudes that are central to making sound

decisions and acting on them in the context of learning to understand our rights and

responsibilities as citizens, as well-informed and thinking consumers, and as

participants in a symbiotic world economy.

14. It should enable educators to see what kinds of skills are basic for the future.

15. It should be of a kind that will assess valuable skills applied to genuine problems as

seen by a large body of the public, both inside and outside of the educational

community.

16. It should include items that assess both the skills of thoughtfully choosing the most

reasonable answer to a problem from among a pre-selected set and the skills of

formulating the problem itself and of making the initial selection of relevant

alternatives.

17. It should contain items that, as much as possible, are examples of the real-life problems

and issues that people will have to think out and act upon.

18. It should enable educators to assess the gains they are making in teaching higher order

thinking.

References

Aiken, Lewis R., (1982). Writing multiple-choice items to measure higher-order

educational objectives. Educational and Psychological Measurement, 1982, Vol. 42,

pp. 803-806.

Bloom, B. S., Englehart, M. D., Furst, E. J., Hill, W. H., & Krathwohl, D. R. (1956).

Taxonomy of educational objectives. The classification of educational goals:

Handbook I. Cognitive domain. New York: David McKay.

24

Bloom’s Taxonomy, downloaded from Wikipedia 11/8/2011,

http://en.wikipedia.org/wiki/Bloom's_Taxonomy

Guilford, J.P., (1967). The nature of human intelligence, New York, McGraw-Hill

Hoepfl, Marie C. (1994) Developing and evaluating multiple-choice tests. The

Technology Teacher, April 1994, pp. 25-26.

Morrison, Susan, and Kathleen Walsh Free, (2001) Writing multiple-choice test

items that promote and measure critical thinking. Journal of Nursing Education,

January 2001, Vol. 40, No. 1, pp. 17-24.

Schreyer Institute for Teaching Excellence at Penn State, Writing multiple-choice

items to assess higher order thinking. Downloaded Nov. 1, 2011.

Compiled by Khawla AL-Refaae

Al-Aassema Educational Area

http://en.wikipedia.org/wiki/Bloom%27s_Taxonomy

25

الموضوع الجالح

Cornerstones Of Test Design

ختبارلإشاشيات تصنيه اأ

Cornerstones Of Test Design Introduction

Language testing at any level is a highly complex undertaking that must be based on

theory as well as practice. The guiding principles that govern good test design,

development and analysis include:

validity, reliability, administration(practicality), washback, impact, authenticity, and

transparency.

1. Validity

Refers to the accuracy of an assessment, to whether or not it measures what it is

designed to measure. When closely examined, however, the concept of validity reveals a

number of aspects, each of which deserves attention.

1.1 Content validity

http://taesig.8m.com/createii.html#anchor145755






26

A test is said to have content validity if its content constitutes a representative sample

of the language skills, structures, etc. with which it is meant to be concerned. The greater a

test’s content validity, the more likely it is to be an accurate measure of what it is

supposed to measure. In order to judge whether or not a test has content validity, we need

a specification of the skills or structures etc. that is meant to cover. Such a specification

should be made at a very early stage in test construction.

1.2 Criterion-related validity/ Empirical validity

There are essentially two kinds of criterion-related validity: concurrent validity and

predictive validity. Concurrent validity is established when the test and the criterion are

administered at about the same time. The second kind of criterion-related validity is

predictive validity. This concerns the degree to which a test can predict candidates’ future

performance.

1.3 Construct validity

Construct validity is the most important form of validity because it asks the

fundamental validity question: What does this test really measure?

1.4 Face validity

A test is said to have face validity if it looks as if it measures what it is supposed to

measure, for example, a test which pretended to measure pronunciation ability but which

did not require the candidate to speak might be thought to lack face validity. A test which

does not have face validity may not be accepted by candidates, teachers, education

authorities or employers.

2. Reliability

Reliability is a necessary characteristic of any good test. If the test is administered to

the same candidates on different occasions with no language practice work taking place

between these occasions, then, to the extent that it produces differing results, it is not

reliable. In short, in order to be reliable, a test must be consistent in its measurements.

How to make tests more reliable

Write unambiguous items. It is essential that candidates should not be presented

with items whose meaning is not clear or to which there is an acceptable answer

which the test writer has not anticipated.

Provide clear and explicit instructions. This applies both to written and oral

instructions. It is possible for candidates to misinterpret what they are asked to do,

then on some occasions some of them certainly will. Test writers should not rely on

the students’ powers of telepathy to elicit the desired behaviour.

27

Ensure that tests are well laid out and perfectly legible. Otherwise students are faced

with additional tasks which not ones are meant to measure their language ability.

Their variable performance on the unwanted tasks will lower the reliability of a test.

Candidates should be familiar with format and testing techniques. If any aspect of a

test is unfamiliar to candidates, they are likely to perform less well they would do

otherwise. For this reason, every effort must be made to ensure that all candidates

have the opportunity to learn just what will be required of them.

Use items that permit scoring which is as objective as possible. This may appear to

be a recommendation to use multiple choice items, which permit completely

objective scoring. An alternative to multiple choice item which has a unique, possibly

one word, correct response which the candidates produce themselves.

Provide a detailed scoring key. This should specify acceptable answer and assign

points for partially correct responses. For high scorer reliability the key should be as

detailed as possible in its assignment of points.(Rubrics)

3. Administration (Practicality)

A test must be practicable; in other words, it must be fairly straight forward to

administer. Classroom teachers are well familiar with practical issues, but they need to

think of how practical matters relate to testing. A good classroom test should be "teacher-

friendly". A teacher should be able to develop, administer and mark it within the available

time and with available resources.

4.Washback

Washback refers to the effect of testing on teaching and learning. Unfortunately, students

and teachers tend to think of the negative effects of testing such as "test-driven" curricula

and only studying and learning "what they need to know for the test". Positive washback,

or what we prefer to call "guided washback" can benefit teachers, students and

administrators. Positive washback assumes that testing and curriculum design are both

based on clear course outcomes which are known to both students and teachers/testers.

5.Impact

Closely related to washback, the term impact ―refers to any of the effects that a test may

have on individuals, policies or practices, within the classroom, the school, the educational

system or society as a whole‖ .

6.Authenticity

Language learners are motivated to perform when they are faced with tasks that reflect real

world situations and contexts. Good testing or assessment strives to use formats and tasks

28

that mirror the types of situations in which students would authentically use the target

language. (What would you say in the following situations?

7.Transparency

Transparency refers to the availability of clear, accurate information to students about

testing. Such information should include outcomes to be evaluated, formats used,

weighting of items and sections, time allowed to complete the test, and grading criteria.

Tips for Writing Good Test Items

1. Vocabulary:

a. Multiple Choice and Gap filling:

The stem (statement) should not be a definition of the word.

The stem should be meaningful in itself.

Avoid lengthy stems and very short ones as well.

The stem should not be ambiguous; it should not have difficult words that may

distract the students.

The stem (and the alternatives) should be stated clearly and concisely.

Alternatives should be homogeneous in content [part of speech, length, formatting,

and language choice in the alternatives.

The stem should not hold negative ideas or negative connotations/denotations.

*We won't go to that restaurant again. The food and

the service were .................

a- invisible b- dreadful c- unreliable d- charitable

Avoid complex multiple-choice items.

In the model answer, specify which unit/lesson the vocabulary items are taken from.

2. Grammar:

Make sure all alternatives are grammatically parallel.

There is a difference between recognition knowledge of a word and recall

knowledge of a word. Recognition knowledge means you understand the word

when you hear it or read it. Recall knowledge means you can and will use the

word in your own speech or writing.

29

All alternatives should be grammatically correct. Any incorrect alternative should

not be included in the test.

Avoid possible answers.

In the model answer, specify which unit/lesson the grammatical items are taken

from.

2. Language Functions:

The situations should be reasonable and non-threatening.

Reinforce the words they have learned during the course as much as possible by

using the words in meaningful situations.

The situation should not be misleading or confusing.

Avoid political, religious and media-related topics.

In the model answer, specify which unit/lesson the language function is taken from.

4.Set-Book Questions:

It should be indicated in the head of the question that students are to provide full

answers.

All questions should be of general nature and related to the theme of the unit.

Questions should tackle higher-level thinking skills.

Literature time questions should deal with morals and be educational in nature.

5.Writing:

The topics should be semi-related to the textbook material.

The number of words (lines) should be indicated clearly.

The helping ideas provided should be ―helpful‖ not confusing.

6.Reading Comprehension:

The reading passage should have a relatively acceptable level of difficulty.

The passage should not have complicated or ambiguous sentence formation.

The questions should have Literal comprehension questions. Literal comprehension

refers to an understanding of the straightforward meaning of the text, such as

facts, vocabulary, dates, times, and locations. Questions of literal comprehension

can be answered directly and explicitly from the text.

30

There should be reorganization questions. Reorganization is based on a literal

understanding of the text; students must use information from various parts of the

text and combine them for additional understanding.

There should be inference questions included. Students may initially have a difficult

time answering inference questions because the answers are based on material that

is in the text but not explicitly stated. An inference involves students combining

their literal understanding of the text with their own knowledge and intuitions.

Summary question should have a clear question.

7.Translation:

Make sure the English paragraph is taken from the reading passage with no difficult

words or complex sentences.

The Arabic translation should be taken from the textbook. Specify the page number

and the lesson.

The Table of Specifications

The table of specifications. (The test blueprint or test specifications) identifies the objectives

and skills which are to be tested and the relative weight on the test given to each. This

statement necessarily precedes any development of the test. These specifications provide a

"blueprint" for test construction. In absence of such a blueprint, test development can

potentially proceed with little clear direction. The development of such a set of

specifications is the crucial first step in the test development process. The table of

specifications can help the teacher write a test that has content validity. Teachers are

required to refer to the table of specifications designed for all grades by the General

Supervision.

There is a match between what was taught and what is tested. The table further helps

insure that the teacher:

* emphasizes the same content emphasized in day-to-day instruction.

* aligns test items with learning objectives e.g., important topics might include items that

test interpretation, application, prediction, and unimportant topics might be tested only

with simpler recognition items

* does not overlook or underemphasize an area of content

As a teacher, you can create a table of specifications as you teach a class by inserting the

topics/concepts covered each day, and to the extent possible writing 1-2 test items while

31

the class period is still fresh in your mind. If you can create the plan as-you-go, then it is

going to be very useful as a guide when you sit down to write the test .

General guidelines for item writing for all test types

Select the type of test item that measures the intended learning outcome .

Write the test item so that the task is clear and definite .

Write the test item so that the difficulty level matches the intent of the learning

outcome, the age group to be tested, and the use to be made of the results .

Write the test item so that there is no disagreement concerning the answer .

Write the test items far enough in advance that they can later be reviewed and

modified as needed .

Write more test item than called for by the test plan .

Because the Multiple Choice Test is the most common and highly regarded of all the test

types, the various rules on how the items are to be written are listed below.

1.1 Terminology Regarding Multiple-Choice Test Questions

Multiple-Choice Item: This is the most common objective-type item. The multiple-choice item is a test question

which has a number of alternative choices from which the examinee is to select the correct

answer. It is generally recommended that one use 4 or 5 choices per question, whenever

possible. Using fewer alternatives often results in items with inferior characteristics. The

item choices are typically identified on the test copy by the letters a through d.

Stem: This is the part of the item in which the problem is stated for the examinee. It can

be a question, a set of directions or a statement with an embedded blank.

Options/Alternatives: These are the choices given for the item.

Key: This is the correct choice for the item.

Distractors: These are the incorrect choices for the item.

2.1 Writing Multiple-Choice Test Items

The general rules used for writing multiple-choice items are described below.

The stem should contain the problem and any qualifications. The entire stem must

always precede the alternatives.

Each item should be as short and verbally uncomplicated as possible. Give as much

context as is necessary to answer the question, but do not include superfluous

information. Be careful not to make understanding the purpose of the item a test of

reading ability.

If one or more alternatives are partially correct, ask for the "best" answer.

Try to test a different point in each question. If creating item clones (i.e., items

designed to measure the exact same aspect of the objective), be certain to

32

sufficiently change the context, vocabulary, and order of alternatives, so that

students cannot recognize the two items as clones.

If an omission occurs in the stem, it should appear near the end of the stem and not

at the beginning.

*Caring for poor people is a basic aspect of ……………………….

a. boom b. adoption c. compassion d. litigation

Use a logical sequence for alternatives (e.g., temporal sequence, length of the

choice). If two alternatives are very similar (cognitively or visually), they should be

placed next to one another to allow students to compare them more easily.

Make all incorrect alternatives (i.e., distractors) plausible and attractive. It is often

useful to use popular misconceptions and frequent mistakes as distractors. In the

foreign languages, item distractors should include only correct forms and

vocabulary that actually exists in the language.

All alternatives should be homogeneous in content, form and grammatical

structure(part of speech).

Use only correct grammar in the stem and alternatives.

Make all alternatives grammatically consistent with the stem.

The length, explicitness and technical information in each alternatives should be

parallel so as not to give away the correct answer.

Use 4 alternatives in each item.(a,b.c,d,)

Avoid repeating words between the stem and key. It can be done, however, to make

distractors more attractive.

Avoid wording directly from a reading passage or use of stereotyped phrasing in the

key.

Alternatives should not overlap in meaning or be synonymous with one another.

Avoid terms such as "always" or "never," as they generally signal incorrect choices.

To test understanding of a term or concept, present the term in the stem followed by

definitions or descriptions in the alternatives.

Avoid items based on personal opinions unless the opinion is qualified by evidence

or a reference to the source of the opinion (e.g., According to the author of this

passage, . . . ).

Do not use "none of the above" as a last option when the correct answer is simply

the best answer among the choices offered.

Try to avoid "all of the above" as a last option. If an examinee can eliminate any of

the other choices, this choice can be automatically eliminated as well.

Guidelines for Reviewing Test Items The following guidelines are recommended for reviewing individual test items. When you

review an item, write your comments on a copy of the item indicating your suggested

changes. If you believe an item is not worth retaining, suggest it be deleted.

1. Consider the item as a whole and whether

33

a. it measures knowledge or a skill component which is worthwhile and appropriate for the

examinees who will be tested;

b. there is a markedly better way to test what this item tests;

c. it is of the appropriate level of difficulty for the examinees who will be tested.

2. Consider the stem and whether it

a. presents a clearly defined problem or task to the examinee;

b. contains unnecessary information;

c. could be worded more simply, clearly or concisely.

3. Consider the alternatives and whether

a. they are parallel in structure;

b. they fit logically and grammatically with the stem;

c. they could be worded more simply, clearly or concisely;

d. any are so inclusive that they logically eliminate another more restricted option from

being a possible answer.

4. Consider the key and whether it

a. is the best answer among the set of options for the item;

b. actually answers the question posed in the stem;

c. is too obvious relative to the other alternatives (i.e., should be shortened, lengthened,

given greater numbers of details, made less concrete).

5. Consider the distractors and whether

a. there is any way you could justify one or more as an acceptable correct answer;

b. they are plausible enough to be attractive to examinees who are misinformed or ill-

prepared;

c. any one calls attention to the key (e.g., no distractor should merely state the reverse of

the key or resemble the key very closely unless another pair of choices is similarly parallel

or involves opposites)

Checklist for Evaluating the Test Plan

Is the purpose of the test clear ?

Have the intended learning outcomes been identified and defined ?

Are the intended learning outcomes stated in performance measurable terms ?

Have test specifications been prepared that indicate the nature and distribution of

items to be included in the test ?

Have test specifications been prepared that indicate the nature and distribution of

items to be included in the test ?

Are the types of items appropriate for the learning outcomes to be measured ?

Is the difficulty of the items appropriate for the students to be tested?

34

Is the number of items appropriate for the students to be tested, the time available

for testing, and the interpretations to be made ?

Does the test plan include built-in features that contribute to valid, reliable scores ?

Have plans been made for arranging the items in the test, writing directions, scoring,

and using the results ?

Conclusion

Any language test which has claims to being professionally constructed will be able to

provide details of exactly how these principles of practicality, reliability, validity and

backwash are met by their test. Furthermore, a test should be constructed with the goal of

having students learn from their weaknesses. It will locate the exact areas of difficulties

experienced by the class or the individual student so that assistance in the form of

additional practice and corrective exercises can be given. Besides, teachers can evaluate

the effectiveness of the syllabus as well as the methods and materials they are using.

References.

Assessment: What Teachers Need to Know (2002) by W. James Popham.

Gronlund, N., Assessment of student achievement, 7th ed. Pearson Education, Inc.,

Boston, 2003

Dewidar 2003.Table of specs.

Sebastian Kluitmann April 2008

CATL Workshop on writing better Objective Tests, Cerbin, 2009

Diposkan 2011 Oleh Bejo Sutrisno

(http://fcit.usf.edu/assessment/basic/basicc.html)

Handbook on Test Development: Helpful Tips for Creating Reliable and

Valid Classroom Tests Allan S. Cohen and James A. Wollack Testing & Evaluation

Services University of Wisconsin-Madison

Compiled by. El Habib REZZOUK

MUBARAK ALAKABEER Educational Area

https://plus.google.com/106515707437716367795

35

الموضوع الرابع

Interpreting Test Scores

&

Benefiting from Results

اليتائج الاشتفادة مً و الإختبار ىتائجتحليل

36

Interpreting Test Scores & Benefiting from Results

Why test analysis / Benefits of evaluation: • To make an educational decision • To measure students language proficiency • To determine students strength & weakness • To monitor and follow the progress of students or groups • To See the effectiveness of the teaching methods used

Improving teaching using test analysis:

• Conduct analysis to identify problem areas. • Review classroom assessment and test results with teachers regularly. The

sooner problems are identified, the sooner remedy action can be taken. • Record improvement and success to enhance morale and motivation for

further improvement. • See how well each student did on each standard to identify points of weakness

and strength and hence develop plans to address points like (training teachers to use different strategies, adding material resources/remedial exercises, etc.).

• See how well each class did to decide if change is needed like (smaller groups or different teaching methods/plans).

• See how well each grade did for the head of department to decide on strategies needed like (modification in curriculum, revised schedules, increased supervision & specific staff development).

Before we proceed to analyze test scores, know these terms:

• Central Tendency : The term relates to the way in which quantitative data tend to cluster around some central value, usually the Mean.

• Mean : Average score • Standard Deviation : Range above and below the average score. • Dispersion : The amount of spread among the scores and it varies from narrow

to large dispersion. • Difficulty Coefficient : What overall percentage of the students in the analysis

groups (low & high) answered correctly. • Discrimination Coefficient : How well a test item discriminates between the top

students and the bottom students in the analysis groups.

To examine raw data, two ways are to be considered:

37

1- Measurements of the Central Tendency:

• The central tendency refers to the ―central value‖, and is measured using the Mean,

Median, or Mode.

2- Measurements of Spread of Data (variability):

• It is also called Measurement of Dispersion. It is the distribution and the spread of

the data around this central tendency (i.e., the Mean), and it is measured using the

Range and the Standard Deviation.

1- Measurements of the Central Tendency:

Arithmetic Mean:

• The mean is the most efficient measure of central tendency. When we talk about an

"average", we usually are referring to the mean.

• The mean is simply the sum of the separate scores divided by the total number of

testees.

Example: Find the mean for the following set of scores:

13, 18, 13, 14, 13, 17, 14, 20, 13

(13 + 18 + 13 + 14 + 13 + 17 + 14 + 20 + 13) = 135 ÷ 9 = 15

The Mean by itself enables us to describe an individual student’s score by comparing it

with the average set of scores obtained by a group. It tells us nothing about the spread of

marks.

An X Student got 17 out of 20 and the M is 15. What does that mean to you?

Find the mean for the following sets of scores:

13, 18, 13, 14, 13, 17 M = ……. ( Answer is : 15 )

13, 18, 13, 14, 13, 17, 14, 20, 13 M = ……. ( Answer is : 15 )

2- Measurements of Dispersion: (the spread of marks)

Standard Deviation:

It is the commonest measure of the variability, or dispersion, of a distribution of scores,

that is, of the degree to which scores vary or deviates from the Mean; in other words, it

shows how ALL scores are spread out and thus gives a fuller description of test scores.

38

Formula:

∑ is the sum of, N is the number of scores, d is the deviation of each score from the Mean

and d2 is the Square Root.

Method of Calculating Standard Deviation (s. d.):

1- Calculate the Mean first.

2- Find out the amount by which each score deviates from the Mean (d).

3- Square each result (d2) (square root).

4- Total the squared results (∑d2).

5- Divide the total by the number of testees (∑d2/N).

6- Find the square root for the result in step 5 (√∑d2/N).

Example:

- Calculate the Mean and the Standard Deviation for the following scores:

2, 4, 8, 6, 7, 9

Answer:

No of Candidates = 6

Mean = 2+4+8+6+7+9 = 36/6 candidates = 6

Score (X) Mean Deviation D = X – 6 (Mean) Squared ( d2 )

2 -4 16

4 -2 4

8 2 4

6 0 0

7 1 1

9 3 9

36 34

s.d.= 34/6 Testees = √5.67 = 2.38

Benefits of score analysis using standard deviation:

• The smaller the SD, the closer the scores cluster around the Mean score.

39

(Homogeneous scores)

• The greater the SD, the greater the differences between the scores and the Mean.

• Large spread indicates that there are probably large differences between individual

scores.

• A standard deviation of 4.08, for example, shows a smaller spread of scores than, say,

a standard deviation of 8.96.

• Standard deviation is also useful for providing information concerning characteristics

of different groups.

If, for example, the standard deviation on a certain test is 4.08 for one class, but 8.96 on

the same test for another class, then it can be inferred that the latter class (8.96) is more

heterogeneous than the former (4.08).

ITEM ANALYSIS

- When analyzing the test items, we have several questions about the performance of each

item. Some of these questions include:

Are the items congruent with the test objectives?

Are the items valid? Do they measure what they're supposed to measure?

Are the items reliable? Do they measure consistently?

How long does it take an examinee to complete each item?

What items are most difficult to answer correctly?

What items are easy?

Are there any poor performing items that need to be discarded?

Types of Item Analyses:

Two major types to analyze items:

1. Assess difficulty of the items (Difficulty Coefficient)

2. Assess how well an item differentiates between high and low performers

(Discrimination Coefficient)

1- The Difficulty Coefficient / Facility Value:

- What overall percentage of the students in the analysis groups answered correctly?

- It varies from 0.0 to +1.0 with numbers approaching +1.0 indicating more students

answering correctly.

Formula: FV = R ÷ N

R= # number of correct answers

N= # number of Ss taking the test

P = Number correctly answering the item

40

Number taking the test

P is percentage of test takers who respond correctly, usually expressed as proportion (p),

i.e., 68% is .068.

- An item with a p value of .0 or 1.0 does not contribute to measuring individual

differences and thus is certain to be useless because the closer to zero, the more difficult

the question is, i.e., the more wrong answers are and vice versa when it is closer to 1.0.

- When comparing test scores, we are interested in who had the higher score or the

differences in scores.

- p value of .5 has most variation so seek items in this range.

- General Rules of Item Difficulty:

p low (<= .30) Too difficult test item

p moderate (> .30 AND < .80) Moderately difficult

p high (>= .80) Easy item

Accepted (GOOD) difficult coefficient varies according to question type:

Question Type: Accepted FV:

T/F (0.75)

MCQs (0.63)

Productive (0.50)

What is the best p-value?

- Most optimal p-value = .50

- It shows maximum discrimination between good and poor students.

Item Difficulty Level Example: (No of Ss = 50)

Item No. No. of Correct Answers Correct % Proportion Difficulty Level

1 15 30 .30 High

2 25 50 .50 Medium

3 35 70 .70 Medium

4 45 90 .90 Low

2- The Discrimination Coefficient / Index:

41

- The Discrimination Index for each item distinguishes between the performance of

students who did well on the exam and students who did poorly.

- It varies from -1.0 (Negative Discrimination Index) to +1.0 (High Discrimination Index).

- The higher the value of D, the more adequately the item discriminates between those

who know the material and those who don't. (The highest value is 1.0)

- The only difference is that when the number is positive, the items will be answered

correctly by the good students and incorrectly by the poor students; when it is negative, the

reverse is true.

The formula : D.I.= H – L ÷ N = …

H = # number of top Ss correct

L = # number of bottom Ss correct

N = # number of Ss in either top or bottom group (should be equal)

OR

D = Correct U – Correct L

N

The Discrimination Index Example:

Hint: No. of SS per group= 20

D = Correct U – Correct L

N

U Gr. (10 correct)

L Gr. (2 correct)

D.I. = 10 – 2 = 8/20 = 0.4

U Gr. (2 correct)

L Gr. (8 correct)

D.I. = 2 – 8 = –6/20 = - 0.3

- Negative numbers (lower scorers on test more likely to get item correct) and low positive

numbers (about the same proportion of low and high scorers get the item correct) don’t

discriminate well and should be discarded. So, seek items with high positive numbers

(those who do well on the test tend to get the item correct).

Procedure to calculate the discrimination index:

• Rank the students according to their total scores. (Descending)

42

• Divide them into 3 equal groups high, middle & low. (3 thirds)

• Count how many students got answer correct from high & low group (leave middle

group aside).

• Find the difference between the number of correct answers in both groups and

subtract them.

• Divide the result by the total No. of students in EITHER group (high or low)

Item Discrimination Example:

Hint: No. of SS per group= 100

Item No.

Number of Correct Answers in

Group Item Discrimination Index

Upper 1/3 Lower 1/3

1 90 20 0.7

2 80 70 0.1

3 100 0 1

4 100 100 0

5 50 50 0

6 20 60 -0.4

Ebel, 1972, classification for items according to their DI value:

- All questions with a negative discrimination value should be discarded.

- D.I.= (< 0.20) is weak and should be discarded.

- D.I.= (>= 0.20 : <= 0.39) poor items but should be improved or amended.

- D.I.= (> 0.39) is a good question for discrimination

Use the following table as a guideline to determine whether an item should be considered

for revision:

Item Discrimination (D) Item Difficulty (FV)

High Medium Low

D =< 0% review review review

D > 0% < 30% ok review ok

D >= 30% ok Ok ok

43

Benefiting from Results

Traditionally, testing in schools is usually thought to serve only the purpose of evaluating

students and assigning them grades. Yet tests can serve other purposes in educational

settings that greatly improve performance. Other benefits of testing are:

• Benefit 1: Retrieval practice occurring during tests can greatly aids later retention.

• Benefit 2: Testing identifies gaps in knowledge.

• Benefit 3: Testing causes students to learn more from the next learning episode.

• Benefit 4: Testing produces better organization of knowledge.

• Benefit 5: Testing improves transfer of knowledge to new contexts.

• Benefit 6: Testing can facilitate retrieval of information that was not tested.

• Benefit 7: Testing improves metacognitive monitoring.

• Benefit 8: Testing prevents interference from prior material when learning new

material.

• Benefit 9: Testing provides feedback to instructors.

• Benefit 10: Frequent testing encourages students to study.

How to uplift low achievers level:

• Divide the students into level groups.

• Give different tasks and different homework groups exercises.

• Make written instructions & proceed step by step.

• Give new information in small portions.

• Get feedback in classroom & let everybody hears.

• Reward & raise self-esteem. Motivation is a very important element.

• Call parents to tell them about positive developments.

• Practise various instruction techniques.

• Consider proper selection and gradation of material.

• Encourage participation in group interaction.

• Teach students how to study & how to evaluate themselves.

Purpose of the remediation plan:

• To help teachers think about and organize remediation plans.

• To allow teachers to try to identify specific problem areas and link them to steps

that can produce attainable results.

• To provide teachers with a template to easily record remediation plans, and use

them to communicate with students and/or parents, HOD or the school principal.

Writing a remediation plan:

44

• The Remedial Plan should be specialized, concrete not abstract, based on what low

achievers can do, not on what we think they can do. It should be graded,

systematically applied and it should aim at final results.

• You can use the remediation plan template* (see figure 1) to lay out a plan for

students who are in need of intervention/remediation.

Suggested Uses:

Involve students in their remediation plans

• Hold a teacher-student conference to go over the details of the remediation plan.

Make certain they understand what they are to do.

Involve parents as much as possible

• You may also involve parents in the remediation plan, if the situation is appropriate.

Like your students, make sure the parents understand the steps their children should

take to improve their performance in your class.

Identify common steps and resources that can be used for different levels of remedial

study.

• Try to identify several sets of steps and resources for at least two different levels of

student need. For example, you might identify a course of action for students who

need a small amount of extra work, and one for those that need a great deal of extra

study in the identified academic area.

• Then, as you identify students in need of intervention, you can choose their level

and the appropriate remediation plan. While you will probably want to customize

the plan per student, you will at least have a defined set of steps with which to

begin. After the semester ends, you can then evaluate each plan's success rate and

determine what can be revised to improve each set of actions or resources.

Figure 1: The Remediation Plan Template

Student Remediation Plan

Student ___________________________________Teacher_______________________________

Course ___________________________________ Date__________________________________

From_____________________________________ To____________________________________

Problem Areas Solutions/Steps To Be Taken Resources Needed

45

Supervised by :Aysha Alawadi

46

4.5

- 0

5

6.5

- 5.5

8.5

- 7

9.5

- 9

10

يندم

متق ال

ددع

ينجح

لناد ا

عد

%ة

سبالن

14.5

- 0

15

20.5

- 15

.5

26.5

- 21

29.5

- 27

30

يندم

متق ال

ددع

ينجح

لناد ا

عد

%ة

سبالن

19.5

- 0

2027

.5 -

20.5

35.5

- 28

39.5

- 36

40

يندم

متق ال

ددع

ينجح

لناد ا

عد

%ة

سبالن

مدير المدرســــــــــةيعتمدرئيس القسم

درجة الفترة الدراسية…… ) 40 ( اختبار الفترة الدراسية…… ) 30 ( أعمال الفترة الدراسية…… ) 10 (

الصف

...................................................... : مدرســــــــة

بيبن بأعداد الطلاة في فئبث درجبث مبدة اللغت الانجليزيت والنسبت المئويت للنجبح في الصف ---------------المتوسط

العام الدراسي: 20 / 20 موزارة التربية

التوجيه الفني العام للغة الانجليزية

50

Step 2

Making comparisons between the different periods to identify the progress that students have made.

Step 3

Making comparisons between the test results of the current scholastic year and the previous two years to

identify the progress that students have made.

Step4 Identifying the gap between the ongoing assessment marks, the test and the final marks.

Step 5

الأولى الفترة الصف الثانيت الفترة الثالثت الفترة الحالت الحالت

الإجمالي

الصفالدراسي العام

2011- 2012

الدراسي العام

2012- 2013 لحالتا

الدراسي العام

2013- 2014 الحالت

الإجمالي

النهائيت الدرجت الاختبار اليىميت الأعمال الصف

51

Using the following forms to analyze the result of each item.

Step 6

Identifying the points of weakness and designing the remedial plans accordingly using the following form.

.........:...................الدراسي العام :............................ الدراسيت الفترة

:.................................. المدرســت :.................................. الصف

المقترحت العلاج طرق الضعف أسباب الضعف نقاط

الفرعيت/الأساسيت المهاراث

52

References:

- Alderson, J.C., Clapham, C. and Dianne, W., 1995. Language Test

Construction and Evaluation. Cambridge: Cambridge University Press

- Heaton, J.B., 1990. Writing English Language Tests. Longman:

Longman, New York

- Henry L. Roediger et al, 2011. Psychology of Learning and Motivation,

Volume 55

- http://www.glencoe.com/sec/teachingtoday/downloads/pdf/remediation_p

lan.pdf

Compiled by

ELT. Supervisor Redha Sheeha Hawalli Educational Area

ELT. Supervisor Ashraf Khalid Al-Jahra Educational Area

ELT. Supervisor Raafat Ismail Al-FarwaniyaEducational Area

http://www.glencoe.com/sec/teachingtoday/downloads/pdf/remediation_plan.pdf

http://www.glencoe.com/sec/teachingtoday/downloads/pdf/remediation_plan.pdf

53

الموضوع الخامض

Constructing a good test according to question

types designed by the ELT General Supervision

تصنيه اختبارديد طبقا لأنماط الأشئلة الواردة مً التوديه الفني العاو للغة

الإنجليسية

a training course on test design for elt teachers

Documents