comprehensive material for measurement and evaluation

7/30/2019 Comprehensive Material for Measurement and Evaluation

1/54

TCP.TIP_Rungduin Page 1

Technological Institute of the Philippines

College of Education

Center for Teaching Excellence

Teaching Certificate Program

Introduction to Measurement and Evaluation

Discussion Point 1:

Introduction to Measurement and Evaluation

The Necessity of EvaluThe Necessity of EvaluThe Necessity of EvaluThe Necessity of Evaluation in Teachingation in Teachingation in Teachingation in Teaching

To teach without evaluation is a contradiction in terms.

By its very nature, teaching requires innumerable judgments to be made by the

teacher, the school administrators, parents and the pupils themselves.

Teachers are obligated to assemble, analyze, and utilize whatever evidence can be

brought forward to make the most effective decisions (evaluations) for the benefit of

the students in their classes. Among these decisions are the following:1. The nature of the subject matter that should be taught at each grade level;2. Which aspects of the curriculum need to be eliminated, modified or included

as a function of the current level of student knowledge and attitudes;

3. How instruction can be improves to ensure that students learn;4. How pupils should be organized within the classroom to maximize learning;5. How teachers can tell if students are able to retain knowledge;6. Which students are in need of remedial or advanced work;7. Which students will benefit from placement in special programs for the

mentally retarded, emotionally disturbed, or physically handicapped;

8. Which children should be referred to the school counselor, psychologist,speech therapist, nurse or social worker; and

9. How each pupils progress can be explained most clearly and effectively.The Relationship between Teaching and EvaluationThe Relationship between Teaching and EvaluationThe Relationship between Teaching and EvaluationThe Relationship between Teaching and Evaluation

The purpose of teaching is to improve the knowledge, behaviors, and attitudes of

students.

Teachers want students to increase the amount of knowledge they possess and to

decrease the amount of forgetting.

Teaching consists of at least four interrelated elements (Glaser and DeCecco,

1968):

1. Developing instructional objectivesTeachers need to know what they are attempting to accomplish and

cannot leave such matters to chance.Students improve when they make progress toward clearly defined

objectives.

Clearly defined instructional objectives serve at least two roles:

a. Help the teacher recognize student improvement by clarifying whatit is the teacher wants to accomplish

b. Instructional goals imply the way in which the goals will beevaluated


2/54


2. Evaluating the students entering behaviorIndividual differences (academic achievement, sexual preference, social

class, notes from previous teachers, former school or former location ,

physical characteristics, knowledge of older brother or sister, and family

background)

Teaching methods are effective only if they are considered in relationship

to the background of the student.

3. Selecting an instructional strategyIf student background is important in selecting an instructional strategy,

the teacher will have to become familiar with those procedures used to

measure and evaluate those backgrounds.

4. Providing for an evaluation of the students performancePerformance assessment may suggest that a program is ineffective

because the objectives are unrealistic or because the entering behavior

was not considered adequately.

Evaluation can determine whether instructional objectives have been

met; it provides evidence that students have the necessary entering

behavior and it helps to evaluate the adequacy of an instructionalstrategy.

Test an instrument or systematic procedure for measuring a sample of behavior. (Answer

the question How well does the individual perform either in comparison with others or in

comparison with a domain of performance task?)

Measurement the process of obtaining a numerical description of degree to which an

individual possesses a particular characteristic. (Answer the question How much?)

Evaluation the systematic process of collecting, analyzing and interpreting information to

determine the extent to which pupils are achieving instructional objectives. (Answer thequestion How good?)


3/54


MeasurementMeasurementMeasurementMeasurement

Measurement involves the assigning of numbers to attributes or characteristics of

persons, objects, or events according to explicit formulations or rules.

Educational measurement requires the quantification of attributes according to

specified rules.

Characteristics of Scales of Measurement

Scale Definition Uses and Examples Limitations

LeastComplex

Nominal Scale involving the

classification of objects,

persons, or events into

discrete categories

Plate numbers, Social

Security numbers,

names of people,

places, objects,

numbers to identify

athletes

Cannot specify

quantitative differences

among categories

Ordinal Scale involving ranking

of objects, persons,traits or abilities without

regard to equality of

differences

Letter grades (ratings

from excellent tofailing), military ranks,

order of finishing a test

Restricted to specifying

relative differenceswithout regard to

absolute amount of

difference

MostComplex

Interval Scale having equal

differences between

successive categories

Temperature, Grades,

Scores

Ratios are meaningless,

the zero point is

arbitrarily defined

Ratio Scale having an

absolute zero and equal

intervals.

Distance, weight, time

required to learn a skill

or subject

None except that few

educational variables

have ratio

characteristics


4/54


Testing

A test may be defined as a task or series of tasks used to obtain systematic

observations presumed to be representative of educational or psychological traits

and or attributes.

Typically tests require examinees to respond to items or tasks from which the

examiner infers something about the attribute being measured.Tests and other measurement instruments serve a variety of purposes:

1. Selection. To determine which persons will be admitted to or deniedadmittance to an institution or organization.

2. Placement. To help individuals determine which of several programs they willpursue.

3. Diagnosis and remediation. To help discover the nature of the specificproblems individuals may have.

4. Feedback5. Motivation of guidance and learning6. Program and curriculum improvement7. Theory development

Tests may be classified by how they are administered (individually or in groups), how

they are scored (objectively or subjectively), what sort of response they emphasize

(power or speed), what type of response subjects make (performance or pencil-and-

paper), what they attempt to measure (sample or sign), and the nature of groups

being compared (teacher-made or standardized)

1. Individual and group testsSome tests are administered on a one-to-one basis during careful oral

questioning (e.g., individual intelligence tests), whereas others can be

administered to a group of individuals.

2. Objective and subjective testsAn objective test is one on which equally competent scorers will obtain

the same scores (e.g., multiple-choice tests), whereas subjective test isone on which the scores are influenced by the opinion or judgment of the

person doing the scoring (e.g., essay tests).

3. Power and speed testsA speed test measures the number of items that an individual can

complete in a given time, whereas a power tests measures the level of

performance under ample time conditions. Power tests items usually are

arrange in order of increasing difficulty.

Relationship between power and speed tests


5/54


PowerPowerPowerPower SpeedSpeedSpeedSpeed

TimeTimeTimeTime Generous Limited

Partially speeded

DifficultyDifficultyDifficultyDifficulty Relatively hard Relatively easy

4. Performance and paper-and-pencil testsPerformance tests require examinees to perform a task rather than

answer questions. They are usually administered individually so that the

examiner can count the number of errors committed by the student and

can measure how long each tasks takes.

Pencil-and-paper tests are almost always given in group situation in which

students are asked to write their answers on paper.

5. Sample and sign testsSample of a students total behavior

Sign tests are administered to distinguish one group of individuals from

another.

6. Teacher-made and standardized testsTeacher made tests are constructed by teachers for use within their own

classrooms. Their effectiveness depends on the skill of the teacher and

hi or her knowledge of test construction.

Standardized tests are constructed by test specialists working with

curriculum experts and teachers. They are standardized in that they have

been administered and scored under standard and uniform conditions so

that results from different classes and different schools may be compared

7. Mastery and survey testsSome achievement tests measure the degree of mastery of a limited set

of specific learning outcomes, whereas others measure pupils general

level of achievement over broad range outcomes.8. Supply and Selection Tests

Some tests require examinees to supply the answer (e.g., essay tests),

whereas others require them to select the correct response from the set

of alternatives (e.g., multiple-choice tests).


6/54


Evaluation

Evaluation is a process through which a value judgment or decision is made from a

variety of observations and from the background and training of the evaluator.

General Principles of Evaluation

1. Determining and clarifying what is to be evaluated always has priority in theevaluation process.

2. Evaluation techniques should be selected according to the purpose to beserved.

3. Comprehensive evaluation requires a variety of evaluation techniques.4. Proper use of evaluation techniques requires an awareness of both their

limitations and their strengths.

5. Evaluation is a means to an end, not an end in itself.Reasons for Using Tests and Other Measurements

Basis for

Classification

Type of Evaluation Function of Evaluation Illustrative Instruments

Nature ofMeasurement

Maximumperformance

Determines what individuals can

do when performing at their best

Aptitude tests, achievement tests

Typical performance

Determines what individuals will

do under natural conditions

Attitude, interest and personality

inventories; observational

techniques; peer appraisal

Use in Classroom

Instruction

Placement

Determines prerequisite skills,

degree of mastery of course

objectives, and/or best mode of

learning

Readiness tests, aptitude tests,

pretests on course objectives, self

report inventories, observational

techniques

Formative

Determine learning progress,

provides feedback to reinforce

learning, and correct learning

errors

Teacher-made mastery tests,

custom-made tests from test

publishers, observational

techniques

DiagnosticDetermines causes (intellectual,physical, emotional, environmental)

of persistent learning difficulties

Published diagnostic tests, teacher-made diagnostic tests,

observational techniques

Summative

Determines end-of-course

achievement for assigning grades

or certifying mastery of objectives

Teacher-made survey tests,

performance rating scales, product

scales

Method of

Interpreting

Results

Criterion referenced

Describes pupil performance

according to specified domain of

clearly defined learning tasks (e.g.,

adds single digit whole numbers).

Teacher-made mastery tests,

custom-made tests from test

publisher, observational techniques

Norm referenced

Describes pupil performance

according to relative position in

some known group (e.g., ranks

tenth in a classroom group of 30).

Standardized aptitude and

achievement tests, teacher-made

survey tests, interest inventories,

adjustment inventories

Motivation and Guidance of Learning

Tests can be used to motivate and guide students to learn, and because pupils study

for the type of examination they expect to take, it is the teachers responsibility to

construct examinations that measure important course objectives.

Program and Curriculum Improvement: Formative and Summative Evaluations


7/54


1. Formative EvaluationFormative EvaluationFormative EvaluationFormative Evaluation. Formative evaluation is used to monitor learningprogress during instruction and provide continuous feedback to both pupil

and teacher concerning learning success and failures.

2. Summative EvaluationSummative EvaluationSummative EvaluationSummative Evaluation. Summative evaluation typically comes at the end of acourse (or unit) of instruction. It is designed to determine the extent to which

the instructional objectives have been achieved and is used primarily for

assigning course grades or certifying pupil mastery of the intended learning

outcomes.

NormNormNormNorm----Referenced and CriterReferenced and CriterReferenced and CriterReferenced and Criterionionionion----Referenced MeasurementReferenced MeasurementReferenced MeasurementReferenced Measurement

Evaluation procedures can also be classified according to how the results are

interpreted. There are two basic ways of interpreting pupil performance on tests and other

evaluation instruments. One is to describe the performance in terms of the relative position

held in some known group (e.g., typed better than 90 percent of the class members). The

other is to directly describe the specific performance that was demonstrated (e.g., typed 40

words per minute without error). The first type of interpretation is called norm referenced;

the second criterion referenced. Both types of interpretation are useful.

Some Basic Terminologies

1. Norm-referenced test a test designated to provide a measure of performance thatis interpretable in terms of an individuals relative standing in some known group.

2. Criterion-referenced test a test designated to provide a measure of performancethat is interpretable in terms of a clearly defined and delimited domain of learning

tasks.

3. Objective-referenced test a test designated to provide a measure of performancethat is interpretable in terms of a specific instructional objective. (Many objective-

referenced test are called criterion-referenced tests by their developers).

Other terms that are less often used have meanings similar to criterion referenced:

content referenced, domain referenced, and universe referenced.

Comparison of Norm-Referenced Tests (NRTs) and Criterion-Referenced Tests (CRTs)

Common Characteristics of NRTs and CRTs

1. Both require specification of the achievement domain to be measured.2. Both require a relevant and representative sample of test items.3. Both use the same types of tests items.4. Both use the same rules fir item writing (except for item difficulty).5. Both are judge by the same qualities of goodness (validity and reliability).6. Both are useful in educational measurement.

Differences Between NRTs and CRTs (but it is only matter of emphasis)


8/54


1. NRT typically covers a large domain of learning tasks, with just a few itemsmeasuring each specific tasks

CRT typically focuses on a delimited domain of learning tasks, with a relatively large

number of items measuring each specific task.

2. NRT emphasizes discrimination among individuals in terms of relative level oflearning

CRT emphasizes description of what learning tasks individuals can and cannot

perform

3. NRT favors items of average difficulty and typically omits easy items.CRT matches item difficulty to learning tasks, without altering item difficulty or

omitting easy items

4. NRT used primarily (but not exclusively) for survey testingCRT used primarily (but not exclusively) for mastery testing

5. NRT interpretation requires a clearly defined groupCRT interpretation requires a clearly defined and delimited achievement domain

Strictly speaking, norm reference and criterion reference refer only to the method of

interpreting the results. These types of interpretation are likely to be most meaningful anduseful, however, when tests (and other evaluation instruments) are specially designated for

the type of interpretation to be made. Thus, we can use the terms criterion referenced and

norm referenced as broad categories for classifying tests and other evaluation techniques.

Tests that are specifically built to maximize each type of interpretation have much in

common, and it is impossible to determine to determine the type of test from examining the

test itself. Rather, it is in the construction and use of the tests that the differences can be

noted. A key feature in constructing norm-referenced tests is the selection of items of

average difficulty and the elimination of item that all pupils are likely to answer correctly.

This procedure provides a wide spread of scores so that discrimination among pupils at

various levels of achievement can be more reliably made. This is useful for decisions basedon relative achievement, such as selection, grouping and relative grading. In contrast, a key

feature in constructing criterion-referenced tests is the selection of items that are directly

relevant to the learning outcomes to be measured, without regard to the items ability to

discriminate among pupils. If the learning tasks are easy, the test items will be easy, and if

the learning tasks are difficult, the test items will be difficult. Here the main purpose is to

describe the specific knowledge and skills that each pupil can demonstrate, which is useful

for planning both group and individual instruction.

Norm-Referenced Test Combined Type Criterion-Referenced Test

Discrimination Dual Description

Among Pupils Interpretation of Performance

Other Descriptive Terms


9/54


Some of the other terms that are frequently used in describing tests are presented

here as contrasting test types, but some are simply the ends of a continuum (e.g., speed

versus power tests).

1. Informal Versus Standardized TestsInformal Versus Standardized TestsInformal Versus Standardized TestsInformal Versus Standardized Tests. Informal tests are those constructed byclassroom teachers, whereas those designated by test specialists and administered,

scored, and interpreted under standard conditions are called standardized tests.

2. Individual Versus Group TestsIndividual Versus Group TestsIndividual Versus Group TestsIndividual Versus Group Tests. Some tests are administered on a one-to-one basisduring careful oral questioning (e.g., individual intelligence tests), whereas others can

be administered to a group of individuals.

3. Mastery Versus SuMastery Versus SuMastery Versus SuMastery Versus Survey Testsrvey Testsrvey Testsrvey Tests. Some achievement tests measure the degree ofmastery of a limited set of specific learning outcomes, whereas others measure

pupils general level of achievement over broad range outcomes. Mastery tests are

typically criterion referenced, and survey tests tend to be norm referenced, but some

criterion-referenced interpretations are also possible with carefully prepared surveytests.

4. Supply Versus Selection TestsSupply Versus Selection TestsSupply Versus Selection TestsSupply Versus Selection Tests. Some tests require examinees to supply the answer(e.g., essay tests), whereas others require them to select the correct response from

the set of alternatives (e.g., multiple-choice tests).

5. Speed Versus Power TestsSpeed Versus Power TestsSpeed Versus Power TestsSpeed Versus Power Tests. A speed test measures the number of items that anindividual can complete in a given time, whereas a power tests measures the level of

performance under ample time conditions. Power tests items usually are arrange in

order of increasing difficulty.

6. Objective Versus Subjective TestsObjective Versus Subjective TestsObjective Versus Subjective TestsObjective Versus Subjective Tests. An objective test is one on which equallycompetent scorers will obtain the same scores (e.g., multiple-choice tests), whereas

subjective test is one on which the scores are influenced by the opinion or judgment

of the person doing the scoring (e.g., essay tests).


10/54


Discussion Point 2:

Preparing Instructional Objectives

Instructional objectives play a key role in the instructional process. When properly

stated, they serve as guides for both teaching and evaluation. A clear description of

the intended outcomes of instruction aids in selecting relevant materials and

methods of instruction, in monitoring pupil learning progress, in selecting or

constructing appropriate evaluation procedures, and in conveying instructional intent

to others.

In preparing instructional objectives, it is possible to focus on different aspects of

instruction.

Educational goal is a general aim or purpose of education that is stated as a broad,

long range outcome to work toward. Goals are used primarily in policy making and

general program planning (e.g. Develop proficiency in the basic skills of reading,

writing and arithmetic.)

General Instructional Objective is an intended outcome of instruction that has been

stated in general enough terms to encompass a set of specific learning outcomes

(e.g., Comprehends the literal meaning of written material).Specific Learning Outcome is an intended outcome of instruction that has been

stated in terms of specific and observable pupil performance (e.g. Identifies details

that are explicitly stated in a passage.) A set of specific learning outcomes describes

a sample of the types of performance that learners will be able to exhibit when they

have achieved a general instructional objective (also called Specific Objectives,

Performance Objectives, Behavioral Objectives, and Measurable Objectives).

Pupil Performance is any measurable or observable pupil response in the cognitive,

affective, or psychomotor area that is a result of learning.

Dimensions of Instructional Objectives

1. Mastery vs. Developmental OutcomesMastery objectives are typically concerned with relatively simple

knowledge and skill outcomes (adds two single-digit numbers with sums of

ten or less).

Developmental outcomes are concerned with objectives that can never be

fully achieved. Varying degrees of pupil progress along a continuum of

development.

2. Ultimate vs. Immediate ObjectivesUltimate objectives are concerned with those concerned with the typical

performance of individuals in the actual situations they will face in the

future. Example, good citizenship is reflected in adult life through voting

behavior, interest in community affairs, and the like; safety consciousnessshows up in safe driving and safe work habits and in obeying safety rules

in daily activities.

Immediate objectives should be closely related to ultimate situation. For

example, can pupils apply basic skills to practical situations? Such

objectives, calling for the application of knowledge and skills, aid in the

transfer of skills to ultimate situations and should be on any list of

objectives.


11/54


3. Single-course vs. Multiple-course ObjectivesAreas Containing Multiple-Course Objectives

Whether these areas are the shared responsibility of several teachers depends

on the grade level and the schools goals (e.g., in some schools, every teacher

is considered a teacher of basic skills).

Reading Computer skills CreativityWriting Study skills Citizenship

Speaking Library skills Health

Selection of Instructional Objectives

1. Types of learning outcomes to consider2. Taxonomy of educational objectives3. Use of published lists of objectives4. Review of your own teaching materials and methods

Begin with a Simple Framework: Knowledge, Understanding, Application

Reading

K = Knows vocabularyU = Reads with comprehension

A = Reads a wide variety of printed materials

Writing

K = Knows the mechanics of writing

U = Understands grammatical principles in writing

A = Writes complete sentences (paragraph, theme)

Math

K = Knows the number system and basic operations.

U = Understands math concepts and processes.

A = Solves math problems accurately and efficiently

Criteria for Selecting Appropriate Objectives

1. Do the objectives include all important outcomes?2. Are the objectives in harmony with the general goals of the school?3. Are the objectives in harmony with sound principles of learning?4. Are the objectives realistic in terms of the pupils abilities and the time and facilities

available?


12/54


Stating the Specific Learning OutcomesStating the Specific Learning OutcomesStating the Specific Learning OutcomesStating the Specific Learning Outcomes

1. Focus on action verbsExamples:

a. Understands the meaning of terms.1. Defines the terms in own words2. Identifies the meaning of a term in context3. Differentiates between proper and improper usage of a term4. Distinguishes between two similar terms on the basis of meaning.5. Writes an original sentence using the term.

b. Demonstrates skill in critical thinking1. Distinguishes between fact and opinion2. Distinguishes between relevant and irrelevant information.3. Identifies fallacious reasoning in written material.4. Identifies the limitations of given data.5. Identifies the assumptions of underlying conclusions

2. Kept free of specific content so that they can be used in various units of study.Poor: Identifies the last ten presidents of the Psychological association

of the Philippines.Better: Identifies important historical figures.

Poor: Identifies the parts of the brain.

Better: Identifies the parts of a given structure.


13/54


Discussion Point 3:

Achieving Different Types of Learning Outcomes

Achieving Cognitive LearningAchieving Cognitive LearningAchieving Cognitive LearningAchieving Cognitive Learning

Teaching Fact, Factual Information, and Knowledge

Basic Concepts

1. Fact something that has happened, an event or an actual state of affairs2. Factual Information information discriminated by many individuals who share the

same cultural background and accepted as correct or appropriate.

3. Information anything that is discriminated by an individual4. Knowledge factual information that is learned initially and then remembered

Types of Knowledge

1. General knowledge that applies to many different situations2. Domain-specific knowledge that pertains to a particular task or subjects3. Declarative knowledge of verbal information: facts, beliefs, opinions]4. Procedural knowledge of how a task is performed5. Conditional knowing when and why the need to use declarative and procedural

knowledge

Three Categories of Knowledge

1. Knowledge of specifics isolated facts and remembered separately2. Knowledge of ways and means conventions, trends and sequences, classification

and categories, criteria and methodology

3. Knowledge of abstraction laws, theories and principles


14/54


Application of Principles in Teaching and Learning Factual Information

Principles Applications in Classroom Situation

1. Organizing learning materialthrough meaningful association

facilitates acquisition of

information

Group itemso According to common attributeso By relationships

2. Transition from old to newmaterials facilitates acquisition of

information

Organize the material to a higher level ofgenerality

Use advanced organizer that is moregeneral, abstract, inclusive and

comparative

Utilize related prior knowledge of studentsas advance organizers

3. Proper sequencing of materialsfacilitates acquisition of

information

Order subject mattero According to regularity of the

structure

o According to the responsesavailable to the learner

o According to similarity of differentstimuli

4. Appropriate practice facilitatesacquisition of information

Provide adequate practice througho Use of knowledge in situationo Relationshipo Distribution of sessiono Review of small amount of material,

then at increasingly larger interval

Reinforce practice through confirmationof correct responses

5. Independent evaluation facilitatesacquisition of information

Provide mechanism for learners toevaluate their own responses.

Teaching Concepts and PrinciplesTeaching Concepts and PrinciplesTeaching Concepts and PrinciplesTeaching Concepts and Principles

Basic Concepts of Concept and Principles

1. Concept essentially an idea or an understanding of what something is A category used to group similar events, ideas, objects or people Organized information about the properties of one or more things

2. Principles relationship among two or more conceptsClassification of Principles

1. Cause and effect if-then relationship2. Probability prediction on actual sense3. Correlation prediction based on a wide range of phenomena4. Axioms rules


15/54


Instances of Concepts

1. Positive specific example of concepts2. Negative non-example of a concept

Attributes of Concept

1. Learnability some concepts are readily learned than others2. Usability some concepts can be used more than others3. Validity the extent to which experts agree on its attributes4. Generality the higher the concept, the more general it is5. Power the extent to which concept facilitates learning of other concepts6. Structure internally consistent organization7. Instance perceptibility the extent to which concepts can be sensed8. Instance numerousness number ranging from one to infinite number

Four Levels of Concept Attainment

1. Concrete2. Identity3. Classificatory4. Formal

Four Components in Any Concept Development

1. Name of concept2. Definition3. Relevant and irrelevant attributes4. Examples and non-examples

Simple Procedures in Concept Analysis

1. State attributes and non-attributes2. Give example and non-examples3. Indicate relationships of a concept to other concept4. Identify the principles in which concept is used5. Use concept in solving problems


16/54


Application of Principles in Teaching and Learning of Concepts and Principle

Principles Applications in Classroom Situation

1. Awareness of attributes facilitates concept

learning

Manage instruction by

Guiding learners to identify the critical attributes Using examples and non-examples which the

attribute of the concept can be identified

Utilizing activities where instances of theconcept can be directly observe

Providing for over generalization and undergeneralizing to establish the limits of the

concept

Providing the right amount of variation andrepetition

Varying irrelevant dimensions so that therelevant dimensions may be identified easily

2. Correct language for concepts facilitates

concept learning

Teach relevant names and labels associated with

Concept Attributes

3. Proper sequencing of instances facilitates

concept learning

Concept development should proceed

From simple to complex example From concrete to abstract From parts to whole From whole to parts

Present concepts of

Larger number of instances of one concept High dominance than those of low dominance Positive and negative instances of the concept

rather than all positive or all negative instances

Present instances of concept simultaneously rather than

successively

4. Guided student discovery facilitates concept

learning

Guided students discovery of concept through

Encounter with real and meaningful problems Gathering accurate information A responsive environment Prompt and accurate feedback

5. Concept application facilitates concept

learning

Conduct meaningful applications of concepts by

Drawing on the learners experiences Observing related situations Encountering life-like situations

6. Independent evaluation facilitates concept

learning

Arrange for independent evaluation by

Creating an attitude of seeking and searching Arranging for self-evaluation of the adequacy of

ones concept

Assisting learners to evaluate their conceptsand their methods of evaluating them.


17/54


Developing Problem Solving AbilitiesDeveloping Problem Solving AbilitiesDeveloping Problem Solving AbilitiesDeveloping Problem Solving Abilities

Basic Concepts

1. Problem felt difficulty or a question for which a solution may be found only by aprocess of thinking

2. Thinking the recall and reorganization of facts and theories that occur at a timewhen the individual is face4d with obstacles and problems3. Reasoning productive thinking in which previous experiences are organized or

combined in new ways to solve problems

4. Problem solving creating new solutions to remove felt difficultySteps in Problem Solving

1. Felt need2. Recognizing a problem situation3. Gathering data4. Evaluating the possible solution5. Testing and verification6. Making generalization or conclusion


18/54


Principles for Developing Problem Solving Abilities and their Applications in Classroom

Situations

Principles Applications in classroom

1. Recognizing difficulties in a situation

facilitates problem solving

Assist students to

Identify solvable significant problems State problems themselves

2. Delimiting the problem facilitates problem

solving

Guide students in

Analyzing the situation related to the problem Determining problems of immediate concern Delimiting the problem Stating problems with opportunity for securing

progress towards a solution

Deciding the form in which the solutions mightappear

Using information processing skill of selectiveattention

3. Using new methods for arriving at a

conclusion facilitates problem solving

Help students in

Locating needed information Acquiring the necessary background

information, concepts, principles for dealingwith the problem

Developing their minimum reference list Identify various sources of information Drawing information from their own experiences Deciding on a uniform system for writing

bibliography

4. Generalizing possible solutions through

applying knowledge and methods to the

problem situation facilitates problem solving

Lead students to generate solutions through

Brainstorming session Processing information Analyzing information in terms of the larger

problems

Incorporating diverse information Eliminating overlapping and discrepancies5. problem solutions through inferring and

testing hypothesis facilitates problem solving

Develop the skills of the students in

Drawing hypotheses Stating hypotheses Testing hypotheses


19/54


Developing CreativityDeveloping CreativityDeveloping CreativityDeveloping Creativity

Basic Concepts

1. Restructuring conceiving of a problem in a new or different way2. Incubation unconscious work toward a solution while one is away from the problem3. Divergent thinking coming up with many possible solutions4. Convergent narrowing possibilities to the single answer5. Creativity occurrence of uncommon or unusual but appropriate responses:

imaginative, original thinking

Characteristics of a Creative Individual

1. Has a high degree of intellectual capacity2. Genuinely values intellectual matter3. Values own independence and autonomy4. Verbally fluent5. Enjoys aesthetics impressions6. Is productive7. Is concerned with philosophical problems8. Has high aspiration level of self9. Has a wide range of interests10.Thinks in unusual way11.Is an interesting, arresting person12.Appears straight forward, candid13.Behaves in ethically consistent manner

Principles for Developing Creativity and their Applications in Classroom Teaching

Principles Applications in Classroom Teaching

1. Production of novel forms of ideas

through expressing oneself by figural,verbal and physical means facilitates

development of creativity

Model creative behaviors such as

Curiosity Inquiry Divergent production

Provide opportunities for

Expression in many media: language,rhythm, music and art

Divergent production through figural, verbaland physical means

Creative processes Valuing creative achievement Production of ideas that cannot be scored

right or wrong

Develop a continuing program for developingcreative abilities

2. Associating success in creative efforts

with high level of creative experience

facilitates the development of creativity

Respect Unusual questions Imaginative creative ideas Reward Creative efforts Unique productions


20/54


Achieving Psychomotor LearningAchieving Psychomotor LearningAchieving Psychomotor LearningAchieving Psychomotor Learning

Basic Concepts

1. Capacity individuals potential power to do certain task2. Ability actual power to perform an act physically and mentally3. Skill level of proficiency attained in carrying out sequences of action in a consistentway

Characteristics of Skilled Performances

1. Less attention to the specific movement (voluntary to involuntary)2. Better differentiation of cues3. More rapid feedback and correction of movements4. Greater speed and coordination5. Greater stability under a variety of environmental conditions

Phases of Motor Skills Learning

1. Cognitive phase understanding the task2. Organizing phase associating responses with particular cues and integrating

responses

3. Perfecting phase executing performance in automatic fashion


21/54


Application of Principles in Developing Psychomotor Skills in Classroom Teaching

Principles Application in Classroom Teaching

1. Attending to the characteristicsof the skill and assessing ones

own related abilities facilitate

motor skill learning

Analyze the psychomotor skills in terms of the

learners abilities and development level

To determine the specific abilitiesnecessary to perform it To arrange the component abilities in order

To help students master them2. Observing and imitating a model

facilitates initial learning of skills

and movements

Demonstrate and describe the

Entire procedure for advance organizer Correct component of motor abilities Links of the motor chain in sequence Skill again step by step

3. Guiding initial responsesverbally and physically facilitates

learning of motor skills

Provide verbal guidance to

Give learners a feeling of security Direct attention to more adequate

techniques

Promote insight into the factors related tosuccessful performance of task

Provide physical guidance to

Facilitate in making correct response4sinitially

Correct immediately wrong responses4. Practicing under desirable

conditions facilitates the

learning of skills through

eliminating errors and

strengthening and refining

correct responses and form

Conduct practice of skills

Close to actual conditions where the skillwill be used

From whole to part arrangement Through repetitive drills in the same

materials By distributed rather than mass practice With interval of rest long enough to

overcome fatigue but not too long that

forgetting occurs

5. Knowledge of results facilitatesskill learning

Provide informational feedback on

Correct and incorrect responses Adequate and inadequate responses Correct6 or incorrect verbal remarks

Feedback may be secured from

Verbal analysis Chart analysis Tape performance

6. Evaluating ones ownperformance facilitates mastery

of skills

Self evaluation of learners performance through

Discussion Analysis Assessment


22/54


Achieving Affective LearningAchieving Affective LearningAchieving Affective LearningAchieving Affective Learning

Developing Attitudes and Values

1. Affective pertains to emotions or feelings rather than thought2. Affective learning consists of responses acquired as one evaluates the meaning of

an idea, object, person, or event in terms of his view of the world.

Main Elements

1. Taste like or dislike of a particular animal, color, or flavor2. Attitudes learned, emotionally toned predisposition to react in a consistent way,

favorable or unfavorable toward a person, object or idea

3. Values inner core belief, and internalized standards as norm of behaviorDefining Attributes of Attitudes

1. Learnability all attitudes are learned2. Stability learned attitudes become stronger and enduring3.

Personal-societal significance attitudes are of high importance to the individual andsociety

4. Affective-cognitive contents attitudes have both factual information and emotionsassociated with an object


23/54


Application of Principles in Developing Attitudes and Values in Classroom Teaching

Principles Application in Classroom Teaching

1. Recognizing an attitudefacilitates its initial learning

Guides students in

Identifying the attitudes and values todeveloped

Defining the terminal behavior expected ofthem

2. Observing and imitating a modelfacilitates initial attitude learning

Teachers provides

Different types of exemplary models Opportunities to examine carefully

instructional materials in terms of attitudes

and values presented

Teacher sets good example

3. Positive attitudes toward aperson, event or object

facilitates affective learning

Provide for pleasant and positive emotional

experiences by

Showing warmth and enthusiasm towardstudents

Keeping personal prejudices under control Allowing students to express ones own

value commitments

Demonstrating interest in subject matter Making possible for each student to

experience success

4. Getting information aboutperson, event, or object

influences initial attitude

learning and later commitment

to group held attitudes

Guide learners to extend their informative

experiences by

Undergoing direct experiences Listening to group lectures and discussions Engaging in extensive reading Participating in related activities

5. Interacting in primary groupsinfluences initial attitude

learning, later commitment to

group held attitude

Facilitate interacting in primary groups through

Group planning Group discussion Group decision making Role-playing

6. Practicing an attitude facilitatesstable organization

Practice context should

Regard the teacher as an exemplary modelmanifesting interest in the students

Be characterized by positive climate Confirm learner responses with positive

remarks, approving nod, and smile7. Purposeful learning facilitates

effective attitude acquisition and

modification

Guide learners to engage in independent attitude

cultivation through

Providing opportunities for them to thinkabout their own attitudes

Writing about open-ended themes


24/54


Discussion Point 4:

Test Construction, Reliability and Validity

STEP I: CONTENT VALIDATON

It is the degree to which the test represents the essence, the topics, and theareas that the test is designed to measure.

Considered the most crucial procedure in the test construction process becausecontent validity sets the pace for the succeeding validity and reliability measures.

1.1Documentary analysis or preDocumentary analysis or preDocumentary analysis or preDocumentary analysis or pre----survey.survey.survey.survey. At this stage, one must have familiarizedhim/herself with the theoretical constructs directly related to the test one

planning.

1.2Development of a Table of Specification.Development of a Table of Specification.Development of a Table of Specification.Development of a Table of Specification. Determining the areas or concepts thatwill represent the nature of the variable being measured and the relative

emphasis of each area are essentially judgmental.

Detailed TS includes areas or concepts, objectives, number of items, and the

percentage or proportion of items in each area.

It is advisable to make 50 to 100 percent allowance in the construction of items.

Sample

Table of Specification (first draft) for Introduction to Psychology Unit Exam

AREAS

LEARNING

OBJECTIVES

NUMBER

OF

ITEMS

PLACEMENT OF ITEMS PERCENTAGE

K C A A S E

I. History

of

Psychology

151,6,13,14,22,23,24,32,33,

49,50,51,59,65,6521.43 %

II.Branches

of

Psychology

157,12,15,16,20,21,31,34,

47,48,52,58,59,61,7021.43 %

III.

Schools of

Psychology

20

3,4,5,17,19,25,26,30,35,

36,39,40,44,46,53,54,60,

62,67,69

28.57 %

IV.

Research

Methods

20

2,8,9,10,11,18,27,28,29,

37,38,41,42,43,45,55,56,

63,64,68

28.57 %

Total 70 100 %

Building a Table of Specifications:

1. Obtaining a list of Instructional Objectives2. Outlining the Course Content3. Preparing a Two-way chart


25/54


Table of Specifications for a Summative Third-Grade Social Studies Test

1.3Consultation with expertsConsultation with expertsConsultation with expertsConsultation with experts. At this point it is advisable to consult with your thesisadviser or with some authorities that have the expertise in making judgment

about the representativeness or relevance of the entries made in your TS.

1.4Item writingItem writingItem writingItem writing. At this stage you should know what types of items you are supposedto construct: the type of instrument, format, scaling and scoring techniques.

STEP II: FACE VALIDATIONSTEP II: FACE VALIDATIONSTEP II: FACE VALIDATIONSTEP II: FACE VALIDATION

Face validity, the crudest type of validity, pertains to whether the test looks valid,that is, if by the face of the instrument, it looks like it can measure what you

intend to measure. This type of validity cannot stand-alone for use especially in researches in the

graduate level.

2.1 Item inspection.Item inspection.Item inspection.Item inspection. Have the initial draft of the instrument inspected by a group ofevaluators thesis adviser, test construction experts, expert/professionals whose

specialization are related to the subject matter at hand.

ITEM / ITEM NO.ITEM / ITEM NO.ITEM / ITEM NO.ITEM / ITEM NO. SUITABLESUITABLESUITABLESUITABLENOTNOTNOTNOT

SUITABLESUITABLESUITABLESUITABLE

NEEDNEEDNEEDNEED

REVISIONREVISIONREVISIONREVISION

2.2 InterInterInterInter----judge consistencyjudge consistencyjudge consistencyjudge consistency. You may collate the data gathered from the evaluatorsfor analysis. You have to look at the agreement or consistency of judgment they

made each of the items.


26/54


STEP III: FIRST TRIAL RUNSTEP III: FIRST TRIAL RUNSTEP III: FIRST TRIAL RUNSTEP III: FIRST TRIAL RUN

At this stage you must have already a stencil of your first draft as a result ofsteps 1 and 2. Try out your test to a sample that is comparable to your target

population or of your final sample. This try out should be large enough to

provide meaningful computations.

STEP IV:STEP IV:STEP IV:STEP IV: ITEM ANALYSISITEM ANALYSISITEM ANALYSISITEM ANALYSIS

Both the reliability and validity of any test depend largely on the characteristicsof the items. It says that high validity and reliability can be built into the

instruments in advance through item analysis.

According to Likert, item analysisitem analysisitem analysisitem analysis can be used as an objective check indetermining whether the member of the group react differentially to the battery,

that is, item analysis indicates whether those person who fall toward one end of

the attitude continuum on the battery do so on the particular statement and

vice versa.

THE UTHE UTHE UTHE U----L INDEX METHODL INDEX METHODL INDEX METHODL INDEX METHOD

Appropriate for the test whose criterion is measured along the continuous scaleand whose individual item is scored right or wrong and negative or positive.

Steps in using USteps in using USteps in using USteps in using U----L Index MethodL Index MethodL Index MethodL Index Method

1. Score the test and arrange them from lowest to highest based on the total scores.Sample score from 10 item test

n = 30

List of scores

Arranged scoresArranged scoresArranged scoresArranged scores

from lowest tofrom lowest tofrom lowest tofrom lowest to

highesthighesthighesthighest

2 9 1 5

3 8 2 5

5 9 2 5

9 5 2 6

5 2 2 6

4 3 2 6

2 4 3 8

6 4 3 8

8 6 3 8

8 2 3 8

8 2 4 89 1 4 9

3 8 4 9

6 4 4 9

3 5 5 9


27/54


2. Separate the top 27 %top 27 %top 27 %top 27 % and bottom 27 %bottom 27 %bottom 27 %bottom 27 % of the cases.the 27 % of 30 is 8.1 or 8 (30 x .27 = 8.1)

Sample score

from 10 itemtest

n = 30

Arranged scoresArranged scoresArranged scoresArranged scores

from lowest tofrom lowest tofrom lowest tofrom lowest to

highesthighesthighesthighest

1 5

2 5

2 5

2 6

2 6

2 63 8

3 8

3 8

3 8

4 8

4 9

4 9

4 9

5 9

3. Prepare a tally sheet. Tally the number of cases from each group who got theitem right for each of the entire items. And then convert them into frequencies.

ITEMITEMITEMITEM

NO.NO.NO.NO.UPPER 27 %UPPER 27 %UPPER 27 %UPPER 27 % LOWER 27 %LOWER 27 %LOWER 27 %LOWER 27 %

tally frequency tally frequency

1111 IIIII III 8 II 2

2222 IIIII 5 II 2

3333 IIIII III 8 I 1

4444 IIIII I 6 I 1


6666 IIIII II 7 III 3

7777 IIIII I 6 I 1

8888 IIIII 5 I 1


10101010 IIIII - II 7 II 2

Bottom 27 %

To 27 %


28/54


4. Compute the proportions of each case in the different item number.

U or L =

ITEMITEMITEMITEM

NO.NO.NO.NO.

UPPER 2UPPER 2UPPER 2UPPER 27 %7 %7 %7 %

nnnn = 30= 30= 30= 30

LOWER 27 %LOWER 27 %LOWER 27 %LOWER 27 %

nnnn = 30= 30= 30= 30

f p F P

1111 8 2

2222 5 2

3333 8 1

4444 6 1

5555 8 2

6666 7 3

7777 6 18888 5 1

9999 8 2

10101010 7 2

5. Compute the discrimination index of each item.Discrimination indexDiscrimination indexDiscrimination indexDiscrimination index refers to the degree to which an item differentiates correctly

among test takers in the behavior that the test is designed to measure. Thus, a god test

item separates the bright from the poor respondents.

Ds =

Where: DsDsDsDs is discrimination index

PuPuPuPu is proportion of the upper 27 %

PlPlPlPl is proportion of the lower 27 percent %

ITEMITEMITEMITEM

NO.NO.NO.NO.

UPPER 27 %UPPER 27 %UPPER 27 %UPPER 27 %

nnnn = 30= 30= 30= 30


nnnn = 30= 30= 30= 30DDDDssss

f p (Pu) f p (Pl)

1111 8 2

2222 5 2

3333 8 14444 6 1

5555 8 2

6666 7 3

7777 6 1

8888 5 1

9999 8 2

10101010 7 2

f

n

Pu Pl


29/54


6. Compute the difficulty index of each itemDifficulty inDifficulty inDifficulty inDifficulty indexdexdexdex is the percentage of the respondents who got the item right. It

can also be interpreted as how easy or how difficult an item is.

Df =

Where: DfDfDfDf is difficulty index

PuPuPuPu is proportion of the upper 27 %

PlPlPlPl is proportion of the lower 27 %

ITEMITEMITEMITEM

NO.NO.NO.NO.


nnnn = 30= 30= 30= 30


nnnn = 30= 30= 30= 30DDDDssss DDDDFFFF

F p (Pu) f p (Pl)

1111 8 2

2222 5 23333 8 1

4444 6 1

5555 8 2

6666 7 3

7777 6 1

8888 5 1

9999 8 2

10101010 7 2

7. Deciding whether to retain an item will be based on two ranges.ITEMITEMITEMITEM

NO.NO.NO.NO.


nnnn = 30= 30= 30= 30


nnnn = 30= 30= 30= 30DDDDssss DDDDFFFF DecisionDecisionDecisionDecision

F p f p

1111 8 2

2222 5 2

3333 8 1

4444 6 1

5555 8 2

6666 7 3

7777 6 1

8888 5 1

9999 8 2

10101010 7 2

Items with difficulty indicesdifficulty indicesdifficulty indicesdifficulty indices within .20 to .80 and didididiscrimination indicesscrimination indicesscrimination indicesscrimination indices within.30 to .80 are retained.

Pu Pl

2


30/54


The Chung-the-fan item analysis table can be obtained in the discriminationindices of the items.

.40 and above - very good item

.30 - .39 - reasonably good item but possibly subject

to improvement

.20 - .29 - marginal item, usually needing

improvement

.19 and below - poor item, to be rejected, improved or

revised

Difficulty indices can interpreted as the following:.00 - .20 - very difficult

.21 - .80 - moderately difficult

.81 1.00 - very easy

CRITERION OF INTERNAL CONSISTENCYCRITERION OF INTERNAL CONSISTENCYCRITERION OF INTERNAL CONSISTENCYCRITERION OF INTERNAL CONSISTENCY

Somewhat similar to the U-L Index Method, in that two criterion groups, the highgroup and the low group, are employed to judge the discriminatory power of anitem. However, in this method, Likert recommends the use of the high 10

percent and the low 10 percent groups.

Steps in Criterion of Internal Consistency Method

1. List all the scores of the respondents and get its high 10 percent and low 10percent. Write their respective scores for each item.

sample scores from a 10-item test

using 4-point scale

n = 50

High10

%

TEST ITEMS1111 2222 3333 4444 5555 6666 7777 8888 9999 10101010

AAAA 4 4 4 4 4 3 4 2 4 3

BBBB 4 2 4 2 2 4 4 2 4 3

CCCC 4 4 4 4 4 4 4 1 3 3

DDDD 4 4 4 4 2 4 4 2 4 4

EEEE 3 4 4 3 4 2 2 2 4 4

Low

10

%

FFFF 4 1 1 2 3 2 3 4 3 2GGGG 4 1 2 2 4 2 2 3 3 2

HHHH 3 2 2 1 3 3 1 4 2 2

IIII 3 2 2 1 4 2 3 4 2 2

JJJJ 3 2 2 2 4 1 3 3 1 2

respondents


31/54


2. Get the summation of the high group and the low group.sample scores from a 10-item test

using 4-point scale

n = 50

High 10 % TEST ITEMS1111 2222 3333 4444 5555 6666 7777 8888 9999 10101010

AAAA 4 4 4 4 4 3 4 2 4 3

BBBB 4 2 4 2 2 4 4 2 4 3

CCCC 4 4 4 4 4 4 4 1 3 3

DDDD 4 4 4 4 2 4 4 2 4 4

EEEE 3 4 4 3 4 2 2 2 4 4

Sum of high groupSum of high groupSum of high groupSum of high group

Sum of low groupSum of low groupSum of low groupSum of low group

Low 10 %

FFFF 4 1 1 2 3 2 3 4 3 2

GGGG 4 1 2 2 4 2 2 3 3 2HHHH 3 2 2 1 3 3 1 4 2 2

IIII 3 2 2 1 4 2 3 4 2 2

JJJJ 3 2 2 2 4 1 3 3 1 2

3. Get the difference of the groups.sample scores from a 10-item test

using 4-point scale

n = 50

High 10 %TEST ITEMS

1111 2222 3333 4444 5555 6666 7777 8888 9999 10101010

AAAA 4 4 4 4 4 3 4 2 4 3

BBBB 4 2 4 2 2 4 4 2 4 3

CCCC 4 4 4 4 4 4 4 1 3 3

DDDD 4 4 4 4 2 4 4 2 4 4

EEEE 3 4 4 3 4 2 2 2 4 4

Sum of high group

Sum of low group

DifferenceDifferenceDifferenceDifference

Low 10 %

FFFF 4 1 1 2 3 2 3 4 3 2GGGG 4 1 2 2 4 2 2 3 3 2

HHHH 3 2 2 1 3 3 1 4 2 2

IIII 3 2 2 1 4 2 3 4 2 2

JJJJ 3 2 2 2 4 1 3 3 1 2


32/54


4. To see the difference between scores, ranking will be helpful to use.sample scores from a 10-item test

using 4-point scale

n = 50

High 10 % TEST ITEMS1111 2222 3333 4444 5555 6666 7777 8888 9999 10101010

AAAA 4 4 4 4 4 3 4 2 4 3

BBBB 4 2 4 2 2 4 4 2 4 3

CCCC 4 4 4 4 4 4 4 1 3 3

DDDD 4 4 4 4 2 4 4 2 4 4

EEEE 3 4 4 3 4 2 2 2 4 4

Sum of high group

Sum of low group

Difference

RankRankRankRank

Low 10 %FFFF 4 1 1 2 3 2 3 4 3 2

GGGG 4 1 2 2 4 2 2 3 3 2

HHHH 3 2 2 1 3 3 1 4 2 2

IIII 3 2 2 1 4 2 3 4 2 2

JJJJ 3 2 2 2 4 1 3 3 1 2

PEARPEARPEARPEARSON PRODUCTSON PRODUCTSON PRODUCTSON PRODUCT----MOMENT CORRELATION METHODMOMENT CORRELATION METHODMOMENT CORRELATION METHODMOMENT CORRELATION METHOD

This item analysis technique is used for tests of continuous scaling with three(3) or more scale points. There is a total score, which serves as an X criterion,

and item score, which is the Y criterion. This is done to the entire items.

Therefore, if the draft consists of 60 items, there should be 60 correlation

coefficients computed.


33/54


Steps in Pearson Product-Moment Correlation Method

1. Find the X and Y scores.Where XXXX is the total scores of the respondents while the YYYY is the item score.

sample scores in item no 1item no 1item no 1item no 1 in a75 item test

n = 10

Respondents X Y

A 30 4

B 43 5

C 53 3

D 45 4

E 70 2

F 45 3

G 68 4

H 48 5I 38 2

J 45 4

TOTAL 485 36

2. Square all the X and Y scores.sample scores in item no 1item no 1item no 1item no 1 in a

75 item test

n = 10

Respondents X Y X2 Y2

A 30 4

B 43 5

C 53 3

D 45 4

E 70 2

F 45 3

G 68 4

H 48 5

I 38 2

J 45 4

TOTAL 485 36 24905 140


34/54


3. Multiply all the X and Y.sample scores in item no 1item no 1item no 1item no 1 in a

75 item test

n = 10

Respondents X Y X2

Y2

XYA 30 4

B 43 5

C 53 3

D 45 4

E 70 2

F 45 3

G 68 4

H 48 5

I 38 2

J 45 4

TOTAL 485 36 24905 140

4. Given the above data, compute the Pearson r.rxy =

where: rxy = correlation between x and y

x = sum of total scores

y = sum of item scores

xy = sum of the product of XY

y2 = sum of the squared total scoresx2 = sum of squared total item scores

Significant coefficient reflects good items while insignificant coefficient onesreflect poor items. Most researchers considers a coefficient of .30 and above as

indicating good items.

To interpret the correlation coefficient values ( r) obtained, the followingclassification may be applied:

+.00 - + .20 = negligible correlation

+.21 - + .40 = low or slight correlation

+.41 - +.70 = marks or moderate correlation

+.71 - +.90 = high relationship

+.91 - +.99 = very high correlation

+1.00 = perfect correlation

nxy (x) (y)

[nx2

(x)2] [ny

2 (y)

2]


35/54


POINTPOINTPOINTPOINT----BISERIAL CORRELATION METHODBISERIAL CORRELATION METHODBISERIAL CORRELATION METHODBISERIAL CORRELATION METHOD

This is applied to test with dichotomous scoring system (yes/no, right/wrong,improved/ not improved). Unlike the Pearson Product method, the Y criterion

is scored either 1 or 0.

USING TWO OR MORE TECHNIQUESUSING TWO OR MORE TECHNIQUESUSING TWO OR MORE TECHNIQUESUSING TWO OR MORE TECHNIQUES

Basically it is a combination of two or more item analysis techniques. Althoughitem analysis is laborious, some researchers have adopted to play safe by going

through this process. This is done to ensure the more accurate quantitative

judgment.

STEP V: SECOND TRIAL RUN OR FINAL TEST ADMINISTRATIONSTEP V: SECOND TRIAL RUN OR FINAL TEST ADMINISTRATIONSTEP V: SECOND TRIAL RUN OR FINAL TEST ADMINISTRATIONSTEP V: SECOND TRIAL RUN OR FINAL TEST ADMINISTRATION

More often than not, the second trial run becomes the final run. This meansthat for the second trial run one may administer the draft resulting from the

item analysis to ones final sample.

Necessary adjustments can still be done before finally administering theinstrument to the final sample.

STEP VI: EVALUATION OF THE TESTSTEP VI: EVALUATION OF THE TESTSTEP VI: EVALUATION OF THE TESTSTEP VI: EVALUATION OF THE TEST

After the final run, the test can now be evaluated statistically of its final validityand reliability.

6.1 Evaluation of reliabilityEvaluation of reliabilityEvaluation of reliabilityEvaluation of reliabilityTHE SPLIT HALF RELIABILITYTHE SPLIT HALF RELIABILITYTHE SPLIT HALF RELIABILITYTHE SPLIT HALF RELIABILITY

The most common technique of evaluating the reliability of the half-test isthrough the odd-even split half technique. This is done by splitting the test into

two, the odd numbered items as one, and the even numbered items as the

other.

Through the use Pearson Product-Moment Correlation, the reliability of the halfof the instrument can be determined. The reliability coefficient of this type is

often called a coefficient of internal consistency.

Through the use of Spearman-Brown Prophecy Formula, the reliability of theentire instrument can be obtained.

r11 =

where: r11 = reliability of the whole test

r = reliability of the half test

What would be the reliability of the whole test if the computed coefficient from

the odd-even method is r = .63?

The KudeKudeKudeKuderrrr----Richardson Formula 20Richardson Formula 20Richardson Formula 20Richardson Formula 20 can also be used in determining thereliability of the entire test and at the same time solving the problem that may

arise in using the Spearman-Brown Prophecy Formula

Steps in Kuder-Richardson Formula 20

2 (r )

1 + r


36/54


1. Check the test by giving 1 for every correct answer and o for every wronganswer and get its frequency

ITEMSRESPONDENTS

A B C D E F G H I J K L M N ffff1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 12

2 1 1 1 1 1 1 1 1 1 1 1 1 0 0 12

3 1 1 1 1 1 1 1 1 1 1 1 0 0 0 11

4 1 1 1 1 1 1 1 1 1 1 0 0 0 0 10

5 1 1 1 1 1 1 1 1 1 1 0 0 0 0 10

6 1 1 1 1 1 1 1 1 1 1 0 0 0 0 10

7 1 1 1 1 1 1 1 1 1 0 0 0 0 0 9

8 0 1 1 1 1 1 0 1 1 0 1 0 0 0 8

9 0 1 1 1 1 1 1 1 0 0 0 0 0 0 8

10 0 0 1 0 0 1 1 0 1 0 0 0 0 0 4

total 7 9 10 9 9 10 9 9 9 6 4 2 0 0

2. Find the proportion passing each item (pi) and then the proportion failingeach item (qi). pipipipi is computed by dividing the number of respondents who

got the correct answers in the total number of respondents, while the qiqiqiqi is

computed through subtracting 1 to the computed pi.

pi = qi = pi 1

ITEMSf pipipipi qiqiqiqi

1 12

2 12

3 11

4 10

5 10

6 10

7 9

8 8

9 8

10 4

total

no. of students w/ correct answer

total number of respondent


37/54


3. Multiply the pi and qiITEMS

f pi qi piqipiqipiqipiqi

1 12

2 123 11

4 10

5 10

6 10

7 9

8 8

9 8

10 4

total 1.9509

4. Compute for the variance (s2) of the instrument

xxxx =

ssss2222 =

5. Compute for the Kuder-Richardson Formula 20

rrrrtttttttt =

RESPONDENTS x (x x) (x x)2

A 7

B 9

C 10

D 9

E 9

F 10

G 9

H 9I 9

J 6

K 4

L 2

M 0

N 0

total

(x x)2

n- 1

x

n

k piqi

1

k-1 s2


38/54


THE TESTTHE TESTTHE TESTTHE TEST----RETEST RELIABILITYRETEST RELIABILITYRETEST RELIABILITYRETEST RELIABILITY

This is also called the coefficient of stability. To calculate the coefficient, thetest is administered twice to the same sample with a given time interval.

The Pearson r is then calculated to determine the reliability of the instrument The most critical problem in this technique is determining the correct time

interval between the two testing. Generally, twp weeks or so.

PARALLEL FORM OF RELIABILITY OR ALTERNATE FORMPARALLEL FORM OF RELIABILITY OR ALTERNATE FORMPARALLEL FORM OF RELIABILITY OR ALTERNATE FORMPARALLEL FORM OF RELIABILITY OR ALTERNATE FORM

The coefficient of equivalence is computed by administering two parallel orequivalent forms of the test to the same group of individuals.

This technique is also referred to as the method of equivalent forms. The coefficient obtained from this formula is also called as the coefficient of

equivalence.

6.2 Evaluation of validityEvaluation of validityEvaluation of validityEvaluation of validityCRITERIONCRITERIONCRITERIONCRITERION----RELATED VALIDITYRELATED VALIDITYRELATED VALIDITYRELATED VALIDITY

Criterion-related validity is a very common type of validity, and it is primarilystatistical. It is a correlation between a set of scores or some other predictor

with an external measure. This external measure is called criterion.

A correlation coefficient is then run between two sets of measurements. In actual practice, several predictors are used. Then multiple r would be

computed between these predictors.

The difficulty usually met in this type of validity is in selecting or judging whichcriterion should be used to validate the measure at hand.

Also called as predictive validity.CONSTRUCONSTRUCONSTRUCONSTRUCT VALIDITYCT VALIDITYCT VALIDITYCT VALIDITY

Construct validity is determined by investigating the psychological qualities,trait, or factors measured by a test.

It often called as concept validity because it does really after with the highvalidity coefficient but in the theory and concept behind the test. Likewise, it

involves discovering positive correlation between and among the

variables/constructs that define the concept.


39/54


Discussion Point 5:

Constructing Objective Test Items: Multiple-Choice Form

Objective test items are not limited to the measurement of simple learning

outcomes The multiple-choice item can measure at both the knowledge and

understanding levels and is also free of many of limitations of other forms of objective

items.

The multiple-choice item is generally recognized at the most widely applicable and usefultype of objective test item.

More effectively measures many of the simple learning outcomes measured by the short-answer item, the true-false item, and the matching exercise.

Measures a variety of the more complex outcomes in the knowledge, understanding andapplication areas.

This flexibility, plus the higher quality items usually found in the multiple-choice form, hasled to its extensive use in achievement testing.

CHARACTERISTIC OF MULTIPLE-CHOICE ITEMS

A multipleA multipleA multipleA multiple----choice item consists of a problem and a list of suggested solutions.choice item consists of a problem and a list of suggested solutions.choice item consists of a problem and a list of suggested solutions.choice item consists of a problem and a list of suggested solutions.

The problem maybe stated as a direct question or an incomplete statement and is calledthestemstemstemstem of the item.

The list of suggested solutions may include words, numbers, symbols, or phrases andare called alternatives (also called choices or options).

The pupil is typically requested to read the stem and the list of alternatives and to selectthe one correct, or best, alternative.

The correct alternative in each item is called merely the answeransweransweranswer, and the remainingalternatives are called distractersdistractersdistractersdistracters (also called decoys or foils). These incorrect

alternatives receive their name from their intended function to distract those pupilswho are in doubt about the correct answer.

Whether to use a direct question or incomplete statement in the stem depends onWhether to use a direct question or incomplete statement in the stem depends onWhether to use a direct question or incomplete statement in the stem depends onWhether to use a direct question or incomplete statement in the stem depends on

several factors.several factors.several factors.several factors.

The direct-question form is easier to write, is more natural for the younger pupils, andis more likely to present a clearly formulated problem.

On the other hand, the incomplete statement is more concise, and if skillfullyphrased, it too can present a well-defined problem.

A common procedure is to start each stem as a direct question and shifting to theincomplete statement form only when the clarity of the problem can be retained and

greater conciseness achieved.


40/54


Examples: Direct-question form

In which one of the following cities is the capital of Philippines?

a. Manila

b. Paraaque

c. Pasay

d. Taguig

Incomplete sentence form

The capital of the Philippines is in ______.

a. Manila

b. Paraaque

c. Pasay

d. Taguig

Examples: Best-answer type

Which one of the following factors contributed most to the selection of Manila as thecapital of the Philippines?

a. Central location

b. Good climate

c. Good highways

d. Large population

The bestbestbestbest----answer type of multipleanswer type of multipleanswer type of multipleanswer type of multiple----choice itemchoice itemchoice itemchoice item tends to be more difficult than the

correctcorrectcorrectcorrect----answer typeanswer typeanswer typeanswer type. This due partly to the finer discriminations called for and partly to the

fact that such items are used to measure more complex learning outcomes. The best-

answer type is especially useful for measuring learning outcomes that require the

understanding, application or interpretation of factual information.

USES OF MULTIPLEUSES OF MULTIPLEUSES OF MULTIPLEUSES OF MULTIPLE----CHOICE ITEMSCHOICE ITEMSCHOICE ITEMSCHOICE ITEMS

The multiple-choice item is the most versatile type of test item available. It can

measure a variety of learning outcomes from simple to complex, and it is adaptable to most

types of subject matter content. The uses show only its function in measuring some of the

more common learning outcomes in the knowledge, understanding and application areas.

The measurement of more complex outcomes, using modified forms of the multiple-choice

item.

Measuring Knowledge OutcomesMeasuring Knowledge OutcomesMeasuring Knowledge OutcomesMeasuring Knowledge Outcomes

Knowledge of Terminology.Knowledge of Terminology.Knowledge of Terminology.Knowledge of Terminology. A simple but basic learning outcome measured by the multiple-

choice item is knowledge of terminology. For this purpose, pupils can be requested to show

their knowledge of a particular term selecting a word that has the same meaning as the

given term or by choosing a definition of the term. Special uses of a term can also be

measured, by having pupils identify the meaning of the term when used in context.


41/54


Knowledge of Specific Facts.Knowledge of Specific Facts.Knowledge of Specific Facts.Knowledge of Specific Facts. Another learning outcome basic to all school subjects is the

knowledge of the specific facts. It is important in its own right, and it provides a necessary

basis for developing understanding, thinking skills, and other complex learning outcomes.

Multiple-choice items designated to measure specific facts can take many different forms,but questions of the who, what, when, and where variety are most common.


42/54


Knowledge of Principles.Knowledge of Principles.Knowledge of Principles.Knowledge of Principles. Knowledge of principles is also important learning outcome in most

school subjects. Multiple-choice item can be constructed to measure knowledge of

principles as easily as those designated to measure knowledge of specific facts. The items

appear a bit more difficult, but this is because principles are more complex than isolated

facts.

Knowledge of Methods and Procedures.Knowledge of Methods and Procedures.Knowledge of Methods and Procedures.Knowledge of Methods and Procedures. Another common learning outcome readily

adaptable to the multiple-choice form is knowledge of methods and procedures. In some

cases we might want to measure knowledge of procedures before we permit pupils to

practice in particular area (e.g., laboratory procedures). In other cases, knowledge of

methods and procedures may be important learning outcomes in their own right (e.g,

knowledge of governmental procedures).


43/54


Measuring OutcomesMeasuring OutcomesMeasuring OutcomesMeasuring Outcomes aaaatttt tttthe Understanding And Application Levelshe Understanding And Application Levelshe Understanding And Application Levelshe Understanding And Application Levels

Many teachers limit the use of multiple-choice items to the knowledge area because

they believe that all objective-type items are restricted to the measurement of relatively

simple learning outcomes. Although this is true of most of the other types of objective items,

the multiplemultiplemultiplemultiple----choice item is especially adaptable to the measchoice item is especially adaptable to the measchoice item is especially adaptable to the measchoice item is especially adaptable to the measurement of more complexurement of more complexurement of more complexurement of more complex

learning outcomeslearning outcomeslearning outcomeslearning outcomes.

In reviewing the following items, it is important to keep in mind that such item

measure learning outcomes beyond factual knowledge only if the applications and

interpretations are new to the pupils. Any specific applications or interpretations of

knowledge can, of course, be taught directly to the pupils as any other fact is taught. When

this is done, and the test item contains the same problem situations and solutions used in

teaching, it is obvious that the pupils can be given credit for no more than mere retention of

factual knowledge. To measure understanding and application, an element of novelty must

be included in the test items.


44/54


Ability to Identify Application of Facts and Principles.Ability to Identify Application of Facts and Principles.Ability to Identify Application of Facts and Principles.Ability to Identify Application of Facts and Principles. A common method of determining

whether pupils learning has gone beyond the mere memorization of a fact or principle is to

ask them to identify its correct application in a situation that is new to the pupil.

AbilityAbilityAbilityAbility tttto Interpro Interpro Interpro Interpret Causeet Causeet Causeet Cause----AndAndAndAnd----Effect Relationship.Effect Relationship.Effect Relationship.Effect Relationship. Understanding can frequently be

measured by asking pupils to interpret various relationships among facts. One of the most

important relationships in this regard, and one common to most subject-matter areas, is the

cause-and-effect relationship. Understanding of such relationships can be measured by

presenting pupils with specific cause-and-effect relationship and asking them to identify the

reason that best accounts for it.

Ability to Justify MetAbility to Justify MetAbility to Justify MetAbility to Justify Methods and Procedureshods and Procedureshods and Procedureshods and Procedures. Another phase of understanding important in

various subject-matter areas is concerned with methods and procedures. A pupil might know

the correct method or sequence of steps in carrying out procedure, without being able to

explain why it is the best method or sequence of steps. At the understanding level we are

interested in the pupils ability tojustify the use of a particular

method or procedure. This can

be measured with multiple-

choice items by asking the pupils

to select the best of several

possible explanations of method

or procedure.


45/54


Advantages and Limitations of MultipleAdvantages and Limitations of MultipleAdvantages and Limitations of MultipleAdvantages and Limitations of Multiple----Choice ItemsChoice ItemsChoice ItemsChoice Items

The multiple-choice item is one of the most widely applicable test items for

measuring achievement. It can effectively measure various types of knowledge and complex

learning outcomes. In addition to this flexibility, it is free from some of the shortcomings

characteristics of the other item types. The ambiguity and vagueness that frequently areambiguity and vagueness that frequently areambiguity and vagueness that frequently areambiguity and vagueness that frequently are

present in the shortpresent in the shortpresent in the shortpresent in the short----answer item areanswer item areanswer item areanswer item are avoided because the alternatives better structure theavoided because the alternatives better structure theavoided because the alternatives better structure theavoided because the alternatives better structure the

situationsituationsituationsituation. The short-answer item can be answered in many different ways, but the multiple-

choice item restricts the pupils response to a specific area.

Poor: Jose Rizal was born in _______.

Better: Jose Rizal in

A. Cavite

B. Laguna

C. Manila

D. Quezon

One advantage of the multiple-choice item over the true-false item is that pupils

cannot receive credit for simply knowing that a statement is incorrect; they must also know

what is correct.

T F The degree to which a test measures what it purports to measure is reliability.

The degree to which a test measures what it purports to measure is

A. Objectivity

B. Reliability

C. Standardization

D. Validity

Another advantage of the multiple-choice items over the true-false item is the greaterthe greaterthe greaterthe greater

reliability per itemreliability per itemreliability per itemreliability per item. Because the number of alternatives is increased from two to four or five,

the opportunity for guessing the correct answer is reduced, and reliability is correspondingly

increased. The effect of increasing the number of alternatives for each item is similar to that

of increasing the length of the test.

Using the best-answer type of multiple-choice item also circumvents a difficulty

associated with the true-false item obtaining statements that are true or false without

qualification. This makes it possible to measure learning outcomes in the numerous subject-

matter areas in which solutions to problems are not absolutely true or false but vary indegree of appropriateness (e.g., best method, best reason, best interpretation).

Another advantage of the multiple-choice item over the matching exercise is that the

need for homogeneous material is avoidedneed for homogeneous material is avoidedneed for homogeneous material is avoidedneed for homogeneous material is avoided. The matching exercise, which is essentially a

modified form of the multiple-choice item, requires a series of related ideas to form the list

of premises and alternative responses. In many content areas it is difficult to obtain enough

homogenous material to prepare effective matching exercises.


46/54


Two other desirable characteristics of the multiple-choice item are worthy of mention.

First, it is relatively free from response set. That is, pupils generally do not favor a particular

alternative when they do not know the answer. Second, using number of plausible

alternatives makes the result amenable to diagnosis. The kind of the incorrect alternatives

pupils select provides clues to factual errors and misunderstanding that need correction.

The wide applicability of the multiple-choice item, plus its advantages makes it easier

to construct high-quality test items in this form than in any of the other forms. This does not

mean that good multiple-choice items can be constructed without effort. But for given

amount of effort, multiple-choice items will tend to be of a higher quality than short-answer,

true-false, or matching-type items in the same area.

Despite its superiority, the multiple-choice does have limitationslimitationslimitationslimitations.

1. As with all other paper-and-pencil tests, it is limited to learning outcomes at theverbal level. The problems presented to pupils are verbal problems, free from the

many irrelevant factors presenting natural situations. Also, the applications pupils are

asked to make are verbal applications, free from the personal commitmentnecessary for application in natural situations. In short, the multiple-choice item, like

other paper-and-pencil tests, measures whether the pupil knows or understandsmeasures whether the pupil knows or understandsmeasures whether the pupil knows or understandsmeasures whether the pupil knows or understands

what to do when confronted with a problem situation, but it cannot determine howwhat to do when confronted with a problem situation, but it cannot determine howwhat to do when confronted with a problem situation, but it cannot determine howwhat to do when confronted with a problem situation, but it cannot determi

comprehensive material for measurement and evaluation

Documents