assessment presentation final2

Assessment

http://www.youtube.com/watch?v=Ocd1D8fwdjU

Min 4:13


The ongoing process of gathering, analysingand reflecting on evidence to make informed and consistent judgements with the goal of improving student outcomes.

Gathering, analysing and reflecting on evidence to make informed judgements

within a targeted outcome area

Find out what students know

Identify students’ learning needs

Plan teaching programs

Select candidates for programs, scholarships

Monitor effectiveness of interventions

Monitor impact of policy

Report to parents

Report to governments

Assessment must be technically adequate

Assessment must be targeted toward the right level of difficulty, so that all students have opportunity to demonstrate what they know, think, or can do.

Assessment should be based on a variety of different measures to cater for learner differences.

Assessment should be ongoing rather than episodic, and should provide a meaningful basis for feedback and reflection.

Given periodically to determine at a particular point in time what students know and do not know.

Occur at the end of unit learning.◦ Determines at a point in time what students know

and can do.

◦ Used for reporting against standards.

◦ Used for entry (e.g. to university)

Selected Response

◦ Multiple Choice

◦ True/False

◦ Matching

◦ Fill-in

Extended Written Response

Performance Assessment

Assessment of practical or laboratory work.

Oral examinations

Short answer

Portfolio

Is generally seen as process-oriented. Although the information that is gleaned from summative assessments is important, it can only help in evaluating certain aspects of the learning process.

Can provide the information needed to adjust teaching and learning while they are happening.

Often involves students in the formative assessment process, both as assessors of their own learning and as resources to other students.

Questions Classroom discussions Learning activities Feedback Conferences Interviews Student self-assessment

1. Identification by teachers & learners of learning goals, intentions or outcomes and criteria for achieving these.

2. Rich conversations between teachers & students that continually build and deepen.

3. Provision of effective, timely feedback to enable students to advance their learning.

4. Active involvement of students in their own learning.

5. Teachers responding to identified learning needs and strengths by modifying their teaching approach(es).

Black &Wiliam, 1998

http://www.youtube.com/watch?v=pJ7v8TtAx8o

BALANCED CLASSROOM ASSESSMENT SYSTEM

FORMATIVE ASSESSMENT SUMMATIVEASSESSMENT

A process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to help students improve their achievement of intended instructional outcomes.

A tool used after instruction to measure student achievement which provides evidence of student competence or program effectiveness.

Diagnostic assessment is seen by some as a component of formative assessment, but in general it is seen as a distinct form.

In practice, the purpose of diagnostic assessment is to ascertain, prior to instruction, each student’s strengths, weaknesses, knowledge, and skills. Establishing these permits the instructor to remediate students and adjust the curriculum to meet each pupil’s unique needs.

Because the primary purpose of the diagnostic test is remediation, it is both un-graded and low-stakes.

Assessment FOR learning: used by teachers to inform their teaching (formative assessment)

Assessment AS learning: students monitor their progress to inform their learning goals

Assessment OF learning: teachers use evidence of student learning to make judgments on student achievement against goals and standards (summative assessment)

Expectation of variability

Marks allocated and norm calculated

Marks often statistically manipulated

Basis for student comparison

Reports give marks and class position

Performance criteria established for each desired outcome

Most students expected to achieve minimum criteria

Can be associated with pass/fail judgments (rather than marks)

No direct comparisons between students

Report indicate number of outcomes achieved by students

Score Description

5 Demonstrates excellent

understanding of the

problem. All requirements

of task are included in

response.

4 Demonstrates very good


problem. All requirements

of task are included.

3 Demonstrates adequate


problem. Most

requirements of task are

included.

2 Demonstrates poor


problem. Many

requirements of task are

missing.

1 Demonstrates no


problem.

0 No response/ task not

attempted.

Sample scoring for the history question: What caused World War II?

Student answersCriterion-referenced

assessmentNorm-referenced assessment

Student #1:

WWII was caused by Hitler and Germany

invading Poland.

This answer is correct.

This answer is worse than

Student #2's answer, but better

than Student #3's answer.

Student #2:

WWII was caused by multiple factors, including

the Great Depression and the general economic

situation, the rise of nationalism, fascism, and

imperialist expansionism, and unresolved

resentments related to WWI. The war in Europe

began with the German invasion of Poland.

This answer is correct.

This answer is better than

Student #1's and Student #3's

answers.

Student #3:

WWII was caused by the assassination of

Archduke Ferdinand.

This answer is wrong.

This answer is worse than

Student #1's and Student #2's

answers.

http://en.wikipedia.org/wiki/World_War_II

Behaviourist approach: ◦ state the specific task,

◦ teach the specific task,

◦ test the specific task

Assumes learning is linear.

Suited to criterion-referenced assessment.

Cognitive approach:

Assumes active involvement of students in making meaning through thinking, reasoning, engaging (constructive)◦ Deals with complex learning outcomes

◦ Assessments of these need extended period of time

◦ Assessments require meaningful context

Sometimes called ‘authentic’ assessment

‘Authentic’ Assessment

Presents students with ‘real-world’ challenges to apply relevant skills and knowledge◦ Elicit higher order thinking in addition to basic

skills

◦ Allow for the possibility of multiple human judgments

Which indicates a behaviourist approach and a cognitive approach?◦ Paper-and-Pen Tests

◦ Questionnaires, scales

◦ Portfolios

◦ Projects

◦ Performances

◦ Self- and peer-assessment

◦ Student Journals

1. is integral to instructional design2. is fair (free from biases) 3. is technically adequate4. Has clear purpose, goals, standards and criteria5. Attends to student outcomes and processes,

recognizing how students think and learn6. is well targeted to allow students to show what they

know and can do 7. Uses a range of measures to cater for learner

differences8. is ongoing rather than episodic 9. provides feedback to the learner10. informs the teacher what to teach next

Level 1 Knowledge◦ Recall of specifics and universals

◦ Recall of methods and processes

◦ Recall of pattern, structure, setting

Level 2 Comprehension◦ Lowest level of understanding

◦ Knowing what is being communicated and using the material without relating this to other material or seeing its fuller implications

Level 3 Application◦ The use of abstractions in particular and concrete

situations

◦ Abstractions can be in the form of general ideas, rules of procedures, or generalized methods

◦ Abstractions can be technical principles, ideas, and theories which must be remembered and applied

Level 4 Analysis◦ Breakdown of material into its elements to show relative

hierarchy of ideas and/or relations between ideas

◦ Such analyses clarify the material, to indicate how it is organized

Level 5 Synthesis◦ Putting together elements to form a whole, and

arranging and combining them in to constitute a pattern or structure not clearly seen before.

Level 6 Evaluation◦ Judgements about the value of material for given

purposes.

◦ Quantitative and qualitative judgements about the extent to which material and methods satisfy

criteria.

Emphasis on higher-order thinking

Bloom’s Taxonomy of Cognitive Objectives (1950s) expresses qualitatively different kinds of thinking

One of the most well-used models for classroom assessment

Revised taxonomy Lorin Anderson (1990s)

Names of six major categories changed from noun to verbs to reflect emphasis on thinking as an active process.

Original Terms New Terms

Evaluation Creating

Synthesis Evaluating

Analysis Analyzing

Application Applying

Comprehension Understanding

Knowledge Remembering

Consistency or stability with which a test measures what it is intended to measure

Property of test

All tests are imperfect at estimating the qualities or skills they are trying to measure◦ Score each student receives always includes some

error

◦ More reliable a test, the less error in the score actually obtained

Observed score = true score + random error

Common sources of measurement error◦ Inconsistencies across testing occasions

◦ Inconsistencies across forms of the test

◦ Inconsistencies between raters

◦ Inconsistencies in sampling to content domain

Standardised tests take into consideration and make estimations of how much students scores would probably vary if they were tested repeatedly◦ Standard deviation of distribution of scores from

hypothesised repeated testing

If the assessment measures what it is supposed to be measuring

Is a property of test scores, not test itself –depends on person and situation◦ Test may be valid for one purpose but not for

another

Is a matter of degree

Evidence for validity – content related, criterion related, construct related

Extent to which sample of items, tasks or questions on an assessment are representative of some defined domain of content

Approaches to establish content related validity◦ Domain sampling, relevance, clarity◦ Logical analysis of test content – Bloom’s

Taxonomy◦ Examining test content and format

Extent to which scores are systematically related to one or more outcome criteria

Approaches to establishing criterion related validity◦ Predictive validity - does the score highly correlate

to performance later

◦ Concurrent validity - does it correlate to a test known to measure the assessment area

◦ Face validity - does evidence show test is assessing according to decision purposes

Extent to which assessment measures the identified underlying psychological characteristic of interest

Approaches to construct related validity◦ Explicating construct meaning

◦ Convergence evidence

◦ Divergent evidence

◦ Deriving and testing predictions about test performance from the underlying theory

A test must be reliable to be valid, BUT a reliable test is not always valid

Differences in the extent to which the assessee has had the opportunity to know and become familiar with the specific subject matter or specific processes required by the test item

Distorts the performance of a group – either for better or worse

Test content and characteristics

Test takers

Test environment

Test usage

Examining bias is matter of examining validity of assessment across groups

Bias/Fairness

Distribution of difficulty within assessment◦ Bloom’s Taxonomy

Sources of difficulty in assessment items◦ Construct relevant/construct irrelevant

◦ Subject or concept difficulty

◦ Process difficulty

◦ Question or stimulus difficulty

Negations Referential Vocabulary Sentence and paragraph lengths Abstraction of text Location of relevant text Problem complexity Novelty Item placement in test Closeness of the best distractors to the

correct answer

What piece of laboratory equipment is best-suited for accurately measuring the volume of a liquid?

a) graduated cylinderb) beakerc) Erlenmeyer flaskd) more than one of the above

Which piece of laboratory equipment can be used to store chemicals for long periods of time?

a) buretb) evaporating dishc) beakerd) more than one of the above

Which of the following is the most appropriate unit for expressing the weight of a pencil?

a) poundsb) ouncesc) quartsd) pintse) tons

Due to budget cutbacks, the university library nowsubscribes to fewer than _?_ periodicals.

a) 25,000 b) 20,000 c) 15,000 d) 10,000

Reliability

Does assessment accurately reflect student’s achievements?

Does moderation reveal consistency between markers and student grading?

Have criteria for assessment been applied the same way by different markers?

Validity

Does assessment measure what it was designed to measure?

Is assessment sufficiently challenging, engaging and relevant to students?

Does assessment provide sufficiently broad evidence –have different types of evidence been considered?

Fairness

Are students familiar with formats and expectations of assessment task?

Do assessment tasks favour one group over another?

Have learning activities prior to testing sufficiently

Many factors that should be considered when designing assessments including◦ amount of assessments

◦ types of assessment

◦ how to assess

◦ how to ensure assessment is truly representative of ability

◦ analysis of data obtained from assessment

Woolfolk, A. & Margetts, K. (2013). Educational psychology (3rd ed.). Pearson Education: Australia.

Marsh, C. J. (2010). Becoming a teacher: Knowledge, skills and issues (5th ed.). Pearson Education: Australia.

Chapman, E. (2013). Technical Adequacy. Approaches to Student Assessment. Presented at University of Western Australia: Perth, Western Australia

NC TEACH ,(2010). ASSESSMENT:FORMATIVE & SUMMATIVE , Practices for the Classroom:◦ Retrieved from ◦ uncw.edu/ed/ncteach/cohort3/documents/ASSESSMENT.ppt• Educational app (2011). Formative Feedback

Retrieved from http://www.youtube.com/watch?v=pJ7v8TtAx8o

• Mr Bean (2007) . The Exam Retieved from http://www.youtube.com/watch?v=Ocd1D8fwdjU