assessment presentation final2
DESCRIPTION
AssessmentTRANSCRIPT
Assessment
The ongoing process of gathering, analysingand reflecting on evidence to make informed and consistent judgements with the goal of improving student outcomes.
Gathering, analysing and reflecting on evidence to make informed judgements
within a targeted outcome area
Find out what students know
Identify students’ learning needs
Plan teaching programs
Select candidates for programs, scholarships
Monitor effectiveness of interventions
Monitor impact of policy
Report to parents
Report to governments
Assessment must be technically adequate
Assessment must be targeted toward the right level of difficulty, so that all students have opportunity to demonstrate what they know, think, or can do.
Assessment should be based on a variety of different measures to cater for learner differences.
Assessment should be ongoing rather than episodic, and should provide a meaningful basis for feedback and reflection.
Given periodically to determine at a particular point in time what students know and do not know.
Occur at the end of unit learning.◦ Determines at a point in time what students know
and can do.
◦ Used for reporting against standards.
◦ Used for entry (e.g. to university)
Selected Response
◦ Multiple Choice
◦ True/False
◦ Matching
◦ Fill-in
Extended Written Response
Performance Assessment
Assessment of practical or laboratory work.
Oral examinations
Short answer
Portfolio
Is generally seen as process-oriented. Although the information that is gleaned from summative assessments is important, it can only help in evaluating certain aspects of the learning process.
Can provide the information needed to adjust teaching and learning while they are happening.
Often involves students in the formative assessment process, both as assessors of their own learning and as resources to other students.
Questions Classroom discussions Learning activities Feedback Conferences Interviews Student self-assessment
1. Identification by teachers & learners of learning goals, intentions or outcomes and criteria for achieving these.
2. Rich conversations between teachers & students that continually build and deepen.
3. Provision of effective, timely feedback to enable students to advance their learning.
4. Active involvement of students in their own learning.
5. Teachers responding to identified learning needs and strengths by modifying their teaching approach(es).
Black &Wiliam, 1998
http://www.youtube.com/watch?v=pJ7v8TtAx8o
BALANCED CLASSROOM ASSESSMENT SYSTEM
FORMATIVE ASSESSMENT SUMMATIVEASSESSMENT
A process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to help students improve their achievement of intended instructional outcomes.
A tool used after instruction to measure student achievement which provides evidence of student competence or program effectiveness.
Diagnostic assessment is seen by some as a component of formative assessment, but in general it is seen as a distinct form.
In practice, the purpose of diagnostic assessment is to ascertain, prior to instruction, each student’s strengths, weaknesses, knowledge, and skills. Establishing these permits the instructor to remediate students and adjust the curriculum to meet each pupil’s unique needs.
Because the primary purpose of the diagnostic test is remediation, it is both un-graded and low-stakes.
Assessment FOR learning: used by teachers to inform their teaching (formative assessment)
Assessment AS learning: students monitor their progress to inform their learning goals
Assessment OF learning: teachers use evidence of student learning to make judgments on student achievement against goals and standards (summative assessment)
Expectation of variability
Marks allocated and norm calculated
Marks often statistically manipulated
Basis for student comparison
Reports give marks and class position
Performance criteria established for each desired outcome
Most students expected to achieve minimum criteria
Can be associated with pass/fail judgments (rather than marks)
No direct comparisons between students
Report indicate number of outcomes achieved by students
Score Description
5 Demonstrates excellent
understanding of the
problem. All requirements
of task are included in
response.
4 Demonstrates very good
understanding of the
problem. All requirements
of task are included.
3 Demonstrates adequate
understanding of the
problem. Most
requirements of task are
included.
2 Demonstrates poor
understanding of the
problem. Many
requirements of task are
missing.
1 Demonstrates no
understanding of the
problem.
0 No response/ task not
attempted.
Sample scoring for the history question: What caused World War II?
Student answersCriterion-referenced
assessmentNorm-referenced assessment
Student #1:
WWII was caused by Hitler and Germany
invading Poland.
This answer is correct.
This answer is worse than
Student #2's answer, but better
than Student #3's answer.
Student #2:
WWII was caused by multiple factors, including
the Great Depression and the general economic
situation, the rise of nationalism, fascism, and
imperialist expansionism, and unresolved
resentments related to WWI. The war in Europe
began with the German invasion of Poland.
This answer is correct.
This answer is better than
Student #1's and Student #3's
answers.
Student #3:
WWII was caused by the assassination of
Archduke Ferdinand.
This answer is wrong.
This answer is worse than
Student #1's and Student #2's
answers.
Behaviourist approach: ◦ state the specific task,
◦ teach the specific task,
◦ test the specific task
Assumes learning is linear.
Suited to criterion-referenced assessment.
Cognitive approach:
Assumes active involvement of students in making meaning through thinking, reasoning, engaging (constructive)◦ Deals with complex learning outcomes
◦ Assessments of these need extended period of time
◦ Assessments require meaningful context
Sometimes called ‘authentic’ assessment
‘Authentic’ Assessment
Presents students with ‘real-world’ challenges to apply relevant skills and knowledge◦ Elicit higher order thinking in addition to basic
skills
◦ Allow for the possibility of multiple human judgments
Which indicates a behaviourist approach and a cognitive approach?◦ Paper-and-Pen Tests
◦ Questionnaires, scales
◦ Portfolios
◦ Projects
◦ Performances
◦ Self- and peer-assessment
◦ Student Journals
1. is integral to instructional design2. is fair (free from biases) 3. is technically adequate4. Has clear purpose, goals, standards and criteria5. Attends to student outcomes and processes,
recognizing how students think and learn6. is well targeted to allow students to show what they
know and can do 7. Uses a range of measures to cater for learner
differences8. is ongoing rather than episodic 9. provides feedback to the learner10. informs the teacher what to teach next
Level 1 Knowledge◦ Recall of specifics and universals
◦ Recall of methods and processes
◦ Recall of pattern, structure, setting
Level 2 Comprehension◦ Lowest level of understanding
◦ Knowing what is being communicated and using the material without relating this to other material or seeing its fuller implications
Level 3 Application◦ The use of abstractions in particular and concrete
situations
◦ Abstractions can be in the form of general ideas, rules of procedures, or generalized methods
◦ Abstractions can be technical principles, ideas, and theories which must be remembered and applied
Level 4 Analysis◦ Breakdown of material into its elements to show relative
hierarchy of ideas and/or relations between ideas
◦ Such analyses clarify the material, to indicate how it is organized
Level 5 Synthesis◦ Putting together elements to form a whole, and
arranging and combining them in to constitute a pattern or structure not clearly seen before.
Level 6 Evaluation◦ Judgements about the value of material for given
purposes.
◦ Quantitative and qualitative judgements about the extent to which material and methods satisfy
criteria.
Emphasis on higher-order thinking
Bloom’s Taxonomy of Cognitive Objectives (1950s) expresses qualitatively different kinds of thinking
One of the most well-used models for classroom assessment
Revised taxonomy Lorin Anderson (1990s)
Names of six major categories changed from noun to verbs to reflect emphasis on thinking as an active process.
Original Terms New Terms
Evaluation Creating
Synthesis Evaluating
Analysis Analyzing
Application Applying
Comprehension Understanding
Knowledge Remembering
Consistency or stability with which a test measures what it is intended to measure
Property of test
All tests are imperfect at estimating the qualities or skills they are trying to measure◦ Score each student receives always includes some
error
◦ More reliable a test, the less error in the score actually obtained
Observed score = true score + random error
Common sources of measurement error◦ Inconsistencies across testing occasions
◦ Inconsistencies across forms of the test
◦ Inconsistencies between raters
◦ Inconsistencies in sampling to content domain
Standardised tests take into consideration and make estimations of how much students scores would probably vary if they were tested repeatedly◦ Standard deviation of distribution of scores from
hypothesised repeated testing
If the assessment measures what it is supposed to be measuring
Is a property of test scores, not test itself –depends on person and situation◦ Test may be valid for one purpose but not for
another
Is a matter of degree
Evidence for validity – content related, criterion related, construct related
Extent to which sample of items, tasks or questions on an assessment are representative of some defined domain of content
Approaches to establish content related validity◦ Domain sampling, relevance, clarity◦ Logical analysis of test content – Bloom’s
Taxonomy◦ Examining test content and format
Extent to which scores are systematically related to one or more outcome criteria
Approaches to establishing criterion related validity◦ Predictive validity - does the score highly correlate
to performance later
◦ Concurrent validity - does it correlate to a test known to measure the assessment area
◦ Face validity - does evidence show test is assessing according to decision purposes
Extent to which assessment measures the identified underlying psychological characteristic of interest
Approaches to construct related validity◦ Explicating construct meaning
◦ Convergence evidence
◦ Divergent evidence
◦ Deriving and testing predictions about test performance from the underlying theory
A test must be reliable to be valid, BUT a reliable test is not always valid
Differences in the extent to which the assessee has had the opportunity to know and become familiar with the specific subject matter or specific processes required by the test item
Distorts the performance of a group – either for better or worse
Test content and characteristics
Test takers
Test environment
Test usage
Examining bias is matter of examining validity of assessment across groups
Bias/Fairness
Distribution of difficulty within assessment◦ Bloom’s Taxonomy
Sources of difficulty in assessment items◦ Construct relevant/construct irrelevant
◦ Subject or concept difficulty
◦ Process difficulty
◦ Question or stimulus difficulty
Negations Referential Vocabulary Sentence and paragraph lengths Abstraction of text Location of relevant text Problem complexity Novelty Item placement in test Closeness of the best distractors to the
correct answer
What piece of laboratory equipment is best-suited for accurately measuring the volume of a liquid?
a) graduated cylinderb) beakerc) Erlenmeyer flaskd) more than one of the above
Which piece of laboratory equipment can be used to store chemicals for long periods of time?
a) buretb) evaporating dishc) beakerd) more than one of the above
Which of the following is the most appropriate unit for expressing the weight of a pencil?
a) poundsb) ouncesc) quartsd) pintse) tons
Due to budget cutbacks, the university library nowsubscribes to fewer than _?_ periodicals.
a) 25,000 b) 20,000 c) 15,000 d) 10,000
Reliability
Does assessment accurately reflect student’s achievements?
Does moderation reveal consistency between markers and student grading?
Have criteria for assessment been applied the same way by different markers?
Validity
Does assessment measure what it was designed to measure?
Is assessment sufficiently challenging, engaging and relevant to students?
Does assessment provide sufficiently broad evidence –have different types of evidence been considered?
Fairness
Are students familiar with formats and expectations of assessment task?
Do assessment tasks favour one group over another?
Have learning activities prior to testing sufficiently
Many factors that should be considered when designing assessments including◦ amount of assessments
◦ types of assessment
◦ how to assess
◦ how to ensure assessment is truly representative of ability
◦ analysis of data obtained from assessment
Woolfolk, A. & Margetts, K. (2013). Educational psychology (3rd ed.). Pearson Education: Australia.
Marsh, C. J. (2010). Becoming a teacher: Knowledge, skills and issues (5th ed.). Pearson Education: Australia.
Chapman, E. (2013). Technical Adequacy. Approaches to Student Assessment. Presented at University of Western Australia: Perth, Western Australia
NC TEACH ,(2010). ASSESSMENT:FORMATIVE & SUMMATIVE , Practices for the Classroom:◦ Retrieved from ◦ uncw.edu/ed/ncteach/cohort3/documents/ASSESSMENT.ppt• Educational app (2011). Formative Feedback
Retrieved from http://www.youtube.com/watch?v=pJ7v8TtAx8o
• Mr Bean (2007) . The Exam Retieved from http://www.youtube.com/watch?v=Ocd1D8fwdjU