datadirector: creating exams winter/spring 2011 stan masters lenawee isd
TRANSCRIPT
DataDirector:Creating Exams
Winter/Spring 2011
Stan Masters
Lenawee ISD
POP
• Purpose– train DataDirector users to plan, create, and
administer local standards-based assessments instruments
• Objectives– I can develop an exam blueprint – I can choose items to add to my exam– I can create an online key for my students– I can identify exam reports I will use with others
• Procedures– PowerPoint presentation– Use of LISD DataDirector site– Use of LISD Data Warehouse webpage
School Improvement Process
WE MUST UTILIZE AN
INQUIRY APPROACH
TO DATA ANALYSIS
WE MUST USE MULTIPLE
SOURCES OF DATA
We need a data warehouse
for our 21st century
schools
WE MUST FOCUS ON DATA TO INCREASE STUDENT ACHIEVEMENT
Talking Points for the Purpose of Implementing
a Data Warehouse in Lenawee Schools
Source: Presentation by Dr. Victoria Bernhardt, April 2007
FERPA/HIPAA Pre-Test
You are in charge of a staff meeting to study student achievement on school improvement goals. As part of your meeting, you are showing a report to the entire staff that shows student scores on a common local assessment. The report shows the student names. In addition, you have given them a paper copy of the report.
It is a violation of FERPA to display the results of the assessment to the entire staff.
The exception would be a group of teachers working on a specific student strategies, as they are a specific population that then has a “legitimate educational interest” in the information.
Existing Summative and Formative Classroom Tests
Not Aligned with Expectations
Classroom Summative and Formative Tests Aligned to Expectations
Common Classroom Summative and Formative Assessments Aligned to Expectations
Common Formative and Summative Assessments Aligned to Expectations and
Delivered Online Through DataDirector
Implementing Exams with DataDirector
Adapted from St. Clair RESA
• assessment for learning– formative
(monitors student progress during instruction)
– placement(given before instruction to gather information on where to start)
– diagnostic(helps find the underlying causes for learning problems)
– interim (monitor student proficiency on learning targets)
• assessment of learning– summative
(the final task at the end of a unit, a course, or a semester)
Purposes of Assessments
Sources: Stiggins, Richard J, Arter, Judith A., Chappuis, Jan, Chappius, Stephen. Classroom Assessment for Student Learning. Assessment Training Institute, Inc., Portland, Oregon, 2004. Bravmann, S. L., “P-I Focus: One test doesn’t fit all”, Seattle Post-Intelligencer, May 2, 2004. Marshall, K. (2006) “Interim Assessments: Keys to Successful Implementation”. NewYork: New Leaders for New Schools.
What Do We Ask Ourselves Before Building
an Exam Instrument?• What do we want to measure?• What information do we want to
gather?• How long do we want the
assessment to take?• How will we determine which
standards we want to assess?• How will we use the results?
Guiding Principles for Building Exams
• A minimum of three items for every standard you want to measure.
• Spread items measuring an individual standard across the assessment.
• Range of overall assessment difficulty and for each standard measured.
• Begin assessment with easier items, build to more difficult and conclude with easier items.
Cognitive Difficulty Levels
Level 1 - Basic Skills requires students to recall information such as facts,
definitions, terms, or simple one-step procedures
Level 2 - Conceptual Understanding requires students to make some decisions as to how to
approach the problem or activity and may imply more than a single step
Level 3 - Extended Reasoning require students to develop a strategy to connect and
relate ideas in order to solve the problem while using multiple steps and drawing upon a variety of skills.
Level One: Basic Skills
• Support ideas by reference to details in text
• Use dictionary to find meaning• Identify figurative language in
passage• Solve a one step word problem• Perform a specified procedure
Level 2: Conceptual
Understanding
• Predict logical outcome• Identify and summarize main
points• Represent a situation
mathematically in more than one way.
• Interpret a visual representation
Level 3: Extended Reasoning
• Determine effect of author’s purpose on text elements
• Summarize information from multiple sources
• Provide a mathematical justification
• Describe, compare and contrast solution methods
Psychometrician’s Vocabulary
• ValidityValidity– What kind of concrete evidence What kind of concrete evidence
can we collect that a test measures can we collect that a test measures what we say it measures? what we say it measures?
– Do the results of using the test give Do the results of using the test give the consequences we expect?the consequences we expect?
• ReliabilityReliability– What is the measure of the What is the measure of the
amount of error in a test?amount of error in a test?– What is the percent of the What is the percent of the
differences (differences (variancevariance) in the ) in the scores that is attributable to scores that is attributable to the trait being assessed?the trait being assessed?
William H. Trochim, Research Methods Knowledge Base, www.socialresearchmethods.net/kb/rel%26val.htm, accessed 3/23/06,
Reliability Source: Ernie Bauer and Ed Roeber, “Technical Standards for
Locally-Developed Assessments”, April 8, 2010
• Longer tests are more reliable.• Tests with a narrow range of content are
more reliable.• Appropriately difficult tests (neither too
hard nor too easy) are more reliable.• Clearly worded items make more reliable
tests.• Reliability can be trumped by validity.
Validity Source: Ernie Bauer and Ed Roeber, “Technical Standards for
Locally-Developed Assessments”, April 8, 2010
• The evidence we collect depends on the inference we wish to make– Content validity - is there evidence to show the
test measures what it says it measures?– Predictive validity - is there evidence that shows that
the test predicts what it is we want it to predict?– Concurrent validity - does this test measure the same
thing as another measure, only easier to use?– Construct validity - does this test measure the
psychological construct we are trying to measure?
Use a blueprint to build validitySource: Ernie Bauer and Ed Roeber, “Technical Standards for
Locally-Developed Assessments”, April 8, 2010
• A grid is used to summarizes the content and format of the test.– The rows are the learning objectives
– The columns are the level of cognitive complexity
• The cells list the types and numbers of items.
• You can place a margin that can be used to sum the total points.
Reverse DesignSource: Ernie Bauer, and Jim Gullen,
“Locally Developed Assessment: What Do the Results Mean?”, April 2010
• Start with your items
• Group the items by content (GLCE/HSCE)– This will give you the rows for the blueprint
• Split the items in the content piles by DOK– This will give you the columns for the blueprint
• Create a blueprint document to check if this provides acceptable evidence for you.
Naming Conventions• Assessment and Exams
– School Year • (e.g., 2008-2009)
– Name of Course/Grade • (e.g.,US History and Geography or 4th Grade)
– Name of Assessment or Exam• (e.g., World War II, EXPLORE, or Dolch Words)
• You may also identify the timing of the assessment – (e.g., Beginning, Middle, End or Fall/Spring or Pre/Post)
Questions?Stan MastersCoordinator of
Instructional Data ServicesLenawee Intermediate School
District2946 Sutton RoadAdrian, Michigan 49921
517-265-1606 (phone)517-263-7079 (fax)[email protected]
Data Warehouse webpage:https://webapps.lisd.us/sites/Curriculum/Pages/DataWarehousing.aspx