1 children first intensive mathematics 2009 getting sophisticated about testing january 29, 2009 eso...

59
1 Children First Intensive Mathematics 2009 Getting Sophisticated about Testing January 29, 2009 ESO Networks 14 (Bob Cohen) and 19 (Vera Barone) Facilitators: Deena Abu-Lughod, SAF; Freddie Capshaw, LIM Contact us at [email protected] or [email protected] Electronic copies available at http://dabulughod.wikispaces.com

Upload: byron-lynch

Post on 29-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

1

Children First Intensive

Mathematics 2009Getting Sophisticated about Testing

January 29, 2009ESO Networks 14 (Bob Cohen) and 19 (Vera Barone)

Facilitators: Deena Abu-Lughod, SAF; Freddie Capshaw, LIM

Contact us at [email protected] or [email protected]

Electronic copies available at http://dabulughod.wikispaces.com

2

School-Based Round Table Facilitators

PS 71 – Ana Martinez, LeeAnne Russian-Wells

MS 211- Tracy Fray Oliver

PS/MS 4 – Tanica Brathwaite

PS 24 – Mary Schenk and Loretta Carbone

PS 86 – Rosanna Monaco, Ursula Smalls

PS/MS 189 – Archangelo Joseph

MS/HS 368 – Mary Anne Sheppard

3

Agenda 8:30 Rationale, Learning Intentions, Activities and Opening Inquiry

9:00 Sponsor Presentation: Neufeld Learning

9:40 Data Drill Down: From Progress and Performance, to Current Needs, to Performance Indicators (Deena Abu-Lughod)

10:30 Break

10:45 Deconstructing an indicator and constructing a learning progression (Freddie Capshaw)

11:15 Distinguishing Questions: Group Work

12:00 Give One-Get One

12:15 Lunch

1:00 Distinguishing Questions Report Out

1:20 Round Tables

2:00 Strategies, Resource Packet

2:30 Questions, Debrief, Evaluation

4

Rationale

The NYC Progress Report expects that all students should make at least one year of progress after a year of instruction, and it measures that progress the New York State tests.

To deal with this different challenge, we need to learn how to analyze the tests and related resources, and to share and migrate best practices so we can work more strategically to improve student performance.

5

Learning Intentions

What are we going to know at the end of the day?

1) Understand how to use State tests and related resources (Item maps, Item analyses, Trends maps, State Benchmarks, and Item Response files) to identify “power standards” and questions that distinguish students at different performance levels so we can plan strategically for the next 16-20 instructional days, and in the future.

2) How to use performance indicators to construct a learning progression that identifies the precursor skills that allow students to answer a question.

3) A greater variety of test sophistication techniques and instructional strategies that have produced high student gains.

6

What will we do?

•Examine comparative data related to 2008 performance and progress and to the distribution of students who are below grade level. Participants will note their observations of the data and reflect upon the implications.

•Familiarize ourselves with State item maps, item analyses, trends maps and benchmarks so we can see what standards and performance indicators matter the most, and for whom (“distinguishing questions”).

•Work as a group to deconstruct a 6th grade item.

•Work in small groups to construct a “learning progression” (sequence of precursor skills) to support success on one distinguishing question from Grade 4, 6 or 8.

•Share test sophistication strategies.

•Participate in a round table on a selected topic: Math Vocabulary, 4 Square Method for Analyzing Test Questions, Enrichment and Remediation, Differentiating Test Prep, Making Progress at All Levels, Algebra for All, and Implications of the Scoring Policies.

7

Norms for Collaborative Learning

We are all responsible for the success of the meeting.

Active listening.

Keep cell phones on vibrate.

Honor the time allocations and each other.

Use Post Its to jot down questions and ideas as they occur to you. We will either take questions at intervals or move them to the Parking Lot to respond to at the end of the day.

Seek innovation.

8

Opening Inquiry (assigned by table)How do you know, before the test, if your Level 3+4 students are on track to make 1 year progress. What data could reliably predict that progress?

Report Out: Acuity tests, classroom tests, performance series, end of year assessment from previous grade repeated in Fall, do item analysis based on state standards, Use frequency chart of how many times PIs appear (trends map) and focusing on the most frequently tested indicators. Students who read at grade level do math at grade level. ELA predicts Math. Use end of unit EDM, self eval, RSA.

What skills or behaviors distinguish a Level 4 student from a Level 3 student? What does a teacher need to do to keep a 3 a 3, or move a 3 to 4?

Report Out: Complete homework, willingness to ask questions, differentiate instruction to challenge them. Express their answers thoroughly. Understand the problem they are asked to solve. Use multiple strategies to solve the problem. Ease with natural applications of concepts, motivated. To move students, give critical thinking assignments to challenge students. Track progress on practice tests. Daily work on problem of the day.

What skills or behaviors distinguish a Level 3 student from a Level 2 student? What does a teacher need to do to move a 2 to a 3?

Report Out: Conceptualizing numberness; built in learned number sense. Ability to comprehend the procedures. Able to problem solve around the content strands, esp. number sense. Must understand mathematical language as it appears on the test itself. Students are aware of their own mathematical data. Goal setting requires that students own their data.

9

Welcome!

Special Thanks to:

Rudy Neufeld

CEO/Senior Author

Neufeld Learning Systems, Inc.

[email protected]

Understanding Math

10

Mathematics and the Progress Report

The Progress Report emphasizes the importance of students showing PROGRESS in mathematics, from one year to the next.

To measure progress, it converts scale scores to “proficiency rates” and then sees whether a student made the same or higher proficiency rate as the year before (1 yr progress) and measures separately the average change of Level 1+2 students and the average change of Level 3+4 students.

It then compares those metrics to schools in your “peer group” (schools like yours in terms of demographics for elementary and K-8 schools, or in terms of the 4th grade scores of incoming students for middle schools). Those comparisons are conveyed in the “horizon scores” – where your school falls in the range of scores. The charts in the Mathematics Data Packet convey only the “raw” data, not the horizon scores. (Pages 1-A and 1B)

11

Interpreting Performance and Progress

Examine the selected Performance and Progress metrics for schools that have your same grade configuration. (1-A)

Performance: Percent of students in Levels 3+4Progress: Percent of students making 1 yr progress (attaining a proficiency rate at least as high as the previous year)

You will see schools that have:> High performance and high progress> High performance and low progress> Low performance and high progress> Low performance and low progress

Note on your Data Recording sheet what you see about your school in this comparative context (Here’s what), possible implications (So what), and one thing you will do (Now what).

12

13

14

15

Avg. Change for Lvl 1+2 and Lvl 3+4 Students

The Progress Report also measures the Average Change in Proficiency Rate for Level 1 and 2 students and for Level 3 and 4 students.

This is because it is, in fact, harder for high performing students to show progress.

Examine the Average Change charts (1-B) for your configuration.

In comparison with other schools, how did your school do in moving Level 1+2 students and moving Level 3+4 students? What would you like to know? What analysis should you conduct next June?

Note on your Data Recording sheet what you see about your school in this comparative context (Here’s what), possible implications (So what), and one thing you will do (Now what).

16

17

18

19

Distribution of Current Students

Examine the charts showing the Distribution of Current Students by 2008 Math performance level by network, and the tables that show the distribution by grade. (Pages labeled “2”)

What percent of your students are below grade level? Does it vary from grade to grade? Did you do something differently this year that would allow you to predict that the percent of students below grade level will decline this year? What are the implications for your work over the next 3 weeks?

Note on your Data Recording sheet what you see about your school in this comparative context (Here’s what), possible implications (So what), and one thing you will do (Now what).

20

21

22

Sophisticated test scrutiny

NYS issues documents that help us get more sophisticated in analyzing the effectiveness of our implemented curriculum and in using the Item Response files that we received.

These documents are accessible through links at this webpage:

http://www.emsc.nysed.gov/irts/nystart/what-is-nySTART.shtml

Documents include:

1) Scoring Keys and Item Maps, Item Analyses and “Benchmarks”, and retired State Tests.

2) Trends maps are available from Tony Tripolone [email protected]

23

2008 Item Maps

The State issues Item Maps (Standard and performance indicator with answer key). These tell you what indicator was tested with each question and the correct answer. It is helpful to mark up a copy of the test with the Indicator for easy reference in the future.

24

Trends Maps

The Trends Maps come in 2 forms, and help you identify the “Power Standards”: those performance indicators that are most frequently tested on the State exams.

The Trends Map by grade shows which items tested which performance indicator over the last few years.

An additional Trend Map shows which items were tested in each grade in a particular year. (Not included)

Let’s look at the Trend Map for Grade 4 together.

Be aware, however, that although some indicators are not tested, they may involve necessary skills.

25

26

Priorities: NYS Math 2008 Distribution by Strand

Gr 3 Gr 4 Gr 5 Gr 6 Gr 7 Gr 8

Number Sense

52% 52% 44% 40% 32% 13%

Algebra 16% 15% 15% 20% 16% 47%

Geometry 13% 13% 26% 17% 20% 40%

Measurement

16% 19% 18% 14% 17% 14%

Statistics 10% 10% 12% 23% 32% 0%

27

Your turn

Examine the Trends Map for the grade assigned to your table.

What indicators appear to be tested most frequently over time? Refer back to the Item Map to see what the performance indicator is.

What are the implications of this information for your planning?

No Hands Up report out.

28

Power Standards: Priority Perf. Indicators

Grade 3: N6, N10, N18, A1, A2, G1, M2, M7, S5, S7

Grade 4: N6, N14, N15, N18, N20, N21, N22, A1, A4, G2, G3, M3, M8, S3, S6

Grade 5: N8, N11, N16, A2, A7, G1, G4, M3, M8, S2, S3, S4

Grade 6:

Grade 7:

Grade 8:

29

Item Analysis

From the “What is nySTART?” webpage, you can access both the Item Analysis and the Benchmarks for the 2006, 2007 and 2008 Mathematics exams.

The Item Analysis will tell you what percent of NYS students selected each answer choice. The p-value --the percent of all students who answered a question correctly-- is a gauge of the question’s difficulty.

The information also tells you what percent of students select each of the wrong answer choices. This helps you with the distracter analysis. Distracters are a lens into misconceptions.

30

Item Analysis Examples from Gr 8 2008

Note that most students who

answered incorrectly went for C (3). That

suggests a common misconception.

In this case, the wrong answers were

distributed fairly evenly. This suggests

general confusion.

31

Limitations of Item Analysis

Most schools engage in Item Analysis. Typically, after an assessment, we sort the indicators into categories, indicating which ones are secure (where 75%+ of students answered correctly), which are developing (50-75%), and which ones students truly struggled with (where <50% answered correctly). Then we work on the weak indicators.

This is an excellent practice, but has some limitations:

1) The weakest indicator may not be the most important one.

2) It is hard to tell whether the question was hard just for your students or for all students because you have no basis for comparison.

How can we resolve this dilemma?

32

Comparing ourselves with Benchmarks

How do you know which questions are strategic (matter the most)?

How do you know which strands and indicators represent a strength or a weakness for your school?

How do you know whether a question was difficult just for your own students or for all students?

Comparing our own scores with the State Benchmarks helps us answer these questions.

Then we must ask: To work strategically, what questions should we focus on? The ones that were difficult for all students or just for some? How should we deal with infrequently tested indicators?

33

What are the State Benchmarks?

The Benchmarks will tell you what percent of Level 2, Level 3 and Level 4 students answered each item correctly. This is a more useful indicator than the p-value provided in the Item Analysis.

The benchmarks help you identify which questions distinguish the Level 4 students from the Level 3 students and the Level 3 students from the Level 2 students.

Understanding these differences may help you select what to work on if you want to push 2s to 3s, and keep 3s and 4s from slipping.

34

Understanding the State Benchmarks(List of 2008 Math Indicators)This column shows the

percent of Level 2 students in NYS who answered correctly.

This column shows the percent of Level 4 students

in NYS who answered correctly.

These columns show the difference in the % of 4s and 3s answering correctly and the % of 3s and

2s answering correctly in

NYS.

35

Notice how the gaps increase late in the test.

Stamina distinguishes 4s from our Network’s students.

On some items, there is

little difference

between how our students

answered and how Level 4 students in

NYS answered.

This chart compares the average percent of our network’s students who answered each item correctly to the percent of Level 4 students in NYS who answered each item correctly.

36

School Team Analysis

With your team, compare how you did on each item relative to the State benchmark. For all grades EXCEPT Grade 4, you are comparing yourself with Level 3 students statewide. For Grade 4, you are comparing yourself with Level 4 students statewide.

On your data collection sheet, indicate which items (and performance indicators) represented the greatest challenges for your students. Consider what the data are telling you, the possible implications and what you can/will do.

If on all items, your school exceeded the Level 3 benchmark, refer to the Level 4 Benchmark table (List of 2008 Math Indicators) instead of the bar charts (Pages labeled “3” and “4” and “5”).

Your school-specific data will indicate your school number in the PAGE HEADER. The table legend and chart title will suggest that you are still looking at Network data.

37

Multiple Choice vs Constructed Response

What is the importance of the multiple choice questions relative to the constructed response questions?

Approximately 75% of the questions in Grades 3,5,6 and 7 are multiple choice; 62% of the questions in Grades 4 and 8 are multiple choice.

Both sections are important. In Grade 4, Level 2 students answered between16% and 76% of the MC items correctly.

A student who cannot do the CR is unlikely to make it to a Level 3. Among the Level 4 students, some had as few as 72% correct on the MC. So, a Level 2 and a Level 4 student who have the same percent correct on the MC will be at different levels because of the constructed response.

In Grade 8, the percent correct on the MC for Level 2 students ranged from 12% to 84% correct, and the range for Level 4 students was from 84% to 100%. Again, doing well on the multiple choice is no guarantee of doing well overall on the test.

38

Using the Item Response files

The student-specific Item Response data shows:> How each student answered each item> The percent of multiple choice items each student

answered correctly> The percent of constructed response items each

student answered correctly

Note: These data were sent to principals by the DOE in June 2008. In September, we created a template into which the data could be inserted so that we could conduct item analyses and compare multiple choice and constructed response scores, and provided 2 training sessions. If you still want your school’s data already inserted into the template, please email Deena Abu-Lughod at [email protected]

39

Item Response Example: What would you do?These two students are both high level 2s (scale scores of 644), but performed very differently on last year’s test. One did better on the Multiple Choice; the other did better on the Constructed Response.

Student 1:

Student 2:

40

Take a Short Break!

When you return from the break, please divide up from your school team to maximize learning across grades, and sit at a grade-appropriate table:

Grade 4

Grade 6

Grade 8

41

Strategic Question: Grade 6 Math 2008 Item # 1

If 8n = 96, what value of n makes the equation true?

A 12

B 88

C 104

D 768

42

With your table partners, list at least 5 words or phrases that describe what 8n=96 is about.

1.

2.

3.

4.

5.

43

What is 8n=96 about?How does knowing mathematics vocabulary help us

solve math problems?

Vocabulary Share OutAlgebra, Division, Multiplication

Product, Sum, Factors

Equation, Variable, Value

Multiples (of 8)

Whole numbers

Term (a number, variable or product in an expression, such as 8n)

Unknown quantity

Relationship

44

Table Talk: Strands and Indicators

Which Strand is being tested ? What’s the big math idea?

Algebra

Which Band (within the strand)? What are students being asked to do?

Algebra, 5A4, (tested every year including the 2005 sample test-see Math 6 Strand trends document),

Use the Algebra 3-8 Pre/Post March Indicator document (in the accordian folder) to determine which Performance Indicator is being tested.

Is it a Pre-March Grade 6 or Post-March Grade 5 indicator?

Where do skills break down in 1 and 2 step equations. Usually with the operations. Operations become more complex. How does this help us deconstruct the question. Would this be difficult for 3’s and 4s? No. But Level 2s have difficulty with math facts, they will waste time

Note on the Pre/Post document that 5A 4 is in the “equations and inequalities” band for Gr. 5, followed by 6A 4 for Gr. 6, but on Grade 4, 4A4 falls on the “Patterns, Relations and Functions” band.

45

Our Challenge: Progress

Here’s What: The data tell us that

1) The percent of students in Levels 3+4 declines as the grades go up.

2) It is harder for Level 3+4 students to make 1 year progress than it is for Level 1+2 students.

So What: Why is this important?

We need to know how to support students to maintain or exceed the level they achieved the year before. So, what is it that distinguishes Level 4 students from Level 3 students? What is it that distinguishes Level 3 students from Level 2 students?

Now What: What will I do?

Hypothesis: By looking at those questions that best separate the 4s from the 3s, and the 3s from the 2s, we can tease out the skills that make the difference.

46

Distinguishing Questions

Examine the packet of “Distinguishing Questions” for your assigned grade.

These questions were selected because there was an unusually large difference in the percent of Level 4 students answering correctly and the percent of Level 3 students answering correctly, or the percent of Level 3 students answering correctly and the percent of Level 2 students answering correctly.

Is there a certain strand that is highly represented in those questions? If so, what does that imply for working strategically in the next 20 days and planning future actions?

47

Chart a Learning Progression

At your table, select 1 “distinguishing question” and answer the following:

> Why did this question distinguish the 4s from the 3s or the 3s from the 2s?

> What foundational or precursor skills were necessary to answer this item correctly? Construct a “learning progression” that could scaffold instruction on this kind of problem or indicator that could be used in the next 15 days.

> Would the linguistic demand of the question require unusual mathematic or academic vocabulary support?

> What are the implications for working with your current Level 2 or Level 3 students?

48

Gr 4 Item 8 – distinguishes 3 and 4

There is a cultural disconnect (burn in a wood stove). Key words are misleading. Multiplication fluency is lacking. The 3rd grade indicator may have been ignored in grade 4. Lack of problem solving strategies. Computational issue. Worded oddly. Also based on 2N20. Similar to a question from sample test.

Concept of key words in math is different than in ELA.

49

Gr 4 Item 24 Elapsed Time 4M9

Students need to know days of the week and months by Gr 1; by Gr 2 they must know hours; by Gr 3, they must know minutes; by Gr 4, they must know half hours. By gr 5, pass the midnight hour.

Gr 4 is the first time they look at elapsed time.

Vocabulary: arrived and left. Would that be confusing?

Use the Sample Math Tasks (from NYSED, in the accordian folder) to prepare your students. Have AIS teachers use them when they work with your students. You can test the kids up and down the indicators with it. To access these tasks, go to the following link:

http://www.emsc.nysed.gov/3-8/sampletasks.htm

50

Gr 6 Item 8: Item distinguished 3s from 2s

Post-March Gr 5 indicator – 5S7

Multistep problem; prior knowledge of simple probability.

Precursor skills: 3S2 collecting data and recording from surveys; 4S5, predicting based on data. 6S9-list possible outcomes from compound invents; interpret data as basis for predictions.

Linguistic demand is high: “or” makes it difficult and a lot of vocabulary.

Support: visual aids, concrete objects, model it. On day of test, what could you tell them? Write in the book!

51

Gr 8, Item 9: Complementary AnglesStudents confuse complementary and supplementary. Begins to appear in Gr 8. In Gr 6, they have right angle. Ask students what angles they know. I don’t know if they know notation for congruent angles. Kids may be confused by the “not drawn to scale”. The instructional recommendations talk about this.

Kids can’t remember the difference. Good mnemonic: C before S, 90 degrees before 180 degrees. That would push a kid over to the 4. Nearly always tested in Gr 8.

Use the pacing guides, but REFINE them. Where they give the indicator may not be the first time the indicator is introduced. This is work that needs to happen at grade level meetings.

52

Give One, Get One

1. Record 3 test prep techniques or resources (one of your own and two from others at your table) on your chart.

2. Get up, move around, and connect with someone from another school.

3. Introduce yourself and GIVE ONE of your strategies/resources to that person. GET ONE strategy/resource for your list from that person.

4. Move to a new partner and repeat the process until your chart is full.

5. If your list and your partner’s lists are identical, brainstorm together for a strategy/resource that can be added to both of your lists.

NOTE: Exchange no more than one strategy/resource with any given partner.

53

Round Tables

1. PS 71: Concept Cards – Building Math Vocabulary: kids take ownership; put in portfolio; keep it with them every day. Ongoing process.

2. MS 211: Four Square Method for Analyzing Test Questions: can be used in math and ela. What is the question? What info do I know? What strategy will I use? Solution and check. Those last 2 questions are what helps raise the scores.

3. PS 24: Staying the Course with the DOE Math Initiative and adding an enrichment & remediation test prep model. On Saturday and 6 wks for the 1s, 2s and low 3s, 3 days during Feb break, use Kaplan Advantage (above gr level) and Keys (below gr level). 1/3 of school’s students participate.

4. PS/MS 4: Differentiating Mathematics Test Prep Using the ITA’s and the Predictive. Start with the data and identify needs by class; the coaches put together materials for each class. The prioritize the strands based on the results.

5. PS 86: Making Progress with all Performance Levels. Grassroots curriculum. Looked at indicators and developed their own curriculum around those ideas. Not using EDM or Impact. Work with Progress in Math (basic skills), and enrichment Fridays. A strategy per week. Create their own leveled booklets. Includes PD; to get consistency, you must have PD.

6. M/HS 368: Algebra for All, article discussion with 4 C’s Protocol. Teaching strategies (focus on big concepts; friendly numbers; work scaffolding into lessons all year). How do we get kids see the importance of algebra? Should we begin the year with remediation? (That doesn’t work.)

8. PS/MS 189: PEMDAS Misconceptions: Address a school-wide misconception and create plan to address it. Confusion about how teachers are teaching it or students are understanding it. Look at these immediately and create plans to address this. Mathematical runs? People reluctant to open up to new approaches.

54

Work Strategically by Assessing Wisely

Working strategically in the final push before the test requires selecting those strands, indicators, skills and practices that matter the most.

You may have identified weaknesses in your school from the Acuity Predictive. That is useful feedback. However, to tell whether a question was just hard for you or hard for all students, you must compare your school with other schools. The packet of Acuity Network Level item analyses will help you do that.

Consider focusing on those power standards --the frequently tested indicators--, where your school fell well below the State benchmark or well below the Network average.

55

Compare this year’s cohort to last year’s

Identify the 3-5 indicators that represented the greatest challenge for your students last year on each grade.

Assign students from your current grade a “distinguishing item” from last year’s test. Try no more than 3 questions with a strict time limit. Compare the percent of current students who answered correctly with last year’s percent. Was it higher or lower?

Construct a learning progression to scaffold their understanding of the foundational skills, re-administer the item or a similar item and observe whether there has been improvement.

Examine the items that represented difficulties for individual students. Make sure they have mastered those skills so that they don’t carry forward their weaknesses from year to year. Use the Item Response files to identify those difficulties, the RSA, or the Acuity Predictives.

56

Construct reliable assessments

Create short “parallel” baseline and endline assessments to measure growth for target population students (or all students), using the old State Tests, Trends Maps, and the Benchmark information.

Focus on one strand and extract questions from past tests for particular performance indicators, especially the frequently tested ones (the power standards).

Use the benchmark information to identify “distinguishing” questions to make sure pre- and post-tests have similar distributions of “difficult” and “easy” questions.

Open ended questions will give you more insight into student misconceptions, so consider leaving off the multiple choice answers so you can see the student work.

Administer the pre-test, work on the revealed misconceptions, and administer the post-test.

57

Plan your Acuity Customized Assessment

Many of you create Acuity’s customized assessments. When constructing them, consider the following:

Create short tests in pairs for a pre- and post- assessment on each learning target.

Limit them to one strand, focus on the power standards, and select items from different grades that follow a logical sequence.

Write down what you are doing so you can reconstruct your thinking and test your hypotheses. For example, if you think a certain 3rd grade indicator is a necessary “grain” or precursor skill for a certain 4th grade indicator, which in turn is a necessary skill for a 5th grade indicator, then test all three and see if or where the breakdown is in the understanding.

58

Learning Intentions: How did we do?

Do we now better understand how to use State tests and related resources to identify “power standards” and questions that distinguish students at different performance levels so we can plan strategically for the next 16-20 instructional days, and in the future?

Do we know how to use performance indicators to construct a learning progression that identifies the precursor skills that underlie “distinguishing questions”?

Do we have more knowledge of test sophistication techniques and instructional strategies that have produced high student gains?

59

Feedback and debrief; Evaluation

Please complete the Feedback Form now.

Materials from this session, including the work you produced, will be posted at http://dabulughod.wikispaces.com and/or as resources on ARIS.