first two cycles - nemp · generally marked tasks which required reasonably clear cut answers....

15
The First Two Cycles of New Zealand's National Education Monitoring Project: 1995 -1998 and 1999 - 2002 Lester Flockton Educational Assessment Research Unit University of Otago Box 56, Dunedin New Zealand [email protected] The National Education Monitoring Project has completed its first two cycles of assessment of student achievement across fifteen subject areas. This report outlines key features of the design and operation of the project during the first two cycles, and examines performance differences among subgroups of the student samples. Aggregated scores from sets of assessment tasks in each of fifteen subject areas have been used to compare student performance within each of seven demographic variables. The analyses show that overall differences relating to gender, school size, school type, community size and geographic region are generally small, whereas differences according to student ethnicity (Maori/non-Maori) and socio-economic index range from small to large. Patterns of performance have remained largely constant from the first to the second cycle of assessments. The National Education Monitoring Project is conducted by the Educational Assessment Research Unit, University of Otago, under contract to the New Zealand Ministry of Education.

Upload: others

Post on 06-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

The First Two Cycles of New Zealand's National Education Monitoring Project:

1995 -1998 and 1999 - 2002

Lester Flockton Educational Assessment Research Unit

University of Otago Box 56, Dunedin

New Zealand

[email protected]

The National Education Monitoring Project has completed its first two cycles of assessment of student achievement across fifteen subject areas. This report outlines key features of the design and operation of the project during the first two cycles, and examines performance differences among subgroups of the student samples. Aggregated scores from sets of assessment tasks in each of fifteen subject areas have been used to compare student performance within each of seven demographic variables. The analyses show that overall differences relating to gender, school size, school type, community size and geographic region are generally small, whereas differences according to student ethnicity (Maori/non-Maori) and socio-economic index range from small to large. Patterns of performance have remained largely constant from the first to the second cycle of assessments.

The National Education Monitoring Project is conducted by the Educational Assessment Research Unit, University of Otago, under contract to the New Zealand Ministry of Education.

Page 2: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

NEW ZEALAND’S NATIONAL EDUCATION MONITORING PROJECT: THE FIRST TWO CYCLES: 1995 to 1998. and 1999 to 2002

PART 1 BACKGROUND, PURPOSE AND ORGANISATION

BACKGROUND

New Zealand’s national monitoring of students’ educational achievements started with the National Education Monitoring Project (NEMP) in 1995. This marked the realisation of a succession of governmental enquiries and reports spanning at least thirty years prior to 1995. The reorts repeatedly highlighted the need for regular and dependable information about the educational achievements of New Zealand students. Prior to 1995 New Zealand had no national programme for systematically monitoring student learning. While useful information has been available through participation in international surveys, these cover only some areas of our mandatory curriculum and include only a modest coverage of learning outcomes. Moreover, they are restricted mainly to timed paper and pencil tests which are not typically part of the curriculum and assessment culture of New Zealand’s schools. National monitoring includes all curriculum areas using a range of assessment approaches designed to give a rich and detailed picture of student achievement. PURPOSE

The National Education Monitoring Project is a national assessment programme with the purpose of obtaining a dependable national picture of what New Zealand students know, think and can do. The programme monitors achievement trends over four-yearly intervals and provides information which is relevant to the work of both policy makers and practitioners. Because national monitoring also meets a public accountability and information function, its descriptive reports on students’ achievements and attitudes are widely distributed. An important function of the project is to help identify what students are doing well, areas of concern, and priorities for future improvement in student learning. ORGANISATION AND ADMINISTRATION OF NEMP

Random Samples of Schools and Students in the Main Sample

Each year nationally representative random samples of 2,880 students from approximately 250 randomly chosen schools were selected from national lists of state, integrated and private schools, half at year 4 (ages 8-9) and half at year 8 (ages 12-13). Probability of selection was proportional to the number of students enrolled in the level. From each selected school or pair of neighbouring small schools, twelve students were randomly chosen to take part. In turn, the twelve students were randomly assigned to three groups of four students. Each group of students worked on a different set of tasks across all of the curriculum areas being assessed in the particular year. Over a period of a week, each student typically took part in about four hours of assessment activities. Schools and parents/students were individually notified of their selection to take part in national monitoring, and given the opportunity to withdraw. Additional Samples of Schools and Students – Second Cycle

From the beginning of the second cycle of national monitoring, additional samples were included to allow the performance of special categories of students to be reported. Pacific Island Students: To allow results for Pacific Island students to be compared with those of Maori students and other students, ten additional schools were selected at year 4 level and year 8 level. These were selected from schools that had not been chosen in the main sample, and had at least fifteen percent Pacific Island students attending the school, and had at least twelve students at the relevant year level.

1

Page 3: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

Students in Maori Medium Education: To allow results for Maori students learning in Maori medium to be compared with results for Maori children learning in English, ten additional schools were selected at year 8 level only. They were selected from Maori medium schools (such as Kura Kaupapa Maori) that had at least four year 8 students, and from other schools that had at least four year 8 students in classes classified as Level 1 immersion (80 to 100 percent of instruction taking place in Maori).

In each four-year cycle of national monitoring*:

• approximately 11,500 students in cycle 1 and 12,500 in cycle two, from over 1000 schools took part (cycle 2 included a special additional sample from schools with at least 15 percent Pacific Island students);

• over 1000 personal contacts were made with school principals by the project’s directors along with numerous discussions with individual parents;

• In cycle one, 10 (0.9%) of 1030 schools declined participation and were subsequently replaced. In cycle two, 21 (1.8%) of 1160 schools declined participation and were replaced.

• The number of students replaced for reasons other than change of school or absences averaged approximately 2 percent of the original sample.

*excluding Maori Medium sample Learning Areas Assessed

Within repeating four year cycles, NEMP assessed and reported on all major curriculum areas. This recognises the considerable value and importance New Zealand schools and their communities attach to a broad-based curriculum which relates to the world around the school, and to preparation for learning for life beyond the school.

In the first and second cycles of national monitoring, assessments were reported in 15 areas of the curriculum: science, art, information skills (1995 and 1999); reading and speaking, aspects of technology, music (1996 and 2000); mathematics, social studies, information skills (library and research) (1997 and 2001); writing, listening and viewing, health and physical education (1998 and 2002).

Curriculum Advisory Panels

Identifying the key learning outcomes to be assessed (knowledge, skills, understandings and attitudes) and deciding on suitable assessment tasks involved curriculum advisory panels made up of curriculum specialists, classroom practitioners and Maori educators. These panels assisted with drawing up and subsequent reviews of NEMP assessment frameworks used to guide the development, review and selection of tasks. NEMP curriculum panels also played an important part in generating task ideas and guiding the final selection of tasks used in the assessment programme.

National monitoring drew upon the experience and insights of about 75 nationally acknowledged curriculum specialists, classroom practitioners and Maori educators who were members of the project’s nine curriculum panels, its Maori Reference Group (Te Pitau Whakarei), and the Maori Immersion Education Advisory Committee.

2

Page 4: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

Task Administration and Marking

Each year more than 100 national monitoring tasks and survey questionnaires were administered by trained teachers seconded from their own schools for periods of six weeks. Each teacher attended a one week national training workshop then spent the subsequent five weeks working with a paired colleague in a group of selected schools. Each week each pair of teachers visited one school where twelve randomly selected students were assessed, or two small neighbouring schools which together provided the required twelve students. At the conclusion of the assessment programme in schools, tasks were marked by senior tertiary education students and teachers. Tertiary students generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement. The payback for professional development from these involvements is an established feature of the project.

In the first two cycles of national monitoring, about 780 teachers had opportunities to work as task administrators. About 275 senior tertiary students and over 1000 experienced teachers were involved in marking. Formal feedback each year from all groups indicated high levels of professional satisfaction and extended understandings of assessment principles and practices.

Approaches to Assessment

Assessing a full cross-section of students across a broad range of curriculum outcomes in authentic, contextualised ways required varied approaches suited to the different processes and outcomes that were investigated. National monitoring used five main approaches for presenting assessment tasks, each one allowing students easy access to the support of a trained teacher assessor: Most sessions took approximately one hour.

One-to-one interview: each student worked individually with a teacher with the whole session recorded on videotape. Stations: four students worked independently, moving around a series of task activity stations. Team: four students worked collaboratively with the session usually recorded on videotape. Independent: four students worked individually on paper and pencil tasks or art making tasks. Open Space (physical education): four students, supervised by two teachers, attempted a series of physical skill tasks with performances being videorecorded.

Within each four year cycle of national monitoring:

• approximately 15,000 hours of video recorded performances and 240,000 pages of paper responses (including art works) were gathered for marking from a total of about 500 tasks.

• Total student assessment time amounted to approximately 45,000 hours.

• Approximately four tonnes of supplies and equipment were in use around New Zealand during each year’s assessment programme.

• About one million bits of information were produced from the marking of individual tasks each year.

• The highest proportion of tasks used performance assessment methods. Very few tasks involved paper and pencil multiple choice methods.

3

Page 5: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

Low Stakes, High Impact Assessment

There is a considerable imperative arising out of the low stakes nature of national monitoring (no school, teacher or student was identified at any point). Tasks and their presentation had to be designed so that students would feel strongly inclined to produce and sustain their best efforts regardless of individual differences in ability, background or experience. However, the national monitoring project recognised that any form of externally administered assessment inevitably constrains the sorts of things that students can be asked to do. Despite this, there has been a strong commitment to developing and administering tasks that reflect the best of day to day learning experiences and the world in which we live.

Every student who has taken part in national monitoring assessment was asked to rate their impressions of the tasks they attempted as ”really enjoyed” or “not enjoyed”. Feedback on this variable can be useful for identifying task attributes that might have a negative impact on student performance. 4 of the 499 tasks administered in the first cycle, and 2 of the 553 tasks administered in the second cycle, had more students disliking them than liking them.

Reporting NEMP Results

The project attaches great importance to wide distribution and easy access to its reports on student achievement. Annual reports of assessment results provide task by task descriptive. Approximately one third of tasks used in the first cycle were used again as “link tasks” in the second cycle to allow analyses and comparisons of performance over time. The details of trend tasks are not fully described until they have been repeated. The relative performance of subgroups was also reported using seven demographic variables. Prior to public release, all reports were examined by national reporting forums made up of curriculum and assessment specialists, teachers and Maori educators. Each year the forum produced a summary statement of key findings and highlighted implications for policy and practice.

In the first two cycles of national monitoring approximately 300,000 copies of reports were distributed to major educational institutions and agencies, and to every New Zealand school and its Board of Trustees (i.e. 14,000 copies of each report). Approximately 290,000 copies of the Forum Comment were distributed (i.e. 72,000 of each edition).

4

Page 6: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

PART 2 RESULTS ACROSS THE FIRST TWO CYCLES OF NATIONAL MONITORING: THE PERFORMANCE OF SUBGROUPS USING SEVEN DEMOGRAPHIC VARIABLES. SEVEN DEMOGRAPHIC VARIABLES

National Monitoring results are analysed and reported task by task. Although the emphasis is on the overall national picture of student learning, performance patterns of different demographic groups and categories of school were also analysed. Seven demographic variables were used for creating subgroups, with students divided into two or three subgroups on each variable.

TABLE 1 7 Demographic Variables and Subgroups*

Variables Subgroups Gender Girl, Boy Ethnicity Maori, non-Maori. Socio-economic index for the school community

• bottom three deciles (1 – 3) • middle four deciles (4 – 7) • highest three deciles (8 – 10)

Size of school • Year 4 Schools: less than 20 year 4 students, 20-35 year 4 students, more than 35 year 4 students.

• Year 8 Schools: less than 35 year 8 students, 35 – 150 year 8 students, more than 150 year 8 students.

Type of school • full primary school, intermediate school. • (Some students were in other types of schools, but

too few to allow separate analyses.) Geographic region • greater Auckland

• other North Island, • South Island

Size of community • urban area over 100,000 • community of 10,000 to 100,000 • rural area or town of less than 10,000

*Note that from 1995 through to 1998 NEMP also analysed and reported achievement according to the proportion of Maori students in schools, and the proportion of Pacific island students in schools. These analyses were discontinued from 1999. From 1999 through to 2002 special additional samples of schools and students were included so that the achievement of year 4 and year 8 Pacific Island students could be assessed and reported.

Each of the categories listed above, except the small year 4 schools in one year, included at least 16 percent of the student sample. Categories containing fewer students, such as Asian students or female Maori students, were not used because the resulting statistics would be based on the performance of fewer than 75 students, and thus not sufficiently reliable.

The analyses of the relative performance of subgroups used an overall score for each task, created by adding scores for the most important components of the task. Where only two subgroups were compared, differences in task performance between the two subgroups were checked for statistical significance using t-tests. Where three subgroups were compared, one way analysis of variance was used to check for statistically significant differences among the three subgroups.

The number of students included in each analysis was quite large (approximately 450), so statistical tests were sensitive to small differences. To reduce the likelihood of attention being drawn to unimportant differences, the critical level for statistical significance was set at p=.01 so that differences this large or larger among the subgroups would not be expected by chance in more than one percent of cases. For team tasks, the critical level was raised to p= .05, because of the smaller sample size (120 teams rather than about 450 students).

5

Page 7: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

NUMBERS OF ASSESSMENT TASKS

Each year a variety of tasks were undertaken by three randomly assigned groups of students in each school or pair of small schools. The number of components in individual tasks varied considerably from one to several markable items. A large proportion of tasks were identical for year 4 and year 8, some had small adjustments to take account of age appropriateness, and a few were entirely different for reasons of curriculum appropriateness.

TABLE 2 Number of Assessment Tasks Included in Analyses:

Cycle 1 (1995–1998) and Cycle 2 (1999-2002)

Learning Area

Year 4

Year 8

Total Tasks

Cycle 1 2 1 2 1 2

Science 37 56 39 54 54 70 Art 11 13 11 13 16 13 Graphs, Tables, Maps 29 33 31 38 45 51 Reading 17 17 17 17 25 19 Speaking 13 15 13 16 18 18 Technology 15 22 16 25 22 30 Music 22 28 21 28 31 29 Mathematics 51 78 46 94 82 101 Social Studies 19 36 26 41 35 49 Information Skills 21 21 27 28 37 35 Writing 24 29 29 35 34 36 Listening 8 14 9 17 12 18 Viewing 11 16 14 18 19 19 Health 31 31 32 39 39 43 Physical Education 25 22 25 24 30 24 TOTALS 334 431 356 489 499 555

PERFORMANCE OF SUBGROUPS ACROSS THE FIRST TWO CYCLES

The summary tables that follow show the relative performance of subgroups within each of the seven demographic variables. The data show percentages of the total number of tasks which had some or no significant differences in performance between subgroups in each of the 15 learning areas which were assessed. Notable differences occurred in subgroups of gender, ethnicity and socio-economic status, whereas differences in school size, type and location were few or non-existent.

The data are arranged by subjects assessed in each year of the four year cycles:

Year 1: Science; Visual Arts; Graphs, Tables, Maps Year 2: Music, Aspects of Technology, Reading & Speaking Year 3: Information Skills, Social Studies, Mathematics Year 4: Listening, Viewing, Writing, Health, Physical Education

6

Page 8: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

Gender Differences

Results achieved by girls and boys were compared. Year 4 samples across the first two cycles averaged 50 percent girls and 50 percent boys. Year 8 samples averaged 48 percent girls and 52 percent boys in the first cycle, and 50 percent girls and 50 percent boys in the second cycle.

Results of the statistical significance tests for the total sample of year 4 and year 8 students are shown in adjoining tables. The first column in each table shows learning subjects assessed in each year of the four year cycles. The second column shows the percentage of tasks on which girls performed statistically significantly higher than boys. The third column shows the percentage of tasks on which girls and boys were not statistically significantly different. The final column shows the percentage of tasks on which boys performed statistically significantly higher than girls.

Girl/Boy Differences

Year 4 Students

Subjects Grouped by assessment years

G> = B>

Cycle 1 2 1 2 1 2

Science 0 0 90 72 10 28 Visual Arts 18 15 73 85 9 0 Graphs/Tables 11 0 89 94 0 6 Music 15 17 85 83 0 0 Technology 8 0 75 89 17 11 Reading 50 53 50 47 0 0 Speaking 36 54 64 46 0 0 Info Skills 30 0 70 100 0 0 Social Studies 0 7 86 76 14 17 Mathematics 4 0 94 88 2 12 Listening 13 14 87 86 0 0 Viewing 22 6 78 94 0 0 Writing 79 39 21 61 0 0 Health 4 11 96 89 0 0 Phys. Educ. 23 23 29 27 48 50

Average 21 14 72 76 7 8

Girl/Boy Differences Year 8 Students

Subjects Grouped by assessment years

G> = B>

Cycle 1 2 1 2 1 2

Science 0 0 70 73 30 27 Visual Arts 9 23 91 77 0 0 Graphs/Tables 3 16 97 84 0 0 Music 0 7 100 83 0 0 Technology 17 13 75 70 8 17 Reading 64 11 36 89 0 0 Speaking 9 14 91 86 0 0 Info Skills 27 28 73 72 0 0 Social Studies 16 9 53 85 31 6 Mathematics 13 4 85 93 2 3 Listening 22 29 78 71 0 0 Viewing 29 11 71 89 0 0 Writing 86 88 14 12 0 0 Health 23 46 77 54 0 0 Phys. Educ. 33 26 19 31 48 43

Average 23 22 69 72 8 6

Comment

With the exception of the performance of boys in physical education (years 4 and 8), science and social studies (year 8, cycle 1), girls outperformed boys on a substantial percentage of tasks. The most striking gender differences occurred in the area of literacy generally, and most particularly in reading and writing. In reading, the gap between year 8 girls and boys reduced considerably from the first to the second cycle. While the gap between year 8 girls and boys was large in both cycles in writing, it decreased substantially for year 4 students. Girls performed better than boys on 12 of the 26 health tasks in the second cycle, compared to just 6 of 26 tasks in the first cycle. Both year 4 and year 8 boys performed better on about half of the physical education tasks in both cycles. Most of those tasks involved ball handling skills. Year 8 boys performed better than girls on ten of the 33 science tasks in cycle 1, and 13 of the 48 tasks in cycle 2. These tasks were spread across all strands of the science curriculum. In the second cycle, year 4 boys performed better than girls on 14 of 50 science tasks, compared to just 3 of 37 tasks in the first cycle. 7

Page 9: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

Maori/Non-Maori Differences

Results achieved by Maori and non-Maori students were compared. Year 4 samples across the first four years averaged 79 percent non-Maori students and 21 percent Maori students, and 80 percent non-Maori students and 20 percent Maori students in the second four years. Year 8 samples averaged 80 percent non-Maori students and 20 percent Maori students in the first four years, and 81 percent Maori students and 19 percent Maori students in the second four years.

The first column in each table shows learning subjects assessed in each year of the four year cycles. The second column shows the percentage of tasks on which Maori students performed statistically significantly lower than non-Maori students. The third column shows the percentage of tasks on which Maori and non-Maori students were not statistically significantly different. The final column shows the percentage of tasks on which Maori students performed statistically significantly higher than non-Maori students.

Maori/Non-Maori Differences Year 4 Students

Subjects Grouped by assessment years

M< = M>

Cycle 1 2 1 2 1 2

Science 61 12 39 88 0 0 Visual Arts 27 0 73 100 0 0 Graphs/Tables 81 33 19 67 0 0 Music 10 38 85 57 5 5 Technology 16 53 84 47 0 0 Reading 100 88 0 6 0 6 Speaking 25 67 75 33 0 0 Info Skills 55 32 45 68 0 0 Social Studies 36 33 64 60 0 7 Mathematics 80 75 20 25 0 0 Listening 50 36 50 64 0 0 Viewing 67 38 33 62 0 0 Writing 46 40 54 60 0 0 Health 26 18 74 82 0 0 Phys. Educ. 0 0 81 91 19 9

Average 45 38 53 60 2 2

Maori/Non-Maori Differences Year 8 Students

Subjects Grouped by assessment years

M< = M>

Cycle 1 2 1 2 1 2

Science 58 44 42 54 0 2 Visual Arts 0 23 100 77 0 0 Graphs/Tables 33 42 67 58 0 0 Music 15 24 85 76 0 0 Technology 25 65 75 35 0 0 Reading 50 47 50 42 0 11 Speaking 50 43 50 57 0 0 Info Skills 62 56 38 44 0 0 Social Studies 68 51 27 43 5 6 Mathematics 77 66 23 34 0 0 Listening 33 36 67 64 0 0 Viewing 57 33 43 67 0 0 Writing 39 38 61 62 0 0 Health 27 18 73 82 0 0 Phys. Educ. 5 0 66 91 29 9

Average 40 38 58 60 2 2

Comment The overall picture shows a pattern of Maori students performing less well than their non-Maori counterparts in most curriculum areas across both cycles. However, there is considerable variation in comparative performance across the 15 curriculum areas. The gap between year 4 Maori and non-Maori reduced from cycle 1 to cycle 2 in science, visual arts, graphs and tables, listening, viewing and health, but increased somewhat in music, technology and speaking. The gap between year 8 Maori and non-Maori students reduced from cycle 1 to cycle 2 in science, social studies, viewing and health, but increased in visual arts and technology. Averaged across all subjects, the gaps between Maori and non-Maori, as shown by these analyses, have remained relatively constant at both year 4 and year 8. 8

Page 10: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

13

Socio-Economic Differences

Schools are categorised by the Ministry of Education based on census data for the census mesh block where children attending the schools live. The SES index takes into account household income levels, categories of employment, and the ethnic mix in the census mesh blocks. The SES index uses ten subdivisions, each containing ten percent of schools (deciles 1 to 10). For our purposes, the bottom three deciles (1 to 3) formed the low SES group, the middle four deciles (4 to 7) formed the medium SES group, and the top three deciles (8 to 10) formed the high SES group.

The first column in each table shows learning subjects assessed in each year of the four year cycles. The first data column in each table shows the percentage of tasks in each learning area for which there were statistically significant differences between the three groups, with students in the low decile group performing worst. The second data column shows the percentage of tasks for which there were no statistically significant differences. The final data column shows the percentage of tasks for which there were statistically significant differences with students in the low decile group performing best.

Socio-Economic Differences Year 4 Students

Subjects Grouped by assessment years

L< = L>

Cycle 1 2 1 2 1 2

Science 54 54 46 46 0 0 Visual Arts 8 31 92 69 0 0 Graphs/Tables 67 52 33 48 0 0 Music 35 57 65 43 0 0 Technology 33 86 67 14 0 0 Reading 71 88 29 12 0 0 Speaking 75 87 25 13 0 0 Info Skills 81 43 19 57 0 0 Social Studies 53 64 47 33 0 3 Mathematics 85 87 15 13 0 0 Listening 88 71 12 29 0 0 Viewing 100 50 0 50 0 0 Writing 83 72 17 28 0 0 Health 44 32 56 68 0 0 Phys. Educ. 0 5 83 95 17 0

Average 58 59 41 41 1 0

Socio-Economic Differences Year 8 Students

Subjects Grouped by assessment years

L< = L>

Cycle 1 2 1 2 1 2

Science 56 63 44 37 0 0 Visual Arts 17 62 83 38 0 0 Graphs/Tables 60 84 40 16 0 0 Music 45 27 55 73 0 0 Technology 41 48 59 52 0 0 Reading 93 47 7 42 0 11 Speaking 67 56 33 44 0 0 Info Skills 56 71 44 29 0 0 Social Studies 73 76 27 19 0 5 Mathematics 77 76 23 24 0 0 Listening 78 59 22 41 0 0 Viewing 86 61 14 39 0 0 Writing 72 83 28 17 0 0 Health 38 44 62 56 0 0 Phys. Educ. 13 8 83 92 4 0

Average 58 58 42 41 0 1

Comment

With exceptions in physical education and visual arts, the two most popular school subjects, students in schools in low decile areas generally performed poorly on large percentage of tasks in most learning areas when compared to the performances of students in mid to high decile schools. However, there have been some interesting trends across the two cycles. The gap between Year 4 students in low decile schools and those in higher decile schools decreased in graphs and tables, information skills, listening, viewing and health, but increased in visual arts, music, technology, reading, speaking and social studies. The gap between year 8 students in low decile schools and those in higher decile schools decreased in music, reading, listening, and viewing, but increased in the visual arts, graphs and tables, information skills and writing. Averaged across all subjects, the gaps between students in low and higher decile schools, as shown by these analyses, have remained substantially constant at both year 4 and year 8.

9

Page 11: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

13

School Size Differences

Results were compared from students in larger, medium sized and small schools (see Table 1 above).

The first column in each table shows learning subjects assessed in each year of the four year cycles. The first data column shows the percentage of tasks on which there were no statistically significant differences in student performance according to size of school. The second data column shows the percentage of tasks on which statistically significant differences occurred in relation to school size.

School Size Differences Year 4 Students

Subjects Grouped by assessment years

= Diff.

Cycle 1 2 1 2

Science 100 95 0 5 Visual Arts 100 100 0 0 Graphs/Tables 100 97 0 3 Music 100 97 0 3 Technology 100 100 0 0 Reading 100 94 0 6 Speaking 100 100 0 0 Info Skills 90 100 10 0 Social Studies 100 94 0 6 Mathematics 94 97 6 3 Listening 100 100 0 0 Viewing 100 100 0 0 Writing 100 90 0 10 Health 96 100 4 0 Phys. Educ. 91 100 9 0

Average 98 98 2 2

School Size Differences Year 8 Students

Subjects Grouped by assessment years

= Diff.

Cycle 1 2 1 2

Science 97 95 3 5 Visual Arts 100 100 0 0 Graphs/Tables 100 97 0 3 Music 85 97 15 3 Technology 94 100 6 0 Reading 100 94 0 6 Speaking 100 100 0 0 Info Skills 96 100 4 0 Social Studies 96 94 4 6 Mathematics 100 97 0 3 Listening 100 100 0 0 Viewing 93 100 7 0 Writing 97 90 3 0 Health 100 100 0 0 Phys. Educ. 100 100 0 0

Average 97 92 3 2

Comment

These analyses, which give stable results over two cycles of national monitoring, show that school size within the roll range encountered in the national monitoring sample is not in itself a key determinant or good predictor of student achievement. These results contribute an important educational perspective for discussion and debate on the relative merits and effects of school size for student learning outcomes.

10

Page 12: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

13

School Type Differences

Results were compared for year 8 students attending full primary and intermediate schools. Across the first four years about fifty percent of students in the sample were in intermediate schools, 40 percent in full primary schools, and the remaining 10 percent in other school types (eg. composite and form1 to 7 schools).

The first column in each table shows learning subjects assessed in each year of the four year cycles. The first data column shows the percentage of tasks in each learning area for which there were statistically significant differences between intermediate and full primary schools, with students in intermediate schools performing best. The second data column shows the percentage of tasks for which there were no statistically significant differences. The third data column shows the percentage of tasks for which there were statistically significant differences with students in full primary schools performing best.

Intermediate/Full Primary School Differences Year 8 Students

Subjects Grouped by assessment years

Int> = FP>

Cycle 1 2 1 2 1 2

Science 0 4 100 96 0 0 Visual Arts 8 8 84 92 8 0 Graphs/Tables 0 0 100 97 0 3 Music 0 0 95 100 5 0 Technology 0 0 94 100 6 0 Reading 0 0 100 95 0 0 Speaking 0 0 100 94 0 6 Info Skills 0 4 96 96 4 0 Social Studies 0 0 96 93 4 7 Mathematics 0 0 95 98 5 2 Listening 0 0 89 94 11 6 Viewing 0 0 100 100 0 0 Writing 0 0 100 100 0 0 Health 0 0 100 92 0 8 Phys. Educ. 0 0 100 100 0 0

Average 0 1 97 97 3 2

Comment

These analyses, which give stable results over two cycles of national monitoring, show very few differences in student achievement in relation to the type of school attended at the year 8 level, where there are two major alternative types of schools – intermediate and full primary. 11

Page 13: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

13

Regional Differences

Results achieved by students from Auckland, the rest of the North Island, and the South Island were compared. Both the year 4 and 8 samples across the first four years averaged 28 percent of students in Greater Auckland, 49 percent in the rest of the North Island, and 23 percent in the South Island.

The first column in each table shows learning subjects assessed in each year of the four year cycles. The first data column shows the percentage of tasks on which there were no statistically significant differences in student performance according to geographic zone. The second data column shows the percentage of tasks on which statistically significant differences occurred in relation to zone.

Regional Differences

Year 4 Students

Subjects Grouped by assessment years

= Diff.

Cycle 1 2 1 2

Science 92 100 8 0 Visual Arts 92 92 8 8 Graphs/Tables 93 91 7 9 Music 80 97 20 3 Technology 93 91 7 9 Reading 93 98 7 12 Speaking 100 80 0 20 Info Skills 95 95 5 5 Social Studies 95 81 5 19 Mathematics 91 85 9 15 Listening 88 64 12 36 Viewing 78 56 22 44 Writing 88 86 12 14 Health 84 84 16 16 Phys. Educ. 88 86 12 14

Average 90 86 10 14

Regional Differences Year 8 Students

Subjects Grouped by assessment years

= Diff.

Cycle 1 2 1 2

Science 95 85 5 15 Visual Arts 84 85 16 15 Graphs/Tables 97 84 3 16 Music 95 93 5 7 Technology 100 92 0 8 Reading 100 74 0 26 Speaking 58 81 42 19 Info Skills 70 86 30 14 Social Studies 77 78 23 22 Mathematics 80 98 20 2 Listening 100 94 0 6 Viewing 100 94 0 6 Writing 90 87 9 13 Health 93 95 7 5 Phys. Educ. 91 87 9 13

Average 89 87 11 13

Comment

The somewhat irregular regional differences might arouse curiosity, yet elude ready explanation. For example, year 4 differences in cycle 1 viewing appeared in 3 of 17 tasks, with students from the South island scoring highest on all 3 tasks and Auckland students scoring lowest on those same 3 tasks. The year 4 differences in cycle 2 viewing appeared in 7 of 16 tasks, with students from the South Island scoring highest on all 7 tasks, and students from Auckland lowest on 5 of them. By contrast, the percentage differences that appeared in other subject areas were mixed across the three regions, and typically on small numbers of tasks. Overall, there is no consistently clear pattern of one region performing better or worse than another.

12

Page 14: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

13

Community Size Differences

Results were compared for students living in communities containing over 100,000 people (main centres), communities containing 10,000 to 100,000 people (provincial cities), and communities containing less than 10,000 people (rural areas). Both the year 4 and 8 samples across the first four years averaged 56 percent of students in main centres, 26 percent in provincial cities, and 18 percent in the rural areas.

The first column in each table shows learning subjects assessed in each year of the four year cycles. The first data column shows the percentage of tasks on which there were no statistically significant differences in student performance according to community size. The second data column shows the percentage of tasks on which statistically significant differences occurred in relation to community size.

Community Size Differences Year 4 Students

Subjects Grouped by assessment years

= Diff.

Cycle 1 2 1 2

Science 97 96 3 4 Visual Arts 100 92 0 8 Graphs/Tables 96 97 4 3 Music 90 93 10 7 Technology 100 100 0 0 Reading 100 94 0 6 Speaking 92 87 8 12 Info Skills 100 100 0 0 Social Studies 100 97 0 3 Mathematics 98 99 2 1 Listening 88 93 12 7 Viewing 100 100 0 0 Writing 100 93 0 7 Health 92 94 8 6 Phys. Educ. 96 91 8 9

Average 97 95 3 5 s

Community Size Differences Year 8 Students

Subjects Grouped by assessment years

= Diff.

Cycle 1 2 1 2

Science 97 83 3 17 Visual Arts 100 85 0 15 Graphs/Tables 96 95 4 5 Music 90 93 10 7 Technology 100 100 0 0 Reading 100 95 0 5 Speaking 92 94 8 6 Info Skills 100 93 0 7 Social Studies 100 93 0 7 Mathematics 98 99 2 1 Listening 100 88 0 12 Viewing 93 100 7 0 Writing 93 94 7 6 Health 100 95 0 5 Phys. Educ. 91 100 9 0

Average 97 94 3 6

Comment

Overall, these analyses of task performance relative to community size show that differences are quite small across all curriculum subjects. This suggests that size of population in the areas where schools are located has little bearing on student learning outcomes.

13

Page 15: First Two Cycles - NEMP · generally marked tasks which required reasonably clear cut answers. Teachers marked tasks requiring higher degrees of experienced professional judgement

The true meaning and potential richness of assessment is in the wholesome description and analysis of the individual student’s performance. The numeric data we construct to represent those performances are always proxies for reality, and thus we need to constantly exercise the greatest of modesty and care in our claims based on those data.