exploring the impact of a largescale diagnostic science

477
EXPLORINGTHEIMPACTOFALARGE- SCALEDIAGNOSTICSCIENCETESTAND FORMATIVEPRACTICES.Amixed- methodsstudy. JamesScottMEd DoctorofPhilosophyC02041 UniversityofTechnologySydney FacultyofArtsandSocialSciences ©JamesScott2018

Upload: others

Post on 03-Nov-2021

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Exploring The Impact of a Largescale Diagnostic Science

EXPLORINGTHEIMPACTOFALARGE-

SCALEDIAGNOSTICSCIENCETESTAND

FORMATIVEPRACTICES.Amixed-

methodsstudy.

JamesScottMEd

DoctorofPhilosophyC02041

UniversityofTechnologySydney

FacultyofArtsandSocialSciences

©JamesScott2018

Page 2: Exploring The Impact of a Largescale Diagnostic Science

ii

Certificateoforiginalauthorship

I,JamesScottdeclarethatthisthesis,issubmittedinfulfilmentoftherequirements

fortheawardofDoctorofPhilosophybyThesisintheFacultyofArtsandSocial

SciencesattheUniversityofTechnologySydney.

Thisthesisiswhollymyownworkunlessotherwisereferencedoracknowledged.

Inaddition,Icertifythatallinformationsourcesandliteratureusedareindicated

inthethesis.

Thisdocumenthasnotbeensubmittedforqualificationsatanyotheracademic

institution.

29August2018

Production Note:

Signature removed prior to publication.

Page 3: Exploring The Impact of a Largescale Diagnostic Science

iii

Acknowledgments

Thisthesiswouldnothavehappenedwithoutinsights,supportand

encouragementfromanumberofpeople.

MythankstoDagmarMcCloughan,ESSATeamLeader,fortheopportunitytobe

involvedwiththeESSAprogramintheearlyyearsofitsdevelopmentand

implementation.Also,mythankstoProfessorJohnPegg(UniversityofNew

England)andAssociateProfessorDebraPanizzon(thenfromtheUniversityof

NewEngland)fortheopportunitytobeapartoftheresearchteaminvestigating

thepotentialofSOLOasatoolforimprovingassessmentforlearning.Afterthat

initialinvolvement,bothprovidedmewiththeiradvice,encouragementand

supportwhichIsoughtatdifferenttimeswhilstIworkedonthisthesis.

IwouldliketoacknowledgeDoctorGeoffBarneswhoencouragedmetorunwith

theideathattheresidualfromaregressionprocedurewasameasureofareal

effectofteaching.MythanksalsototheNSWDepartmentofEducationfor

providingmewithaccesstoESSAandNAPLANdatainaformthatIcouldusefor

thepurposesofthisthesis.ParticularthanksinthisregardareduetoDoctor

NadineSmithandformercolleagueanddearfriendGerryMcCloughan.

AssociateProfessorsNickHopwoodandTapanRaifromtheUniversityof

TechnologySydneyhavemygratitudeforthetimeandadvicetheyprovidedasI

developedtheapproachIwantedtotakewiththeresearchmodelandanalysisof

data.

Iamextremelygratefultothescienceteacherswhorespondedtothesurveyabout

theirpracticeandparticularlysototheteacherswhomadethemselvesavailableto

participateinthecasestudies.Ihaveundertakentoprovidethemwiththeresults

ofmyworkinaformthatIhopewillbeusefultothem.

Ihadthesupportandadviceoftwoexcellentsupervisors,bothattheUniversityof

TechnologySydney,forthisthesis.ProfessorPeterAubussonwhoencouragedme

toundertakethisprojectinthefirstinstanceandAssociateProfessorMatthew

Page 4: Exploring The Impact of a Largescale Diagnostic Science

iv

Kearneywhotookoverinthelaterstagestoassistmebringittoaconclusion.My

gratitudeandthankstobothfortheirpatience,adviceandsupport.

ThisthesisalsohadthebenefitoftheconsiderableeditingskillsofDoctorTerry

FitzgeraldwhoisalsoattheUniversityofTechnologySydney.

Finally,IwanttoacknowledgetheforbearanceofDaunemywifewho,intheend,

waitedpatientlyandsupportivelyformetocompletethisprojectsothatwecould

resumeourlivestogether.

Thesisformat

Thisisaconventionalthesiscomprisedoftitle,frontmatter,glossaries(acronyms

andtermsused),tableofcontents,listoffigures,listoftables,abstract,six

chapters,appendicesandreferencesconsultedinthepreparationofthisthesis.

Page 5: Exploring The Impact of a Largescale Diagnostic Science

v

ListofAcronyms

AAS AustralianAcademyofScience

ABS AustralianBureauofStatistics

ACARA AustralianCurriculumAssessmentandReporting

Authority

ACCI AustralianChamberofCommerceandIndustry

ACER AustralianCouncilforEducationalResearch

AE AtExpectation(seealsoWAEandWBE)

ANOVA AnalysisofVariance

AQF AustralianQualificationsFramework

ARG AssessmentReformGroup

BCA BusinessCouncilofAustralia

BOS BoardofStudies

BOSTES BoardofStudies,TeachingandEducational

Standards

CC CurriculumCorporation

CCII CentreforContinuousInstructionalImprovement

DEC NSWDepartmentofEducationandCommunities

DET NSWDepartmentofEducationandTraining

DofE DepartmentofEducation

ESA EducationServicesAustralia

ESSA EssentialSecondaryScienceAssessment

EV AcronymfortheacronymsESSAandVALID.

F TheFoundationorentrylevelforschooling(seeK).

HSC HigherSchoolCertificate

ICSEA IndexofCommunitySocio-EducationalAdvantage

K Kindergartenorentrylevelforschooling(seeF).

NAP-SL NationalAssessmentPlan-ScientificLiteracy

NAPLAN NationalAssessmentPlanLiteracyAndNumeracy

NESA NewSouthWalesEducationStandardsAuthority

NGSS NextGenerationScienceStandards(US)

Page 6: Exploring The Impact of a Largescale Diagnostic Science

vi

NSES NationalScienceEducationStandards(US)

OECD OrganisationforEconomicCo-operationand

Development

PCK PedagogicalContentKnowledge

PIRLS ProgressinInternationalReadingLiteracyStudy

PISA ProgrammeforInternationalStudentAssessment

SEA Socio-EducationalAdvantage

SEAR ScienceEducationAssessmentResource

SET Science,EngineeringandTechnology

SLPM ScientificLiteracyProgressMap

SMART SchoolsMeasurementAssessmentandReporting

Toolkit

SME Science,MathematicsandEngineering

SOLO StructureoftheObservedLearningOutcome

SPSS StatisticalPackagefortheSocialSciences

STEM Science,Technology,EngineeringandMathematics

TIMSS TrendsInMathematicsandScienceStudy

US UnitedStatesofAmerica

VALID ValidationofAssessmentforLearningandIndividual

Development

VET VocationalEducationandTraining

WAE WellAboveExpectation(seealsoAEandWBE)

WBE WellBelowExpectation(seealsoAEandWAE)

Page 7: Exploring The Impact of a Largescale Diagnostic Science

vii

Glossaryoftermsasusedinthisthesis

artifact Somethingmadebyhumaneffort,inthiscontext

relatedtoeducationalassessment.

assessmentaslearning Assessmentaslearningoccurswhenstudentsare

theirownassessors.Studentsmonitortheirown

learning,askquestionsandusearangeofstrategies

todecidewhattheyknowandcando,andhowtouse

assessmentfornewlearning.(NESA,2018)

assessmentforlearning Assessmentforlearninginvolvesteachersusing

evidenceaboutstudents'knowledge,understanding

andskillstoinformtheirteaching.Sometimes

referredtoas'formativeassessment',itusually

occursthroughouttheteachingandlearningprocess

toclarifystudentlearningandunderstanding.(NESA,

2018)

assessmentoflearning Theuseofevidenceoflearningtomakeasummative

judgmentofachievementagainstoutcomesand

standards.Sometimesreferredtoas'summative

assessment'.Itusuallyoccursafteraperiodof

instruction.Thejudgmentisoftenexpressedasa

mark,percentageorgrade.Theusefulnessofthe

gradeormarkdependsonvalidityandreliabilityof

theprocessesusedtogatherandassignvaluetothe

evidencegathered.(NESA,2018)

assessment-relatedwork Isthepurposefulcollectingofevidenceoflearning,

creatingthemeansbywhichthatevidencewas

obtained(ifnotbydirectobservationofbehaviour),

theassumptionsusedtointerpretthatevidence,the

choiceoftextformsusedtorepresentand

communicateresultsofassessment,andsubsequent

usesforthoseresults.

Page 8: Exploring The Impact of a Largescale Diagnostic Science

viii

capabilities Ameasureoftheability,capacity,powerorpotential

todosomething.TheAustralianCurriculum,Science

includessevengeneralcapabilitiesallstudentsare

expectedtoacquireastheyprogressthrough

schooling.

CurriculumCorporation Anationaleducationalsupportentitycreatedbythe

Federal,stateandterritorygovernmentsinAustralia

toproduceeducationalresourcesforAustralian

Schools.ItwasreplacedbyEducationServices

Australia(ESA)from2010.

competencies Seecapabilities.

curriculum Thedocumentsteachersusetoinformthelearning

activitiestheyplananddelivertostudents.

diagnosticassessment Gatheringevidenceoflearningtoidentifygaps,

strengthsandweaknessesinstudentlearning.

educationjurisdiction StatesandterritoriesinAustraliamanagethe

deliveryofeducationalservicestostudentsin

Australia.Theyprovideforregistrationand

regulationofpublicandprivateschoolsintheir

geographicareasofjurisdiction.

educationalstandards Arethelearninggoalsstudentsareexpectedto

achieve,usuallyaftersetperiodsofinstruction

typicallyassociatedwithYearorGradelevels.

feedback Informationprovidedbyanagentregardingaspects

ofone'sperformanceorunderstanding.

formativeassessment Seeassessmentforlearning.

formativepractices Instructioninformedbyformativefeedback.

highstakesassessment Anyassessmentwheretheresultshave

consequencesfortherecipientofthoseresults.

keycompetencies Asetofcompetenciesrelatedtoequippingstudents

forwork.

Page 9: Exploring The Impact of a Largescale Diagnostic Science

ix

lowstakesassessment Theuseofevidenceoflearninginwaysthatreduces

toaminimumunintended,usuallynegative,

consequencesforthelearner.

outcomes Measurableorobservablebehavioursintendedasa

resultofinstruction.

PrimaryConnections Asetofcurriculummaterialsproducedbythe

AustralianAcademyofSciencedesignedtoassistK-6

teacherstoteachscience.

proficiencyareas Areasofskillorexpertise.

proficiencylevels Descriptionsofresponsefeaturesthatdifferentiate

betweenlevelsofskillorexpertise.

regression Regressionisastatisticalprocessforestimatingthe

relationshipsbetweenvariables.

SciencebyDoing Acurriculumsupportresourceproducedfor

secondaryscienceteachersbytheAustralian

AcademyofScience.

scientificliteracy Scientificliteracyistheabilitytoengagewith

science-relatedissues,andwiththeideasofscience,

asareflectivecitizen(OECD).Itisalsothespecialized

literaciesthatdistinguishscienceliteracyfrom

generalliteracyandnumeracy.

SEAquarters Socio-EducationalAdvantage(SEA)proportions,

relativetoAustralia,inschoolpopulations.(ACARA

MySchoolwebsite)

SEAscore Socio-EducationalAdvantage(SEA)scoreisa

compositemeasureofsocio-educationaladvantage

generatedforthepurposesofthisproject.

selectiveentryschools AcategoryofschoolinNSW,entrytowhichis

determinedbystudentresultsintestsofreading,

mathematics,generalabilityandwriting.

Page 10: Exploring The Impact of a Largescale Diagnostic Science

x

self-regulatedlearners Studentswhocanplantheirownlearning,monitor

theirperformanceandthenreflectontheoutcomeof

thatlearning.

Skills,cognitive Includeremembering,thinkinglogicallyand

reasoning,explaininganddescribing.

Skills,employability Skillsrelatedtocommunicating,workinginteams,

problemsolving,initiativeandenterprise,planning

andorganisingandself-management.

Skills,generic Groupsofskillsvariouslydescribedas

basic/fundamental,people-related,

conceptual/thinking,personalskillsandattributes,

skillsrelatedtothebusinessworldandskillsrelated

tothecommunity.

SOLOmodel StructureoftheObservedLearningOutcome(SOLO)

theorythatinvolvestwolearningcycleswithina

modeofthinking

SOLOtaxonomy StructureoftheObservedLearningOutcome(SOLO)

theorythatdescribesasinglelearningcyclewithina

modeofthinking

standardsframework Descriptionsoflevelsofperformanceinanumberof

categoriesrelatingtocurriculum,teachingorother

profession.

statisticallysignificant Istheprobabilityoffindingagivendeviationfroma

nullhypothesis,oramoreextremeone,inasample.

(SPSSdefinition)

STEMsystem Science,Technology,EngineeringandMathematics

institutionsinacountryorlargergroupthat

preparespeopleforworkin,andincluding,the

institutionsthatproduceSTEMoutputsinsociety

andrelatedeconomies.

summativeassessment Seeassessmentoflearning.

Page 11: Exploring The Impact of a Largescale Diagnostic Science

xi

syllabus AdetailedcurriculumthatinNSWmaybeusedto

definethescopeofanexternaltest.

TheBoard AgenerictermforthestatutoryauthorityinNSW

withresponsibilityfordeterminingthecurriculum

andrelatedassessmentrequirementsschoolsneedto

complywithsothatstudentssatisfyrequirementsfor

receiptofcredentials.Inthecourseofthisproject

thatauthoritybeganastheNSWBoardofStudies

(BOS),becametheNSWBoardofStudiesTeaching

andEducationalStandards(BOSTES)before

becomingtheNSWEducationStandardsAuthority

(NESA)in2017.

TheDepartment AgenerictermcoveringtheNSWgovernment

authorityresponsiblefordeliveringpubliceducation

servicestostudentsinNSW.Itwentfrombeingatthe

beginningofthisproject(2012)theNSW

DepartmentofEducationandTraining(DET)tothe

DepartmentofEducationandCommunities(DEC)to

theNSWDepartmentofEducation(DofE).

Year8 TheyearofschoolinginAustralia(Gradeinother

places);inthiscasetheninthyearofschooling.

Page 12: Exploring The Impact of a Largescale Diagnostic Science

xii

Tableofcontents

ListofAcronyms................................................................................................................v

Glossaryoftermsasusedinthisthesis..................................................................vii

Tableofcontents............................................................................................................xii

ListofFigures...............................................................................................................xviii

ListofTables....................................................................................................................xix

Abstract.............................................................................................................................xxi

CHAPTERONE:OUTLINEOFMYPROJECT....................................................................1

1.1Introduction................................................................................................................1

1.2Thetwoinitiatives.....................................................................................................3

1.3Researchquestionsandmethodology................................................................7

1.4Overviewoffindings..............................................................................................11

1.5Importanceoftheresearch.................................................................................14

1.6Theresearcher........................................................................................................15

1.7Structureofthisthesis..........................................................................................18

CHAPTER2:LITERATUREREVIEW..............................................................................20

2.1Introduction.............................................................................................................20

2.2Acurriculum,teachingandassessmentforthetwenty-firstcentury...21

2.3Assessmentandassessmentsystems..............................................................28

Page 13: Exploring The Impact of a Largescale Diagnostic Science

xiii

2.4Thepurposesforassessment.............................................................................37

2.4.1Threepurposesforassessment?..............................................................................39

2.4.2Theoriesoflearning,cognitionandassessment................................................43

2.4.3Criteriaforevaluatingthecredibilityofassessments.....................................49

2.5Measurementandsummativeandevaluativeassessment......................57

2.6Formativeassessmentandformativepractices...........................................65

2.6.1Supportforformativeassessment...........................................................................67

2.6.2Teachersmakethedifference....................................................................................71

2.6.3Weightofevidencesupportingformativepractices........................................72

2.6.4FormativePractice..........................................................................................................74

2.6.5Formativepracticeandself-regulatedlearning.................................................76

2.7SOLOandtheESSA-VALID(EV)programinNSW.........................................80

2.7.1TheSOLOTaxonomy......................................................................................................80

2.7.2TheSOLOmodel...............................................................................................................86

2.7.3TheESSA-VALID(EV)assessmentframework...................................................88

2.7.4TheEVtest:“fitforpurpose”?....................................................................................91

2.7.5SOLOandassessmentinAustralasia......................................................................94

2.8Themesfromtheliteraturereviewandtheirrelevancetothisthesis.95

CHAPTERTHREE:RESEARCHDESIGN,METHODOLOGY,METHODS...............102

3.1Introduction...........................................................................................................102

Page 14: Exploring The Impact of a Largescale Diagnostic Science

xiv

3.2Mixedmethodresearch,casestudiesandresearchdesign...................104

3.3PhaseOne:selectingthesampleofschoolstoworkwith.......................107

3.3.1Selectingthesampleofschoolstoworkwith..................................................108

3.3.2Regressionresidualasbothmeasureofcollectivescientificliteracyand

‘effectsize’ofscienceteaching..........................................................................................110

3.4Phasetwo:onlinesurveyforscienceteachers...........................................113

3.4.1Surveydesign.................................................................................................................113

3.4.2Analysisofsurveyresponses..................................................................................116

3.5Phasethree:casestudiesandsciencedepartmentassessmentrelated

narratives.......................................................................................................................118

3.5.1Audio-recordedsemi-structuredinterviews:purposeanddevelopment

.........................................................................................................................................................118

3.5.2Artifactsofassessmentpractice:purpose.........................................................123

3.5.3Casestudyschooldata:purpose............................................................................124

3.5.4Defininglaterachievementinscience.................................................................126

3.5.5Definingengagementwithscience.......................................................................127

3.6Comparableschoolsandthreepredictions.................................................130

3.7Limitations..............................................................................................................133

3.7.1Trustworthinessofqualitativeresearch............................................................133

3.7.2Validityandreliabilityofquantitativedata......................................................136

3.7.3Summaryoflimitationsaffectingthisstudy’sfindings................................137

Page 15: Exploring The Impact of a Largescale Diagnostic Science

xv

3.8Researchapprovals..............................................................................................141

CHAPTERFOUR:FINDINGSFROMPHASETWO......................................................142

4.1Introduction...........................................................................................................143

4.2Findingsfromanalysisofthescienceteachersurveyreturns..............146

4.2.1Setoneresults:TeacherengagementwithEVresources(survey

questions1to5)......................................................................................................................147

4.2.2Settworesults:SOLOandextentofteacherengagementwithit(survey

questions6to8)......................................................................................................................158

4.2.3Setthreeresults:Formativepractices(Questions9to15).......................165

4.2.4Setfourresults:RespondentData.........................................................................189

4.3Otherfindings........................................................................................................191

4.3.1Teacherexperienceandstudentachievement................................................191

4.3.2TeacheruseofEVstudentsurveyfeedback.....................................................191

4.4Keyfindingsfromthesurveyanalysis...........................................................193

4.5Summaryoffindingsinrelationtoscienceteacheruseofformative

practices..........................................................................................................................197

CHAPTERFIVE:PHASETHREE-COMPARINGCASESTUDYSCHOOLS..............198

5.1Thecasestudyschools........................................................................................201

5.2Threepredictionsandthecasestudyschools............................................207

5.2.1Predictionone:Year8achievementandengagement.................................207

5.2.2Predictiontwo:Year10achievement.................................................................209

Page 16: Exploring The Impact of a Largescale Diagnostic Science

xvi

5.2.3Predictionthree:Year12engagement...............................................................210

5.3Comparedcasestudyschools...........................................................................211

5.3.1PairONE:PCWAE1andMCWAE1.........................................................................212

5.3.2PairTWO:MCAE2andMCWBE3...........................................................................217

5.3.3PairTHREE:PCWAE2andMCWBE5...................................................................220

5.3.4PairFOUR:MGFSAE2andMGFSWBE1...............................................................226

5.3.5PairFIVE:PCWAE2andPCWAE3.........................................................................232

5.4Correlationandstrengthofassociationsbetweenschoolvariables..241

5.4.1Correlations:fullyselectiveentrycasestudyschools(n=3)...................243

5.5.2Correlations:non-selectiveentrycasestudyschools(n=11).................244

5.5.3Correlations:provincialcasestudyschools(n=3).......................................246

5.5Summary..................................................................................................................248

CHAPTER6:DISCUSSIONANDFUTUREDIRECTIONS...........................................253

6.1Introduction...........................................................................................................253

6.2Discussionoffindingsaddressingresearchquestionone......................255

6.2.1TeachersandtheEVprogram................................................................................256

6.2.2TeachersandSOLO......................................................................................................265

6.3Discussionoffindingsaddressingresearchquestiontwo......................267

6.3.1Sciencedepartmentassessmentpractices........................................................267

6.3.2Formativeclassroompractices..............................................................................270

Page 17: Exploring The Impact of a Largescale Diagnostic Science

xvii

6.4Discussionoffindingsaddressingresearchquestionthree..................279

6.5Suggestionsforfurtherresearch.....................................................................282

6.6Recommendations................................................................................................286

6.7Conclusion...............................................................................................................292

APPENDICES.......................................................................................................................296

AppendixA:Competencies,BasicSkills,GenericSkillsandKey

Competencies................................................................................................................296

AppendixB:GoalsforSchooling(1989–2008)................................................298

AppendixC:Ateachingsequenceexemplifyingdifferentviewsoflearning

............................................................................................................................................300

AppendixD:FiveexamplesinvolvingaspectsoftheSOLOmodel..............302

AppendixE:Proformaforcasestudyschoolstocomplete............................320

AppendixF:Scienceteachersurveyquestions..................................................322

AppendixG:Interviewquestionsforcasestudyschoolparticipants(final)

............................................................................................................................................328

AppendixH.Assessmentrelatednarrativesforcasestudyschoolsusedto

makepairwisecomparisons...................................................................................330

AppendixI:Datatablesforpairedschoolcomparisons.................................384

AppendixJ:Surveydescriptivestatistics.............................................................390

REFERENCES..................................................................................................................429

Page 18: Exploring The Impact of a Largescale Diagnostic Science

xviii

ListofFigures

Figure Page2.1 Componentsofanevaluationandassessmentframework... 33

2.2 Selectedschooldataforagovernment,metropolitan,Years7-12school…………………………………………………………………….. 61

2.3 Effect-sizesofdifferencesbetweenExpertandExperiencedTeachers……………………………………………………. 72

2.4 Thethreeinteractingdomainsofpedagogy(instruction)…. 77

2.5 RepresentationoftheBiggs&Collis(1991)SOLOTaxonomy……………………………………………………………………… 82

2.6 RepresentationofthetwocycleswithinamodeSOLOmodel……………………………………………………………………………. 87

3.1 Regressionof2014EVresultsoveraNAPLAN-basedpredictor………………………………………………………………………... 109

4.1 MeansplotsforQ1&Q2combined………………………………..... 149

4.2 EVcategorymeansshowntobestatisticallysignificantlydifferent…………………………………………………………………………. 152

4.3 Teacherself-ratingfortheirunderstandingoftheEVprogram(n=85)……………………………………………………………. 155

4.4 MeansplotsforQ6…………………………………………………………. 1594.5 FrequencyVlevelofengagement(zerototen)………………… 1614.6 MeansplotsforQ7self-reportedunderstandingofSOLO…. 162

4.7 S7Frequency(n=85)verseslevelofunderstanding(onetofive)……………………………………………………………………………. 163

4.8 NKUAgraphicalrepresentationofmeans……………………….... 1664.9 MeansplotsforQ9–Q15……………………………………………….... 167

4.10 Surveyquestionssortedtoshowteacherorstudentastheleadactor……….………………………………………………………………. 169

4.11 Formativepracticemeansforallitems,teacheritemsandstudentitems(n=84)…………………………………………………….. 171

4.12 LISCmeansplots…………………………….………………………………. 1754.13 CDELcombined,TCDEL,SCDELmeansplots……………………. 1774.14 FTALcombined,TFTAL,SFTALmeansplots…………………….. 1814.15 ASIRmeansplots……………………………………………………………. 1844.16 ASTLcombined,TASTL&SASTLmeans…………………….…….. 186

4.17 Frequencyversesitemsetsforstudentsurvey(nonetothree)……………………………………………………………………………... 192

5.1Graphicalrepresentationofdescriptivestatisticsforidentified(ID)andcasestudy(CS)schoolscombinedandcasestudy(CS)schoolsseparately………………………………….. 205

Page 19: Exploring The Impact of a Largescale Diagnostic Science

xix

ListofTables

Table Page2.1 Summaryofneededchangestoteachingandassessment………….. 25

2.2 Issues to resolvewhenplanning and constructing assessmentsandhowtousethem………………………………………………………………... 49

2.3 Messick’saspectsofconstructvalidity……………………………..………. 512.4 Influencesonlearningandeffectsizes……………………………………… 74

2.5 The concept of evaporation through modes and levels (SOLOTaxonomy)…………………………………………………….………………………... 83

2.6 Selected outcomes and related SOLO levels in the 2011 EVassessmentframework…………………………………………………………..... 90

3.1 Structureofonlinesurveyforscienceteachers…………………………. 115

4.1 Defining populations from which to invite researchparticipants……………………………………………………………………………... 143

4.2 DescriptivestatisticsforQ1&2(n=85)…………………………………... 1494.3 ResultsofparametricANOVAforeightEVcategories………………... 1514.4 DescriptivestatisticsforQ3(n=85)………………………………………… 1544.5 SummaryofEVpurposes…………………………………………………………. 1564.6 DescriptivestatisticsforQ6(n=85)………………………………………… 1584.7 ParametricANOVA(n=85)forSOLOquestions(Q6&7)…............. 1604.8 Q6SOLOcategorycounts(n=85)………………………………………….….. 1604.9 DescriptivestatisticsforQ7(n=84)……………………………………….... 1624.10 Q8summaryofsourcesforlearningaboutSOLO…………………...….. 1644.11 MeansforNKUAoption(n=85)………………………………………………. 1664.12 DescriptivestatsforQ9-15(n=84)…………………………………………. 167

4.13 TestfornormalityandhomogeneityofvarianceforallitemsQs9-15(n=84)……………………………………………………………………........... 168

4.14 Sample items fromtheonlinesurveywitha teacherorstudentfocus………………………………………………………………………………….…… 170

4.15 DescriptivestatisticsforTAFL&SFAL(n=84)...…………………….... 171

4.16 Testsfornormalityandhomogeneityofvarianceonassessmentforlearning(AFL)responsesdatasets(n=84)………………………… 172

4.17 NonparametricANOVAonAFLALL,AFLteachersandAFLstudentmeans(n=84)……………………………………………………………. 173

4.18 Welchstatisticsforrobustequalityofmeans…………...…………..….... 1744.19 LISCcombinedmeans(n=84)…………………..…………………………….. 1754.20 CDELcombined,TCDEL&SCDELmeans………………………….……..... 177

4.21 TestsfornormalityandhomogeneityofvarianceonCDELresponsesdatasets(n=84)…………………................................................. 178

Page 20: Exploring The Impact of a Largescale Diagnostic Science

xx

4.22 ParametricANOVA:ALLCDEL,CDELteacherandCDELstudentmeans(n=84)………………………………………………………..….. 178

4.23 TCDEL&SCDELGames-Howellmultiplecomparisonstests………. 1794.24 FTALcombined,TFTAL&SFTALmeans…………………………………… 180

4.25 TestsfornormalityandhomogeneityofvarianceFTALresponsesdatasets(n=84)…………………………………........................... 181

4.26 NonparametricANOVAonFTALALL,FTALteacherandFTALstudentmeans(n=84)………………………………….……………………….... 182

4.27 TFTAL&SFTALGames-Howellmultiplecomparisonstest(n=84)………………………………………………………………………………………….. 182

4.28 ASIRcombined,TASIR&SASIRmeans……………………………………… 1844.29 ASTLcombined,TASTL&SASTLmeans……………………………………. 186

4.30 TestsfornormalityandhomogeneityofvarianceonASTLdatasets(n=83)………………………………………………..….... 187

4.31 NonparametricANOVAonASTLALL,ASTLteacherandASTLstudentmeans(n=84)…………………………………….................................. 188

4.32 TASTL&SASTLGames-Howellmultiplecomparisonstests(n=83)………………………………………………………….………………………………. 188

4.33 Dataaboutrespondentsandtheirschools………………………………… 1904.34 YEScountsforstudentsurveyitems…………………………………...……. 192

5.1 Schoolsthatidentifiedthemselvesincludingcasestudyschools(shaded)…………………………………………………………………………………. 202

5.2 MeanstandardisedresidualsandSEAscores……………………………. 2045.3 PairONEselectedstatistics……………………..…………………………….… 2135.4 Year12sciencecoursecompletions(2013-2015averages)………. 2155.5 PairTWOselectedstatistics…………………………………………………..… 2185.6 Year12sciencecoursecompletions(2013-2015averages)………. 2205.7 PairTHREEselectedstatistics………………………………………………….. 2225.8 Year12sciencecoursecompletions(2013-2015averages)………. 2245.9 PairFOURselectedstatistics……………………………………………….…… 2285.10 Year12sciencecoursecompletions(2013-2015averages)………. 2305.11 PairFIVEselectedstatistics……………………………………………………… 2345.12 PCWAE2andPCWAE3Year8EVresults…………………………………... 235

5.13 Casestudyschoolranksbasedonstudentscoresforthesixitemsfromthestudentsurvey....................................................................... 236

5.14 Year12sciencecoursecompletions(2013-2015averages)…….... 238

Page 21: Exploring The Impact of a Largescale Diagnostic Science

xxi

Abstract

ResearchersworkingwithschoolsintheUKandelsewherearefindingthat

explicitlyteachingstudentsthe“fivestrategiesofformativeassessment”(Black

andWiliam,2009,p.8)ishelpingtore-engagestudentswithscience.Thisthesis

presentsfindingsabouttheimpactoftwomajorinterventionsontheassessment-

relatedworkofjuniorsecondaryscienceteachersintheNewSouthWales

governmentschoolsystem(thelargestinAustralia)andonstudentscienceresults.

Thefirstinterventiontooktheformofadvicetoteachersaboutformative

assessmentintheofficialsciencecurriculum(introducedin2003),whereitis

calledassessmentforlearning.Thesecondtooktheformofamandatorylow-

stakes,large-scale,test-baseddiagnosticassessmentprograminvolvingYear8

students.ThisprogramwasfullyimplementedacrossNSWfrom2007.The

assessmentframeworkusedtoinformthedevelopmentoftestitemsandtasksand

thatinformsthecomprehensivefeedbackprovidedtostudents,parentsand

teachersisunderpinnedbyStructureoftheObservedLearningOutcome(SOLO)

theory.Threeresearchquestionsguideddatacollection.Theresearchdesign

employedmixedmethods,includingbothquantitativeandqualitativemethodsas

wellascasestudiesinvolvingsixteenpurposivelychosenschoolsites.Descriptive

andinferentialstatisticswereappliedtotheanalysisofbothstate-wideand

school-specific,teacher-providedsurveydataabouttheirpracticesandschool-

leveltestresults.Aninterpretiveapproachwasusedtogenerateassessment-

relatedworknarrativesfromaudio-recordedinterviewsandartefactsof

assessmentpracticeprovidedtotheresearcherbyvolunteeringscienceteachers

inthecasestudyschools.Thefindingsshowthatteacheruseofthreeoffive

dimensionsofformativepracticeandanexplicitfocusonteachingstudentsthe

skillsofwritingtolearnscienceproducedsciencetestresultsthatwereabove

expectation.Lesscertainwasthehoped-forfindingthatstudentswerealso

acquiringtheskillsoflearninghowtolearn.Anunexpectedfindingwasthat

studentsinregionalschoolswherescienceresultswerewellaboveexpectation

werelesspositiveabouttheirschoolscienceexperiencethantheirmetropolitan

counterparts.

Page 22: Exploring The Impact of a Largescale Diagnostic Science

1

CHAPTERONE:OUTLINEOFMYPROJECT

1.1Introduction

ThisthesisreportsinsixchaptershowIusedamixedmethodsresearchdesignto

exploretheimpactoftwoassessmentinitiativesonteachers’assessment-related

workandstudentresultsinthelargestgovernment-runschoolsysteminAustralia.

Thefindingsarethenusedtoargueinthefinalchapterfortheretentionofboth

initiativesandtosupportrecommendationstoenhancetheirfutureeffectiveness.

EducationinAustraliaistheresponsibilityoftheeightstatesandterritoriesthat

makeuptheCommonwealthofAustralia(CommonwealthofAustralia

ConstitutionAct,1901).Thestateandterritorygovernmentsinthosejurisdictions

haveestablishedgovernment(orpublic)schoolsystemswhicharemanagedby

educationdepartmentsresponsibletothosegovernments.Educationdepartments

allocateandmanagethehumanandphysicalresourcesprovidedbygovernments

todelivereducationalservicestostudentsinthegovernmentschoolsystem.

Studentsenrolledinthegovernmentschoolsystemareentitledtofreeeducation

fromage5to17years.

Privateinterestshavealsoestablishedschoolsineachofthestateandterritory

jurisdictions.Themajorityofthoseschoolsareaffiliatedwithorganizedreligions.

TheCatholicChurchsupportsthelargestnumberofschoolsaffiliatedtoareligious

organization.Privateschoolswithacommonphilosophyorreligiousaffiliation

haveformedthemselvesintosystemsforthepurposesofefficientandeffectiveuse

ofresources.Parentspayschoolfeesdirectlytoprivateschoolstosendtheir

studentsthere.However,allschoolsystemsinAustraliareceivemoneycollected

bygovernmenttaxsystems.

Thegovernmentsoftheeightstatesandterritorieshaveestablishedautonomous

authoritiestomanagetheregistrationandaccreditationofschoolsestablishedby

bothgovernmentandprivateinterests.Registrationensuresthecommunitythat

theirchildrenareeducatedinappropriatephysicalsurroundingsandprovided

Page 23: Exploring The Impact of a Largescale Diagnostic Science

2

withadequatehumanandotherresourcestosupporttheirlearning.Accreditation

ensuresthatstudentshaveaccesstoeducationalprogramsbasedonahigh-quality

curriculumandrelatedassessmentandcredentialingprocesses.Registrationand

accreditationprocessesaredeterminedandmanagedindependentlyofdirect

governmentinfluence.Inrecenttimesstateandterritorygovernmentshaveadded

registrationofteachersandaccreditationoftertiaryeducationcoursespreparing

peopleforteachingtotheresponsibilitiesofthoseautonomouseducation

authorities.

Overthepastfourdecades,theeightstateandterritorygovernments,with

supportfromthenationalgovernment,havebeenworkingtowardashared

nationalpolicyagendaforeducationinAustralia.In2008,bycooperative

agreementofthenationalandallstateandterritorygovernments,theAustralian

CurriculumAssessmentandReportingAuthority(ACARA)wasestablishedto

performthefollowingfunctions“developmentofnationalcurriculum,

administrationofnationalassessmentsandassociatedreportingonschoolingin

Australia”(ACARA,2016a).ACARAisresponsibletotheCouncilofAustralian

Governments(COAG)EducationCouncil.

NewSouthWales(NSW)isthemostpopulousstateinAustraliaandaround20%

ofallsecondaryschoolstudentsinAustraliaattendagovernmentschoolinNSW

(ABS,2018).It’sDepartmentofEducation(“theDepartment”inthisthesis)

managesthelargestgovernmentschoolsystemofalleightstatesandterritories.

TheautonomouseducationauthorityinNSWisatthetimeofwritingthistheNSW

EducationStandardsAuthority(NESA)andisreferredtoas“theBoard”inthis

thesis.ItwasvariouslytheNSWBoardofStudies(BOS)thentheNSWBoardof

Studies,TeachingandEducationalStandards(BOSTES)beforebecomingNESAon

January1,2017.DatausedinthisresearchwasprovidedtomebytheDepartment

andbyscienceteachersworkingingovernmentsecondaryschoolsacrossNSW.It

wassupplementedbyschooldataavailableonthenationalMySchoolwebsite

managedbyACARA.

Page 24: Exploring The Impact of a Largescale Diagnostic Science

3

Thefollowingsection,Section1.2,willoutlinethetwoassessmentinitiativesthat

arethefocusofinterestforthisthesis.Section1.3willoutlinetheresearch

questionsandmethodology.Section1.4willprovideanoverviewofthefindings.

Section1.5explainstheimportanceoftheresearch.Section1.6willexplainmy

interestinthetwoinitiativesandSection1.7,thefinalsectioninthischapter,will

outlinethestructureofmythesis.

1.2Thetwoinitiatives

Thephrase‘formativepractices’inthetitleofthisthesisistakenfromapaperby

tworesearchers,BlackandWiliam(2009)titledDevelopingthetheoryofformative

assessment.Theyusedthephrasetocovertheorizingaboutinstructioninformed

byfeedbackfromassessment.Thepaperhaditsoriginsinworkthepairhadbeen

commissionedtodosomethirteenyearsearlierbytheUKbasedAssessment

ReformGroup(ARG)withfundingfromtheNuffieldFoundation.BlackandWiliam

werecommissionedtoreviewtheliteratureontheuseofassessmenttosupport

learning,alsoknownasformativeassessment.Theresultsofthatreviewwere

publishedinabookletforteacherscalledInsidetheBlackBox(Black&Wiliam,

1998b).

TheARGhadusedthephrase“assessmentforlearning”(ARG,2002a,p.3)to

differentiateitfrom“assessmentoflearning”(ARG,2002a,p.3).Afullexplanation

ofthedistinctionsbetweenthetwowillbeprovidedinChapterTwo,theliterature

reviewforthisthesis.Thisthesiswillexploretheassessment-relatedworkof

teachersintheearlyyearsofsecondaryschoolingtofindouttheextenttowhich

thatworkcouldbedescribedas“formative”inBlackandWiliam’s(2009,p.8)

theoryofformativeassessment.Inotherwords,theextenttowhichinstructionor

teachingisexplicitlyinformedbytheresultsofassessment-relatedworkof

teachers.

Assessment-relatedworkofscienceteachersisdefinedhereasthepurposeful

collectingofevidenceoflearning,creatingthemeansbywhichthatevidencewas

obtained(ifnotbydirectobservationofbehaviour),theassumptionsusedto

interpretthatevidence,thechoiceoftextformsusedtorepresentand

Page 25: Exploring The Impact of a Largescale Diagnostic Science

4

communicateresultsofassessment,andsubsequentusesforthoseresults.

‘Studentresults’asusedinthetitlereferstotherepresentationofthejudgment

madebyteachersaboutthevalueoftheteachercollectedevidenceofstudent

learning.Itistypicallyrepresentedbyagrade,amark(sometimesexpressedasa

percentage)oralevel(inthisproject,sixlevelswerecommon).Thisformofresult

iswhatismeantbyassessmentoflearning.Itbecomesassessmentforlearning

whenitisusedtoinformthenextstepinteachingorinstruction(feedback)while

itishappening.

Thefirstoftheinitiativesusedinthisstudywasassessmentadviceforscience

teacherstitled:“AssessmentforLearning?”(BOS,2003,p.70).Itwasembeddedin

the2003releaseoftheofficialsciencecurriculumdocumentsthatsecondary

scienceteachersareexpectedtousewhenpreparingteachingandlearning

programsfortheirstudents.Theinitiativetooktheformofadvicetoteachers

abouthowtogatheranduseevidenceoflearningtoinformthenextstepsin

instructionasitwashappening.Inotherplaces‘assessmentforlearning’is

referredtoas“classroomassessment”byShepard(2001,p.2)or“formative

practices”byBlack&Wiliam(2009,p.6)intheirpaperonthetheoryofformative

assessment.Thecurriculumdocument(alsoreferredtoasasyllabusinNSW)

summarisesthescopeofassessmentforlearningforscienceteachersinthese

terms.It:

1. isanessentialandintegratedpartofteachingandlearning

2. reflectsabeliefthatallstudentscanimprove

3. involvessettinglearninggoalswithstudents

4. helpsstudentsknowandrecognisethestandardstheyareaimingfor

5. involvesstudentsinself-assessmentandpeer-assessment

6. providesfeedbackthathelpsstudentsunderstandthenextstepsinlearning

andplanhowtoachievethem

7. involvesteachers,studentsandparentsinreflectingonassessmentdata.

(BOS,2003,p.70)

Page 26: Exploring The Impact of a Largescale Diagnostic Science

5

Thefocusonassessmentforlearninginofficialcurriculumdocumentswasa

strongsignaltoteachersabouttheneedtoshifttheemphasisfromusingevidence

oflearningforreportingachievementafterinstructiontoimprovinginstruction

itself.Otherimplicationsarethatcurriculumintentions,instructionand

assessmentshouldbealignedandthatstudentsandthewiderschoolcommunity

needtobemoreinvolvedmore.Thecurrentcurriculumdocuments(BOSTES,

2012)continuewiththatemphasisandhaveextendedittoincludeadviceon

“assessmentaslearning”aswellas“assessmentofandforlearning”(NESA,2018).

Allthreewillbediscussedfurtherintheliteraturereview(ChapterTwo).The

current(2018)curriculumforscienceinNSWreplacedthe2003curriculum

beginningwithYear7and9in2014.

Thesecondinitiativewasatest-basedinterventioncalledatthetimeofits

introductiontheEssentialSecondaryScienceAssessment(ESSA)program.The

testwasdeliveredtostudentsatthemidpointofamandatoryfour-yearscience

coursecommencingintheirfirstyearofsecondaryschooling(Years7to10in

Australia).Afterpiloting(2005)andtrialing(2006),thefirsttestforthefullcohort

ofYear8studentswasin2007.Initsinitialform,itwasapen-and-papertestwith

thesame‘lookandfeel’asotherpen-and-papertestsstudentswereusedtodoing.

Itwassubsequentlydeliveredonlinefrom2010andcontinuestobedeliveredthis

way.Itwasthefirstcohorttesttobedeliveredonlinebyaneducationjurisdiction

inAustralia.

Thetestwasdesignedtodomuchmorethanprovideareporttoparentson

studentachievementatthemidpointofafour-yearsciencecourse.Itwasdesigned

as“adiagnostictooltoidentifywhatstudentsknowandcandoandwhere

teachingneedstobedirectedtoenhancescientificunderstanding”(Panizzon,

Arthur,&Pegg,2006,p.1).Tobettersupportthatgoal,theassessmentframework

wasinformedbytheStructureoftheObservedLearningOutcome(SOLO)model.

SOLOisa“cognitivestructuralmodel”(Panizzon,2003,p.1428)developedfrom

empiricalstudiesofthestructureandsophisticationofthelanguageusedby

studentsintheirresponsestotestitemsandtasks.TheSOLOmodelusedinNSWis

Page 27: Exploring The Impact of a Largescale Diagnostic Science

6

basedontheSOLOtaxonomyoriginallypublishedbyBiggsandCollis(1982,

1991).

TheassessmentframeworkdevelopedfortheESSAprogramenabledamaptobe

createdthatputssyllabusexpectationsalongoneaxisandlevelsofunderstanding

aboutthoseexpectationsalongasecondaxis.Howthisworkswillbeexplained

furtherinChapterTwo.Thetestwasalsoaccompaniedbyasurveydesignedto

findoutwhatstudentsthoughtaboutscience,theirschoolscienceexperienceand

thetestitself.Theresultsofthesurveyanalysiswereprovidedtoscienceteachers

alongwithdetailedfeedbackaboutstudentresponsestoeveryitemandtaskinthe

test.

TheESSAprogramwascompulsoryforallYear8studentsinthegovernment

schoolsystemandforYear8studentsinnon-governmentschoolsthathadopted

intotheprogram.Theprogramwasexpandedin2015toincludeatestforYear6

andYear10studentsandrenamedValidationofAssessmentforLearningand

IndividualDevelopment(VALID).Theadditionoftwoextratestsprovidedschools

withawayofmappingtheprogressionofstudentlearninginsciencefromYears6

toYear8andthenYear10.

VALID8remainedcompulsoryforallgovernmentschools,butthenewVALID6and

VALID10testswere(andstillare)optionalforbothgovernment(andnon-

governmentschoolswantingtoparticipate).Astheprogramnamechangetook

placebeforedatacollectionbeganinthisproject(secondhalfof2016)andschools

werealreadycallingittheVALIDprogram,IchosetousetheacronymEVinthis

thesistoreflectboththeoriginal(ESSA)andnew(VALID)acronyms.Iwillreferto

theEVprogramorEVtestfromthispointonwards(unlessitismoreappropriate

torefertoeitherESSAorVALID).

Theperiodofinterestforthisprojectisfrom2011to2014inclusivewhichwere

thelastfouryearsofdatalinkedtoteachers’workusingthe2003curriculum.The

EVprogramisappropriatelydescribedasanexternal,large-scale,low-stakes,

diagnosticintervention.‘External’referstothesourceofthetest,whichisexternal

totheschool.‘Large-scale’referstothesizeoftheprogram,whichincludesall

Page 28: Exploring The Impact of a Largescale Diagnostic Science

7

NSWgovernmentschoolswithYear8students(465schoolsatthetimeofthis

research).ThestudentcohortsizeinYear8from2010to2015numberedaround

47,000students.Thestatisticsquotedforthesizeofthegovernmentschool

systemandthesizeoftheEVprogramweresourcedfromtheNSWDepartmentof

Education.

‘Low-stakes’isarelativedescriptorfortheimpactoftheEVprogramonstudents,

theirparentsandtheirteachersasexplainedfurtherinChapterTwo.

Diagnosticassessmentreferstotheintendeduseofassessmentresultstoidentify

strengthsandweaknessesinstudentlearning(Goodrum,Rennie,&Hackling,

2001;Hackling,2004;Masters,2013;Millar&Hames,2003;Treagust,2006).

Thewidercontextforthetwoinitiativesdescribedinthissectionwillbedescribed

inthefirstsectionoftheliteraturereview.

1.3Researchquestionsandmethodology

Thissectionoutlinesthespecificresearchquestions,theresearchdesignand

relatedmethodologiesusedtoguidethisresearchproject.Afullaccountofthe

methodologywillbeprovidedinChapterThree.

Theobjectiveofthisstudyistoanswerthemotivatingquestionofwhatimpactare

thetwoinitiativesofformativeassessmentandthediagnosticEVtesthavingon

theassessment-relatedworkofscienceteachersinNSWgovernmentschoolsand

whyitmatters?

Threeresearchquestionsprovidethefocusforthisresearch:

1. WhatusearescienceteachersmakingoftheEVprogramincludingSOLO

andwhyisitusedornotused?

2. Whatformativepracticesareevidentintheworkofscienceteachersand

whyaretheyusedornotused?

Page 29: Exploring The Impact of a Largescale Diagnostic Science

8

3. Istheuseofformativepracticesbyteacherslinkedtoimprovementin

students’EVresultsandlaterachievementinandengagementwith

science?

ThefirstquestionisaboutidentifyingtheextenttowhichEVtests,EVresultsor

relatedresources(includingSOLOtheory,studentsurveyresultsandprofessional

learningopportunities)havebeenaccessedandusedbyscienceteacherstoinform

assessment-relatedworkattheirschools.

Thesecondquestionisabouttheextenttowhichformativepracticesareevidentin

teachers’assessment-relatedwork.ChapterTwowillelaboratethetheoretical

frameworkoffivedimensionsofformativepracticeusedinthatexplorationof

teachers’work.“Formativepractices”isaphraseusedbyBlackandWiliam(2009,

p.8)intheirdiscussionofthetheoryofformativeassessment.Inthatdiscussion

BlackandWiliamexplorethelinksbetweenwhattheycallthefivestrategiesof

formativeassessmentandtheirrelationshipstopedagogyorinstruction.Idecided

touseBlackandWiliam’sphraseandinvented“fivedimensions”offormative

practiceasthebasisforcharacterisingteachers’responsestoitemsinanonline

surveyabouttheirwork.Thefivedimensionswerebasedonthefivestrategiesof

formativeassessmentasarticulatedbyBlackandWiliam(2009)intheirpaper.

Thethirdquestionisaboutinvestigatingtheassociationbetweenformative

practicesandachievement(asmeasuredbyEVresultsandotherassessmentsin

science)andlatertake-upofsciencecoursesinthesenioryearsofsecondary

schooling(ameasureofongoingorlaterengagement).

Alsoexploredinrelationtothethirdquestionwastheextenttowhichthe

formativepracticesobservedintheassessment-relatedworkofscienceteachers

mayhaveassistedlearners’acquisitionofself-regulation(Boekaerts&Corno,

2005).Self-regulationdescribesstudentswhoaregoodmanagersoftheirlearning,

likelearningandcontinuetheirinvolvementinlearning.Theexpectationthat

somestudentshaddevelopedthoseattributesasaresultofexposuretoformative

practicesusedbyscienceteacherswasbasedontheworkbeingdoneintheUKby

Black,McCormick,JamesandPedder(2006),andJamesetal.(2007).

Page 30: Exploring The Impact of a Largescale Diagnostic Science

9

Threepredictionsweremadetotesttheassumptionofacquisitionofself-

regulationbysomestudents.Confirmationofthethreepredictionswouldbetaken

asevidencethattheassumptionofself-regulationforsomeofthestudentswas

reasonable.Thethreepredictionsandthethinkingbehindthemisdiscussedin

ChapterThree.Analysisofdataprovidedbycasestudyschoolsarereportedin

ChapterFive.16schoolsidentifiedthemselvesaswillingtobeinvolvedinacase

studyasoutlinedbelowandfullyinChapterThree.

Thecapacitytomanageone’slearningisanessentialskillinthecontextofthe

knowledgesocietyandrelatedeconomywherethecapacitytolearnnewskillsand

adapttochangeisincreasinglyimportantformaintainingajobandwiderlife

satisfaction(UNESCO,2005,p.27).ChapterTwodescribessomeoftheworkbeing

donetoteachstudentsthestrategiesofformativeassessmentasonemeansfor

producingstudentself-regulation.Itisforthisreasonthathelpingteachersto

adoptformativepracticeastheirdefaultpedagogy“matters”(seethemotivating

questionforthisresearchprojectstatedatthebeginningofthissection).

Theresearchdesigninvolvedmixedmethodsexecutedinthreephases.Anoutline

ofthephasesfollows.FulldetailswillbeprovidedinChapterThreeand

subsequentchapters.

ThefirstphaseemployedaquantitativeinferentialstatisticsprocedurewhereEV

resultswereregressedoveranEVresultpredictorandtheresidualsfromthat

regressionwereusedtoidentifythreegroupsofschools.Onegrouphadschools

withlargepositiveresiduals,asecondgroupwithzeroorclosetozeroresiduals

andathirdgroupwithlargenegativeresiduals.AswillbeexplainedinChapter

Three,schoolsinthesethreegroupsareassociatedwithEVresultsthatwerewell

above,atorwellbelowexpectationrespectively.

ExpectationwasrelativetotheEVresultpredictor.TheEVresultpredictorwas

developedfromacombinationofreadingandnumeracyresultsobtainedby

studentsinnationaltestinginYear7andagaininYear9.Thereasoningforusing

suchapredictorisexplainedinChapterThree.

Page 31: Exploring The Impact of a Largescale Diagnostic Science

10

TheDepartmentaccesseditsrecordsoftestresultsforschoolswith10ormore

studentsinYear8whohadsattheEVtestinfoursuccessiveyearsfrom2011to

2014inclusive.ItalsomatchedthosestudentswiththeirYear7andYear9reading

andnumeracyresultsfromnationaltestingandretainedthoseresultsforstudents

whohadsatthetestsatthesameschoolinsuccessiveyears.Readingand

numeracyresultswereusedtogeneratefourpredictorsofEVresultsforthe10or

morestudentsineachschool.IntheendtheDepartmentprovidedmewithfour

setsofregressionresidualsfrom394schools(outofapotential465).

Usingoneofthefoursetsofresiduals,Iidentifiedthreegroupsofbetween80-90

schoolsusingthesizeandpolarityoftheirresidualsasthebasisforallocationto

oneofthethreegroups.Scienceteachersattheselectedschoolswereinvitedto

completeananonymousonlinesurveyabouttheirteachingandassessment

practices.Responseswerecollatedaccordingtotheschoolgroupthescience

teachershadbeenassignedto.

Thesecondphaseemployedaquantitativemethodtoanalyseteacherresponsesin

eachofthethreegroupsandthentocomparetheresultsfromeachgroupfor

statisticallysignificantdifferences.TheprocedureusedwasAnalysisofVariance

(ANOVA).Itspurposewastofindoutwhethertherewerestatisticallysignificant

differencesinassessment-relatedpracticesofteachersineachofthethreegroups.

Analysisandfindingsfromthefirstandsecondphaseoftheresearchwere

reportedinChapterFour.

Thedefaultpositionforresponsestotheonlinesurveywasrespondentanonymity.

However,respondentswhowishedtobeconsideredforinvolvementinacase

study(thethirdphaseoftheresearchdesign)wereinvitedtoidentifythemselves

andtheirschool.Teachersat36schoolsspreadacrossthethreegroupsidentified

themselves.Betweenfourandsixoftheidentifiedschoolsfromeachofthethree

groupswereinvitedandsubsequentlyparticipatedincasestudies.

TeachersatcasestudyschoolswereinvitedtoprovideschoollevelEVandYear10

results,numbersofstudentscompletingYear12sciencecoursesandartifactsof

Page 32: Exploring The Impact of a Largescale Diagnostic Science

11

teacher-producedassessment-relatedworkconsideredbythemtobeexemplary

practice(includingtestorassignmentitemsandtasks,relatedmarkingrubrics,

sampleschoolreports,assessmentplansorsciencedepartmentprogramswhere

assessmentwasexplicitlydescribed).Teacherswereaskedtobringtheartifactsto

asemi-structuredinterviewattheschoolwhichwasplannedtotakeupanhourof

theirtime.Theinterviewswithteacherswereaudiorecorded.Accesstostudents

wasnotpartoftheresearchdesign.

CasestudyschoolsprovidedYear8,Year10resultsandYear12completiondata.I

sourcedandcollectedcasestudyschools’socio-educationaladvantageprofiledata

fromtheACARAmanagedMySchoolwebsite.Thatdataandtheresidual(from

phaseone)werecollatedandanalysedusinginferentialstatisticstoestablishthe

strengthofcorrelationstoconfirm(ordisconfirm)threepredicationsrelevantto

answeringthethirdresearchquestion.Interviewsandartifactswerequalitatively

analysedandassessment-relatedworknarrativesweredevelopedfromthat

analysisforeachofthecasestudyschoolsaswell.

Thepropositionthatinstructionconsistentwithformativepracticesmayhave

supportedstudents’self-regulatedpracticeswasalsotestedinthecontextof

answeringresearchquestionthree.

Findingsfromquantitativeanalysesperformedinthecasestudythird,phaseofthe

researchalongwithsupportingevidenceandexamplesfromtheassessment-

relatedworknarrativesforthoseschoolswerereportedinChapterFive.

Anonymityforparticipatingteachersandtheirschoolswasguaranteedforthis

research.Thestepstakentoprotecttheidentitiesofparticipatingschools,dataand

teachersaredescribedinChapterThree.

1.4Overviewoffindings

Intermsofthemethodology,thereading-numeracypredictorchosenaccounted

for89.2%oftheexplainedvariationaveragedoverthefouryears(2011-2014)of

results.Thisisaverystrongcorrelationgiventhatotherlarge-scaletesting

Page 33: Exploring The Impact of a Largescale Diagnostic Science

12

programsinvolvingpredictorsandregressionanalysis,suchasACARA’sIndexof

CommunitySocio-EducationalAdvantage(ICSEA),accountedfor81%ofexplained

variationin2013NAPLANresults(ACARA,2014b)and80%ofthe2014results

(ACARA,2015).WhenRowe(2006)analysedthe2003PISAresultsforthe

Australiansampleof15year-oldstudents,hefoundthattheboys(n=6335)

readingresultsaccountedfor77.4%oftheexplainedvariationintheirscience

results;thecomparablefigureforgirls(n=6216)was75.3%(Rowe,2006,p.8).

Thesamestudentssatboththereadingandthesciencetests.

Whentheresidualsforallschoolsanddifferentschoolcategorieswereanalysed,it

wasfoundthatEVresults“werebetterthanexpected”(i.e.theresidualwas

positive)in:

• 53%ofthe394schoolsinthestudy;

• 67%oftheprovincialschools(n=anestimated90schools);

• 68%ofthefullyselectiveentryschoolsschools(n=19);and

• 23%ofthepartiallyselectiveentryschools(n=24).

InrelationtothefirstresearchquestionaboutteacheruseofEVresourcesand

SOLO,somefindingswerethat:

• 67%ofscienceteachersmadeuseofEVresourcestosupporttheir

assessmentprogramsandin-classwork;

• 25%ofteachersratedtheirunderstandingofSOLOasgoodorverygood;

and

• 18%ofteacherssaidtheyusedSOLOasabasisforfeedbacktostudentson

theirlearning.

Whenitcametostudentsurveyresults(thesurveyaccompaniedtheEVtestanda

newfeatureofexternaltestinginNSW):

• 67%ofscienceteacherssaidtheyhadlookedattheresults

• 49%haddiscussedtheresultswiththeircolleagues;and

• 18%ofteacherssaidtheyhaddiscussedtheresultswiththeirstudents.

Page 34: Exploring The Impact of a Largescale Diagnostic Science

13

Inrelationtothesecondquestion,therewerestatisticallysignificantdifferencesin

theusebyteachersofthreeofthefivedimensionsofformativepractice.The

teachersatschoolswhereresultshadbeenidentifiedasbeing“wellabove

expectation”(orWBEschools,comparedtotheircolleaguesintheothertwo

groupsofschools,weremorefrequentusersofactivitiesinvolving:

• discoursethatelicitsevidenceoflearning;

• theprovisionoffeedbackknowntoprogresslearning;and

• theuseandmodelingof“goodlearningbehaviours”(Boyle,Fahey,

Loughran&Mitchell,2001,p.200).

Forthethirdresearchquestion,theanswertothefirstpartofthequestion(Isthe

useofformativepracticesbyteacherslinkedtoimprovementinstudents’EV

results…)wasastrongyes.WhenitcametoextrapolatingthatresultbeyondYear

8toYear10achievement(…laterachievement),uncertaintyaboutthe

comparabilityofYear10dataacrossschoolswastoogreattohavereasonable

confidenceinbetweenschoolcomparisons.ThewithinschoolcorrelationsforYear

8andYear12sciencecoursecompletionsandYear10achievementandYear12

sciencecoursecompletionswashighlypositiveandstatisticallysignificant.

Theassumptionthatschoolswhereresultswere‘wellaboveexpectation’would

havemoreself-regulatedstudentsthanotherschoolswasthebasisforthree

predictionsaboutlaterachievementandlaterengagement.Thetermsachievement

andengagementasusedinthisprojectaredefinedinChapterThree.The

predictionsrelatedtocomparableschools(schoolswiththesamesocio-

educationaladvantage).Noneofthepredictionscouldbeconfirmedbeyond

reasonabledoubtwhichinturnrenderedtheunderlyingassumptionofself-

regulationdoubtfulaswell.

Contributingtotheuncertaintyaboutself-regulationwasthefindingthatstudents

atthethreeprovincialcasestudyschoolsthathad‘wellaboveexpected’EVresults

werelesspositiveabouttheirschoolscienceexperiencethanstudentsinthe

metropolitancasestudyschools,mostofwhomwereinschoolswhereEVscores

were‘at’or‘wellbelowexpectation’.

Page 35: Exploring The Impact of a Largescale Diagnostic Science

14

1.5Importanceoftheresearch

Twoclaimsabouttheimportanceofthisthesisaremade.Thefirstclaimisthatthis

projectwasthefirstlargescalestudyinAustraliausingtheresultsfroman

externalsciencetesttoprovideconfirmationthatformativeassessmentand

relatedinstruction(formativepractices)areassociatedwithbetterlearning

outcomesinscience.

Here“better”meansthattheschool’soverallscienceresultshadahighermean

thanthescienceresultsoftheschoolitwasbeingcomparedto.Inthiscontext

‘comparable’meansaschoolorschoolswiththesamesocio-educationaladvantage

score(ameasureofthecollectivelearningpotentialofstudentsataschool;its

derivationisexplainedinChapterThree).

Thewordingoftheclaimandthenotionofcomparableschoolsrelatetothe

methodologyinvolvedinproducingtheevidencefortheclaimedassociation

betweenteacheruseofformativepracticesandstudentlearning.

Asaresult,thisstudyaddstothegrowingbodyofevidencefromaroundtheworld

abouttheeffectivenessofformativepractices.Asynthesisofkeyliteraturelinking

formativepracticestobetterlearningoutcomesispresentedinChapterTwo,the

literaturereview.

Specifically,myresearchshowedthatstudentsattainedbetterresultsinthose

schoolswhereteachersprovidedstudentsmorefrequentlywith‘science-rich’

activitiesinvolvingthreeofthefivedimensionsofformativepractice(Black&

Wiliam,2009).Thedimensionswere:classroomdiscourseelicitingevidenceof

learning;teacherfeedbackknowntoprogresslearningofthatcontentandteacher

useandmodelingof“goodlearningbehaviours”.

Thesecondclaimforimportancerelatestothestudy’smethodology.The

methodologyinvolvestakingastudent’sresultsfromnationalliteracyand

numeracytestingtogenerateapredictorfortheirresultinasciencetest.Aswas

discussedinthesectionabove,theregressionofsciencetestresultsoverthesame

Page 36: Exploring The Impact of a Largescale Diagnostic Science

15

students’setofsciencepredictorscoresproducedaschoolsetofindividual

residuals.Theclaimhereisthattheresidualisameasureoftherealanddirect

contributionofscienceteachingtothesciencelearningofstudentsatthatschool.A

positiveresidualmeansthatastudenthaslearnedmoresciencethanexpected;a

negativeresidualmeansstudentshavelearnedlessthanexpected.When

individualresidualsaresummedandaveraged,theindividualstudentresiduals

produceaschoolscore.

Whentheresidualsfromallschoolswherethisprocesshasbeenappliedare

standardizedtheycanbecompared.Schoolswithlargerpositiveresidualshave

donemoreforstudentscientificliteracythanthosewheretheresidualsarelarge

andnegative.TheprocessfromresidualtocomparingactualschoolEVresults

commencesinChapterThreeandthefindingsreportedinChapterFour.

Anunanticipatedfindingwasthatscienceteachinginprovincialschoolshad

producedbetterthanexpectedresultsbutthat(forhighperformingcasestudy

provincialschoolsatleast)studentswerenotenjoyingtheirschoolscience

experiences.Thislastfindingwasanimportantconsiderationinconcludingthatan

assumptionofself-regulationasacontributortolaterachievementandlater

engagementwasnotwarranted.

1.6Theresearcher

Ibeganmycareerinscienceeducationasasecondaryschoolscienceteacher

(1967to1979)beforetakingontheroleofheadteacher,scienceintheNSW

governmentschoolsystem(1980to1993).Iacceptedtheroleofseniorscience

managerinthethennewlycreatedcurriculumsupportdirectorateoftheNSW

DepartmentofEducation(1994to2005).Inthatrole,Iprovidedcurriculum

supporttoscienceteachersingovernmentschoolsacrossNSW,managedthe

developmentofanumberofsciencecurriculumsupportresources,provided

policyadviceonscienceeducationtoseniormanagementintheDepartmentand

ledprofessionaldevelopmentforastatewidenetworkofscienceconsultants.

Page 37: Exploring The Impact of a Largescale Diagnostic Science

16

IrepresentedtheDepartmentatthenationallevelasamemberofsteering

committeesandasacontributortonational,scienceteaching,curriculum,

assessment,professionalstandardsandcurriculumsupportreviewsandprojects.I

alsoparticipatedregularlyintheannualconferencesoftheAustralianScience

TeachersAssociationandAustralasianScienceEducationResearchAssociation.

Commencinginthemid-1970s,IhadanumberofroleswiththeNSWcurriculum

andassessmentauthorityasasciencecurriculumwriter,curriculumpolicyofficer

onsecondmentfromschool(1987to1990).Iwasamember,thenchair,ofthe

authority’ssciencecurriculumcommittee,aHSCexaminationmarker,chairofa

HSCexaminationcommitteeandsupervisorofmarkingforaHSCsciencecourse.

LaterIhadarolewithACARAasbothacurriculumwriterfortheF-10Australian

sciencecurriculumandsubsequentlyasanofficerassistingwithdevelopmentof

theseniorChemistryandPhysicscurriculumdocuments.

IjoinedtheScienceTeachersAssociationofNSWin1967andwaselectedVice-

Presidentontwoseparateoccasions.Iwasalsoaconvenoroftheirprofessional

developmentcommitteeandannualconferences,contributortothoseconferences

andtheseniorjudgeandmarkingtrainerfortheirYoungScientistAward.Ialso

representedSTANSWasamemberoftheteamengagedbyASTAtowritetheir

professionalstandardsdocumentforHighlyAccomplishedTeachers(ofscience)

whichbecameamodelforlaterprofessionalstandardsdocuments.Iwasawarded

anhonorarylifemembershipofSTANSWin1997.

IbecameacasuallecturerandthencoordinatorfortheBachelorsandMasterspre-

servicescienceteachercoursesattheUniversityofTechnologySydney(2004to

2015).Iwasalsoamemberofteamsthatresearched,developed,piloted,trialed

andmarkedthefirstEVtests.Duringthattime(2005to2008)Iledthetrainingfor

markersoftheextendedresponsetasksaswellasbeingthekeyliaisonperson

betweentheDepartmentandtheagencycontractedtomanagetheonlinemarking

oftheextendedresponsetasks.

Thisthesisistheculminationformeoffivedecadesofworkinscienceeducation,

startingwithpart-timedegreesatMacquarieUniversity(BAmajoringin

Page 38: Exploring The Impact of a Largescale Diagnostic Science

17

CurriculumandGeophysics,completedin1974)andTheUniversityofSydney

(MEdmajoringinCurriculum,awardedin1991).

Itismyintentiontousetheresultsfromthisstudytosatisfycriteriafortheaward

ofaPhDandforfutureadvocacywork.ThelatterwillbeachievedwhenIprovide

feedbackfromthisstudytoparticipantschoolsandpolicyadvicetothe

DepartmentofEducation,NSW.Totheextentthatmyadvocacyproducessupport

fromtheDepartmentforteacherprofessionaldevelopmentleadingtomore

confidentandaccomplisheduptakeofformativepractices,thenthetransformative

intentofthisstudywillberealised.Inaddition,Iwillbeofferingmysupporttothe

schoolsthatparticipatedinthisstudy,shouldteacherstherewishtoimplement

adviceprovidedinmyfeedbacktothem.

FromtheaboveresuméitisappropriatetosaythatIbringbothaninsiderand

outsiderperspectivetothisdoctoralstudy(Fensham,2013).Iwasaninsiderinthe

followingways:

• asaresearchparticipantintheinitialevaluationofthesuitabilityofthe

SOLOmodelininformingthedevelopment,implementationandmarkingof

theEVextendedresponsetasksinthefirstfouryearsofitslife;

• asamemberofreferencegroupsforthereviewintothestatusandquality

ofschoolscienceinAustralia(Goodrumetal.,2001),forthereviewof

optionsforanationaltestforprimaryscience(Ball,Rae,&Tognolini,2000)

andfortheScienceEducationAssessmentResources(SEAR)project(ACER,

2004a)

• asawriterofbothstateandnationalsciencecurriculumdocuments(BOS,

2003;ACARA,2014c).

Myoutsiderperspectiveis“likethatofotherinterestedscienceeducators[who

access]projects’reportsoftheirfindingsandtotheiraftermathinfluence(insofar

asfindingsarepublished)onthepolicyandpracticeofscienceeducation”

(Fensham,2013,p.13).

Page 39: Exploring The Impact of a Largescale Diagnostic Science

18

IhavetitledthisfirstchapterOutlineofMyProjectandwrittenitinthefirstperson

toensurethatreadersrecognisewhatIbringtothisstudy.Subsequentchapters

arewritteninthepassivevoice.Thisunderpinsmywishtobeseenasan

independentresearcherwhohastakenappropriatesteps(seesection3.7in

ChapterThree)toconducttheresearchinthefullknowledgeofissuesrelatedto

participantresearchers/observersthatariseinthecontextofqualitativeresearch

ineducationandpsychology(seeDenzin&Lincoln,2011andHammersley,2008).

Mylastinvolvementwithteachersinthecontextofsupportingtheir

implementationofthesyllabus(BOS,2003)wasin2004andtheEVprogramwas

in2008.Datacollectionincasestudyschoolsforthisprojecttookplacein2016.I

hadpreviouslyworkedwithoneofthecasestudyteacherssome12yearspriorto

that.Hewasthenaparticipantataone-dayworkshopIwasrunningatthattime.

Hisschoolwasinvitedtoparticipateasacasestudyschoolin2016becauseitmet

thecriteriaforinclusionasanoutcomeofthephaseonequantitativemethodology.

1.7Structureofthisthesis

ChapterTwoexplorestheresearchandotherliteratureconsultedforthisthesis.It

providesthetheoryforconceptualisingfivedimensionsofformativepractice,that

comprisetheframeworkforinvestigatingtheimpactofassessmentforlearning

adviceandtheEVprogramonassessment-relatedworkofscienceteachers.

ChapterThreeexplainsthethree-phase,mixedmethodsdesignusedtoinvestigate

theimpactofthetwoinitiatives(theEVtestandexpectationsforgreateruseof

assessmentforlearning)onassessment-relatedworkofscienceteachersinthe

NSWgovernmentschoolsystem.Thefivedimensionsofformativepractice,which

istheframeworkagainstwhichimpactwillbeinvestigated,aredescribedthere.

ChapterFourreportsthefindingsfromthefirstandsecondphasesofthestudy.

ThefirstphaseusedanEVresultpredictortoidentifyschoolswhereEVresults

werewellabove,atandwellbelowexpectation(relativetothepredictor).

Teachersinthoseschoolswereinvitedtocompleteanonlinesurveyabouttheir

Page 40: Exploring The Impact of a Largescale Diagnostic Science

19

work.Thesecondphaseinvolvedtheanalysisofsurveyresponsestofindout

whetherbetterthanexpectedEVresultswereassociatedwithformativepractices.

ChapterFivereportsfindingsfromthethirdphaseoftheproject.Thethirdphase

involvedtesting(bothquantitativelyandqualitatively)thepropositionsthat

studentsatcomparableschools(schoolswiththesamesocioeducational

advantagescores)whohadmorefrequentexposuretoformativepractices,

comparedtostudentsatschoolsnotsoexposed,wouldhave

• betterYear8EVresults

• betterYear10results

• morestudents(asaproportionoftheYear12cohort)completesenior

sciencecourses.

ChapterSixsummarisesthestudy’sfindingsandprovidesqualifiedconfirmation

fortheclaimsmadeabouttheimportanceoftheresearch.Italsoprovidessome

suggestions,supportedbyfindingsinthisproject,forfutureresearchand

recommendationstorelevanteducationauthoritiesaboutchangestoenhancethe

ongoingeffectivenessofthetwointerventions.

Page 41: Exploring The Impact of a Largescale Diagnostic Science

20

CHAPTER2:LITERATUREREVIEW

2.1Introduction

Australiaisoneofthemostadvantagedandadvancedcountriesintheworld

(OECD,2018;UNDP,2018).ThereviewscommissionedbysuccessiveAustralian

andothergovernmentsaroundtheworld,researchandrelatedagencies(See

section2.2)havearguedthatthebestwaytoretainthisadvantageistodevelop

thecreativityandcognitiveskillsofitspeople,withaparticularemphasison

Science,Technology,EngineeringandMathematicsorSTEMasitisalsoknownas

(JFF,2007;DES,2003;OCS,2014)andtoaimforworld’sbestpracticeindoingso.

ThatAustralia’saspirationsareglobalisevidencedbyitsmembershipofand

activeparticipationinOECDprojectsrelatedtoassessment,forexample,

• ongoingparticipationinitsProgrammeforInternationalStudent

Assessment(PISA)sinceitsinceptionin2000(OECD,2014);

• casestudiesofclassroompracticeinQueenslandschoolswereincludedin

theirWhatWorksseriesofpublications,forexample,astudyonFormative

Assessment(CERI,2005,pp.191-204);and

• participationintheOECDReviewsofEvaluationandAssessmentinEducation

series(OECD,2011).

Section2.3reviewstheresearchliteratureonassessmentanddiscussestheidea

thatschoolsareenmeshedinacomplexwebwhichisappropriatelycalledan

assessmentsystem.

Section2.4discussesthepurposesofassessmentandhowtheoriesoflearningand

cognitionimpactwhatandhowweassess.

Section2.5examinestheconceptofassessmentasmeasurementandexploresthat

ideainrelationtosummativeandevaluativepurposesforassessment.

Page 42: Exploring The Impact of a Largescale Diagnostic Science

21

Section2.6looksatthenewemphasisbeinggiventoformativeassessmentandits

contextualisationinteachingknownasformativepracticeandwhythismaybethe

keytohelpingstudentsbecomelife-longlearners.

Section2.7describestheevolutionoftheSOLOmodelandpositionsitasageneric,

developmentallearningprogressionthatenhancesthefeedbackpotentialof

summativetestssuchastheEVtest.

Section2.8reviewsthemainideasdiscussedaboveandthathaveinformedthis

study.

2.2Acurriculum,teachingandassessmentforthetwenty-firstcentury

InApril2005,CarmelTebbutt,theMinisterforEducationinNewSouthWales

(NSW),Australia,announcedtotheNSWParliament:

Thereisnodoubtthatscienceandtechnologyareintegraltoourmodern

society[and]wemustdoallwecantoencouragestudentstotakeup

scienceandtocontinueitsstudyinyears11and12.TheGovernmentis

introducingforyear8anessentialsecondaryscienceassessmenttohelp

improvelearningoutcomesandgeneratestudentinterestinstudying

science(Tebbutt,2005,p.14956).

Thefirstsentencefromthisquoteisastrongstatementoftheneed,atleastinthe

eyesofthethenNSWgovernment,toensurethatmorestudentsengagewith

scienceuntiltheendoftheirseniorsecondaryschooling.Thebasisforthisclaim

willbeoutlinedlaterinthissection.ThesecondsentenceisareferencetotheEV

programdescribedinChapterOne.Aswillbeexplainedlaterinthischapter,

imposingatestisatoolusedbygovernmentstosignaltothecommunitythe

importanceplacedbygovernmentonaspectsofthecurriculum,inthiscase

science(alongwithliteracyandnumeracyaswillalsobeexplainedbelow).

InherspeechannouncingtheintroductionoftheEVprogram,ministerCarmel

Tebbutt,explicitlyreferredtoareportfromareviewintoinnovation,science,

technologyandmathematicsteachingandteachereducationinAustralia(CRTTE,

Page 43: Exploring The Impact of a Largescale Diagnostic Science

22

2003),alsoknownastheDowReport.Thatreviewwasacontributiontothethen

nationalgovernment’sbroaderagendato

promoteresearch,developmentandinnovation[becausetheAustralian

economywastransitioningfromonebasedon]land,labourandcapitalto

onebasedonhumanandintellectualcapacity.(Australia,2001,p.4).

Thisemergingneweconomywasreferredtoastheknowledgeeconomyinmany

ofthereportspreparedforgovernmentsinAustraliasuchastheDowreport

referredtoabove(CRTTE,2003)andaroundthedevelopedworldatthattime

(OECD,1996).Allwereanxioustoensurethatallcontinuedtoprosperintothe

future.

Inonesuchreport,thethenchiefscientistforAustralia,RobinBatterham,wrote:

“Science,engineeringandtechnologyunderpinsourfutureasathriving,cultured

andresponsiblecommunity”(Batterham,2000,p.9).Hisreportidentifiedthat

moreinvestmentmustbemadeinpeopleandculture,ideasandcommercialisation

ifAustraliawastokeepupwiththerestofthedevelopedworld.His

recommendationsfordoingsowerebasedonhisanalysisof“initiativesand

consequentialstructuralchangesunderway…inOECDandAsiancountries”(p.41),

includingtheUnitedStates,theUnitedKingdom,Canada,Japan,Finland,Ireland,

SingaporeandThePeople’sRepublicofChina.

Batterham’sproposedstrategiesandrecommendationsforkeepingupwiththe

changesgoingonintheworldeconomywereaimedatensuringthatagrowing

numberofstudentswerepreparedforscience,engineeringandtechnology(SET)

relatedwork.Amongthestrategiesheidentifiedwere:makinglifelonglearninga

keystrategyforeducationprovidersandemployees,inspiringstudentstostudy

SET-basedsubjects,rewardingexcellentSETteachers,providingspecialist

intensivetrainingforteachers,andprovidingopportunitiesforSETgraduates

alreadyintheworkforcetoentertheteachingsystem.Theneedformorestudents

inAustraliatoengagewithSTEMinthelateryearsofschoolandbeyondwas

affirmedintheDowreport(CRTTE,2003)referredtoaboveand,infact,mostof

Page 44: Exploring The Impact of a Largescale Diagnostic Science

23

thestrategiesandrelatedrecommendationsfromBatterham’sreportwere

repeatedandendorsedintheDowreport(CRTTE,2003).

ThenationalAustralianEducationCouncilwaspursuinganagendatobroadenthe

school’scurriculumtobetterequipthegrowingnumberofstudentscompletingsix

yearsofsecondaryschoolingwithskillsthatbetterpreparethemforworkaswell

assuccessintertiarystudies.TheMattersandCurtis(2008)reporttothe

AustralianGovernmentDepartmentofEducation,EmploymentandWorkplace

Relations(DEEWR)describedhowfivecompetencesfirstproposedbytheKarmel

review(QERC,1985)endedupas“KeyCompetencies”(AECRC,1992)whichwere

thenhandedovertostateandterritoryeducationsystems.AsummaryofthisKey

CompetencyworkisincludedinAppendixA.

TherewasatrialofthekeycompetenciesinNSWschools,TAFEinstitutesand

workplaces,whichweredefinedas“theintegratedapplicationofknowledge,skills

andunderstandings”(Ryan,1997,p.5).Thetrialinginschoolswasfoundtobe

“broadlysupportedbypractitionersinvolvedinthefieldtesting[…buttherewas]

littlesupportforaseparateadditionallayerofassessmentandreportingthat

focusesonkeycompetencies”.(Ryan,1997,p.7)Aswillbeapparentfroma

readingofthefourthsectionoftheTableinAppendixA,thekeycompetencies

werelaterwrittenintotheNSWsciencesyllabus(BOS,2003)whichcontainedthe

curriculumofinterestforthisproject.Thereafter,theextentofKeyCompetency

acquisitionwasassessedbyteachersinthecontextofcontentandskillsrelatedto

theseparatelearningareasyllabuses,includingscience.

Ofnotetoowasthesyllabusexpectationthatafterfouryearsofscienceteachingin

NSW,studentswouldemergeasindependentlearnerswhowere“creative,

responsible,scientificallyliterate,confident,[and]readytotaketheirplaceasa

memberofsociety.”(BOS,2003,p.10)Thisaspirationwasmentionedinthe

AdelaideDeclaration(seeAppendixB)aswellasinBatterham’s(2000)report.

Thepushfromemployersandgovernmenttobroadenthecurriculum’spurpose

frompreparationfortertiarystudy(Connell,1985)topreparationforlifeinthe

twenty-firstcenturywasexpressedinthreeagreementsbetweenthenational,

Page 45: Exploring The Impact of a Largescale Diagnostic Science

24

stateandterritoryeducationministersaboutnationalgoalsforeducationwhich

weresubsequentlyendorsedbygovernments.ThefirstofthesewastheHobart

Declaration(MCEETYA,1998)withtenCommonandAgreedNationalGoalsfor

Schoolingreleasedin1989.Thegoalsweresubsequentlyrevisedandendorsedin

theAdelaideDeclaration(MCEETYA,1998)whichwasreleasedin1998.Following

areviewsometenyearslaterafurtheriterationwaspublishedintheMelbourne

Declaration(MCEETYA,2008).EachDeclarationwasaccompaniedbyanaction

plan.ACARAwascreatedasaconsequenceofgovernmentcommitmentstothe

actionplanattachedtotheMelbourneDeclaration.Thethreesetsofgoalsare

includedasAppendixB.

Theaboveoutlinestheinfluencesbeingbroughttobearonthecurriculumfor

schooling,includingthesciencecurriculum.TheDowreport(CRTTE,2003)also

includedreferencetoarecentlycompletedcomprehensivereviewintoscience

teachinginAustralianschoolstitledTheStatusandQualityofTeachingand

LearningofScienceinAustralianSchools(Goodrumetal.,2001).

Goodrumetal.(2001)includedatableadaptedfromtheUSANationalScience

EducationStandards(NRC,1996).Thetablesummarisedtraditionalscience

teachingpracticesfoundaroundtheworldandinAustralia(lefthandcolumn)

withpracticessupportedbytheresearchliteratureasbeingmoreeffective(right

handcolumn).ThetablefromthereviewispublishedhereasTable2.1.Themore

effectiveapproachesaresummarizedintheright-handcolumn.Threeofthelast

fourpointsintherighthandcolumnareitalicizedandboldedbythethesiswriter

tohighlightspecificreferencestoassessmentandhowitneedstochangewhen

comparedtomodalpractices(seecorrespondingpointsintheleft-handcolumn)at

thattime.

Page 46: Exploring The Impact of a Largescale Diagnostic Science

25

Table 2.1 Summary of needed changes to teaching and assessment Teaching for scientific literacy requires:

Less emphasis on: More emphasis on:

memorising the name and definitions of scientific terms covering many science topics theoretical, abstract topics

presenting science by talk, text and demonstration asking for recitation of acquired knowledge

individuals completing routine assignments

activities that demonstrate and verify science content providing answers to teacher’s questions about content science being interesting for only some students assessing what is easily measured

assessing recall of scientific terms and facts

end-of-topic multiple choice tests for grading and reporting

learning science mainly from textbooks provided to students

learning broader concepts that can be applied in new situations studying a few fundamental concepts content that is meaningful to the student’s experience and interest guiding students in active and extended student inquiry providing opportunities for scientific discussion among students groups working cooperatively to investigate problems or issues open-ended activities that investigate relevant science questions communicating the findings of student investigations science being interesting for all students

assessing learning outcomes that are most valued assessing understanding and its application to new situations, and skills of investigation, data analysis and communication ongoing assessment of work and the provision of feedback that assists learning learning science actively by seeking understanding from multiple sources of information, including books, Internet, media reports, discussion, and hands-on investigations

Source: Figure 7.1 in Goodrum et al.,2001, p. 168.

Page 47: Exploring The Impact of a Largescale Diagnostic Science

26

IntheAustraliancontext,Goodrumetal.(2001)hadidentifiedthat

mostsecondaryscienceteachersareconcernedaboutthefinalassessments

forstudentswhichdetermineaccesstotertiaryeducationandtheyregard

coveringthecontentlikelytobeassessedasofparamountimportance,the

repercussionsofwhichechorightdowntotheearlyyearsofhighschool(p.

145).

Thereviewerswereconcernedaboutthatfocuson“finalassessments”andtheir

recommendationsforchangeidentifiedassessmentasanareaforreform.Threeof

thelastfourpointsintheright-handcolumnareaboutassessment.Thethirdone

receivedspecialmentionintheirrecommendations.

Recommendation7:ItisrecommendedthattheCommonwealthassist

educationaljurisdictionstoreformassessmentpracticesothatassessment

moreeffectivelyservesthepurposeofimprovinglearning.Assessment

mustfocusonthelearningoutcomesassociatedwithscientificliteracy.

(Goodrumetal.,2001,p.xiii)

Subsequently,twoofthereviewreportauthorswerecommissionedtopreparea

five-yearactionplan(2008to2012)tomanagethecontinuingimplementationof

recommendationsfromthatinitialreport.(Goodrum&Rennie,2007)Assessment

wasoneofeightareasforaction.Theoverridingobjectiveofassessmentreform,

theywrote,wasto“improvethequalityofstudentassessmentbyensuringthatit

wasalignedwithintendedlearningoutcomes.”(p.15).

Twopriorityactionstoachievethisobjectiveweredescribedintheirreport.The

firstwasfor“effective[useof]diagnostic,formativeandsummativeassessment

approaches”(p.16)tobeembeddedincurriculumresourcesdevelopedtosupport

scienceteaching.Thesecondwasto“monitorperformanceinscienceatthe

nationallevel”(p.16).Goodrumetal.(2001)recommendedthatthelatterbedone

bynationalsampletestingofstudents.

Page 48: Exploring The Impact of a Largescale Diagnostic Science

27

Inresponsetothefirstproposedaction,twomajorAustraliancurriculumsupport

initiativessubsequentlymodelledtheuseofassessmentfordiagnostic,formative

andsummativepurposesasrecommended.ThesewerePrimaryConnections(AAS,

2016),whichprovidescomprehensivesupportmaterialsforscienceteachingin

theK-6years,andSciencebyDoing(AAS,2017),whichprovidessimilarsupport

forjuniorsecondaryscienceteaching.

Theproposalfornationalmonitoringofscienceperformancewas,ineffect,an

endorsementofcurrentprogramsusingexistingsampletestingprograms,onean

Australianinitiativeandtheothertwowereinternationalinorigin.These

programstestsamplesofNSWstudentsinYears4,6,8,9and10(fifteen-year-

olds)andwillbedescribedfurtherinthischapter.

(Broadfoot,2009)observesthataroundtheworldexternallyset,test-based

assessmentsarebeingused

ubiquitouslytoprovideforselection,forcertification,foraccountabilityand

forinternationalcomparisonsofeducationalstandards.Theadventofthe

21stcenturyalsoheraldedtheearlystagesofamovementtopromotethe

useofassessmentasatooltosupportlearningitself.(p.vii)

TheNSWsyllabusbeingusedatthetimeofthisprojectwithitsemphasison

assessmentforlearningwasanexampleofthelater,aswastheintroductionofthe

EVprogramandtheQualityTeachinginNSWpublicschools(QT)initiative.TheQT

initiativewasaprofessionallearninginitiativeoftheDepartmenttosupportand

improveteachingandassessmentingovernmentschools.TheQTinitiativewasa

professionaldevelopmentprogramwidelysupportedinNSWschoolsinthefirst

decadeafter2000.Thesyllabusmessageaboutassessmentforlearningwas

reinforcedintheDepartment’sQTinitiative.

Assessmentistheprocessofidentifying,gatheringandinterpreting

informationaboutstudents’learning.Thecentralpurposeofassessmentis

toprovideinformationonstudentachievementandprogressandtosetthe

directionforongoingteachingandlearning.(DET,2006,p.5)

Page 49: Exploring The Impact of a Largescale Diagnostic Science

28

InadditiontoNSWresourcessupportingassessmentforlearning,threenational

projectshadadditionalresourcesonlineforscienceteachersby2005,including:

• materialaboutassessmentforlearning(CC,n.d.);

• arangeofdiagnosticassessmentitemsandtasksforscience(ACER,2004a);

and

• onlinelearningobjectsspecificallytargetingsciencelearningthatcouldbe

usedbyteachersandstudentsfordiagnosticpurposesaswellastosupport

sciencelearningmoregenerally.

Elementsofthesethreeprogramsareevidentinteachingandlearningsupport

resourcescurrentlyavailabletoschoolsonEducationServicesAustraliamanaged

websites(ESA,n.d.),includingImproveandScootle(ESA,2012).

By2010,scienceteachersinNSWshouldhavebeenveryawareofexpectationsfor

theiruseofassessmentforlearningstrategiesandresources,includingtheuseof

assessmentdatafromtheEVprogram.AsexplainedinChapterOne,NSWhas

chosentoretainandexpanditsEVtestfromYear8toincludebothYear6andYear

10,thoughfornowthelattertwotestsarenotmandatory.(DET,2015)While

thereisconsiderableevidencethatsummativetestscontributetodisengagement

withlearning(Darling-Hammond,2003;Harlen&Deakin-Crick,2002;Osborne&

Dillon,2008;Stiggins,2007;Tytler,2007),thisthesisusesthecontextoftheEV

programtoprovideimportantinsightsintohowlarge-scale,summative,externally

designedtestsarebeingusedtoimprovebothachievementinandengagement

withlearning.

2.3Assessmentandassessmentsystems

Somedefinitionsofassessmentandassessmentsystemsareprovidedtointroduce

thissection.Thesewillbefollowedbyadiscussionoftheliteraturerelatingto

threecommonpurposesforschoolassessment.Theimpactofcurrent

understandingsaboutlearningandcognitiononassessmentandtheneedto

ensurethatwhatisdoneinthenameofassessmentisfitforpurposecompletethe

section.

Page 50: Exploring The Impact of a Largescale Diagnostic Science

29

Thefollowingfivedefinitionsofassessmentarefoundintheliterature.

Thefirstis:

Thetermseducationalmeasurement,assessment,andtestingareused

almostinterchangeablyintheresearchliteraturetorefertoaprocessby

whicheducatorsusestudents’responsestospeciallycreatedornaturally

occurringstimulitodrawinferencesaboutthestudents’knowledgeand

skills.(PophamcitedinNRC,2001,p.20,italicsinoriginal)

Thesecondis:

[Assessmentis]theprocessofgatheringandinterpretinginformationabout

theprogressofstudents’learning.(Hackling,2004,p.127)

Thethirdis:

Assessmentisatermthatcoversanyactivityinwhichevidenceoflearning

iscollectedinaplannedandsystematicwayandisusedtomakea

judgmentaboutlearning.(Harlen&Deakin-Crick,2002,p.1)

Thefourthissciencespecific:

[Assessmentis]thecollectionandinterpretationofinformationabout

learners’knowledge,understandings,skillsandattitudesrelatingtothe

scienceoutcomes.(Goodrumetal.,2001,p.20)

Thefifthhasalternativenamesforassessment,dependingonwhatisbeing

assessed:

[Assessmentsare]judgementsonindividualprogressandachievementof

learninggoals[from]classroom-basedassessments,aswellaslargescale,

externalassessmentsandexaminations…appraisalreferstojudgements

ontheperformanceofschool-levelprofessionals,e.g.teachers,school

leaders…evaluationreferstojudgementsontheeffectivenessofschools,

Page 51: Exploring The Impact of a Largescale Diagnostic Science

30

schoolsystems,policiesandprogrammes.(Nusche,Radinger,Santiago,&

Shewbridge,2013,p.59)

Thelastdefinitionrelatestothesystemofassessmentsthatschoolsareexpected

toparticipatein.Theassessmentsinvolvecollectingevidenceoflearningand

evidenceofperformancethatgoeswellbeyondwritingresponsestopenand

papertestitems.

Participantsinanydiscussionaboutassessmentneedtounderstandmorethanthe

literalinterpretationsofthewords“evidenceoflearning”.Twoexamplesillustrate

this:ThefirstistheNSWDepartmentofEducationandTraining’sQualityTeaching

(QT)initiative(DET,2003)mentionedabove.Itsuggestsfourquestions.

1. Whatdoyouwantthestudentstolearn?

2. Whydoesthatlearningmatter?

3. Whatareyougoingtogetthestudentstodo(ortoproduce)?

4. Howwelldoyouexpectthemtodoit?(DET,2006,p.10)

Amoresophisticatedversionofthecontextforassessmentisprovidedina

NationalResearchCouncil(NRC,2001)report.TheNRCmanagessevenprograms

fortheUSAcademiesofScienceandEngineering,includingtheirBehaviouraland

SocialSciencesandEducationprograms.Itdrawsonexpertisefromwithinand

outsidetheacademiesasneeded.FortheNRC

Assessmentisalwaysaprocessofreasoningfromevidence…[and]is

imprecisetosomedegree[andassessments]areonlyestimatesofwhata

studentknowsandcando.(p.2)

Everyassessmentinvolvesthreefoundationalelements(whichthewriterscallthe

verticesofanassessmenttriangle):

amodelofhowstudentsrepresentknowledgeanddevelopcompetenceinthe

subjectdomain[cognition];tasksorsituationsthatallowonetoobserve

student’sperformance[observation]andaninterpretationmethodfor

drawinginferencesfromtheperformanceevidencethusobtained

Page 52: Exploring The Impact of a Largescale Diagnostic Science

31

[interpretation]…Thesethreeelements—cognition,observation,and

interpretation—mustbeexplicitlyconnectedanddesignedasacoordinated

whole.(p.2,italicsintheoriginal)

AfundamentalpremiseoftheNRC(2001)reportis:

Mostwidelyusedassessmentsofacademicachievementarebasedon

highlyrestrictivebeliefsaboutlearningandcompetencenotfullyinkeeping

withcurrentknowledgeabouthumancognitionandlearning.Likewise,the

observationandinterpretationelementsunderlyingmostcurrent

assessmentswerecreatedtofitpriorconceptionsoflearningandneed

enhancementtosupportthekindsofinferencespeoplenowwanttodraw

aboutstudentachievement.(pp.2-3)

TheNRC(2001)reportmakesthisobservationaboutassessmenttoo.

Muchgreatervalueandcredibility[isattributed]toexternalassessmentsof

individualsandprogramsthantoclassroomassessmentdesignedtoassist

learning…Moreoftheresearch,development,andtraininginvestmentmust

beshiftedtowardtheclassroom,whereteachingandlearningoccur.(p.9,

italicsintheoriginal).

ThislastsentimentwasechoedintheGoodrumetal.(2001)reviewand

recommendationsmentionedintheprevioussection.

Sincethebeginningofthe1990s,studentsinAustraliahavebeenaskedtosittests

imposedbyeducationauthoritiesoutsidetheimmediateschoolbeforetheirfinal

yearofschooling.InYears7and9allstudentssitliteracyandnumeracytestsonce

satbystateandterritoryeducationauthorities.ACARAinthecontextofits

NationalAssessmentPlanLiteracyandNumeracy(NAPLAN)programhastaken

overmanagementofthetestssince2008.Year8studentsinNSWgovernment

schoolsatleastsitEVtestsforscience.Inmanyschools,sciencedepartmentbuy

testsdevelopedbyprivatetestingcompanies(suchasICASsciencetestsproduced

byEducationAssessmentAustralia(EAA).(EAA,2018)TheseICAStestsprovide

Page 53: Exploring The Impact of a Largescale Diagnostic Science

32

independentfeedbackonthelevelofscienceprocessskillsstudentspossessatthe

timetheysitthetest.TheAustralianCouncilforEducationalResearch(ACER)also

providescomparabletestsforsciencethatschoolscanpurchasetosupporttheir

teachingandlearningprograms(Masters,2009).

Itisalsopossible,butlesslikely,thatstudentscouldbeaskedtosittestsproduced

bytwointernationalagenciesinreadingliteracy,numeracyandscientificliteracy.

ThefirstorganisationtobringtheseteststoAustralia(in1995)wasthe

InternationalAssociationfortheEvaluationofEducationalAchievement(IEA)

(IEA,2013).Theseprovidetestingandreportinginreadingliteracy(PIRLS)overa

pentennialcycleandinmathematicsandscience(TIMSS)inaquadrennialcycle.

TheTIMSStestsarecurrentlysatbyYear4andYear8students;onlyYear4

studentssitPIRLStests.

ThesecondprogramistheOECD’sProgrammeforInternationalAssessmentof

Students(PISA),whichprovidestestsinliteracy,numeracyandscientificliteracy

overatriennialcyclefor15-year-oldstudents(OECD,2014).Australiahas

participatedinPISAsinceitbeganin2000.TheACERmanagesthetestprocesses

inAustraliafortheIEAandOECDanditwritesthereportsforAustraliafromtheir

analysisoftheresultsandrelatedsurveys(Thomson,DeBortoli,&Underwood,

2017;Thomson,Wernert,O'Grady,&Rodrigues,2017).

Figure2.1isarepresentationthatNuscheetal.(2013)usedtoexamineandreport

againstintheirexplorationoftheassessmentsystemsofparticipatingOECD

members,includingAustralia.Thefigureshowsthecomplexityoftheassessment

systemschoolsarenowenmeshedin.

Page 54: Exploring The Impact of a Largescale Diagnostic Science

� ����

�����������������������������������������������������������������

����������������������������������������������������������������������������������������������������������������������������������������������������������������������

�������������������������������������������������������������������������������

� �

�����������������������������������������������������������������������������������������������������

����������������������������������������������������������������������������

� �

����

������

����

���

��������

��������

��������������

����

����

������������������������

������������������

���������������

���������

����������

�������������������

����

�������������������

��

���������

�����������

��������������

����������

��������������

������������

������

�����������

��������������

���������

���

������������

�����������������

������

�����������

Page 55: Exploring The Impact of a Largescale Diagnostic Science

34

Inrelationtothatsystem,theNRC(2001)reportsays:

Aspectsoflearningthatareassessedandemphasizedintheclassroom

shouldideallybeconsistentwith(thoughnotnecessarilythesameas)the

aspectsoflearningtargetedbylarge-scaleassessments.(NRC,2001,p.3).

Thisisacallforverticalalignmentofassessmentintent.Theclaimhereisthat

classroomassessmentsandexternallyimposedtestsshouldallbedefensiblein

termsofthenationalorstateorterritorygoalsthetestsaresupposedtobe

providingevidenceoflearningabout.

TheNRC(2001)reportalsoasserts:“Educationalassessmentdoesnotexistin

isolationbutmustbealignedwithcurriculumandinstructionifitistosupport

learning”(p.3,italicsintheoriginal).Thisisacallforthehorizontalalignmentof

assessmentpractices,learningexpectations(asdescribedinthecurriculum)and

instruction.OthersexpressingasimilarviewincludeBiggs(1999),Mansell,James

&theARG(2009)andMasters(2013).Alignmentmeansthatwhatisintendedto

belearnt(curriculum)andhowitisacquired(instruction)anddemonstratedas

beingacquired(assessment)areconnectedbyacoherentandconsistentviewof

learningandcognition.TheTrendsinMathematicsandScienceStudy(TIMSS)

assessmentmodelcollectsdatabasedonthepremiseofhorizontalalignment.

(Mullis,Martin,Ruddock,O'Sullivan,&Preuschoff,2009)

ItisthesealignmentsthattheNRC(2001)saysareoftenmissingintherealworld

ofpractice.ThereportbytheCounciloftheGreatCitySchoolsintheUSprovides

examplesoftheconsequenceswhenthosealignmentsareweakormissing(CGCS,

2015).TheCGCS(2015)reportfindingsarelistedundersixheadings:assessments

requiredofallstudentsinagivengrade;sampleandoptionalassessments;

assessmentsforspecialpopulations;lookingattestinginadistrictcontext;costsof

testinginasampledistrict;andparents.Asummaryof23separatepointsincludes

thefollowing:mandated,externaltestingofGrade8studentstookupatleast

2.34%oftheschoolyear;therewasnocorrelationbetweentheamountof

mandatedtestingtimeandthereadingandmathscoresingradesfourandeighton

theNationalAssessmentofEducationalProgress(NAEP)testprogram;sometests

Page 56: Exploring The Impact of a Largescale Diagnostic Science

35

arenotwellalignedtoeachotherorcollegeorcareerreadingstandardsandoften

donotassessstudentmasteryofanyspecificcontent;and,parentssupport

replacingcurrenttestswith“better”tests.Despitetheseissues,82%oftheschool

parentspolledexpressedsupportorstrongsupportfor“[having]anaccurate

measureofwhatmychildknows”(pp.9-11).

Broadfoot(2009)describesafour-dimensionalcharacterisationofassessment

systemsforanalysingthelinksbetweentheassessmentsystemandthesocial

contextinwhichitisembedded.Thecomponentsarepurposes;mode(meansused

togatherevidenceoflearning);content(whatisbeingassessed);andorganisation

(howassessmentsareconducted).Shearguesthattheprevailingsocialcontextin

Westernsocietiesattheendofthe20thcenturywasdominatedbyenlightenment

andmodernistsentimentstodowith“individualrightsandresponsibility,

rationalityandscientificprogress”(p.vi).Therewasalsoenormousinvestment

madeduringthe20thcenturyinto“thepursuitofmaximumaccuracyin

educationalmeasurement”(p.vii).

Broadfootseesmeasurementasthepurposeofassessmentinthissocialcontext.

Thehigherthescorethemoresocialmeritwasbestowedonthatperson,who

then,presumably,couldgoontobeanythingtheywantedtobeinlife(scientist,

doctor,lawyer,pilotandanyotherhighsocialstatusjobtheydesired).Thecontent

tobeassessedwasthecurriculumcontentthatcouldbemeasured.Thedominant

modeofassessmentwasby‘paperandpencil’testing.Theevidenceoflearningit

deliveredrangedfromaletterrepresentingthebestresponse(fromseveral

optionsprovided)towritingafewwordsortheresultofacalculation,oran

extendedresponseinvolving(one,some,orallof)calculations,annotatedgraphic

representations(flowcharts,diagrams,tablesandgraphs)andtexttypes

characterisingdescription,explanation,justificationoracreativesynthesis.TIMSS,

PISAandNAPLANtestsuseamixofshortresponseitemsandextendedresponse

tasks;thebalancebeinginfavourofshortresponseitems(typicallybetween60-

80%).Correctresponseswerecountedandsummed.Inthiscontextthebiggerthe

numberthebettertheresult.

Page 57: Exploring The Impact of a Largescale Diagnostic Science

36

Thetypicalorganisationforexternal,standardisedtestsassumesresponseswillbe

fromindividualsandprovidedwithinastrictlyimposedtimelimit,andthatthe

testandanswerbookletswouldbeproduced,printed,delivered,collected,collated

andcodedinprocessesmanagedbytheagencyresponsibleforthetest(ortheir

delegate).Large-scaletestscores(rawscores)onceobtainedwouldoftenbe

standardisedinavarietyofwaysusingstatisticalprocedurestoensureafairbasis

forcomparability.

Manyteachersandothersinthecommunitybeyondschoolsbelievethatthismode

ofassessmentprovidesanobjective,unbiasedandthusfairassessmentof

individualperformanceatthetimethetestistaken.Supportforthisgeneralisation

hasbeenexpressedininternationalandlocal(AustralianandNSW)reportsand

researchpapersreviewinglarge-scaleassessmentprograms,suchasthose

mentionedaboveandtherecentlyabandonedYear10testsinNSW.Examples

includeCooney(2006),Smith(2005)andWasson(2009)inrespectoftheNSW

literacyandnumeracytests;BOS(2011)forthenowabandonedYear10testsin

NSW;andThomson,Wernert,etal.(2017)andThomson,DeBortoli,etal.(2017)

forthelatestTIMSSandPISAreportsrespectively.IntheUS,theNRC(2001)

supportstheNAEP(2011)testmodel.

Broadfoot(2009)goesontoidentifyachangedevelopinginhowtheeducation

communityviewsassessmentthatsheassociatedwithpost-modernism.

[This]movementseesassessmentasatooltosupportlearning…

involvementofhumanbeingsineveryaspectofitsdesign,executionand

usemakes[testing]irrevocablyasocialprojectandthussubjecttoallthe

vagariesthatanykindofhumanactivityimplies…assessmentinthe21st

centuryshowssignsofagrowingpreoccupationwith‘fitnessforpurpose’

andimpactonlearning.(p.vii)

Thisemergingviewsupportsthemoveawayfromseeingassessmentasa

summativeprogramtoaformativeone(asevidencedintheNSWSciencesyllabus

ofinteresthere).TheEVprogramisanattemptedshiftinthatdirection.Itusesa

summativetesttoprovidefeedbackonlearningwiththeexpectationthattest

Page 58: Exploring The Impact of a Largescale Diagnostic Science

37

resultsbeusedformativelybyteacherstoimprovesciencelearningand

engagement.HowthishasworkedoutinpracticeinNSWisreportedoninthe

concludingchapterofthisthesis.

TheOECD’s(2011)reportandrecommendationsontheAustralianevaluationand

assessmentsystemmentionedatthebeginningofthischapterwasbasedon

Australia’ssubmissiontotheOECDreviewprocessandtheobservationsofan

independentOECDpanelthatvisitedAustraliainJune2010.Thepanelconcluded:

Theoverallevaluationandassessmentframework[inAustralia]appearsas

highlysophisticatedandwellconceptualised,especiallyatitstoplevel

(nationalandsystemiclevels).However,thereisalesscleararticulationof

waysforthenationalagendatogenerateimprovementsinclassroom

practicethroughtheassessmentandevaluationprocedureswhichare

closertotheplaceoflearning.(OECD,2011,p.9)

Ofinterestthough,istheinclusionoftwoAustraliancasestudiesofformative

assessmentinoneoftheOECD’sWhatWorkspublicationsonformative

assessment(CERI,2005).Theselocalexamplesofgoodclassroomassessmentand

schoolsupportforassessmentaremodelsthatcouldbeappliedmorewidelyin

Australiatoaddressthepanel’sconclusions.

2.4Thepurposesforassessment

Thelocusofinterestforthisstudyisteachers’assessment-relatedwork.The

followingdiscussionaboutpurposesforassessmentwillfocusonclassroomand

schoolassessment-relatedwork.

TheNRC(2001)reportpositsthreepurposesforassessment:

1. toassistlearning

2. tomeasureindividualachievement

3. toevaluateprograms.(p.3)

Page 59: Exploring The Impact of a Largescale Diagnostic Science

38

TheNRC(2001)saysthatthesethreepurposesholdforclassroomandlarge-scale

testsaswell.

IntheUK,TheEconomicandSocialResearchCouncil’s(ESRC)report,Assessment

inSchools.Fitforpurpose?ACommentarybytheTeachingandLearningResearch

Programme(Manselletal.,2009),identifiedthesamethreeuses(orpurposes)for

assessment:

1. tohelpbuildpupils’understanding,withinday-to-daylessons

2. toprovideinformationonpupils’achievementstothoseontheoutsideof

thepupil-teacherrelationship:toparents(onthebasisofin-class

judgmentsbyteachers,andtestandexaminationresults)andtofurtherand

highereducationinstitutionsandemployers(throughtestandexamination

results)

3. toholdindividualsandinstitutionstoaccount,includingthroughthe

publicationofresultswhichencourageoutsiderstomakeajudgmentonthe

qualityofthosebeingheldtoaccount.(p.8)

Thefirstpurposeinbothreportsisalsoreferredtointheliteratureas:

• classroomassessment(Black&Wiliam,1998b;Brookhart,2003;Cowie,

2005,2013;Marzano,2000;Ruiz-Primo&Li,2012;Shepard,2001;Stiggins,

2004)

• formativeassessment(Bell&Cowie,2002;Black&Wiliam,2009;Heritage,

2010;Panizzon,Callingham,Wright,&Pegg,2007;Sadler,1998;Stiggins&

DuFour,2009)

• assessmentforlearning(ARG,2002a;Biggs&Collis,1982;Hargreaves,

2005;Stiggins,2002;Wiliam,2011b)

• embeddedassessment(mainlyintheUS)(Wiliam,2011a;Wilson&Sloane,

2000).

Thesecondpurposeinbothreportsisoftenreferredtoassummativeassessment

(Biggs,1998;Harlen&Deakin-Crick,2002;Harlen,2005)orassessmentof

learning(ARG,2006;Hackling,2004).Historically,summativeassessmentattracts

Page 60: Exploring The Impact of a Largescale Diagnostic Science

39

considerableindividualand/orpublicattentionwhenresultsthathavereal

consequencesforthosereceivingthemarepublicised,delivered,usedorrecorded

forlateruse.Forthatreason,summativeassessmentisalsocalledhigh-stakes

assessment(Au,2007;Broadfoot&Black,2004;Dulfer,Polesel,&Rice,2012;

Gipps,1999;Klenowski&Wyatt-Smith,2012;Lim,TanEngThye,&KangLu-Ming,

2009).ThelastcitationrelatestoSingapore’seducationsystemrequirements.

Thethirdpurposerelatestoaccountability.Oftentheresultsofsummative

assessmentsarethebasisformonitoringtheperformanceofaschoolorschool

system.Theissuesrelatedtohigh-stakesassessmentarediscussedbyresearchers

listedforsummativeassessmentalsoapplyinthiscontext.

2.4.1Threepurposesforassessment?

Inaneditorialreviewingthefirst10yearsoftheUKjournalAssessmentin

Education:Principles,Policy&Practice,BroadfootandBlack(2004)asserted:

Educationalassessmentmustbeunderstoodasasocialpractice,anartasmuchas

ascience,ahumanisticprojectwithallthechallengesthisimpliesandwithallthe

potentialscopeforbothgoodandillinthebusinessofeducation.(p.8)

Theeditorsgoontoidentifyfromthepaperspublishedinthoseyearsa“subsetof

subtlepurposes,whichservetounderlinethepervasive[social]powerof

assessmenttodefineandshapeeveryaspectofeducationallife”(pp.11-12),

including:

• asamechanismforcontrollingclassbehaviourandattention(thethreatof

poorresults!)

• todescribeachievementstandardsintermsofqualitativechangesinthe

responsecapabilitiesofstudentsovertime.Thiswasareferencetowork

doneinAustraliainthefirsthalfofthe1990stodevelopsubject‘Profiles’

foranationalcurriculum(Rowe&Hill,1996)

• theuseofassessmenttoencourage‘deep’ratherthan‘surface’learning

Page 61: Exploring The Impact of a Largescale Diagnostic Science

40

• encouragingownership(bybothteachersandstudents)ofassessmentas

aninfluenceontheircapacityandmotivationtolearn

• thegrowingusebypolicymakersofthesocialpowerofassessmentin

attemptstoraiseachievementlevels,changethefocusofcurriculum

priorities,inperformancemanagementforteachers,institutionalquality

assuranceandcontroland,defining‘standards’throughthepublicationof

leaguetables.

MattersandCurtis(2008)refertoattemptsbypolicymakerstochangethefocus

ofcurriculumprioritiesas“signalling”(p.17).Thewritersusetheterminthe

contextofgovernmenteffortstohavekeycompetenciesembeddedinschoollevel

curriculumdocumentsassessedbyteachers.Themessagefromgovernmentwas

thatthiscontentwasofequalvaluetotheothercontentin,say,theNSWscience

syllabus.InitsimposingoftheEVprogramonschools,theNSWgovernmentwas

signallingtostudents,teachersandthewidercommunityitsviewoftherelative

importanceofscienceinthecurriculum(seethequotationopeningSection2.2).

ThesamecouldbesaidofthedecisiontointroducesampletestingofYear6

studentsinscienceliteracyeverythreeyears(Ball,Rae,&Tognolini,2000)andthe

decisiontoparticipateininternationaltestingofscience.

Thesecond,thirdandfourthpurposesarelinkedtoformativeassessmentandwill

beexploredinSection2.6.Thefifthclusterofpurposesidentifiedhererelates

easilytothethirdpurposeofassessmentidentifiedbyboththeNRC(2001)report

andManselletal.(2009)commentary.

Itisevidentthatdiscussionsaboutassessmentandmeaningsofrelatedtermscan

beasourceofconfusion.Newton(2007)isaUK-basedexpertandresearcherwith

wideexperienceinassessment.Basedonhisexperienceofdiscourseabout

assessmentpurposes,hereportsthatthephrase‘assessmentpurposes’maybe

interpretedinatleastthreeways.Thefirstisareferencetothetechnicalaimofthe

assessment,whichistomakea“judgment”(p.150)thatistypicallyreferredtoas

theresult(thishecallsthefirstorjudgmentlevel).“Judgment”andtheNRC’s

(2001)“interpretation”inthecontextofthe“assessmenttriangle”(NRC,2001,pp.

Page 62: Exploring The Impact of a Largescale Diagnostic Science

41

2-3)areequivalent,butitisworthobservingthattheword“judgment”hasmoral

overtones.“Interpretation”isaneutral,objective,technicalword.Theword

judgmentisperhapsanintended,ifimplicit,reminderofthesocialpowervestedin

assessment.(Broadfoot&Black,2004)

Newton(2007)analysedhistoricalpublicationsaboutassessmenttoexplainhow

firstleveljudgmentsmightbebetterexpressedtoclarifythevariousforms

assessmentmighttake.Todothisheresortedtotechnicaldescriptionsofthe

variousjudgmentsaprofessionalworkingintheassessmentareamightuse.He

distinguishedbetweenquantitative,summativejudgmentsinvolvingappraisal,

andqualitative,descriptivejudgmentsinvolvinganalysisatthetwoendsofa

judgmentdipole.Theformermightbeeitherself-referencedornorm-referenced

judgments.Thelattermaybeeitherconcept-referencedjudgmentsor

performance-referenced.

Thesecondwayisabouttheusetowhichtheassessmentresultisput(the

decisionlevel).Newton(2010)producedalistof22“categoriesofusesfor

assessments”,includingsocialevaluation,formativeassessment,diagnosis,

screening,segregating,guidance,programevaluation,andinstitutionalmonitoring.

TheManselletal.(2009)commentaryreferencetousesforassessmentrather

thanpurposesacknowledgedNewton’sworkinthisarea.

Thesamesetoftestresultsaresometimesusedformultiplepurposes,often

inappropriately(James,2009;Newton,2007).TheNRC(2001)reportmakesthe

sameobservation.ComparethiswithJames’s(2009)observationthat“twenty

yearsago…testandexaminationresultswerepredominantlymeanttoserveas

indicatorsofwhatapupilknewandunderstoodaboutasubject”(p.8).Multiple

usesforthesamesetofresultswereacknowledgedinevidencetoaUKHouseof

CommonsSelectCommittee(SCCS&F,2008)alongwithanacknowledgmentthe

sametestwasnotalwaysthemostappropriateforallpurposes.

Newton’s(2007)thirdwayofinterpreting‘assessmentpurposes’relatestothe

intendedimpactoftesting,whichistosignaltheimportanceofthelearning(so

importantthatitwillbetested!).Newton(2007)alsorecognisesunintended,

Page 63: Exploring The Impact of a Largescale Diagnostic Science

42

negativeimpactsforbothsecond-andthird-levelusesandintentions.Thenotionof

impactisanexplicitrecognitionofthe‘principle’thatassessmentisasocialact

becauseassessmentresultsbothconveyinformationandinfluencewhatpeopledo

(Manselletal.,2009).

PutanotherwayFenshamandRennie(2013),JonesandBuntting(2013),and

Millar(2013)allagreethatwhatisassessedhasapowerfulinfluenceonwhatis

taught(ornottaught).Anexampleisschoolreportingofachievementinscienceat

theendofYear10totheNSWBoardofStudies.Schoolsareadvisedthatthe

resultsshouldnotincludeanyconsiderationofachievementofsyllabusoutcomes

relatedtovaluesandattitudes.Ontheotherhand,theadvicerelatingto

investigationskillsisexplicitaboutwhatistobeincluded(BOS,n.d.).

Somewritershavesoughttoframeassessmentintermsoffunctionsratherthan

purposes.ForHattie(2003a),assessmentisnotaboutthetestitself,itisthe

functionthatmatters.Testresults,heasserts,functionasfeedbackto

…teachersand/orstudents…whichtheyneedtointerpretwhenanswering

thethreefeedbackquestions:WhereamIgoing?,HowamIgoing?and,

Wheretonext?Specifically,feedbackisactionsorinformationprovidedby

anagent(e.g.teacher,peer,writtenreport,book,parent,experience)that

providesinformationregardingaspectsofone’sperformanceor

understanding.(p.2)

Hattiearguesthatthesethreequestionsworkforalllevelsoftheassessment

system,andfeedbackcombinesjudgmentandaction(eitherproposedoractual).

Masters(2013,p.2)proposesthattheoverridingfunctionofassessmentisto

provideunderstanding,notjudgment.Heusestheanalogyofadoctor-patient

consultationwherethedoctoristryingtoelicitthesymptomsfromapatientin

ordertodiagnosetheillnessandthenproposeactionstocurethepatient.

Extendingthisanalogy,hesays,“Thefundamentalpurposeofassessmentisto

establishwherelearnersareintheirlearningatthetimeoftheassessment”

(Masters,2013,pp.5-6,italicsintheoriginal).

Page 64: Exploring The Impact of a Largescale Diagnostic Science

43

InthisscenarioMasterswantstoremovethepejorativejudgment(ofpassorfail)

andreplaceitwithunderstandingasthebasisforfurtheraction.BothHattieand

Mastersshareaviewoflearningasacontinuousprocessthatcanbeassistedbya

timelydiagnosisandappropriateintervention.Bothresearchersseetheprimary

roleforassessmentasimprovingstudentlearning.

2.4.2Theoriesoflearning,cognitionandassessment

Atthebeginningofthissection,anoverviewofwhatareaderneededtobringtoa

productivediscussionaboutassessmentwasoutlined.Whatastakeholderin

educationunderstandsaboutlearningandcognitioninformswhattheybelieveis

importanttolearnandhowtheyexplainwhyitmatters.Italsoinformsthe

constructionorchoiceoftaskstoprovokeresponsesfromstudents,the

interpretationofthoseresponses(intermsoftheassessorsunderstandingof

curriculumintentions),andtherepresentationandexplanationofthejudgment

(theresult)aboutlearninginferredfromtheresponsestoassessmenttasks.

Twoexamplesofwhereteachersfoundtheoriesoflearningandcognitionhelpful

follow.Inthefirst,Black,Harrison,Lee,Marshall,andWiliam(2004)foundthatUK

secondaryscience,mathematicsandEnglishteacherstheresearcherswere

workingwithinanefforttoimproveformativeassessmentpracticeswantedto

knowmoreabout“thepsychologyoflearning”(p.16).Teacherswantedamodelof

howstudentslearnthatwouldbeusefulforprovidingfeedbacktostudents.Inthe

secondexample,Panizzonetal.(2007)foundthatwhenparticipatingteachers

weregiventheSOLOtheoryofcognition,teachersfounditusefulforplanning

assessmenttasksandrestructuringsciencelearningprogrammestoreflectthe

developmentalchangesanticipatedbytheSOLOmodel.

Adiscussionoflearningtheoriesandtheirrelationshipswithassessmentfollows.

AccordingtotheNRC(2001)report,

Mostcurrenttests,andindeedmanyaspectsofthescienceofeducational

measurement,havetheoreticalrootsinthedifferentialandbehaviorist

Page 65: Exploring The Impact of a Largescale Diagnostic Science

44

traditions.Themorerecentperspectives—thecognitiveandthesituative—

arenotwellreflectedintraditionalassessments.(p.60)

Biggs(1995)wrote:

Twobasicconceptionsofthenatureoflearningexistinoureducational

thinking,quantitativeandqualitative…thequantitativetraditionhasthe

longesthistory[andstemsfrom]thepositivisttraditioninthesocial

sciences…Thequalitativetraditionhasitsrootsinnineteenthcentury

phenomenology[and]Gestaltpsychology.[Bothofwhichlatercontributed

toafamilyoflearningtheoriesunderpinnedby]constructivism.(pp.2-5,

italicsintheoriginal).

Thequantitativeassessmenttraditionisassociatedwithbehaviouristtheoriesof

psychologistssuchasEdwardL.ThorndikeandB.F.Skinnerwhoconceive

learningasacquiring

discretequantaofdeclarativeorproceduralknowledge;asfaras

assessmentwasconcerned,anyonequantumistreatedasfunctionally

independentofanyother.Thecurriculumbecomesineffectalistofdiscrete

units:facts,skills,competencies,behaviouralobjectives,performance

indicators,andthelikeandassessmentamatterofhowmany.(Biggs,1995,

p.2)

Fromthisperspective,teachingorinstructionis

conceivedastransmittingknowledgefromteachertolearner…theteacher’s

taskistoknowthesubjectandexpounditclearly,thelearner’storeceiveit

accurately[and]assessment[involvesthe]correctunitsbeingsummedto

giveanaccuratescorethatyieldsanindexofcompetenceinwhatis

learned.(Biggs,1995,p.2)

Thequantitativeassessmentinstrumentofchoicewasthemultiple-choicetest.If

essayswereused,themarkingrubricidentifiedunitsthatwouldbeconsidered

correctoracceptableand‘fullmarks’wouldbeawardedwhenenoughcorrect

Page 66: Exploring The Impact of a Largescale Diagnostic Science

45

unitswereevidenced.Agoodtestwouldhavearangeofunitsatvaryinglevelsof

cognitivedifficultybuttheunitswouldallbetreatedashaving“mutual

equivalence,independence,andadditivity.”(Biggs,1995,p.3)

Thebehaviouristperspectiveemergedinthe1930s“aboutthesametimethat

theoriesofindividualdifferencesinintellectualabilitieswerematuring”(NRC,

2001,p.61).AccordingtobehaviouristssuchasThorndike(citedinNRC,2001),

Peoplelearnbyacquiringsimplecomponentsofaskill,thenacquiringmore

complicatedunitsthatcombineordifferentiatethesimplerones.Stimulus-

responseassociationscanbestrengthenedbyreinforcementorweakened

byinattention.Whenpeoplearemotivatedbyrewards,punishments,or

other(mainlyextrinsic)factors,theyattendtorelevantaspectsofa

situation,andthisfavorstheformationofnewassociationsandskills.(p.

61)

Bycontrast,thequalitative,constructivistorcognitiveperspectivecomprises

afamilyoftheoriesratherthananyone,accordingtowhichstudentsare

assumedtolearncumulatively,activelyinterpretingandincorporatingnew

materialwithwhattheyalreadyknow.Differenttheoriesvariously

emphasizetheindividual,social,cognitive,saccadic,contextualoremergent

naturesoflearning,butallagreeonanactivelearnerseekingmeaningby

constructingknowledgeratherthanbyreceivingandstoringknowledge.

(Biggs,1995,pp.3-4)

Inthisperspective,theteacher’sroleistohelpstudents“constructunderstandings

thatareprogressivelymorematureandcongruentwithacceptedthinking”(p.4).

Theteachershouldalsorecognisethatstudentseverydayexperiencesandprior

learningwillinevitablyleadtonaïveoralternativeconceptions(Driver&Easley,

1978)ofhowtheworldworks,andtheseneedtobechallengedandreorientedto

betterreflectthescientificviewpoint.Aconstructivistmodelofteachingand

learningisthe5Esapproach,asadvocatedintheSciencebyDoingcurriculum

supportmaterialsproducedbytheAustralianAcademyofScience(AAS,2017).

Page 67: Exploring The Impact of a Largescale Diagnostic Science

46

Fromthequalitativeperspective,assessment

impliesaggregatingunitsoflearningtakencross-sectionallywithrespectto

time,thatfromthequalitativetraditionimplieschartinglongitudinal

growthovertime,fromrelativeignorancetorelativecompetence…Ifthat

growthincompetencecanbedescribedinrecognizablestagesthenso

muchthebetter,becausethesestagescanthenbecomeassessmenttargets.

(Biggs,1995,p.4)

Biggs(1995)thendescribestwokindsofassessmentthathaveemergedfrom

constructivistthinking.Onehedescribesasecological,whichappearstoequate

withwhatothershavecalledperformanceorauthenticassessment(Frey&

Schmitt,2007);theotherhedescribesasdevelopmentalassessment.Itisthelatter

thathegoesontoelaborateas“ageneralizedmodelofqualitativeassessment”(p.

6)andassociatewiththeSOLOTaxonomy(Biggs&Collis,1982).TheSOLO

TaxonomyandSOLOmodelwillbedescribedlaterinthischapter.WhilstBiggs

(1995)positionstheSOLOtaxonomyasaqualitativedevelopmentalmodel,the

laterSOLOmodelhasbeenvalidatedbothempiricallyandinmeasurementmodel

termsaswell(Panizzon&Bond,2007).

Thesituativeviewoflearningprovidessupportforthosearguingthatassessment

shouldbeauthentic,suchasDarling-Hammond(2003);FenshamandRennie

(2013);Hackling(2004);Tytler(2007);andWiggins(1998).TheNRC(2001)

writerssayofthisperspective:

Muchknowledgeisembeddedwithinsystemsofrepresentation,discourse,

andphysicalactivity.Moreover,communitiesofpracticesaresitesfor

developingidentity—oneiswhatonepractices,tosomeextent.(p.89)

Inaddition,standardassessmentmodelstakeaviewofknowledgeas

“disembodiedandincorporeal[andit]capturesonlyasmallportionoftheskills

actuallyusedinmanylearningcommunities”(NRC,2001,p.89).

Page 68: Exploring The Impact of a Largescale Diagnostic Science

47

Thesituativeviewoflearningsupportsrecenteffortstoprovidecontextsforboth

thelearningofscienceinthesyllabusofinterestforthisproject(BOS,2003)and

theframingofscienceasahumanendeavour,andtoengagestudentswithscience

andencouragethemtoseethemselvesdoingSTEMwork,post-school.

Vygotsky’s(1978)conceptofthezoneofproximaldevelopmenthasbeen

influentialinthesituativeorsocio-culturalviewoflearning.Shayer(2003)

providesacommentaryonbothPiaget’sandVygotsky’sviewsofcognitive

developmentinchildrentosupporthisparticularinterventionaimedat

acceleratingcognitivedevelopment.

Anothercontributiontothediscussionaboutlearningandrelatedconceptionsof

assessmentisthatbySfard(1998).Hercontributionbridgesbehaviouristand

cognitive(constructivist)andsocio-culturalviewsoflearning.Shesuggeststhat

twometaphorsareusefulforunderstandinglearning:thelearningasacquisition

metaphor(AM)–weacquireconceptsorknowledge;andthelearningas

participationmetaphor(PM).InthecontextofAM,assessmentisaboutthe

quantityofwhathasbeenacquired.InPM,assessmentisaboutaprocessof

knowing,withthepermanenceofhavinggivingwaytotheconstantfluxofdoing.

Thismetaphorimpliesthatlearningasubjectisabout“becomingamemberofa

certaincommunity”(p.6).AMisabouttheindividual;PMisaboutthesocial.

Millar(2013)stronglyadvocatesthatbothcurriculumintention(whathastobe

done)andtheassessmenttask(theconditionsunderwhichitistobedoneasa

demonstrationoftheacquiredlearning)shouldbeprovidedincurriculum

documents.“Theassessmentinstrumentbecomesanoperationaldefinitionofthe

[sciencelearning]objective”(p.56).Also,doingthatwouldrequireteachersto

acknowledge(ifthetaskinvolvedperformance)aviewoflearningthat

acknowledgesbothAMandPM(Sfard,1998).

Anexampleofateachingsequencethatdemonstratesaviewoflearningwhere

bothAMandPMareacknowledgedisprovidedasAppendixC

Page 69: Exploring The Impact of a Largescale Diagnostic Science

48

Thecognitivistperspectiveandrelateddevelopmentalapproachestoassessment

haveinformedworkbeingdonetoelucidatelearningprogressionsthatspanthe

yearsofschoolingandspanatopicofworklastingfromfivetotenweeks.TheNRC

(2001)reporthasacomprehensiveanddetaileddiscussionaboutdevelopmental

assessmentandrelatedterms,includingprogressmaps,progressvariables,

developmentalcontinua,progressionsofdevelopingcompetence,andprofile

strands(p.137).

Ofprogressmapsingeneral,theNRC(2001)reportsays

TheDevelopmentalAssessmentapproachrepresentsanotableattemptto

measuregrowthincompetenceandtoconveythenatureofstudent

achievementinwaysthatcanbenefitteachingandlearning.(p.190)

RoweandHill(1996)drawonbothbehaviouristandconstructivistviewsof

learningtoprovideaninsightintothedevelopmentoftheAustraliansubject

curriculumprofiles((CURASS,1994)andoutlinetheirstrengthsandweaknesses

fromadevelopmentalperspective.

TomCorcoran’steamattheCentreonContinuousInstructionalImprovement(CCII)

(Corcoran,Mosher,&Rogat,2009)workonsciencelearningprogressionsinthe

US.TheteamrefertothedefinitionintheNRC-fundedschoolsciencetextbook

TakingSciencetoSchooleditedbyDuschl,Schweingruber,&Shouse(2007),which

iswidelyusedintheUS:

Learningprogressionsaredescriptionsofthesuccessivelymore

sophisticatedwaysofthinkingaboutatopicthatcanfollowoneanotheras

childrenlearnaboutandinvestigateatopicoverabroadspanoftime(e.g.6

to8years).Theyarecruciallydependentoninstructionalpracticesifthey

aretooccur(p.214).

Corcoranetal.(2009)usetheterm“adaptiveinstruction[tocapturethesenseof]

formativeassessmentinaction”(p.8).Thisappearstobesynonymouswiththe

phraseassessmentforlearningthatappearsinthesyllabusrelevanttothisstudy

Page 70: Exploring The Impact of a Largescale Diagnostic Science

49

(BOS,2003)andwithwhatBlackandWiliam(2009)call“formativepractice”(p.

8).

2.4.3Criteriaforevaluatingthecredibilityofassessments

Inthecontextofexplaininghowtoensurethequalityandcredibilityof

assessments,researchersreferredtoanumberofcriteriathatneedtobe

addressed.FourexamplesoflistsofcriteriaareprovidedinTable2.2.Thecriteria

applyfromthelevelofclassroomassessmenttolargescaleexternalassessment.

Table 2.2 Issues to resolve when planning, constructing and using assessments

NRC (2001) Harlen (2005) Matters and Curtis (2008)

Ruiz-Primo (2009)

Identification of the targets for assessment Item and test design Validation Reporting Fairness

Validity Reliability Dependability

Validity and related constructs Reliability and related constructs Objectivity Feasibility Usability Credibility

Choose an approach to science instruction (eg inquiry… Identify the critical skills Define assessment purposes Define an appropriate approach for: Validity Reliability Fairness Issues of practicality

Thechangeofstateexample(describedinAppendixC)considerthe“constructs”

(NRC,2001,p.112)ofphysicalandchemicalchange.Anassessmenttaskrelatedto

thatexamplemightinvolveprovidingstudentswithaccesstoaseriesofshort

videoclipsshowingnaturaland‘made’changes.Thetaskistoidentifyineachclip

aprocesswhereeitheraphysicalorchemicalchangeisoccurringandtojustifythe

choice.

Thefirstconsiderationisvalidity(seeTable2.2forthelistofcriteria).Dothevideo

clipscontainexamplesofthetwotypesofchanges?Dotheimagesshowaspects

Page 71: Exploring The Impact of a Largescale Diagnostic Science

50

(constructdimensions)ofthephenomenathatareactualpointersorindicatorsof

thechangestoberecognisedandassociatedwitheitheraphysicalorchemical

changeandnotsomethingelse?Isthereevidenceofotherimportantlearningthat

couldbethesubjectofassessment(suchasthepracticalvalueoftheknowledge

forsafeuseofmaterialsandchemicalsthatcouldbeinferredfromthecontextson

displayinthevideofootage)?

Mansell,James&theARG(2009)summarisetheissueofvalidityinthecontextof

teachers’summativeassessmentsas“aboutwhethertheassessmentmeasuresall

thatitmightbefeltimportanttomeasure”(p.12).Intheaboveexample,choices

havetobemadeaboutwhetherthefocusisontheprocessesofchemicalchangein

isolationorwhetherstudentsshouldbepromptedtosaysomethingaboutits

usefulnessaswell.

Messick’s(1995)viewsonvalidityarewidelycitedintheresearchliterature(e.g.

BroadfootandBlack,(2004);Hattie,Jaeger,&Bond(1999);Masters,(2013);NRC

(2001);Shepard(1993).Messick(1995)definesvalidityinthecontextof

psychologicalandeducationalassessmentas

nothinglessthananevaluativesummaryofboththeevidenceforandthe

actual-aswellaspotential-consequencesofscoreinterpretationanduse

(i.e.,constructvalidityconceivedcomprehensively).Thiscomprehensive

viewofvalidityintegratesconsiderationsofcontent,criteria,and

consequencesintoaconstructframeworkforempiricallytestingrational

hypothesesaboutscoremeaningandutility.(p.742)

Hecontraststhismorecomprehensiveapproachtoscoreinterpretationwiththe

historical

primaryemphasisinconstructvalidation…oninternalandexternaltest

structures—thatis,ontheappraisaloftheoreticallyexpectedpatternsof

relationshipsamongitemscoresorbetweentestscoresandother

measures.(p.743)

Page 72: Exploring The Impact of a Largescale Diagnostic Science

51

Inessence,Messick(1995)issayingthattheoriginalconstructforvaliditywas

locatedinthemeasurementparadigmforassessment(Biggs,1995;Broadfoot,

2009)andhebroadenedittoencompasstheconcepts(constructs)thatclassroom

teachersengagewitheverydayandarelookingtoassessinthecontextof

performances.Messicksaysthisbroaderviewofconstructvalidity(seeTable2.3)

dependsonanappraisalofsixaspectsheidentfiesas“content,substantive,

structural,generalizability,external,andconsequential”(pp.744-745).

Table 2.3 Messick’s aspects of construct validity

content includes evidence of content relevance, representativeness, and technicaI quality

substantive refers to theoretical rationales for the observed consistencies in test responses, including process models of task performance, along with empirical evidence that the theoretical processes are actually engaged by respondents in the assessment tasks

structural appraises the fidelity of the scoring structure to the structure of the construct domain at issue

generalizability examines the extent to which score properties and interpretations generalize to and across population groups, settings, and tasks including validity generalization of test criterion relationships

external includes convergent and discriminant evidence from multitrait-multimethod comparisons as well as evidence of criterion relevance and applied utility

consequential appraises the value implications of score interpretation as a basis for action as well as the actual and potential consequences of test use, especially in regard to sources of invalidity related to issues of bias, fairness, and distributive justice

Source: Messick, 1995, pp. 744-5

Mislevy(2008)drawsattentiontoresearchworkthatattemptstoreconcile

currentpsychometricmodelsofassessmentandrecentviewsofcognitionthat

includebothcognitivistandsocioculturalorsituativeperspectives.

Cognition,inthisview,isnotjustsomethingthathappensinside

individuals’heads,butacoordinatedinterplayofactionswithinandamong

peopleinasocially-structuredspace.(p.6)

Page 73: Exploring The Impact of a Largescale Diagnostic Science

52

Mislevy(2008)explorestheimpactofsocioculturalviewsoflearningonthe

traditionalmeasurementmodelsbasedoncognitivistviewsoflearningand

concludesthat(latent)traitoritemresponsetheory“stillholdsundera

sociocognitivemetaphor,butwithaninterpretationquitedifferentthanthatofthe

strictmeasurementmetaphor”(p.13).Latenttraittheoryascribesarangeof

consistentbehaviouralresponsestounderlying,invisiblebutstablemental

constructssuchasability,aptitude,expertiseandintelligence.Healsoreportsthat

anotherlineofinquiryisfindingthat

Modelsadaptingfeaturesofgeneralizabilitytheory,cognitivediagnosis,and

standardmeasurementmodelswouldseemtobeasuitablestartingpoint

forapsychometricstosupportassessmentunderthesociocognitive

metaphor.(p.13)

ThesecondcriterioninTable2.2isreliability.Woulddifferentassessorsscore

studentresponsesthesameway?Wouldthesameassessorscoreacomparable

responsethesameway?Wouldastudentansweringacomparablequestionona

differentdayanswerthesameway?Andwhatdoescomparablemeaninanycase?

Well-constructedmarkingcriteriaandrubricshelptoensureconsistencyof

marking(anaspectofreliability),aswouldsomepriorpracticeusingthembefore

markingactuallycommenced.Aswell,check-markingbyanotherassessorofa

randomsampleofalreadymarkedscriptsisanotherwayofensuringinter-marker

reliability.

Fairness(seeTable2.2)isensuringthatstudentshavehadopportunitiesbefore

thetesttolearnaboutphysicalandchemicalchangesandthedifferencesbetween

them.Atonelevel,thiscanbeanissueinGrade/Yearcohorttestinginschools

wheremorethanoneclassofstudentssitacommontest.Itcanalsobeanissue

withexternaltestingwhenthecurriculumusedtoprepareforthetestisdifferent

acrossthevarioussitestakingthetest.IntheUKandAustralia,externaltestingis

basedonnationalcurriculumsthatdescribestandardsandrelatedcontentthat

studentstakingthetesthave(orshouldhave)been“taught”.IntheUS,curriculum

choicerestswithindividualschooldistrictboards.Large-scaleexternaltestinghas

Page 74: Exploring The Impact of a Largescale Diagnostic Science

53

tobemoreaboutgeneralcapabilitieslinkedtoassumed,commondomain-specific

knowledgethatmayormaynothavebeen“taught”(Ruiz-Primo,Shavelson,

Hamilton,&Klein,2002).

Anotherconsiderationisthechoiceofassessmenttaskandtheopportunitiesit

providesfordifferentstudents(sayfromabackgroundwhereEnglishisnot

spokenathome)torespond.Returningtoourchemicalchange/physicalchange

example,wouldadeaforblindstudentbeabletoscorethesameasastudentwith

normalhearingandvision,giventhatthetesteesarenotabletoobserveall

possibleevidence?(e.g.adeafstudentwouldnothearheatedcornpopping,anda

blindstudentwouldnotseeit).Somestudentsmaynothavehadanyexperienceof

pop-cornatall.Dotesteesneedtowritearesponseorsimplytelltheassessor

whatisgoingon?Howmanycorrectresponsesisrequiredtodemonstrate

proficiency?Thefairnessandequityofassessmentissuesraisedhereareall

relatedtoassessmentvalidity(Messick,1995).

Dependability(seeTable2.2)involvesmakingadefensibletrade-offbetween

validityandreliability.Inthecontextofteachersummativeassessment,Harlen

(2005)says:

Dependabilityisacombinationofthetwo,definedinthisinstanceasthe

extenttowhichreliabilityisoptimizedwhileensuringvalidity.This

definitionprioritizesvalidity,sinceamainreasonforusingteachers’

assessmentratherthandependingentirelyontestsforexternalsummative

assessmentistoincreasetheconstructvalidityoftheassessment.(p.213)

Inassessingstudentresponsestotheabovetask(distinguishingbetweenphysical

andchemicalchanges),shortresponseitems(orevenmultiple-choiceoptions)

mayincreasereliability,butoptionsforextendedresponsescouldinclude

applicationsandreasonsforchoosingeitherchemicalorphysicalchange.The

latteroptionsimprovevaliditybutarehardertoscorereliably.

ObjectivityismentionedbyMattersandCurtis(2008)asitisoftenraisedasthe

bulwarkforfairness.However,iftheassessmentiscomplex,suchasmarkingan

Page 75: Exploring The Impact of a Largescale Diagnostic Science

54

essay,itmightbeworthattendingtothe“objectivityandfairnessofthosewho

assessstudentwork”(p.15).Theconcernhereismarkerbiasandhowtoensureit

doesnotaffectordistorttheapplicationoftheassessmentcriteria(seediscussion

aboutreliabilityanddependabilityabove).

Feasibility/practicality(Table2.2)alsoneedstobetakenintoaccount.According

toMattersandCurtis(2008),“Feasibilitymeanscapableofbeingdone,withthe

connotationofconvenienceandpracticabilityinthedoing.Whilemanythingsare

“doable,fewerarefeasible”(p.15).Inthecontextofanationalprogram,cost-and

time-effectivenessareimportantconsiderations,andinaschoolcontext,resources

andtimefactorsareconsiderations.

Intheexampleprovidedabove,shouldtheteacherprovidethevideoclipsand

questionsonaUSBdriveorallowstudentstoaccessthemviatheschool’s

intranet?Thehigherthecostintermsoftimeandresources,themoreimportantit

istoexplainthebenefitsofwhatisbeingdone.Inlarge-scaletestingofsciencein

theAustraliancontext,performancetasksinvolvinganinvestigationwereincluded

inthenationalsampletestsforYear6science(NationalAssessmentProgram-

ScientificLiteracy(NAP-SL)tests(ACARA,2014a)butwerereplacedbyanonline

simulationforthe2015test(ACARA,2017).ThePISA,TIMSSorEVtestshaveno

includedperformancetasks(theEVtesthasasimulatedinvestigationasoneofthe

extendedresponsetasks).Some(e.g.Fensham&Rennie,2013)wouldarguethat

thisreducesthevalidityoftestscoresrelatingtoscience.

Usability(inTable2.2)isanotherissueMattersandCurtis(2008)raise:

Theusabilityofassessmentandreportingmethodsinvolvesthecapacityof

theassessmentandreportingsystemtobeinformativetostakeholdersin

meetingtheirdiverseneeds…Anapproachwillberegardedaspracticableif

itworksandimposesajustifiableyetlimitedloaduponparticipantsand

yieldsvaluableinformationtostakeholders.(p.16)

Theresearchersdiscussthenotionthatgoodassessmentprovidesbothsummative

andformativefeedbackandiscredible.Credibility(seeTable2.2)inheresinthe

Page 76: Exploring The Impact of a Largescale Diagnostic Science

55

soundnessoftheassessmentregimeandthereputationoftheissuingauthority,

which(forthepurposesofemployabilityskillcredentials)maybetheschoolsor

theestablishedstateandterritorycurriculumandassessmentbodies.

Educationauthoritiespubliciseresultsfrominternationaltestsandschool-level

aggregationsofnationaltestresults.InAustralia,resultsfrominternationaltests

(TIMSSandPISA),NAPassessmentsandNSWYear12externalschoolexit

examinationsarepublishedforalltosee.Somemediaoutletsusetheresultsto

publishorderedlistsofschoolsusingwhatevercriteriathereportersbelieve

supportsthepointtheywanttomakeaboutassessmentresults.Privatecoaching

collegesalsousetheresultsintheiradvertisementstoattractclients.

Poorresultscanencourageteacherstoteachtothetest(Au,2007)asamistaken

responsetosocialpressureforgoodresults.Thereceiptbystudentsof

consistentlypoorassessmentscandiscourageparticipationandengagementin

learningwhoalreadyhavepoorlearninghistories(ARG,2002b).Testingor

assessmentthatisconsequentialforstakeholdershasbeenlabelled‘highstakes’in

theliterature(e.g.Gipps(1999);Harlen&Deakin-Crick(2002);NRC(2001);

Polesel,Dulfer,&Turnbull(2012)).

Messick(1995)insiststhattheimpactofassessmentresultsonindividualsmust

betakenintoaccountwheninterpretingassessmentscores.Onthatbasis,itis

entirelyappropriateforACARAtoexplainthelimitationsoftheinformationit

providesaboutschoolsontheMySchoolwebsitethatitknowspeopleaccessto

compareschools.TIMSSandPISAtestinginvolvesthecollectingofcontextual

informationtoassistwithinterpretingthetestscoresPISAofficialspublishfor

eachcountry(Thomson,Wernert,etal.,2017;Thomson,DeBortoli,etal.,2017).

TheNRC(2001)reportidentifiedfoursetsofconcernsabouttheadequacyof

assessmentsthatwereevidentatthattime:

1. thevalidityofevidenceusedtoproduceresults

2. thereliabilityofinferencesaboutthelevelofcompetenceandoverall

proficiencydemonstrated

Page 77: Exploring The Impact of a Largescale Diagnostic Science

56

3. thepublishers’silenceaboutinterventionslikelytoimproveachievement

orperformance

4. issueswithequityandfairness.(seepp.26–29)

Growingrecognitionoftheseconcernshaspromptededucationauthoritiesto

betteralignlargescaleassessmentwithcurriculumstandardsandtodevelop

assessmentsforknowledgeandskillsnotwelladdressedbyexistingtestitemsand

tasksthatmakeupmoststandardisedtestscurrentlyinwidespreaduse(suchas

aptitudetestsusedtomoderateindividualschooltestresultsinAustraliaandin

theUS).Performanceassessmenthasbeenanotherresponse.Studentsare

presentedwith“open-endedtasksthatcallupon[them]toapplytheirknowledge

andskillstocreateaproductorsolveaproblem”(NRC,2001,p.30).

Harlen(2005),inworkfortheUK-basedAssessmentReformGroup(ARG),

exploredtheissuesteachershavetoreconcilewhenattemptingtouseclassroom

assessmentsandresultsfromtests,includinglarge-scaleexternaltestsforboth

formativeandsummativepurposes.Broadlyspeakingthetradeoffthathastobe

madeisbetweenvalidityandreliability,whichwasdiscussedaboveinthecontext

ofDependability.ThediscussioninherpaperisapplicabletotheNSWcontext

whereteachersarebeingaskedtouseassessmentsforbothsummativeand

formativepurposes(theEVprogram).

AmajorreportonhighstakestestinginAustraliawaspublishedintwopartsby

theWhitlamInstitutein2012.Theliteraturereviewpart(Poleseletal.,2012)

considered“whethertheteststhemselvesarereliable,validanddesirableontheir

owntermsasameansofassessment”(p.8)andcitedresearchchallengingthe

testsasabasisforeducationaldecisionmakingundertheheadingsofreliability,

studenthealthandwell-being,learning,teachingandcurriculum.

Thereportitself(Dulferetal.,2012)drewontheliteraturereviewandresponses

(N=8353)totheverylargeonlinenationalsurveytoprovidean“educators

perspective”(inthereporttitle)andconcluded:

Page 78: Exploring The Impact of a Largescale Diagnostic Science

57

NAPLANisviewedbytheteachingprofessionas‘highstakestesting’;

findings…suggestthatNAPLANmaybehavingadetrimentaleffectinareas

suchascurriculumbreadth,pedagogy,staffmorale,schools’capacityto

attractandretainstudentsandstudentwell-being;and,concerns

expressed…suggestthatfurtherresearchisrequiredtoexaminecarefully

theuses,effectsandimpactsofNAPLAN(p.9)

WhilstthetestsreviewedintheWhitlamInstitute-sponsoredresearchareabout

literacyandnumeracytesting,thetechnicalissuesrelatedtovalidity,reliability,

desirabilityandfairnessapplytoanyone-offsummativetest,suchasthenational

Year6sciencetest,theEVtests,thenowabandonedYear10testsinNSWand

currentYear12schoolexittestsinAustraliaandotherpartsoftheworld,aswell

astestsdevisedbyteachersforstudentsattheirschools.

2.5Measurementandsummativeandevaluativeassessment

Thissectionwilldescribesummativeassessmentmodels–“thegenerationof

summativedata”(Broadfoot,2009,p.x)–thatepitomisetherigorousapproachto

measurementthatunderpinstheTIMSS,PISA,NAP-SLandNAPLANtests.Itwill

includeexamplesfromtheMySchoolwebsite.

Discussionoftheabovetestsisrelevanttothisthesisforthreereasons.Thefirstis

togetasenseofwhatisbeingmeasured.Thesecondistoobtainasenseof

whethertestresultscanbeusedforformativepurposesand,ifyes,atwhatlevel/s

(individual/class/school/schoolsystem(governmentorprivate)/stateor

territory/national/internationalmighttheinformationtheyprovidebeuseful?

Thethirdreasonistounderstandwhatinformationaboutschoolsisavailableon

theMySchoolwebsiteandtoexplainhowitwasusedinthisresearchproject.

Inthecontextofaschool,teachers’summativeassessment(ofindividual’s

achievements)usuallyhappensattheendofanepisodeofteaching.Thephrase

summativeassessmentrefersto

Page 79: Exploring The Impact of a Largescale Diagnostic Science

58

[the]processbywhichteachersgatherevidenceinaplannedand

systematicwayinordertodrawinferencesabouttheirstudents’learning,

basedontheirprofessionaljudgement,andtoreportataparticulartimeon

theirstudents’achievements.(Harlen,2005,p.247)

Atthispoint,itisperhapsworthrecallinghowevidenceisgatheredandusedto

informreportstoparents.

Humanscanonlyprovideevidenceintheformofwhattheywrite,make,do

andsayanditisfromthesefourobservableactionsthatalllearningis

inferred.Thisisthebasicandfundamentalroleofassessment—tohelp

interpretobservationsandinferlearning.Themoreskillsareobserved,the

moreaccuratelygeneralisedlearningcanbeinferred.Hence,thereisaneed

todocumentthediscreteobservableskillsandfindawaytoblendthem

intocohesiveevidencesets(Griffin,2009,p.195)

Griffin(2009)couldequallyhavecompletedtheabovequotewiththefollowing

addendum:“andinterprettheminaconventionalwaytoreportprogressin

learning”.InNSW,reportingconventionsforstudentsfromYearsKto10are

describedontheBoard’swebsite(NESA,2017,Awardinggrades).Information

aboutassessmentandrelatedreportingproceduresinthesenioryearsforNSW

andotherAustralianstatesandterritoriesisavailableontheAustralasian

Curriculum,AssessmentandCertificationAuthoritieswebsite(ACACA,2018).

Theassignmentofagradeforreportingpurposesisbasedonateacher’sjudgment

oftheaccumulatedevidenceoflearninggatheredsincethelastreport.NSW

governmentschoolsarerequiredtoformallyreporttoparentstwiceayear.What

istobelearnedandassessedaresyllabusoutcomesandrelatedcontentthat

definestheminimumexpectationsforachievingtheoutcome/s.Thislearning

mattersbecauseithasbeendeemedappropriatebythoseempoweredtocreate

thesyllabusforstudentsattheageandstageoflearningforwhichtheproposed

summativeassessmentistobedone.Theexpectationisthatteacherswillhave

providedstudentswithaccesstothecontentthatwillbethetargetofthe

assessment.

Page 80: Exploring The Impact of a Largescale Diagnostic Science

59

AswillbeshowninChapterFive,evidenceoflearning(inscience)inthe16case

studyschoolswastypicallycollectedusingpen-and-papertests,responsesto

practicalactivitiesandresearchprojectsforwhichwrittenreportsoranswersto

specificquestionsand/ororalpresentationsarerequired.Typically,taskswere

assessedinthecourseoftheschoolyearandamarkawardedtoeachbasedon

criteriaderivedfromtheoutcome/stargetedandits/theirrelatedcontent.The

marksarerecordedandthenusedasthebasisformakingan‘on-balance’,holistic

judgmentthatisthenrepresentedasagradefromAtoE.Itisthegradethatis

reportedtostudents,parentsandinterestedothersatpredeterminedtimesinthe

yearoverthesuccessiveyearsofschooling.

Inmakingthisjudgment,teachersareassistedbytheDepartment(officialpolicy

andsupportmaterialontheDepartment’sintranetforgovernmentschool

teachers)andtheNSWBoardofStudiespublicwebsitededicatedtoassessment

support(BOS,2013).TheBoard’swebsiteincludestheCommonGradeScale(CGS)

andrelatedadviceabouthowtomakeagradejudgment.Foraparticularstage

(e.g.Stage4coveringschoolYears7and8)anAgradewouldbeawardedtoa

studentwho

hasanextensiveknowledgeandunderstandingofthecontentandcan

readilyapplythisknowledge.Inaddition,thestudenthasachievedavery

highlevelofcompetenceintheprocessesandskillsandcanapplythese

skillstonewsituations.(BOS,2013,TheCommonGradeScale)

Bycomparison,anEgradewouldbeawardedforworkjudgedtodemonstrate

anelementaryknowledgeandunderstandinginfewareasofthecontent

andhasachievedverylimitedcompetenceinsomeoftheprocessesand

skills.(BOS,2013,TheCommonGradeScale)

Thescopeforjudgmentabouttheappropriategradetoapplyisconstrainedto

studentdemonstrationsofknowledgeandunderstanding;abilitytoapplythat

knowledgeandunderstandinginnewsituations;andthelevelofskillsand

processesrelatedtoscience.Depthisarelativetermranginginthecaseofskills

Page 81: Exploring The Impact of a Largescale Diagnostic Science

60

andprocessesfromveryhightoverylimited.Thecapacitytoapplythoseskillsin

newsituationsgoesfromanimplied“almostall”to“most”foraBgrade,thennot

mentionedafterthat.Thus,ifthereisnoevidenceoftransfer,thebestastudent

canachieveisaCgrade.Judgmentsaboutsyllabus-describedValuesandAttitudes

(BOS,2003,p.11)werenottobeincludedintheseassessments.

Theassignmentofthegradeisbasedonaholisticon-balancejudgmentapplyingto

alloftheoutcomesassessedorfordifferent“areas”ofgroupedoutcomes.For

Stage5science,theBoardrecommendsreportingachievementforsixareas:

Knowingandunderstanding;Questioningandpredicting;Planningandconducting

investigations;Processingandanalysingdataandinformation;Problem-solving

andCommunicating(BOS,n.d.).

Giventhemethodologyforcollectingandscoringevidenceoflearningrelativeto

syllabusstandards,reportingingradesappearstobeanappropriatetrade-off

prioritisingconstructvalidityoverreliability.Theawardofagradeinvolves

differentiatingbetweenfivelevelsascomparedtothedubiousreliabilityofan

implieddifferentiationifresultswerereportedaspercentiles.

Thenextexamplerelatestothewaysresultsfromlargescalenationaltestingin

literacyandnumeracyofeveryeligiblestudentinYears3,5,7and9inAustralian

schoolsarereported.Thereferencetoliteracyandnumeracytestingandthe

MySchoolwebsiteisincludedinthissectionofthethesisbecausebothNAPLAN

dataandotherpubliclyavailabledataontheMySchoolwebsitepertainingtothe

casestudyschoolsinvolvedwasaccessedfordatarelevanttoaddressingthe

researchquestionsattheheartofthisthesis.Thoseuseswillbeexplainedin

subsequentchapters.

StandardisedNAPLANresultsforindividualsarecollatedintoschoolsetsandused

togeneratereportsforparentsandschools.Aggregatedschoolleveldatais

publishedontheMySchoolwebsiteintheformofalevelrelatedtoascalethathas

beenestablishedtocovertherangeofexpectedperformancesforthegreat

majorityofstudentssittingthetestsuptoYear9.Thescaleincludes10

performanceBands.Year3students’performanceisreportedagainstBands1to6;

Page 82: Exploring The Impact of a Largescale Diagnostic Science

61

Year9students’resultsarereportedagainstBands5to10(ACARA,2013c),

Resultsandreports).Theschoolwebsitesincludearangeofotherinformationthat

isupdatedannuallyforthe9450schools(ABS,2018)acrossAustralia.Information

abouteachschoolispublishedontheschoolwebsite(ACARA,2016a,About).

Anextractoftheschooldataforagovernment,metropolitan,comprehensive

schoolwithsomeunclassifiedstudents(educationallydisadvantaged)andarange

ofstudentsfromYears7to12isshowninFigure2.2.

Figure 2.2 Selected school data for a government, metropolitan, Years 7-12 school. Source: MySchool website (ACARA, 2017)

Page 83: Exploring The Impact of a Largescale Diagnostic Science

62

InadditiontousingNAPLANdata,schoolsocioeducationaladvantage(SEA)profile

(whichisreferencedtothenationalquartileprofile)wasusedtofindcomparable

schools,aswillbeexplainedinChaptersThreeandFive.

ACARAproducesanannualreporttitledtheNationalReportonSchoolingin

Australia,whichisavailablefromtheACARAwebsite(ACARA,2016c,Reporting).

TheReport’smainpurposeistoreportprogresstowardachievingthetwo

“EducationalGoalsforYoungAustralians”(seeAppendixB).ACARAdoesthisby

“collecting,managing,analysing,evaluatingandreportingstatisticalandrelated

informationabouteducationaloutcomesfromdomainsoflearningdeemed

importantbythenationalEducationCouncil”(ACARA,2016d,Nationaldata

collectionandreporting).Thescopeofthisworkcurrentlyincludesliteracy,

numeracy,scienceliteracy,ICT,andcivicsandcitizenship.ScienceliteracyofYear

6students,forexample,hasbeenmonitoredtrienniallysince2003;thelatesttest

cyclewascompletedin2015.

Thethirdexampleofsummativetestingtobediscussedinvolvesinternational

comparativetestingandthewaysresultsfromthosetestsareused,bywhomand

forwhatpurposes.ThetwotestsofinterestherearetheTIMSSandPISAtests

describedabove.Thetestsprovidesummativeassessmentsofperformancesby

studentcohortsinschoolschosenbyastratified,randomsamplingmethodologyto

deliverasampleofstudentsfortesting.Thesamplehastoberepresentativeofall

targetedstateandterritorystudentpopulationsinAustraliaaswellastheirschool

sectorsandimportantdemographicgroupsrelatedtoassessingequityand

excellence(nationalGoal1).

Thetestsareofliteracy,numeracyandscientificliteracy,butwhatisassessed

withinthedomainconstructshastobeaccessibleandcomparableincognitive

demandforallparticipantsacrossthejurisdictionstakingpartinthetesting.

Evidenceoflearningiscollectedbypen-and-papertestsandother,related,

contextualinformationfromsurveyscompletedbystudents,teachers,principals

andeducationauthorityofficers.

Page 84: Exploring The Impact of a Largescale Diagnostic Science

63

DetailedreportsonAustralianstudents’performanceintheinternationaltestsand

ontheirconsiderableinfluenceareavailableontheACERwebsite.Theintended

audiencefortheresultsfromtheseinternationaltestsarehigh-leveleducation

policyofficers,educationadviserstogovernmentandthemedia,andeducation

researchers.Datasetsfromthetestsareavailablefordownloadandindependent

analysis.

ACERhaspublishedabookforteachersaboutPISA(Thomson,Hillman,&De

Bortoli,2013)thatexplainsthetestanditspurposesaswellasprovidingsome

examplesofassessmenttasks.BothFensham(2013)andMillar(2013)arguethat

thisisaveryworthwhileinitiativebecauseitprovidesgoodmodelsforassessment

itemsthatteachersshoulduseandreplicateinthecontextoftheirownschool-

basedassessment.

Scientificliteracywasthedomainofmajorfocusforassessmentin2015(asitwas

inthe2006roundoftesting).In2006theconstructsinterestinscienceandsupport

forsciencewereincludedforassessmentinthetest.Athirdconstruct,

responsibilitytowardsresourcesandenvironments,wasincludedinthestudent

questionnaire.Atthattime,theinclusionofattitudestowardscienceinthissortof

testwasgroundbreaking.Itwasretainedin2015butwereaddressedinthe

studentquestionnaire.

OfparticularinteresttothisthesisisthePISA2015assessmentframework,which

includedthenewfeatureofcognitivedemand“withintheassessmentofscientific

literacyandacrossallthreecompetenciesoftheframework”(OECD,2017,p.40).

Thetestdevelopersdistinguishcognitivedifficultyfromempiricalitemdifficulty.

Thelatteris“estimatedfromtheproportionoftest-takerswhosolvetheitem

correctly”(p.40).

Cognitivedifficultyrelatestothetypeandlevelofmentalprocessesdemonstrated

inresponsestoaquestion.Ofrelevancetothisthesisistheinternational

acknowledgmentthatthelevelofthinkingdemonstratedbyastudentisan

importantaspectofthecompetenciesthatdefinescientificliteracy.Fromits

inceptionin2005,theEVprogramhasincludedthemeasurementofcognitive

Page 85: Exploring The Impact of a Largescale Diagnostic Science

64

difficulty(levelsofthinking).TheinclusionofcognitivedifficultyinthePISA2015

assessmentframeworkisbelatedvindicationofitsincorporationintotheEV

program.

BeforeWebb’s(1997)DepthofKnowledge(DOK)approachwaschosenasthebest

tomeasurecognitivedifficultyforthepurposesofPISA2015,anumberofother

theoreticalframeworkswereconsidered,includingtheSOLOTaxonomy(Biggs&

Collis,1982).IntheviewofPISAdevelopers,DOK“isasimplerbutmore

operationalversionoftheSOLOTaxonomy”(OECD,2017,p.40).TheEVprogram

inNSWusestheSOLOmodel(Panizzonetal.,2006)asthebasisformeasuring

levelsofthinking.ThereasonsforusingSOLOforthiswillbediscussedinSection

2.6.

Giventhehighstakesinvolvedhereforthecountriesparticipatinginthese

internationaltests,theassessmentframeworksaresubjectedtoscrutinyandneed

tobedefensible.Stateoftheartpsychometricsareutilisedtoensuredependability

ofscores(theappropriatetrade-offbetweenconstructvalidityandreliability).For

theseinternationaltests,giventhediversityofcurriculaacrossthecountries

involvedandtheabsenceofanyinternationalagreementaboutwhattotest,

“reliability[is]thedominantstatisticintheseinternationaltests”(Fensham,2013,

p.14).

Fensham’s(2013)maincriticismofTIMSSandPISArelatetothefactthatpen-and-

papertestingcannotassesstheincreasinglyimportantexpectationsforscience

andtechnologylearning,suchas

practicalperformanceinscience…decisionmakingaboutsocio-scientific

issues,context-basedscienceandscienceprojectworkinandoutside

school…NeitherTIMSSnorPISAacknowledgestheabsenceofanytestingof

thesciencelearningsassociatedwiththesenewergoals.Suchhigh-status

silencecaneasilybeinterpretedassuggestingtheyarenotofworth.(p.18)

Despiteacknowledgingvalidityissues,overtimeTIMSSandPISAtestshave

providedimportantreliable(inthestatisticalsense)feedbacktoeducation

Page 86: Exploring The Impact of a Largescale Diagnostic Science

65

authoritiesaroundtheworldonissuestodowithgenderequityintermsofscience

achievement,theimpactofsocio-economicbackgroundonachievement,and

whetherthegapbetweentopandbottomperformersisgettingwiderornarrower.

InAustralia,thesamplesizeisdeliberatelylargeenoughtoprovidereliabledata

onachievementofstudentsinthedifferentstatesandterritoriesandschool

sectors(governmentschool,catholicschoolandotherindependentschool)aswell.

TheAustraliandatashowsgirlsandboysdoequallywellinscience;thesocio-

economicstatusofparentsispositivelylinkedtoachievement;andstudentsof

Indigenousbackgroundandstudentswhoareeducatedingeographicallyisolated

placesdomuchworseinsciencethantheirmetropolitancounterparts.Inshort,

thesetestsprovideapictureovertimeofstudentprogressinrelationtothe

“EducationalGoalsforYoungAustralians.”

2.6Formativeassessmentandformativepractices

TheNSWsciencesyllabusofrelevancetothisproject(BOS,2003)refersto

assessmentofandforlearning(pp.70-75).Thecurrentsyllabus(BOSTES,2012)

talksaboutassessmentaslearning,aswellasbeingofandforlearning(p.171).In

theliterature,assessmentaslearning(Dann,2002;Earl&Giles,2011;Hickey,

Taasoobshirazi,&Cross,2012)islinkedto“assessmentforlearning”(Black&

Wiliam,2009,p.8)whentheresearcherstalkaboutactivatingstudentsasthe

ownersoftheirownlearning.Advocatesofassessmentaslearningacceptthat

studentsshouldbevaluedparticipantsintheirownlearning,anticipatereceiving

andutilisingconstructivefeedbackandfeed-forwardandbeabletoidentifytheir

ownlearninggapsandsolvetheirlearningneeds,withteacherassistance.Through

thispracticestudentscandevelopskillsforlife-longlearningandbeself-motivated

bylearningselfandpeerassessmentstrategies.(Earl&Giles,2011,p.13)

Someresearchershavewarnedthatasimplisticviewofassessmentaslearning

couldbemisconstruedasendorsingteachingtothetest,coachingtoimprovetest

answeringskills,andthenotionthattestingcountsaslearning(e.g.Sadler,2007;

Torrance,2007).Assessmentaslearningisalsolinkedtoself-regulatedlearning

Page 87: Exploring The Impact of a Largescale Diagnostic Science

66

(Boekaerts&Corno,2005;Clark,2012;Nicol&Macfarlane-Dick,2006;Schraw,

Crippen,&Hartley,2006).

Thisthesiswillattendprimarilytoformativeassessment,self-regulatedlearning,

learninghowtolearn,andlearningindependenceorautonomy.becausethe

researchliteraturefortheseismoreextensive.InChapterThreethisliteraturehas

beenusedtodevelopthedimensionsofformativepractice,whichconstitutethe

theoreticalframeworkusedforexploringtheimpactofassessmentforlearning

andtheEVprogramonassessment-relatedworkofscienceteachers.Thiswillbe

explainedinsubsequentsectionsandinChapter3.

AccordingtoBlack,McCormick,James,andPedder(2006),self-regulatedlearning

isthekeyto“learninghowtolearn”,whichtheseresearchersdistinguishfrom

learningtolearn.Self-regulatedlearningunderpinsthecapacityfor“life-long

learning”(Blacketal.,2006,pp.120-121).Theimportancetoindividualsof

acquiringthecapacityforindependent,life-longlearninghasbeenidentifiedasan

importantgoalforpreparingstudentsforlifeintheknowledgesocietyandits

relatedknowledgeeconomy.Itwastheover-ridinggoalforscienceeducationin

NSWintheperiodofinterestforthisproject(BOS,2003).

Assessmentforlearningorformativeassessmentisattractingalotofattention

becauseitisperceivedtobeperhapsthesinglemostimportantkeytoimproving

engagementwithlearningandrelatedachievementinscience.Ifitisproperly

implemented,studentsshouldbegraduatingfromschoolwellontheirwayto

beingself-managinglearners.Therearethreereasonsformakingthesestrong

claims.

Thefirstreasonisthewaveofsupportforformativeassessmentandits

pedagogicaloffspring,formativepractices,sparkedbytwopublicationsbyBlack

andWiliamin1998:AssessmentandClassroomLearning(Black&Wiliam,1998a)

andInsidetheBlackBox:RaisingStandardsThroughClassroomAssessment.(Black

&Wiliam,1998b).Thelatterwaswrittenwithscienceandotherteachersinmind.

Page 88: Exploring The Impact of a Largescale Diagnostic Science

67

Thesecondreasonisthestrongandgrowingconfirmationthattheteacheris

“…thegreatestsourceofvariancethatcanmakethedifference[inachievement]”

(Hattie,2003b,p.3).Inhiscalculationsofeffectsizesforalargearrayofclassroom

interventions,Hattieidentifies14influencesofteachers,allbutthreeofwhichare

linkedtowhattheteacherdoesintheclassroomwithstudents.

Thethirdreasonisthatananalysisofwhatteachersdointheclassroomthat

makesthemostdifferencetoachievement,arealllinkedto“formativepractices”

(socalledbyBlack&Wiliam,2009,p.8).Eachofthesethreereasonswillbedealt

withinseparatesubsections.

2.6.1Supportforformativeassessment

TheARGwithsponsorshipfromtheNuffieldFoundationhadcommissionedBlack

andWiliamin1995toreviewtheliteratureonformativeassessment.Theirreport

waspublishedin1998(Black&Wiliam,1998a).Subsequently,theARGpublished

abrochuredescribingtenprinciplesofAssessmentforLearningandgavestrong

endorsementfortwopublicationsaboutassessmentforlearningarisingfromthat

reviewandlaterwork(ARG,2002a).TheARGdefinedassessmentforlearningas

Theprocessofseekingandinterpretingevidenceforusebylearnersand

theirteacherstodecidewherelearnersareintheirlearning,wherethey

needtogoandhowbesttogetthere.(ARG,2002a,p.2)

Inasecondpublication,WorkingInsidetheBlackBox,Blacketal.(2004)reprised

thethreequestionstheBlackandWilliam(1998b)reviewhadsetouttoanswer:

1. Isthereevidencethatimprovingformativeassessmentraisesstandards?

2. Isthereevidencethatthereisroomforimprovement[informative

assessmentpractices]?

3. Isthereevidenceabouthowtoimproveformativeassessment?

Theresearchreportedin1998hadsaidyestothefirsttwoquestions.Blacketal.

(2004)providedananswerintheaffirmativeforquestionthree.Itreportedthe

resultsofatwo-year,school-basedprojectinvolvingtheresearchersworkingwith

Page 89: Exploring The Impact of a Largescale Diagnostic Science

68

science,mathematicsand,later,Englishteacherstoimproveformativeassessment

practicesandtodevelopnewones.

AssessmentforlearningwasacknowledgedintheUSpublication(NRC,2001)as

“assessmenttoassistlearning,orformativeassessment.”(p.38,italicsinthe

original).TheNRC(2001)reportreferencedthe1998Black&Wiliampaper:

[Black&Wiliam]alsoreport…thatthecharacteristicsofhigh-quality

formativeassessmentarenotwellunderstoodbyteachersandthat

formativeassessmentisweakinpractice.(p.227)

TheNRC(2001)reportappearstoacknowledgethiswasanissueintheUSaswell

becauseitsRecommendation11said:

Thebalanceofmandatesandresourcesshouldbeshiftedfromanemphasis

onexternalformsofassessmenttoanincreasedemphasisonclassroom

formativeassessmentdesignedtoassistlearning.(p.14)

TheOECDpublicationonformativeassessmenttitledFormativeAssessment:

Improvinglearninginsecondaryclassrooms(OECD,2005)citedajournalversionof

theWorkinginsidetheBlackBoxnarrative(Black&Wiliam,2005).TheOECDused

thejournalversionasthemainreferentfromtheEnglish-speakingworldand

linkedittoeightcasestudiesofformativeassessmentinpracticefromaroundthe

world,includingQueensland,asindicatedearlierinthischapter.Offormative

assessment,theOECDreportsays:

Studiesshowthatformativeassessmentisoneofthemosteffectivestrategiesfor

promotinghighstudentperformance.Itisalsoimportantforimprovingtheequity

ofstudentoutcomesanddevelopingstudents’“learningtolearn”skills.(CERI,

2005,p.13)

InAustralia,thewritersofTheStatusandQualityofTeachingandLearningof

Science(Goodrumet.al.,2001)endorsedBlack&William’s(1998a)supportfor

theprovisionofmeaningfulfeedbacktoachieveimprovementsinlearning

outcomes.

Page 90: Exploring The Impact of a Largescale Diagnostic Science

69

Assessmentforlearning,asdistinctfromassessmentoflearning,impliesan

importantshiftintheownershipofassessment.Theoverwhelmingmessageabout

assessmentoflearningisthatitisdonetosomeone(students?)bysomeoneelse(a

teacher?)andtheperson‘doneto’wearsthejudgmentlabelassignedthem

(Newton,2007).ThelanguageIamusingisdeliberatelypejorativetosignalthata

properunderstandingofassessmentinvolvesrecognisingitasasocialact(Gipps,

1999;Broadfoot&Black,2004)

Insciencelearning,theteacher’sroleistohelpstudentsidentifyandowna

progressioninsciencelearningappropriatetotheirneedsasstudentsinascience

course.Intheprocessofdoingthat,theteachershouldprovidestudentswiththe

cognitivetoolstoconstructtheirownlearningmaps,whichtheycanuseto

navigatethroughlifeasasciencestudentatschoolandinlifegenerally.Thegoal

forteachersistomakethemselvesredundant(Sadler,1998).

BellandCowie(1997)wroteareportfortheLearninginScienceProject

(Assessment)whichranin1995and1996.Republishedin2002(Bell&Cowie,

2002),thereportsuggested:

1. Pen-and-papertestscannotprovidedataformanyofthevaluedoutcomes

inscience(suchasinquirytasksorworkinginteams).

2. Therearemanypurposesforassessment(cf.Newton,2007).

3. Iflearningisownedbythestudent,theteacherneedstobeabletomonitor

studentconceptualdevelopmentandsupporttheprocessbyhavinga

theoryoflearningthatcanbeusedtosupportthatprogress.

4. Formativeassessmentcanprovideevidenceoflearningforthegapsin

assessmentcoverage(thusimprovingthedependabilityoftheassessment)

foradiversityofpurposesanduses,andbetterqualityfeedbacktosupport

theprogressingconceptualdevelopmentfromnaïvetosophisticated

understandingsofscience.

CowieandBell(1999)definedformativeassessmentas“theprocessusedby

teachersandstudentstorecogniseandrespondtostudentlearninginorderto

Page 91: Exploring The Impact of a Largescale Diagnostic Science

70

enhancethatlearning,duringthelearning”(p.101).Wiliam(2011b)

acknowledgedtheCowieandBell(1999)qualification“duringthelearning”.

Wiliam(2011b)creditsStigginswithpopularisingtheuseofthephrase

‘assessmentforlearning’.HealsoattributestoStigginstheidentificationoffour

conditionsthathavetobesatisfiedforformativeintenttoberealisedandfor

studentstoremainengagedwiththelearningprocessevenwhentheassessment

resultisnotwhattheywouldwanttoreceive(Stiggins&Chappius,2005).Wiliam

(2011a)alsoelaboratestheprinciplesofformativeassessment,goingwellbeyond

whatwasprovidedinthepaperbyBlack&Wiliam(2009).

WilsonandSloane(2000)describedasystemofembeddedassessments—theso-

calledBEAR(BerkeleyEvaluationandAssessmentResearch)AssessmentSystem,

orBAS.TheBAS“isacomprehensive,integratedsystemforassessing,interpreting,

andmonitoringstudentperformance”(p.182).Itstoolsenableteachersto:

• assessstudentperformanceoncentralconceptsandskillsinthecurriculum

• setstandardsofstudentperformance

• trackstudentprogressovertheyearonthecentralconcepts

• providefeedback(tothemselves,students,administrators,parents,or

otheraudiences)onstudentprogressandontheeffectivenessofthe

instructionalmaterialsandtheclassroominstruction.(p.182)

Theprinciplesbehindthedesignofthisclassroom-basedassessmentsystemare:

1. Itshouldbebasedonadevelopmentalperspectiveofstudentlearning(ie

thatthereisadefinablepathwayastudentfollowsastheyworkthrough

thetopic[…]alearningprogressionthatdescribesintendedlearningsina

curriculumdefinedlearningarea,suchasscience.

2. Theremustbeamatchbetweenwhatistaughtandwhatisassessed(which

meansthatothermethodsforassessingperformanceapartfromresponses

topenandpapertestsmustbeused).

3. Teachersmustbethemanagersofthesystem(iftheyaretousetheresults

aseffectivefeedback).

Page 92: Exploring The Impact of a Largescale Diagnostic Science

71

4. Tobeacceptablebeyondtheschool,assessmentshavetobeseenasfair,

validandreliablemeasuresoftheexpectedlearning(evidenceofhigh

quality).

2.6.2Teachersmakethedifference

Hattie(2003b)hassummarisedthefindingsfrommanystudies,usingHierarchical

LinearModelling(HLM)(p.1),whichattributesthevariationinstudent

achievementatschooltosixmaininfluencesasmeasuredbytheresultsfromlarge

scaleexternaltesting.Thelastfouraregroupedas“combinedeffects”sometimes

referredtoasschoolenvironmentfactors.HLMalsoassignstherelativeweight

eachhasonachievement.Thethreecontributorstovariationare:

1. whatstudentsbringtoschoolintheformofabilityandsocialcapital(50%);

2. theexpertiseoftheteacher(30%)

3. thecombinedeffectsofschool-principal,homeandpeereffects(20%).

Hattie(2003b)arguesthatsupportingteacherswouldbethemostproductiveway

toimproveachievement.Hemadethatpointbycomparingtheeffectsize

differencesof16influencesonachievement(assessedusingtheSOLOTaxonomy)

attributedtoexpertasopposedtoexperienced(whichhedefinesinhispaper)

teachers(seeFigure2.3).

Giventhateffectsizesabove0.40(verticalaxisFigure2.3graph)arevalueadding

abovetheaverage,teacherexpertiseisaveryusefulcontributortolearning.

Page 93: Exploring The Impact of a Largescale Diagnostic Science

72

Figure 2.3 Effect-sizes of differences between Expert and Experienced Teachers.

Source: Hattie, 2003b, p. 14.

Whenyoudrilldownintothedimensionsofexpertisethatprovidethegreatest

effectsize,theabilitytouseaspectsofformativeassessmentfeaturehighly.

Examplesgivenincludetheuseoffeedback,thecapacitytomanageclassroom

discussionsproductively,andworkingwithstudentsinwaysthatenhancetheir

capacityforself-regulation.

2.6.3Weightofevidencesupportingformativepractices

Thesheerweightofevidencethatemergedfrommeta-analysesofthehugebodyof

researchfindingsaboutinterventionsandstrategiesusedbyteacherstoimprove

achievementisperhapsthemostcompellingreasonforsupportingformative

practices.Meta-analysisisastatisticalprocessthatprovidesacomparable

measureofeffectsizeforinterventionstriedandtestedinresearchprojectswhere

beforeandafterstudiesproducedaresult.JohnHattie’s(2009)VisibleLearning,

Distinguishing Expert Teachers from Novice and Experienced Teachers. 14

Percentage of Student Work classified as Surface or Deep

0

10

20

30

40

50

60

70

80

Experienced Experts

SurfaceSurface Deep

Deep

A more effective method for demonstrating the magnitude or importance of the differences in means is to graph the effect-size (difference in means divided by the pooled standard deviation). The effect-sizes (the of each of the 16 dimensions can be seen in the next Figure.

Deep R

epres

entat

ions

Problem

Solving

Anticipate

and Plan

Better

Decisio

n mak

ers

Classro

om C

limate

Multid

imen

sional

Perspect

ives

Sensit

ivity

to Con

text

Feedbac

k & M

onito

ring L

earn

ingTest

Hyp

othesi

sAutom

aticit

y

Respect

for S

tudents

Passion

Engage

in le

arning

Set ch

allen

ging t

asks

Positiv

e influ

ence

on ac

hievem

ent

Enhance

surfa

ce an

d deep le

arning

Effect-size0

0.2

0.4

0.6

0.8

1

1.2

Effect-sizes of differences between Expert and Experienced Teachers

Essential Representations

Guiding learningMonitoring and

FeedbackAffective attributes Influencing Student

Outcomes

Copyright Professional Learning and Leadership Development, NSW DET

Page 94: Exploring The Impact of a Largescale Diagnostic Science

73

Tomorrow’sSchools,TheMindsetsthatmakethedifferenceinEducationis

extraordinaryfortworeasons.

Thefirstreasonisthehugenumberofresearchpapersheanalysedtoproducethe

effectsizesfordifferentinterventions.Thesnapshotoftheresearchprojects

includedforpublicationnumberedmorethan800meta-analysesofsome50,000

studiesinvolvingmorethan200millionstudents.

Thesecondwasitsrevelationofconsistentlyhigheffectsizesattributableto

interventionsassociatedwithformativepractices(thiswillbeexplainedlaterin

thissection).Theaverageeffectsize(ES)ofallinterventionsHattiereviewedwas

0.40.Basedonevidencefromlarge-scaletestingintheUS,theUK,NewZealand

andAustralia,Hattie(2012)saysthisistheaverageESonachievementofone

yearsteaching.Tohaveanimpactonachievementabovethat,teachingneedsto

involvepracticeswithanES>.40.

Eachoftheinfluencesisdiscussedandexplainedinthebodyofthetextreferenced

inTable2.4,whichliststhe21mostpowerfulinfluencesofstudentachievementas

of2012.

Page 95: Exploring The Impact of a Largescale Diagnostic Science

74

Table 2.4 Influences on learning and effect sizes

Influence ES Influence ES Influence ES Self-reported grades / Student expectations (STE)

1.44 Comprehensive interventions for learning disabled students (TGE)

0.77 Acceleration (SLE)

0.68

Piagetian programs (STE)

1.28 Teacher clarity (TRE) 0.75 Classroom behavioural (SLE)

0.68

Response to intervention (STE)

1.07 Feedback (TRE) 0.75 Vocabulary programs (CME)

0.67

Teacher credibility (in the eyes of the student) (TRE)

0.90 Reciprocal teaching (TGE) 0.74 Repeated reading programs (CME)

0.67

Providing formative evaluation (TGE)

0.90 Teacher-student relationships (TRE)

0.72 Creativity programs on achievement (TGE)

0.65

Micro-teaching (TRE) 0.88 Spaced vs mass practice (TGE)

0.71 Prior achievement (STE)

0.65

Classroom discussion (TGE)

0.82 Metacognitive strategies (TGE)

0..69 Self-verbalization and self-questioning (STE)

0.64

Source: Hattie, 2012, p. 266 / ES = Effect Size / STE = student effect / TRE = teacher effect / TGE = teaching effect / SLE = school effects / CME = curriculum effect

ElevenoftheinfluencesinTable2.4withthehighestESarelinkedtoteacheruse

offormativepractices(e.g.providingformativeevaluation,classroomdiscussion,

feedback,reciprocalteaching,andmetacognitivestrategies).Localvariationsofthe

curriculumeffect(CME)influenceswereobservedintheprogramsinsomeofthe

casestudyschoolsvisitedforthisproject.

2.6.4FormativePractice

AconsistentthemeinBlackandWiliam’sworkistheirinterestinestablishinga

theoryofformativeassessment“toprovideaunifyingbasisforthediverse

practicesthataresaidtobeformative”(BlackandWiliam,2009,p.7).Theirfirst

propositionisthatboththeteacherandstudentareresponsiblefortheoutcomes

fromthreekeyprocessesinteachingandlearning:

• establishingwherethelearnersareintheirlearning

Page 96: Exploring The Impact of a Largescale Diagnostic Science

75

• establishingwheretheyaregoing

• establishingwhatneedstobedonetogetthemthere.

BlackandWiliambringthethreeprocessesandtherolesoftheagents(teachers,

peersandthestudentsthemselves)togetheronagridtogeneratefivekey

strategiesforconceptualisingformativeassessment:

• clarifyingandsharinglearningintentionsandcriteriaforsuccess

• engineeringeffectiveclassroomdiscussionsandotherlearningtasksthat

elicitevidenceofstudentunderstanding

• providingfeedbackthatmoveslearnersforward

• activatingstudentsasinstructionalresourcesforoneanotherandtheir

teacher

• activatingstudentsastheownersoftheirownlearning.(Black&Wiliam,

2009,p.8)

Theresearchersalsoprovideanupdateddefinitionofformativeassessmentthat

conflatesitwithinstruction:

Practiceinaclassroomisformativetotheextentthatevidenceabout

studentachievementiselicited,interpreted,andusedbyteachers,learners,

ortheirpeers,tomakedecisionsaboutthenextstepsininstructionthatare

likelytobebetter,orbetterfounded,thanthedecisionstheywouldhave

takenintheabsenceoftheevidencethatwaselicited.(Black&Wiliam,

2009,p.9)

Theresearchersexplainthatinstructionmeansteachingandlearningactivities

and,becausetheeffectof“decisionsaboutthenextstepsininstruction”isnot

certain,thequalificationof“likelytobebetterorbetterfounded”isanappropriate

qualificationforthosedecisionsandrelatedactionstoimprovelearning.

Forthepurposesofthisstudy,Ihavelinkedtogetherinthefollowingequation

eachofBlackandWiliam’s(2009)fivestrategiesofformativeassessmentand

Page 97: Exploring The Impact of a Largescale Diagnostic Science

76

scienceteachingandcallthecombination,dimensionsofformativepractice.

Five Dimensions of Formative practice

= Formative assessment activity

+ Instruction in science

Thefirstdimensionofformativepracticeinvolvesactivitiesthatfocusonclarifying

andsharinglearningintentionsandsuccesscriteriarelatedtolearningscience

(LISC).

Theseconddimensioninvolvesclassroomdiscourseinsciencecontextsthatelicits

evidenceoflearning(CDEL).

Thethirddimensionfocusesonfeedbackused(byeitherorboththeteacheror

student)toprogressthelearningofscience(FTAL).

Thefourthdimensionisaboutactivatingstudentsasinstructionalresourcesfor

eachotherandtheteacherinsupportofsciencelearningandincludingpeer

assessment(ASIR).

Thefifthdimensioninvolvesactivatingstudents(andteachers)asownersoftheir

ownlearninginscienceandincludingself-assessment(ASTL).

Inthemethodologysectionofthisthesis(ChapterThree),theactivitiesrelatedto

eachdimensionarefurtherdifferentiatedintoteacherorstudent

focus/emphasis/agency.Thereasonforthisdifferentiationistoprovidean

operationaldefinitionforself-regulatedlearningconstructedintermsoftheextent

ofstudentagencywiththefivedimensionsofformativepractice.

2.6.5Formativepracticeandself-regulatedlearning

Black&Wiliam(2009)assignagencyforassessmenttoteachers,peers,and

individuallearners,saying,“Formativeassessmentisconcernedwiththecreation

of,andcapitalizationupon,‘momentsofcontingency’ininstructionforthe

purposeofregulationoflearningprocesses”(p.10).Anarrowfocusdistinguished

Page 98: Exploring The Impact of a Largescale Diagnostic Science

77

theformativeassessmentcomponentininstructionfromactsthatfollow,acts

drawingfromtheteachers’knowledgeof“instructionaldesign,curriculum,

pedagogyandepistemology.”(p.10).These‘momentsofcontingency’maybe

synchronous(immediatelyactedupon)orasynchronous(delayedaction).Tobe

effective,formativeinteractionshavetoresultinlearning.BlackandWiliam

(2009)citeSadler’sdefinitionoflearningas“theactivityofclosingthegap

betweenalearner’spresentstateofmindandthestateimpliedbythelearning

aim”(p.12).

Thefeedbackbyateachermaynotdoitsintendedjob(changecognition)unless

theteacherhassomeinsightorunderstandingofhow“studentsapproachproblem

solving,andhowtheyargue,evaluate,create,analyseandsynthesise”(Sadler,

1998,p.81).Thisreferstowhatteachersunderstandabouttheprocessesof

metacognitionthattheyandthestudentbringtotheprocessofmediation

occurringinthemiddlesectionofFigure2.4.

Figure 2.4 The three interacting domains of pedagogy (or instruction) Source: Black & William, 2009, p. 11

‘exit passes’), to plan a subsequent lesson. They might also include responses towork from the students from whom the data were collected, or from other students,or insights learned from the previous lesson or from a previous year.

The responses of teachers can be one-to-one or group-based; responses to astudent’s written work is usually one-on-one, but in classroom discussions, thefeedback will be in relation to the needs of the subject-classroom as a whole, andmay be an immediate intervention in the flow of classroom discussion, or a decisionabout how to begin the next lesson.

A formative interaction is one in which an interactive situation influencescognition, i.e., it is an interaction between external stimulus and feedback, andinternal production by the individual learner. This involves looking at the threeaspects, the external, the internal and their interactions. Figure 2 below serves toillustrate the sequence of the argument. The teacher addresses to the learner a task,perhaps in the form of a question, the learner responds to this, and the teacher thencomposes a further intervention, in the light of that response. This basic structure hasbeen described as initiation–response–evaluation or I-R-E (Mehan 1979), but thisstructure could represent either a genuinely dialogical process, or one in whichstudents are relegated to a supporting role.

Frequently, the teacher’s use of the I-R-E format involves the teacher askingstudents to supply missing words or phrases in the teacher’s exposition of thematerial—a form of extended ‘cloze’ procedure. During such interaction, theteacher’s attention is focused on the correctness of the student’s response—whatDavis (1997) terms “evaluative listening”, and subsequent teacher ‘moves’ are aimedat getting the student to make a correct response, through such encouragingresponses as, “Almost” or “Nearly”. There is ample evidence that this form ofinteraction is the norm in most classrooms (Applebee et al. 2003; Hardman et al.2003; Smith et al. 2004).

The model is meant to apply to more than one-on-one tutoring (which Bloom1984, regarded as the most effective model of instruction): the shaded area in thecentre stands for the classroom where many learners are involved, through hearingthe exchange, perhaps by joining in, so there would be many arrows in all directionsin this area. This aspect will feature in our later sections.

The process represented in Fig. 2 may be decomposed into several steps: one stepis the teacher’s interpretation of the pupils’ responses - this will be discussed inSection 4. The next is to decide on the best response: such decision is first of all astrategic one, in that it can only be taken in the light of the overall purpose for which

Teacher

Controller

Or

Conductor

Learners

Passive

Or

Involved

Fig. 2 The three interactingdomains of pedagogy

Educ Asse Eval Acc (2009) 21:5–31 11

Page 99: Exploring The Impact of a Largescale Diagnostic Science

78

Toexplaintheirtheoreticalmodelsofmediation,BlackandWiliam(2009)begin

byprovidingthedefinitionofself-regulatedlearningusedbyBoekaerts,Maes,and

Karoly(2005)whohadcompletedageneralreviewofthisfield:

Self-regulationcanbedefinedasamulti-component,multi-level,iterative

self-steeringprocessthattargetsone’sowncognitions,affectsandaction,as

wellasfeaturesoftheenvironmentformodulationintheserviceofone’s

goals.(p.150)

BoekaertsandCorno(2005)describea

dualprocessingself-regulationmodelwherelearninggoalsinteractwith

well-beinggoals[…]whenstudentshaveaccesstowell-refinedvolitional

strategiesmanifestedasgoodworkhabits,theyaremorelikelytoinvest

effortinlearningandgetoffthewell-beingtrackwhenastressorblocks

learning.(p.1)

Twopossibilitiesoperateinthiscontext.Oneisdescribedasatop-downSRor

growthoptionpathwaywhich

hasafocusonlearning[…]thestudentpursuesthepurposeofachieving

learninggoalsthatincreaseresources,i.e.knowledgeandbothcognitive

andsocialskills.Theprocessismotivatedandsteeredbypersonalinterest,

valuesandexpectedsatisfactionandrewards.(p.14)

Theotherpathwayisdescribedasthewell-beingoption,whichmaymanifestitself

asthelearnerchoosing

competitiveperformancegoalsorprioritis[ing]friendshipwithpeers,

whichafocusonlearninggoalsmayputatrisk.[It]maybetriggeredby[…]

sometypesofclassroomfeedbackandreward,ormerelybyboredom.

Whencuesfromtheenvironmenthavethiseffect,thissecondoptionis

adopted—thatofgivingprioritytowell-being.(p.14)

Page 100: Exploring The Impact of a Largescale Diagnostic Science

79

Inthecourseofalearningepisode,studentsmayseekoneorotherofthese

optionsandchooseattimestoswitchfromonetotheother.Thechoiceofoptionis

alsoinfluencedbythestudents“awarenessofandaccesstovolitionalstrategies

(metacognitiveknowledgetointerpretstrategyfailureandknowledgeofhowto

buckledowntowork)helplessness,andfailureofemotionalcontrol”(Vermeer,

Boekaerts,&Seegers,citedinBlack&Wiliam,2009,p.14).

2.6.6Learninghowtolearn,self-regulatedlearningandlife-longlearning

InLearningHowtoLearnandAssessmentforLearning:atheoreticalinquiry,Black

etal.(2006)write:

Theoverallconclusionisthatemphasisshouldbeplacedonpracticesthat

havepotentialtopromoteautonomyinlearning,acommonthemeinthe

literatureatalllevels,andonereflectedinourempiricalworkonteachers'

attitudesandpractices.(p.119)

Itisimportanttounderstandthatthenotionoflearninghowtolearnisalso

consistentwiththeeducationagendarelatedtoemployerswantingemployees

withskillsforwork(andlifemorebroadly)intheknowledgesociety.(OECD,

2003;CERI,2008).

Blacket.al.’s(2006)paperisanattempttobuildabridgebetweenwhatweknow

aboutteachingandlearningthatmightputstudentsinchargeoftheirown

learning.SeealsoDeakin-Crick,Broadfoot,andClaxton(2004);James(2006);

Jamesetal.(2007);Manselletal.(2009)andPellegrino(2009).

InthecontextofscienceeducationinAustralia,TheProjectforEnhancingEffective

Learning(PEEL),foundedin1985,anticipatedtheworkreportedonabove.Inthe

Australiancontext

PEELisaboutmakingsignificantchangesinhowstudentslearn—

generatinglearningthatismoreinformed,purposeful,independent,

interactive,andmetacognitive.(Mitchell,Mitchell,&Lumb,2009,p.1)

Page 101: Exploring The Impact of a Largescale Diagnostic Science

80

ThePEEL(2009)publicationPrinciplesofTeachingforQualityLearningdescribes

12principlesthatteachersusetoinstilgoodlearningbehaviours.Goodlearning

behavioursarethosethatoperationalisemetacognitionandself-assessment,

whicharepowerfulcontributorstolearninghowtolearn.Thelistofactivitiesand

theideasdevelopedbyscienceteachersandpublishedinPEELSEEDS(PEEL,

2009)aresimilarintypetothelistofactivitiesintheKing’s-Medway-Oxfordshire-

FormativeAssessmentProject(KMOFAP)(Black&Wiliam,2005).Thelistof

procedures(usingaPEELtermforlearningactivities)includes:

• sharingsuccesscriteriawithlearners

• classroomquestioning

• comment-onlymarking

• peer-andself-assessment

• formativeuseofsummativetests.

ThepointofdrawingattentiontoPEEListhatitisanexistingnetworkthatholdsa

considerablebodyofworkthatteacherscanaccessthemselvesastheytryto

improvestudentlearningbehaviourandautonomy.

2.7SOLOandtheESSA-VALID(EV)programinNSW

ThissectionwilldiscussSOLOasalearningprogression.Itwillexplainthe

thinkinginvolvedinitscreation,itsuseinscienceassessmentanditscontribution

totheEVprograminNSW.

2.7.1TheSOLOTaxonomy

TheSOLOTaxonomy(Biggs&Collis,1982)anditssuccessorSOLOmodel

(Panizzon,Arthur,&Pegg,2006)areexamplesofdevelopmentallearning

progressionsinthecognitivetradition(NRC,2001).TheoriginalSOLOTaxonomy

waspublishedbyBiggsandCollis(1982).Itwasdevelopedtoassistteachers

differentiatebetweenquantityandqualityinstudentresponsestoclosed,

classroomtestquestions.IntheiroriginalconstructfortheTaxonomy,learning

Page 102: Exploring The Impact of a Largescale Diagnostic Science

81

progressesthroughfivelevels,eachonerepresentingahigherleveloflearningas

explainedbelow.

Biggs&Collis(1982)wereconcernedthatstudentscouldscorehighlybysimply

writingdownanumberofrelevantresponses(quantity)withoutanyweighting

beinggiventowhetherthethinkingondisplaywasofahigherorder(whichBiggs

andCollisdescribedasquality)thansimplerecallofrelatedbitsofinformation.

BiggsandCollisexaminedBloom’soriginaltaxonomy(Bloom,Engelhart,Furst,

Hill,&Krathwohl,1956),Piaget’shypothecatedcognitivestructures(Ginsberg&

Opper,1979)andotherpost-Piagetianmodels(suchasthoseputforwardby

Marton&Säljö(1976);Schroder,Driver&Streufert(1967)andShayer(1976)as

thebasisfordescribingquality.

IntheendBiggsandCollis(1982)proposedanddevelopedanempiricalmodel

thatclassifiedanswersinlevelsaccordingtotheincreasingstructuralcomplexity

evidentintheanswers.Afterworkingwiththousandsofstudentwrittenresponses

totestitems,theydefinedcomplexityasincluding

progressionfromconcretetoabstract;usinganincreasingnumberof

organizingdimensions;increasingconsistencywithintheresponse;theuse

oforganizingorrelatingprincipleswithhypotheticalorself-generated

principlesbeingusedatthemostcomplexend.(p.14)

ThefirstversionofSOLOwasdevelopedandpublishedwithoutreferencetoa

traditionalsciencesubjectsuchasbiologyorchemistry(theclosestsubjectwas

geography).In1991BiggsandCollispublishedanupdatedversionofthe

Taxonomythattookintoaccountworkdoneonacademicandeveryday

intelligence,andonideasrelatedtomultipleintelligences,novice-expertresearch

andformsofknowledge(Biggs&Collis,1991).

BiggsandCollisthenshowedhowthisversionofSOLOcouldbeappliedto

categorisingrepresentationsprovidedbystudentsattemptingto“explain

phenomenawithasyetinadequatelydeveloped[scientific]constructsbyusing

alternativeframeworkstothoseusedbyscientists”(p.71).TheyusedBeveridge’s

Page 103: Exploring The Impact of a Largescale Diagnostic Science

82

(1985)workonevaporation,supplementedwitholderstudents’responsesto

identifythestructuralelementsinanswersthatexemplifiedthedifferentlevelsin

theSOLOTaxonomy.Figure2.5representsthisupdatedSOLOTaxonomy.

Figure 2.5 Representation of the Biggs & Collis (1991) SOLO Taxonomy (Source: Pegg, J., slide for a presentation at the ACER research conference, August 2010, in Melbourne)

FeaturesoftheSOLOTaxonomyincludemodesofrepresentation/thinking

(verticalaxis)thatareassociatedwithage-relatedchanges(horizontalaxis)in

studentcognitivefunctioning.Thesechangesenablestudentstoconstruct

differentlevelsofresponse(denotedbythelettersU-M-Rwithinamode)to

questionsusingtheknowledgeformsassociatedwiththatmodeofthinking(right-

handsidelabelswithinthemodes).

Table2.5includestheexamplesusedbyBiggsandCollis(1991)toillustratehow

thefeaturesofstudentresponseschangewithage(column2)andthefivelevelsof

learningdescriptorsstudentresponsesweremappedto.

0 112 6 16 21

Sensori-Motor

Ikonic

ConcreteSymbolic

Formal

Post Formal

– Uni-modal development Tacit

Intuitive

Declarative

Theoretical

Theoretical

Age (years not to scale)

Mode

Modes, Learning Cycles and Forms of Knowledge

Forms of Knowledge

Hig

her

Ord

er L

earn

ing

Low

er O

rder

Lea

rnin

g

U M

R

U M

R

U M

R

U M

R

Page 104: Exploring The Impact of a Largescale Diagnostic Science

83

Table 2.5 The concept of evaporation through modes of thinking and levels of thinking (SOLO Taxonomy) Mode Concept of evaporation Level of learning Postformal

U (EA) Developing and testing a new theory.

Formal

R Working understanding of the discipline of physics M Other physical concepts involving principles of energy, matter U (EA)The heat energy supplied speeds particles so that water changes state into steam. The latent heat is the amount of energy supplied

5. Extended abstract (EA)-generalizes the structure to take in new and more abstract features representing a new and higher mode of operation

Concrete symbolic

R The heat turns the water into steam and it evaporates off, remaining invisible in the atmosphere (15 yrs) M The flame makes the steam come and the water goes (9yrs) U (EA) It soaks into the pan (7 yrs)

4. Relational (R)-integrates the parts with each other so the whole has a coherent structure and meaning 3. Multistructural (M)-picks up more and more relevant or correct features but does not integrate them 2. Unistructural (U)-the learner focuses on the relevant domain and picks up one aspect to work with

Ikonic mode

R The steam causes the water to disappear (7 yrs). This does not happen at our house. There’s still water in the pan because my mum makes the tea with it (8 yrs) M You put the pan on top of the flame and the water goes U The flame does it (5 yrs)

1. Prestructural (P)-the task is engaged, but the learner is distracted or misled by an irrelevant aspect belonging to a previous stage or mode

The sensori-motor (mode) is not included here as it is related to motor-skills and, in this context, not knowledge of them. Source: Adapted from Biggs & Collis, 1991, p. 65 / 66 (Their tables 5.1 and 5.2)

Thefivestepscoveronelearningcyclecentredontheconcretesymbolicmode(the

‘targetmode’)whichisthemodemostrelevanttotheyearsofschooling.Notethat

theUlevelofonemodeistheEAlevelforthemodebelowit.

Themodesofthinkingmostrelevanttoschoolingincludethesensori-motor,

ikonic,concretesymbolicandformalmodes.Aschildrenage,theyareableto

accessmodesofthinkingorrepresentationsthatareprogressivelymorecomplex

andabstract.ThemodesintheSOLOTaxonomydonotprogressivelyreplaceeach

other(asPiagettheorised)butarecumulativeasexplainedbelow.

Page 105: Exploring The Impact of a Largescale Diagnostic Science

84

Whenattemptingtolearnanewskillset,suchasTaiChiforexample,analready

accomplishedbasketballplayermustbeginatthesensori-motorlevelbylearning

(imitatingandpracticing)thebasicfootmovesorhandandarmmovesofTaiChi.

Demonstratingeitheroneisaunistructuralresponse,demonstratingboth

separatelyisamultistructuralresponse,andputtingbothfeetandhand

movementstogetherwiththecorrectbreathingforoneTaiChi“move”isa

relationaldemonstration.Whentheaccomplishedbasketball-playingandnowTai

Chistudenttakestheirfirstdrivinglessonsometenormoreyearsafterstarting

school,heorshebeginsagainatthesensori-motormodetolearntheactions

involvedindrivingtothepointoffluentrelationalexecutionneededtocoordinate

themanydifferentcomponentskillsneededtopass,say,thesafeovertakingpart

oftheactualdrivingtest.

Fromaroundage18months,childrenareabletolinkactionswithimagined

representationsthattheyexpressinwordsanadultwouldinterpretas

“stereotypicalcharactersandobviousplots”(Biggs&Collis,1991,p.63).InTai

Chi,whiledemonstratingaseriesofmoves,alateprimary-agestudentmayuse

phraseslike“horse-ridingstance”,“strokethepeacock’stail”,“repulsethe

monkey”,forexample,asawayofrepresentingtheactionstothemselvesand

others(theseareactualexamplesfromaTaiChisupportcardusedwithallages).

Thismode,calledtheikonicmode,

isevidentintheintuitiveknowledgedisplayedin…scientists[forexample].

Kekule’srealizationofthestructureoftheorganicringcompoundwas

precededbyahypnogogicdreamofsixsnakeschasingeachothers'tails,

andonlylaterwashis"truth"establishedtothesatisfactionofthescientific

communitybyevidenceandargument.Theikonicmodeisthusnotmerely

apresymbolicmodeofinformationprocessingrestrictedtoearly

childhood.Itcontinuestogrowinpowerandcomplexitywellbeyond

childhood.(Biggs&Collis,1991,p.63)

Moststudentsbegintheirschoolingwiththis(ikonic)modeoflearningwell

developedatthemultistructuraland/orrelationallevelswithinthemode.Oral

Page 106: Exploring The Impact of a Largescale Diagnostic Science

85

expressionisdominant,butikonicdrawingsandphysicalmodelsrepresenting

peopleandthingsfamiliartothestudentmayalsobeproduced.

Fromaroundagesixyears,studentsbegintoshowconcrete-symbolicmode

thinking.Theknowledge,associatedthinkinganditsrepresentationwithinthis

modeisclassifiedasdeclarative,which

involvesasignificantshiftinabstraction,fromdirectsymbolizationofthe

worldthroughorallanguage,towritten,secondorder,symbolsystemsthat

applytotheexperiencedworld.Thereislogicandorderbetweenthe

symbolsthemselves,andbetweenthesymbolsystemandtheworld.The

symbolsystemsofwrittenlanguageandsignsgiveusoneofthemost

powerfultoolsforactingontheenvironment,andtheyincludewriting

itself,mathematicalsystems,maps,musicalnotation,andothersymbolic

devices.Masteryofthesesystems,andtheirapplicationstorealworld

problems,isthemajortaskinprimaryandsecondaryschoolingaccording

toanycurriculumtheory.Learningintheconcrete-symbolicmodeleadsto

declarativeknowledge,demonstratedbysymbolicdescriptionsofthe

experiencedworld.(Biggs&Collis,1991,p.63)

Intheirprogressthroughthismodefromincompetencetoexpertise,

learnersdisplayaconsistentsequence,orlearningcycle,thatis

generalizabletoalargevarietyoftasksandparticularlyschool-basedtasks.

(Biggs&Collis,1991,p.64)

Responsesobservedmayrangefromprestructural(notoperatinginthetarget

mode,whichisconcretesymbolicinthissituation)tounistructural(U)to

multistructural(M)torelational(R)withinthetargetmodeorabovethetarget

mode(formalmode),whereresponsesareclassifiedasextendedabstract(see

column3,Table2.5).BiggsandCollis(1991)summariseobservationsand

explanationsprovidedbystudentsrelatedtotheconceptofevaporationto

exemplifythefivelevelsofthinkingintheSOLOTaxonomy(seecolumn2,Table

2.5).

Page 107: Exploring The Impact of a Largescale Diagnostic Science

86

Bythetimestudentsreachage16years(inYears10or11),someareableto

accesstherepresentationaltoolsforformalthinking.Thedifferencebetweena

studentoperatingattheconcretesymbolicmodeandformalmodeasitwas

conceivedthenisillustratedinTable2.5(column2)usingtheexampleof

evaporation.Intheconcretesymbolicmode,explanationsaretiedtoconcrete

situationsandoperationaldefinitions(flamesorsunlightforenergy)foreffectsor

changesobserved.

Studentsoperatingintheformalmodeasitwasconceivedthenareabletomove

awayfromparticularconcretereferents(ice,wateror‘steam’andflames,sunlight,

electricity,coal,gas)todiscussevaporationandboilingintermsofa“moving

particle”modelwhereenergyisaddedtoortakenawayfromasituationcausinga

changeofstate(fromsolidtoliquidtogasandbackagain).“Thinkingintheformal

modethusbothincorporatesandtranscendsparticularcircumstances”(Biggs&

Collis,1991,p.63).

ByYear12anumberofstudentswillbeoperatingattheformalmode.Manymay

notentertheformalmodeofthinkingbythetimetheyleaveschoolat17or18

yearsofage.

2.7.2TheSOLOmodel

TheassessmentframeworkdevelopedfortheEVprograminNSWusedan

enhancedversionofthe1991versionoftheSOLOTaxonomy.CalledtheSOLO

modeltodistinguishitfromtheoriginalSOLOTaxonomy,theSOLOmodelincludes

asecondlearningcyclewithintheconcrete-symbolicandformalmodesof

thinking.LiketheTaxonomybeforeit,ithasitsrootsinempiricalevidencefrom

thousandsofwrittenresponsestotestquestions(Panizzon&Bond,2007).

Figure2.6representsthetwo-cycleconcretesymbolicmodeoftheSOLOmodel.

Theconcrete-symbolicmodeofthinkingisthedominantmodeofthinking

throughouttheyearsofschooling.

Page 108: Exploring The Impact of a Largescale Diagnostic Science

87

Figure 2.6. Representation of the “two cycles within a mode” SOLO model. Increasing age along the horizontal axis (L to R). Source: Pegg, J., slide from a presentation at the ACER research conference, August 2010, Melbourne.

Thehorizontalaxisrepresentsageinyears.Moststudentsenterschooloperating

intheikonicmode.Student’scapacitytouseandmasteryofthecognitivetools

associatedwiththeconcretesymbolicmodeofthinkingdevelopsovertheyearsof

schoolingthroughtwolearningcycles(thesecondcycleisatahigherlevelthan

theprecedingcycle).Inthejuniorsecondaryyearsstudentsacquirethelanguage

ofscienceconceptswhichtheyareexpectedtouseinexplanationsand

justificationsfortheconclusionstheycometo.

Atage16somestudentsbegintothinkusingabstractconceptsnotlinkedto

particularsituations(suchaspotentialenergy,propertiesoffields,latentheat,

electro-magneticradiation,mass,inertiaandmomentum).

A two-cycle diagram

Page 109: Exploring The Impact of a Largescale Diagnostic Science

88

Theneedtomodifythesinglelearningcycleapproachemergedfromanumberof

researchstudieswhereitwasbecomingincreasinglyobvious

thatasingleunistructural-multistructural-relationalcyclewithinamode

didnotaccommodateadequatelytherangeofresponsesofferedby

students.Inparticular,itwasdifficulttointerpretresponsesfrommany

primarystudents,low-achievingsecondarystudents,oradultsnewtoa

particularareaofstudywithinthesinglecyclemodel.(Pegg,Panizzon,

Arthur,Scott,&Aylmer,2011,p.24)

Itwasfoundintheirresponsesthat

anearliercycleoflevels(i.e.,anewunistructural-multistructural-relational

cycle)wasdiscerned.Interestingly,theresponsescodedattheselevelsstill

sharedcharacteristicsofthesamemode[...and]wereparticularlyrelevant

toprimaryandsecondaryeducationintheconcretesymbolicandformal

modes(p.24)

Thetwo-cyclemodelwassubsequentlyvalidatedbypsychometricmodelling

involvingRaschanalyses,andtheresultsofthreestudiestothatendwere

reportedinapaperbyPanizzonandBond(2007).Thetheoryunderpinningthe

SOLOmodelwasoriginallyshapedbyPiaget’sthinkingaboutdevelopmental

stages.However,PanizzonandBond(2006)refertoVygotsky’s(1978)socio-

cognitivetheories,andtheysuggestthatateachercanworkwithstudentsinways

thatsetupthesocialconditionsthatsupporttheemergenceofanew,highermode

ofthinkinginstudents.

2.7.3TheESSA-VALID(EV)assessmentframework

Whilethetestwasdeliveredasapenandpaperexercise(from2005to2010),the

assessmentframeworkdiscussedinthissubsectionwasbeingdevelopedand

validated.

Page 110: Exploring The Impact of a Largescale Diagnostic Science

89

Table2.6showsanextractoftheframework.Itshowshowthesyllabusoutcomes

(writtenandpublishedforthe2003sciencesyllabus)weresubsequentlyrelated

tothesixlevelsoftheconcretesymbolicmodeofthinkingintheSOLOmodel.

TwoexamplesoftasksareprovidedinAppendixD.Thefirstisataskrelatedto

heatingicefromthe2005EVpilottest;thesecondisataskaboutmagnetsfrom

the2008test.ThetasksaremappedtotheshadedoutcomesandrelatedSOLO

levelsasshowninTable2.6.

Table2.6andrelatedexplanatorymaterialdetailingthelinksmadebetween

syllabusandSOLOarereproducedasexamplethreeinAppendixD.

Page 111: Exploring The Impact of a Largescale Diagnostic Science

90

Table 2.6 Selected outcomes and related SOLO levels in the 2011 EV assessment framework LEVEL 1 LEVEL 2 LEVEL 3 LEVEL 4 LEVEL 5 LEVEL 6

Out

com

es 4

.1 to

4.5

(2

of 7

row

s)

Identify a scientific discovery

Compare scientific discovery to other types of discovery

Link a scientific discovery to its effect on humans

Describe a development in science that has led to new developments in technology

Compare the methods of the scientist to the design model of the engineer and architect

Explain the role of scientific thinking on society

Identify a possible career path in science

Identify a science context in a career

Link a career in science to knowledge and skills required

Identify science as a human activity

Discuss why society should support scientific research

Out

com

es 4

.6 to

4.9

(3

of 1

6 ro

ws)

Identify materials attracted by a magnet (example two)

Compare the observable effects when magnets are placed end to end

Link the observable effects when two magnets are placed end to end with their position

Describe a magnetic field as producing a force that attracts particular metals

Describe the poles of a magnet as the area/ends where the magnet’s field is most intense

Explain the behaviour of magnetic poles using the term field

Identify that objects / substances take up space and/or have mass/weight

Explain that materials are held together differently in solids, liquids and gases

Explain density in terms of a simple particle model

Identify an observable feature in melting, freezing, condensation, evaporation or boiling (example one)

Describe observable features in melting, freezing, condensation, evaporation and boiling

Explain that, when substances melt, freeze, condense, evaporate and boil, they are still made of the same stuff

Identify that particles are continuously moving and interacting

Compare movement and interaction of particles in different states

Explain change of state in terms of rearrangements of particles

Identify that as particles are heated they gain energy

Identify that as particles are heated they gain energy and move further apart

Relate changes of state to the motion of particles as energy is removed or added

No content for Outcomes 10 - 12 is included

Outcomes 4.13 to 4.15* (1 of 8 rows)

Make a simple observation

Compare observations made by different people

Explain strategies to increase accuracy of observation

Correctly sequence steps in a scientific procedure

Accurately and systematically record observations and data

Discuss the relationship between accuracy and reliability

Outcomes 4.16, 4.17 a-d & 4.18** (1 of 8 rows)

Use a simple key or symbol to represent a concrete object or representation

Distinguish between different symbols

Complete diagrams and symbolic representations

Correctly sequence steps in a process described in a text

Distinguish between two related sets of data / information

Represent relationships using keys, symbols and flow chart

Outcomes 4.17e-g, 4.19-4.21*** (1 of 7 rows)

Identify a common unit of measurement (example one)

Identify the ratio of one unit to another

Complete a correct conversion of one unit to another

Create a simple scale

Compare the scale on two axes

Create an appropriate scale

Source: NSW Department of Education and Training DET, 2011. Shaded rows are referenced in the body text. * Planning and Conducting Investigations area / ** Communication area / and *** Critical thinking area

Page 112: Exploring The Impact of a Largescale Diagnostic Science

91

2.7.4TheEVtest:“fitforpurpose”?

Theliteraturereviewedinthischapterdescribesthreebroadpurposesfor

assessment:toimprovelearning(formativeassessment);toassessprogressin

learning(summativeassessment);andtomonitoraspectsof,and/ortheoverall

effectivenessandefficiencyof,theeducationsystem.TheEVprogramprovides

feedbackonallthreepurposes.Feedback(intheformofresultsfromaoneoff

externaltest)isprovidedtostudents;theirparentsandcarers;theirteachers;the

schoolstheyattend;theeducationsystemauthorities;andgovernments.

Fensham(2013)describedtheEVtestdevelopmentprocessesascomparableto

thePISAprocesses,whichhesaidwereexemplary.Theinternationaltests,hesaid,

prioritisedreliability,whichinthiscasewasaboutensuringthatthescores

includedthemeasureofstatisticalcertaintyrelatedtothemeansscores.The

discussioninthelaterpartofthissubsectionwilldescribehowtheEVtest

developmentprocessesstriveforbothvalidityandreliabilityinanefforttobeas

fitforpurposeaspossible.

TheresultsfromtheEVtestareorganisedintoasummativereportofachievement

attheendofYear8.Thereportforstudents,parentsandteachersprovidesthe

resultsforfiveareasorcategoriesofoutcomes.Examplesandrelateddiscussionof

thisaspectoftheEVprogramareprovidedasexamplefourinAppendixD.The

scoresfromitemsintheEVframeworkmappedtotheCriticalThinkingarea(see

Table2.6)aredistributedtotheWorkingScientificallyandCommunicating

Scientificallycategories,dependingonwhethertheitemshadaninvestigatingor

communicatingcontext.Thestudentreportprovidesindividualfeedbackonevery

taskanditeminthetest.

TheformativeintentoftheEVprogramissignalledinthereporttoparentsand

students:

Students,parentsandteacherscanusethe[EV]levelstoplanlearning

programsandactivitiessothatstudentskeepmovingforwardintheir

scienceknowledgeandskills.(NSWDET,2007,p.3)

Page 113: Exploring The Impact of a Largescale Diagnostic Science

92

ThelevelsreferredtoarethesixlevelslinkedtotheSOLOmodeldiscussedabove.

Progress(“movingforward”intheEVreport)insciencelearningisdefinedbythe

languageusedineachoftheleveldescriptionsforaparticularreportingcategory.

Ofinterestintheearlydaysoftestingwastheoverallconcernexpressedby

teachersthatthetestwastoomuchaboutreading,whichintheirviewwasgetting

inthewayof‘seeingthescience’questions.TheresultsfromthestudentEVsurvey

showedthatstudentsactuallyenjoyedthetestandstimulusmaterialandtheydid

notthinkitdistractedthem(seequestionsinthelastsectionofthesurvey).

Articlesthatteacherssawasbeing‘toodifficult’,moststudentsenjoyeddoing.

Oneoftheintentionsofassessingthiswaywastoputahighvalueongetting

studentstoreadsciencerichtextsandtoidentifythesciencecontent.Students

stronglyagreedthat“literacyisimportantinlearningscience”(thirdquestionin

thesurvey).Detailedfeedbackfromselectedcasestudyschoolsonsomeofthe

surveyitemsisprovidedinChapter6.

AccordingtoMessick(1995),“Constructvalidity[inprincipleandpractice]is

basedonanintegrationofanyevidencethatbearsontheinterpretationor

meaningofthetestscores”(p.742).Theprocessesusedtodevelopitemsandtasks

fortheEVtestprovidearepresentativecoverageofsyllabusintentions(mappedto

theEVassessmentframework),andtheresponsesitemselicitfromstudentsare

evaluatedbyexperiencedteachersforalignmentwithintendedlearningas

describedinthesyllabus.

Currentpsychometricmethodsareusedtomonitortheconsistencywithwhich

markingrubricsareappliedduringtheactualmarkingprocessandinreviewing

theresultsofpilotmarking.Theanalysisofscoringoftheextendedresponseitems

“utilisestheRaschUnidimensionalMeasurementModel(RUMM)…andthe

InteractiveTestAnalysisSystem(QUEST)”(Peggetal.,2011,p.36).Itemsand

tasksthatdonotmeetthecriteriaforinclusioninthetestarediscardedor

modifiedforpilotingthefollowingyear.

Page 114: Exploring The Impact of a Largescale Diagnostic Science

93

TeacherswhohavehadexperienceteachingYear8students(butarenotcurrently

doingso)wereinvitedeachyeartoexpressaninterestindevelopingitemsand

tasksforthetests.Agroupcomprisingteacherswithpriorexperienceandsome

whoarenewisselected,andafterattendingaone-daytrainingworkshoptheyare

askedtowriteitemsandtasks,forwhichtheyarepaidbytheDepartment.

Theworkshoptakeswritersthroughthecriteriaforselectingappropriatestimulus

materialandwritingrelateditemsrelatedthataddresssyllabusexpectations

(outcomesandrelatedessentialcontent)forStage4students.Writersarealso

takenthroughtheSOLOmodelandshownexamplesofitemsandtasksrelatedto

thetwocycleswithintheconcretesymbolicmodethatareexemplarsofitemsand

tasksusedinprevioustests.

Theitemsandtasksproducedarecollected,assessedandeitherdiscardedor

editedbyofficersintheEVtestdevelopmentunitoftheDepartment.Surviving

stimulusmaterialsandrelatedsetsofitemsareedited,mappedagainsttheEV

assessmentframeworkandcollateduntilmorethanenoughforonetestare

available.Theseitemsarethenreviewedbyanexpertpanelofteachersdrawn

fromarangeofspecialistareaswithintheDepartmentincludingAssessment,

Equity,KeyCompetencies,AboriginalandTorresStraightIslanders,Language

BackgroundsOtherthanEnglish,andLiteracyandNumeracy.Examplesoftest

items,relatedstimulusmaterialsandthestudentsurveyareincludedatexample

fiveinAppendixD.

Severaldifferenttestsusingamixofitemsandtasksarecompiledandsentofffor

piloting.Intheearlystages,pilotingwasdonewithstudentsintheirsecondyearof

secondaryschoolinginthevariousstatesofAustralia.Nowitisdoneearlyinterm

oneofthenewschoolyearwithstudentswhodidthetestinthepreviousyear.

Pilotingensuresthattheitemsandtaskswithpoortestcharacteristics

(discrimination,difficulty,ambiguity,constructvalidity)areidentifiedand

discardedfromfurtherconsideration.Markingrubricsforthethreeextended

responsequestions,developedbyexperiencedscienceteacherswithSOLO

expertise,arerefinedduringthepilotmarkingprocess.

Page 115: Exploring The Impact of a Largescale Diagnostic Science

94

Experiencedscienceteachersarecontractedtoscoreonlinethethreeextended

responsequestions.Theyareprovidedwithuptofourhoursoftraininginthe

SOLOmodelandtheconsistentapplicationofthemarkingrubricsbeforeactual

markingcommences.Themarkingprocessiscontinuouslymonitoredonlineto

ensureconsistencyofrubricapplication.Everyhour,allmarkersofaparticular

questionarepresentedwiththesamestudentresponseandtheirscoresare

checkedtoensureconsistency.Thecheckmarkingisdoneusingstudentresponses

thathighlightparticularscoringissuesthatemergedduringpilotmarking.

Thetestincludesmultipleitemstargetingthesameconstruct.Thisistoimprove

reliabilityofinferencesaboutthatconstruct.Intheend,theinterpretationofhow

manyitemsareneededtoachieveareliableinferenceisajudgmentcall.In

addition,itemsfrompreviousyearstestsareincludedtoenableequatingoftest

resultsacrosstheyearsoftesting.Theequatingprocessusessamplesofitems

distributedacrossthetesttakenbythewholecohortsothattheriskofaschool

seeingitemsithasusedbeforeinitsowntestingisveryunlikely.

AsPISA,TIMSSandNAP-SLtestsareconsideredhighstakestesting,intheinterest

offairnesstoall,equatingitemsarenotreleased.Testpapersareretainedatthe

endofthetestsessionsandsentbacktothemanagingagencyafterthetestsare

completed.Onlinedeliverymakessecurityarounditemseasiertoensure(asfor

NAP-SLtestingin2015).Examplesoftestitemsnotretainedforequatingpurposes

werepublishedinthereportssomeonetotwoyearsafterthetestingwas

completed.Fensham(2013)hasexpressedaviewthatmoreoftheTIMSSandPISA

itemsshouldbereleasedtoprovidegoodassessmentmodelsforschoolstouse.

ThenextsubsectionexamineshowSOLOhasbeenusedinAustraliaandNew

Zealand.

2.7.5SOLOandassessmentinAustralasia

SOLOtheoryhasbeenusedinthedesignofassessmentframeworksforlargescale

testinginAustralasiasincetheearly2000s.IthasbeenusedinscienceinAustralia

andforreading,writingandmathsinNewZealand.

Page 116: Exploring The Impact of a Largescale Diagnostic Science

95

The1991versionoftheSOLOTaxonomywasusedbytheACERtodevelopthe

ScientificLiteracyProgressMap(SLPM)(ACER,2004b).TheSLPMwasinitially

developedasatoolforcategorisingassessmentitemswrittenfortheScience

EducationAssessmentResource(SEAR)(ACER,2004a,).Itemsandandtasksfrom

thisprojectareavailableonlinetoscienceteachers(ESA,n.d./Improve).

TheSOLOTaxonomywassubsequentlyusedtodeveloponeofthestrandsinthe

assessmentframeworkforthenationalYear6sciencetest(ACARA,2014a).It

providedthelanguageforthescaleusedtodescribethechangeinqualityof

studentanswersfoundinstudents’answerstotheitemsandtasksintheYear6

test.

TheSOLOTaxonomyisutilisedintheassessmentandreportingframeworkforthe

NewZealand-basede-asTTleprojectthatprovidesassessmentitemsforreading,

writingandmaths.ItemsareclassifiedagainsttheNewZealandnational

curriculumandthefivelevelsintheSOLOTaxonomy(Hattie&Brown,2004).

SOLOwasconsideredforinclusioninthePISA2015assessmentframeworkas

disussedabove.AsfarasIamaware,SOLOtheoryisnotusedanywhereelseinthe

contextoflarge-scaletestingofscienceinCanada,NewZealand,theUKortheUSA.

2.8Themesfromtheliteraturereviewandtheirrelevancetothisthesis

Aneedtoliftandbroadenthelevelofskillsstudentsacquireinthefirstphaseoftheir

education.

Thedemandsoftheknowledgesocietyandtherelatedknowledge-basedeconomy

requireaworkforceabletoadapttochangingopportunities.Todothis,people

needtokeeplearningascircumstanceschangeovertheirlifetime.Thisrealisation

hasledtotheunderstandingthatleavingschoolistheendofthefirstphaseof

preparationforalifethatwillrequirefurtherepisodesofformallearningor

trainingatleasttoensureongoingaccesstoemployment.

Employersaretellinggovernmentsthattheyneedgraduatesfromthisfirstphase

ofschool,traininganduniversitywhohaveabroaderrangeofskills(both

Page 117: Exploring The Impact of a Largescale Diagnostic Science

96

cognitiveandsocial)andhigherlevelsofskillthanbefore.Expertiseisnotjust

aboutknowing,itisaboutbeingabletousethatknowingintheworkplaceand

beyondtosolveproblemsandexplainthosesolutionstoothers,andtoboth

quantifyandqualifytherisksinvolvedinimplementingdifferentoptions.These

issueshavebeendealtwithinanumberofOECDreportsincluding(OECD,

1997);(CERI,2008).

Educationagencieschargedbygovernmentstoproducethecurriculumforschools

inAustraliahaveretainedacorecurriculumforallstudentsuptotheendofYear

10broadlydefinedineightlearningareas,includingscience.Thescience

curriculumatthetimeofinterestforthisproject(uptotheendof2014)consisted

ofknowledgeandunderstandingsdrawnfromthemodels,theoriesandlaws,

structures,systemsandinteractionsunderpinningtraditionaldisciplinesof

scienceandtheskillsof“workingscientifically”(BOS,2003,p.21)inaboutequal

measure(10ofthe22outcomesareskills).

Inrecognitionoffallingstudentengagementandinterestinsciencestartingat

school,butparticularlysointheearlyyearsofsecondaryschooling,changeswere

madetothecurriculum.The2003curriculuminNSWrequiredscienceteachersto

providecontextsforlearningaboutscienceandinwhichtodoscience.The

prescribedcontextsinNSWweretodowiththehistoryofscience,thenatureand

practiceofscience,theapplicationsofscienceandimplicationsofdoingso

includingcurrentexamplesandworkinvolvingscience.

Teacherswerealsorequiredbythatcurriculumtousescienceresourcesto

providestudentswiththeopportunitytoacquiretheKeyCompetences,develop

skillsintheuseofICT,workaloneandinteamssafelyandinclusively(considering

genderandculturaldifferences),acquiresomeunderstandingandappreciationof

AboriginalandTorresStraitIslanderPeopleworldviews,acquiresome

understandingandappreciationofhowscienceimpactsourciviclifeandthe

environmentandtoimprovetheirgeneralliteracyandnumeracyskills.

Totheextentpossible,giventhebreadthofexpectations,thescopeanddepthof

whatwastobelearnedwasdescribedinbundlesoflearningframedasoutcomes.

Page 118: Exploring The Impact of a Largescale Diagnostic Science

97

Outcomesweredefinedbyaminimumnumberofactionsandcontextsfortheir

acceptableperformance.Theoutcomesdescribedahierarchyoflearning(inaset

ofstandardsfortwostagesinthejuniorsecondarycurriculum)thatstudentswere

expectedtoengagewithandacquire.Years7and8comprisedonestageandYears

9and10thesecondstage.Twenty-twooutcomesprovidedthescopeanddepthof

expectedlearningsinscienceattheendofYear8andagainattheendofYear10.

Teachersareexpectedtoassessstudentachievementoftheseoutcomesand

reporttoparentsonprogressintheirlearningtwiceayear.

Assessmentasananswertohigherexpectations.

NSWhadtwoexternalpenandpapertestsastheprimarymeansforsatisfying

stakeholdersoftheextenttowhichstudentshadacquiredtheexpectedlearnings,

oneattheendofYear10andtheotherattheendofYear12.Noneoftheother

statesandterritorieshadaYear10sciencetest.WhenNSWintroducedtheYear8

sciencetestfrom2007itwastheonlystatetodoso.Queenslandintroduceda

scienceassessmentprogramin2009forYears4,6and9,butabandoneditatthe

endof2012(QSA,2012).WesternAustralia,introducedasciencetestforits

studentsinYears5,7and9from2010(SCSA,2010)andabandoneditafter2013.

AssessmentinthejunioryearsofsecondaryschoolinNSWwas,andstillremains,

theresponsibilityofscienceteachers.Theyweresupportedinthattaskas

discussedaboveinearliersectionsofthischapter.

Goodrumetal,(2001)reportedintheirreviewthatinsecondaryschoolsacross

Australia

Traditionalassessmentpracticesremainasoneofthemostsignificant

barrierstoeducationalreforminsecondaryschoolswhereteachersare

requiredtocovertoomuchcontenttopreparestudentsfor“thetest”.

Teachersindicatethattestsarethemostcommonformofassessmentand,

onaverage,represent55%oftheweightingofassessment[…]Assessment

is[…]typically,summative,norm-referencedandfocusedoncontent.

Students[report]thatquizzesarefrequentlyusedtoprovidefeedbackto

Page 119: Exploring The Impact of a Largescale Diagnostic Science

98

[them],howeverone-thirdofstudentsindicatethattheirteachernever

spoketothemabouthowtheyweregoinginscience.(p.155)

ItisfairtosaythatNSWscienceteachershadastrongertraditionofexternal,

summativetestingembeddedintheirculturethanotherstatesandterritoriesasis

elaboratedbelow.

GiventhecontinuationwithexternaltestingforallstudentsinNSWinYears8and

10(upuntiltheendof2011)andcontinuingtothisdayattheendofYear12,itis

likelythatthefindingsin2000mightstillapplyinmanysecondaryschoolsinNSW

today.Astheevidencefromcasestudyschoolsinthisprojectshows,testsarea

dominantformofassessmentinscienceinYears7and8tothisday.However,that

assessmentisnowmuchmorefocusedonthefullrangeofoutcomesandtheshift

towardthethreeboldedindicatorsofbetterassessmentpracticelistedinTable2.1

iswellunderway.

AroleforSOLOtoinformfeedbackaboutprogressinlearning.

AsthediscussionabouttheEVtestindicated,theuseofSOLOtoprovidean

additionalcomponentoffeedbackaboutthelevelofunderstandingdemonstrated

bystudentsintheiranswerswasvindicatedbyPISA2015testingthathaditemsin

itdesignedtoprovidefeedbackonthelevelofscientificliteracydemonstratedby

students.However,workintheUSandelsewhereonhow(toassessthefullrange

ofhigherlevelsofcognitivefunctioningexpectedofstudents)hasyettobe

demonstratedasRuiz-Primorevealsinher2009reporttotheUSNational

ResearchCouncil.

Ruiz-Primo(2009)wasaskedbytheUSNationalResearchCounciltoprovidea

paperthatreconciledtwenty-firstcenturygenericemployment-relatedskills

(NRC,2008)andcompetenciesatthecoreofscienceeducation(Duschl,

Schweingruber,&Shouse,2007).Herfirstcommentwasthatexpertiseislocated

inaknowledgedomain(scienceandtechnologyinthiscase).Shethengoesonto

elaboratethatsuitablesciencecontextsneedtobedescribedtoassesstheextent

towhichstudentshaveacquiredthefollowingtypesofknowledge:

Page 120: Exploring The Impact of a Largescale Diagnostic Science

99

1. Declarativeknowledge–knowingthat

2. Proceduralknowledge–knowinghow

3. Schematicknowledge–knowingwhy

4. Strategicknowledge–knowingwhen,where,whyandhowtoapply

knowledge

5. Metacognitiveknowledge–knowingaboutone’scognitionandhowto

regulateone’scognition(withmetacognitivestrategies).(pp.24-25)

HavingreviewedtheassessmentframeworksforTIMSS,PISAandtheUS,

CollegiateLearningAssessment(CLA)andNationalAssessmentforEducational

Progress(NAEP)sciencetests,shesaidthatnoneofthecurrenttestsprovide

evidenceforjudgingthedegreeofproficiencywithalltheseformsofknowledge.

However,sheexpressesthebeliefthataccesstoappropriatecomputer-based

technology(simulations)shouldenableteststhataccessallformsinthefuture.In

thebroadschemeofthings,theinclusionofSOLOintheEVtestsforNSWstudents

(andWebb’sDOKlevelsinthePISA2015test)isamodestbeginningtohelping

teacherssupportstudentacquisitionofthehighestlevelsofatleastoneofRuiz-

Primo’s(2009)fivetypesofknowledge,declarativeknowledge.

ThefivetypesofknowledgedescribedbyRuiz-Primo(2009)rangewellbeyond

cognitivefunctioningtoincludepurposefulactivitywithotherpeopleand

applicationofexpertisetodoing.Assessingperformanceinauthenticsettingsis

thepreferredoptionhere(Matters&Curtis,2008).Choosingcorrectoptionsfrom

abatteryofmultiplechoiceitemsisnotgoingtobeseenasanauthentic,validor

reliabledemonstrationofexpertiseneededinthe21stcenturybymembersofthe

ARGorresearcherswhoholdsituativeorsocioculturalperspectiveonlearning

(Billett,1996;Cowie,2013;Gipps,1999;Lemke,2001;Tobin,2012),orbythe

widercommunity(Hattie,2005).Norisitavaliddemonstrationoftheuseof

expressivelanguagetoconstructascientificreport,explanation,orprocedure,or

forthejustificationofacourseofaction.Actualuseofexpressivelanguageto

representknowledgeandunderstandingindifferentlearningdomainshasledtoa

viewofscienceasamulti-literacy(Hackling,Peers,&Prain,2007;Hand,Yore,

Jagger,&Prain,2010;Tytler&Prain,2010;Waldrip,Prain,&Carolan,2010).This

Page 121: Exploring The Impact of a Largescale Diagnostic Science

100

viewofscienceisdiscussedinthenextchapterandusedtojustifytheuseof

NAPLANresultsasavalidpredictorofscientificliteracyasmeasuredintheEV

test.

Theneedtoteachstudentshowtolearnsotheycanbecomeindependentlearners

Theresearchliteraturediscussedinthischapterhasidentifiedthatwhatteachers

dowithstudentsinthenameofscienceeducationaccountsfor30%ofthe

variabilityinachievement(Hattie,2003b).Whatstudentsbringtotheclassroom

bywayofnaturalability,priorschoolexperiencesandfamilybackgrounds

accountsfor50%ofthevariability.Theremaining20%isattributabletohowwell

theschoolenvironment(leadership)ismanagedtoenhancethepositiveinfluences

andminimisethenegativeinfluencesontheoveralllearningofscienceinthe

schoolsetting.

Itfollowsthatsupportingteacherstodothebestjobtheycanislikelytohavethe

mosteffectonstudentlearningandengagementwithscience.Hattie(2005)has

shownthatteacheruseofformativepracticesisoneofthemosteffectivewaysto

improvestudentachievement(asmeasuredbylarge-scaletestresults).Other

worklookingathowtoteachstudentsto“learnhowtolearn”(LHTL)concludes

that“emphasisshouldbeplacedonpracticesthathavepotentialtopromote

autonomyinlearning”(Blacketal.,2006).Oneapproachtodoingthisistoteach

studentshowtolearnbyprogressivelygivingthemcontrolandownershipofthe

strategiesofformativeassessment.Knowinghowtolearnandbeingmotivatedto

doso(self-regulation)isprobablythemostimportantoutcomeforschooling.

Theassumptionthatthiscapacityforself-regulationwouldshowupinsubsequent

achievementinandengagementwithsciencebeyondYear8underpinnedthethird

subsidiaryresearchquestioninthisresearchproject.Thatquestionwas:Istheuse

offormativepracticesbyteacherslinkedtoimprovementinstudents’EVresults

andlaterachievementinandengagementwithscience?

Page 122: Exploring The Impact of a Largescale Diagnostic Science

101

Summarycomments

Thisprojectisabouttheassessment-relatedworkofscienceteachersintheearly

yearsofsecondaryeducationinalargegovernmentschoolsysteminoneofthe

mostadvantagedanddevelopedcountriesintheworld(OECD,2018;UNDP,

2018).Itexplorestheimpactoftwoassessmentinitiativesonteachers

assessment-relatedworkalmostadecadeaftertheywereputinplace.The

constructsforfive“dimensionsofformativepractice”arethewindowsthrough

whichthatworkcanbeexamined.

Broadfoot(2009)impliesthatweareatatippingpointinourcollective

understandingandapplicationofassessment:

Thepurposeofassessmentduringthe20thcenturyhasbeen

overwhelminglythegenerationofsummativedata.Thecontentaddressed

hasconcernedprimarilycognitivetasks.Themodehasbeenthelargely

traditionalvehicleofpaper-and-penciltestsandtheirorganisationthrough

largetestingandassessmentproviders…Coulditbe,finally,thatthegrand

narrativesofintelligenceandability,whichwereregardedasthekeytothe

determinationoflifechances,arebeginningtoyieldtoamorepractical

discourseofmultipleexperiences,skills,knowledgeanddispositions?(pp.

x-xi)

Page 123: Exploring The Impact of a Largescale Diagnostic Science

102

CHAPTERTHREE:RESEARCHDESIGN,METHODOLOGY,METHODS

3.1Introduction

ChapterTwoprovidedanoverviewoftheliteraturerelatingtoassessment

generallyandformativepracticesinparticular,alongwiththeconceptofa

learningprogressionandSOLOtheory.Italsoreviewedworkbeingdonewith

formativepracticesasawayofimprovingstudentcapacityforself-regulated

autonomouslearningtoequipthemforlifelonglearning,thelatterbeingahighly

sought-afteroutcomeforeducationinthe21stcentury.ThedevelopmentofSOLO

theoryfromTaxonomytomodelanditsuseintheEVprograminNSWschoolsand

beyondwasoutlinedaswell.

Thischapterdescribesandexplainstheresearchdesignandthemethodsusedto

collectdataandinformationtoanswerthethreeresearchquestionsposedin

ChapterOne.Thequestionswere:

1. WhatusearescienceteachersmakingoftheEVprogramincludingSOLO

andwhyisitusedornotused?

2. Whatformativepracticesareevidentintheworkofscienceteachersand

whyaretheyusedornotused?

3. Istheuseofformativepracticesbyteacherslinkedtoimprovementin

students’EVresultsandlaterachievementinandengagementwith

science?

Section3.2providesarationaleforemployingamixedmethodsresearchdesign

involvingthreephases.Phasesoneandtwoinvolvedquantitativemethods.The

thirdphaseemployedquantitativeandqualitativemethodsinthecontextofcase

studies.

Section3.3describesthefirstphaseinwhichaquantitativemethodwasusedto

deliverasampleofschoolstoworkwith.Thequantitativemethodwasa

regressionanalysisusingdataprovidedbytheDepartmentfor394government

secondaryschoolsinNSW.Aswillbeexplainedinthissection,theresidualsfrom

Page 124: Exploring The Impact of a Largescale Diagnostic Science

103

thatregressionanalysiswereusedasameasureofthescientificliteracy

componentofEVtestresultsandasameasureoftheeffectsizeofscienceteaching.

Onthebasisoftheschoolresidual,EVresultsforstudentsataschoolwere

designatedaswellaboveexpectation(WAE),atexpectation(AE)andwellbelow

expectation(WBE).

Section3.4describesthesecondphasethatalsoinvolvedusingaquantitative

methodology.WhilsttheresidualswereindicatorsofEVresultsabove,atorbelow

expectationandoftherelativesuccessofscienceteaching,theresidualssay

nothingabouttheteacherpracticesassociatedwiththoseresults.ANOVAwas

usedtotestforstatisticallysignificantrelationshipsbetweenaspectsofthe

assessment-relatedpracticesusedbyteachersandEVresultscategorizedasWAE,

AEandWBE.

Section3.5explainsthethirdphaseinvolvingcasestudiesofassessment-related

workpracticesinself-nominatedgovernmentschoolsciencedepartmentsandof

teachersworkingthere.Quantitativedataaboutstudentresultsandnumbersof

studentscompletingYear12sciencecourseswereobtainedfromteachersatthe

casestudyschools,thestatecurriculumauthority’swebsite(theBoard)andthe

MySchoolwebsiterespectively.Qualitativedatawerealsocollectedfromteachers

intheformofaudio-recordedinterviewsandartifactsofassessment-related

practice.Narrativesdescribingtheassessment-relatedworkdonebyscience

teachersateachofthecasestudyschoolswereconstructedusinginterpretive

methodology.

ItwasproposedattheendofChapterTwothatdatacollectedtoanswerresearch

questionthreecouldbeusedtotestthepropositionthatstudentsexposedto

formativepracticesmightbebetterself-regulatedlearnersthanthosenotso

exposed.Section3.6discusseshowdatafromtheMySchoolwebsitewasaccessed

andusedtoconstructabasisforcomparingschoolsinordertotestthree

predictionsdesignedtoprovidefindingsrelevanttoansweringthesecondpartof

researchquestionthree.Statisticalcorrelationsweredonetoassessesthestrength

ofassociationbetweenachievementandengagement.

Page 125: Exploring The Impact of a Largescale Diagnostic Science

104

Section3.7discussesthelimitationsarisingfromtheresearchdesignandmethods

usedinthisproject.

3.2Mixedmethodresearch,casestudiesandresearchdesign

Johnson,Onwuegbuzie,&Turner(2007)definemixedmethodsresearchas:

thetypeofresearchinwhicharesearcherorteamofresearcherscombines

elementsofqualitativeandquantitativeresearchapproaches(e.g.,useof

qualitativeandquantitativeviewpoints,datacollection,analysis,inference

techniques)forthebroadpurposesofbreadthanddepthofunderstanding

andcorroboration.(p.123)

Followingareviewofmanypublishedstudies,CreswellandPlanoClark(2011)

proposedsixmixed-methodsresearchdesigns:

1. convergentparalleldesign

2. explanatorysequentialdesign

3. exploratorysequentialdesign

4. embeddeddesign

5. transformativedesign

6. multiphasedesign.

Creswell(2012)characterisesthefirstfourofthesedesignsas“basic”andthelast

twoas“complexdesignsthatarebecomingincreasinglypopular”(p.540).The

explanatorysequentialdesignmethodcollectsquantitativedatafirstandthen

drawsonqualitativedata“tohelpexplainorelaborateonthequantitativedata”(p.

542).Creswellarguesthatexplanatorysequentialdesign(number2inthelist

above)canbecomeatransformativedesign(number5).Thedesignbecomes

transformativewhentheexplanatorysequentialdesignisembeddedwithinan

overarchingframeworkthat

informstheoverallpurposeofthestudy,theresearchquestions,thedata

collection,andtheoutcomeofthestudy.Theintentoftheframeworkisto

Page 126: Exploring The Impact of a Largescale Diagnostic Science

105

addressasocialissueforamarginalizedorunderrepresentedpopulation

andengageinresearchthatbringsaboutchange.(p.546).

Thisresearcherintendsattheconclusionofthisstudytoprovidefeedbacktoall

participantschools.Thesocialpurposehereistoassistschools,particularly

regionalschoolswheretestresultsinsciencedonotappeartobeasstrongoverall

astestresultsareinmetropolitanschools.AccordingtoFlyvbjerg(2011),case

studiesprovidethe“concrete,context-dependentknowledge…necessarytoallow

peopletodevelopfromrule-basedbeginnerstovirtuosoexperts”(p.302).Acase

studyconductedonanumberofphysicallyseparatedsiteshasbeenalternatively

definedbyotherresearchersasamultiple(Stake,2005)orcollective(Yin,2003)

casestudy.Giventhatoneofthepurposesfordoingthisstudyistoprovideschools

withfeedbackaboutpractices,casestudiesprovideapotentiallypowerfulvehicle

fordoingso.

ForFlyvbjerg(2011),

(t)hedecisivefactorindefiningastudyasacasestudyisthechoiceofthe

individualunitofstudyandthesettingofitsboundaries…notsomuch

makingamethodologicalchoicebutachoiceofwhatistobestudied.(p.

301)

Theunitofstudyhereisthesetofassessment-relatedpracticesusedbyscience

teacherswiththeirjuniorsecondarysciencestudentsingovernmentschoolsand

evidenceoftheimpactofthesepracticesonscienceachievementandengagement,

bothofwhichwillbedefinedinthenextsection.Theboundariesofthestudywere

delineatedbyfiveconstraints:

1. manageabilityofsamplesizefortheresearcher

2. purposiveselectionrequirements

3. availabilityofvolunteerparticipants

4. availabilityofrelevantcontent

5. manageabilityforschoolparticipants.

Page 127: Exploring The Impact of a Largescale Diagnostic Science

106

First,theresearchwasconstrainedbythenumberofschoolsabletobeengaged

withbyasoloresearcher.While18schoolswereidentifiedandconsidered

manageable,intheendonly16werevisitedduetotimeandotherconstraints.

Second,theschoolsvisitedwerepurposivelyselectedonthebasisoftheirresidual

ranking.Theaimwastoworkwithschoolsasclosetothetop,middleandbottom

ofthethreeschoolgroupsthatcouldbeattainedgiventhenextconstraint.

Residuals,residualrankingandpurposiveselectionwillbeexplainedinthenext

section(section3.3)

Third,eachparticipatingscienceteacherhadtobeavolunteerandhavethe

supportoftheirdepartmentheadandschoolprincipal.Researchfindings,inthe

eventofpublication,hadthepotentialtobeconfrontingsoconsenttocollect

informationwasaskedforontheconditionofanonymityforschools,teachersand

students.

Fourth,schooldatasets,audio-recordedinterviewsandteacher-provided

assessmentartifactsallhadtoprovidecontentrelevanttoorproducedinthe

periodofinterest(2011-2014)asexplainedinChapterOne.

Fifth,thedata-gatheringexerciseshadtobemanageableforschool-based

participantsandseenasworthwhilefromtheirperspective.Thisentailedthe

researcherbeingflexibleinrelationtohisexpectationsofparticipants.

Tosummarise,thethreephasesoftheresearchdesignandmethodsdelivered:

1. threegroupsofschoolsdifferentiatedfromeachotherbyan

unconventionalmeasureofscientificliteracyattainment(aquantitative

PhaseOne)

2. findingsaboutscienceteacherengagementwiththeEVprogram(including

SOLO)andformativepracticesbasedontheanalysisoftheirresponsestoa

commononlinesurvey,initiallysortedaccordingtothegroupofschoolsthe

responsescamefrom(aquantitativePhaseTwo),

Page 128: Exploring The Impact of a Largescale Diagnostic Science

107

3. dataandinformationaboutstudentachievementinandengagementwith

scienceuptotheendofYear12plusinformationaboutassessment-related

practicesinthesciencedepartmentsofthe16casestudyschools

purposivelychosenfromeachofthethreegroupsofschools(aquantitative

andqualitativePhaseThree).

3.3PhaseOne:selectingthesampleofschoolstoworkwith

Bryman(2012)identifiesnineapproachestopurposivesampling,oneofwhich,

maximumvariationsampling,hedescribesas“samplingtoensureaswidea

variationaspossibleintermsofthedimensionofinterest”(p.419).Flyvbjerg

(2011)arguesthatbychoosing“maximumvariationcases”(p.306)aresearcher

hasthebestchanceofidentifyingfindingsthatareeitherconsistentorinconsistent

withpredictionortheory.Thedimensionofinterestinthisphaseofthestudyis

thescientificliteracyattainmentofstudentsataschool.Thegoalwastoselecta

sampleofschoolscomprisedofthreegroupswhosestatisticalmeansforthe

measureofscientificliteracyattainmentwereasdifferentaspossible.

Aswillbeexplainedlaterinthissection,astudent’sEVtestresultsareafunction

oftheirgeneralliteracyandnumeracyskillsandtheirdispositiontoapplythemto

learningscience.Whileastudentacquiresscientificliteracyfrommanysources,

theEVtesttargetsthescientificliteracyexpectationsdescribedinthescience

syllabusthatscienceteachersareexpectedtoteachstudents.

Thequantitativemethodusedinthisphaseofthestudyseparatesthecontribution

ofscienceteachingtostudentattainmentofscientificliteracyfromother

contributions.Asacrudegeneralisation:

Scientificliteracy

attainment=

EVtest

results-

generalliteracyand

numeracyskillscontribution

Themethodologyusedtoachievethatseparationandthethinkingbehindit

follows.

Page 129: Exploring The Impact of a Largescale Diagnostic Science

108

3.3.1Selectingthesampleofschoolstoworkwith

AsexplainedinChapterOne,thisresearcherfirstapproachedtheNSW

DepartmentofEducationin2012todiscusstheirpossibleinterestinaproposalto

researchtheimpactoftheEVprogramonscienceteachinginNSW.The

Departmentagreedtoassist.

ThefirststepinvolvedtheDepartmentpsychometricianscheckingtheintegrityof

datasetsheldforstudentswhohadsatEVtestsinthefouryears2011to2014.

Thischeckestablishedthatatleast465schoolshadYear8studentswhosatEV

testsinthisperiod.Tobeeligibleforthisstudy,aschoolhadtohaveaminimumof

10Year8studentswhohadsattheEVtestin2011.Departmentpsychometricians

alsocheckedwhetherthosesamestudentshadsattheYear7NAPLANtestsin

2010andYear9NAPLANtestsin2012atthatschool.Comparabledatasetsforthe

nextthreeyears(2012,2013and2014)hadthentobeconfirmed.Whenthiswas

done,thenumberofschoolswithsufficientstudentstomeettheeligibility

requirementswas394.

ThenextstepwastouseNAPLANresultstogenerateasciencepredictorthatcould

betestedinaregressionwithactualEVresultsforthesamestudents.Theaimwas

tofindapredictorthatproducedthebest“fit”betweenagraphofthepredictor(as

theindependentvariable)andactualEVresult(dependentvariable)forpairsof

students.Themeasureof“fit”iscalledthe“coefficientofdetermination”(Laerd

Statistics,2013,p.1)andhasthesymbolR2.Thevalueasapercentage(inthis

context)isameasureofhowwellthepredictoraccountsforthevariabilityinthe

EVscore.Thecloserto100%,themorethepredictorissaidtoaccountforthe

scoreintheEVtest.A‘lineofbestfit’goingthroughthegraphofpairedstudent

resultsataschoolcanbedrawn.

PlottedresultsarescatteredaboveandbelowthatlineasshowninFigure3.1.

Page 130: Exploring The Impact of a Largescale Diagnostic Science

109

Figure 3.1 Regression of 2014 EV results over a NAPLAN-based predictorSource: Department of Education, 2016

Thetwolinesshowninfigure3.1separatethetopandbottomtwentypercentof

pairedresults.Thestatisticaldistancebetweenthelineofbestfitandtheplotted

resultistermedtheresidual.Theresidualsizeincludesbothmeasurementerror

andrealdifferencesbetweenpredictorandactualEVresult.Iftheresidualisabove

thelineofbestfit,thentheEVresultispositiveanddeemedforthepurposesof

thisstudy,“betterthanexpected”;ifbelow,theresultisnegativeanddeemed

“belowexpectation”.

FourpredictorsofEVresultswereagreedtoindiscussionsbetweenresearcher

andtheDepartmentfortesting.Thepredictorswere:Years7and9literacyand

numeracyresults(combinedandaveraged);Years7and9literacyresultsonly

(combinedandaveraged);Year7literacyandnumeracyresults(combinedand

averaged);andYear7literacyresultsonly.TheDepartmentperformedseparate

regressionsofEVresultsoverthefourdifferentpredictorsandsetsofresidualsfor

the394schoolsforeachoftheyearsfrom2011to2014werecalculated.A

representationoftheregressionusing394pairsofschoolresultsforthe2014

schoolyearisprovidedinFigure3.1.ThebluediamondsarethepairedschoolEV

results(verticalaxis)andpredictorvalues(horizontalaxis).

Page 131: Exploring The Impact of a Largescale Diagnostic Science

110

Theslightcurvatureinthetwolinesdelineatingthe80th(topline)and20th

percentilesasdrawnonthegraph(seeFigure3.1)aretheresultofusingfirstand

thirdorderfactors(derivedfromthepredictor)toprovide‘linesofbestfit’.

Equivalentplotsforyears2011,2012and2013werealsoproduced.

Themodelofbestfitturnedouttobethatthepredictorbasedontheaverageof

Years7and9literacyandnumeracyresultscombined.Thecoefficientof

determination(R2)forthatpredictor,averagedoverthefouryearsofinterest,was

R2=.892.Thefour-yearaveragesforR2fortheotherthreepredictorsintheorder

listedabovewere.889,.887and.870,respectively.ThecombinedYear7andYear

9literacyandnumeracypredictoraccountedfor89%oftheexplainedvariationin

EVresultsacrossthestate.

Residualsfromtheregressionprovidingthelineofbestfitwereusedtogenerate

threelistsofschoolsfromacrossNSWidentifiedashavingscientificliteracy

achievementwellaboveexpectation(WAE),asexpected(AE),andwellbelow

expectation(WBE).Thegroupscorrespondedapproximatelywiththetop20%,the

middle20%(spreadevenlyaboveandbelowthelineofbestfit)andlowest20%of

residualsrespectively.

ScienceteachersfromthethreegroupsofschoolswithresultsidentifiedasWAE,

AEandWBEwereinvitedtocompletethesameonlinesurvey(tobeexplainedin

thenextsection,section3.4).Theinvitationsidentifiedawebsiteforsurvey

returnswhichwasdifferentforeachofthethreeschoolgroups.ChapterFour

includesastatisticaldescriptionofthesampleanditsconstituentgroupsand

analysisofthosereturns.

3.3.2Regressionresidualasbothmeasureofcollectivescientificliteracy

and‘effectsize’ofscienceteaching.

Sixpropositionsprovidethebasisforusingaregressionresidualasbotha

measureofscientificliteracyandeffectsizeofscienceteaching.Thefirst

propositionisthatstudentresponsestoitemsandtasksinwell-constructedpen-

Page 132: Exploring The Impact of a Largescale Diagnostic Science

111

and-papertests(oronlineequivalents)providevalidevidenceformaking

judgmentsaboutthelevelofachievementofmanyaspectsofscientificliteracy.

ThispropositionattractssupportfromFensham(2013),whocommendstheEV

testasanexampleof“agoodmodel”(p.18)ininternationalcomparisons.Rowe

(2006)inpreliminarycommentaryaboutrelationshipsbetweenPISA2003

reading,numeracyandscientificliteracyresults,says“ReadingLiteracy

competenceconstitutesthefoundationalskillthatunderlieseffectiveengagement

withtheschoolcurriculum.”(p.9,Italicsintheoriginal)

Thesecondpropositionisthatschoolscienceisamultiliteracy.Hackling,Peers,

andPrain(2007)describeitthisway.

Science-specificaswellaseverydayliteraciesarerequiredbystudentsto

effectivelyengagewithscience,constructscienceunderstandingsand

developscienceprocesses,andtorepresentandcommunicateideasand

informationaboutscience.(p.14)

Whilestudentsacquire“science-specific”literacyfromanumberofsources,theEV

testtargetsthe“science-specific”literacydescribedinofficialcurriculum

documents.Scienceteachersareexpectedtoteachthatcontentandrelated

vocabularytostudents.Aswell,itisimportanttorecognisethatscienceteachingis

expectedtodevelopotherscience-relatedcapabilitiesthatarenotdirectly

assessableusingpen-and-papertesting(suchasthoseneededformanaging

practicalinvestigations).

Thethirdpropositionisthataccordingtotheconsensusofresearchreportedby

Hattie(2003b),only30%oftheaccounted-forvariationinachievementmeasured

bytestsisattributabletotheexperiencesstudentshaveintheclassroom;50%is

attributabletostudentfactorssuchasabilityandsocioculturalbackground;and

home,peerandschoolenvironment(physical,socialandcultural)influences

accountfortheremaining20%.

Page 133: Exploring The Impact of a Largescale Diagnostic Science

112

ThefourthpropositionisthatanempiricallytestedNAPLAN-basedpredictorofan

individual’sEVresultprovidesthebestindependentmeasureofwhatisbeyond

thecapacityofscienceteachersworkingintheirscienceclassroomstoinfluence.

Inotherwords,itisameasureofthefactorsHattie(2003b)referstointhe

previouspropositionasresponsiblefor70%oftheexplainedvariationin

achievement.

Ofthefourpredictorstestedforthisproject,theonebasedonanaggregationof

Years7and9readingandnumeracyscores,equallyweighted,providedfor89%of

theexplainedvariability(R2av=0.892)intheEVresultoverthefouryearsof

interest.Theremaining11%ofexplainedvariabilityismostlikelyattributableto

theimpactofscienceteaching.Thisissmallintheoverallschemeofthings

because,accordingHattie(2003b),theteachers’contributiontoachievement(in

scienceinthiscase)is30%overall.

Thiscontributionofscienceteachingtoscienceachievement(asmeasuredintests

liketheEVtest)issosmallthat“maximumvariationcases”(Flyvbjerg,2011,p.

306)aresoughttoensurethebestchanceoffindingcorroboratingevidencethat

theresidualisameasureoftheeffectofscienceteaching(suchasbetterthan

expectedscientificliteracyachievement).

ThefifthpropositionisthattheresidualfromaregressionofactualstudentEV

resultsovertheirpredictedresultsisavalidmeasureoftheimpactofscience

teachingattheschoollevel.Itisarealeffect(overandabovethemeasurement

errorcomponent)thatcontributesperstudentmarkstobothEVresultsand(toa

lesserextent)toNAPLANresults.Theresidualiswhatyougetwhenalltheother

contributionstoEVresultsapartfromscienceteachingareaccountedfor.This

effectwasdesignatedforthepurposeofthisthesisasthecollectivescientific

literacyscorefortheschool.

Thesixthpropositionisthatwhenstandardisedappropriately,thescientific

literacyscoreisavalidmeasureforcomparingschools.Thestandardisedfour-year

averageresidual(actualmarks)rangedfrom2.68marksperstudentperschool

abovethestateregression“lineofbestfit”and2.50marksperstudentperschool

Page 134: Exploring The Impact of a Largescale Diagnostic Science

113

belowit.Asageneralisation,aslongasthesameorequivalentsetsoftestresults

areusedtogeneratetheresiduals,standardisedscientificliteracyscores(as

representedbytheresiduals)provideavalidbasisforcomparingtheeffectof

scienceteachingonindividualsinaclass;groupswithinaclass;classesataschool;

andschoolsinadistrict,state,nationorgroupofnations.

Giventheabovereasoningandconditions,NAPLAN-basedpredictorscouldalsobe

usedtoassesstheimpactofteachingonachievementinotherlearningdomains

apartfromscience.

3.4Phasetwo:onlinesurveyforscienceteachers

Themainpurposeoftheonlinesurveywastocollectdatafromscienceteachers

abouttheirassessment-relatedwork.Itsotherpurposesweretocollectdirect

evidenceofteacheruseofEVprogramresourcesandrelatedunderstandingofthe

SOLOmodelembeddedintheEVprogram.BoththeEVprogramandembedded

SOLOarespecificNSWinitiativesdesignedtosupportteacheradoptionof

formativepractices.Findingsfromtheanalysisofsurveyreturndatawerethe

primarysourceofevidenceforansweringresearchquestionsoneandtwo.

3.4.1Surveydesign

Aninitialsetofitemsforthesurveywasconstructedusingideasfromarangeof

inputsthatincludedthepublishedworkofresearcherswithaspecialinterestin

assessment,forexample,Black,Harrison,Lee,Marshall,&Wiliam(2004),Hattie

(2012),andShute(2007).AnothersourceofideasforitemswastheNSWBoardof

Study’ssyllabus(BOS,2003)anditssectionsonassessmentforlearningandthe

useoftermssuchas“practices”and“strategies”(pp.71-75).

Otherinfluencesthatimpactedthecontentofsurveyitemsandtheoveralldesign

ofthesurveywerethisresearcher’spreviousexperiencesinthecontextof‘insider’

workdescribedinChapterOne.Thisworkvariouslyincludedcritiquing,

constructingandimplementingsurveys,collatingandanalysingtheresults,and

providingfeedbackonproposedsurveys.Twosurveysthathadadirectinfluence

Page 135: Exploring The Impact of a Largescale Diagnostic Science

114

onthecontentandformofthefinalsurveyproducedforthiscurrentresearch

were:

• thetelephonesurveyforsecondaryscienceteachersusedtocollectdatafor

theStatusandQualityofSciencereview(Goodrum,Rennie,&Hackling,

2001)

• anationalsurveyonNAPLANtesting(Dulfer,Polesel,&Rice,2012).

Theformerhelpedwiththescopeofthequestions,thelatterwiththeformatofthe

questionsandthedecisiontoaskforpersonalinformationlastofall.

Toensurefacevalidity,itemsforthesurveywererefinedinaniterativeprocess

involvingseveralmeetingswithdifferentgroupsofscienceteachersandonewith

educationofficersintheDepartmentwhohadexperienceofsurveydesignand

expertiseinassessmentandSOLO.Adraftversionofthefinalsurveywastrialed

onlinebyfivescienceteacherswhovolunteeredtodosoatthelastmeetingwith

scienceteachers.Noneofthetrialingteacherswerefromschoolssubsequently

invitedtoparticipateintheresearch.

Itwasthistrialingthatconfirmedthe25-minutetimeallowancesuggestedfor

completingthesurveyonline.Notwithstanding,theonlineversionallowed

teacherstostop,saveandresumeatwill,andtheywereencouragedtokeepacopy

oftheirresponsestocheckagainstthestateresultstobeforwardedatalaterdate.

Anotherpurposeforanonlinetrialwastoensurethattheonlineplatformholding

thesurveywasworkingasanticipated.Followingthetrial,andafterevaluating

teacherfeedbackfrommeetingsandone-on-oneconversationswithscience

teachers,itwasdecidedthatprovidingfeedbacktoparticipantswasan

appropriateincentive.

ThedecisiontouseUTS’sSurveyManagerastheplatformofchoicefortheonline

surveywasbasedon:

• feedbackfromscienceteachers(convenienceofonlinesurveysand

anonymity,ifwanted)

Page 136: Exploring The Impact of a Largescale Diagnostic Science

115

• easeofdistributionandmanagementofreturns

• supportfromexperiencedstaffassociatedwiththesurveyplatform

• capacityforanalysisusingdescriptivestatisticsofcollectedresponses

• capacitytodownloadtoExcelandSPSS,ifrequired,formoresophisticated

analysis

• separatereturnofindividualcompletedsurveyswithadateandtimestamp

tochecktherequestforindependentindividualreturnsmadeinthesurvey

itself.

Thesurveyquestionsandrelateditemsareorganisedinfoursectionsasshownin

Table3.1.

Table 3.1 Structure of online survey for science teachers Section ONE: About ESSA/VALID • Q1 a-i & Q2 a-m was about the EV program itself and included statements requiring

yes/no responses about teacher engagement with test feedback data and components of EV program resources

• Q3 asked about their understanding of the purpose of the EV program (write a response)

• Q4 asked how well teachers understood the EV program (five point scale: very poor to very good)

• Q5 asked about intention to take up optional VALID 10 test (Yes / No / Unsure) Section TWO: About SOLO • Q6 a-j items here sought to discover the extent of teacher engagement with

aspects of SOLO through a series of yes/no responses • Q7 rate my understanding of SOLO (five point scale: very poor to very good) • Q8 I learnt most about SOLO… (write a response)

Section THREE: About “Assessment for Learning” • Q9 to Q15 were about formative practices. Questions and related items in this

section were organised using the five dimensions of formative practices* • Teachers were asked to choose between (not known-unsure about / never / seldom

/ sometimes / often) when responding to each of the survey items Section FOUR: About your teaching experience / context • Q16 to Q26 invited respondents to provide information about themselves, their

experience and training and about their current school. The last two questions in the fourth section

• Q27 and Q28 asked teachers to participate in a follow-up case study and to identify themselves, their school and provide contact details to facilitate that if interested.

* Responses to survey items provided the opportunity to create individual teacher profiles in terms of the five dimensions of formative practices differentiated from each other by the relative strength of each dimension.

Page 137: Exploring The Impact of a Largescale Diagnostic Science

116

ThecompletesetofsurveyquestionsisprovidedasAppendixF.

3.4.2Analysisofsurveyresponses

Cresswell(2012)describesafivestepprocessfortheconductofhypothesis

testinginthefourtheditionofhishandbooktitledPlanning,Conducting,and

EvaluatingQuantitativeandQualitativeResearch.Thestepsare:

1. Identifyyournullandalternativehypothesis

2. Setthelevelofsignificance,oralpha(α)level,forrejectingthenullhypothesis

3. Collectdata

4. Computethesamplestatistic

5. Makeadecisionaboutrejectingorfailingtorejectthenullhypothesis.(pp

188-195,italicsintheoriginal)

Thisprocedurewasgenerallyfollowedintheconductofanalysisofquantitative

datacollectedforthisprojectandreportedintermsconsistentwithcurrent

AmericanPsychologicalAssociation(APA)protocols.

Thedesignintentionherewastocharacterisetheassessmentrelated-workof

scienceteachersintermsrelatedtotheiruseofEVresources,SOLOandthefive

dimensionsofformativepractice.Further,thesamplingmethodologydeliveredthe

responsesinthreesetscorrespondingtothegroupsofschoolswithresults

labelledasWBE,AEandWAE.Thethreegroupswereineffectthreeseparate

populations.Surveyreturnsconstitutethesamplesrepresentativeofthose

populations.Theseparatedsurveyreturnspresentedtheopportunityfortesting

thehypothesisthattherewerenodifferencesinteachers’assessment-related

work(thenullhypothesis)despitethegroupshavingEVresultsclassifiedasWAE,

AEandWBE.

Thetoolsusedtobothmanageandanalysethedatacollectedfromteacher

responsestothesurveywereMicrosoft’sspreadsheetsoftware,ExcelandIBM’s

StatisticalPackagefortheSocialSciences(SPSS)whichwasrenamedIBMSPSS

Statisticsin2014.SPSSsoftwareincludesarangeofstatisticaltoolsthatcanbe

Page 138: Exploring The Impact of a Largescale Diagnostic Science

117

appliedtoprovidedescriptivestatisticsandarangeofinferentialstatistical

analyses.Inferentialstatisticsprovideamethodfor“generalizingfromasampleto

apopulation.”(Lane,n.d.).

Itwasdecidedtouseone-way,between-subjectsANOVAtotestthenull

hypothesis(thatteacherassessmentrelatedworkwasthesameacrossthe

populationofschoolsineachofthethreegroupsofschools).Iftheanalysis

producedstatisticallysignificantdifferencesinaspectsofteacherpracticebetween

thepopulationsanditwasreasonabletorejectthenullhypothesisandconsideran

alternativehypothesis.Thealternativehypothesiswasthataspectsofassessment-

relatedworkandstudentlevelsofscientificliteracyarepositivelyassociated.

1ThedefaultassumptioninSPSSforANOVAcalculationsisthenullhypothesis.

Twoerrorsarediscussedinthestatisticsliteraturerelatedtorejectingatruenull

hypothesisorfailingtorejectafalsenullhypothesis.Sampletestingmayreturn

meansdifferencesthatatfirstglancesuggestpopulationdifferencesinthevariable

ofinterestwheninrealitythedifferencesdonotexist(afalsepositiveresult).Asa

consequence,rejectingthenullhypothesiswouldbeanerror.Thiserroris

identifiedasa“typeIerror”(Lane,n.d.,p.377).SPSSsoftwareprovidesaprintout

ofthetargetstatisticandthelevelofstatisticalsignificance(designatedbythe

letterp)relatedtothatstatistic.Byconvention.insocialresearch,apvaluebelow

0.05(or.01insomesituations)isconsideredareasonablebasisforrejectingthe

nullhypothesis(Bryman,2012andCresswell,2012).

Intheeventthatthereareactualdifferencesbetweenpopulationmeansbutthe

sampletestingwasnotsensitiveenoughtorevelthedifferences(afalsenegative

result),itispossibletodecidethatthenullhypothesisshouldberetainedrather

thanrejected.Thiserror(failingtorejectthenullhypothesis)iscalleda“typeII

error”(Lane,n.d.,p.378).Theprobabilityofmakingthatmistakecanbereduced

bygoodexperimentaldesignandappropriatechoiceofstatisticaltools.The

conceptofstatisticalpowerisusedinthiscontext;itisameasureof“the

probabilityofrejectingafalsenullhypothesis.”(Lane,n.d.).Thegreaterthepower

thebetter.

Page 139: Exploring The Impact of a Largescale Diagnostic Science

118

Thelimitationsrelatedtousingandinterpretingtheresultsofinferentialstatistics

inthisprojectwillbeprovidedinsection3.7.2and3.7.3.

3.5Phasethree:casestudiesandsciencedepartmentassessmentrelated

narratives

Bothquantitativeandqualitativedatawerecollectedforcasestudiesofscience

departmentsin16schools.Themethodsusedwereaudio-recordedsemi-

structuredinterviews,teacher-selectedartifactsofassessment-relatedpractices,

andaproformacompletedbyteachersandpopulatedwithofficialschooldata

aboutachievementinandengagementwithscience.

3.5.1Audio-recordedsemi-structuredinterviews:purposeand

development

Thepurposeoftheinterviewswastocollectqualitativedatathatcouldbe

interpretedtoprovidecontextualinformationabouttheschoolanditsscience

department’scultureandpractices,andfromthistoconstructschool-specific

narrativesaboutassessment-relatedworkinthesciencedepartment.Substantive

contentwouldbeusedtoinformanswerstotheresearchquestions.

Theinterviewwassemi-structured(afterBryman,2012)usingasetofkeyand

followupquestions(totestsilencesinrelationtooptionspossiblyforgotten).

Giventhedemandsbeingmadeofcasestudyparticipants,aone-hourinterview

wasconsideredsufficientforthesepurposes,andthisprovedtobethecase.

Becausetheinterviewwasaone-offevent,thequestionssoughtresponsesto

relativelyspecificaspectsofassessmentandrelatedpracticesinthecontextof

scienceteaching,manyofwhichhadbeenfirstraisedinsectionthreeoftheonline

surveythatteachershadcompletedsomemonthsearlier.

Afinalsetofquestionswastrialedataschoolnotinvolvedintheresearch.The

purposesforthetrialweretoassessthebestplaceattheschooltoconductthe

interviewsothatparticipantsfeltatease;totestthelanguagerelatedtothe

conductoftheinterviews;tocheckonthewordingofquestions;andtodetermine

Page 140: Exploring The Impact of a Largescale Diagnostic Science

119

howbesttodescribetheartifactsofinterestforcollection.Thegoalwastoensure

thattheintervieweesfeltascomfortableaspossibleasquicklyaspossible.Aprecis

ofthequestionsandpurposesforaskingthemfollows.

Questions1-3askedwhatpromptedparticipantstojointhecasestudy.Thiswas

followedbytwoquestionsabouttheiruseoftheEVtestandrelatedresources.The

hoped-forresponseswereinsightsintowhatimpact(ifany)theEVprogramhad

onassessment-relatedworkofteachers(sectiononeoftheteacherquestionnaire).

Questions4-8askedaboutthecollectionanduseofevidenceoflearning,and

morespecificallyaboutpeerandself-assessmentopportunitiesgiventostudents

(again,seekinginsightsastotheextenttowhichthesetwokeyaspectsof

formativepracticewereapriorityinteacherthinkingatthisschool).

Questions9–10askedaboutschoolandsciencedepartmentprioritiesinan

attempttogainsomeinsightintotheiralignment.Basedonthisresearcher’s

experience,therewasalikelihoodofschoolprioritiesbeingformativeassessment

and/orthedevelopmentofstudentliteracyandnumeracyskills,thelatterbeingan

attempttounderstandwhetherthereisanemphasison‘writingtolearn’and,ifso,

towhatextenthasitbeentakeupbyteachersinthesciencedepartment.

Questions11–12wereaboutresourcesusedtoteachscienceandaquestion

abouthowknowingwhetherwhatoneisdoingworks(asatestoftheir

commitmenttoassessment).Thiswasalsorelatedtosurfacingunderstandings

aboutusingthesameresourceforbothteachingandassessment.

Anopportunitywasprovidedinrelationtotheonlinesurveyteachershad

completedsomemonthsearlierforintervieweestoexplainhowtheydecidedwhat

weretheappropriateresponseoptionsfromamongthechoices:not

known/unsureabout;never;seldom;sometimes:andoften.Thepurposehere

beingtocheckthatthebasisforchoosingwassimilarforallrespondents.

Aquestionwasaskedaboutwhatintervieweesunderstoodprogressioninlearning

sciencemeans(giventhatSOLOprovidesoneandthesyllabusoutcomesina

Page 141: Exploring The Impact of a Largescale Diagnostic Science

120

standardsframeworkanother).Theconceptofaprogressioninlearningisastrong

themeintheresearchonformativeassessment(seeChapterTwo).

Aquestionwasaskedabouttheregularityofsciencedepartmentmeetingswas,as

wasoneaboutthenatureandextentofdiscussionsaboutassessmentatthose

meetings.Itwashopedthatdiscussionheremightprovideinsightsintopractices

aroundthesettingandassessingofstudenttasks;howissuesaboutreliabilityand

validityaredealtwith;andwhetherthemeetingsprovidedopportunitiesfor

teacherstodisplaygoodlearningbehaviourswitheachother.

Anopportunitywasprovidedforintervieweestodiscusswhat,ifanything,had

surprisedthemaboutaspectsoftheirschoolEVresultsorstudentsurvey

feedback,Year10orYear12dataputintotheproforma.Thisquestionwas

exploratory,andhoped-forresponsesincludedreferencestohowthescience

departmentwasrespondingtostudentperceptionsoftheirscienceexperienceor

theextenttowhichthisexerciseinresultanalysiswasmoreorlessthanwhatis

currentlythenorm.

Interpretiveanalysisbytheresearcherofteacherresponsestotheinterview

questionswasaniterativeprocess.Theprocessinvolvedtheproductionof

comprehensive,holistic,qualitativedescriptions(Sandelowski,2000)ofpractice

framedbytheinterviewquestions.Thepurposeoftheanalysiswastogenerate

narrativesincludingexamplesorcontextstosupportandilluminateanswersto

theresearchquestions.

All16recordedinterviewswerelistenedtoatleastthreetimes.Nomorethanfour

interviewswerelistenedtoandanalysedinanyoneday.Theelapsedrecording

timestouniquelydescriptiveinstancesofpracticeinthecontextofthatschoolwas

noted(toenableefficientreturntothenatalatertimeforadditionalreplaying).

Noteswerecreatedduringthefirstreplaytosummariseresponses.Replaywas

stoppedandrewoundoversomesectionstocheckthattherecordwasaclearand

accuratesummaryofwhathadbeensaid.

Page 142: Exploring The Impact of a Largescale Diagnostic Science

121

Whilethesecondreplaywasinprogress,thefirstsetofnoteswascheckedto

ensurekeyactivities,strategies,examplesorinsightsrelatedtoformativepractices

alreadynotedwereconsistentwithwhatwasbeingsaid.Atthethirdlistening,

priornoteswerecomparedwithwhatwasbeingheardtoensureallkeyinsights

andexampleswereappropriatelyreferenced,andfurtheradditions/corrections

weremadewhenconsideredappropriate.

TheteninterviewswiththecasestudyschoolsreportedoninChapterFivewere

thenlistenedtoagainbeforewritingtheassessmentnarrativesusingthefollowing

scaffold.Thecomponentsofthescaffoldwerederivedfromtheteacherinterview

questions(AandB),theteachersurveyquestionsincludingthefivedimensionsof

formativepractice(CtoG).Thelastcomponent(H)wasanopportunitytoprovide

summativecommentsidentifyinguniquepracticesorcommonalitieswithother

schools.

A.EngagementwithEVfeedback,resourcesandSOLO

AnyreferencestotheEVprogram,howitwasvaluedcomparedtoNAPLAN,issues

withdoingthetests(students,staffsupervisionoraccesstocomputers),feedback

used(orignored),andimpactonscienceassessmentgenerallywerereportedhere.

AnyreferencestoSOLOoritsuseswerealsoreportedhere.

B.Groupingforinstruction

ThesourcesofassessmentdatausedtoestablishYear7classes,whodiditand

howitwasusedtoallocatestudentstogroupsforinstructionarereportedhere.

Classessoformedwerevariouslylabeledasmixedability,graded,streamed,or

parallel.Thetimingandbasisforchangingstudentallocationstoclassesasthey

progressedfromYear7toYear9werealsoreported.

C.Useoflearningintentionsandsuccesscriteria

Inthissection,schoolandsciencedepartmentteachingandlearningprioritiesand

theirsourceswererecorded.Theformofteachingandlearningprogram

componentsthatcommunicatedlearningintentionstoteacherswerenoted.Also

Page 143: Exploring The Impact of a Largescale Diagnostic Science

122

recordedweredetailsofassessmenttasks,prioritiesasrevealedintherelated

rubrics,andalignmentwithsyllabusintentions.Thelinksbetweensuccesscriteria,

markallocationandsubsequentconversiontogradesforthepurposeofreporting

toparentswasalsoexamined.Theresearcheralsolistenedforevidenceofstudent

involvementindevisingorchoosingeitherlearningintentionsorsuccesscriteria.

D.Classroomdiscourseandevidenceoflearning

Teachingscienceinvolvesengagingstudentsinarangeofactivities,including

usingequipmenttomeasureandrecordobservations;accessingsecond-hand

sourcesofdataandinformation;anddesigningandcarryingoutinvestigationsto

solveproblemsandanswerquestions.Itinvolvesworkingaloneandwithothers

anditmaytakeplaceinaregularclassroom,adedicatedspacewithspecialfittings

(suchasschoollaboratories)andaccesstoarangeofspecialistequipment

(includingICTbasedtools),oritmaytakeplacebeyondtheschoolwalls.Of

interestherewastheextenttowhichteachersmadeuseofthediversityofoptions

inthesesettingstoobserveevidenceoflearningandhowtheymanagedthe

discoursesothatevidenceoflearningwasmadeexplicit.

E.Feedback

Thissectionrecordswhodidwhatwiththeevidenceoflearningproducedfrom

teachingandlearningactivities(suchasthosedescribedintheprevioussection).

Inparticular,itwasusefultorecordwhetherthefeedbackprovidedsoughtto

progresslearningforboththestudent/sandtheirteachers,andwhetheritwas

aboutwhatformthecompletedtaskwouldtake,theskillstobeimproved,

metacognition,orpraiseforthelearner(suchasatickorcomment).Ofinteresttoo

werethereferentsforcriteriausedinfeedback.Referentsofinterestherewere

syllabusintentions(scopeofresponsesand/ordepth),misconceptions,SOLO

levelsofthinking,orsomeotherreferentsuchastheBoard’sCommonGradeScale.

Howaccumulatedmarksareconvertedtogradesforreportingpurposeswasalso

ofinteresthere.

F.Activatingstudentsasinstructionalresourcesforothers

Page 144: Exploring The Impact of a Largescale Diagnostic Science

123

Heretheemphasiswasonrecordingtheopportunitiesstudentsweregivento

providepeerfeedbackandtheguidancetoensurethatitwasaproductiveprocess

forboththeproviderandrecipient.Examplesmightincludestructuredgroup

workwherestudentsareassignedrolesorgivenopportunitiestodemonstrateto

orinstructothers;teacheruseofstrategiessuchaspredict-observe-explain(POE);

think-pair-share-report;jig-saw;orjointconstructionofstudentresponsesto

phasesinaninvestigation.

G.Activatingstudents(andteachers)aslearners

Inthissectionthefocuswasonreportingexamplesofgoodlearningbehaviours

modeledbyeitherorbothstudentsandteachers.Tobeworthnoting,the

opportunitieshadtobeexplicitlyprovided(suchaskeepingreflectivejournals,

choosingitemsforaportfolio,defendingchoices,ormakinglinkstoprevious

learninginscienceand/orothersubjects).Forteachers,opportunitiesmayinclude

workingcollaborativelywitheachothertomarkassessmenttasks;annotating

worksamplestousewhenconvertingmarkstogrades;identifyingmarkcut-offs

forconvertingtogrades;developingfurtherunderstandingaboutwhata

progressionoflearninginsciencelookslike;developinga“scopeandsequence”

foraunitofwork;ordevelopinganassessmentrubricthatincludescriteriafor

rewardingdifferentlevelsofstudentresponsetoanitemortask.

H.Comparativesummativecomments

Summativestatementsrelatingcomparativeachievementandengagementto

aspectsofformativepracticerevealedininterviewsandartifacts,alongwith

commentaryabouttheextentofconfirmationforthepredictions(orotherwise),

completedthereports.

3.5.2Artifactsofassessmentpractice:purpose

Schoolsidentifiedforparticipationinthecasestudieswereadvisedinanemailto

collectanydocumentation,models(orimagesofsame)usedtoinformorsupport

assessment-relatedworkinscienceatthetimeofhisvisit.Artifactssoughtwere

examplesofthingsteachersconsideredtobe‘bestpractice’.Thepurposewasto

Page 145: Exploring The Impact of a Largescale Diagnostic Science

124

usetheartifactstoconfirminterviewandsurveyresponsesandtoprovide

examplestoillustrateassessment-relatednarrativesdevelopedforspecificcase

studyschools.Theartifactsaskedforincluded:

• teacher-devisedassessmentpoliciestoguideassessment-relatedworkof

scienceteachers

• formalreportsofachievementorprogressbystudents(theonessent

home)

• examplesofassessmenttasks

• learningprogramswherespecificreferencestoassessmentweremade

• lessonplansorstudent‘worksheets’whereassessment-relatedactivities

werethemainfocus

• annotatedexemplarsofqualityworkatdifferentlevelsproducedby

students

• rubricsusedtoassessactivitiesandtoprovidefeedbacktostudents.

Analysisofthecollectedartifactsofassessmentpracticewasperformedafter

listeningtoandsummarisingtheinterviews.Thefocuswastolookfor

confirmatory/contradictory/additionalinformationtoillustratethenarrativesfor

eachschool.

3.5.3Casestudyschooldata:purpose

Participatingteachersatcasestudyschoolswereaskedinadvanceofaschoolvisit

toprovideschool-leveldataaboutEVachievement,Year10resultsandYear12

sciencecoursecompletiondatarelevanttotheyearsofinterest(2010to2015).

Participantsweresentaproforma(inbothhardcopyandasanexcelspreadsheet)

toassistthemprepareforaplannedvisit.Theschool-specificinformationwas

soughttoprovidedataaboutlaterachievementandengagement(explainedin

subsection3.5.5),bothofwhichwererelevanttoansweringresearchquestion

threeandforassessingpredictionsrelatedtoself-regulatedlearning.The

proformasenttoschoolsisattachedasAppendixE.

Page 146: Exploring The Impact of a Largescale Diagnostic Science

125

TheEVdatarequestedofcasestudyschoolswasfortheyearsfrom2007-2015.It

transpiredthatinmostcasesrespondentswereonlyabletoaccessdatainSMART

fortheyears2011to2015.SMARTistheacronymforSchoolMeasurement

AssessmentandReportingToolkit.Itissophisticatedsoftwaretoolprovided

onlinetoschoolsbytheDepartmentanditcanbeusedtoperformlimitedformsof

analysisontestresultsfromexternaltesting.

Datafortheyearsbeforethatwereapparentlyunavailabletotherespondents,

exceptforthreeschoolswherethedatahadbeenretainedinsciencedepartment

records.OtherdatarelatingtoYear10resultsandnumbersforYear12

completionsofseniorsecondarysciencecourseswereavailabletoschoolsinthe

Board-providedResultsAnalysisPackage(RAP).Mostschoolsdidnotretainthe

Year10dataaspartoftheirsciencedepartmentrecords.Year12resultswere

generallyretainedatthesciencedepartmentlevelandwasprovidedtothe

researcherinallcases.MostschoolshadtoasktheHeadTeacherEnglishfortheir

numbersinordertocalculatetheproportions(asapercentageoftheEnglish

candidature)ofstudentsdoingthevarioussciencesubjects.

ItisfortheprincipaltodecidewhoataschoolhasaccesstoSMARTandRAP.The

purposeforaskingschoolsabouttheirresultswastocollectinformationduring

interviewsabouthowthatinformationwasusedtoinformassessment-related

workintheschoolandinitssciencedepartment.Onlythreeschoolsbrought

completedproformastointerviews.Theremainderprovidedthemafterthe

interviews.Inafewcasestudyschools,thisinformationwasnotimmediately

accessibletoscienceteachersotherthantheHT.

TheresearcherhadDepartmentalapprovaltoaccessanduseaggregatedschool-

levelresults.However,accesstothepatternofschoolresultswasatthediscretion

ofschoolprincipals.AccesstotheresultswasprovidedbythePrincipalinallbut

twoschoolswhowithheldtheYear10datarequested.

Feedbacktoschools,studentsandparentsfromtheDepartmentaboutEVtest

resultsisprovidedinSMART.Theproformaprovidedtoschoolsincluded

tabulatedspacesforschool-leveldataforfourofthefivereportingcategories

Page 147: Exploring The Impact of a Largescale Diagnostic Science

126

relatingtoEVtestresults.Studentachievementdatafortheschoolandstateare

bothreportedinSMARTagainstthreeachievementbands.

School-providedandotherdatafromcasestudyschoolswerecollectedfrom

schoolsandrecordedinanExcelspreadsheetwhichwaslatertransferredtoSPSS

inordertoperformstatisticalprocesseswiththedata.Aswillbediscussedinmore

detailinChapterFive,sixitemsofthe21inthesurveywillbereportedoninthis

thesis.Analysisofstudentsurveyresponseswasdesignedtoprovidepatternsof

differenceinstrengthofagreement/disagreementoneachoftheitemswithinand

betweenthepairedschools.Thisanalysiswasstraightforward.Themeanscores

wereprintedoutastablesanddifferentcolouredhi-lighterswereusedtoidentify

eachschool’sdifferencewiththestatepopulationrating(above,below,thesame

eachhaddifferentcolours).

Howlaterachievementandengagementinsciencewereassessedforthepurposes

ofthisresearchisexplainednext.

3.5.4Defininglaterachievementinscience

Themeasureofstudents’laterachievementinsciencewasthepatternofgrades

awardedtostudentsattheendofYear10(twoyearsaftertheEVtest)basedon

schoolprocessesandendorsedbytheBoard.Anoptionwouldhavebeento

includeend-of-Year12resultsinscienceaswell.Thiswasnotdonefortwo

reasons:first,becausethedatacollectedaboutassessmentpracticeswas

specificallyfocusedonthefirstthreeyearsofsecondaryschooling;andsecond,to

reducetheamountoftimerequiredofparticipatingteachers.

Theissueofassessingimprovementinachievementovertheyearswithinaschool

isnotstraightforwardbecausethebasisforbothassessingandreporting

achievementisdifferentateachofthetwochosenpointsofinterest.Thekey

differencesinthereportingofachievementareoutlinednext.

ResultsfortheEVtestareaone-offsummativeassessmentreportedinlevelsfrom

1to6referencedtoascalebasedonSOLOlevelsoflearning.InSMART,asecond

Page 148: Exploring The Impact of a Largescale Diagnostic Science

127

wayofprovidingfeedbackonresultsistoreportitastheproportionofstudentsin

achievementbands(threebandsareused:band1,band2andband3,thelatter

beingtheproportionofstudentsattheschoolattaininglevels5and6,thehighest

twolevels).

StudentachievementattheendofYear10isreportedasgrades(AtoE,Ais

highest).ThegradesarereferencedtotheBoard’s(BOS,2013).TheScale

describesfivestandardsofachievement.Inallofthe16schoolsinterviewedthe

gradesawardedbyteachersarebasedontheirjudgmentofthestandardimplied

byparticularmarkrangeswithintherangeofaggregatedmarksforalltasks

completedinthatyear.Forexample,marksrangingfrom60to70(outofsay100)

mightbeindicativeofworkconsistentwiththatdescribedforaBgradeonthe

Board’sScale.TheBoard’sCommonGradeScaleisinnotrelatedtoSOLOlevels.

Casestudyschoolsprovidedtheproportionsofstudentsobtainingthetop,middle

andbottomachievementbandsinYear8fortheirschool,andcomparabledatafor

thestateintheyearsofinterest.Schoolsalsoprovidedtheproportionsattaining

gradesAtoEinYear10fortheirstudents.Therelativeproportionsofstudents

obtainingAtoEinscienceinthestateintheyearsofinterestwereobtainedfrom

theNSWcurriculumandassessmentauthority’swebsite(NESA,2017).

Thus,changesinintra-schoolproportionsrelativetothestateatbothYear8and

Year10provideagoodbasisformonitoringlaterachievement(Year10compared

toYear8).

3.5.5Definingengagementwithscience

Inthecontextofthisproject,asimpleoperationalviewofengagementwaschosen

toassesstheextentoflaterengagementwithscience(seeresearchquestion

three).Itwaschosenforpragmaticreasonsrelatingtodataavailabilityandthe

sensewithwhicheducationministerTebbuttuseditwhenannouncingtheEV

programin2005(seesection2.2).Asstudentsarefreetochoosewhetherornot

theytakeupsciencecoursesafterYear10,comparingtheproportionsofstudents

Page 149: Exploring The Impact of a Largescale Diagnostic Science

128

completingsciencecoursesattheendofYear12waschosenasthemeasureof

laterengagement(seeresearchquestionthree).

Becausescienceisacompulsorycourseforthefirstfouryearsofsecondary

schoolingadifferentwayofassessingengagementwasneeded.Studentresponses

toitemsfromthestudentsurveyaccompanyingtheEVtestprovideanalternate

wayofmeasuringengagementattheendofYear8.ItemsintheEVsurveyasked

studentstorateonafour-pointscaletheiragreement(ordisagreement)witha

seriesofstatementsrelatedtoscienceandtheirschoolscienceexperienceofit.

Selectedsurveyitems(sixof21)werechosenasthebasisformeasuring

engagement.

Theitemschosencoveredinterestinscience,enjoymentofscienceinprimaryand

secondaryschool,perceiveddifficultyofsciencerelativetoothersubjects,

perceivedsuccessinlearningitandwhetheritwasoneoftheirfavouritesubjects.

These(andother)aspectsofaffectappearinresearchpapersattemptingtodefine

engagementwithscience(includingitsretentionwhenfreetodropit).Seefor

exampletheUK’s,NationalFoundationforEducationalResearch(2011)report

titledExploringyoungpeople’sviewsonscienceeducationwheresomeofthese

aspectsarediscussed.Atthistime,theredoesnotappeartobeanevidence-based

consensusabouthowbesttodefineengagement.Studentfeedbackonaspectsof

affectaddressedbytheitemsprovidedataforevaluatingtheirusefulnessas

markersforstudentself-regulationwhichisexploredinChapterFive.

Laterengagement

Thedataprovidedbyschoolswasusedtogenerateanoperationaldefinitionof

laterengagement.Englishisamandatorycourseforallstudentswantingthe

HigherSchoolCertificate(HSC).TheHSCistheschoolexitcredentialprovidedto

studentsfromNSWschoolswhowantitassupportforentrytopostschooloptions

includingworkandhigherorfurthereducation.Scienceteacherswereaskedto

converttheirsciencecoursecompletionnumberstoapercentagerelativeto

Englishnumbersattheschool.

Page 150: Exploring The Impact of a Largescale Diagnostic Science

129

Englishandseniorsciencecourseseachyearacrossthestateforthepurposesof

theHSCwereobtainedbytheresearcherfromtheNSWEducationandStandards

Authoritywebsite(NESA,2017).StatewideproportionsrelativetoEnglishwere

calculatedforthestate.Thesetwosetsofnumbers,schoolandstateproportions,

provideanobjectivebasisformakinginter-schoolcomparisonsrelatedto

engagementasdefinedabove.Studentsmakeachoicetocontinuewithordrop

scienceafterYear10.Thus,themeasureofengagementbasedonproportions

completingsciencecourses(relativetoEnglish)attheendofYear12would

appearalsotobeastrongmeasureofthecollectivevaluingofsciencebystudents

ataschoolhalf-waythroughYear10whentheymaketheirchoicesforsubjectsto

studyinthesenioryearsofschooling.

ThestudentsurveycomponentoftheYear8EVtestprovidesawayofmeasuring

students’levelofsatisfactionwiththeirexperienceofscienceinthefirsttwoyears

ofsecondaryschool.Whenaggregatedandaveragedovertheyearsofinterestand

comparedtostatewidedata,thesurveyprovidesanobjectivemeasurefor

engagementthatcanbecomparedovertimebothwithintheschoolandbetween

schools(whenreferencedtostateproportions).

ThedatausedtoproduceaYear8measureofengagementwerecollatedfrom

schoolrecordsbyteachersincasestudyschools.Thedataaskedforwasasubset

oftheEVfeedbackprovidedinSMART.Giventhatonethirdoftheteachershad

saidinsurveyresponsesthattheyhadnotlookedatsurveydata(seesection

5.6.2),responsestoonlysixidentifieditems(of22)wereaskedfor.SMART

includesthestateproportionsofstudentsateachachievementbandlevelforeach

item.

Whileitistruetosaythatcross-schoolcomparisonsforbothachievementand

engagementcanbemadeusingobjectivemeasures,thesewouldalmostcertainly

notbevalidunlessotherfactorscontributingtothescoresaremadeexplicit.Those

doingthecomparisonarethenabletomakeaninformedjudgmentabout

differencesafterconsideringthelikelyimpactofthesefactors.Thisissueisdealt

withinthenextsection.

Page 151: Exploring The Impact of a Largescale Diagnostic Science

130

3.6Comparableschoolsandthreepredictions

Thepointofmakingintra-schoolcomparisonsforachievementinandengagement

withscienceistoassesswhether,overtime,successivecohortsofstudentsare

doingbetteratkeypointsinthejourneythroughsecondaryschooling,suchasat

theendofYears8,10and12.Inotherwords,aretherefinementsbeingmadeat

theschoolleveltoteachingandlearningprogramsinthelightoffeedback

resultinginbetteroverallachievementforsuccessivecohortspassingthrough

thosepoints?Changestoachievementandengagementpatternsthatreveal

growingproportionsofstudentsathigherlevels/grades/numberstakingsenior

sciencecourseswouldnodoubtbewelcomedasevidenceofimprovementandbe

entirelyconsistentwiththeuseofformativepracticesbyscienceteachers.

Inthecontextofthisresearch,inter-schoolcomparisonsprovideameansfor

independentlytestingthevalidityofaclaimmadeinSection3.3.Theclaimthere

wasthatthesizeandsignoftheregressionresidualisadirectmeasureofthe

scientificliteracycomponentofEVresults…themorepositivetheresidual,the

biggerwasthecontributiontotheEVresultoverall(makingitbetterthan

expected).Also,itwasclaimedtherethatthescientificliteracyeffectisdirectly

relatedtotheimpactofscienceteaching.Ifthatisavalidclaim,thenforapairof

comparableschools(oneintheWAEgroupofschoolsandoneintheWBEgroup,

say)theactualEVresultsintheWAEschoolshouldbebetterthantheresultsin

theWBEschool.Themeaningof‘comparable’isexplainedbelow.

Comparableschoolsweredefinedbytheresearcherasschoolshavingthesameor

verysimilarSEAscores.SEAistheacronymfor“socio-educational

advantage”(ACARA,2014b,p.3),whichisameasurepublishedforschoolsonthe

MySchoolwebsite.Itisanindependentmeasureofthecapacityforlearningeach

studentbringstoschool.Thismeasureofstudenteducational

disadvantage/advantageisdeterminedfromparents’levelsofeducation,

occupationandpost-schoolqualifications.Afourthcategoryofcurrent

employmentstatusofparentswasaddedtoSEAdeterminationsfrom2013

Page 152: Exploring The Impact of a Largescale Diagnostic Science

131

onwards(ACARA,2014b)becauseitwasfoundhelpfulinimprovingthe

correlationbetweentheSEAscore,ICSEAandsubsequentNAPLANresults.

TheSchoolProfilepageforeachschoolontheMySchoolwebsiteprovidestheSEA

dataasaquartileprofileshowingtheproportionsofstudentsatthatschoolinthe

fourquartersfromthemosteducationallydisadvantagedtothemosteducationally

advantaged.Inordertoprotecttheidentityoftheschool,theSEAprofiledatafor

eachschoolwasusedtoproducewhattheresearchercalledtheSEAscore.The

profilequartileswereconvertedtoasinglescoreonascaleof0–10usingasimple

lineartransformation.Thelowerthenumber,thelargertheproportionof

educationallydisadvantagedstudentsattheschool;thehigherthenumber,the

largertheproportionofeducationallyadvantagedstudentsattheschool.TheSEA

scoreforeachschoolisthefour-yearaverageoftheSEAscoresfortheYear7

intakesin2010to2013,inclusive.

ThereasoningbehindthedecisiontousetheSEAasthecontrolfollows.InHattie’s

(2003b)termsACARA’sSEAisequivalenttothestudentfactorsthathesays

provide50%oftheaccounted-forvariabilityintestresults.Themeasureof

regionalremotenessandpercentageofIndigenousstudentenrolment,which

ACARAreferstoasschoolfactors,areequivalenttothefactorsHattie(2003b)says

contributeupto20%oftheaccounted-forvariabilityintestresults.Whatthe

teacherdoesintheclassroomcontributestherest,hesays.

TheANOVAperformedonteacherresponsestothesurveyquestionsabout

formativepracticesprovidedaprofileofscienceassessment-relatedworkforthe

sampleofteachersfromeachoftheschoolgroups.Ifthesamplemeansrelatedto

thedimensionsofformativepracticein,say,theWBEandWAEsamplewere

showntobesignificantlydifferent,thedifferenceinpracticeassociatedwiththat

meanwasthengeneralizedtoapplytoalltheschoolsinthatgroup.Thegroup

profileisdescribedintermsofthefivedimensionsofformativepractice.IftheEV

resultsforcomparable(thatis,havingthesameSEAscores)schoolsare

statisticallysignificantlydifferentinthewaypredictedbytheresidual,thenit

wouldbereasonabletoattributethatdifferencetotheformativepracticeprofileof

Page 153: Exploring The Impact of a Largescale Diagnostic Science

132

scienceteachersinthatgroupofschools.Thisisbecausetheresidualassigningthe

schooltoaparticulargroupisalsoanimputedmeasureofthe‘effectsize’of

scienceteaching.

Thestrengthoftherelationshipsbetweenschoolgroup,EVresults,andformative

practiceprofilescanbetestedusingcorrelationstatisticswhichSPSShasthe

capacitytoperform.Aswell,accordingtotheresearchevidencediscussedin

ChapterTwo,ifformativepracticesaremorefrequentinWAEschoolsthenwe

couldreasonablyexpectthatstudentsintheWAEschoolare,collectively,more

skilledatlearningandmoremotivatedandengagedthanstudentsintheWBE

school.IfthatisthesituationattheendofYear8,itcouldreasonablybeexpected

thatstudentsintheWAEschoolwouldapplythoseskills,motivationand

engagementgoingforward,withthesamerelativeeffectsonachievementat,say,

theendofYear10.

Withtheaboveinmind,threepredictionsweremade:

1. OverallEVresultsforstudentsincomparableschoolswillbebetterinWAE

schoolsthanAEschools,andAEschoolresultswillbebetterthanWBE

schools.

2. OverallYear10scienceresultpatternsforstudentsincomparableschools

willbebetterinWAEschoolsthanAEschools,andAEschoolresult

patternswillbebetterthanWBEschoolpatterns.

3. Theproportion(relativetoEnglish)ofstudentscompletingYear12science

coursesincomparableschoolswillbehighestinWAEschools,andAE

schoolswillhaveahigherproportionofcompletionsthanWBEschools.

Verificationofthepredictionsandrelateddiscussiondrawingontheassessment-

relatednarrativesparticulartothecasestudyschoolswillbeprovidedinChapter

Five.

Findings(ChapterFour)andassessment-relatednarratives(ChapterFive)

providedthedataandinformationusedtoinformdiscussionreportedinChapter

Page 154: Exploring The Impact of a Largescale Diagnostic Science

133

Sixabouttheimpactofformativepracticesonstudentlearningofscienceinthe

earlyyearsofsecondaryeducationinNSWgovernmentschools.

3.7Limitations

Specificfactorsthatimpactthetrustworthinessandthevalidityoffindingsinboth

qualitativeandquantitativeresearchgenerallyandinthisresearchspecifically

follow.

3.7.1Trustworthinessofqualitativeresearch

Toensurethepersuasivenessoftheanswersrelatingtothe“why”,“how”and

“impact”componentsoftheresearchquestions,thisresearchertookstepsto

ensurethattheevidenceusedtoconstructanswerssatisfiesthefourcriteriafora

“trustworthystudy”(Shenton,2004,p.64):credibility,transferability,

dependability,andconfirmability.Potentialconcernsthattheresearcherinthis

projectshouldhavebeenpositionedasaparticipantresearcher/observer

(Denzin&Lincoln,2011andHammersley,2008)areaddressed.

OriginallyproposedbyGuba(1981),Shentonhasusedtheabovefourcriteriain

hisownwork,claimingthatthecriteriahavebeen“acceptedbymany”(Shenton,

2004,p.64).Shenton(2004)arguesthatthesecriteriaareanalogoustofour

criteriausedbypositiviststodefendtheirwork.Credibilityisthequalitative

researchanalogforinternalvalidity;transferabilityistheanalogforexternal

validity/generalisability;dependabilityreplacesreliability,andconfirmability

replacesobjectivity.

Credibility

Credibilityisaboutcongruenceoffindingswithreality(Merriam,1998).

Transferabilityisaboutprovidingenoughcontextualdetailforapersontomakea

judgmentthat“findingscanjustifiablybeappliedto[adifferent]setting”(Shenton,

2004,p.63).Dependabilityisdifficulttoachieveinaqualitativestudy,butagoal

shouldbetohavesufficientdetailtoenable“afutureinvestigatortorepeatthe

study”(Shenton,2004,p.63).Confirmabilityisabout“researchers[takingsteps]

Page 155: Exploring The Impact of a Largescale Diagnostic Science

134

todemonstratethatfindingsemergefromthedataandnottheirown

predispositions”(Shenton,2004,p.63).

Inrelationtocredibility,Shenton(2004)advocates14“strategies”(p.64)thatmay

beusedtoachievecredibility.Theseincludeusingwell-establishedmethodsin

qualitativeresearch“ingeneralandin[education]inparticular”(p.64).

Interpretiveanalysisofinterviewsandartifactsofpracticewithintheconstraints

ofacasestudyisawell-acceptedmethodologyinqualitativeresearch.Inthis

project,interpretiveanalysisofsemi-structuredinterviewsandartifactsof

assessmentpracticethatwereselectedbyteachersasrepresentativeoftheir‘best

practices’produceddataandfindingsaboutcontextrelevanttounderstanding

resultsobtainedquantitatively.

Anotherstrategyforensuringcredibilityisresearcher“familiaritywiththeculture

ofparticipatingorganisations”(Shenton,2004,p.65).Thisresearcher’sdirectand

continuousinvolvementwithscienceeducationsincethelate1960swasan

importantfactorinhisdecisionsaboutwhattoaskofparticipantsinthecasestudy

componentsand,asmentionedearlierinSection3.4,hischoosinganddevising

itemsfortheonlinesurvey.Otherstrategiesmentionedinrelationtocredibility

includetacticstoensurerespondenthonestyincludingiterativequestioning.These

wereexplicitconsiderationsatvariouspointsintheresearchreportedhere.

Usingmultiplesourcesandmultipledatacollectionstrategiesisanotherwayto

promotecredibilityinresearch.Inthisresearchproject,someoftheinterview

questionssoughttocorroboratetheextentofsharedunderstandingbetween

intervieweeandinterviewer(thisresearcher)whenitcametoitemslistedinthe

onlinesurvey.Examplesfromtheonlinesurveyincludeitem10eabouttheuseof

think-pair-share-reportstrategy;item11cabouttheuseofgradesasaformof

feedback;item15eabouthowstaffdevelopasharedunderstandingofwhat

progressioninlearningsciencelookslike;andadirectquestionaskingteachers

howtheydecidedbetweenoften,sometimesandseldomwhenconsideringhow

frequentlytheyemployedtheactivities/strategiesdescribedintheonlinesurvey

items.

Page 156: Exploring The Impact of a Largescale Diagnostic Science

135

Transferability

Transferabilityisthesecondcriterionusedtoestablishtrustworthiness.In

relationtoqualitativeresearch,thisiscontentiousbecauseofthelimitations

imposedbytheboundariesofcasestudywork.Shenton(2004)says:

Ultimately,theresultsofaqualitativestudymustbeunderstoodwithinthe

contextoftheparticularcharacteristicsoftheorganisationororganisations

and,perhaps,geographicalareainwhichthefieldworkwascarriedout.(p.

70)

Shenton(2004)cautionsthatwheninconsistenciesarefound,thismaynotreflect

onthetrustworthinessoftheresearchbutmaybeanindicatorofmultiplesocial

realities.Inthisresearch,everyattemptwasmadetoprovidesufficientcontextual

informationforpeopletomakeajudgmentaboutthecontentionthatformative

practiceshaveademonstrableimpactonsciencelearningandrelatedattitudesto

science.

Dependability

Dependabilityisthethirdcriterion.Thedetailprovidedabouttheconductofthe

researchreportedhereshouldenableapersontorepeattheprocessatanother

placeorinafuturetimeperiod.Theirintentionmightbetoconfirmfindings,but

equally,itmightbeaboutwhetheradifferentrealityisabetterfitforthefindings.

Confirmability

Thefourthcriterionoftrustworthiness(confirmability)canbeprovidedby

triangulationtocheckinvestigatorbias(ortoassessparticipantresearcher/

observerbias);makingexplicittheresearcher’sbeliefsandassumptions;drawing

attentiontolimitationsofthemethodsusedandtheirpotentialimpactonfindings;

anddescribingexplicitlyandindetailthemethodsthatenablescrutinyofresults.

Thedetailsprovidedinthisthesisrelatingtothemethodsandsamplesizesshould

enableareadertoverifyforthemselvesthefindings,inferencesandconclusions.

Page 157: Exploring The Impact of a Largescale Diagnostic Science

136

Inthisproject,interpretiveanalysisofsemi-structuredinterviewsandartifactsof

assessmentpracticethatwereselectedbyteachersasrepresentativeoftheir‘best

practices’produceddataandfindingsaboutcontextrelevanttounderstanding

resultsobtainedquantitatively.

3.7.2Validityandreliabilityofquantitativedata

Quantitativeresearchcriteriarelatingtovalidity,reliabilityandobjectivityhave

longbeentouchstonesforassessingtheworthofresearchfindings(Bryman,

2012).Intheapplicationofstatisticalmethodstoprovideanobjectivebasisfor

reportingfindings,adistinctionismadebetweendescriptivestatisticsand

inferentialstatistics.Descriptivestatisticsincludeconceptssuchassum,average,

mean,measuresoffrequency,measuresofdistribution.Inferentialstatistics

involvetheuseofconceptssuchascorrelation,probability,statisticalsignificance,

powerandconfidencelevelsindiscussingtestresults.

SPSSsoftwareprovidestoolstoanalysequantitativedataandproducearangeof

descriptivestatisticscharacterizingthedata.Featuresofthedatacanthenbe

evaluatedforimpactontheinferentialstatisticofinterest.Datamaybejudgedas

beingeitherparametricornon-parametricandtheappropriatetoolcanbechosen

fortheproposedtest,suchasANOVA.TheaccuracyofthecalculatedANOVA

statisticmaybecompromised(andinextremesituations,invalidated)byusing

datathatdoesnotfullycomplywithallthedataassumptionsforparametric

analysiswhich,accordingtoaLaerdStatistics(2018)tutorialandLane(n.d.),are:

1. Thedependentvariableshouldbemeasuredattheintervalorratiolevel

2. Theindependentvariableshouldconsistoftwoormorecategorical,

independentgroups

3. Independentobservations(norelationshipsbetweenthegroups;nosubject

inmorethanonegroup)

4. Nosignificantoutlierdatavalues

5. Dependentvariabledatashouldbeapproximatelynormallydistributedfor

eachcategoryoftheindependentvariable

6. Datadisplayshomogeneityofvariation

Page 158: Exploring The Impact of a Largescale Diagnostic Science

137

7. Samplenumbersinthedifferentgroupsareapproximatelyequal.

AccordingtoLane(n.d)andRennie(1998)thepowerofthestatisticbeing

calculatedusingsamplesisenhanced(reducingthechanceoffailingtorejectthe

nullhypothesis)whenthe:

1. samplesizeislarge

2. standarddeviationissmall

3. differencebetweenthehypothesizedandactualmeansbeingcomparedare

large

4. significancelevelislessstringent

5. atestisonetailed(andthehypothesizeddirectioniscorrectlyspecified).

Intheeventthattheparametricstatisticandrelatedstatisticalsignificancefigure

basedonanassumptionofparametricdataisinconclusive,posthoctestsbasedon

theassumptionthatthedatawere,ineffect,nonparametricmaybemorepowerful

orrobustandprovideareasonablebasisforrejecting(orretaining)thenull

hypothesis.

Teachersurveyresponsesandtheschool-leveldatasetsforEVtestresults,Year10

assessmentsandYear12sciencecoursecompletionnumberswereprocessed

usingbothdescriptiveandinferentialstatistics.Findingsfromtheapplicationsof

statisticalprocesseswillbeprovidedinChaptersFourandFive.

3.7.3Summaryoflimitationsaffectingthisstudy’sfindings

Qualitativedata

Twolimitationsinrelationtotheartifactscollectedforthisprojectareworth

mentioning.Thefirstwasthat,forthemostpart,artifactsreflectedcurrent

practiceandwithfewexceptionshadbeenproducedinthetwoyearspreceding

thisresearchinresponsetotheintroductionofanewsyllabusthatwasbeing

implementedfrom2014.Theyearsofinterestforthisprojectpredated2015.The

secondwastheextenttowhichtheartifactswererepresentativeofthediversityof

teacherpractice.

Page 159: Exploring The Impact of a Largescale Diagnostic Science

138

Intheend,thesamplesprovidedwereassessedforalignmentbetweenaspectsof

theprovidedassessmentrubricandsyllabusintentions(asexpressedby

outcomes,accesstorelatedcontentprescribedbythesyllabus,andthecontextin

whichtheactivitywasembedded).Thesyllabusthen,asnow,intendedteachersto

providecontextualisedactivitiestoengagestudentinterest.

Whenconsideringthecharacterisationsofformativepracticesproducedfrom

teachersurveyresponses,itwasimportanttorememberthattheprofilesdrawn

wereonlyinrelationtopracticesinYears7,8and9.Thisisrelevanttoany

discussionabouttheextrapolationoffindingsinrelationtothethreepredictions

describedinSection3.6.

Greatcarewhentryingtointerpretteacherresponsestointerviewquestions

aboutassessment-relatedpracticeshadtobetakenfortworeasons.Thefirstwas

thatthisresearcher(whoconductedtheinterviews)knewonlyoneofthe

participatingcasestudyschoolscienceteachersbeforetheinterviews.Hehad

attendedatwo-dayworkshoppresentedbythisresearchermorethantenyears

earlier.Initialnaturalreservewhenitcametodisclosureofpracticeswasevident

inmostcases.

However,anhourisageneroustimeforaone-on-onediscussionandmost

participantsseemedtoappreciatetheopportunitytodiscusstheirpracticewithan

interviewerwhounderstoodtheirsituationandtowhomtheycouldmakefrank

disclosuresabouttheirwork.Nointerviewwasterminatedbeforetheassigned

time;mostwentlonger.

Thesecondreasonwasthattheinterviewswerebeingconductedin2016about

assessmentpracticesrelatedtoasyllabusthatschoolswerenolongerworking

with(itwasreplacedafter2014).Thenewsyllabuswassenttoschoolsin2012

andscienceteacherswereencouragedthentobeginplanningforits

implementationintoYears7and9from2014andYears8and10from2015.The

newsyllabusbecamethebasisforEVtestinginYears8and10from2015,theyear

aftertheperiodofinterestforthisthesis.

Page 160: Exploring The Impact of a Largescale Diagnostic Science

139

ThismeantthatHTsincasestudyschoolsweremanagingsyllabusimplementation

processesthathadbeeninprogressforatleasttwoyearsaftertheperiodof

interestrelatingtoachievement.Theseprocessesincludedreviewingand

adjustingthesetofsummativeassessmenttaskstoreflectnewsyllabuslearning

intentions.Inpracticethismeantverylittlechangeinthesubjectmatterandthe

weightingbetweenknowledgeandunderstandingandskillswasthesame(50:50).

Anumberofthecasestudyschoolshadchangedtheassessmentmodesusedto

collectassessmentdata.Somereplacedformalpen-and-papertestswithresearch

projects,practicaltasksandoralpresentations.Theissuewastoworkoutwhether

whatwasbeingprovidedinthediscussionandartifactswererecentinnovations

(i.e.hadbeenintroducedafter2014orwereinplacebeforethat).

Artifactsofassessment-relatedpracticeprovidedbyteachersneededtobe

consideredinthelightofrecencyaswell.Themainissuewastoworkoutwhich

partoftheschoolnarrativesaboutassessmentforlearningappliedbeforeorafter

2014.Questionsfromtheinterviewerwereusedtoassistwiththatwhere

necessary.

Quantitativedata

Thecriterionofdataindependencewasprovidedbyanexperimentaldesignthat

askedforanddeliveredresponsesfromeitherWAEorAEorWBEdesignated

schoolstothreedifferentwebsites.Theinstructionswiththeonlinesurveywere

explicitinaskingforindividualresponses.Acheckonthetimingofsurveyreturns

supportedtheassessmentthatreturnswerefromindividualsevenwhenmultiple

returnsfrom(teacher)identifiedschoolswerereceived.Oneschoolthatidentified

itselfsaidithadprovidedaconsensusreturnfromthefiveteacherscomprisingthe

sciencedepartment.Itwastreatedasanindividualreturnforthepurposesofthis

exercise.

ThedatanormalityrequirementwastestedusingtheShapiro-WilktestinSPSS.

TheSPSStutorialadvicewasthattheShapiro-Wilktestismoreappropriatefor

samplesizeslessthan50(LaerdStatistics,2017).

Page 161: Exploring The Impact of a Largescale Diagnostic Science

140

Therequirementforhomogeneityofvariancewastestedusingboththe

parametricLevenetestandWelch(nonparametric)testfor‘robustnessofmeans

equality’andthemostappropriatetestresultwasreported.Bothofthesetestsare

readilyavailableinSPSS.

IftheANOVAstatisticforthebetween-groupmeanswasstatisticallysignificant,

thenonparametricGames-HowellMultipleComparisonsTestwasusedtoidentify

thegroupswithstatisticallysignificantmeans.TheGames-Howelltestis

recommendedwherethegroupsizeswererelativelysmallandunequalinnumber,

and,asinsomecases,datasetswereborderlineintermsofhomogeneityof

varianceandnormaldistribution(LaerdStatistics,2017).TheTukeyHSDtestisa

parametrictestandwasnotanappropriatetestinmostcases.Thesetwotests

(andmore)werereadilyaccessiblewithintheSPSSsoftwareused.

Becausethesurveywasvoluntaryandanonymous,itwasnotpossibleto

predeterminethetotalnumberofresponsesorhowtheindividualresponse

numberswouldbedistributedacrossthethreepopulations.Asaconsequence,the

groupsizeswereunequalandthenumberofsubjectsrelativelysmall.Whilethere

were101respondentsintotal,onlycompleteoralmostcompletedatasetsfor

sectionsbeinganalysedwereused.Thenumberofdatasetsremainingineach

groupwere:nWBE=32,nAE=28,andnWAE=25,meaningthatdatafrom16(15%)

oftherespondentswasnotused.Theimpactofmissingdatawithinthedatasets

usedwasmanagedbytheSPSStoolsusedtoreportthestatisticalsignificanceof

thestatisticproduced.

ThenonparametricKruskal-WallisANOVAwasgenerallyusedwheretestsfor

homogeneityofvarianceandnormalitywerenotcompletelysatisfied.

Year-on-yearvariabilityandschoolmisfortunescanimpactresultsinaone-off

test.Examplesmightbethedeathofateacherorastudent,aswellasindividual

studentcircumstances.Therelativeimpactofindividualorgroupmisfortuneon

aggregatedresultsisinverselyproportionaltotheYear8population.Forexample,

onestudentdroppinganachievementlevelinaschool’sYear8populationof30

producesa3%variationintheproportionofstudentsatthatgradelevel;inaYear

Page 162: Exploring The Impact of a Largescale Diagnostic Science

141

8with100ormorestudents,theimpactisoftheorderofa1%variationorless.

Averagingresultsoverfouryearsreducestheimpactofyear-on-yearvariations,

particularlyforsmallschools.Thiswasafactortakenintoaccountwhen

determiningthetolerancesfordecidingdifferencesinresultsorengagement

patterns(seeChapterFive).

3.8Researchapprovals

AsaPhDcandidate,thisresearchersoughtandwasgrantedUTSethicsapproval

(UTSHRECREFNO.2015000453)inSeptember2015toundertaketheresearch

describedinthisthesis.

AnapplicationtotheNSWDepartmentofEducationtoaccessitsstate-wideEVand

NAPLANresultsandtoapproachschoolstoparticipateinresearchwasgrantedin

November2015(SERAP2015373).

Page 163: Exploring The Impact of a Largescale Diagnostic Science

142

CHAPTERFOUR:FINDINGSFROMPHASETWO

Thischapterreportsfindingsfromphasetwooftheresearchdesign,theanalysis

ofsurveyreturnsfromscienceteachers.Thefindingsprovidepartialanswersto

thefirsttworesearchquestions:

1. WhatusearescienceteachersmakingoftheEVprogramincludingSOLO,

andwhyisitusedornotused?

2. Whatformativepracticesareevidentintheworkofscienceteachers,and

whyaretheyusedornotused?

Findingsinrelationtothewhyorwhynotcomponentsofthequestionsare

providedinChapterFive.

Section4.1reportsthesizeofthegroupscomprisingthesampleofschoolsinvited

toparticipateintheresearch(fromphaseoneoftheresearchdesign).Also

discussedhereistheimpactofusingtheregressionresidual(whichisanimputed

measureofthescientificliteracyattainedrelativetoapredictor)torankschools

insteadofEVresults.Itisrelevanttothetransformativeintentofdoingthis

researchaswillbediscussedfurtherinChaptersFiveandSix.

Section4.2providestheresultsandfindingsfromanalysisofthesurveyreturns.

Theyarereportedinfoursetsrelatingtothesectionsinthesurvey.

Section4.3reportssomeadditionalfindingsthatwillbereferredtoinsubsequent

chapters.

Section4.4providesasummaryofkeyfindingsgroupedunderthetworesearch

questionstheyprovideanswersto.

Section4.5providesasummaryoffindingsinrelationtothesecondresearch

question.

Page 164: Exploring The Impact of a Largescale Diagnostic Science

143

4.1Introduction

Phaseoneintheresearchdesigndeliveredthesampleofschoolstoworkwith.The

regressionanalysisofEVresultsoverthechosenpredictorproducedresidualsfor

394schools.Theschoolswerethenorderedaccordingtotheirresiduals(biggest

positiveresidualatthetop).Thesizeoftheresidualwasdeemedforthepurposes

ofthisthesis(seesubsection3.3.2)tobeameasureofthescientificliteracy

componentofEVtestresultsandameasureofthescienceteachingassociatedwith

it.

AsshowninTable4.1theapproximately20%ofschoolswiththebiggestpositive

residualswerelabelledasschoolshavingEVresultsthatwerewellabove

expectation(WAE);approximately20%ofschoolswiththelargestnegative

residualswerelabelledashavingEVresultswellbelowexpectation(WBE).A

middlegroupofschools(approximately20%)straddlingthelineofbestfitline

(zeroresidual)werelabelledashavingresultsatexpectation(AE).Theremaining

schoolswerelabelledas‘notdefined’.Expectationwasdefinedintermsofthe

differencebetweentheactualEVresultandNAPLAN-basedpredictor.

Table 4.1 Defining populations from which to invite research participants

Standardised residuals

Residual Rank

Quintile group

Group label Number of schools

2.68 to 0.56 1—85 TOP Well above expectation (WAE)

85

0.55 to 0.16 86—166 - Not defined 81 0.15 to -0.20 167—254 MIDDLE As expected (AE) 88 -0.21 to -0.56 255—309 - Not defined 55 -0.57 to -2.50 310—394 BOTTOM Well below

expectation (WBE) 85

Note. A positive residual means that EV results were above expectation. A negative residual means that EV results were below expectation. Expectation is defined as relative to the “line of best fit” for the result pairs used in the regression model.

Page 165: Exploring The Impact of a Largescale Diagnostic Science

144

AswillbedemonstratedinChapterFive(seeTable5.1),thethreegroupsof

schoolsareineffectthreeseparatepopulationsdefinedbythesizeoftheirgroup

meanresidualsandthefactthatwhenmeasurementerrorsaretakenintoaccount,

thereisnegligibleoverlapbetweenthedistributionsofresultsassociatedwiththe

WAEandAEandAEandWBEgroups.ThereisnooverlapbetweentheWAEand

WBEdistributions.Thislastdifferenceisimportantbecauseitmeansthat,interms

ofstatisticalconvention,findingsofstatisticalsignificancebetweenthesample

meansineachgroupcanbegeneralisedtothegrouppopulationfromwhichthat

samplewastaken.

Also,thedifferencesbetweenWAEandWBEgroupsmeanresidualsareasfar

apartascouldbemanagedwithintheconstraintsofthemethodologyused.The

intentionwastoachieveFlyvbjerg’s(2011)pre-conditionofmaximumdifference

betweenthegroupmeasureofthekeyvariable(scientificliteracy)weare

interestedin.

IntheNSWgovernmenteducationsystem,schoolsareclassifiedinanumberof

ways,includingbyproximitytomajorpopulationcentres(metropolitan,

provincial,ruralandremote),bygender(coeducational,boysorgirlsschools),and

bystudententrycriteria(comprehensive,partiallyselectiveentryorfullyselective

entry).Whenschoolsarerankedusingconventionalmeasuresofachievement,

suchasEVtestresults,thefullyselectiveentryschoolsoccupythetop19positions

andprovincialschoolsperformpoorlyrelativetometropolitanschools.Only9%of

provincialschoolswereinthetop20%ofschoolsbasedonEVresults.

Theuseoftheresidualtorankschools(Table4.1)producedthefollowingfindings.

Scientificliteracyscores,fortheyearsfrom2011to2014,werebetterthan

expected(aresidualabovezero)in53%ofthe394schoolsmeetingcriteriafor

inclusioninthestudy.Whilstitisarguablethatthedifferenceisnotstatistically

significant,theconsistencyoftheslightpositivebiasoverfouryearsisinteresting,

ifnotreal.Whenthisresultislookedatbygovernmentschoolcategory,67%ofall

provincialschools,68%offullyselectiveentryschools,and23%ofpartially

Page 166: Exploring The Impact of a Largescale Diagnostic Science

145

selectiveentryschoolsallachievedbetterthanexpectedEVresults(theresidual

wasthefour-yearaverageofschoolresiduals).

AccordingtoThomsonetal.(2017)approximately25%ofschoolsinAustraliaare

classifiedasprovincial(thenextcategoryaftermetropolitan,basedontheirsize

anddistancefrommajorpopulationcentres).Assumingthisfigureisrelevantto

NSW,around115schoolswouldbeinthatcategoryofschool.Whenwecountup

thenumberofprovincialschoolsinthetop20%ofschoolsrankedaccordingto

theirresidual,56%oftheschoolsthereareprovincialschools.Also25%ofthe

schoolsinthebottom20%ofschoolswereprovincialschools.

Thus,onthebasisofresidualrankings,provincialschoolshadmorethandouble

theirexpectedpresenceinthetop20%groupandwererepresentedasexpectedin

thebottom20%group.ItwasarguedinChapterThreethattheresidualisadirect

measureoftheeffectofscienceteaching.ThejustificationforlookingatEVresults

above,atandbelowexpectationandtheirattributiontoschooltypeisprovidedin

thenextparagraph.

InSection3.2referencewasmadetothetransformativeintentofthemixed

methodsdesignemployedinthisresearchproject.Theresearcherwillprovidethe

findingstotheschoolsthatparticipatedandtotheNSWDepartmentofEducation

thatsupportedit.Iftheunconventionalmeasureofteachingsuccess(residual

valueandpolarity)isvalidated,thentheschoolsreallyneedinghelptoimprove

studentachievementinandengagementwithsciencecanbespecificallyidentified

andtargetedforsupport.

Leavingthecategoryofschooloutofconsiderationinthefirstphaseofthe

research,principalsofschoolswithWAE,AEandWBEEVresultswereinvitedto

supporttheirscienceteachers’participationintheresearch.Ofthe394eligible

schools,258principalsreceivedinvitations(66%ofeligibleschoolsand55%ofall

465governmentsecondaryschoolsinNSWwithYear8studentenrolments.

Ofthe101surveysreturnedbyscienceteachers,35werefromWBEschoolsand

therewere33eachfromAEschoolsandWAEschools.Itisnotpossibleto

Page 167: Exploring The Impact of a Largescale Diagnostic Science

146

determinetheresponseratebecausethenumberofteacherswhoreceived

notificationaboutthesurveyisunknown.Intheirresponsestotheonlinesurvey,

42respondentsidentifiedthemselvesandthe36schoolsinwhichtheytaught.Not

allthesurveyreturnswerecompleteandthisshowsupinthenumberscounted

forthepurposeofstatisticalanalysis.

ThesurveyquestionsareavailableasAppendixFandaprintoutofdescriptive

statisticsofteacherresponsesisprovidedasAppendixJ.

4.2Findingsfromanalysisofthescienceteachersurveyreturns.

Theresidualusedtocreateschoolgroupsfromwhichtosamplecontainsno

informationaboutthecharacteristicsoftheteachingexperiencedbystudentsin

theschoolsthatprovidedresponses.Phasetwooftheresearchsoughttoestablish

therelationship,ifany,betweenthethreeschoolgroupsandtheextenttowhich

teachersuseEVresources,includingSOLOandformativepracticesintheirwork.

Thesurveyundertakenbyallrespondingteacherswasidentical.However,their

returnswerecollatedaccordingtothegrouptheirschoolhadbeenassignedto.

AseriesofANOVAswereperformedtoestablishthestrength(inthestatistical

sense)ofanyassociationsbetweentheschoolgroupandaspectsofassessment-

relatedworkdonebyteachersinthosegroups.Thesurveyhadfourpartsand

analysisofthesetofresultsfromeachpartisreportedseparatelyinsubsections

4.2.1to4.2.4.

Subsection4.2.1describestheextentofteacherengagementwithanduseofEV

resources,theirunderstandingoftheEVprogram,andtheirinvolvementwithitat

andbeyondschool.

Subsection4.2.2describestheextentofscienceteacherengagementwithand

understandingofSOLO.Thesetwosetsofresultsandrelatedfindingsdetailthe

extentanddepthoftheimpactoftheEVprogram,includingSOLO,onthe

assessment-relatedworkofthesampledjuniorsecondaryscienceteachersfrom

2011to2014.

Page 168: Exploring The Impact of a Largescale Diagnostic Science

147

Thefindingsinthesetwosubsectionsarethemaininputsforaddressingresearch

questionone.

Analysisofteacherresponsesanditemsinthethirdsectionofthesurveyprovided

datarelevanttocharacterisingteachers’assessmentrelatedworkintermsofthe

fivedimensionsofformativepractice.Theanalysiswasalsoaimedatestablishing

thegeneralityofthefindingfromthesampletothegrouppopulation.Thefindings

fromthatanalysisarereportedinSubsection4.2.3andwereusedtoinform

answerstothesecondandthirdresearchquestions.

Internationalresearchdiscussedinchaptertwoshowsthatbetterlearning

outcomesarestronglyassociatedwithteacheruseofformativepractices(see,for

example,CERI,2005).AsexplainedinChapterOne,itwasforthisreasonthat

syllabusadvicesupportingtheuseofassessmentforlearning(underpinning

formativepractices)wasincludedinofficialsyllabusdocumentsinNSW.The

findingsreportedinSubsection4.2.3arealsothebasisfordiscussioninChapter

Sixontheextenttowhichthefindingsheremakeacontribution(through

replication)tothegrowingbodyofinternationalresearchonthepowerof

formativepracticesandonlearninghowtolearn.

Thefourthsetoffindings,reportedinSubsection4.2.4,areabouttheparticipating

teachersandtheirschools.Findingsinthissectionprovidebackground

informationusedtoinformassessmentnarrativesandconclusionsreportedin

ChaptersFiveandSixrespectively.

Subsection4.2.5reportsotherfindingsfromthesurveyanalysisusedto

contextualisediscussioninChaptersFiveandSix.

4.2.1Setoneresults:TeacherengagementwithEVresources(survey

questions1to5)

Questionone(Q1)itemsintheteachersurveyaddressedthescopeofrecent(past

12months)teacherengagementwithEVresults.Teachersrespondedyesornoto

atotalofnineitems.Itemsweregroupedintothefollowingcategoriesofactions:

Page 169: Exploring The Impact of a Largescale Diagnostic Science

148

• accessingresults(items1ato1d)

• discussingresultswithcolleagues(items1e,1g&1h)

• discussingresultswithstudents(items1f&1i).

Questiontwo(Q2)itemssoughttofindouttheextentofteacherengagementwith

anduseofEVrelatedactivitiesandresourcesoverthepasttwoyears.Teachers

respondedyesornotoatotalof13items.Categoriesofactionswere:

• accessingEVresourcesandmaterials(items2a,2b&2d)

• usingEVresourcesintheclassroom(items2c&2g)

• usingEVquestionsandotherresourcesinorasmodelsforschool

assessments(items2e,2f&2h)

• changingfacultyprograms(item2i)

• engagingbeyondschoolinEVrelatedactivities(2jto2m).

AnalysisofdatafromQuestionsoneandtwo

ThehypothesiswasthatteachersinschoolswhereEVresultsweredeemedtobe

WAEwouldmakegreateruseofEVresourcesthantheircolleaguesinschools

whereresultsweredeemedasWBE.ThedecisionwasmadetoincludeAEschools

inthetestingtoassesstheconsistencywithwhichthemeasuresofteacheractivity

associatedwithAEschoolswaslowerthaninWAEschools,buthigherthanin

WBEschools.Onbalance,EVresultsinAEschoolsshouldbebelowWAEschools

andaboveWBEschoolsEVresults.Ifthispatternisfound,itaddsweighttothe

credibilityoftheresidualasameasureofscienceteachingeffectiveness.

ANOVAproceedsontheassumptionofthenullhypothesis(thatthereareno

statisticallysignificantdifferencesinthelevelofEVresourceusebyteachersinthe

threegroups).Subsection3.7.2discussedgeneralconsiderationsrelatingtothe

featuresofdatasetsandtheappropriatechoiceoftoolfromthesuiteoftools

availableinSPSS.Subsection3.7.3particularisedthatdiscussiontothisproject.

Consequently,datasetswereanalysedfornormalityandhomogeneityofvariation

andappropriatestatisticaltoolschosentoperformANOVAandrelatedsignificance

testing.IndicativefindingsfromANOVAwereassessedagainstasignificancelevel

Page 170: Exploring The Impact of a Largescale Diagnostic Science

149

(p)of.05.Thedecisiontoacceptorrejectthenullhypothesiswasmadeby

referencetotheconventionalstandard.

ThedescriptivestatisticsforQ1&Q2(combined)arepresentedinTables4.2and

therelatedmeansplotsinFigure4.1.

Table 4.2 Descriptive statistics for Q1 & 2 (n = 85)

Result group x̅ s σx̅ n Q1 & Q2 ( / 22)

WBE 7.63 4.85 .86 32 AE 11.82 3.98 .75 28 WAE 11.48 4.55 .91 25 Total 10.14 4.86 .53 85

Figure 4.1 Means plots for Q1 & Q 2 combined

TheQ1&Q2(combined)datasets(n=85)passedboththenormalityand

homogeneityofvariancetests(p>.05).TheShapiro-Wilkstatistic(W)forthe

Page 171: Exploring The Impact of a Largescale Diagnostic Science

150

WBEgroupW=.965,p=.38;theAEgroupW=.982,p=.90,andtheWAEgroupW

=.964,p=.49.TheLevenevariancestatisticwasF2,82=.821,p=.44.

TheparametricANOVAstatisticforQ1&Q2combined(F2,82=8.093,p=.001)

supportedtherejectionofthenullhypothesis(p<.05).Thismeansthattherewas

astatisticallysignificantdifferencebetweenoneormoreofthegroupsmeans.

TheGames-Howellmultiplecomparisonsanalysisindicatedthatthex̅WAE-x̅WBE

(difference=3.86,p=.009)andx̅AE–x̅WBE(difference=4.20,p=.001)were

statisticallysignificant,butthatthex̅AE–x̅WAE(difference=.34,p=.994)wasnot.

BasedonthedataanalysisforQ1&Q2combined,itcanbereasonablyconcluded

that,asagroup,teachersatschoolswhereresultsweredeemedtobeWBEmake

lessuseoverallofEVresultsandresourcestosupporttheirassessment-related

workthandotheircolleaguesatschoolswhereresultsaredeemedtobeAEor

WAE.

Asupplementaryanalysiswasthenperformedonthecombineddatabutthistime

disaggregatedagainsttheeightcategoriesidentifiedabovetodifferentiate

particularsimilaritiesanddifferencesbetweengrouppractices.

Alleightcategory-separateddatasetsfailedtheShapiro-Wilktestfornormality(p

<.05)andallbutone(categoryF)failedtheLevenetestaswell.Inthelightofthat

failure,thenonparametricKruskal-WallisANOVAwasapplied.Itdemonstrated

statisticallysignificantdifferencesbetweenfouroftheeightcategorymeans,as

showninTable4.3.

Page 172: Exploring The Impact of a Largescale Diagnostic Science

151

Table 4.3 Results of nonparametric ANOVA for eight EV categories

Figure4.2providesavisualrepresentationofthefourcategoriesmeansthatwere

statisticallysignificantlydifferent.ReadverticalbarsLtoR(matchedwithEVAto

EVGdowntheRHSlabels).

Page 173: Exploring The Impact of a Largescale Diagnostic Science

152

Figure 4.2 EV category means shown to be statistically significantly different

Takingintoaccountthe95%confidenceintervalsforthemeans,visualinspection

showsthatWBEmeansspreadsforcategoriesEVAandEVD(firstandsecondplots

fromtheleft)appeartobebelowtheWAEmeansspreadsforthesamecategories.

WBEmeansspreadsforcategoriesEVFandEVG(thirdandfourthplots)appearto

belowerthantheAEmeansspreadsforthosecategories.TheAEandWAEmeans

spreadsforallfourcategoriesappeartooverlapeachother.

TheposthocGamesHowellmultiplecomparisonstestsconfirmedthatthe

statisticallysignificantdifferencesinmeans(p<.05)were,asobserved,between

theWBEandWAEmeansforcategoriesEVAandEVD(difference=1.1,p=.024

anddifference=.73,p=.006respectively)andbetweentheWBEandAEmeans

forcategoriesEVFandEVG(difference=.76,p=.039anddifference=.53,p.=000

respectively).TheGames-Howellmeanscomparisonprocessshowedthatforthe

eightdatasets,theAEandWAEmeanswerenotstatisticallysignificantlydifferent.

Basedontheabove,itwouldbereasonabletoconcludethefollowingabout

teachersuseofresourcesinthelightoftheeightcategories.

Page 174: Exploring The Impact of a Largescale Diagnostic Science

153

Thesecond(EVB),third(EVC),fifth(EVE)andeighth(EFH)categoryresponses

werenotstatisticallysignificantlydifferentfromeachother.Thusfigures

discussedforthesefourcategoriesofactionsarebasedonthecombinedtotalof

teachersrespondingfromeachofthethreegroups(n=85).

InrelationtoEVBwhichwasaboutdiscussingresultswithcolleagues,66%had

discussedthetestitemandtaskanalysis,49%haddiscussedtheresultsofthe

studentsurvey,and33%haddiscussedthestudentprofileinformation.

EVCwasaboutdiscussionwithstudents.22%haddiscussedtheitemortask

analysiswithstudentsand18%haddiscussedtheresultsofthestudentsurvey.

EVEwasaboutusingEVresourcesintheclassroom.45%hadusedtheteaching

strategiesprovidedintheSMARTpackageand68%haduseditemsandtasksfrom

EVtestsintheirschoolassessments.

EVFwasaboutengagementbeyondschool.TwoteachersfromtheAEgrouphad

writtenitemsfortheEVtest;twoteacherseachfromtheAEandWAEgrouphad

evaluateditemsforthetest;39%hadmarkedextendedresponsetasks;and30%

hadattendedworkshopsabouttheEVprogram(differenttotrainingformarking).

Thefollowingfindingscanreasonablybemadeforthefourcategorieswhere

statisticallysignificantdifferencesbetweenteacheruseofEVresourceswere

demonstrated.

Thefirstcategory(EVA)askedteacherstosaywhethertheyhad,intheprevious

twelvemonths,lookedatEVresultsforthestudentsurvey(fortheirclass),the

analysisofanswerstotheextendedresponsetasks,andindividualstudentprofile

results.TeachersinWBEschoolshadnotaccessed(viewed)thisinformationas

muchastheircolleaguesinWAEschools.

Thefourthcategory(EVD)askedteacherswhethertheyhadintheprevioustwo

yearsaccessedEVrelatedmaterialsinTaLE(theDepartment’sinternalteacher

supportwebsite),SMARTprovidedfeedbackonEVresultsaswellasadviceabout

teachingstrategiestoaddresssciencemisconceptionsandtheseparatelyproduced

Page 175: Exploring The Impact of a Largescale Diagnostic Science

154

markingmanualsforextendedresponsetasks.Again,teachersinWBEschoolshad

notaccessedtheseresourcesasmuchastheircolleaguesinWAEschools.

Thesixthcategory(EVF)askedwhetherteachersintheprevioustwoyearshad

usedEVtestitemsandtasksintheirowntestsorasmodelstoworkwith.

TeachersinWBEschoolshaddonesolessthantheircolleaguesinAEschools.

Theseventhcategory(EVG)askedwhetherschoolshadusedEVresultstoinform

changestofaculty(teachingandlearning)programsintheprevioustwoyears,

TeachersinWBEschoolsmadelessuseofEVresultsinthatprocessthanhad

teachersinAEschools.

Surveyquestionthree(Q3orEV3)askedteacherstoself-reporttheirlevelof

understandingoftheEVprogram.

Thedescriptivestatisticsforthecombineddataandrelatedplotsareshownin

Table4.4andFigure4.3

Table 4.4 Descriptive statistics for Q3 (n = 85)

Result group x̅ s σx̅ n Q3 ( / 5)

WBE 2.97 1.15 .20 32 AE 4.04 .79 .15 28 WAE 3.84 .90 .18 25 Total 3.58 1.07 .12 85

Page 176: Exploring The Impact of a Largescale Diagnostic Science

155

Figure4.3Teacherself-ratingfortheirunderstandingoftheEVprogram(n=85)

ResponsestoQ3wereanalysedtodiscoverwhetherteacherratedunderstanding

oftheEVprogramwasdifferentbetweenthethreegroupsofschools.

ThethreedatasetsforQ3failedthenormalitytests(p<.05)butdidpassthe

homogeneityofvariancetests(p>.05).

Giventhefailureonthenormalitytest,itwasdecidedtoapplytheWelchrobust

testofequalityofmeans(Welch’sF2,53.19=9.162,p=.000).Asthepvaluewas

<.05,theresultsweretakenasshowingarealdifferencebetweenoneormoreof

thegroupmeans.

TheGames-Howellmultiplecomparisonsanalysisattributedthedifferencesto

x̅WAE-x̅WBE(difference=.871,p=.006)andx̅AE–x̅WBE(difference=1.067,p=.000)

whichwerestatisticallysignificant(p<.05).Thethex̅AE–x̅WAE(difference=.196,p

=.682)wasnotstatisticallysignificantlydifferent.

Page 177: Exploring The Impact of a Largescale Diagnostic Science

156

BasedonthedataanalysisforQ3,itcanbereasonablyconcludedthatteachersin

schoolswithresultsdeemedtobeWBEhadalowerself-ratedunderstandingof

SOLOthantheircolleaguesinschoolswhereresultsweredeemedtobeAEand

WAE.

Q4askedteacherstowritewhattheythoughtwasthemostimportantpurposefor

theEVtest.Table4.5showstheircollatedandcategorisedresponses.

Table 4.5 Summary of EV purposes

Response numbers per group Category of response

WBE n = 32

AE n = 38

WAE n = 25

For students Opportunity to demonstrate their learning Provide students with feedback to improve their learning Opportunity to improve test taking skills Provide challenge for higher achievers For teachers Opportunity for professional learning about assessment Provide feedback on student performance / achievement relative to others Provide feedback on student performance / achievement relative to standards Provide feedback on student learning Provide feedback on learning progress Provide feedback on teaching Provide feedback on teaching programs Other responses No idea of EV purpose An unwelcome imposition Jobs for head office workers No response or left blank

3 0 0 0 0 10 1 12 2 3 5 2 1 2 6

0 3 1 0 1 7 1 1 3 7 7 5 0 0 0 3

0 0 2 1 0 6 2 4 1 8 3 0 0 0 4

Note. Some respondents mentioned more than one purpose thus the group sample numbers (n) do not match the comment totals.

Examplesoftypicalresponsesinclude:

Understandhowwellourstudentsperformrelativetotherestofthestate.

(WBEteacher)

Thetrackingofstudentsastheyprogressthroughhighschool.(WBEteacher)

Page 178: Exploring The Impact of a Largescale Diagnostic Science

157

Understandyourstudentsandamendteachingandlearningstrategiesfor

students.(WBEteacher)

Providefeedbacktoteachersontheeffectivenessoftheirteachingthestage4

Sciencesyllabus.(AEteacher)

TogetasnapshotofhowStage4studentshaveprogressedspecificallyin

Sciencesinceprimaryschool.Theextendedresponsesareparticularlyuseful

inidentifyingthestudents’abilityorlackofabilityincommunicatingand/or

understandingscientificconceptsindifferentscenarios.Itisalsoveryuseful

toidentifymisconceptions–soinfluencesourteachingapproaches.(AE

teacher)

Providefeedbacktostudentsontheirknowledgeandunderstandingof

scientificconceptsandtheirscientificliteracy.Provideinformationto

teachersonareasthatneedimprovement.(AEteacher)

Recordofstudentgrowth,strengthsandweaknessesofprograms/areasof

teaching.(WAEteacher)

Toassessstudents’scientificliteracycomparativetotheirpeersinthestate.

(WAEteacher)

Identifyareaswhereweneedtoimproveourteachingofparticularconcepts

orskills.(WAEteacher).

Tosummarise,allthreegroupsofteachersmostfrequentlyidentifiedthepurpose

oftheEVprogramasbeingtoprovide:

• feedbacktoteachersaboutstudentlearning/learningprogress

• comparativeinformationaboutachievement/performancerelativetoother

schools

• feedbackaboutteaching

• feedbackonteachingandlearningprograms.

Page 179: Exploring The Impact of a Largescale Diagnostic Science

158

Surveyquestionfive(Q5)askedteacherswhethertheirschoolwastakingupthe

invitationtoparticipateinVALID10,whichisanacronymforValidationof

AssessmentforLearningandIndividualDevelopment.VALIDhadbeenintroduced

onavoluntarybasisforYear10studentsforthefirsttimein2015.Itisanewtest

designedtoprovidedataaboutachievementinscienceattheendofYear10.Itisa

Year10equivalenttesttotheYear8EVtest.

IntendedparticipationinVALID10in2016waslowerforWBEschools(n=3)

thaneitherAE(n=6)orWAE(n=6)schools.Thenumbersarebasedonacount

fromidentifiedschoolsineachgrouptoavoiddouble-countingthesameschool.

4.2.2Settworesults:SOLOandextentofteacherengagementwithit

(surveyquestions6to8)

SOLOisthetheoreticalmodelthatinformsfeedbacktoschoolsaboutthelevelof

sciencethinkingexhibitedbystudentsasrevealedintheirselectedresponsesto

itemsandwrittenresponsestotheextendedresponsetasks(seeChapterTwofor

afullexplanation).

Surveyquestionssix(Q6a-j)andseven(Q7)wereaboutteacherengagementwith

anduseofSOLOatschoolandtheirunderstandingofSOLOrespectively.Q6a-j

askedteacherstorespondyesornoto10itemsdescribingactionstakenoverthe

previoustwoyears.TheQ6a-jmeansforteachersattheschoolsineachschool

groupatthetimeofinterestareprovidedinTable4.6andFigure4.4.

Table 4.6 Descriptive statistics for Q6 (n = 85)

Result group x̅ s σx̅ n

Q6 (out of 10)

WBE 2.00 1.87 .330 32 AE 2.21 2.41 .455 28 WAE 2.96 3.18 .636 25 Total 2.35 2.49 .270 85

Page 180: Exploring The Impact of a Largescale Diagnostic Science

159

Figure 4.4 Means plots for Q6

LookingatQ6a-jmeans(n=85),thefirstobservationisthatallthreegroupmeans

arelow.Thesecondisthatwhenconfidencelevelsaretakenintoaccount,the

visualrepresentationofthemeansalloverlapanddonotappeartobestatistically

significantlydifferenttoeachother.

Toconfirmthatresult,theShapiro-WilkstestandLevenetestsfordataset

normalityandvarianceofhomogeneityrespectivelywerenotsatisfied(p<.05for

bothandthusbelowtheacceptedpvalueof.05)andthusthenonparametric

Kruskal-WallisANOVAwasused.

ThenonparametricANOVA(Table4.7)includesresultsforbothQ6andQ7.Row1

inthattablesupportstheabovefindingthatthemeansdifferencesbetweenthe

threecategoriesarenotstatisticallysignificantlydifferentforQ6.

Page 181: Exploring The Impact of a Largescale Diagnostic Science

160

Table 4.7 Nonparametric ANOVA (n = 85) for SOLO questions (Q6 & 7)

Giventhatthemeansdifferencesbetweenthesamplesfromthethreeschool

groupswerenotstatisticallysignificantlydifferent,totalsampleresponsesare

providedforQ6(Table4.8andFigure4.5).Thefrequenciesrecordedareforthe

totalsofYESresponsestotheitemsinQ6.Noteacherscored9or10outof10.

Table 4.8 Q6 SOLO category counts (n =85)

Total Frequency Percent Cumulative Percent

0 31 36.5 36.5 1 7 8.2 44.7 2 13 15.3 60.0 3 10 11.8 71.8 4 8 9.4 81.2 5 5 5.9 87.1 6 3 3.5 90.6 7 2 2.4 92.9 8 6 7.1 100.0

85 100.0

Page 182: Exploring The Impact of a Largescale Diagnostic Science

161

Figure 4.5 Frequency V level of engagement (zero to ten)

Inlightoftheaboveanalysis,whenalmostathirdofthesamplesaidnotoallitems

andwithhalfofthesamplerespondingyestofromonetofiveofthetenquestions,

itisreasonabletoconcludethatinallprobability,mostteachersacrossthestate

havenotengagedwithSOLOtoanextentwhereitgreatlyinformstheir

assessment-relatedwork.

Q7askedforaself-ratingbyteachersoftheirunderstandingofSOLO(1=very

poorto5=verygood).ThemeansforallthreegroupsareshowninTable4.9.The

levelofself-reportedunderstandingrangedfromabovepoortobelowacceptable

andthemeansforallthreegroupsrespondingtoQ7arenotstatistically

significantlydifferenttoeachotherasshowninFigure4.6andconfirmedabove

(seerow2,Table4.7).

Page 183: Exploring The Impact of a Largescale Diagnostic Science

162

Table 4.9 Descriptive statistics for Q7 (n = 84)

Result group x̅ s σx̅ n

Q7 (out of 5)

WBE 2.47 1.11 .196 32 AE 2.61 1.10 .208 28 WAE 2.58 1.47 .300 24 Total 2.55 1.21 .132 84

Figure 4.6 Means plots for Q7 self-reported understanding of SOLO

AccordingtoFigure4.7,45%oftheteachersrespondingtothesurveyratedtheir

understandingofSOLOaspoororverypoor.

Page 184: Exploring The Impact of a Largescale Diagnostic Science

163

Figure 4.7 S7 Frequency (n = 85) verses level of understanding

TheresultsfromtheanalysisofQ6andQ7supportthefollowingfindingsthat

applytoscienceteachersinthethreeschoolgroupssampled(n=85):

• around40%ofrespondentshad“accessedmaterialaboutSOLO”(survey

wording)

• fewerthan30%hadexplainedSOLOtoanyoneorusedSOLOinthe

classroom

• 46%ofteacherssaidtheyhadaverypoororpoorunderstandingofSOLO

• fewerthan10%reportedthattheirschoolhadusedSOLOconceptsorthe

SOLOmodeltoinformfacultyassessmentpoliciesortoprovidefeedbackon

studentachievementtoparents,and;theoveralllevelofself-reported

understandingofSOLOrangedfrompoortoacceptableatbest.

Surveyquestioneight(Q8)askedrespondentswheretheylearntmostaboutSOLO.

Table4.10summarisescollatedresponsesfromthethreesamples(WAE,AEand

WBE).

Page 185: Exploring The Impact of a Largescale Diagnostic Science

164

Table 4.10 Q8 summary of sources for learning about SOLO

Category of response

WBE n = 32

AE n = 28

WAE n = 25

No response / left blank 9 11 8 Training for ESSA / VALID marking 2 3 4

Actually marking ESSA / VALID 10 4 7 Applying it to school assessment 2 - 2 ESSA / VALID workshop 2 3 3 Using it in class 2 1 - Researched it - 7 1 Explaining it to others 1 2 - Talking to colleagues - 4 - Nothing helped 2 - - Never heard of SOLO / what is it? 2 - 4

Note. Total responses do not match total sample (n = 85) because some mentioned more than one source. Highlighted responses indicate the sources most commonly identified.

Arangeofresponsesfromthethreegroupsincluded:

MarkingextendedresponsequestionsforVALID10thisyear.(WBEteacher)

WTFisSOLO?I'veneverheardofthis.Idon'tthinkIspendahugeamountof

timeunderarock,withmyfingersinmyears,crouchedinthefoetalposition

whilsthummingnurseryrhymes,butIhavenotheardofthisterm.(WBE

teacher)

IattendedaworkshoprunbytheESSApeople.(WBEteacher)

Ireadaboutitonlinetodeterminewhatitwas.Idon'trememberit

specificallyfromanytraining.(AEteacher)

Explainingittootherstaff.(AEteacher)

Page 186: Exploring The Impact of a Largescale Diagnostic Science

165

ParticipatedinamarkingcourseforwhatESSAisandhowSOLOmarking

schemeswork.(AEteacher)

MarkingESSA.(WAEteacher)

IattendedtherecentMeettheMarkersseminaronSOLOandVALID.Our

facultythendecidedtoimplementspecificSOLObasedquestionsandmarking

schemesinourhalfyearlyexaminationforalljunioryears.Thisallgavea

clearperspectiveandgoodpracticeintheuseofSOLO.Theoutcomesand

markingschemesfromtheseexaminationshavenotyetbeencommunicated

tostudentsorparents.(WAEteacher)

AttendedSTANSWMTM(MeettheMarkers)onESSA(someyearsagoand

regularlyeveryfewyearssince)andmorerecentlybyinvestigatingthework

ofPamHookandothers.(WAEteacher)

SixrespondentsreportedthattheyhadneverheardofSOLOortheywantedto

knowwhatSOLOwas.Markingortrainingformarkingandworkshopswerethe

mostfrequentlymentionedsourcesforlearningaboutSOLO.

Bywayofexplanation,trainingfortheYear8testwasprovidedinworkshopsbya

skilledtrainerwithunderstandingofSOLO;trainingformarkingtheYear10tests

attheschoollevelinvolvedaccessingonlinematerialsandmayormaynothave

beendonecollaborativelywithcolleagues.

4.2.3Setthreeresults:Formativepractices(Questions9to15)

Questionsnineto15(Q9toQ15)soughttocapturetheextentofusebyteachersof

assessmentforlearningstrategies/formativepracticesbeyondthoseassociated

withtheEVprogram.TheEVprogramisaboutusingassessmentdatafor

diagnosticpurposesasdiscussedinearlierchapters.Allitemsintheassessment

forlearning(AFL)/formativepracticessectionofthesurveyareavailableinthe

surveyitself,whichisprovidedasAppendixF.

Page 187: Exploring The Impact of a Largescale Diagnostic Science

166

Q9toQ15included47separateitems.Eachiteminvitedoneoffiveresponsesfrom

teachers:NotknownorUnsureabout(NKUA)/Never/Seldom/Sometimes/

Often.

TheanalysisfortheNKUAoptionacrossQ9toQ15forthethreegroupsshownis

presentedinTable4.11(descriptivestatistics)andtheirgraphicalrepresentation

inFigure4.8.

Table 4.11 Means for NKUA option (n = 85)

School group n x̅ s σx̅

WBE 32 .94 1.46 .26 AE 28 .79 2.06 .39

WAE 25 .72 1.02 .20 Total 85 .82 1.57 .17

Figure 4.8 NKUA graphical representation of means

Themeansfromeachgroupoverlappedwhentheconfidenceintervalsweretaken

intoaccountandwerethusnotstatisticallysignificantlydifferent.Onthatbasisthe

Page 188: Exploring The Impact of a Largescale Diagnostic Science

167

nullhypothesis(comparableunderstandingoftheitemsbyteachersinthethree

groups)wasretained.

ThenextsetofdatarepresentedinTable4.12andFigure4.9summarisesthedata

fromallrespondents(n=84)forallitemsinQ9toQ15.Calculationswerebased

onassigningvaluestoteacherdecisionsonthefollowingbasis.NKUA=1;never=

2;seldom=3;sometimes=4andoften=5.

Table 4.12 Descriptive stats for Q9 -15 (n = 84)

Result group x̅ s σx̅ n

Q9 - 15 (out of 5)

WBE 3.86 .32 .06 32 AE 4.07 .41 .08 28 WAE 4.10 .37 .08 24 Total 4.00 .38 .04 84

Figure 4.9 Means plots for Q9 – Q15

Page 189: Exploring The Impact of a Largescale Diagnostic Science

168

Visualinspectiontakingintoaccounttheconfidencelevelspreadforeachgroup

meanstronglysuggeststhattheAEandWAEmeanswerenotstatistically

significantlydifferent.Also,theconfidencelevelspreadfortheWBEgroupmean

overlapssomewhattheAEandWAEmeanspreads.

Totestwhethersomeorallofthemeanswerestatisticallysignificantlydifferent

(ornot),thefollowingtestswereconductedontheallitemsdata(Qs9-15).Tests

fordatanormality(Shapiro-Wilk)andhomogeneityofvariance(Levene)are

providedinTable4.13.

Table 4.13 Tests for normality and homogeneity of variance for all items Qs 9-15 (n = 84)

Shapiro-Wilk test

WBE W32 = .956, p = .217

AE W28 = .978, p = .793

WAE W24 = .943, p = .191

Levene test F 2,81 = .356, p = .702

Theresultssatisfiedthethresholdsfordatanormalityandhomogeneityof

variance(p>.05)inallthreegroups.

Despitetherebeingunequalnumbersinthethreesamples,theparametricANOVA

statistic(F2,81=3.849,p=.025)andnon-parametricKruskal-WallisANOVA

statistic(χ2(2)=6.695,p=.035)bothreturnedasignificancefigure<.05.(forthe

Kruskal-Wallisresult,seerowoneinTable4.17).

TheGames-Howellmultiplecomparisonsanalysisindicatedthatthex̅WAE-x̅WBE

difference(difference=.24,p=.033)wasstatisticallysignificant(p<.05),butthe

x̅AE–x̅WBEdifference(difference=.21,p=.081)andx̅AE–x̅WAEdifference

(difference=-.03,p=.950)werenot(p>.05).

Anon-balancedecisionwasmadetorejectthenullhypothesisbasedontheresults

oftheabovethreetests.

Areasonableconclusionwasthat,inallprobability,teachersinschoolswhere

resultsweredeemedWBEwerelessfrequentusersofformativepracticesthan

Page 190: Exploring The Impact of a Largescale Diagnostic Science

169

weretheircolleaguesatschoolswhereresultsweredeemedtobeWAE.

AsexplainedinChapterTwo,formativepracticeswerecategorizedintofive

dimensions.Surveyitemreturnsweresubsequentlygroupedtoprovidedata

relatingtoeachofthefivedimensionsandthendisaggregatedtoidentifywhether

theactivitywasteacherfocusedorstudentfocused.

Figure4.10representsthatorganization.Itprovidesasummarydescriptorfor

eachofthefivedimensionsandauniqueacronyminparenthesisafterit;teacher

focusedandstudentfocuseditemsrelatedtoeachdimensionareidentifiedand

groupedbelowthedescriptor.

1. Clarifying and sharing learning intentions and success criteria (LISC):

Teacher focus: 9a, 9c & 9e Student focus: 9b, 9d & 9f

2. Engineering effective classroom discourse and using learning tasks that elicit evidence of student learning (CDEL):

Teacher focus: 10a, 10b, 10c & 10f, 10g & 10h Student focus: 10d

3. Providing feedback that moves learners forward (FTAL): Teacher focus: 9h, 11a – e, 12a – g, 14b & 14e Student focus: 14a

4. Activating students as instructional resources for one another (and the teacher) including peer assessment (ASIR):

Teacher focus: 15a, 15b & 15c Student focus: 9g, 10e, 13a, 13b & 13c

5. Activating students (and teachers) as the owners of their own learning including self-assessment (ASTL):

Teacher focus: 14c, 14d, 14f, 14g, 14h & 15d, 15e Student focus: 13d, 13e & 13f

Figure 4.10 Survey questions sorted to show teacher or student as the lead actor

Page 191: Exploring The Impact of a Largescale Diagnostic Science

170

Examplesfromthesurveytoillustratethedistinctionbetweenteacherandstudent

focusareprovidedinTable4.14.

Notethatstrategiesfurtherdownthelistareabouthelpingstudentstoexercise

greatercontrolovertheirlearning(Mitchelletal.,2009).Thisisrelevanttothe

discussioninChapterFiveaboutthedegreetowhichself-regulationwasevident.

Table 4.14 Sample items from the online survey with a teacher or student focus

Teacher focus

Student focus

Q9c explain to students the indicators or success criteria I will be looking for in their work

Q10h I explain my responses / thinking

Q10f I use test or assignment items and tasks as stimulus for discussion (in class)

Q11e (provide feedback) advice about how to improve

Q12c (feedback) refers to misconceptions

Q14c I evaluate lessons and record ideas for change next time

Q14f, g & h access and use information in class…about assessment for learning

Q15a collaborate with my science teacher colleagues to develop a shared understanding of what progression in science learning looks like

Q9d allow students some input in deciding what success criteria are to be applied

Q9f ask students why they think they are being asked to do the proposed activities

Q9g encourage peer feedback based on success criteria

Q10d ask students to explain their thinking

Q10e use the “think-pair-share-report” strategy

Q13d (students) self-assess by redoing work to a higher standard

Q13e (student self-) selection of items for a portfolio

Q13f self-assess by getting students to keep a journal of their reflections in their own words (on what they have learned in science lessons)

Q14a students give feedback on my teaching

Page 192: Exploring The Impact of a Largescale Diagnostic Science

171

Thedescriptivestatisticsfortheseparateteacherfocusedandstudentfocused

subsetsofitemsforQs9-15areprovidedinTable4.15.Thegraphical

representationsoftheteacherfocusedandstudentfocusedmeansareprovidedin

Figure4.11(secondandthirdverticalbarsineachgroup).

Table 4.15 Descriptive statistics TAFL and SFAL (n = 84)

School group n x̅ s σx̅

AFL for teachers

WBE 32 3.84 .35 .06 AE 28 4.04 .33 .06 WAE 24 4.05 .33 .07 Total 84 3.97 .35 .04

AFL for students

WBE 32 3.40 .43 .08 AE 28 3.56 .64 .12 WAE 24 3.59 .54 .11 Total 84 3.51 .54 .06

Figure 4.11 Formative practice means for all items, teacher items and student items (n = 84)

Page 193: Exploring The Impact of a Largescale Diagnostic Science

172

Basedontheabovetableandmeansplots,itwouldappearthatthesamplemean

forteacherfocuseditemsinschoolsdesignatedasWBEwaslowerthanthesample

meansfortheircolleaguesinbothAEandWAEschools,butthatthedifferences

areborderlinestatisticallysignificant.Themeansforstudentfocuseditemsdidnot

appeartobestatisticallysignificantlydifferentwhentheconfidencelevelspreads

weretakenintoaccount.

Normalityandhomogeneityofvariancetestswereperformedonthedatasubsets

relatedtoteacherfocusedandstudentfocuseditemswithinQs9-15.Table4.16

presentstheresultsofthatanalysis.

Table 4.16 Tests for normality and homogeneity of variance on assessment for learning (AFL) responses data sets (n = 84)

Shapiro-Wilk tests AFL for teachers WBE W32 = .987, p = .963

AE W28 = .914, p = .025* WAE W24 = .961, p = .453

AFL for students WBE W32 = .917, p = .017* AE W28 = .960, p = .353 WAE W24 = .958, p = .399

Levene test AFL (teachers) F 2,81 = .421, p = .658

AFL (students) F 2,81 = 1.796, p = .173

*sample failed the Shapiro-Wilk normality test (p < .05)

Thetestresults(Table4.16)didnotsupporttheuseofparametrictestsfor

comparingmeans(smallandunequalsamplenumbersinallthreegroupsandin

theteacherfocusedandstudentfocuseddatasets,onedatasetineachfailedthe

testsfornormality).

Page 194: Exploring The Impact of a Largescale Diagnostic Science

173

Basedonthatassessment,thenonparametricKruskalWallisANOVAtestwas

appliedtothedatasets.TheresultsareprovidedinTable4.17.

Table 4.17 Nonparametric ANOVA on AFL ALL, AFL teacher and AFL student means (n = 84)

Statisticallysignificantgroupmeansdifferences(p<.05)werefoundfortheall

itemsdata(χ2(2)=6.695,p=.035)andthemeansfortheteacherfocuseditems

(χ2(2)=6.704,p=.035).Therewerenostatisticallysignificantdifferences(p>.05)

betweenthemeansforstudentfocuseditems(χ2(2)=2.529,p=.282).

Welchrobusttestsofmeansequalityproducedstatisticallysignificantresults(p

<.05)fortheallitem(Qs9-15)data(WelchF2,50.737=4.236,p=.020)andthe

teacherfocuseddata(WelchF2,52.620=3.365,p=.042)butnotforthestudent

focusdata(WelchF2,49.209=1.283,p=.286)wherep>.05.

TheGames-Howellmultiplecomparisonsanalysisreturnedastatistically

significantdifferencebetweentheallitems(Q9toQ15)meanfortheWBEgroupof

schools(x̅WBE=3.86)andtheallitemsmeanfortheWAEgroupofschools(x̅WAE=

4.10).Themeansdifferencewas0.24,p=.033whichislessthanthe.05threshold

forstatisticalsignificance.Nostatisticallysignificantdifferenceswereshownfor

theteacherfocuseditemsorstudentfocuseditemmeans.

Page 195: Exploring The Impact of a Largescale Diagnostic Science

174

OnthebasisoftheparametricandnonparametricANOVAontheallAFLitemdata

setandsubsequentposthocanalysis(WelchtestandGames-Howellmultiple

comparisonstests),itisreasonablerejectthenullhypothesisandtoconcludethat

teachersinschoolswhereresultsaredeemedWBEmakelessfrequentuseof

teacherfocusedformativepracticesthantheircolleaguesinschoolswhereresults

weredeemedWAEbutthatnodistinctionbetweenthegroupscouldbemadeon

thebasisofdifferencesinstudentfocus.

Giventhattherewasastatisticallysignificantmorefrequentuseofteacherfocused

formativepracticesbyWAEteachersthantheirWBEcolleagues,thenextstepwas

totestforstatisticallysignificantdifferencesbetweenthemeansforitemsrelated

toeachofthefivedimensionsofformativepracticeineachschoolgroup.

Inordertodeterminewhichofthedimensionsmightpresentgroupmeansthat

werestatisticallysignificantlydifferent,Welchtestsformeansequalitywere

performedonthefivesubsetsofsampledataforeachofthedimensions.The

resultsofthosetestsarepresentedinTable4.18.

Table 4.18 Welch statistics for robust equality of means

Dimension WelchFdf1,df2 Statistic Significance

LISCAll F2,51.823 .460 .634

CDELAll F2,52.205 3.684 .032

FTALAll F2,50.494 4.522 .016

ASIRAll F2,51.2.6 1.714 .190

ASTLAll F2,50.650 3.475 .039

Shadingindicatesdimensionswherestatisticallysignificantmeansdifferenceswerefound

Theresultsofthetestsonthefirstandfourthdimensionsofformativepractice

revealednostatisticallysignificantdifferencesbetweenthegroupmeans.

Statisticallysignificantdifferenceswerefoundbetweengroupmeansforthe

second,thirdandfifthdimension.Forthosedimensionsthenullhypothsiswas

rejectedandtheattributionofthosedifferencesisreportedbelow.

Page 196: Exploring The Impact of a Largescale Diagnostic Science

175

4.2.3.1LEARNINGINTENTIONSANDSUCCESSCRITERIA(LISC)

Thisdimensionofformativepracticeisaboutlearningintentionsandsuccess

criteriabeingmadeexplicitbyteachersfor(orby)students.Theitemswereabout

whodeterminedwhatwastobetaughtandlearnedandwhyandhowitwouldbe

assessed.MeansdataandplotsareprovidedinTable4.19andFigure4.12

respectively.

Giventhattherewerenostatisticallysignificantdifferencesbetweenthegroup

means,onlythedescriptivestatisticsforthisdimensionwillbeprovidedhere.

Table 4.19 LISC combined means

School group n x̅ s σx̅

Mean for LISC

WBE 32 4.10 .46 .08 AE 28 4.20 .42 .08 WAE 24 4.12 .48 .10 Total 84 4.14 .45 .05

Figure 4.12 LISC means plots

Page 197: Exploring The Impact of a Largescale Diagnostic Science

176

Themeansspreadsforthethreegroupsamplesshowthatteacherledactivity

comparedtostudentopportunitiestosetlearningintentionsandchoose(or

formulate)successcriteriadonotoverlapandarethusstatisticallysignificantly

different.

FindingsfromtheLISCsubsection

From the above it is reasonable to conclude that teachers in all three school groups more

often take the lead when it comes to establishing learning intentions and success

criteria. They do this at self-reported frequencies between sometimes and often.

Teachers report that they involve students between seldom and sometimes in

negotiating learning intentions or success criteria.

4.2.3.2CLASSROOMDISCOURSETHATPRODUCESEVIDENCEOFLEARNING(CDEL)

Thisdimensionofformativepracticeisaboutclassroomdiscourseeliciting

evidenceoflearningforboththeteacherandstudents.Theitemsassociatedwith

thisdimensionwereaboutquestioninganddiscussioninclassandtheuseof

assignmentsandassessmentitemsasthestimulusforthatdiscussion.

TheWelchstatisticreportedaboveinTable4.18forthisseconddimensionshows

therewerestatisticallysignificantdifferencesbetweenoneormorepairsof

samplemeans.Themeansandmeanplotsfortheteacherandstudentfocused

combinedandseparateddataforitemsrelatedtoCDELareshowninTable4.20

andFigure4.13respectively.Thisdatawereexaminedtoseewhethertheteacher

focused(TCDEL)orstudentfocused(SCDEL)datameansorbothwerestatistically

significantlydifferent.

AnexaminationofthemeansspreadsinFigure4.13suggeststhatthemeansfor

bothteacherfocusedandstudentfocuseddatarelatingtoCDELinatleasttheWBE

andWAEschoolsmaybestatisticallysignificantlydifferent.Subsequenttestingfor

normalityandhomogeneityofvarianceinthedataisreportedinTable4.21(note

thatthestudentfocuseddataisbasedononlyoneitem,10d.Thatitemwasabout

thefrequencyofopportunitygiventostudentstoexplaintheirthinking.

Page 198: Exploring The Impact of a Largescale Diagnostic Science

177

Table 4.20 CDEL combined, TCDEL & SCDEL means

School group n x̅ s σx̅

Mean for CDEL combined

WBE 31 4.00 .41 .07 AE 28 4.16 .32 .06 WAE 24 4.28 .35 .07 Total 83 4.14 .38 .04

Mean for TCDEL

WBE 31 3.91 .40 .07 AE 28 4.09 .31 .06 WAE 24 4.20 .37 .08 Total 83 4.06 .38 .04

Mean for SCDEL

WBE 31 4.52 .68 .12 AE 28 4.57 .63 .12 WAE 24 4.75 .44 .09 Total 83 4.60 .60 .07

Figure 4.13 CDEL combined, TCDEL, SCDEL means

Page 199: Exploring The Impact of a Largescale Diagnostic Science

178

Table 4.21 Tests for normality and homogeneity of variance on CDEL data sets (n = 84) Shapiro-Wilk tests CDEL combined WBE W = .928, p = .038*

AE W = .949, p = .185 WAE W = .931, p = .104

CDEL teachers WBE W = .936, p = .062 AE W = .902, p = .013* WAE W = .944, p = .204

CDEL students WBE W = .656, p = .000* AE W = .675, p = .000* WAE W = .542, p = .000*

Levene tests CDEL (ALL) F 2,81 = .011, p = .989 CDEL (teachers) F 2,81 = .123, p = .884 CDEL (students) F 2,81 = .128, p = .053 *sample failed the Shapiro-Wilk normality test (p < .05)

ThenonparametricANOVA(Table4.22)didshowstatisticallysignificant

differencesbetweenatleastonepairofmeans(p<.05)andthatthatdifference

wasrelatedtotheteacherfocuseddata(TCDEL).

Table 4.22 Nonparametric ANOVA: ALLCDEL, CDEL teacher and CDEL student means (n = 84)

Page 200: Exploring The Impact of a Largescale Diagnostic Science

179

TheGames-Howellmultiplecomparisonstestresults(Table4.23)follow.

Table 4.23 TCDEL & SCDEL Games-Howell multiple comparisons test

Dependent Variable

(I) School group by ES

(J) School group by ES

Mean Diff (I-J) SE Sig.

95% CI

Lower Bound

Upper Bound

TCDEL

WBE AE -.17769 .09295 .145 -.4015 .0462

WAE -.28741* .10480 .022 -.5403 -.0345

AE WBE .17769 .09295 .145 -.0462 .4015

WAE -.10972 .09553 .490 -.3413 .1218

WAE WBE .28741* .10480 .022 .0345 .5403

AE .10972 .09553 .490 -.1218 .3413

SCDEL

WBE AE -.05530 .17070 .944 -.4661 .3555

WAE -.23387 .15142 .279 -.5993 .1315

AE WBE .05530 .17070 .944 -.3555 .4661

WAE -.17857 .15004 .465 -.5414 .1843

WAE WBE .23387 .15142 .279 -.1315 .5993

AE .17857 .15004 .465 -.1843 .5414

* Grey shading indicates significantly different means

TheGamesHowellanalysisfortheteacherfocused(TCDEL)data,revealedthatthe

thex̅WAE-x̅WBEpairdifference(difference=.29,p=.022)wasstatistically

significantbutthex̅AE–x̅WBEpairdifference(difference=.18,p=.145)andthe

x̅WAE-x̅AEpairdifference(difference=.11,p=.490)werenot.Forthestudent

focused(SCDEL)meansthetestshowednostatisticallysignificantdifference

betweenthegroupmeans.

FindingsfromtheCDELdataanalysis

Fromtheaboveanalysisitwasreasonabletoconcludethatteachersinschools

whereresultsweredeemedtobeWBE,comparedtotheircolleaguesinschools

whereresultsweredeemedtobeWAE,weremorelikelytoaskclosedquestions,

lesslikelytouseopen-endedquestionsorallowwait-timebeforeanswers,oruse

assignmentsandassessmenttasksasstimulusfordiscussion.Teachersinthe

sampleofWBEschoolswerelesslikely(39%)toasktheirstudentstoexplaintheir

thinkingthantheircolleaguesinWAEschools(75%).

Page 201: Exploring The Impact of a Largescale Diagnostic Science

180

4.2.3.3FEEDBACKTHATADVANCESLEARNING(FTAL)

Thisdimensionofformativepracticeisaboutfeedbackthattakeslearning

forward.

TheWelchstatisticforrobustequalityofmeansreportedinTable4.18forthis

dimension(F2,50.494 =4.522,p=.016)indicatedthattherearestatistically

significantdifferencesbetweenoneormoreofthegroupmeans.Thefollowing

analysiswillshowwhichofthosemeanspairsarestatisticallysignificantly

different.

ThemeansandmeansplotsareshowninTable4.24andFigure4.14respectively.

ThedataforSFTALisbasedononeitem(14a)whichaskshowoftenstudentsare

giventheopportunitytoprovidefeedbackontheteachingtheyreceive.

Table 4.24 FTAL combined, TFTAL & SFTAL means

School group n x̅ s σx̅ Mean for FTAL combined

WBE 32 3.38 .29 .05 AE 28 3.59 .36 .07 WAE 24 3.60 .35 .07 Total 84 3.51 .34 .04

Mean for TFTAL

WBE 32 3.42 .31 .05 AE 28 3.64 .39 .07 WAE 24 3.66 .36 .07 Total 84 3.56 .37 .04

Mean for SFTAL

WBE 30 3.30 .75 .14 AE 28 3.82 .82 .16 WAE 23 4.00 .85 .18 Total 81 3.68 .85 .09

Page 202: Exploring The Impact of a Largescale Diagnostic Science

181

Figure 4.14 FTAL combined, TFTAL, SFTAL means

FromobservationofthemeansandrelatedconfidenceintervalspreadsinFigure

4.14,itwouldappearthatstatisticallysignificantmeansdifferencesmightbefound

inboththeteacherfocusandstudentfocusdata.

TheFTALteacherdatasetsatisfiedthenormalitytests(p>.05)butthethree

studentdatasetsallfailed(p<.05);allthreedatasetspassedtheLevene

homogeneityofvariancetests(seeTable4.25).

Table 4.25 Tests for normality and homogeneity of variance FTAL responses data sets (n = 84)

Shapiro-Wilk tests FTAL for teachers WBE W = .968, p = .439

AE W = .948, p = .174 WAE W = .953, p = .310

FTAL for students WBE W = .830, p = .000* AE W = .848, p = .001*

WAE W = .856, p = .003*

Levene tests FTAL (ALL) F 2,81 = .560, p = .574 FTAL (teachers) F 2,81 = .411, p = .664 FTAL (students) F 2,81 = .004, p = .996

*sample failed the Shapiro-Wilk normality test (p < .05)

Page 203: Exploring The Impact of a Largescale Diagnostic Science

182

ThenonparametricANOVA(Table4.26)indicatedthatatleastoneofthepairsof

meansforboththeteacherandstudentdatasetswerestatisticallysignificantly

different(χ2TFTAL(2)=8.713,p=.013and(χ2SFTAL(2)=11.100,p=.004).

Table 4.26 Nonparametric ANOVA on FTAL ALL, FTAL teacher and FTAL student means (n = 84)

TheGames-HowellmultiplecomparisonsresultsareincludedinTable4.27.

Table 4.27 TFTAL & SFTAL Games-Howell multiple comparisons (n = 84)

Dependent Variable

(I) School group by ES

(J) School group by ES

Mean Diff (I-J) SE Sig.

95% CI Lwr

Bound Upr

Bound

TFTAL

WBE AE -.22228* .09138 .048 -.4428 -.0017 WAE -.23736* .09200 .035 -.4604 -.0144 AE WBE .22228* .09138 .048 .0017 .4428 WAE -.01508 .10432 .989 -.2671 .2370 WAE WBE .23736* .09200 .035 .0144 .4604 AE .01508 .10432 .989 -.2370 .2671

SFTAL

WBE AE -.52143* .20661 .038 -1.0192 -.0237 WAE -.70000* .22440 .009 -1.2443 -.1557 AE WBE .52143* .20661 .038 .0237 1.0192 WAE -.17857 .23574 .731 -.7494 .3922 WAE WBE .70000* .22440 .009 .1557 1.2443 AE .17857 .23574 .731 -.3922 .7494

* The grey shading indicates a statistically significant difference

Page 204: Exploring The Impact of a Largescale Diagnostic Science

183

FortheTFTALdata,thex̅WAE-x̅WBEpairdifference(difference=.24,p=.035)and

thex̅AE–x̅WBEpairdifference(difference=.22,p=.048)werestatistically

significantbutthex̅WAE-x̅AEpairdifference(difference=.02,p=.989)wasnot.

FortheSFTALdata,thex̅WAE-x̅WBEpairdifference(difference=.70,p=.009)and

thex̅AE–x̅WBEpairdifference(difference=.52,p=.038)werestatistically

significantbutthex̅WAE-x̅AEpairdifference(difference=.18,p=.731)wasnot.

Thus,statisticallysignificantmeansdifferenceswerebothidentifiedand

confirmed.

FindingsfromtheFTALsubsection

FromtheaboveanalysisteachersatschoolswhereEVresultswereWBE

(comparedtotheircolleaguesatWAEandAEschools)weremorelimitedinthe

rangeofoptionsusedtoprovidefeedbacktotheirstudentsanddidsoless

frequently.WBEteacherswerelesslikelytoseekstudentfeedbackontheir

teaching,lessresponsivetostudentfeedbackontheirteaching,andlessinclinedto

changethenextstepinalessoninresponsetofeedbackfromstudents.

ItwasalsoappropriatetoconcludefromthisanalysisthatontheoneSFTALitem

askingabouttheopportunityforstudentstoprovidefeedbackabouttheteaching

theyexperience,teachersinWBEschoolswerelesslikelytoinviteit(closerto

seldomthansometimes)comparedwiththeircolleaguesatWAEschoolswhosaid

theyinviteditsometimes.

Ononeitem(Q9h)whichaskedabouttheuseofdigitaltechnologytoprovide

feedbackduringalesson,teachersinthethreegroupsampleshadasimilarlow

responserate,withmostsaying(53%)theydidn’tknowaboutitorwereunsure

aboutitorneveruseditforfeedback.

4.2.3.4ACTIVATINGSTUDENTSASINSTRUCTIONALRESOURCES(ASIR)

Thisdimensionexplorestheopportunitiesthatmightbeprovidedforstudentsto

workcollaborativelywithpeersasateacherwouldworkwiththeircolleagues.

Page 205: Exploring The Impact of a Largescale Diagnostic Science

184

TheWelchstatisticforrobustequalityofmeansreportedinTable4.18forthis

dimension(F2,51.2.6=1.714,p=.190)indicatedthattherearenostatistically

significant(p>.05)differencesbetweenoneormoreofthegroupsamplesmeans.

OnlythemeansandmeansplotswillbeprovidedasshowninTable4.28and

Figure4.15respectively.

Table 4.28 ASIR combined, TASIR & SASIR means

School group n x̅ s σx̅

Mean for ASIR combined

WBE 32 3.88 .40 .07 AE 28 4.08 .47 .09 WAE 24 4.02 .45 .09 Total 84 3.99 .44 .05

Mean for TASIR

WBE 31 4.63 .50 .09 AE 28 4.74 .49 .09 WAE 24 4.60 .49 .10 Total 83 4.66 .49 .05

Mean for SASIR

WBE 32 3.43 .52 .09 AE 28 3.68 .69 .13 WAE 24 3.65 .57 .12 Total 84 3.58 .60 .07

Figure 4.15 ASIR combined, TASIR, SASIR means

Page 206: Exploring The Impact of a Largescale Diagnostic Science

185

ObservationoftherelativedifferencebetweentheTASIRandSASIRsamplemeans

showsthatthedifferencesineachgrouppairwerestatisticallysignificantly

different.

FindingsfromtheASIRsubsection

Themainfindinghereisthatacrossthethreegroupsofschoolscombined,

teachersineachsamplesaidtheyworkcollaborativelymoreoftenthansometimes

withcolleaguesonassessmentrelatedtasks.However,theyonlyprovidetheir

studentswithopportunitiestoworkcollaborativelyorprovidefeedbacktoeach

otherseldomtosometimesinaboutequalmeasure.

4.2.3.5ACTIVATINGSTUDENTS(ANDTEACHERS)ASOWNERSOFTHEIRLEARNING(ASTL)

Itemsrelatingtothisdimensionofformativepracticescanvassarangeofactivities

forteachersandstudentsdesignedtopromoteself-assessmentleadingto

meaningfullearning(afactorconceptanditsconnection/stootheraspectsofa

particularcontextthatisunderstoodbythelearnerattheveryleast).

Table4.29andFigure4.16providethedescriptivestatisticsandgraphsofthe

meansforthisdimension.

Page 207: Exploring The Impact of a Largescale Diagnostic Science

186

Table 4.29 ASTL combined, TASTL & SASTL means

School group n x̅ s σx̅

Mean for ASTL combined

WBE 31 3.49 .47 .08 AE 28 3.74 .62 .12 WAE 24 3.84 .52 .11 Total 83 3.68 .55 .06

Mean for TASTL

WBE 31 3.66 .52 .09 AE 28 4.00 .61 .12 WAE 24 4.07 .60 .12 Total 83 3.89 .60 .07

Mean for SASTL

WBE 31 3.11 .57 .10 AE 28 3.16 .91 .17 WAE 23 3.26 .69 .14 Total 82 3.17 .73 .08

Figure 4.16 ASTL combined, TASTL, SASTL means

Page 208: Exploring The Impact of a Largescale Diagnostic Science

187

Acrossthethreegroupscollectively,themeansfortheteacherandstudentdata

arestatisticallysignificantlydifferent.Thatsaid,thedatasetswereanalysedto

locatewhichofthemeanspairswerestatisticallysignificantlydifferent.Alldata

setspassednormalityandhomogeneityofvariancetests(p>.05),asshownin

Table4.30.

Table 4.30 Tests for normality and homogeneity of variance on ASTL data sets (n = 83) Shapiro-Wilk tests ASTL WBE W = .951, p = .169

AE W = .954, p = .252 WAE W = .948, p = .248

ASTL for teachers WBE W = .961, p = .317 AE W = .948, p = .172 WAE W = .929, p = .093

ASTL for students WBE W = .933, p = .055 AE W = .951, p = .211 WAE W = .961, p = .476

Levene tests ASTL (ALL) F 2,80 = 1.451, p = .240 ASTL (teachers) F 2,80 = .372, p = .690 ASTL (students) F 2,79 = 2.984, p = .056

ThenonparametricANOVAindicatesthattherewerestatisticallysignificant

differencesbetweenatleastonepairofthemeansforthecombinedscoresand

thatthatdifferenceislocatedwiththeteachercomponent(TASTL)asshownin

Table4.31.

Page 209: Exploring The Impact of a Largescale Diagnostic Science

188

Table 4.31 Nonparametric ANOVA on ASTL ALL, ASTL teacher and ASTL student means (n = 83)

TheGames-Howellmultiplecomparisonsprocess(Table4.32)confirmedthatthe

meandifferencebetweentheWAEandWBEforTASTLwasstatisticallysignificant

(difference=.41,p=.030)becausep<.05.

Table 4.32 TASTL & SASTL Games-Howell multiple comparisons

Dependent Variable

(I) School group by ES

(J) School group by ES

Mean Diff (I-J) SE Sig.

95% CI

Lwr Bound

Upr Bound

TASTL

WBE AE -.34101 .14862 .065 -.6993 .0173

WAE -.40649* .15409 .030 -.7797 -.0333

AE WBE .34101 .14862 .065 -.0173 .6993

WAE -.06548 .16798 .920 -.4715 .3405

WAE WBE .40649* .15409 .030 .0333 .7797

AE .06548 .16798 .920 -.3405 .4715

SASTL

WBE AE -.04724 .19976 .970 -.5316 .4371

WAE -.15334 .17604 .661 -.5811 .2744

AE WBE .04724 .19976 .970 -.4371 .5316

WAE -.10611 .22396 .884 -.6475 .4353

WAE WBE .15334 .17604 .661 -.2744 .5811

AE .10611 .22396 .884 -.4353 .6475

* Means are statistically significantly different (p < .05)

Page 210: Exploring The Impact of a Largescale Diagnostic Science

189

FindingsfromtheASTLdimension

TherelevantfindingswerethatteachersinWAEschools,comparedtotheir

colleaguesinWBEschools,morefrequentlyself-monitortheirteaching,usea

greatervarietyofresourcestoinformtheirassessment-relatedworkandengage

moreinprofessionaldiscussionsaboutsyllabusintentionsandwhatismeantby

progressioninsciencelearning.

Allthreegroupsamplesofteachersindicatedtheyseldomprovidestudentswith

opportunitiestoacquirelearninghowtolearnskillssuchasredoingworktoa

higherstandard,self-selectingitemsforaportfolio(orexplainingtheirchoicesfor

inclusion)orkeepingareflectivejournal.

4.2.4Setfourresults:RespondentData

Dataandinformationaboutteachersandtheirschoolsweresoughtinthefinal

sectionoftheonlinesurvey.Table4.33presentstheaggregateddataprovidedby

teachersfromallthreegroupsofschools.

Page 211: Exploring The Impact of a Largescale Diagnostic Science

190

Table 4.33 Data about respondents and their schools Question Response/s 16. Gender: nF = 54 (63%)* nM = 24 17. Years teaching: 0-5 yrs:

n = 10

6-10 yrs: n = 12

11-15 yrs: n = 12

15+ yrs:

n = 44 (56%) 18. Science teacher by training /qualifications:

Yes: n = 76 (95%) No:n = 4

Other qualifications: 4 listed, only one not obviously science related 19. Head teacher: Yes: n = 39 (48%) No: n = 42 (52%) 20. Highest science teaching qualification (n = )

BA + Dip Ed 55 (70%) BTeach (4 yrs) 12 MTeach (5 yrs) 7 Doctorate or PhD 4 Other 3

21. Year training completed: earliest: 1973 latest: 2015

22. Where trained (n = ) completely overseas: 8 overseas and in Australia: 5 completely in Australia: 65 (76%)

23. I teach / have taught Y7-9 classes (n = )

this year 69 (87%) last year 2 the year before last 3 more than three years ago 5

Note. Numbers in bold show the mode. Because most respondents did not identify themselves or their school, it is not possible to provide a meaningful summary of the figures for Q 24-27 inclusive. Q 24 asked for the number of Y8 classes at your school; Q 25 asked for the number of full time teachers at your school; Q 26 asked for the number of part-time teachers at your school and Q 27 asked about part-time science teachers; it seems that almost all schools had part-time science teachers (from 1-3) in 2016. * DE employment figures for 2015 show that 61.7% of permanent secondary teachers are male. (nF = number of females; nM = number of males)

Ahigherproportionoffemalescienceteachersrespondedthanmales(twotoone)

eventhoughtheproportionsofscienceteachersinDepartmentschoolsisthree

malestotwofemales.Morethanhalftherespondentswereinthemost

experiencedcategory.Around1in20scienceteachersinthesampleherehave

morethanthebasicqualificationtoteachscience.Allbutonehadaqualification

thatwasmostlysciencebased.Halftherespondentswereheadteachers.

Page 212: Exploring The Impact of a Largescale Diagnostic Science

191

4.3Otherfindings

Attentionisdrawnheretofindingsthatwillbereferredtointhediscussionof

answerstotheresearchquestions(ChapterSix).

Thefirstisabreakdownofrespondentstothesurveyintermsofteaching

experienceineachofthethreeschoolgroups(WAE,AEorWBA).

4.3.1Teacherexperienceandstudentachievement

Theproportionofteacherswith15ormoreyearsteachingexperienceineachof

thethreegroupswas:44%(WBE);57%(AE)and56%(WAE).However,an

ANOVAtocomparethebetweengroupmeansforteachingexperience(F2,80=

2.567,p=.083)showedthattherewerenostatisticallysignificantdifferences(p

>.05)whenitcametocomparingrespondentexperience.

4.3.2TeacheruseofEVstudentsurveyfeedback

Thesurveywasdesignedtoprovidefeedbacktoteachersabouttheirstudents’

experiencesofscienceatschool,includingwhatstudentsthoughtofthetestitself,

aboutsciencelessons,aboutscience,intentionstostudysciencelaterinschool,

whichschoolsubjectstheylikedmost(threetochoosefromoffifteenprovided),

andwhichsubjecttheythoughttheylearntmostin(threetochoosefromoffifteen

provided).

Threeitemsintheonlinesurveyaskedteacherswhetherintheprevious12

monthstheyhad:

• lookedattheresultsfromthestudentsurveyinthelastyear(Q1a)

• discussedtheresultswithcolleagues(Q1g)

• discussedthoseresultswithstudents(Q1i).

TherelevantbetweengroupsANOVAstatistic(F2,82=2.563,p=.083)forthe

clusterofthreeitemsrevealedthatthebetweengroupsamplemeanswerenot

Page 213: Exploring The Impact of a Largescale Diagnostic Science

192

statisticallysignificantlydifferent(p>.05).Thus,descriptivestatisticsforall

surveyrespondents(n=85)arepresentedbelowinTable4.34andFigure4.17.

Table 4.34

YES counts for student survey items Total Frequency Percent Cumulative percent

0 26 30.5 30.6 1 19 22.4 52.9 2 26 30.6 83.5 3 14 16.5 100.0

Total 85 100.0

Figure 4.17 Frequency verses item sets for student survey (none to three yes responses)

Justover30%ofteachershadnotengagedwiththestudentfeedbackatall.Fewer

thanoneinfive(16%)teachershadlookedatanddiscussedtheresultswith

colleaguesandstudents.

Page 214: Exploring The Impact of a Largescale Diagnostic Science

193

4.4Keyfindingsfromthesurveyanalysis

Thissectionsummarisesthesurveyfindingsastheyrelatetothefirsttworesearch

questions.Thesurveydidnotaddresstheissueofwhy(orwhynot)teachersmade

useoftheEVprogramresources.Dataandinformationtoanswerthatpartofthe

twoquestionsisprovidedinChapterFive.

Wherefindingsweredescribedasstatisticallysignificantthesamplefindings

generalisetotherelevantpopulationfromwhichthesamplesweretaken.The

expressionWAEteachersisshorthandforsayingteachersatschoolswhereEV

resultswereWAE(wellaboveexpectation).AEorWBEteachershavecomparable

meaningsexceptthatthereferenceistotherelevantexpectation.

Researchquestionone:WhatusearescienceteachersmakingoftheEV

programincludingSOLOandwhyisitusedornotused?

1. Justover70%ofsurveyrespondentshadlookedatthefeedbackfromthe

studentsurvey.

2. TeachersatschoolswhereresultsweredeemedtobeWBEmakelessuse

overallofEVresultsandresourcestosupporttheirassessment-relatedwork

thandotheircolleaguesatschoolswhereresultsaredeemedtobeAEorWAE.

3. TherewerenostatisticallysignificantdifferencesbetweenAEandWAE

teachers’engagementwiththeEVprogram.

4. InrelationtoEVBwhichwasaboutdiscussingresultswithcolleagues,66%of

thetotalteachersamplehaddiscussedthetestitemandtaskanalysis,49%had

discussedtheresultsofthestudentsurvey,and33%haddiscussedthestudent

profileinformation.

5. EVCwasaboutdiscussionwithstudents.22%ofthetotalsamplehaddiscussed

theitemortaskanalysiswithstudentsand18%haddiscussedtheresultsof

thesurveywithstudents.

6. EVEwasaboutusingEVresourcesintheclassroom.45%ofthetotalsample

hadusedtheteachingstrategiesprovidedintheSMARTpackageand68%had

useditemsandtasksfromEVtestsintheirschoolassessments.

Page 215: Exploring The Impact of a Largescale Diagnostic Science

194

7. EVFwasaboutengagementbeyondschool.TwoteachersfromtheAEsample

hadwrittenitemsfortheEVtest;twoteacherseachfromtheAEandWAE

samplehadevaluateditemsforthetest;39%ofthetotalsamplehadmarked

extendedresponsetasks;and30%ofthetotalsamplehadattendedworkshops

abouttheEVprogram(differenttotrainingformarking).

8. Thefirstcategory(EVA)askedteacherstosaywhethertheyhad,inthe

previoustwelvemonths,lookedatEVresultsforthestudentsurvey(fortheir

class),theanalysisofanswerstotheextendedresponsetasks,andindividual

studentprofileresults.TeachersinWBEschoolshadnotaccessed(viewed)this

informationasmuchastheircolleaguesinWAEschools.

9. Thefourthcategory(EVD)askedteacherswhethertheyhadintheprevious

twoyearsaccessedEVrelatedmaterialsinTaLE(theDepartment’sinternal

teachersupportwebsite),SMARTprovidedfeedbackonEVresultsaswellas

adviceaboutteachingstrategiestoaddresssciencemisconceptionsandthe

separatelyproducedmarkingmanualsforextendedresponsetasks.Again,

teachersinWBEschoolshadnotaccessedtheseresourcesasmuchastheir

colleaguesinWAEschools.

10. Thesixthcategory(EVF)askedwhetherteachersintheprevioustwoyearshad

usedEVtestitemsandtasksintheirowntestsorasmodelstoworkwith.

TeachersinWBEschoolshaddonesolessthantheircolleaguesinAEschools.

11. Theseventhcategory(EVG)askedwhetherschoolshadusedEVresultsto

informchangestofaculty(teachingandlearning)programsintheprevioustwo

years,TeachersinWBEschoolsmadelessuseofEVresultsinthatprocessthan

hadteachersinAEschools.

12. AllthreegroupsofteachersratedtheirunderstandingoftheEVprogramas

acceptableorhigher.Teacherself-ratingsofEVprogramunderstandinginthe

AEandWAEgroupswashigherthanintheWBEgroup(goodcomparedto

acceptable).

13. MostteachersinthethreegroupsidentifiedthatthepurposefortheEV

programwastoprovidefeedbacktoteachersaboutteaching,progressin

learningand/ortheirteachingandlearningprograms.

Page 216: Exploring The Impact of a Largescale Diagnostic Science

195

14. FewerWBEschools(threeschools)indicatedthattheywouldtakeupthe

VALID10testcomparedtoAEorWAEschools(sixschoolseach).

15. Fewerthan20%ofrespondentshad‘accessed’SOLO;fewerthan10%reported

usingittoinformfacultypolicyorasabasisforreportingtoparents.

16. ThemostcommonlymentionedsourceoflearningaboutSOLOwasreportedby

respondentsaseitherEVmarkingorworkshopattendance,andthesemadeup

aroundonethirdofallresponsestothequestionaboutwheretheyhadlearnt

mostaboutSOLO.

17. SevenpercentofrespondentssaidtheyhadnotheardofSOLO.

18. Theoveralllevelofself-reportedunderstandingofSOLObyrespondents

rangedfrompoortoacceptable.

Researchquestiontwo:Whatformativepracticesareevidentintheworkof

scienceteachersandwhyaretheyusedornotused?

19. Inrelationtotheuseofformativepracticesoverall,therewerestatistically

significantdifferencesbetweenWBEandWAEteachers.TeachersinWBE

schoolsusedformativepracticeslessfrequentlyintheirteachingthandidtheir

colleaguesinWAEschools.Teachersinallthreegroupsmoreoftendecidedthe

formativepracticestobeusedratherthansharedecisionmakingwithstudents

onwhattasksweretobedoneandwhyandhowtasksweretobedoneand

assessed.

20. Overall,AEteachershadmoreincommonwiththeirWAEcolleaguesthanWBE

colleagueswhenitcametofrequencyofuseofformativepractices.

21. Whenitcametolearningintentionsandsuccesscriteria(LISC),whichwasthe

firstdimensionofformativepractice,teachersinallthreesamplesprovided

studentswiththelearningintentionsandsuccesscriteria(betweensometimes

andoften)morethanstudentswereaskedtoidentifyorchoosethem(between

seldomandsometimes).

22. Theseconddimensioninvolvingclassroomdiscourseelicitingevidenceof

learning(CDEL)revealedthatWBEteachersweremorelikelytouseclosed

questions;lesslikelytouseopen-endedquestions;lesslikelytoallowwait-

timebeforeansweringandlesslikelytouseassignmentsandassessmenttasks

Page 217: Exploring The Impact of a Largescale Diagnostic Science

196

asstimulifordiscussionthanweretheirWAEcolleagues.TeachersintheWAE

sampleofschoolsweremostlikelytoaskstudentstoexplaintheirthinking

(moreoftenthansometimes)whencomparedtoeithertheircolleaguesinthe

WBEorAEsamples.

23. Inrelationtofeedback(thethirddimensionofformativepractice),WAE

teacherscomparedtoWBEteachersweremorelikelyto:usegradeslinkedto

syllabusexpectations,providefeedbacktostudentsaddressing

misconceptions,refertosuccesscriteriaorsyllabusintentionsandweremore

responsivetostudentfeedbackontheirteaching.WAEteachersweremore

inclinedtochangethenextstepinalessoninresponsetofeedbackfrom

studentsandwerethemostlikelytoaskstudentstoprovidethemwith

feedbackontheirteaching.

24. Themostfrequentresponsefromallthreesamplesofteacherstotheitem

askingabouttheuseofdigitaltechnologytomonitorlearningprogressduring

alessonwasnever.

25. Intermsofworkingcollaborativelywithpeers(dimensionfour)therewereno

statisticallysignificantdifferencesbetweenpracticesacrossthesamplesof

respondents.Teacherscollectivelysaidtheyworkcollaborativelymoreoften

thansometimeswithcolleaguesonassessmentrelatedtasks.However,they

onlyprovidetheirstudentswithopportunitiestoworkcollaborativelyseldom

tosometimesinaboutequalmeasure.

26. Thefifthdimensionofformativepracticeisabouttakingresponsibilityfortheir

ownlearning.WAEteachersmodellearning-how-to-learnstrategieswith

studentsandcolleaguesmorefrequently(sometimes)thantheirWBE

colleagues(seldom-sometimesequally).

27. Overall,teachersinthethreesamplesindicatedtheyseldomprovidestudents

withopportunitiestoacquiretheskillsneededtotakecontroloftheirown

learning.

Page 218: Exploring The Impact of a Largescale Diagnostic Science

197

4.5Summaryoffindingsinrelationtoscienceteacheruseofformative

practices

Theanalysisoftheresponsesbyscienceteacherstotheonlinesurvey(phasetwo)

producedstatisticallysignificantfindingsabouttheuseofformativepracticesand

EVresults.InschoolswhereEVresultswerewellaboveexpectation(WAE),

comparedtoschoolswhereEVresultswerewellbelowexpectation(WBE),science

teachersweremorefrequentusersofactivitiesassociatedwiththefollowingthree

(offive)dimensionsofformativepractice:

• discourseelicitingevidenceoflearning(seconddimension)

• theprovisionoffeedbackknowntoprogresslearning(thirddimension)

• theuseofandmodeling(topeersandstudentsalike)ofgoodlearning

behaviours,includingself-assessment(fifthdimension).

Therewerenostatisticallysignificantdifferencesinthefrequencyofteacher

practicesrelatedtothefirstandfourthdimensionofformativepracticesfor

sampledteachersineachofthethreeschoolgroups.Aswell,thefrequencywith

whichteachersengagestudentsincollaborativeworkwitheachotherand

opportunitiesforpeerassessmentiscomparableacrossallthreesamplesandless

frequentthantheydowithcolleagues.

Thenextchapterprovidesadditionalcontextforthesefindinginspecific

assessmentnarrativesgeneratedforcasestudyschools.Italsoexplorestheextent

towhichcasestudyschooldataconfirmorrefutethethreepredictionsmadein

Section3.6.Thepredictionsaredesignedtotesttwoclaimsthatareatthecoreof

thisresearch.Thefirstisthatthedualmeasureofscientificliteracyandeffectsize

ofteachingvestedintheregressionresidualisvalid;andthesecondisthat

formativepracticesareassociatedstronglywithhigherachievementand

engagementwithsciencelaterinsecondaryschoolyears.Theconfirmation(or

otherwise)ofthesecondpredictionisanimportantcontributiontoanswering

researchquestionthree.

Page 219: Exploring The Impact of a Largescale Diagnostic Science

198

CHAPTERFIVE:PHASETHREE-COMPARINGCASESTUDYSCHOOLS

Thischapterreportsfindingswithwhichtoanswerresearchquestionthree.This

questionis

Doestheuseof(andifso,howdo)formativepracticesbyteachersimprove

students’EVresultsandlaterachievementinandengagementwithscience?

Achievementdataandevidenceofengagementwithsciencefromfivepairsofcase

studyschoolsprovidedthebasisforfindingsrelatedtoimprovement(or

otherwise)inYear8scienceresultsandagainattheendofYear10(later

achievement).SchoolswerepairedonthebasisofhavingcomparableSEAscores

andstatisticallysignificantlydifferentresiduals.ComparableSEAscoresarescores

thatarenotsignificantlydifferentinthestatisticalsense.

ResidualsareimputedtobeameasureoftheimpactofscienceteachingonEV

results;thebiggerthepositiveresidual,thegreaterthecontributionofscience

teachingtothatEVresultintermsofthemarkgain(themeasureofimprovement)

aboveapredictedmarkbasedonNAPLANresults,asexplainedinChapterThree.

Thebiggertheresidualdifferencethebetterbecauseitimprovesthechanceof

identifyingwhatmightbecausingthedifferencesinthoseresults.These

differences,iftheyexist,arelikelytobefoundinthecasestudyschoolnarratives

ofassessment-relatedworkprovidedinAppendixH.

Evidenceofengagementwasprovidedintheformof:

• measuresofstudentresponsestosixitemsintheEVstudentsurvey

• proportionsofstudentscompletingseniorsciencecourses(relativetothe

state)

• informationincasestudyschoolnarrativesaboutassessment-relatedwork.

Thefindingsrelatedtothreepredictionslinkingresidualdifferencesto

achievementandengagementprovidethebasisforansweringthequestion.The

predictionswere:

Page 220: Exploring The Impact of a Largescale Diagnostic Science

199

1. AttheendofYear8comparableschoolswiththebiggestresidualswillhave

betterEVresultsandengagementfiguresthanschoolswithsmalleror

negativeresiduals.

2. AttheendofYear10comparableschoolswiththebiggestresidualsatthe

endofYear8willhavebetterresultsthanschoolswithsmallerornegative

residuals.

3. AttheendofYear12comparableschoolswiththebiggestresidualsatthe

endofYear8willhaveahigherproportionoftheirstudents(relativeto

English)completeseniorsciencecoursesthanschoolswithsmalleror

negativeresiduals.

Findingsrelatedtothefirstpredictiondemonstratetherelationshipbetween

teacheruseofformativepractices(indicatedbythesizeandpolarityofthe

residual)andthesizeofEVresultforaschool.Ahighlypositivecorrelation

betweentheresidualandEVresultforcomparableschoolswouldbeastrong

indicationthattheuseofformativepracticeswassomehowinvolved.

Findingsrelatedtothesecondpredictionmayshowanongoingpositive

correlationbetweenahighpositiveresidualforaschoolattheendofYear8and

continuinghighachievementinsciencetwoyearslater.Thisresearcherwas

speculatingthatlaterhighachievementatthisschoolwouldbeassociatedwith

eithercontinuingusebyteachersofformativepracticesoralastingeffecton

studentsfromthatuse.

Findingsrelatedtothethirdpredictionmayshowthatlaterhighengagement(Y12

sciencecompletions)ispositivelycorrelatedwitheitherhighachievementatthe

endofYear8orhighengagement(asmeasuredbyscoresonthesixitemsfromthe

studentsurveycompletedwiththeEVtest)orboth.Thisresearcherwas

speculatingthathighengagementwouldbeassociatedwithcontinuinguseby

teachersofformativepracticeorbealastingeffectonstudentsfromthatuse.

Thelastingeffectreferredtointhecontextofpredictionstwoandthreeisthe

acquisitionofself-regulationandrelatedlearningskillsbystudentsasaresultof

theirexposuretoformativepractices.Thisresearcher’sassumptioninframingthe

Page 221: Exploring The Impact of a Largescale Diagnostic Science

200

predictionswasthatmorestudentsatWAEschoolswouldbecomeself-regulated,

autonomousandskilledlearners(ofscience)asaresultoftheirrelativelyhigh

exposuretothosepracticesthanatAEandWBEschools.

Thecredibilityofthisassumptionissupportedbytheresearchintolearninghow

tolearnreportedinChapterTwo.Purposelyteachingstudentsthefivestrategies

offormativeassessmenthasbeendemonstratedtoproducestudentswhouse

“goodlearningbehaviours”(Boyleetal.,2001,p.200).Evidenceoftheextentto

whichteachershaddirectedtheireffortstohelpingstudentsacquirethesefive

skillsetswasprovidedintheresultsoftheteachersurveyreportedinChapter

Four.

Self-regulatedstudentsarealsomotivatedtokeeplearning.Theextentofstudent

likingfortheirscienceexperienceatschoolisapossibleindicatoroftheextentto

whichstudentshadacquiredthedispositionforcontinuedlearninginscience

impliedbyself-regulation.Ameasureofstudentlikingforsciencewasavailablein

thescoresstudentsreturnedonthesixitemsofthestudentsurveyreportedby

casestudyteachers.Anecdotalevidenceofstudentattitudestosciencewasalso

providedinthecasestudyschoolnarrativesofassessment-relatedworkpractices.

Thejustificationforsomeofthecontentinthischapter,particularlythe

identificationofspecificexamplesofassessment-relatedpracticesassociatedwith

successfulcasestudyschools,arisesfromtheintentiontoreportthefindingsto

participatingteachersandtotheDepartment.Afurtherintentionisthatthe

findingsbeusedtosupportprofessionallearningthatleadstogreateruseof

formativepracticesinscienceclassrooms.Thisisconsistentwiththe

transformativeintentoftheresearchasoutlinedinChapterThree.

Table4.1inChapterFourshowedthe394participatingschoolssortedfrom1to

394onthebasisoftheirresidualrankingandsubsequentdivisionintofivegroups.

Schoolsinthetop,middleandbottomgroupswereinvitedtoparticipateinthe

research.Aswasreportedthere,ofthe101surveyreturnsfromteachers,42

teachersidentifiedthemselvesandthe36schoolstheywereworkingat.

Page 222: Exploring The Impact of a Largescale Diagnostic Science

201

5.1Thecasestudyschools

Table5.1reportsselectedquantitativedataforall36self-identifiedschools.That

dataweresourcedfromtheDepartmentandtheMySchoolwebsite(theSEAscore).

Schoolidentitieswereprotectedbyreplacingtheschoolnamewithanidentifier

code.The16casestudyschoolsengagedwitharehighlightedinthetable.

Page 223: Exploring The Impact of a Largescale Diagnostic Science

202

Table 5.1 Schools that identified themselves including case study schools (shaded) SCHOOL CODE n = SEAS PEV RPEV AEV RAEV SR RSR

PCWAE1 24 2.7 85.40 127 89.95 46 2.68 1 MCWAE1 19 2.8 78.89 374 82.14 286 1.85 3 PCWAE2 44 1.8 81.90 306 84.79 165 1.69 5 PCWAE7 30 2.3 83.19 237 85.81 129 1.59 8 MCWAE2 54 6.9 87.96 68 90.65 41 1.57 10 PCWAE3 55 2.0 81.26 325 83.64 221 1.43 12

MCFSWAE1 106 8.6 99.90 11 101.97 3 1.19 23 MCWAE3 150 6.2 87.45 77 89.47 54 1.17 24 PCWAE4 161 5.5 89.09 56 91.05 37 1.12 29 PCWAE5 49 2.3 82.57 273 84.44 175 1.08 36 MCWAE6 136 6.0 89.26 51 90.50 43 0.73 58 PCWAE6 28 0.9 82.31 289 83.34 235 0.60 78

n = 12 MGFSAE1 113 9.1 100.76 7 100.23 7 0.12 174

MGAE1 108 3.0 81.65 316 81.86 298 0.12 176 PCAE1 108 3.7 85.40 129 85.55 136 0.08 186 MCAE8 70 2.6 78.62 377 78.82 373 0.06 192 MCAE2 88 3.9 84.94 147 84.85 161 0.03 201 MCAE3 204 3.8 84.30 179 84.28 185 0.01 207 MCAE4 93 2.2 82.19 292 82.16 285 -0.01 213 MCAE5 146 4.1 85.39 128 85.38 141 -0.02 214

MGFSAE2 141 8.3 101.32 5 101.00 6 -0.09 232 MCAE6 89 1.5 79.19 368 79.01 370 -0.01 235 MCAE7 141 2.4 83.42 227 81.91 284 -0.16 244

n = 11 MBFSWBE2 133 8.2 98.99 14 97.99 17 -0.58 313

MGWBE1 142 7.1 89.60 48 88.34 67 -0.75 330 MCWBE7 153 8.2 91.70 31 90.47 44 -0.76 331 PCWBE2 68 2.1 83.01 248 81.42 316 -0.81 335

MCPSWBE3 123 6.9 92.33 26 90.59 42 -1.03 360 PCWBE6 97 2.9 84.16 184 82.14 287 -1.20 368

MGFSWBE1 135 8.9 101.69 3 99.28 14 -1.42 376 PCWBE1 51 1.7 82.97 253 80.61 340 -1.44 377 MCWBE5 79 2.1 85.09 140 82.54 275 -1.48 378 MCWBE4 47 0.7 76.30 392 73.63 394 -1.58 382 MCWBE3 148 4.0 85.70 118 82.85 256 -1.69 383

MCPSWBE2 144 5.4 90.92 37 87.61 78 -1.91 388 MCPSWBE1 34 6.3 92.93 23 89.63 51 -1.93 389

n = 13 Note. School code: First letter is (P)rovincial or (M)etropolitan (ACARA defined). Second letter is (C)oeducational, (G)irls or (B)oys. FS = fully selective entry / PS = partially selective entry. Residual group WAE – AE – WBE then a final number to differentiate schools. Columns: n = number of students whose results were used to perform the regression / SEAS = socio-educational advantage score / PEV = predicted EV result / RPEV = rank out of 394 based on predicted EV result / AEV = actual EV result / RAEV = actual EV rank out of 394 / SR = standardised residual used to designate schools as WAE – AE – WBE / RSR = school rank order based on residual (N = 394).

Page 224: Exploring The Impact of a Largescale Diagnostic Science

203

Fromthesecodesonecanidentifythecategoryofschool(describedinChapter

One).Atleastonefullyselectiveentryschool(FS)wasfoundineachgroupof

schools(WAE,AE&WBE).OneoftheFSschoolswascoeducational(C)andthe

othertwoweregirls(G)schools.Provincial(P)schoolswererepresentedinall

threegroupsandtherewerethreeintheWAEgroup.Provincialschoolswereall

coeducational(C)schools.TheWBEgroupincludedtwopartiallyselectiveentry

(PS)coeducational(C)schoolsaswellasonefullyselectivegirls(G)school.There

weremetropolitan(M)schoolsinallthreegroups.

TheschoolsinTable5.1arerankedaccordingtostandardisedresiduals(RSR)

showninthefarright-handcolumn.LinesseparateWAEfromAEandAEfrom

WBEschools.NotethatgenerallyspeakingactualEVresults(AEV)higherthan

predictedEVresults(PEV)areassociatedwithpositiveschoolresiduals(second

columnfromtheleft);AEVresultslowerthanPEVresultsareassociatedwith

negativeresiduals.

TheSchoolProfilepageforeachschoolontheMySchoolwebsiteshowsthe

proportionsofstudentsatthatschoolinfourquartersfromthemosteducationally

disadvantagedtothemosteducationallyadvantaged(LtoR).Schoolprofiledata

forYear7studententryfrom2010to2013wasaveragedoverthefouryears.As

explainedinSection3.6,theprofilequarterswereconvertedtoasingleSEAscore

(SEAS)usingalineartransformationasafurthermeasuretoprotecttheschool’s

identity.TheSEAscoreisanindependentmeasureofthecollectivelearning

potentialofstudentsataschool.ThecolumnheadedSEASshowsthefourquarters

ofthesocioeducationalprofileforstudentsatthatschoolasasinglescore.

Theaimwastohaveamongthecasestudiesthesixhighest-rankedschoolsfrom

theWAEcategory,thesixschoolsclosesttoazeroresidual(AEcategory)andthe

sixlowest-rankedschools(intheWBEcategory).Tothisend,teacher-identified

schoolsineachresidualcategorywereinvitedtoparticipateinorderoftheir

residualsize.

Table5.2providesdescriptivestatisticsforthethreegroupsofschoolschosenon

thebasisoftheirresiduals.

Page 225: Exploring The Impact of a Largescale Diagnostic Science

204

Table 5.2 Mean standardised residuals and SEA scores

Residual means for the three populations

Residual means for self-identified schools

Mean SEA scores for self-identified schools

Residual means for case study schools

Mean SEA scores for case study schools

WAE μ = 1.02 σ = 0.39 N = 85

x̅ = 1.42 s = 0.55 σx̅ = 0.16 n = 12

x̅ = 3.86 s = 2.24 σx̅ = 0.65

n = 12

x̅ = 1.8 s = 0.45 σx̅ = 0.19

n = 6

x̅ = 3.85 s = 2.4 σx̅ = 0.94 n = 6

AE μ = -0.01 σ = 0.10 N = 88

x̅ = 0.01 s = 0.09 σx̅ = 0.03

n = 11

x̅ = 4.06 s = 2.44 σx̅ = 0.74

n =11

x̅ = -0.02 s = 0.05 σx̅ = 0.03 n = 4

x̅ = 4.4 s = 2.84 σx̅ = 1.42

n = 4

WBE μ = -1.08 σ = 0.44

N = 85

x̅ = -1.28 s = 0.46 σx̅ = 0.13

n = 13

x̅ = 4.96 s = 2.85 σx̅ = 0.79

n = 13

x̅ = -1.67 s = 0.22 σx̅ = 0.09 n = 6

x̅ = 4.6 s = 3.0 σx̅ = 1.21 n = 6

Note. μ = population mean / x̅ = sample mean / σ = standard deviation (population) / s = standard deviation (sample) / σx̅ = standard error (sample) / N = population number / n = sample number

ThesecondcolumninTable5.2(readinglefttoright)showstheresidualmeans

foralltheschoolsineachofthethreegroupsinvitedtoparticipate.Thethird

columnhastheresidualmeansforself-identifiedschoolsincludingthecasestudy

schools.ThefourthcolumnistheSEAscoredatafortheself-identifiedschools.The

fifthcolumnfromtherightshowstheresidualmeansforthecasestudyschools,

andthesixthcolumnshowsthemeanSEAscoresforthecasestudyschools.

Theresidualmeansforthethreeschoolgroups(column1intheTable5.2)are

separatedbyalmostthreestandarddeviations,whicheffectivelyprovidesthree

differentpopulationsfromastatisticalperspective(theoverlapattheextremesof

theresidualdistributionsisapproachingonepercentorless).Thisdistinctionis

important,aswasshowninChapterFourwhenANOVAfindingsbasedonsample

datacouldbeappliedtoalltheschoolsinthatpopulation.

Page 226: Exploring The Impact of a Largescale Diagnostic Science

205

Figure5.1representsvisuallythemeansandrelatederrorbars(atthe95%

confidenceinterval)forthedataincolumnstwo,four,threeandfive(readingLto

R)inTable5.2.

Figure 5.1 Graphical representation of descriptive statistics for identified (ID) and case study (CS) schools combined and case study (CS) schools separately

Thedatasetsforallidentifiedschools(n=36)fromTable5.1weretestedusing

SPSSfornormality(Shapiro-Wilktest)andhomogeneityofvariance(Levene

tests).Threeoftheninedatasets(AEschoolsSEAscoresandEVresultsandWAE

schoolsEVresults)failedthenormalitytest(p<.05).Allthreedatasetsof

residualsfailedthehomogeneityofvariancetest(p<.05),whichwasnot

unexpectedgiventhenon-randomwaytheschoolsassociatedwitheachgroup

wereselected.Correlationresults(n=36)arereportedintermsofthe

nonparametricSpearmancoefficient(r),degreesoffreedom(df)andatwotailed

significanceofeither.01or.05(asshownwiththereportedcorrelation

coefficient).

Page 227: Exploring The Impact of a Largescale Diagnostic Science

206

ThecorrelationbetweentheresidualsandactualEVresults(r=.18,df=34,p

=.283)wasslightlypositivebutnotstatisticallysignificant(p>.05).

ThecorrelationbetweentheSEAscoresandactualEVresults(r=.84,df=34,p

=.000)wasveryhighlypositiveandhighlystatisticallysignificant.TheSEAscore

andresidualcorrelation(r=-.08,df=34,p=.627)wasslightlynegativeandnot

statisticallysignificant.Thesetwofindingswerehopedforgiventhattheresidual

wassupposedtoshowaneffectofteachingoncestudentbackgroundandschool

factorshadbeentakenoutoftheEVresult.

AnANOVAperformedontheSEAscoresandresidualsrelatedtoeachofthethree

schoolgroupsfurthersupportedthecorrelationresults.Welchteststatisticsfor

thethreedatasets(WSEAS(2,12.525)=.281,p=.759)indicatethatthemeanSEA

scoresforeachofthethreegroupswerenotstatisticallysignificantlydifferent.

However,themeansfortheEVresultsforthethreegroups(WEV(2,13.133)=4.98,p

=.025)didshowatleastonestatisticallysignificantdifferencebetweenapairof

thethreegroupmeans.TheTukeymultiplecomparisonstest(EVresultspassed

thehomogeneityofvariancetest)showsstatisticallysignificantmeansdifferences

betweentheEVresultsfortheWAEandWBEschoolgroups(x̅wae-x̅wbe=4.94,p

=.03).ThefindingfromthattestingwasthattheEVresultsofWAEschoolshada

statisticallysignificantlyhighermeanthantheEVresultsofWBEschools.Thiswas

confirmationofastatisticallysignificantassociationbetweenhighEVresultsand

highpositiveresidualsandlowerEVresultsandlownegativeresiduals.

AsexplainedinSection3.2,theintentionwastohavetheresidualmeansforthe

WBEandWAEschoolsaswidelyseparatedaspossible.Thiswastoprovidethe

bestpossiblechanceoffindingdifferencesintheteachingassociatedwiththe

residualsgiventhattheimpactofclassroomteachingonlearningisarelatively

smallcontributiontotheaccountedforvariabilityinachievement(around30%

accordingtoHattie(2003b)).Theextenttowhichtheresidualsrepresent

“maximumvariability”(Flyvbjerg,2011,p.306)canbeseeninthemeansplots

(thefirstandsecondplotsontheleftinFigure5.1)andtheANOVAresults.

Page 228: Exploring The Impact of a Largescale Diagnostic Science

207

Theteachersurveyanalysis(ChapterFour)attributedtheresidualdifferencesto

thefrequencywithwhichteachersineachofthethreeschoolgroupsused

activitiesassociatedwiththefivedimensionsofformativepractice.EVresultsthat

werewellaboveexpectation(WAE)werestatisticallysignificantlyassociatedwith

morefrequentusebyteachersinWAEschoolsofactivitiesassociatedwiththe

second,thirdandfifthdimensionsofformativepracticeassummarisedinSection

4.5.

5.2Threepredictionsandthecasestudyschools

Thissectionexplainsthedataaboutachievementandengagementrelevanttothe

threepredictions.Researchquestionthreeaskshowformativepracticeshelp

improvestudentsresultsandachievement.Thehypothesiswasthatexposureto

formativepracticesproducesself-regulatedautonomouslearners.Asoutlinedin

theopeningsectionofthischapter,theintentionwastoprovidecredibleevidence

thatself-regulatedautonomouslearnersaretheengineersoftheirimproved

achievementandengagementinscience.

5.2.1Predictionone:Year8achievementandengagement.

Participatingteachersatthecasestudyschoolswereaskedtotranscriberesults

fromtheSchoolsMeasurement,AssessmentandreportingToolkit(SMART)intoa

proformasentwellbeforetheschoolvisit.Teacherswereaskedtobringthe

completedproformatotheinterviewwhenitwouldbediscussed.Theproformais

providedasAppendixE.ResultsarereportedinSMARTagainstsix,SOLO-related,

levels.Schoolswereaskedtoaggregatetheresultsintothreeachievementbands.

Levels5and6werelabelledastopbandresults;levels3and4weremiddle-band

resultsandlevels1and2werebottom-bandresults.

Achievementdataisreportedinthreeachievementbandsforfiveresult

categories:anoverallEVresult;aknowledgeandunderstandingresult;an

extendedresponsetaskresult;aworkingscientificallyresult;anda

communicatingscientificallyresult.Forthepurposeofthisexercise,resultsfrom

Page 229: Exploring The Impact of a Largescale Diagnostic Science

208

fourofthefiveachievementcategorieswereaskedfor(theknowledgeand

understandingcategorywasnotprovidedforontheproforma).

EngagementdatawerealsoreportedinSMARTtoteachersagainstachievement

levels.Teacherswereaskedtorecordtheengagementscoresagainstthethree

achievementbandsintheproforma.Thesurveyhad21itemsinit.Onlysixwere

chosenforreportingonintheproforma.TheitemswerelabelledAtoFforthe

purposeofthisanalysis.

StudentsrespondedtoItemsAtoDbychoosingfromafour-pointscale:strongly

disagree,disagree,agree,stronglyagree.Individualresponsestothesurveyitems

wereaggregatedbyschool,groupsofschools,andthestateandreportedbackto

schoolsasgraphswherethescalerangedfrom-2to+2.TheresultsforItemsAto

Darereportedonadifferentscaleinthisthesis.Theeffectistoshiftthescaleso

thatthelowestpossiblescoreiszero.Thecloserthescoreistozero,thestronger

thedisagreementwiththeitemstatement.Ascoreclosetofourmeansastrong

collectedstudentagreementattheschoolwiththestatements.Anevenmixof

agreementanddisagreementintheschoolpopulationwouldproduceascoreclose

to2.

ThestatewideresponsesforItemsAtoDfollow.TableK.2inAppendixIhasthe

fulldatasetforthecasestudyschoolsforthesixItems.

InrelationtoItemA,whichsaid:IwanttostudyasciencesubjectinYears11

and12,topbandstudentsagreed(2.78outof4.00),middleandbottomband

studentsdisagreed(1.76and1.37respectively).

ItemBsaid:ScienceisthehardestsubjectIlearn.Topbandstudentsdisagreed

(1.56outof4.00),middlebandstudentsdisagreedalso(1.69)butbottomband

studentsagreed–just–thatitwasthehardest(2.03).Disagreementinresponseto

thisitemwastakenasapositiveresult.

Page 230: Exploring The Impact of a Largescale Diagnostic Science

209

ItemCsaid:Inprimaryschool,Ienjoyedlessonsthatwereaboutscience.Top

bandstudentsagreed(2.76outof4.00),middleandlowerbandstudentsalso

agreed(2.35and2.01respectively).

ItemDsaid:Insecondaryschool,Ienjoysciencelessons.Topbandstudents

agreed(2.83outof4.00),middlebandstudentsalsoagreed(2.23)butbottom

bandstudentsdisagreed(1.91).

ItemEaskedstudentstonominatetheirthreefavouritesubjects(15werelisted

includingscience).Ofthetopbandstudents,13.5%nominatedscienceinthat

group,asdid6.65%ofmiddlebandstudentsand4.58%ofbottombandstudents.

ItemFaskedstudentstonominatethethreesubjectstheythoughttheylearned

mostin.Again,15options,includingsciencewereprovided.Ofthetopband

students,25.13%(oneinfour)nominatedscienceinthatgroup,asdid16.5%of

middlebandstudents(justunderoneinseven)and9.71%ofbottomband

students(aboutoneinten).

Thefollowinggeneralisationscanbemadeaboutstudentresponsestotheitems

acrossthestate.Thehigherthestudents’achievementband:

• thegreaterwastheiragreementwiththepropositionsinItemsA,C&D

• thegreaterwastheirdisagreementwiththepropositioninItemBthat

sciencewasthehardestsubjecttheystudied

• thegreaterwastheproportionofstudentsnominatingscienceasoneof

theirthreeoptionsforItemsE&F.

5.2.2Predictiontwo:Year10achievement

The2011Year8cohortofstudentsprovidedthe2013Year10results,the2012

Year8cohortprovidedthe2014Year10results,andthe2013Year8cohortthe

2015Year10results.Schoolstranscribedontotheproformathegradepatterns

endorsedbytheBoardforeachyearfrom2009.DatafromYear10resultswere

usedinconjunctionwithYear8EVresultstoprovidefindingsinrelationto

Page 231: Exploring The Impact of a Largescale Diagnostic Science

210

predictiontwo.Datafrom2012to2015wasaggregatedhereforthepurposeof

interschoolcomparison.

Thisresearcher’sassumptionwasthattheimpactofsyllabuschanges(introduced

in2003)andtheintroductionoftheEVtestin2007onformativepracticeswould

havebeeninstitutionalisedintoschoolpracticesby2011andcontinuedupuntil

2014,afterwhichanewsyllabusbecamethebasisforEVtesting.Correlation

statisticsreportedinSection5.4wereappliedonthatassumption.

5.2.3Predictionthree:Year12engagement

Predictionthreeinvolvestheproportionsofstudentsataschoolcompletingthe

Year11and12(senior)sciencecoursesofferedattheschool.Astudentcouldtake

fromonetothreeofthefollowingfourcourses,dependingonthesizeoftheschool

andresourcesavailabletoit:Biology,Chemistry,EarthandEnvironmentalScience,

andPhysics.Manystudentstraditionallytookoneortwoofthesecourses.Itwas

veryrareforaschooltoprovidestudentswiththreesciencecoursesinYear12.A

fifthcourse,SeniorScience,wasanoptionforstudentsnotwantingtoundertake

furtherstudyinscienceafterschool.AllfivecoursesweredevelopedbytheBoard.

Someschoolsofferedinthesenioryearscoursesinsciencetheyhaddeveloped

andhadendorsedbytheBoard,butnoneofthecasestudyschoolsreported

offeringadditionalBoard-endorsedcoursesintheyearsofinterestforthisproject.

Year12completionsfrom2011to2015wereprovidedbyschoolsonthe

proforma.OnlyYear12completionsfor2015weredirectlycomparablewiththe

Year8cohortthatsattheEVtestin2011(Year8resultswereonlyavailablein

SMARTfrom2011to2014).Nevertheless,dataforYear12completionsforthe

years2012to2015,inclusive,areprovidedinthetablesandwereconsideredin

assessingthedegreeofsupportforpredictionthree.Studentschoosetheir

subjectsforstudyinYears11and12half-waythroughYear10.Intheexperience

ofthisresearcher,thegreatmajorityofstudentscompletethesubjectstheychoose

then.Aswell,giventhattheEVtesthadbeeninplacesince2007forallYear8

students,anyimpactoftheEVprogramandteacheruse(ornot)offormative

practicesonlaterengagementwould,arguably,haveoccurredbeforethat.Asfor

Page 232: Exploring The Impact of a Largescale Diagnostic Science

211

Year10results,Year12completionsfrom2012to2015wereusedinthe

correlationanalysesreportedinSection5.4.

Thissectionidentifiedanddescribedachievementandengagementdataprovided

byteachersincasestudyschools.Thenextsectionwillendeavortoshowthatthe

schoolsinwhichstudentshadthegreatestexposuretoformativepractices(WAE

schools)wereabletosustainbetterthanexpectedachievementandhigherlevels

ofengagementwithsciencebeyondYear8.Thequantitativedataweresupported

whereappropriatewithqualitativeevidencefromtheassessment-relatedwork

narrativestosupportthecredibleattributionofself-regulationandautonomyto

learnersinthoseschools.Thedataprovidedbyschoolsandnarrativeevidencewill

bediscussedinthecontextofpairedschoolcomparisonsreportedinthenext

section,Section5.3.

5.3Comparedcasestudyschools

Pairedschoolswiththesame(orcloselymatched)SEAscoresarearguedtohave

studentswithequivalentcollectivelearningpotentialsbyvirtueofthe

socioculturalcapitaltheybringtoschool.Theresidualistakentobethemeasure

oftheextenttowhichexposuretoformativepracticeshasenhancedstudents’

scientificliteracyandproducedanEVresultthatisabove(orbelow)expectation.

Thesurveyresultsprovideameasureofstudentsatisfactionwiththeirschool

scienceexperience.InSection5.2.1,theconnectionbetweenhigherachievement

andlevelofsatisfactionwiththeirschoolscienceexperiencewasestablishedfor

thecasestudyschools.Thissatisfactionisattributedtointerestinandmotivation

tocontinuewithlearningscienceandisthebenchmarkforengagement(asdefined

forthepurposesofthisthesisinsection3.5.5)attainedattheendofYear8.

Theassumptionofself-regulationandlearningautonomyisbasedondifferential

evidenceoflaterachievementandengagementincomparableschoolsasexplained

earlier.

Page 233: Exploring The Impact of a Largescale Diagnostic Science

212

Tableswithdataaboutachievementandengagementforeachoftheschoolsinthe

pairedcomparisonsbelowaresourcedfromdatatablespresentedinAppendixI.

Thenumbersinthosefivetableswereeithertranscribeddirectlyfromteacher-

completedproformasorderivedfromthemasexplainedinthekeysassociated

witheachtableinAppendixI.

5.3.1PairONE:PCWAE1andMCWAE1

PCWAE1isarelativelysmallprovincialschoolinthewestofthestate.Twosmall

Year7parallel,ungradedclassesareformedeachyear(around15studentsineach

class).Studentsremaininthoseclassesforthefirsttwoyearsofsecondaryschool.

Theschoolhadthelargestpositiveresidualofallschoolsinthestate.Three

scienceteachers,includingtherelievingdeputyprincipalandrelievinghead

teacherscience,attendedtheinterview,whichwentforoveranhour.The

completedproformawasbroughttotheinterview.Aselectionofassessment

artifactswasprovidedbothduringandaftertheinterview.Theschoolhad

engagedwithVALID10andplannedtocontinuedoingso.

MCWAE1isametropolitanhighschooltothewestoftheSydneyCBD.ThreeYear

7gradedclassesareformedeachYearusingfeederschooldata.Feederschooldata

ismostlyliteracyandnumeracybased.Studentsremaininthoseclasseswithfew

changesuntiltheendofYear8atleast.Thirtypercentofitsstudentintakeare

fromrefugeefamiliesandsomeofthemhavehadnoformalschooling.Many

studentsattendanIntensiveLanguageCentrebeforeenteringsecondaryschooling

attheschool.Fourteachers,includingtheheadteacher,attendedtheinterview.

TheteachershadaccessedSMARTdatabeforetheinterviewandtheproformawas

completed.Assessmentartifactswereprovidedduringtheinterviewandsome

wereforwardedlateraswell.TheschoolhadengagedwithVALID10andplanned

tocontinuedoingso.

Table5.3containsinformationaboutachievementandengagementatthesetwo

schools.ItwascompiledfromdataprovidedinAppendixI,TablesK.1&K.3.

Page 234: Exploring The Impact of a Largescale Diagnostic Science

213

Table 5.3 Pair ONE selected statistics School Y8 ACH Y8 ENG Y10 ACH

SCH (%) STA ALL / 12 TOP / 16 SCH (%) STA

PCWAE1 EV = 89.95 ± 0.79 SEAS = 2.7 ± 0.22 RES = 2.68 ± 0.38

T 29 156 10 13

A-B 33 87

B 2 15 D-E 23 88

MCWAE1 EV = 82.14 ± 1.91 SEAS = 2.8 ± 0.46 RES = 1.85 ± 0.48

T 7 38 1 1

A-B 10 27

B 27 20 D-E 69 265

Y8 ACH = the proportion of Year 8 students in the top (T) and bottom (B) achievement bands. SCH (%) = school proportions represented as a percentage. STA = the proportion of students at the school expressed as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100. Y8 ENG = the rank order of schools based on engagement scores. ALL = all three achievement bands / 12 = the rank out of 12 non-selective schools based on the total survey scores for students at a school (the state figure is counted as a school) / TOP = top achievement band students / 16 = school rank for top band students in the 16 case study schools for which data had been provided (the state figure is counted as a school). Y10 ACH = the proportion of Year 10 students attaining grades A and B and D and E. SCH (%) = the proportion of students at a school with grades A&B and D&E represented as a percentage. STA = the proportion of students at the school as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100.

YEAR8ACHIEVEMENTANDENGAGEMENT

FromTable5.3itisclearthatattheendofYear8,comparedtothestate,the

proportionofPCWAE1studentsattainingtopbandEVresultswasverymuch

higher.MCWAE1,ontheotherhand,hadalowerproportionofitsstudentsinthe

topbandcomparedtobothPCWAE1andthestatewhichisconsistentwith

predictiononeintermsofachievement.

Whenwelookatengagement,asmeasuredbyresponsestothesixitemsinthe

studentsurvey(AppendixI,TablesK.5A–K.5C),attheendofYear8,when

Page 235: Exploring The Impact of a Largescale Diagnostic Science

214

comparedtoMCWAE1students,PCWAE1studentsinthecontextofthe11non-

selectiveentrycasestudyschools:

• werelessenthusiasticaboutwantingtostudysciencecoursesinthesenior

years(ranked5th;MCWAE1studentsranked1st);

• foundscienceeasier(3rdcomparedto11th);

• enjoyedtheirprimaryschoolscienceless(9thcomparedtohighest);

• enjoyedtheirsecondaryscienceclassesless(9thcomparedtohighest);

• fewerhadnominatedscienceintheirthreefavouritesubjects(10th

comparedtohighest);and,

• fewerhadnominatedscienceasoneofthethreesubjectstheylearntmost

in(9thcomparedtohighest).

PCWAE1’shighestrankingonanyoftheitemswas3rdforwantingtostudyscience

inthesenioryears(ItemA).However,thescoreonwhichthatrankingwasbased,

wasbelowthestatescore,asweretheotherfivescores.Thiswasanunexpected

resultgiventhatapositiveschoolexperienceinscienceuptotheendofYear8was

associatedwithhighachievement(reportedinSection5.2.1).Thisanomalywillbe

discussedinthesummativecommentspartofthissection.

YEAR10ACHIEVEMENT

BytheendofYear10thedistributionofresultsatbothPCWAE1andMCWAE1had

changedwhencomparedtothestate.PCWAE1topbandnumbersdecreasedfrom

threetotwoacrossthestatetonineforeverytenacrossthestate.Intheirhighest

bandresultsMCWAE1wentfromhavingtwostudentscomparedtofiveinthe

statetooneinfour.Thereducedproportionsofstudentsinthetopbandwasfar

greaterforPCWAE1thanMCWAE1.

Inthebottomband,PCWAE1numbers,comparedtothestate,increasedfromone

inseventonineforevery10inthestate.MCWAE1numbersalsoincreasedfrom

oneinfivetomorethanfivetotwocomparedtothestate.Thisresultstillhad

PCWAE1withbetteroverallresultsinsciencethanMCWAE1andconfirmed

predictiontwo.

Page 236: Exploring The Impact of a Largescale Diagnostic Science

215

YEAR12ENGAGEMENT

Table5.4showstheproportionsofYear12sciencecoursecompletionsatboth

schools.Ahigherproportion,relativetothestate,ofPCWAE1studentscomplete

sciencecoursesbytheendofYear12thandostudentsatMCWAE1.Thesefigures

confirmpredictionthree.StudentareaskedinthemiddleofYear10tonominate

coursesforthefinaltwoyearsofschooling.GiventhelowratingbyPCWAE1

studentsoftheirschoolscienceexperienceattheendofYear8,theexpectation

wouldbethatveryfewstudentswouldnominatetodosciencecoursesinthelast

twoyearsofschooling.Theapparentcontradictionwillbediscussedinthe

summativecommentspartofthissection.

Table 5.4 Year 12 science course completions (2013-2015 averages)

School PCWAE1 MCWAE1 Subject (state proportion%) School State School State

Biology (28.5) 40 140 32 112 Chemistry (18) 22 122 12 67 Earth and Environmental Science (2.4) N/A N/A N/A N/A

Physics (16) 22 138 14 88 Senior Science (10.4) 50 481 21 202

School = proportion of students relative to English at the school (relative to 100) State = proportions of students at the school (relative to the state set at 100) completing Year 12 courses.

COMPARATIVESUMMATIVECOMMENTSFORPAIRONE(PCWAE1ANDMCWAE1)

Thefollowingdiscussionoffindingsinrelationtothepredictionsfortheschools

comparedhereandtheircontributiontoansweringresearchquestionthreedraws

ontheschooldataprovidedaboveandreferstotheassessment-relatedwork

narrativesfortheschoolsinAppendixH.

TheexpectationfromthefindingsinSection5.2.1wasthatPCWAE1’srelatively

highresultswouldhavebeenaccompaniedbyapositiveviewoftheirschool

Page 237: Exploring The Impact of a Largescale Diagnostic Science

216

scienceexperience.Theassessment-relatedworknarratives(seeAppendixH)for

thetwoschoolshavemuchincommon.Bothrevealagroupofteachersthatgivea

highprioritytohelpingstudentsrecognisethescienceintheireverydaylivesand

theteachersgooutoftheirwaytoprovideadiversityofexperiencesfortheir

students,bothatschoolandbeyondtheschoolgates,includingshowingstudents

placeswherescienceisthebasisfortheworkbeingdonethere.Thisdiversityof

experiencesisusedasthebasisforteachingactivitiesthatprovideevidenceof

learning(intheformofwrittenandoralreports)aswellasthetraditionalpen-

and-paperteststhatteachersusetoproduceformalassessmentsforthepurpose

ofreportingtoparents.

Teachersatbothschoolsareveryawareofthelimitedliteracyskillspossessedby

manyoftheirstudentsandtheyactivelypromotetheuseofappropriatescientific

termsinstudenttalk.Wholeclassdiscussionisanimportantstrategyandstudents

areencouragedtolearnandusethevocabularyofsciencerelevanttothetopics

beingstudiedatschool.Learningintentionsandsuccesscriteriaareprominentin

theworktheydowithstudentsandteachersmakeuseofthemtoinformfeedback

tostudents.Groupworkisencouragedandsupported.PCWAE1appearstoprovide

moreopportunitiesforpeerassessment(e.g.feedbacktoeachotheronatoythat

studentsmakeandpresenttotheclass)thanMCWAE1.Bothschoolsmakeuseof

olderstudentstomentoryoungerstudents.Teachersmeetregularlyand

collaborativelyprepareteachingprogramsandassessmentissuesaswellas

sharingmarking.

Giventheabove,andtheabsenceofschoolfactorsnegativelyimpactingon

classroomenvironments(absenteeismislow,studentrelationshipsarereported

asbeinggood),thedifferenceinstudentratingoftheirschoolexperienceseemsto

berelatedtoattitudestosciencethatstudentsbringtoschool.Evidencetosupport

thiswasprovidedbystudentsinanswertoItemC,whichaskedabouttheir

enjoymentofclassesinprimaryschoolwheresciencewasthefocus.PCWAE1

studentsranked9thoutof11here,andMCWAE1studentsranked1st.The

comparablequestion(ItemD)forsecondaryscienceclassesproducedasimilar

result(9thcomparedtofirst).Inresponsetoaquestionaskingwhichthreesubjects

Page 238: Exploring The Impact of a Largescale Diagnostic Science

217

studentsthoughttheylearntmostin(of15provided,includingscience),PCWAE1

rankedscience9thandMCWAE1listeditthemost.

Nootherevidenceaboutstudentorcommunityattitudestosciencewas

purposefullycollectedinthisproject.Theanecdotalevidencefromteachersat

PCWAE1wasthatstudentsatthatschoolthoughttheteachersweretoughon

studentsandfolloweduponworkset.Thisresponsewasprovidedwhenteachers

whohadreadthesurveyresultsbeforetheinterviewhadthenaskedstudents

aboutit.

ItappearsthatPCWAE1’srelativelyhighaspirationtodoseniorsciencecourses

expressedattheendofYear8(3rdintheranking,butstillbelowthestate’sscore)

didcomeabout.AhigherproportionofstudentsatPCWAE1completedYear12

sciencecoursesthantheircounterpartsatMCWAE1.Itmaybethatthehigher

take-upofseniorsciencecoursesattheprovincialschoolwasapragmatic

responsetotheperceptionofmorejobopportunitiesrelatedtosciencethanother

subjectchoices.However,teachersatbothschoolshadbeenprovidingthat

informationtostudentsthroughexcursionstoplaceswheresciencewasa

requiredqualificationfortheworkobserved(medicine,agricultureand

universitiesinthecaseofPCWAE1).TeachersatMCWAE1mentionedhigh

parentalexpectationsandsupportforstudentstodowellatschool,including

buyingsciencetextbookstosupportindependentworkbystudentsonscienceat

home.

5.3.2PairTWO:MCAE2andMCWBE3

MCAE2isametropolitanschoolbetweenHornsbyandNewcastlecity.It

establishesthreeorfourmixedabilityclassesforstudentsusingfeederprimary

schoolsschooldata.Oneselectiveentryclassisestablishedforhigh-achieving

studentswithaparticularinterestinSTEM.Togainentrytothatstreamstudents

sitanentrytestsetbytheschooland/orareinvited.Studentsremainintheir

classesuntiltheendofYear8;thespecialclassendsattheendofYear9when

thosestudentshavecompletedStage5.Onlytheheadteacherwasattheinterview.

Page 239: Exploring The Impact of a Largescale Diagnostic Science

218

Assessmentartifactsandtheproformawereprovidedlater.Theschoolhad

engagedwithVALID10andplannedtocontinuedoingso.

MCWBE3isametropolitanschooltothesouth-westoftheSydneyCBD.Theschool

providesforsixYear7classesusingfeederschooldata.Atopstreamoftwolarge

gradedclassesisestablishedandasecondstreamoffourungradedclasses.The

classesremainlargelyunchangeduntiltheendofYear8.Onlytheheadteacher

waspresentattheinterviewandassessmentartifactsandresultsweresentlater.

ThestaffwerenotpreparedtoengagewithVALID10atthetimeofinterview.

ThetwoschoolshavecomparableSEAscoresbutstatisticallysignificantly

differentresidualsasshownbythedataprovidedinTable5.5

Table 5.5 Pair TWO selected statistics

School Y8 ACH Y8 ENG Y10 ACH SCH (%) STA ALL / 12 TOP / 16 SCH (%) STA

MCAE2 EV = 85.45 ± 0.48 SEAS = 3.9 ± 0.30 RES = .03 ± 0.42

T 16 86 8 12

A-B 28 74

B 7 52 D-E 25 96

MCWBE3 EV = 82.85 ± 0.29 SEAS = 4.0 ± 0.25 RES = -1.69 ± 0.13

T 12 65 12 14

nil nil

B 12 89 nil nil

Y8 ACH = the proportion of Year 8 students in the top (T) and bottom (B) achievement bands. SCH (%) = school proportions represented as a percentage. STA = the proportion of students at the school expressed as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100. Y8 ENG = the rank order of schools based on engagement scores. ALL = all three achievement bands / 12 = the rank out of 12 non-selective schools based on the total survey scores for students at a school (the state figure is counted as a school) / TOP = top achievement band students / 16 = school rank for top band students in the 16 case study schools for which data had been provided (the state figure is counted as a school). Y10 ACH = the proportion of Year 10 students attaining grades A and B and D and E. SCH (%) = the proportion of students at a school with grades A&B and D&E represented as a percentage. STA = the proportion of students at the school as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100.

Page 240: Exploring The Impact of a Largescale Diagnostic Science

219

YEAR8ACHIEVEMENTANDENGAGEMENT

MCAE2resultsarepositivelyskewedwithahigherproportionofstudentsinthe

topbandcomparedtothebottomband;MCWBE3hasahigherproportionofits

students(relativetothestate)inthebottomband.Thecomparisonhereconfirms

predictiononeinrelationtoachievement.

InrelationtoengagementwithscienceattheendofYear8,comparedtoMCWBE3

MCAE2studentswere:

• slightlylessenthusiasticabouttakingscienceinthesenioryears(8th

comparedto7thoutofthe11non-selectiveschools)

• slightlymorelikelytodisagreethatsciencewasthehardestsubjectthey

studied(5thcomparedto6th)

• likedtheirprimaryscienceclassesless(8thcomparedto6th)

• likedtheirsecondaryscienceclassesmore(7thcomparedto10th)

• proportionatelymorelikelytoincludescienceasoneoftheirthree

favouritesubjects(7thcomparedto9th)

• proportionatelymorelikelytoincludescienceinthegroupofthreesubjects

theythoughttheylearntmostin(8thcomparedto11th).

ThesefiguresshowthatMCAE2hadaslightlymorepositiveviewoftheirschool

scienceexperiencethanstudentsatMCWBE3.

YEAR10ACHIEVEMENT

MCWBE3didnotprovideanydataforYear10,sonocomparisoncanbemade

here.Asforthefirstpairofschools,Year10resultsforMCAE2changedfromYear

8toYear10.Theproportion(relativetothestate)oftopbandstudentsatMCAE2

wentfrom17comparedto20inthestatedowntothreecomparedtofourinthe

state.Theproportionofstudentsintheirbottombandwentdownfromonetotwo

inthestatetothestatefigures(almostoneforone).

Page 241: Exploring The Impact of a Largescale Diagnostic Science

220

YEAR12ENGAGEMENT

Table5.6showstheproportionsofstudentscompletingYear12seniorscience

coursesatthetwoschools.MCAE2hasproportionatelymoreofitsstudents

completingBiology,ChemistryandPhysicscoursescomparedtoMCWBE3.These

dataconfirmpredictionthree.

Table 5.6 Year 12 science course completions (2013-2015 averages)

School MCAE2 MCWBE3 Subject (state proportion%) School State School State

Biology (28.5) 57 200 21 74 Chemistry (18) 19 106 7 39 Earth and Environmental Science (2.4) N/A N/A N/A N/A

Physics (16) 10 63 9 56 Senior Science (10.4) N/A N/A N/A N/A

School = proportion of students relative to English at the school (relative to 100) State = proportions of students at the school (relative to the state set at 100) completing Year 12 courses.

COMPARATIVESUMMATIVECOMMENTSFORPAIRTWO(MCAE2ANDMCWBE3)

OfinterestherewasthefactthatMCAE2activelypromoteditselfasaSTEMschool

withaparticularinterestintheBiosciences.Thatsaid,itappearstobesucceeding

anditperformedbetterinsciencethanitscomparableschoolpair.However,given

thespecialstatusofscienceattheschool,thedifferentialonengagementwith

sciencebystudentsatthetwoschoolsisnotparticularlymarked.Aswell,students

attheendofYear8atMCAE2wereonlyslightlylessenthusiasticabouttaking

seniorscienceclassesthanwerestudentsatMCWBE3.

5.3.3PairTHREE:PCWAE2andMCWBE5

PCWAE2isarelativelysmallcoeducationalregionalschoolinthecentral-westof

thestate.TheschoolestablishesthreeYear7classeseachyearfromstudentsin

theirfeederprimaryschools.Theclassesareinitiallyungraded,butaftersix

Page 242: Exploring The Impact of a Largescale Diagnostic Science

221

monthsstudentsaregradedusingscienceassessmentresults.Classesare

reviewedeverysixmonthsandchangesmadedependingonassessmentresults.

ThiscontinuesuntilhalfwaythroughYear10.Thescienceheadteacherwasthe

maincontributorattheinterviewandhadmovedfromametropolitan

coeducationalschooltotakeupthatpositionbefore2011.Therearefourfull-time

andtwopart-timescienceteachersattheschool.Afull-timelaboratoryassistant

andapart-timeagricultureassistantsupporttheworkofthesciencedepartment.

Oneofthescienceteacherswastrainedasanagricultureteacher.Theheadteacher

saidshehadbeeninvolvedovertheyearsinjuniorandseniorsecondaryscience

syllabusconsultationprocessesaswellasreviewingitemsforinclusioninEVtests.

Anotherscienceteacherwhohadbeenattheschoolforseveralyearsjoinedthe

interviewtowardstheend.ArtifactsofYear7andYear8assessment-relatedwork

wereprovidedandtheproformawascompletedandforwardedafterthe

interview.TheschooldidthefirstoftheVALID10tests,butatthetimeofthe

interviewitwasnotplanningtocontinuewithit.

MCWBE5isamedium-sizedmetropolitancoeducationalhighschool.Overthe

yearsofinterest,itprovidedfromfourtofiveYear7classeseachyeardepending

ontheintakenumbersfromfeederprimaryschools.OneclassisacombinedYear

7-8classthathasagiftedandtalentedstudentintakeofaround15studentseach

year.Studentswantingtoenterthisclasssitsanentrancetestsetbythehigh

school.Asecondclassofhighachievingindependentlearners(identifiedbytheir

feederschools)wasalsoestablishedeachyear.Twoorthreesmallerungraded

classeswerethencreatedfromtheremainderoftheintake.Theseclassesare

retainedmostlyunchangeduntiltheendofYear8.Thescienceheadteacherhad

occupiedthepositionthroughouttheperiodofinterestandwastheonlyscience

staffmembermetwithandinterviewedatthisschool.Hispreviousschoolwasa

provincialhighschoolinthewestofthestate.Thereweresixfull-timeandone

parttimescienceteachersattheschool.ArtifactsofYear7andYear8assessment-

relatedworkwereprovidedandthedataproformawascompletedandforwarded

aftertheinterview.Thesciencedepartmenthadnoplansatthetimeofinterview

totakeupVALID10.Table5.7providesrelevantdataaboutachievementatthetwo

schools.

Page 243: Exploring The Impact of a Largescale Diagnostic Science

222

Table 5.7 Pair THREE selected statistics

School Y8 ACH Y8 ENG Y10 ACH SCH (%) STA ALL / 12 TOP / 16 SCH (%) STA

PCWAE2 EV = 84.79 ± 0.31 SEAS = 1.8 ± 0.45 RES = 1.69 ± 0.21

T 12 65 7 11

A-B 17 45

B 12 89 D-E 37 142

MCWBE5 EV = 82.54 ± 0.56 SEAS = 2.1 ± 0.11 RES = -1.48 ± 0.28

T 13 70 3 3

A-B 29 76

B 18 133 D-E 24 92

Y8 ACH = the proportion of Year 8 students in the top (T) and bottom (B) achievement bands. SCH (%) = school proportions represented as a percentage. STA = the proportion of students at the school expressed as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100. Y8 ENG = the rank order of schools based on engagement scores. ALL = all three achievement bands / 12 = the rank out of 12 non-selective schools based on the total survey scores for students at a school (the state figure is counted as a school) / TOP = top achievement band students / 16 = school rank for top band students in the 16 case study schools for which data had been provided (the state figure is counted as a school). Y10 ACH = the proportion of Year 10 students attaining grades A and B and D and E. SCH (%) = the proportion of students at a school with grades A&B and D&E represented as a percentage. STA = the proportion of students at the school as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100.

TheSEAscoresforthetwoschoolswereverylow,indicatingthatthereweremany

moresocio-educationallydisadvantagedstudentsatthetwoschoolsthan

advantagedstudents.TheSEAscoreswerecomparablebuttheirresidualswere

statisticallysignificantlydifferent.

YEAR8ACHIEVEMENTANDENGAGEMENT

AttheendofYear8,itwasclearthatPCWAE2wasoutperformingMCWBE5when

itcametoEVresults(seeTable5.7).WhileMCWBE5hadmorestudentsinthetop

band,ithadamuchgreaterproportionofitsstudentsinthebottombandthandid

PCWAE2.

Page 244: Exploring The Impact of a Largescale Diagnostic Science

223

Inrelationtoengagement,ofthe11non-selectivecasestudyschools,attheendof

Year8,comparedtoMCWBE5students,PCWAE2students:

• wereslightlylesswantingtostudyscienceinthesenioryears(ranked3rd

comparedto2nd)

• foundscienceslightlyharder(8thcomparedto7th)

• likedscienceatprimaryschoolless(7thcomparedto2nd)

• likedscienceclasseslessinsecondaryschool(6thcomparedto3rd)

• hadthelowestproportionofstudentsnominatingscienceintheirgroupof

threefavouritesubjects(MCWBE5ranked3rd)

• hadalowerproportionoftheirstudentsnominatingscienceasoneofthe

threesubjectstheythoughttheylearnedmostin(5thcomparedto3rd).

Thesefiguresareinconsistentwithpredictiononeandagainstthepattern

discussedinSection5.2.1(studentsattheWAEschoolshouldbemorepositive

abouttheirschoolscienceexperience).

YEAR10ACHIEVEMENT

InthetwoyearsfromYear8toYear10,comparedtothestate,PCWAE2’s

proportionofstudentsinthetopbandhaddeclinedandincreasedinthebottom

band.MCWBE5’sproportionoftopbandstudentshadincreasedandbottomband

proportionshaddecreased.Theseresultswereinconsistentwithpredictiontwo.

However,thereisaquestionmarkovertheassumptionofcomparabilityofthe

Year10resultsbecauseofthepatternchangeingradesfrom2011(state-wide

examrelated)to2012(whengradeswereschooldetermined).Theproportionof

A+B+CgradesinMCWBE5wentfrom72%(upto2011)to82%(2012to2015).In

thatsametimespan,PCWAE2’sresultswereeffectivelyunchanged(theywent

from62%to63%).

YEAR12ENGAGEMENT

Table5.8showstheproportionsofstudentsatthetwoschoolswhocompleted

sciencecoursesattheendofYear12.

Page 245: Exploring The Impact of a Largescale Diagnostic Science

224

TheproportionsofHSCsciencecoursecompletionsoverthethreeyearscompared

tothestatewerethesameinbothschoolsforBiology.PCWAE2hadlessinPhysics

andChemistrythanMCWBE5.PCWAE2hadmoreofitsstudentscompleteSenior

SciencethanMCWBE5.Thefindinginrelationtopredictionthreeforthispairof

schoolswasinconclusive(seethesummativecommentsbelow).However,given

thelowSEAscoresforbothschools,theproportionsofstudentscompletingsenior

sciencecourseswereabovestatefiguresinBiology,wellaboveforSeniorScience,

butclosetostatefiguresinChemistryandPhysics(proportionatelymoreWBE

studentsthanWAEstudentscompletedPhysicsandChemistry).

Table 5.8 Year 12 science course completions (2013-2015 averages)

School PCWAE2 MCWBE5 Subject (state proportion) School State School State

Biology (28.5) 38 133 38 133 Chemistry (18) 16 89 18 100 Earth and Environmental Science (2.4) N/A N/A N/A N/A

Physics (16) 13 81 17 106 Senior Science (10.4) 30 288 20 191

School = proportion of students relative to English at the school (relative to 100) State = proportions of students at the school (relative to the state set at 100) completing Year 12 courses.

COMPARATIVESUMMATIVECOMMENTSFORPAIRTHREE(PCWAE2ANDMCWBE5)

Thediscussionoffindingsforthepairofschoolscomparedhereandtheir

contributiontoansweringresearchquestionthreedrawsontheschooldata

providedaboveandreferstotheassessment-relatedworknarrativesforthe

schoolsinAppendixH.Thenarrativesforthetwoschoolsreflecttheirvery

differentprioritiesforsciencelearning.Thenarrativesforthetwoschoolsprovide

evidenceofschool-factordifferences,particularlyinrelationtosummative

assessment.

IntheWAEschool,thefocuswasonpreparingstudentstoundertakesenior

sciencecourses,andstudentsclassplacementwasreviewedeachsemesterinthe

Page 246: Exploring The Impact of a Largescale Diagnostic Science

225

lightofassessmentperformance.Thesciencedepartment’sstaffwereactive

participantsintheschool-wideliteracyprogramandprovidedoneperiodof

sciencepertimetablingcycletoit.Scienceteachersalsoprovidedstudentswith

specificsciencevocabularyhomeworklinkedtothetopicsbeingstudied.Science

teachingwashighlydifferentiatedandsensitivetostudentliteracyneeds.Talk

comesfirst,thenteacherdirectedreading(bystudentstotheclass),followedby

writing.

Considerablelaboratory-basedpracticalworkisalsoundertakenbystudentsin

thenameoflearningtheskillsofworkingscientifically.Whatistalkedaboutand

writtenishighlymanagedbyteachers.Whilstothertaskscontributetooverall

assessmentresults,therearetwoformaltestsperyear.Rubricsforscoring

studentsworkwerepreparedtoreflectlearningintentionsandsuccesscriteria

describedintheBoardsyllabus.Therubricswereprovidedtostudentsbefore,

duringandafterassessmentandfeedbackisprovidedontheextenttowhich

intentionsweremet.Studentsdoamajorresearchprojecteachyearandpractical

tests,evidencefromwhichcontributestostudents’overallassessmentinscience.

Theresearchtaskwastightlyconstrainedbyteachersandadetailedscaffoldfor

thefinalreportwasprovided.

IntheWBEschool,theprioritywasforstudentstoenjoytheirschoolscience

experiences.Thefocusforteacherswasonprovidingadiversityofrichscience

experiences,somearisingspontaneouslyoutofstudentinterest,withinand

beyondtheschoolboundary.Atthetimeofinterestforthisproject,theredidnot

seemtobeastrongemphasisonusingliteracystrategiesinscience.Assessment

waslikelytobenegotiatedwithstudents,peerassessmentwasusedtoprovide

feedbackononeofthetasks(amodel-makingexercise),andtherewasan

opportunityforself-assessmentattheendofeachtopic.Evidenceoflearningwas

collectedfromavarietyoftasksandtherewasagooddealofindividualteacher

judgmentinvolvedwhenitcametopreparingreportsforparents(andstudents).

Summativeassessmentwasalow-keyaffair(deliberately)andstudentswerenot

shiftedaroundonthebasisofresultsuntiltheendofYear8.

Page 247: Exploring The Impact of a Largescale Diagnostic Science

226

TheoverallnegativeimpressionrecordedbyPCWAE2studentsisinconsistent

withtheengagementaspectofpredictionone.Thelearningprogramsatboth

schoolsencouragetheuseofcontextstosupportteachingandlearning.At

PCWAE2,mentionwasmadeofagricultureandbiotechnologyascontextsmostly

used.TheWAEschoolhasahigh-stakes,summativeassessmentapproachwhich

hasbeenshownintheresearchliterature(Harlen&Deakin-Crick,2002)toimpact

negativelyonthemotivationtolearnofstudentswithpoorlearninghistories.As

theWAEschoolhereisaprovincialschool,thepossibilityofstudentsocio-cultural

factors(similartothoseoperatinginPCWAE1above)impactingtheengagement

scoresshouldnotbeoverlooked.

ByYear10,theachievementpatternrelativetostatefiguresattheWAEschoolis

belowthatoftheWBEschool,andcompletionsofSeniorSciencecoursestwoyears

afterthatwerenottoodissimilaratthetwoschools.Theachievementfindingsat

theendofYear10areinconsistentwithpredictiontwo(issueswithcomparing

Year10resultsnotwithstanding).However,thefindingsinrelationtoprediction

threeareinconclusive.Overall,ahigherproportionofPCWAE2studentscomplete

sciencecourses,butasmallerproportioncompletethetwomostdemanding

courses,ChemistryandPhysics.

Possibleexplanationsfortheunexpectedfindingsinrelationtothepredictionswill

bediscussedinthesummarysectionofthischapter(Section5.5).

5.3.4PairFOUR:MGFSAE2andMGFSWBE1

Threefullyselectiveschoolswereincludedforcasestudy.TheywereMGFSWAE1,

MGFSAE2andMGFSWBE1.Allthreeweremetropolitan;thefirstwasa

coeducationalschool;thelattertwobeinggirls’schools.Thetwogirls’schools

werethefocusforpairedcomparisoninthissection.Howevercommentaryand

comparisonsweremadeinvolvingallthreeschoolsasconsideredusefulto

understandingsimilaritiesanddifferencesrelevanttothepredictionsbeingtested.

Theheadteachersatthethreeschoolsofinterestinthissectionwereatthose

schoolsatthetimeofinterestforthisproject(2011–2014).Allthreeschoolseach

Page 248: Exploring The Impact of a Largescale Diagnostic Science

227

yearestablishedfromfourtofiveYear7classes.Theclasseswereestablished

usingselective-schooltestresultsandfeederschoolinformationaboutthe

students.Fromthepointofviewofscience,theclasseswereeffectivelyungraded.

OnlytheheadteacherfromtheWAEandAEschoolwereinterviewed.Thehead

teacherandsevensciencestaffmemberswereinvolvedintheinterviewatthe

WBEschool.Bothschoolsprovidedarangeofassessment-relatedartifactsfor

Years7to10.TheHTscienceattheWBEschoolbroughtapartiallycompleted

resultsproformatotheinterview.TheHTattheAEschoolhadcompletedthe

proformafortheinterview.TheWBEandAEschoolhavebothengagedwith

VALID10,andtheWAEschoolhadnoplansfordoingsoatthetimeofinterview.

Atleast94%ofstudentsinallthreeschoolswereinthetopachievementbandfor

EVresults.Noneofthethreeschoolshadanystudentsachievinglowerthanthe

middleachievementband.Studentsatfullyselectiveentryschoolsarethere

becauseoftheiroutstandingperformanceonpen-and-papertestsofgeneral

ability,literacy(includingwriting)andnumeracy.TheNAPLANpredictorsfortheir

EVresultsputtheminthereverseordertothatestablishedbytheirresiduals(see

Table5.9).

TheirSEAscores(allotherfactorsbeingequal)forthethreeschoolswerenot

comparable(seeTable5.9)shouldhavedeliveredMGFSWBE1withthebestEV

result;itcame3rd.MCFSWAE1,whichshouldhavebeen2nd,was1st,aheadof

MCFSAE2,whichcame2nd.

TheinternationalTIMSSandPISAtestresultsdonotrevealanygenderbiasin

achievementinthefirstfewyearsofsecondaryschoolinginNSWschools

(Thomson,DeBortlietal.,2017;Thomson,Wernertetal.,2017).However,thereis

internationalresearchevidencethatadolescentgirlsinthemostdeveloped

nationsarelessengagedwithsciencethanadolescentboysare(Bøe,Henriksen,

Lyons,&Schreiner,2013;Sjøberg&Schreiner,2010).Forthisreasonthe

comparisonsmadeherewillfocusonthetwogirlsschools.

Page 249: Exploring The Impact of a Largescale Diagnostic Science

228

YEAR8ACHIEVEMENTANDENGAGEMENT

Table5.9providessomedatarelevanttomakingcomparisonsandfindings

relevanttothepredictions.TheEVdataforthethreeschoolswassourcedfrom

TableK.1inAppendixI.

Table 5.9 Pair FOUR selected statistics

School Y8 ACH Y8 ENG Y10 ACH SCH (%) STA TOP / 16 SCH (%) STA

MCFSWAE1 EV = 101.97 ± 0.71 SEAS = 8.6 ± 0.16 RES = 1.19 ± 0.29

TEV 95 TER 85 TWS 80 TCS 89

511 419 412 397

4 A 63 485

MGFSAE2* EV = 101.00 ± 0.65 SEAS = 8.3 ± 0.16 RES = -0.09 ± 0.44

TEV 95 TER 85 TWS 76 TCS 89

511 419 392 397

15 A 83 639

MGFSWBE1* EV = 97.99 ± 0.54 SEAS = 8.9 ± 0.14 RES = -1.42 ± 0.02

TEV 94 TER 70 TWS 78 TCS 93

505 345 402 415

8 A 85 654

Y8 ACH = the proportion of Year 8 students. SCH (%) = school result. TEV = proportion of overall EV result in the top band / TER = proportion of results in the top band extended response tasks. TWS = proportion in the top band for working scientifically. TCS = proportion of results in the top band for communicating scientifically. STA = ratio of top band school achievement relative to the state score at 100 (ratio obtained by dividing school % proportion by state % proportion). Y8 ENG = the rank order of schools based on engagement scores. TOP / 16 = the rank out of 16 (the state figure is counted as a school). Y10 ACH = proportions of A grades at the school and relative to the state. SCH % = the proportion of Year 10 students attaining A grades. STA = the ratio of A grades at the school relative to the state set at 100 (ratio produced by dividing the school % proportion by the state % proportion). * Girls schools.

Inthiscomparison,theAEschoolwiththelowerSEAscore(8.3±0.16)hadthe

betterEVresult(101.00±0.65comparedto97.99±0.54)andahigherresidual

thantheWBEschool(-0.09±0.44comparedto-1.42±0.02).ThelowerSEAscore

fortheAEschoolisstronglysuggestiveofgreatervalueaddingtoitsEVresultthan

Page 250: Exploring The Impact of a Largescale Diagnostic Science

229

ifithadacomparableSEAscore.Fromthisperspective,theachievement

componentofpredictiononeissatisfied.

ThegreatestachievementdiscrepancybetweentheAEandWBEschoolisinthe

extendedresponsereportcategory,wheretheproportionofgirlsattheWBE

schoolwas70%comparedto85%attheAEschool.Thiswillbediscussedinthe

summativecommentspartforthissection.

ThesourcesofdataonrelativeengagementwereTablesK.5A,B&CinAppendixI.

Thecomparisonsbelowincludetherelativeorderofschools(inparentheses).The

surveyresultswerethemeasureofstudentengagementforscienceattheendof

Year8.Onlytopbandstudentsineachofthe15casestudyschoolswillbe

comparedhere.AttheendofYear8,girlsatthe:

• WBE(4th)schoolweremorepositiveabouttakingaseniorsciencesubject

(ItemA)thanweretheirAE(9th)counterparts(andbothschoolscores

werebelowthestatefigure)

• AE(13th)andWBE(14th)schoolsthoughtscienceharderthantheir

counterpartsacrossthestate(ItemB)andbothwereabovethestatescores

intheiragreement

• WBE(9th)andAE(14th)schoolsenjoyedtheirprimaryschoolscience

experiencesintheorderlisted;theAEschoolrankingwasbelowthestate

(ItemC)

• WBE(12th)andAE(15th)schoolsenjoyedtheirsecondaryschoolscience

experiencesandproportionsincludingscienceintheirthreefavourite

subjectsaslistedhere;bothschoolscombinedscoreswerebelowthestate

(ItemsDplusEscore)

• WBE(6th)andAE(12th)schoolslistedscienceinthegroupofthreesubjects

thatstudentsthoughttheylearntthemostinintheorderlisted;theAE

school’sscorewasbelowthestate(ItemF).

Page 251: Exploring The Impact of a Largescale Diagnostic Science

230

ItisclearthatforMGFSAE2,highachievementinscienceisnotassociatedwith

positiveattitudestowardscience.Possiblereasonsforthiswillbecanvassedinthe

summativecommentspartforthissection.

YEAR10ACHIEVEMENT

Theschooldataprovidedbythetwoschoolsincludedthelevels/gradesawarded

onthebasisofthepatternofresultsfromtheexternalexaminationatYear10.The

lastexamwasin2011.Therewasadiscontinuitybetweentheresultsbeforeand

aftertheYear10examended,thusthisresearcherwasreluctanttodrawany

conclusionsaboutachievementchangesrelativetoYear8andremainedsilent

aboutpredictiontwofortheseschools.

YEAR12ENGAGEMENT

Table5.10showstheproportionsofstudentsatthethreeschoolscompletingYear

12sciencecourses.OveralltheWBEschool’sproportionsofYear12completions

werelargerthantheAEschool’scompletions.Thiswascontrarytoprediction

three.

Table 5.10 Year 12 science course completions (2013-2015 averages)

School MCFSWAE1 MGFSAE2 MGFSWBE1 Subject (state proportion) School State School State School State

Biology (28.5) 34 119 20 70 22 77 Chemistry (18) 70 389 54 300 58 322 Earth and Environ. Sci. (2.4) N/A N/A N/A N/A N/A N/A

Physics (16) 46.4 288 23 144 28 175 Senior Science (10.4) 5.8 87 N/A N/A N/A N/A

School = proportion of students relative to English at the school (relative to 100) State = proportions of students at the school (relative to the state set at 100) completing Year 12 courses.

Page 252: Exploring The Impact of a Largescale Diagnostic Science

231

COMPARATIVESUMMATIVECOMMENTSFORPAIRFOUR(MGFSAE2ANDMGFSWBE1)

Thefollowingdiscussionoffindingsinrelationtothepredictionsforthetwogirls

schoolscomparedhereandtheircontributiontoansweringresearchquestion

threedrawsontheschooldataprovidedaboveandtheassessment-relatedwork

narrativesfortheschoolsinAppendixH.

FindingsinrelationtothepredictionsarequalifiedbecausetheSEAscoresarenot

comparable.ThefactthattheAEschoolhasalowerSEAscorethantheWBE

schoolprovidesconfidencethattheresidualdifferencesupportstheconclusion

thattheAEschoolsEVresultswerebetterthanexpectedduetotheirmore

frequentexposuretoformativepractices.Areviewoftheassessment-relatedwork

narrativesforthethreeschoolsandresultsinothercategoriespointstoa

differenceinemphasisonwhatwasvaluedassourcesofevidenceoflearningas

explainedinthenextparagraph.

IntheWBEschool,studentsworkedoncross-facultyprojectsandbeyondthe

schoolgates.Evidenceoflearningwasobtainedfromstudent-createdmodels,

writtenreports(usingtightlyconstrainedscaffolds)andgrouppresentations

supportedbytechnologyaswellastraditionalpen-and-papertests.Bycontrast,

theAEschoolhadastrongemphasisonwrittenevidenceoflearningdrawnfrom

traditionallaboratoryandtext-basedexperiencesmostlyprovidedwithinthe

schoolboundaries.‘Writingtolearn’wasahigherpriorityfortheAEschool.

Simplyput,studentsattheWBEschooldidnothavethesameopportunitiesto

writeanswerstoopen-endedextendedresponsetasksasstudentsintheAE

school.Thisresearchersuggeststhatdifferentialopportunityisthemain

contributortothebetterEVresultsattheAEschool.Fulleraccountsofthe

narrativesforthetwoschoolsareprovidedinAppendixH.

Predictionthreeisabouttheproportionsofstudents(relativetothestate)

completingseniorsciencecourses.Theexpectationfrompredictionthreewasthat

theAEschoolwouldhaveahigherproportionofitsstudentscompletingYear12

sciencecoursesthantheWBEschool,whichwasclearlynotthesituationhere.It

wouldappearthattheAEgirls’strongdislikeforscienceattheendofYear8

Page 253: Exploring The Impact of a Largescale Diagnostic Science

232

continuedandwasafactorintheirloweruptakeofsciencecoursesinthesenior

years.

Ifweextrapolatestudents’ratingsoftheirschoolscienceexperienceattheAEand

WBEschoolsfromYears8to10,whenstudentsmakechoicesaboutwhetherand

whatsciencetodointhesenioryears,itappearsherethatYear8engagementand

Year12engagement(seniorcoursecompletions)correlatebetterthanYear8

achievementandlaterengagement.Predictionthreeincludestheunderstanding

thatself-regulationprioritiseslearningoverenjoyment.ThelinksbetweenYear8

achievementandlaterengagementwillbeexploredinthenextsection(Section

5.4),wherethefindingsfromstatisticalcorrelationswillbereported.

Thatsaid,thenarrativesforthetwoschoolssuggestthatthemorepositiveattitude

toscienceattheWBEschoolisrelatedtoqualitativelydifferentlearningprograms.

AttheendofYear8,thegirlsattheAEschoolhaddemonstrablybetterwriting

skills,butthegirlswereclearlynotenjoyingthesciencetheywroteabout.

5.3.5PairFIVE:PCWAE2andPCWAE3

Therankorderingofschoolsinthestatethatisbasedontherelativesizeand

polarityoftheresidualfromaregressionofEVresultsoveraNAPLAN-based

predictorproducedanunexpectedfinding,whichwasthattheproportionof

provincialschoolsrankedinthetop20%ofthestatewentfrom9%to56%(see

Section4.1).Sevenofthe12WAEschoolsthatidentifiedthemselveswere

provincialschools(seeTable5.1),whichwasacoincidencebutreflectedthatstate-

widefinding.

Inprinciple,comparingaprovincialandametropolitanschool,orevenafull

selectiveschool,shouldnotmatteraslongastheSEAscoresareidenticaland

schoolfactors(suchasattendancerates)aretakenintoaccount.Thiswasdonein

theearlieranalysestocomparePCWAE1andMCWAE1,andPCWAE2and

MCWBE5.ThepremiseisthattheSEAscorecapturesallthatmatterswhenit

comestostudents’sciencelearningpotential.

Page 254: Exploring The Impact of a Largescale Diagnostic Science

233

ThissectioncomparesthreeWAEprovincialschools,twoofwhich(PCWAE1and

PCWAE2)werelookedatearlierinthischapter,butinthecontextofcomparisons

withotherschoolshavingthesameSEAscoresastheprovincialschools.Thethird

provincialschool(PCWAE3)wasselectedforpairingwithPCWAE2becauseithad

comparableSEAscoresandcomparableresiduals(‘comparable’meaningnot

significantlydifferentinthestatisticalsense).ThethumbnailsketchesofPCWAE1

andPCWAE2wereprovidedaboveinthecontextofPairsONEandTHREE

respectively;thethumbnailsketchforPCWAE3follows.

PCWAE3isthelargestofthethreeprovincialschools.Eachyear,theschool

establishesfourtofiveYear7classesusingdatafromfeederschools.Asingletop

streamclassisestablishedfromthehighestachieversandabottomstreamsmall

classconsistsofstudentswiththeweakestliteracylevels.Thetwoorthreeclasses

inthemiddlehavetheremainderofthestudentsallocatedinnoparticularorder.

Allclassesaremixedabilityfromascienceperspective.Thescienceheadteacher

wastheonlyteacherinvolvedintheinterviewandhadbeenattheschoolfrom

beforetheperiodofinterest.ArtifactsofYear7andYear8assessmentwere

providedattheinterviewandtheproformahadbeencompleted.Theschoolhad

noplanstotakeupVALID10atthetimeofinterview.

Table5.11providesselecteddatasourcedfromdatatablesinAppendixItomake

comparisonsrelevanttoaddressingthethreepredictions.

Page 255: Exploring The Impact of a Largescale Diagnostic Science

234

Table 5.11 Pair FIVE selected statistics

School Y8 ACH Y8 ENG Y10 ACH SCH (%) STA ALL / 12 TOP / 16 SCH (%) STA

PCWAE2* EV = 84.79 ± 0.31 SEAS = 1.8 ± 0.45 RES = 1.69 ± 0.21

T 12 65

7 11

A-B 17 45

B 12 89 D-E 37 142

PCWAE3* EV = 83.64 ± 0.79 SEAS = 2.0 ± 0.27 RES = 1.43 ± 0.25

T 12 65 11 16

A-B 24 71

B 13 96 D-E 7 127

Y8 ACH = the proportion of Year 8 students in the top (T) and bottom (B) achievement bands. SCH (%) = school proportions represented as a percentage. STA = the proportion of students at the school expressed as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100. Y8 ENG = the rank order of schools based on engagement scores. ALL = all three achievement bands / 12 = the rank out of 12 non-selective schools based on the total survey scores for students at a school (the state figure is counted as a school) / TOP = top achievement band students / 16 = school rank for top band students in the 16 case study schools for which data had been provided (the state figure is counted as a school). Y10 ACH = the proportion of Year 10 students attaining grades A and B and D and E. SCH (%) = the proportion of students at a school with grades A&B and D&E represented as a percentage. STA = the proportion of students at the school as a ratio (school proportion as a % over the state proportion as a %) relative to the state designated as 100.

YEAR8ACHIEVEMENTANDENGAGEMENT

Thedatarelevanttoconfirmingtheachievementcomponentofpredictionone

(Table5.11),hasPCWAE2withastatisticallysignificantlyhigherEVresultthan

PCWAE3,thoughonlyjust.Onthebalanceofprobabilities(anotionallylowerSEA

scoreandhigherEVresult),thissupportspredictionone.NotethatPCWAE1’s

resultsarestatisticallysignificantlybetterthaneitherofPairFIVE’sresults,butit

hasastatisticallysignificantlyhigherSEAscorethaneitherofthetwoschools

comparedhere.

Table5.12recordstheproportionsofstudentsateachofthethreeachievement

levelsinthreeEVresultreportingcategories(ER,WSandCSareallidentifiedin

Page 256: Exploring The Impact of a Largescale Diagnostic Science

235

thelegend).ThislevelofcomparisoniswarrantedbecausetheEVresultsfromthe

twoschoolsareveryclose(justastheywereforMCFSAE2andMCFSWBE1).

Table 5.12 PCWAE2 and PCWAE3 Year 8 EV results

EV % ER % WS % CS % School SEAS SRES AB sch sta sch sta sch sta sch sta

5-6 12 18.6 18 20.3 16 19.4 14 22.4

PCWAE2 1.8 1.69 3-4 76 67.9 66 63.4 69 63.3 71 60.3

1-2 12 13.5 16 16.3 15 17.3 15 17.3

5-6 12 18.6 15 20.3 15 19.4 14 22.4

PCWAE3 2.0 1.43 3-4 75 67.9 66 63.4 68 63.3 66 60.3

1-2 13 13.5 19 16.3 17 17.3 20 17.3

Note. SEAS = socio-educational advantage score / SRES = school residual / AB = achievement band / EV % = proportions of students within each achievement band based on their total EV result (sch = school & sta = state) / ER % = proportions for extended response tasks / WS% = proportions for working scientifically / CS% = proportions for communicating scientifically

Inthreeofthefourreportingcategories,thereisasmallbutconsistentpositive

skewinPCWAE2’sresults.Thisobservationwasmostpronouncedforthe

extendedresponsetaskcategory.Therelativelylowtopbandproportionsof

studentsatbothschoolsisconsistentwiththeirrelativelylowSEAscores.

However,thelargeproportionsofstudentsinthemiddlebandandsmall

proportionsinthebottombandistestimonytoeffectiveteachinginthetwo

schools.Possiblereasonsforthisresultwillbeexploredinthediscussionpartof

thissection.

Table5.13enablescomparisonsofengagementforthethreeprovincialschools.

Thesethreeareincludedherebecausetheachievement–engagementpattern

describedinSection5.2.1wasageneraloneandnotlinkedtopairsofschoolswith

commonSEAscores.ThefindingsinthatsectionshowedthathigherEVresults

andpositiveattitudestowardschoolscienceexperiencewereassociated.

EngagementfindingsforPCWAE1andPCWAE2(reportedonincomparisons

Page 257: Exploring The Impact of a Largescale Diagnostic Science

236

above)wereinconsistentwiththatgeneralfinding.Inboth,engagementmeasures

werewellbelowthetwometropolitanschoolseachwasbeingcomparedwith,

despitebothprovincialschoolshavingbetterEVresultsthanthemetropolitan

schooleachwascomparedwith.

Table 5.13 Case study school ranks based on student scores for the six items from the student survey

School PCWAE1 PCWAE2 PCWAE3 Item ALL / 11 TOP / 15 ALL / 11 TOP / 15 ALL / 11 TOP / 15

A 5 11 3 8 6 10

B 3 9 8 8 4 4

C 9 11 7 7 11 15 D 9 13 6 10 8 15 E 10 8 11 13 8 14 F 9 14 5 8 10 13

AVERAGE RANK 7.5 11 6.7 9 7.8 11.8

ALL / 11 = all students at the 11 non-selective entry schools. TOP / 15 = top band achievers at the 15 case study schools. 3 & 4 are both better than the state figures

Whenconsideringstudentrankingsonengagementforthenon-selectiveschools,

thethreeprovincialschoolswereclearlyinthelowesthalfofthestateforallbut

thefirsttwoitemsrelatedtoengagement.Overall,PCWAE2wasmorepositive

thaneitherPCWAE1orPCWAE3.Thiswasalsotrueforthetopbandachieversin

allthreeschools.

WhencomparingonlyPCWAE2andPCWAE3,itwasfoundthat,overall,of

PCWAE2studentsattheendofYear8:

• morewantedtodoaseniorsciencecourse;

• fewerthoughtsciencetheirhardestsubject;

• moreenjoyedtheirprimaryscienceclasses;

• moreenjoyedtheirsecondaryscienceclasses;

• asmallerproportionlistedscienceintheirlistofthreefavouritesubjects;

Page 258: Exploring The Impact of a Largescale Diagnostic Science

237

• alargerproportionlistedscienceinthegroupofthreesubjectsthestudents

thoughttheylearntmostin.

YEAR10ACHIEVEMENT

Turningnowtolaterachievement,PCWAE3didnotprovideYear10achievement

datafrom2012to2014butdidprovideresultsforthethreeyearsuptoand

including2011,theyearofthelastexternalsciencetest.Itisobviousthatadirect

comparisonbetweenPCWAE2’sandPCWAE3’sYear10resultswouldnotbea

validexercise.However,itispossibletocomparethegrade/leveldistributions

comparedtostatefiguresfortheappropriateyearsandthentoinfer,with

appropriatecaution,theextrapolationofthatpatterntotheyearsofinterest

(2011–2014).

Table5.11providesthedatashowingchangesinthepatternofresultsfromYear8

toYear10relativetothestate.Distributionsofschoolresultsrelativetothestate,

andthechangedproportions,showthatPCWAE2studentshavenotretainedtheir

achievementedgeoverPCWAE3.Thesedatadonotconfirmpredictiontwo.

Reasonsforthechangeinresultspatternsarediscussedinthesummative

commentspartofthissubsection.

YEAR12ENGAGEMENT

LookingnextatYear12completionsforthetwoprovincialschools(Table5.14),

thestudentproportionscompletingsciencecoursesattheendofYear12at

PCWAE2relativetothestateandcomparedtoPCWAE3were:moreinBiology,

comparableinChemistry,moreinPhysics,andmoreinSeniorScience.PCWAE3

alsoofferedEarthandEnvironmentalScience,whichPCWAE2didnot(10%of

PCWAE3studentscompletedthiscourseattheendofYear12).Withoutknowing

moredetails(suchaswhetherBiologyandEarthandEnvironmentalSciencewere

offeredasaneither/oroptionorbothcouldbetaken),itwouldappearthat

PCWAE2hadmoreofitsstudentscompletingYear12coursesthanhadPCWAE3,

whichwasconsistentwithpredictionthree.

Page 259: Exploring The Impact of a Largescale Diagnostic Science

238

Table 5.14 Year 12 science course completions (2013-2015 averages)

School PCWAE2 PCWAE3 Subject (state proportion) School State

(100) School State (100)

Biology (28.5) 38 133 19 67 Chemistry (18) 16 89 17 94 Earth and Environmental Science (2.4) N/A N/A 10 417

Physics (16) 13 81 11 69 Senior Science (10.4) 30 288 22 212

School = proportion of students relative to English at the school (relative to 100) State = proportions of students at the school (relative to the state set at 100) completing Year 12 courses.

COMPARATIVESUMMATIVECOMMENTSFORPAIRFIVE(PCWAE2ANDPCWAE3)

Thefollowingdiscussionoffindingsinrelationtothepredictionsfortheschools

comparedhereandtheircontributiontoansweringresearchquestionthreedraws

ontheschooldatamentionedaboveandtheassessment-relatedworknarratives

fortheschoolsinAppendixH.

Forpredictionone,PCWAE2hadthebetterachievementandengagementoverall.

Forthetwoschoolstheassessmentnarrativesdiscussedtheprioritygiveninboth

schoolstoworkingonimprovingstudents’literacyskills.Theassessment

narrativesforbothschoolsprovidedconvincingevidenceofdifferentialteaching

thataimedtoaddressthefullrangeofliteracydeficitsthatstudentsbringto

scienceclasses.

BothschoolsareintheWAEgroupofschools.WAEschoolsweremorefrequent

usersofthreedimensionsofformativepracticethanWBEschools.Itisreasonable

tosuggestthatattheendofYear8,PCWAE2wasmoresuccessfulatlifting

studentsresultsthanPCWAE3becausePCWAE2teachersweremoreeffectiveat

promotingdiscoursethatelicitsevidenceoflearning,providingfeedbackthat

advanceslearning,andmodelinggoodlearningbehaviourstopeersandstudents.

Page 260: Exploring The Impact of a Largescale Diagnostic Science

239

TheevidenceforthisconclusionisthepositivebiasintheresultsforPCWAE2

studentsintheextendedresponsecomponentoftheEVresults;inthedetailofthe

assessmentnarrativeforPCWAE2comparedtoPCWAE3,and;morestudentsat

PCWAE2hadputscienceinthelistofthreesubjectstheythoughttheyhadlearned

mostin.

Thisbeingthecase,howisitthatbytheendofYear10,theoverallresultsat

PCWAE3arebetter?Theanomalytobeexplainedisthepatternofbetterresultsby

PCWAE3attheendofYear10comparedtoPCWAE2(relativetothestate),which

iscontrarytopredictiontwo.OnepossibilityisthatPCWAE3’sSEAscore

advantage(2.0comparedto1.8)isreal.Asecondpossibilityisthatstudent

absenteeismwashigheratPCWAE2overthefouryears.Athirdpossibilityisthe

impactofahigh-stakessummativeassessmentregimesuchaswasrevealedinthe

narrativeforPCWAE2comparedtothelow-keyapproachbyPCWAE3to

summativeassessment.

IfstudentabsenteeismishigheratPCWAE2,thismightbeadecisivefactorin

reducingtheirYear10results.Disruptiontoindividuallearningprogressdueto

absenceanddisruptiontogrouplearningasaresultofabsenteeismwasidentified

bytheheadteacherintheinterviewatPCWAE2.FromtheMySchoolwebsite,the

proportionofindigenousstudentscomparedtonon-indigenousstudentsat

PCWAE3ishigherthanatPCWAE2(1in4comparedto1in5).Thisbecomes

relevantbecausedatafromtheMySchoolwebsiteforthetwoschoolsshowsthat

theattendanceratesforPCWAE3studentsare15%lowerforindigenousand5%

lowerfornon-indigenousstudentsthanatPCWAE2.Thus,onanyonedaythe

proportionofallstudentsawayatthetwoschoolsislikelytobegreaterat

PCWAE3thanatPCWAE2.So,despitelowerdailyattendanceratesatPCWAE3,its

Year10resultsarebetterthanPCWAE2’sresults.

Giventhatabsenteeismismorelikelytohaveagreaternegativeeffecton

achievementatPCWAE3thanatPCWAE2,itispossiblethatthereisanothermore

potentfactoratworkhererelatedtodifferentapproachestosummative

assessment.ResearchdiscussedinChapterTwoidentifiedthenegativeeffectof

Page 261: Exploring The Impact of a Largescale Diagnostic Science

240

high-stakessummativeassessmentonthemotivationtolearnofstudentswith

poorlearninghistories.Bothschoolshaverelativelyhighproportionsofstudents

withpoorlearninghistories(reinforcedbypublicityaroundNAPLANresults,

whicharegenerallypooratboththeseschools).

PCWAE2havestrictlygradedclassesinscience,thecompositionofwhichis

changedaftersummativeassessmenteverysixmonthsfromhalf-waythrough

Year7tohalf-waythroughYear9.PCWAE3takesalow-keyapproachto

summativeassessmentandkeepsthenumberofformalassessmenttaskstoa

minimum.Onceestablished,classesatPCWAE3areretainedrelativelyunchanged

untiltheendofYear8.Itmaybethatoverthefouryears(fromYear7to10)that

thenegativeimpactonmotivationtolearnisgreateratPCWAE2thanPCWAE3.

TheapproachtoassessmentatPCWAE3isverysimilartothatatMCWBE5.

(MCWBE5wascomparedtoPCWAE2aspairTHREEabove).Absenteeismat

MCWBE5wasthelowestofthethreeschools.LikePCWAE3,MCWBE5established

streamedclassesatthebeginningofYear7whichtheyretaineduntiltheendof

Year8.Summativeassessmentwasalow-keyaffair.Allthreeschoolshad

comparableSEAscores,thusmakingcomparisonfair,basedontheirSEAscores.

MCWBE5’sresidualiswellbelowboththeprovincialWAEschoolsandEVtest

resultwaslowerthaneitherofthetwoprovincialschools.LikePCWAE3,MCWBE5

performedbetterthanPCWAE2bytheendofYear10.Thisoutcomeisatleast

suggestivethatsummativeassessmentpracticesatPCWAE2mayhavebeena

contributortoitslowerachievementbytheendofYear10thaneitherPCWAE3or

MCWBE5.

Astoattributionofsummativeassessmentimpactondifferencesinengagementat

theendofYear8andYear12forthethreeschools,theevidenceislessclear.

AttheendofYear8PCWAE2studentswereenjoyingtheirsecondaryscience

classesmorethanPCWAE3students,topstudentsatbothschoolslesssothan

theiroverallresultindicatestheyshould(seesection5.2.1).MCWBE5students

Page 262: Exploring The Impact of a Largescale Diagnostic Science

241

weremorepositive(3rdonItemD)thanboththeprovincialschoolsandthiswas

sharedbytheirtopstudentsaswell(3rdonItemD).

AsmallerproportionofstudentsatPCWAE2includedscienceintheirlistofthree

favouritesubjects(ItemE)thanatPCWAE3.Topbandstudentsatbothschools

hadevensmallerproportions(outof15schools,PCWAE2was13thandPCWAE3

was14th).MCWBE5ranked3rdinthestateoverallanditstopbandstudentswere

also3rd.

Whenitcametotheproportionsidentifyingscienceinthegroupofsubjectsthey

thoughttheylearntmostin,morePCWAE2studentsthanPCWAE3studentsdidso

(5thoveralland10th,respectively).Topbandstudentsrepeatedthatpattern,but

wereasmallerproportionagain(8thcomparedto13thoutof15schools).MCWBE5

studentshadthe3rdlargestproportioninthestateandtheirtopbandwasthe7th

largest.Basedontheabove,itwouldbedifficulttomakeadefinitiveclaimabout

thenegativeimpactoftheassessmentregimeatPCWAE2oneitherenjoyment

(ItemsD&E)orsenseofachievement(ItemF).

AnexplanationforMCWBE5students’muchhighersatisfactionwiththeirschool

scienceexperiencecomparedtoeitherPCWAE2orPCWAE3wouldappeartobe

lessrelatedtosummativeassessmentthanteacheruseofformativepractice.The

assessment-relatednarrativeforMCWBE5pointstoteachersatthatschoolgiving

studentsagreatersayinwhattodointhenameofscienceeducation,aswellas

moreopportunitiesforpeerandself-assessmentwhichseemtocometoincrease

withthenumberofyearsspentatsecondary.MCWBE5hadthehighest

proportionsofstudentscompletingseniorsciencecoursesofthethreeschools.

5.4Correlationandstrengthofassociationsbetweenschoolvariables

Correlationprovidesawayofconfirming(ordisconfirming)therelativestrengths

ofassociationsbetweenvariables.Thestrengthofacorrelationcanprovidemore

supportforoneorotherinferencewhenconsideringthequalitativeevidencein

theassessmentnarratives.Inthesituationsbeingcomparedherewearelookingat

scoresthataretwo(Year8toYear10)andfouryearsapart(Year8toYear12).As

Page 263: Exploring The Impact of a Largescale Diagnostic Science

242

mentionedearlier,studentsmaketheirchoicesforsciencecoursestheywishto

studyinYears11and12(thelasttwoyearsofsecondaryeducation)inthemiddle

ofYear10.

Thisresearcher’sexperiencesuggeststhatoncemade,studentstendtofollow

throughwiththosechoices.Thus,thedecisiontouseYear12dataforcompletions

needstorecognisethatthefiguresreflectdecisionsmademorethantwoyears

earlier,lessthantwoyearsaftertheEVtest,andbeforeYear10resultswere

finalised.TheEVtesthasbeeninplacesince2007;theresultsbeinglookedathere

areforthefourYear8cohortsfrom2011to2014.Theirscoresarecorrelatedwith

Year10studentswhodidtheEVtestfrom2009to2012andYear12studentswho

didtheEVtestfrom2007to2010.ThefirstEVtestforstudentsacrossthestatein

NSWwasin2007.Theadviceaboutassessmentforlearningwaspromulgatedwith

the2003syllabus.Thepointbeingmadehereisthatchangesinresponsetoboth

initiativeswereasstronglyembeddedinpracticeastheywereevergoingtobeby

theendof2014,whenthenewsyllabusbecamethebasisforongoingEVtesting.

Fromthisperspectivethecorrelationbetweensetsofresultsthatare

asynchronouswasnotconsideredamajorissuewhenitcametoassessingthe

limitationsofcorrelationstatisticsasappliedhere.

Anotherassumptionhereisthatscienceresultsareafunctionofallthescience

teachers’effortsataschoolandthatstaffchangesortraumaticeventsatanyone

schoolduringthattimewererelativelyminor.Nevertheless,anystatistically

significantcorrelationsneedtoconsiderspecificschoolcircumstances.School

circumstancesthatwerelikelytoimpactresultsandengagementweredisclosedto

theresearcherandwereincludedintheassessment-relatedworknarrativesfor

thecasestudyschoolsasappropriate.

Asexplainedearlier,SPSSsoftwarewasusedbytheresearcherinthisprojectto

performbivariatecorrelationsusingeitherparametricornonparametricmodels

asappropriate.

Page 264: Exploring The Impact of a Largescale Diagnostic Science

243

5.4.1Correlations:fullyselectiveentrycasestudyschools(n=3)

Atwo-tailedcorrelationanalysisusingSPSSwascarriedouttotesttheobservation

thatengagementattheendofYear8isthebetterpredictoroflaterengagement.

Measuresofthefollowingvariablesforthethreeschoolswereused.

1. Year8results(Year8achievement)

2. Year8scoresforItemAofthestudentsurvey(aspiringtodoseniorscience

courses)

3. Year8scoresforItemsDandEfromthestudentsurvey(Year8

engagement)

4. Year10proportionsofAgrades(laterachievement)

Year12meanseniorcoursecompletionsinBiology,ChemistryandPhysicsonly

(laterengagement).

ThedatasetssatisfiedtheShapiro-Wilktestfornormality(p>.05).Resultsare

reportedintermsofthePearson’scorrelationcoefficientr,degreesoffreedom(1),

andatwo-tailedsignificance(p)valueateitherthep=.01orp=.05level(as

shownbythevaluequotedwiththereportedcorrelationcoefficient).

ForengagementatYear8(twoitemsfromthestudentsurvey)andachievementat

Year8,thecorrelationsforthetopbandstudents(atleast94%ofallstudentsat

theschools)onItemD(rD(1)=-.14,p>.05)andforItemE(rE(1)=.41,p>.05)

wereslightlynegativeandmoderatelypositive,respectively,butneitherwas

statisticallysignificant.Thus,itwouldbedifficulttodefendanyconclusionthat

likingscienceclassesanddoingwellintheEVtestwererelatedatthesethree

schools.

ThecorrelationsbetweenYear8engagement(thesametwoitemsasbefore)and

Year10achievementwererD(1)=-.88,p>.05andrE(1)=-1.0,p<.05.The

formerwashighlynegativeandnotstatisticallysignificant,thelatterwasvery

highlynegativeandstatisticallysignificant.Studentswhoputsubjectsotherthan

scienceintheirlistofthreefavouritesubjectsattheendofYear8achievedvery

Page 265: Exploring The Impact of a Largescale Diagnostic Science

244

goodresultsinscienceattheendofYear10.Inthesethreeschoolsitseemsthat

notlikingsciencewasnoimpedimenttoachievingwellinitattheendofYear10.

InrelationtoYear8engagement(twoitemsasbefore)andYear12engagement

(Biology,ChemistryandPhysicscompletions),thecorrelationswerehighly

positivebutnotstatisticallysignificant(rD(1)=.68,p>.05)andforrE(1)=.96,p

>.05).AttheendofYear8,aspiringtodoscienceinthesenioryears(ItemAinthe

survey)andactualengagementfiguresforthosestudentswhohadchosenscience

attheendofYear10(twoyearsaftertheirEVtest)andfinisheditattheendof

Year12(fouryearsafterthatEVtest)werehighlypositivelycorrelatedbutnot

statisticallysignificant(rA=.92,p>.05).ThecorrelationbetweenYear8

achievementandYear12completions(r=.63,p>.05)wasalsohighlypositivebut

notstatisticallysignificant.

Thus,itseemsthatforthethreemetropolitanfullyselectiveschools,the

combinationofwantingtodoseniorsciencecourses(ItemAinthestudentsurvey)

andlikingscienceattheendofYear8(ItemsDandE)waslikelytobeabetter

predictorofYear12sciencecoursecompletionsthanYear8achievement.

5.5.2Correlations:non-selectiveentrycasestudyschools(n=11)

Thetestingofcorrelationsbetweenvariableswasrepeatedforthenon-selective

entrycasestudyschools(n=11).Twomorevariableswereaddedtothelistfor

thepurposeofthisanalysis.Thevariablestestedwere:

1. Year8results(anachievementmeasure)

2. Year8aspiringtodoseniorsciencecourses(ItemAonthestudentsurvey)

3. Year8studentsurveyitemsDplusE(acollectivemeasureofYear8

engagement)

4. Year10achievement(thecumulativeproportionofAs,BsandCsawarded

tothecohort)

5. Year12engagement(theaverageofschoolproportionscompletingBiology,

ChemistryandPhysicscoursesattheendofYear12)

6. Residuals(ameasureofteachingeffect/scientificliteracyscores)

Page 266: Exploring The Impact of a Largescale Diagnostic Science

245

7. SEAscores(themeasureofsocio-educationaladvantage).

AllsevendatasetstobecomparedpassedtheShapiro-Wilktestfornormality(p

>.05).OnthatbasisitwasdecidedtousethePearsonparametriccorrelation(r)

two-tailedtestintheSPSSsoftware.Themodelprovidesforninedegreesof

freedom(basedonn=11)andasignificance(p)valueateitherthe.01or.05level

(asreportedwiththecorrelationcoefficientproducedbytheSPSSmodel).

ThefirsttestsweretoassesswhetherYear8engagementorYear8achievement

wasthebetterpredictoroflaterachievement(Year10results)andengagement

(Year12seniorsciencecoursecompletions).

ThecorrelationbetweenYear8engagementandYear10resultswasstrongly

negativeandstatisticallysignificant(r(9)=-.69,p>.05).Thisfiguresuggeststhat

notlikingscienceattheendofYear8anddoingwellinitlateron(attheendof

Year10)wasthenormfortheprovincialandnon-selectiveentrymetropolitan

casestudyschools.

BetweenYear8engagementandYear12engagement,thecorrelationwas

moderatelypositivebutnotstatisticallysignificant(r(9)=.384,p>.05).Thisisan

expectedresultbutinnowaypredictiveinthiscontext.Ontheotherhand,the

correlationbetweenYear8achievementandYear10achievement(r(9)=.70,p

<.05)washighlypositiveandstatisticallysignificant.ThecorrelationbetweenYear

8achievementandYear12engagement(r(9)=.65,p<.05)wasalsohighly

positiveandstatisticallysignificant.

Forthenon-selectivecasestudyschoolscomparedforthisexercise,Year8

achievementisamuchbetterpredictoroflaterachievement(asmeasuredbyYear

10results)andengagement(Year12seniorsciencecoursecompletions)thanYear

8engagement.

InthecomparisonslookingatmeasuresofYear8engagementinprovincial

schoolsandmetropolitancasestudyschoolsrelativetothestate,itappearedthat

Page 267: Exploring The Impact of a Largescale Diagnostic Science

246

provincialschoolshadalowerlevelofengagementwithsciencerelativetothe

stateandrelativetothemetropolitanschoolstheywerebeingcomparedwith.

ThethreeprovincialschoolsallhadlowSEAscores.Thenon-selective

metropolitanschoolshadslightlyhigherSEAscoresoverall.Onepossibilityisthat

alowSEAscoremightbeanindicatoroflowinterestinscience.Thecorrelation

betweenSEAscoresandYear8engagementforthenon-selectivecasestudy

schoolswasshowntobemoderatelynegativebutnotstatisticallysignificantlyso

(r(9)=-.42,p>.05).Thus,anysuggestionthatalowSEAscoreandlow

engagementwithscienceatYear8arenecessarilyrelatedwouldnotbesupported

bythisfinding.

Thecorrelationbetweentheresiduals(ameasureofscientificliteracy

achievement)forthe11non-selectiveentryschoolsandengagement(likingtheir

schoolscienceexperience)wasmoderatelynegativebutnotstatistically

significantlyso(r(9)=-.30,p>.05).Theconclusionfromthisresultisthatforthe

casestudyschools,goodEVresultsandstudentsnotlikingtheirscience

experienceisthemorelikelycombination.

5.5.3Correlations:provincialcasestudyschools(n=3)

Toassessthestrengthoftheassociationsbetweenvariables,thefollowing

variablesinvolvedinthecomparisonsbetweenthethreeprovincialschoolswere

testedforstatisticallysignificantcorrelationsusingSPSS:

1. EVresults(Year8benchmarkmeasureofachievement)

2. StudentsurveyItemsD+Ecombinedlevelsscore(Year8benchmark

measureofengagementwithscience)

3. Year10sumofgradesA+B+C(laterachievement)

4. Year12completionsofBiology,ChemistryandPhysics(averagemean

proportionscomparedtoEnglishatthatschool)

5. Residual

6. SEAscore.

Page 268: Exploring The Impact of a Largescale Diagnostic Science

247

MostofthevariabledatasetspassedtheShapiro-Wilktestfornormality(p>.05).

Asaresult,theSPSSprocedureforatwotailed,bivariate,parametriccorrelationof

thevariableswasused.ResultsarereportedintermsofPearson’sCorrelation

Coefficient(r),degreesoffreedom(1)andwhetherthecorrelationwas

statisticallysignificantrelativetothemodelreportedvalueateitherthep<.01or

p<.05levelofsignificance.

TheSEAscorewasincludedtotestthepossibilitythatengagementmaybe

positivelycorrelatedwithit.Thecorrelationforthethreeprovincialschoolsonthe

Year8engagementvariable(ItemsD+E)andSEAscoreproducedamoderately

negativebutnotstatisticallysignificantcorrelation(r(1)=-.43,p>.05).Thiswas

consistentwiththecorrelationforall11nonselectiveschools(r(9)=-.42,p>.05)

andforthefullcomplementofcasestudyschools(r(13)=-.38,p>.05).The

evidencehereisthatSEAscoreandengagementare,ifanything,negatively

correlated.Thehigherthestudents’learningpotential,thelesstheylikedtheir

schoolscienceexperience.

Anothercheckistoseeiftheresidualandengagement(ItemsD+E)arepositively

correlated.Theresidualisameasureoftheimpactofscienceteachingon

achievement,butitmight,arguably,beanindicatorofstudentattitudinal

responsestothatteaching.Forthethreeprovincialschools,thecorrelationwas

highlynegativebutnotstatisticallysignificant(r(1)=-.76,p>.05).Forthe11non-

selectiveentryschoolsthefigurewasmoderatelynegativeandalsonot

statisticallysignificant(r(9)=-.30,p>.05).Forallcasestudyschoolsr(13)=-.27,

p>.05.Again,theanalysisdoesnotsupportanydefinitiveconclusionbutis

suggestivethatthemorecapablestudentsacrossthestatearenotenjoyingtheir

sciencelessons.

WhenthecorrelationbetweenstudentsatisfactionwiththeirYear8schoolscience

experienceandbeingineitheraprovincial(1)ormetropolitanschool(2)was

testedforthe11non-selectiveentryschools,theresultwasmoderatelynegative

(r(9)=-.46,p>.05)butnotstatisticallysignificant.Aswell,comparingtheaverage

levelsofsatisfaction(descriptivestatistic)recordedforItemsDandEforallthe

Page 269: Exploring The Impact of a Largescale Diagnostic Science

248

casestudyschools(n=15)showsthatx̅metro=34.3versusx̅prov=22.1.Thus,itis

notunreasonabletoconcludefromtheaboveanalysesthatprovincialstudentsin

thissamplewerelesspositiveabouttheirexperienceofschoolsciencethantheir

metropolitancounterparts.

5.5Summary

ThecomparedpairsofschoolswerePCWAE1andMCWAE1,MCAE2andMCWBE3,

PCWAE2andMCWBE5,MGFSAE2andMGFSWBE1,andPCWAE2andPCWAE3.

ThefirstthreepairsofschoolshadcomparableSEAscoresbutstatistically

significantlydifferentresiduals.Thefourthpairwerefullyselectiveentrygirls’

schools.Thegirls’schoolswerepairedonthebasisofbeingselectiveentrygirls’

schools(buttheydidhavestatisticallysignificantlydifferentSEAscoresand

residuals).Thefifthpairwerecoeducationalprovincialschoolswithcomparable

SEAscoresandresiduals.‘Comparable’meansthescoreswerenotstatistically

significantlydifferent.

ThefirstandfifthpairofschoolswereWAEschoolsbecausetheyhadhighly

positiveresiduals,whichmeantthattheirEVresultswerewellaboveexpected.

Theresidualsfortheotherthreepairsweredifferentenoughtoassigneachschool

inthepairtoadifferentschoolgroupbasedontheirEVresultsbeingasexpected

(AE)orwellbelowexpectation(WBE).ExpectationwasrelativetoaNAPLAN-

basedpredictedsciencescore,asexplainedinSection3.3.

ThefindingsreportedinSection4.5werethatteachersinWAEschoolsweremore

frequentusersofthreeoffivedimensionsofformativepracticethanweretheir

colleaguesinWBEschools.Aswell,overall,teachersinAEschoolsweremorelike

theirWAEcounterpartsinthefrequencyoftheiruseofformativepractices.

Theresearchquestiontobeansweredinthischapterwas:

Doestheuseof(andifso,howdo)formativepracticesbyteachersimprove

students’EVresultsandlaterachievementinandengagementwith

science?

Page 270: Exploring The Impact of a Largescale Diagnostic Science

249

Section3.6explainedthatbyidentifyingschoolswithmatchingSEAscoresinalist

ofschoolssortedfromtoptobottomaccordingtothesizeoftheirresidualsthe

possibilityarisesofshowingthatbetterthanexpectedEVresults(intermsofa

predictor)arehigheractualEVresults(inabsoluteterms).Thatsaid,itcanbeseen

fromthetablesinSection5.3thatintermsofEVresults,PCWAE1’sEVresultis

higherthanMCWAE1’s,MCAE2’sishigherthanMCWBE3’s,PCWAE2’sishigher

thanMCWBE5’s,andPCWAE2’sishigherthanPCWAE3’s.Thus,itispossibleto

showthatforfourpairsofthecasestudyschoolswhereSEAscorescouldbe

matched,theschoolswiththebiggestresidualshadthebetterEVresults.Thiswas

theclaimmadeinpredictionone.Thehighresidualsareassociatedwithmore

frequentusebyteachersofthreedimensionsofformativepractice,theuseof

whichislinkedtohigherthanexpectedscientificliteracycontent,thusboostingEV

results.

Thesecondpartofpredictiononelinksresidualsizetoengagement,asmeasured

bystudentscoresonthesixitemschosenforconsiderationhere.Thepresumption

inmakingthelinkbetweenachievementandengagementisthatstudentexposure

toformativepracticeshasproducedstudentswhoarenotonlygoodatsciencebut

enjoylearningit.Thispresumptionwassupportedbyresearchfindingsdiscussed

inChapterTwothathadlinkedexposuretoformativepracticeswiththe

acquisitionofgoodlearningbehavioursandpositivedispositionstowardlearning.

Thisresearcherchosetousestudentenjoymentoftheirschoolscienceexperience

asameasureofpositivecommitmenttolearningscience.Additionalsupportfor

thelinkingofachievementandenjoymentwasprovidedbythefindingreportedin

Section5.2.1thatattheendofYear8,acrossthestate,higherachievementand

enjoymentoftheirschoolscienceexperiencewerepositivelyassociated.

AttheendofYear8,studentsinthehigher-achievingschoolinfourofthefive

pairsofcasestudyschoolsscoredacombinedItemD+Ebelowthescoreof

studentsintheschoolitwaspairedwith(seeTableK.5AinAppendixI).ItemD

wasaratingofenjoymentoftheirsecondaryscienceclassesandItemEwasthe

proportionofstudentswhohadincludedscienceinthegroupoftheirthree

favouritesubjects.Theexceptionwasthesecondpair,MCAE2andMCWBE3,

Page 271: Exploring The Impact of a Largescale Diagnostic Science

250

wherethehigherachievingschool,MCAE2,wasslightlyaboveMCWBE3inboth

theoverallandtopbandcomparisons.Theclosenessofthepairedresultsherewas

somewhatsurprising,giventhatMCAE2promoteditselfasaSTEMschooland

establishedeachyearaclassofstudentswhohadsataselectiveentrytestforthat

classonthebasisoftheirinterestindoingSTEM.

TheseresultsappeartocontradictthegeneralfindinginSection5.2.1thatacross

thestate,higherachievementandenjoymentofschoolsciencewerepositively

associated.ItseemsthatattheendofYear8,highachievementhadbeen

accomplishedattheexpenseofstudentenjoymentoftheirschoolscience

experience.ThisfindingisalsosupportedbythecorrelationsreportedinSection

5.4.

Ofinterestalsowastheobservation(for10ofthecasestudyschools)that

provincialstudentsweremorenegativeabouttheirschoolscienceexperiencethan

metropolitanstudents.Italsoseemsthatthehighestachievingstudentsinthe10

schoolsweretheonesmostnegativeaboutthisexperience.Totheextentthat

enjoymentofsciencewasanindicatorofself-regulationattheendofYear8,these

findingsarenotsupportiveofthatconclusion,norarethefindingspromisingas

predictorsoflaterengagementwithscience(Year12sciencecoursecompletions).

Predictiontwowasthattheschool(inthepairsofschools)withthebiggerresidual

attheendofYear8wouldgoontohavethebetterresultsattheendofYear10.

Thepredictionwasconfirmedforthefirstpairofschools(PCWAE1andMCWAE1).

Itwasnotpossibletomakethecomparisonforthesecondpair(MCAE2and

MCWBE3)becauseMCWBE3didnotprovideYear10results.Itwasnotconfirmed

forthethird(PCWAE2andMCWBE5),fourth(MGFSAE2andMGFSWBE1)andfifth

(PCWAE2andPCWAE3)pairsbecauseofuncertaintyaboutthecomparabilityof

theYear10results.

Inanidealworld,resultsfromaYear10EVtestandrelatedstudentsurveywould

havebeenthebestoptionfordoingthiscomparison.Unfortunately,suchatestand

relatedsurveydidnotbecomeavailableuntilafter2014.Itwouldthereforebe

Page 272: Exploring The Impact of a Largescale Diagnostic Science

251

unsafetosaythatthereweremoreself-regulatedlearnersinWAEschoolsonthe

evidencefromonepairofschools.

Findingsrelatedtopredictionthreeweremeanttodemonstratethepersistenceof

positiveattitudestoscienceprovidedbythepresenceofself-regulatedlearnersin

postYear8yearsofWAEschools.Theindependentevidenceofthepresenceof

self-regulatedlearnersingreaternumbersinWAEschoolswassupposedtobe

confirmedbyhigherproportionsofstudentscompletingsciencecoursesattheend

ofYear12.Thesewerecoursesthatstudentshadinitiallychosenhalf-waythrough

Year10.Havingquestionedthevalidityandreliabilityofthedatausedtoverify

predictiontwo,weareleftwithdataaboutYear8achievement,Year8engagement

andYear12engagement.

Twowaysofmakingthatinterschoolcomparisonareprovided.Thefirstisthe

proportionofstudentsateachschool(relativetoEnglishwhichisacompulsory

courseforstudentswantingtoreceivetheschoolexitcredentialattheendofYear

12)completingoneormoreofthefiveseniorsciencecoursesthatwereavailable

tostudents.AlltheschoolsresearchedhereofferedBiology,ChemistryandPhysics

intwoormoreofthethreeyearsofinterest.MostalsoofferedSeniorScienceand

oneofferedEarthandEnvironmentalScience(PCWAE3in2014)inthethreeyears

ofinterest(2013to2015).

Thesecondistocomparethisschoolproportiontothestatewideproportions.The

assumptionbehindbothmethodsisthatschoolstrytoaccommodatestudents’

preferencestothebestoftheirability,giventheresourcesschoolsareableto

allocate.Asastartingassumption,itwasacceptedthattheschoolproportions

shownhereaccuratelyreflectstudentdemandforsciencecoursesmorethanthe

constraintsofavailableresources;thiswillbelesstruethesmallertheschoolis.

ForthepairsofcasestudyschoolsmatchedbySEAscoresanddifferentresiduals

indicatingtheirdegreeofexposuretoformativepractices,thefindingisthatthe

betterEVresultswere,thehighertheproportionofstudentstakingupand

subsequentlycompletingsciencecourses.Itwasobservedthatachievementatthe

endofYear8wasastrongercorrelatewithcompletionthanlikingthesubjectat

Page 273: Exploring The Impact of a Largescale Diagnostic Science

252

thattime.Onbalance,thecombinationofhighachievementinscienceandnot

likingtheexperiencewasthenormforthecasestudyschools,whichwas

contradictedbythefindingreportedinSection5.2.1fromthelargersampleof

schoolsthatidentifiedthemselves.

Inconclusion,theevidencediscussedhereconfirmsthepositiveassociation

betweenbetterEVresultsandthefrequencyofexposureto:

• discoursethatelicitsevidenceoflearning

• theprovisionoffeedbackknowntoprogresslearning

• theuseandmodeling(topeersandstudentsalike)ofgoodlearning

behaviours.

Theattempttodemonstratethatmorefrequentexposuretothesethree

dimensionsofformativepracticehadproducedmoreself-regulatedstudentsin

WAEschoolsthanAEorWBEschoolshasnotbeendemonstratedconvincingly.

Theassessment-relatedworknarrativesfortheschoolswithbetterthanexpected

EVresultsallhadstrongprogramsaimedatbuildingstudentcapacitytousethe

languageofsciencetoexplainphenomenainthenaturalandmadeworldsthey

inhabit.Itappearstothisresearcherthattheliteracyfocuswasinresponsetoa

widerschoolpriorityand/orinresponsetoscienceteachers’awarenessofthe

importanceofscientificliteracyforsuccessinschoolscienceandaspreparation

forlifeandworkafterschool.Inschoolswhereresultswerewellbelow

expectation,theassessmentnarrativeshadlittleexplicitevidenceofapriorityfor

buildingstudentcapacitytousethelanguageofscienceasatoolformanaging

theirlearningofscience.

Page 274: Exploring The Impact of a Largescale Diagnostic Science

253

CHAPTER6:DISCUSSIONANDFUTUREDIRECTIONS

6.1Introduction

InChapterOneitwassaidthattheobjectiveofthisthesisistoanswerthebroad

question:Towhatextentistheassessment-relatedworkofscienceteachersin

NSWgovernmentschoolsformativeandwhyitmatters?ChapterTwogavetwo

reasonsforwhythisstudymatters.Thefirstisthatteacheruseofformative

practices(Black&Wiliam,2009)islinkedtohighachievement(Hattie,2012)as

measuredbytraditionalpenandpapersummativetests.Thesecondreasonisthat

teachingstudentstousethestrategiesofformativeassessmentthatunderpin

formativepracticeshasshownconsiderablepromiseasawayofhelpingstudents

tolearnhowtolearn.AccordingtotheOECD,“layingthefoundationsforlifelong

learning”(CERI,2008)p.1)shouldbeapriorityfortheinitialphaseofschooling;

knowinghowtolearnwouldbeimportantpreparationforthat.

A2018updatedlistofeffectsizesofparticularteachingstrategiesontestresults

showformativepracticestobeamongstthemosteffective(Hattie,2018).

Strategiessuchasclassroomdiscussion(0.82),providingfeedback(0.70),

responsetointervention(1.29),jigsawmethod(1.20)andscaffolding(0.82)are

amongstthemostpowerfulwaysforteacherstooperateintheclassroom.Two

curriculumstrategiesknowntohaveaboveaverageeffectsizesincluderepeated

readingprograms(0.75)andcoreandspecificvocabularyprograms(0.62).Bothof

thesewereinevidenceinWAEcasestudyschools.Theeffect-sizeofeachstrategy

isprovidedinparenthesis;higherthan0.42isanaboveaverageeffect.

Researchshowsthatteachingstudentsthestrategiesofformativeassessmentis

associatedwiththemacquiringtheskillsoflearninghowtolearn(LHTL)and

becomingautonomouslearners(Blacketal.,2006;James,2006).Learning

autonomyishighlyvaluedinthecontextofpreparingpeopleforlifeinthe

knowledgesocietyandrelatedglobaleconomyasdiscussedinChapterTwo.Again,

accordingtoHattie(2018),theeffectsizeonachievementofstudents’acquiring

andusingthesestrategiesisveryhigh.Examplesinclude:transferstrategies

Page 275: Exploring The Impact of a Largescale Diagnostic Science

254

(0.86),deliberativepractice(0.79),strategiestointegratewithpriorknowledge

(0.93)andsummarization(0.79).Boyleetal.(2001)wouldrefertothese

strategiesbeingusedbystudentsas“goodlearningbehaviours”(p.200).

AsoutlinedinChapterOne,twoinitiativesintroducedintoNSWschoolsin2003

and2007respectively,weredesignedtoshiftteachers’assessmentfocusfrom

summativetoformative.Theneedforthatshifthadbeenelaboratedinthereview

ofthestatusandqualityofscienceeducationinAustraliapublishedearlier

(Goodrumetal.,2001).

Theinitiativestooktheformofstrongadvicetoteachersintheofficialcurriculum

aboutbringingteachingandassessmenttogether(assessmentforlearningasit

wascalledthere)andacompulsorysummativetestforallYear8students.Thetest

alsohadadiagnosticpurposewhichwastoprovideaprogressreportonscience

achievementhalf-waythroughthefour-yearsciencecourse.Thediagnostic

purposeoftheEVprogramwasenhancedbyusingtheSOLOmodelinthedesignof

theassessmentframeworkfortheEVprogram.Testitemsandtaskswere

designedtochallengestudentsacrosssixlevelsofthinkingdescribedbythemodel.

Atthetime,boththeNSWDepartmentofEducation(theDepartment),whichwas

responsibleforthetest,andthecurriculumauthoritythathadproducedthe

curriculum,providedadditionalsupporttoteacherstoassistthemachievethe

shiftinemphasis.Examplesofthatsupportareoutlinedbelowandweredescribed

inearlierchapters.

Theimpactontheassessment-relatedworkofscienceteacherswasofboth

personalandprofessionalinteresttothisresearcherforreasonsexplainedin

ChapterOne.Toassesstheimpactofthetwoinitiativesonassessment-related

workasdescribedinearlierchapters,threeresearchquestionswereposed,a

researchdesignwasdevelopedanddatagathered.

Thefirstresearchquestionaskedaboutteacheruseoftheresourcesrelatedtothe

EVprogram.Theprogramcomponentsincludedatest,arelatedstudentsurvey,

provisionofareporttoparentsandcomprehensiveresults(totheirteachers,

schoolandschoolsystem),teachersupportintheformofmarkertrainingand

Page 276: Exploring The Impact of a Largescale Diagnostic Science

255

onlineprofessionallearningmodules.Discussionrelatedtoreasonsfortheiruse

(ornot)arereportedinSection6.2.

Thesecondresearchquestionsoughttofindouttheextenttowhichscience

teachersareusingformativepractices.Factorssupportingorimpedingtheuseof

formativepracticeswillbediscussedinSection6.3.

Thethirdresearchquestionaskedwhetherteacheruseofformativepractices

improvedstudentEVresultsandwhetherthatusewaslinkedtolaterachievement

inandengagementwithscience.TheanswerstothatquestioninvolvingYear8

studentsataschool,theirlaterachievement(Year10)andlaterengagement(Year

12)inscienceatschoolarediscussedinSection6.4.

Theresearchmethodologyusedtoprovidethefindingsinformingtheanswersto

researchquestionthreeisthebasisforclaimsbythisresearcheroforiginalityand

contributiontotheinternationalbodyofworkonformativeassessment.

Section6.5providessuggestionsforfurtherworktoconfirmfindings.

Thefinalsection(Section6.6)ofthischapterprovidesrecommendationsto

relevantauthoritiesarisingfromthefindingsreportedinthisthesis.

6.2Discussionoffindingsaddressingresearchquestionone.

Thequestionwas:WhatusearescienceteachersmakingoftheEVprogramand

whyisitusedornotused?

TheassessmentframeworkfortheEVtestdiscussedinChapterTwoprovidesa

mapoflearningalongtwoaxes,oneaxisbeingwhatshouldbelearnedinthename

ofscienceinYears7and8inNSWschools.Theotheraxisdescribessixlevelsof

thinkingaboutsciencethatastudentcandemonstrateintheirresponsestotest

itemsandtasks.TheSOLOmodelprovidesdescriptionsforthesixlevelsagainst

whichresponsesaretobejudged.Thebroadercontextincludesthetoolsfor

collectingevidenceoflearning(itemsandtasksinthetest),assigningvaluetothat

evidence(marking),reportingresultsandmakinguseofresultstoimprove

Page 277: Exploring The Impact of a Largescale Diagnostic Science

256

learningwillbereportedonhereaswell.Subsection6.2.1willfocusonteacheruse

oftheEVprogramresourcesmorebroadly;subsection6.2.2willexploretheextent

ofteacherengagementwithSOLO.

6.2.1TeachersandtheEVprogram

ThefollowingdiscussionrelatestothecollectedresponsesfromWAE,AEandWBE

teachers(n=85)totheonlineteachersurveyandtoevidencefromtheassessment

narratives(AppendixH)asappropriate.Thefirstfivequestions(Q1-Q5)inthe

teachersurveycollecteddataabouteightcategoriesofactionsdescribingthe

scopeofteacherengagementwithEVresources(Q1andQ2),theirlevelof

understandingoftheEVprogram(Q3),whatthemainpurposeoftheprogramwas

(Q4)andQ5askedwhethertheywouldparticipateintheextensionoftheprogram

toYear10.

ChapterFourprovidedthedetailedanalysesoftheirresponses.Inbrief,the

findingswere:

• theoveralllevelofWBEteacherengagementwithEVresourceswaslower

thanthatforAEandWAEteachers(seeFigure4.1);

• thatteacherunderstandingoftheEVprogram,onafive-pointscaleranging

fromverypoortopoor,acceptableandthengoodtoverygood,located

WBEteachersatacceptableandAEandWAEteachersatgood(seeFigure

4.1B);

• mostrespondentswrotethatthepurposeoftheEVprogramwastoprovide

teacherswithfeedbackonstudentlearning(seeTable4.5);and

• thatfewerWBEschoolsthanAEorWAEschoolswouldbetakingupthe

VALID10testopportunity.

Teachersfromallthreegroupshaddiscussedresultswitheachother(66%)but

lesssowithstudents(22%).Apossiblereasonfornotdiscussingresultswith

studentswasprovidedincasestudyschoolnarrativeswhereseveralteachershad

mentionedthelargetimegapbetweendoingthetest(November)andwhenthe

resultswerereturned(March-Aprilthefollowingyear).

Page 278: Exploring The Impact of a Largescale Diagnostic Science

257

Noneoftheschoolsmentionedusingitemsortasksthatstudentshaddonepoorly

inasthebasisforreteaching.Poorperformanceinworkingscientificallyor

communicatingscientificallyareprocessesthatcouldberetaughtinthecontextof

anytopics,includingthosebeingdoneinYear9.Reteachinginresponseto

feedbackisacharacteristicofformativepractice.Morebroadly,theliteratureon

feedbackisconsistentlyoftheviewthattheshorterthetimedifferencebetween

actionandfeedback,themorelikelyitistobeacteduponbythelearner(Black

(2007),Hattie&Timperley(2007),Masters(2013),Ruiz-Primo&Li(2012)and

Shute(2007)).

Almost40%ofrespondentshadmarkedextendedresponsetasksandalmost30%

saidtheyhadattendedworkshopsabouttheEVprogram(separatefromtraining

formarkingextendedresponsequestions).Respondingtotheteachersurveywas

voluntaryandanonymous.TeachersexposedtothosetwocomponentsoftheEV

programwerepossiblymoreinclinedtorespondtothesurveythanthosenotso

aware.Itmayalsobeafactorinthehighproportionofthesamerespondentswho

ratedtheirunderstandingoftheEVprogram(seeQ3referenceabove)as

acceptableandhigher(87%).Thatsaid,theEVprogramappearstobewell

understoodbymostoftherespondents,includingthoseinregionalareas,afinding

supportedbyanswerstothenextquestioninthesurvey,Q4.

Thecollationofteacherresponsestothefreeresponsequestion(Q4)aboutthe

mostimportantpurposefortheEVprogramrevealedmultiplepurposesfrom

somerespondents.Overall,themajority(70%)sawthepurposeasbeingabout

providingfeedbacktoteachersonlearningand/orteaching,whichwasconsistent

withtheDepartment’srhetoricaboutitspurpose(seeChapterOne).Aminority

(21%)sawitspurposeasprovidingfeedbackoncomparativeperformancewith

otherschoolsandthestate.Asmallproportion(9%)wroteaboutitspurposein

termsofdirectstudentbenefit,whichsuggestedtheysawitspotentialforstudent

self-evaluationwhichisacharacteristicofformativethinking(BlackandWiliam,

2009).

Page 279: Exploring The Impact of a Largescale Diagnostic Science

258

InrelationtotheEVprogramoverall,fiveresponsesprovideaninsightintoissues

someteachershavewiththeprogram.TheheadteachersatMCWBE5and

MCFSWBE1wereasnothappythatsciencehadbeensingledoutforspecial

treatment(intheformofanexternaltest).Threeotheranonymouscomments

fromrespondentstotheteachersurveyincluded:

Noidea.It'sanimpositionintoanalreadycrowdedcurriculumthatrequires

aninordinateamountoftimeandresourcesforsomethingthatonlyappears

tobetheretojustifyawell-paidjobortwoelsewhere.(WBEteacher)

[TheBoard]tickstheboxformoretestsforschool.Justifiesfundingbasedona

testthatdoesn'tnecessarilymatchtothecurriculumthatthestudentsare

doingatthetime.(WAEteacher)

TokeeppeopleinHeadOfficeinajob.(WBEteacher)

Theseweretheonlynegativecommentsinatotalofninety-fivedifferent

responsestothequestionaboutthemainpurposeoftheEVprogram.

Q5fromthesurveyaskedaboutintentionstotake-uptheVALID10test

(introducedin2015onavoluntarybasis;datacollectionforthisprojectwasin

2016).VALID10istheY10equivalentoftheY8test(asexplainedinChapterOne).

Itwasimpossibletobedefinitiveabouttheintendedtake-upbecausethiswasan

anonymousteacherresponsesurveyandtherewasnowayofknowingwhich

teacherswereatwhatschoolsandwhethertherewasmorethanoneteacherfrom

aschoolresponding.Basedontherawdata,72%ofteachersinWAEschoolssaid

theywouldbetakingupthetestthatyear,52%ofAEteachersand47%ofWBE

teachers.Theoverallresultforthesample(n=84)was56%whichsuggeststhat

aroundhalfthestate’sYear10classeswerepreparingtotakeupthetestona

voluntarybasisin2016.

FourschoolsreportedwantingtoseeevidenceofchangefromYear8toYear10

(MGFSAE2,MCWBE4,MCWAE2andPCWAE1)asthereasontheytookuptheoffer

ofparticipatinginVALID10.Reasonsgivenfornottakingitonincludedtoreduce

Page 280: Exploring The Impact of a Largescale Diagnostic Science

259

assessmentpressureonstudents(PCWAE3andMCWBE5);issuestodowith

computeraccess(PCWAE2andMCWBE5);teachersweretoobusyatthattimeof

theyear(MCFSWAE1);notmuchpointgiventhatstudentsallwentontoYear11

anyway(MCWBE5andMCFSWAE1).Anaside:inseparateconversationswith

scienceteachersoutsidethecontextofinterviewsincasestudyschools,somehad

reportedtheydidnotwanttoengagewithVALID10because,unliketheYear8test,

trainingandmarkingwasonsite(atschool)andunpaid.

Askingcasestudyteacherstocompleteaproformawithasampleofdatafor

studentsattheirschoolwasmeanttoprovidethisresearcherwithanopportunity

tofindoutthebreadthanddepthofanalysisteachersdowithbothEVtestand

studentsurveyresultsaswellastheirownteacherdevisedassessments.Onlyfour

ofthecasestudyschoolshadengagedwiththeproformabeforetheinterview.

Thusdiscussionattheinterviewoftheirpracticesinrelationtodataanalysiswas

limitedbythelowoverallresponseatthattime.Thelowresponsewastakenasan

indicatorthatusingdataforformativepurposeswasnothighonteachers’

assessmentagenda.Thisimpressionwasconfirmedandrecordedinassessment

narrativeswherelearningintentionsandsuccesscriteriawereprimarilyusedby

teachersasthebasisforfeedbackonstrengthsandweaknessesinanswersto

summativeassessmenttasks.Therewaslittleevidenceofstudentsbeinggiventhe

opportunitytouselearningintentionsandsuccesscriteriatoprovidefeedbackto

peersorinself-assessmentactivities.(overall,only16%ofteacherssaidtheyoften

askedstudentstoredoworktoahigherstandard).BothHattie(2012)andMitchell

etal.(2009)describeresearchsupportingtheeffectivenessofreflectionasanaid

toimprovinglearning.

Mostoftheheadteachersinterviewedsaidthatthelevelofresultsanalysisasked

forintheproformawassomethingtheyhadnotconsidereddoingbefore.

However,thethreewhodidcometotheinterviewwithcompletedproformassaid

itwasbeneficialtolookatthedataovertimeandtoidentifytrends.Thehead

teacheratMCWAE2suggestedthatprovidingadatadownloadingcapabilityfrom

SMARTwouldencouragegreateraccessandusebyscienceteachersofthedata,

particularlythestudentsurveydata.She,alongwiththeheadteachersatMCWBE3

Page 281: Exploring The Impact of a Largescale Diagnostic Science

260

andMCPSWBE2,saidtheysawvalueinkeepingarecordovertimeofresultsfrom

Year8toYear10science.

AllheadteachersinterviewedsaidtheykeptfacultyrecordsofHSCresultsover

time.Noanalysiswasdonebyheadteacherstofindtheproportionsofstudents

doingseniorsciencecourseseachyearincasestudyschoolsbeforetheir

participationinthisproject.MostdidnothavefacultyrecordsovertimeofY10

gradesafterexternaltestingstopped.Thiswasnotaprioritybecausealmostall

studentswentontoYear11andmanytookupseniorsciencecourses.There

seemedtobelittleawarenessbyheadteachersthattheserecordsprovideabasis

formonitoringengagementinscience(Year12proportionsrelativetoEnglish)or

ofprogressinlearning(fromYear8toYear10).

InrelationtomonitoringprogressinlearningfromYear8toYear12,doingthis

wasnothelpedbythefactthatYear8resultsarereportedagainstsixlevels,Year

10resultsarereportedagainstfivegradesandYear12resultsarereported

againstsixlevels(notcommensuratewiththeYear8levels).Thepossibilityfor

monitoringstudentachievementandengagementfromYear8toYear10using

VALID10resultsisnowavailabletothoseschoolstakinguptheVALID10test.K-6

schoolstakinguptheVALID6optioncanreporttheirresultstothesecondary

schoolsreceivingtheirstudents.

Itisalsopossiblethatsomescienceheadteachersandclassroomteachersdonot

havethestatisticalskillsand/orspreadsheetfluencyandexpertisetoconfidently

managethetransferandtransformationoftheEVdata.Thiswasfoundtobea

barriertomeaningfulengagementwithNAPLANresultsforsomesecondary

teachersofEnglishandMathematics(Pierce&Chick,2011).

StudentEVresultsaredistributedtoparentsafterprintingoutbytheschool.

Typically,resultsaresenthomeinthesamewaythebi-annualschoolreportson

allcoursesaredistributed.Whenaskedwhatfeedback,ifany,wasprovidedby

parentstoscienceteachersabouttheEVreports,noneofthoseinterviewedcould

recallanyparentcommentingonoraskingformoreinformation.Thiswasalso

trueforthetwoschoolsthatsaidtheyhandedthereportstoparentsattheir

Page 282: Exploring The Impact of a Largescale Diagnostic Science

261

regularparent–teachernightheldearlyinYear8.Whenaskedwhytherewasno

apparentinterestfromparentsintheresults,severalcommentedthatthetime

intervalbetweendoingthetest(Novemberthepreviousyear)andreceiptofthe

report(March-Aprilthenextyear)mayhavebeenafactor,thoughnonesuggested

howthatmighthaveinfluencedtheapparentlackofparentinterest.Theresearch

literatureonthereducedeffectoffeedbackprovidedwellaftertheassessmentwas

mentionedearlierinthissection.

TheinclusionofastudentsurveywiththeEVtestwasauniqueadditiontolarge

scalewholeofcohorttestinginNSWschools.Onlystudentsatschoolschosenin

nationalsamplestoparticipateinTIMSStesting(inYear4and/orYear8)had

completedsurveysandtestsbeforeEVtestingbegan.Teachersfromallthree

schoolgroupsrespondingtothesurveyhadindividuallylookedatthestudent

surveyresults(67%.Yetonly20%haddiscussedtheresultswithcolleaguesor

students.Casestudyschoolssaidintheinterviewsthatthemainreasonfornot

havingthosediscussionswasbecauseteachershadnotbeengivensupportor

encouragementtodoso.Ontheotherhand,almostallthecasestudyschoolssaid

theymetregularlyasastaffandthatassessmentwasafrequentitemonthe

agendaforthosemeetings.Hadthestudentsurveybeenofinterestorseenas

relevant,giventheregularmeetings,itcouldhavebeenontheagenda.Itwould

appearthatscienceachievementwasofmoreinterestthanstudentengagement

withscience.

Thepersonalandprofessionaldiscomfortofteacherstostudentdissatisfactionis

understandable.Inrecognitionofthat,EVresultsaredeliberatelynotpublicizedin

thesamewayNAPLANresultsare.Allinterviewedsaidtherewasnopressure

fromtheschoolexecutiveoverEVresults,onewayortheother.Whilstthisisin

keepingwiththelow-stakesintentionsofdiagnosticassessment,themainreason

thefeedbackisprovidedistopromotechangeleadingtobetteroveralllevelsof

studentachievementandengagement.Thereisastrongelementoftrustbeing

placedintheprofessionalismofteacherstorespondtothefeedback.Basedonthe

highlevelofintention(morethan50%sayingtheywouldtakeupthevoluntary

Year10test),therelativeabsenceofnegativefeedbackabouttheprogram(see

Page 283: Exploring The Impact of a Largescale Diagnostic Science

262

individualteachercommentsabove)andthefactthat48%ofteachershadused

theresultstoinformchangestoteachingandlearningprograms,thatapproachby

theeducationsystemandschoolmanagersseemstobesound.

PCWAE1teacherswereawarestudentsattheirschooldidnotlikescienceortheir

experienceofitbutcouldnotofferanyreasonapartfromreportingacomment

fromstudentsthatteachersattheirschoolwerestrictaboutstudentscompleting

theirwork.AsreportedinChapterFive,studentsatbothPCWAE2andPCWAE3

hadlowrankingsoftheirschoolscienceexperiencesaswell(seeTableK.5A,Band

CinAppendixI).Thatnegativitywasalsoreportedfortheirprimaryschoolscience

experienceandallthreewerebelowthestatefiguresfortheproportionsincluding

scienceintheirlistofthreefavouritesubjects(PCWAE2wasthebestofthethree

there).Theapparentparadoxofbetterthanexpectedachievementanddislikefor

theirschoolscienceexperiencewillbediscussedfurtherinsection6.6.

Ofthethreefullyselectiveschools,studentsfromMGFSAE2recordedtheleast

positiveviewsoftheirschoolscienceexperience.MGFSAE2’srankingonItemsD

andEcombinedwas16th(outof16).Thethreeselectiveentryschools’top

achievementbandstudentsrecordedthethreehighestlevelsofagreementwith

ItemBwhichsaidthatsciencewasthehardestsubjectIlearn(1st,3rdand2nd

respectivelyfortheWAE,AEandWBEschoolsinthatorder).Areviewofthe

assessmentrelatedartifactsprovidedforthethreeschoolsshowedthatthe

expectationsforknowledgeandunderstandingwerewellabovesyllabus

expectationswhichmaybeafactorcontributingtothemnotenjoyingtheirschool

scienceexperiences.

Incontrasttotheabove,asshowninTablesK.5DandK.5CinAppendixI,

MCWBE5,MCWAE1andMCWBE4wereatthetopofcasestudyschoolrankings

(andabovethestate)forstudentenjoymentoftheirsecondaryschoolscience

experience(studentsurveyItemsDandE).Thethreeschoolsalsohadthelargest

proportionsofstudentsnominatingscienceasthesubjecttheylearntmostin

(ItemF).EVresultsforallthreeoftheschoolswererelativelylow(82.54,82.14

and73.63respectively).However,studentsatMCWBE5thoughtsciencewasnotas

Page 284: Exploring The Impact of a Largescale Diagnostic Science

263

difficult(5thoutof11non-selectiveschoolsandcountingthestateasoneschool)

asstudentsatMCWAE1(1st)orMCWBE4(2nd)did.Forstudentsatthesetwo

schools,perceiveddifficultydidnotseemtoimpactenjoymentoftheirschool

scienceexperience.Enjoymentofscienceand/orengagement,aswaspointedout

inSection5.4forallstudents(threelevelsofachievementtogether)atthe

differentschools,wasnotobviouslyrelatedtoeitherSEAscoresorresidual

rankings.

AnalysisoftheassessmentnarrativesforMCWBE5andMCWAE1,andtheschools

theywerecomparedwith(PCWAE2,PCWAE1respectively),didnotprovide

consistent,substantiveevidencethatstudentsatanyofthefourschoolsattheend

ofYear8hadacquiredskillsassociatedwithself-regulation.MCWBE4wasnot

comparedtoanyschoolandithadaSEAscoreof0.7andaresidualof-1.58.Their

assessmentnarrativewasmorefocusedonhowsciencecontextswerebeingused

toimprovestudent’sliteracyandnumeracyskillsandidentifyformation.A

hypothesizedlinkbetweenhighachievementandengagementattheendofYear8

andself-regulationcouldnotbesupported.Inessence,whilstself-regulation

(Boekaerts&Corno,2005)andlearninghowtolearn(James,2006),areseenas

important,themethodsusedinthisprojectandrelatedfindingsdidnotshowa

hopedforconsistent,patternthatcouldreasonablybeattributedtostudentself-

regulation.

Overall,basedonteachercommentsintheinterviews,studentslikedoingthe

onlineEVtestwhichteacherssaidstudentsfindinherentlyinteresting.Inonlytwo

schoolswasitsuggestedthat(some)studentsdidnottakethetestseriously

(MGFSWBE1andMCWBE4).Noneoftheschoolsreportedspendingtime

preparingstudentsforthetestapartfromthebasicrequirementstoensurelogin

successandforstudentstofamiliarizethemselveswithhowtorespondtothe

itemsandtasks.ThecommonmessagegiventostudentswasthattheYear8EV

resultswouldnotbeusedinschoolassessments,butthatstudentsshoulddotheir

bestbecausethetestresultswouldhelpteacherstoimprovetheirteaching.

Page 285: Exploring The Impact of a Largescale Diagnostic Science

264

Noneofthecasestudyteachersinterviewedmentionedtheyhadusedtheteaching

strategiesadviceprovidedinSMARTtoaddressmisconceptionsidentifiedin

feedbacktotheschool.Theoverallsurveyresponsetothatquestionwasfewer

thanhalfsaying‘yes’(oneinthreeWBEandWAEteacherssaid‘yes’;AEteacher

responsewastwointhreesaying‘yes’).Theprovisionofthisresourceinthe

feedbackpackagewasoverlookedbymostteachersitseems.Thisresearcher’s

explanationforthatistheoveralllackofincentiveforteacherstoengagewiththe

massofdataavailableintheSMARTpackage.ThatappearstobethecaseforWAE

schools’lowresponsecomparedtoAEschoolswhereresultswereperhapsnotas

goodorteachersinthoseschoolswerekeentodobetterfortheirstudents.WBE

schoolshadalowerlevelofengagementforallaspectsoftheprogram.

ThenationaltestsinAustraliaforliteracyandnumeracy(NAPLANtests)were

usedinthisprojecttodeveloppredictorsofEVsuccess.Thesetestsareexamples

ofsummativetestsalsobeingusedfordiagnosticpurposes(aswellasother

purposesdiscussedinChapterTwo).Theanecdotalfeedbackfromthescience

headteachersincasestudyschoolswasthatNAPLANfeedbackattractsmore

interest,attentionandtimefromparents,studentsandtheirschools’senior

executivesthandoesthefeedbackonEVresults.Thereasonsmostgaveforthe

attentiontoNAPLANwasthepublicationoftheschool’sresultsonawell-

publicizedwebsiteforalltheworldtosee(theMySchoolwebsite),mediainterest

incomparingschoolsandtherequirementtoreportNAPLANresultsinannual

schoolreports.

Insummary,teachersareusingoradaptingEVtestitemsandtasksfrompasttests

toenhancetheirsciencedepartmentformalassessmentprograms(69%).

Teachersinschoolswhereresultsarewellabove(WAE)oratexpectation(AE)are

usingtheresourcesmoreandinawidervarietyofwaysthantheircolleaguesin

schoolswhereresultsarewellbelowexpectation(WBE).However,overall,fewer

thanhalf(48%)oftheteachersthatrespondedtothesurveysaidtheywereusing

thefeedbackfromEVresultstoamendtheirteachingandlearningprograms.

Teachersinschoolswhereresultswereasexpected(AE)reportedthehighest‘yes’

Page 286: Exploring The Impact of a Largescale Diagnostic Science

265

responserate(75%)totheitemaboutusingthefeedbackfromEVresultsto

amendtheirprograms.

6.2.2TeachersandSOLO

EngagementwiththeSOLOmodelwasaddressedintheonlinesurveyinthree

questions,questionssixtoeight.SOLOwasakeyelementintheassessment

frameworkfortheEVprogrambecauseitprovidedthebasisforfeedbackabout

thelevelofthinkingevidentinstudentresponsestoitemsandtasksinthetest.

AsreportedinChapterFour,theoverallfindingwasthatdifferencesbetweenthe

responsesofWAE,AEandWBEteachersonanyoftheaspectsofSOLO

engagementinvestigatedherewerenotstatisticallysignificant.Alsotheoverallyes

responsestoitemsbeganat54%anddeclinedfromtheretoalowof5%onthe

secondlastiteminQ6whichwasaboutreportingtohomeusingSOLO.

OnQ7,whichaskedteacherstoratetheirunderstandingofSOLO,onaratingscale

goingfromverypoortopoor,thenacceptableandgoodtoverygood,themodal

responsewas“acceptable”(29%chosethatoption).

Whenteacherswereaskedinquestioneight(Q8)wheretheylearntmostabout

SOLO,themostcommonlymentionedsituationwastrainingformarkingtheEV

testormarkingEVtests(35%).ThenextwasinEVworkshops(9%)followedby

neverheardofSOLO(7%).Itwasnotpossibletodistinguishwhetherthe

responseswereabouttheYear8markingforextendedresponsetaskswhichis

doneexternallytotheschoolbyexperienced,trained,scienceteachersorYear10

marking.Trainingforthelatterisdoneatschoolorhomebyworkingthrough

onlinemodules.

Ofthesixteenschoolsvisited,onlytwowereactivelyusingSOLOtoinformtheir

assessmentfeedbacktostudents(MGFSAE2andMCWAE2)atschool.MCWBE4

indicatedthattheschoolwasconsideringusingSOLOasanenhancementtoits

assessmentpoliciesandpractices.Neitherschoolusedittoreporttoparentsor

carers.MCWAE2recogniseditspotentialtoprovidefeedbacktohelpstudentswith

Page 287: Exploring The Impact of a Largescale Diagnostic Science

266

theirexpressivelanguageskillsinscienceandwereusingittomarkextended

responsequestionsteachersattheschoolhadconstructedorappropriatedfrom

othersources.TheHTatMCWBE3wasactivelyworkingonbuildingstaff

understandingaboutSOLOinordertouseitasthebasisforfeedbacktostudents

inscience.

AcogentreasonfornotusingSOLOwasgivenbythetwoteachersinvolvedinthe

interviewatPCWAE2.TheysaidthatstudentsfounditconfusingtoreconcileSOLO

andNSWBoardofStudies(theBoard)providedfeedback(reportedinlevelsand

gradesrespectivelyandbasedondifferentcriteriaasexplainedinChapterTwo).

Giventhattheschool’spriority(seetheirschoolnarrativeinAppendixH)tohave

studentsdoseniorsecondarysciencecourses,theteachersfelttheireffortswould

bebetterspenthavingstudentsunderstandtheBoard’sCommonGradeScale

approachtoassessment.HeadteachersatMCWAE2andMGFSAE2whowere

activelyusingSOLOtoimprovestudentlearninginscienceappearedtohavea

reasonableunderstandingofSOLOlevels.TheheadteacheratMCWAE2was

workingwiththeoriginalSOLOtaxonomyratherthantheversionbeingusedin

theEVprogram.

Fromtheabove,theSOLOcomponentoftheEVprogramwasnotverywell

understoodbyscienceteachersrespondingtothesurveyandwaslargelyignored

asabasisforprovidingfeedbacktostudentsabouttheirlevelofthinkingin

science.

TheEVprogramandtheSOLOmodelareexclusivelyDepartmentinitiativesand

theabovefeedbackwillbeofinteresttotheDepartment.However,theuseofa

formal,externally(totheschool)developedandimposedsummativetestto

providefeedbacktoteachersonprogressinlearningisofgeneralinteresttoall

systemswheresuchtestingisdonewithdiagnosticintent.Aswasdiscussedin

ChapterTwo,thelastroundofPISAtesting(2015)inscienceincludedacognitive

demanddimensioninitsassessmentframework.SOLOwasconsideredforthat

rolebutthetestdeveloperschoseanalternative,simplermodelthatrecognised

threelevelsofcognitivedemand(OECD,2017).Recognisingcognitivedemandin

Page 288: Exploring The Impact of a Largescale Diagnostic Science

267

theassessmentframeworkofaninternationaltest,suchasPISA,representsa

qualitativeimprovementinthesophisticationofmeasurement-based,assessment

modelsofwhichtheEVandPISAtestshavebeendescribedasexemplary

(Fensham,2013).

6.3Discussionoffindingsaddressingresearchquestiontwo

Thequestionasks:whatformativepracticesareevidentintheassessment-related

workofscienceteachersandwhyaretheyusedornotused?

Thefocusherewillbetolookfirstatcasestudyschools’assessmentrelated-work

narratives(providedatAppendixH)forexamplesofsciencedepartmentpractice

thatreflectformativeintentions(6.3.1)beforelookingatevidenceofformative

practiceintheclassroom(6.3.2).Inthesectiononclassroompractices,discussion

willbelinkedtothefivedimensionsofformativepracticewhichcomprisedthe

theoreticalframeworkforassessingtheextenttowhichpracticeswereformative.

6.3.1Sciencedepartmentassessmentpractices

AswasdescribedinChapterFive,studentallocationtoclassesinthejunior

secondaryyearsofhighschoolforthepurposesofinstructionwasdoneincase

studyschoolsalmostalwaysonthebasisofachievementinliteracyandnumeracy

asassessedbyteachersattheendofYear6.TheDepartment’sstaffingformula

providesteachersonthebasisthatnojuniorsecondaryclassinthecoresubjects

(whichincludesscience)“needexceed30students”(NSWDofE,2017).Inpractice

however,someclassesinagivenYearwereallocated(withstaffagreement)more

than30studentsinordertocreatesmallerclassesfor‘lowerability’students

(generallymeaningstudentswithpoorlearninghistories).Socalled‘bottom’

classesweregenerallyassignedcloseto20studentsorfewerifpossible.Fromthe

perspectiveofscienceteachers,theclassesassignedtothemwere“ungraded”in

termsofpriorsciencelearning.Thescienceheadteachersinvolvedininterviews

saidtheirexpectationwasthatteacherswouldworkfromthatassumption.The

rangeofresponsesbyteacherstothediversityofstudentsintheirclassesis

discussedinSection6.3.2.

Page 289: Exploring The Impact of a Largescale Diagnostic Science

268

ItisimportanttounderstandthatteachersinNSWgovernmentschoolswere

requiredbytheiremployer(theDepartment)tomakeuseofaspecified

curriculumandemployer-providedpolicydocuments(NSWDofE,2013)toguide

preparationoftheirteachingandrelatedassessmentwork.Thisrequirement

appliedwellbeforetheperiodofinterestforthisprojectandcontinuestoday.The

responsebyscienceteachersincasestudyschoolstotheabovewastousethe

syllabusandrelatedimplementationsupportandpolicyadvicetoguidetheir

constructionofaplannedprogramofworkfortheirstudentsmappedtotheforty

weeksoftheschoolyear.Importantstructuralfeaturesoftheprogramofwork

werethecurriculumstandardsdescribedintermsofoutcomesandrelatedcontent

todefinethescopeandlevelofexpectedlearningrelatedtoeachoutcome.The

curriculumexpectationwasthatthelearningwouldbespreadequallybetween

knowledgeandunderstandingofscienceandrelatedcontextsandtheacquisition

ofskillsrelatedtoworkingandcommunicatingscientifically(BOS,2003).

Incasestudyschools,thesciencedepartment’sprogramdescribedforteachersthe

scienceknowledgeandunderstandings,skillsandattitudestheywereexpectedto

“teach”tostudentsintheirclassesinthefouryearsfromYear7uptotheendof

Year10andhowitwouldbeassessedalongthewayforthepurposesofcollecting

evidenceoflearningtobeusedinpreparingprogressreportsaboutstudent

learningforparents.Thereisarequirementtoreporttoparentsatleasttwicea

year.Thecurriculum(calledasyllabusinNSWbecauseofitsspecificityabout

whatwasexpectedtobetaught)inplaceatthetimeofinterestforthisproject

includedadvicethatteachingandlearningneedtobecloselylinked,anintention

capturedinthephrase“assessmentforlearning”(BOS,2003,p.70).Tohelp

teachersdothat,theDepartmentandBoardprovidearangeofsupportmaterials

andprofessionallearningactivitiesthatteacherscanaccessandworkthroughto

deviselearningandassessmenttasksthatbetterreflectthefullrangeof

curriculumintentionsandthatarefair,validandreliablereflectionsofthose

intentions.

Inresponsetothatsupport,therangeoftasksandactivitiesdescribedinthe

narrativesasbeingusedbyteacherstocollectevidenceoflearning,apartfrompen

Page 290: Exploring The Impact of a Largescale Diagnostic Science

269

andpapertests,includedstudentresearchprojects(amandatedactivityinthe

curriculum),fieldworkreports,excursionreports,writtenresponsestolaboratory

tasks,internetandothertext-basedresearchtasks,oralpresentationsandcreative

activitiessuchasmodelmakinganddiarywriting,tonamesome.Theseactivities

wereeitherdoneentirelyinclasstimeorinbothclassandhometime.Student

responsestotheactivitiesprovidedevidenceoflearningthatwasusedbyteachers

forbothformativeandsummativeassessmentpurposes.Whilstawiderangeof

taskswasbeingused,whatstudentsknowandunderstandasreportedtoparents

(typicallyexpressedasamarkorgrade)wasdominatedbytheweightofevidence

fromtasksreturningmarksbasedonteachers’judgmentsofthequalityof

expressivelanguageusedbystudentsintheconstructionofresponsestothe

requirementsofthosetasks.

Accompanyingmanyactivitieswererubricssettingoutthelearningexpectations

andsuccesscriteriathatwouldbelookedforinassessingtheworthofthe

evidenceoflearningdemonstratedinstudentresponses.Thelearningintentions,

aswrittendown,weretypicallyderivedfromcurriculumoutcomesandrelated

contentthatdescribedthescopeandformofexpectedresponses(descriptions

includingcomparisonandcontrasts,graphicrepresentationswithappropriate

labels,explanations,justificationsandaspectsofperformancestonamesome).The

judgmenttobemadeofthequalityoftheresponsewasalmostalwaysreferenced

tothefivegradesintheBoard’sCommonGradeScaleadvice(BOS,2013).SOLO

levelsandrelatedlanguageonlyappearedinartifactsprovidedbytwocasestudy

schools(MGFSAE2andMCWAE2).

OfinterestwastheplaceAEteachersoccupiedintheanalysisoftheresponses

againstthefivedimensionsofformativepractice.TheAEteachersarethegroupof

teacherswhosestudents’EVresultswereasexpectedbasedonthepredictor.AE

teacherresponsesprovidedreferencelevelsforthisexercise.Fromtheanalysis

reportedinchapterfivethefrequencymeansfortheAEgroupofteacherswere

alwaysbetweentheWAEandWBEteachers.However,forthethirddimension

(feedbackthatadvanceslearning)boththeWAEandAEmeanswerestatistically

significantlydifferentto(above)theWBEmeanwhichmeantthatthesetwo

Page 291: Exploring The Impact of a Largescale Diagnostic Science

270

groupsofteachersweremorefrequentusersofavarietyoffeedbacksourcesthan

weretheirWBEcolleagues.

Howteachersengagestudentswithcurriculumcontentandconductassessments

oftheextentoflearningisadecisionforclassroomteachers(BOS,2003).The

extenttowhichtheformsofevidencefromthesurveysandassessmentnarratives

canbesaidtobeformativeisdiscussedbelowinSection6.3.2.

6.3.2Formativeclassroompractices

Together,thequestionsandrelatedactivitiesdescribedintheitemsfromthe

scienceteachersurveyinQuestions9to15addresswhatwerecalledinearlier

chaptersfivedimensionsofformativepractice.Thedimensions,bringingtogether

scienceinstructionandassessmentstrategies,are:

1. Clarifyingandsharinglearningintentionsandsuccesscriteria(LISC);

2. Engineeringeffectiveclassroomdiscourseandusinglearningtasksthat

elicitevidenceoflearning(CDEL);

3. Providingfeedbackthatmoveslearnersforward(FTAL);

4. Activatingstudentsasinstructionalresourcesforoneanother(including

peerassessment)andtheirteachers(ASIR);

5. Activatingstudents(andteachers)asownersoftheirownlearning

(includingself-assessment)(ASTL).

Thesefivedimensionsprovidetheframeworkforassessingtheextenttowhich

practicesdiscussedherecanbedescribedasformative.AsdiscussedinChapter

Four,theintentionwastofindouttheextenttowhichteacherswerethemselves

activelyusing,aswellaspromotingstudentagencywith,formativepractice

dimensions.TheexamplesandcontextsdiscussedhererelatetoYears7and8.

Teacheruseoflearningintentionsandsuccesscriteria(LISC)

Asmentionedinthefindingsfromtheteachersurvey(Section4.2.3.1),students

hadverylittleinputintothechoiceoftask,learningintentionsorsuccesscriteria

whichforthemostpartappearedtobegivendeterminedbytheteacher.There

Page 292: Exploring The Impact of a Largescale Diagnostic Science

271

werenostatisticallysignificantdifferencesbetweenthethreeschoolgroupswhen

itcametothefrequencieswithwhichlearningintentionsandsuccesscriteriawere

used.Also,therelativefrequenciesofopportunitiesforstudentstotakeownership

werefewerthanteacherledsituations.

Inthecontextofteachingandlearningtasks,thepurposeoftaskswastypically

explainedtostudentsintermsofcurriculumintentions.Whenhelpingstudents

prepareforassessment,teachersusedrubricstodescribefeaturesofanswersthat

wouldattract‘fullmarks’.Almostallschoolsprovidedwrittenrubricstostudents

tohelpthemunderstandthecriteriathatwouldbeusedtoassignscores.Students

typicallyattemptedformalassessmenttasksindividuallyandwithoutassistance

fromothers.Theirresponsesweretypicallyscoredbyteacherseitherworking

alone(mostoften)orsharedwithotherteachers.Aswillbeclearfromthe

discussionfollowingrelatingtothenexttwodimensions,theuseofLISCtofocus

discourseandfeedbackwasmorefrequentinthesampleofteachersinWAEand

AEschoolswhencomparedtothesampleofteachersintheWBEschools.

Classroomdiscourseelicitingevidenceoflearning(CDEL)

Inthisdimensionofformativepractice,statisticallysignificantdifferenceswere

foundbetweenthethreeschoolgroupsrelatingtoteacher-directedclassroom

discussion.WAEteachersweremorefrequentusersthanWBEteachersofwait-

timebeforeresponding,ofdiscussionaboutitemsfromtestsandassignmentsand

studentresponsestothoseitems.WAEteachersmorefrequentlyaskedstudentsto

explaintheirthinkingaswellasexplainingtheir(teacher)thinkingtostudents.

TeachersinWAEschools,inparticular,hadastrongcommitmenttodeveloping

students’literacyskillsandhelpingstudentstoacquirethescientificvocabulary

neededtodescribeandexplainthescienceintheworldaroundthem.

MCFSWAE1providedamostlyschool-basedrangeofscienceactivitiesfocusedon

laboratoryworklinkedtotext-bookpracticalactivitiesandrelatedskills

development.Theheadteacherreportedanemphasisonwritingexplanationsasa

focusforYear7and8science.MGFSWBE1ontheotherhand,hadanemphasison

scienceprocessskillsbutstudentsalsoworkedonprojects(involvingVisualArts

Page 293: Exploring The Impact of a Largescale Diagnostic Science

272

andPersonalDevelopment,HealthandPhysicalEducation)aswell.Studentswere

alsoprovidedwithexperiencesbeyondtheschoolgate(anexcursiontotheZoo).

Thegirlsweregivenopportunitiestodiscussscienceingroupsandinwholeclass

discussionandtomakemodelsanddeliverpresentationsaboutwhattheyhad

learned.ThegirlsatMGFSWBE1wereprovidedwithamorediversifiedsetof

science-richcontextsthancoedstudentsatMCFSWAE1andawiderrangeof

experiencesinwhichtoexplorethemeaningsofscience.

BothPCWAE1anditspairedschool,MCWAE1(pairone)providedstudentswitha

rangeofsciencerichcontextsbothintheschoolsciencelaboratoryandbeyondthe

schoolgate.Theteachersworkedhardatbothschoolstofitsyllabusintended

sciencelearningwithcontextsrelevanttotheexperienceofstudents.Bothschools

hadthesmallestYear7and8classesofallthecasestudyschools.Theexperiences

providedwerewellusedbyteacherstodevelopstudents’oralandwritingskillsas

wellashelpingthemtoacquirethevocabularyneededtodescribeandexplainthe

scienceintheexperiencesprovided.

MCAE2anditspairedschoolMCWBE3(pairtwo)providedarangeofschool-based

sciencelaboratoryandtext-basedactivitiesfortheirstudents.Thefocusatbothwas

ondevelopingskillsrelatedtoscientificinvestigationsinthosecontexts.MCAE2

engageditsstudentsinawiderangeofscienceprojectsanditsendsthebest

projectstotheNSWScienceTeachersAssociationYoungScientistAwards.Eachyear

itestablishesaclassofYear7studentswithaninterestinscienceandwhohave

donewellinascience-basedtestsetbytheschoolandcompletedinYear6.

PCWAE2andMCWBE5(pairthree)bothprovidearangeofschool-basedscience

laboratoryandtext-basedactivitiesfortheirstudents.PCWAE2makesuseofa

rangeofagriculturalcontextsoutsidetheclassroomandbeyondtheschoolgateto

widentheopportunitiesforitsstudentstoengagewithscience.MCWAE2provides

arangeofsciencerichactivitiesoutsidetheclassroomandbeyondtheschoolgates

toitsstudentsaswell.PCWAE2hasparticularemphasisondevelopingtheliteracy

skillsofitsstudentwithaparticularemphasisonwholeclassdiscussionand

readingaloud.

Page 294: Exploring The Impact of a Largescale Diagnostic Science

273

Providingfeedbackthatadvanceslearning(FTAL)

Thiswasanotherdimensionwheretherewerestatisticallysignificantdifferences

betweenthethreegroupsofteachers.InthisdimensionbothWAEandAEteachers

weremorefrequentusersthantheirWBEcolleaguesofawiderangeof

opportunitiesforandsourcesoffeedbackrangingfromdigitalpolling,toticks,

marks,gradesandcomments,bothencouraginganddiagnostic(includingthe

provisionofmodelanswers,intermsofsuccesscriteria,misconceptions,SOLO

levels,elementsoftheQualityTeachingmodel,syllabusexpectationsandBloom

categories);WAEandAEteacherswerealsomorelikelytoasktheirstudentsfor

feedbackontheirteachingandtochangedirectioninlessonsinresponseto

studentfeedback.

ItwasestablishedinChapterFourthatEVresultsforstudentsinWAEschools

werebetterthancomparableAEorWBEschoolresults.Thebetterresults

indicatedthatWAEstudentsweremorescientificallyliteratethanstudentsin

comparableAEorWBEschools.Thusitwasnosurprisethatthedominanttheme

toemergefromtheWAEcasestudyschoolnarrativeswasthefocusWAEteachers

hadonprovidingfeedbackwiththeexplicitpurposeofdevelopingexpressive

literacyskillsandstudentacquisitionofsciencevocabularyrelatedtothescience

topicsbeingstudiedatthattime.

Activitiesincludedrequiringstudentstolearnthevocabularyrelatedtothe

conceptsbeingtaughtinthecurrenttopic(allsixWAEschools),bygettingthemto

writeextendedanswersonworksheetswithscaffoldsandspacetowrite

descriptions,comparisons,explanationsandjustifications(MCFSWAE1and

MGFSAE2).AttheendofYear8,MCFSWAE1andMGFSAE2hadalmostidentical

resultprofilesinthefourresultcategoriesmonitoredforthisproject.MGFSWBE1

hadamuch-reducedtopbandperformanceintheextendedresponsecategoryof

resultsthantheothertwoschools(seeTable5.10).Thedifferencetherewas

attributedbythisresearchertofeweropportunitiesbeingprovidedtothegirlsat

theWBEschooltoperforminthiswayandconsequentlessfeedbacktosupport

thatwayofrepresentingwhattheyknew.Theirpotentialforperformingstrongly

Page 295: Exploring The Impact of a Largescale Diagnostic Science

274

inthiswaywassuggestedbythefactthattheirperformanceinthecommunicating

scientificallyreportcategorywasstrongerthantheothertwoschools(seeTable

5.10).

PCWAE1students,comparedtoMCWAE1studentshadthebetterEVresultoverall

andresultprofileaswell,butthenalmostalltheirstudentswerefromEnglishasa

firstlanguagebackground.ThatwasnotthecaseforMCWAE1students(85%of

theirstudentscamefromlanguageotherthanEnglishbackgroundsandarounda

thirdofthemwererecentrefugeeswithlittleornoprimaryschooleducation).

Thatsaid,theuseoffeedbacktoimprovelearningoutcomesatPCWAE1was

outstandinginthatitproducedaresultprofileforitsstudentsattheendofYear8

thatwasbetterthanmanyschoolswithhigherSEAscoressuchasMCWAE2,

MCWBE3andMCWBE5(seeTable5.1andTableK.1inAppendixI).

Wholeclassoraldiscussionofsciencecontextsandrelatedconceptswere

explicitlymentionedbyallsixWAEschoolsaswell.PCWAE1explicitlyreferredto

pretestingwhenstartingnewtopics.PCWAE2providedthemostevidenceofa

differentiatedapproachtodealingwiththediversityofstudents’literacyand

numeracylevelsatthetimeofinterestforthisproject.Thefeedbackprovidedby

PCWAE2teachersinthecontextofclassroomworkwasveryeffectivein

supportingsciencelearning(asdemonstratedclearlyinthebetterresultprofilefor

theextendedresponsereportcategorywhencomparedtoMCWBE5whichwasits

pairedschool(seeTableK.1inAppendixI).Bycomparisonthenarrativefor

MCWBE5showedanemphasisonprocessovertheacquisitionofconceptual

knowledge(andrelatedvocabulary).Theprofilesforworkingscientificallywere

verysimilar.Overall,PCWAE2hadapositiveskewintheirresultpattern;

MCWBE5hadanegativeskew).

Ofinterest,asdiscussedinChapterFiveweretheverydifferentlevelsofstudent

satisfactionwiththeirschoolscienceexperience.Thatwasrecordedbythesetwo

schoolsintheirresponsestothesixitemsfromthestudentsurvey.Onthe

combinedscoresforitemsDandE(enjoymentofsciencelessonsandscienceas

oneoftheirthreefavouritesubjects,PCWAE2ranked14th(outof16schools);

Page 296: Exploring The Impact of a Largescale Diagnostic Science

275

MCWBE5ranked4th.SeeTablesK.5A-DinAppendixIfortheircomparativescores

onallsixitems.NotenjoyingsciencedidnotdeterstudentsatPCWAE2from

achievinghighlyandnordiditappeartodetertheirtakeupofseniorscience

courses,relativetoEnglishattheirschool,thestateandMCWBE5(seeTable5.9).

Thisapparentparadoxwillbefurtherdiscussedinthenextsection,Section6.4.

Asimilaroutcomeforthetwofullyselectivegirlsschoolswasinevidenceaswell.

MGFSAE2outperformedMGFSWBE1despitethelatterhavingahigherSEAscore.

OntheircombinedscoresforItemDandE,MGFSWBE1rankedaheadofMGFSAE2

(13thcomparedto16thoutof16).MGFSAE2’sprogramwasstronglytext-based

andlinkedtoconventionalsciencelaboratory-basedskills.Thegirlsatboth

schoolshadthe3rdand2ndhighestlevelsofagreementwiththestatementthat

sciencewasthemostdifficultsubjecttheylearnt(MCFSWAE1was1st).Their

artifacts,whencomparedtothoseoftheothercasestudyschools,showed

knowledgeandunderstandingdemandswayabovetheotherschools(andforthat

mattersyllabusexpectationsaswell).Intheendthetake-upofsciencesubjects

overallbytheWBEschoolwasgreaterthanintheAEschool(relativetothestate)

byawidemargin(seeTable5.11).

SummativeassessmentatPCWAE2wasmuchmoreconsequentialforstudents

thaninothercasestudyschools.Studentsweremovedtoadifferentclassatsix

monthlyintervalsifperformanceandachievementwaseitherverygoodorpoor.

Thereasongivenforthatwastobetterpreparestudentsforsuccessinsenior

sciencecoursesasameanstotheendofobtaininggoodscience-relatedjobsafter

school.

Activatingstudentsasinstructionalresourcesforoneanother(includingpeer

assessment)andtheirteachers(ASIR)

Therewerenostatisticallysignificantdifferencesbetweenteachersinthethree

schoolgroupswhenitcametoactivitieslinkedtothisdimensionofformative

practice(seeTable4.15).Teachersinthethreeschoolgroupshadcomparable

usagefrequenciesforactivitiessuchascollaborativelypreparingassessmenttasks,

markingcriteriaorrubricsandsharedmarking(approximately95%said

Page 297: Exploring The Impact of a Largescale Diagnostic Science

276

sometimesoroften).WhenitcametoprovidingstudentsinYears7and8with

opportunitiesforpeerassessment,theywerelimitedandsubjective(notwell

groundedinthelanguageoflearningintentionsandsuccesscriteria).Examples

mentionedintheartifactswerenotlimitedtoWAEschools(PCWAE1and

MCWBE5includedexamples).Intermsoffrequencies(combiningsometimesand

oftenresponsestoitems)fortheprovisionoffeedbacktopeersusingsuccess

criteria,workingingroupsonthink-pair-share-reportactivities,writinglearning

intentionsandsuccesscriteria,constructingassessmentitemsandtasks,the

proportionsrangedfrom86%beingprovidedwithopportunitiestousesuccess

criteriaorassessmentrubricsandguidelinesto24%beinggiventhechanceto

constructassessmentitemsandtasks.Anumberofschoolsmentionedthatthey

gavemoreopportunitiesforstudentstoprovidefeedbacktoeachotherinYears9

and10(MGFSAE2,MGFSWBE1,MCWBE5andPCWAE2).

Activatingstudents(andteachers)asownersoftheirownlearning(includingself-

assessment)(ASTL)

Analysisofteachersurveyresultsforthisdimensionrevealedstatistically

significantdifferencesintheteacher-initiatedaspectsofthisdimensionof

formativepractice(seeTable4.35).WAEteachers,comparedtotheirWBE

colleaguesweremorefrequentevaluatorsoflessons,keepersofnotesonlearning

issuesindividualstudentshave,accessorsofinformationaboutassessment,more

frequentlyengagedwithcolleaguesinactivitiesrelatedtoimprovingpersonaland

sharedknowledgeaboutsyllabuslearningintentionsandwhatprogressionin

sciencelearning‘lookslike’.Themeansbetweenthethreegroupswerenot

statisticallysignificantlydifferentintermsoftheopportunitiesprovidedto

studentstoredoworktoahigherstandard(71%ofteacherssaidtheydidthis

sometimesoroften),gettingstudentstoself-selectitemsforportfolios(30%said

sometimesoroften)andkeepingajournalofreflectivewritingonscience(23%

saidsometimesoroften).

Astheaboveshows,opportunitiesforstudentstoself-assesswerenotlimitedto

WAEschoolsandthoseopportunitieswereinfrequent.Twoexampleswere

Page 298: Exploring The Impact of a Largescale Diagnostic Science

277

recordedinthenarrativesforcasestudyschools.PCWAE1providedan

opportunityinthecontextofatoyprojectandMCWBE5gavestudentsthe

opportunitytoself-assessagainstfivecriteriaonanumberoftasks.

Theanecdotalevidencefrominterviewandartifactswasthatwherepeerandself-

assessmentwerediscussedduringtheinterviews,theywereopportunitiesgiven

moretostudentsinYears9and10thanYears7and8atthetimeofinterestfor

thisproject.Thesamewastrueforextendedgroupworkanduseofstrategiessuch

asthink-pair-share-reportorjigsawmethods.

ThefindingthatWAEteachers,comparedtotheirWBEcolleagues,weremore

frequentusersofawide-rangeofactivitiesinvolvingtheuseandmodeling(to

peersandstudentsalike)ofgoodlearningbehaviourswasindicativeofthem

‘practicingwhattheywereteaching’.Thiswasmostmentionedwhenitcameto

staffmeetingswhereassessment-relatedworkwasbeingdiscussed,when

assessmentitemsandtaskswerebeingcollaborativelydevelopedorselected,

whenmarkingrubricswerebeingdevelopedandcollectivelyusedwitheachother

andstudentstoassessstudentresponsestotasks(seesection4.3.2.5).

Overviewofandreasonsforusingornotusingformativeassessment

ScienceteachersinNSWgovernmentschools,afteradecadeofexternallyprovided

Year8sciencetestingandrelatedfeedbackonachievementinformedbytheSOLO

model,havenottakenupSOLOinasubstantialway.Themostprobablereasonfor

itnotbeingmorewidelyadoptedbeingtherequirementtoreportachievementin

termsofgradeslinkedtosyllabusstandardsnot,themselves,definedwithany

referencetotheSOLOmodel.ThiswasexplicitlymentionedbyPCWAE2asone

reasonfornotcontinuingwiththeVALID10testaftertheyearofitsintroduction

in2015.

EVsciencetestresultswerebestinschoolswherescienceteachersweremore

frequentusersthantheircolleaguesinotherschoolsofactivitiesrelatedtothree

dimensionsofformativepractice.Thedimensionswere:

Page 299: Exploring The Impact of a Largescale Diagnostic Science

278

• discourseelicitingevidenceoflearning(seconddimension);

• theprovisionoffeedbackknowntoprogresslearning(thirddimension);

and

• theuseandmodeling(topeersandstudentsalike)ofgoodlearning

behaviours(fifthdimension).

Thefirstdimensionaboutlearningintentionsandsuccesscriteriawasbeingwell

usedbyteachersinallschoolstoguidebothinstructionandassessment.The

languageofintentionsandcriteriawerealmostinvariablyderivedfromthe

languageofoutcomesandrelatedcontentthatdefinedcurriculumstandardsinthe

officialcurriculumforNSWschools.Teacherusedominatedthisdimension.The

overallresult,relativetothefourpointscaleofnever,seldom,sometimesand

often,wasbetweensometimesandoftenasshowninFigure4.12.Opportunities

forstudentstodevelopskillsintheirusewasratedbetweenseldomand

sometimes,butclosertosometimes.

Thefourthdimensionofformativepracticerelatestoactivatingstudentsas

instructionalresourcesforeachotherandtheirteachers.Studentperformances

provideteacherswithfeedbacktheycanusetoadjustandimproveinstruction.

Providingopportunitiesforpeerassessmentisanotherwayofdoingthat.

Teachersoverallwereevenlydistributedintheirresponsestoitemsrelatedtothis

dimensionbyansweringfromseldomtosometimes(Figure4.15).Therewas

anecdotalevidenceofmorefrequentopportunitiesforstudentstoengageinboth

formalandinformal(structuredgroupwork)peerassessmentinYears9and10.

Howeverteacher’sworkingtogethertodevelopassessmentprograms,itemsand

sharedmarkingwasratedbetweensometimesandoften,butclosertooften).

Noexplicitreasonsemergedfromtheinterviewsastowhystudentsweren’tbeing

givenmoreopportunitiestodeveloptheskillsofformativeassessmentinscience

forthemselves.However,apossibleexplanationmaybefoundintheofficial

sciencecurriculumwherethelanguageusedtodescribeskilloutcomesforthefirst

twoyearsofsecondaryscience(outcomes13to22of22outcomes)isexplicit

abouttheneedforteacherguidance.The‘guidance’providedbyteachersof

Page 300: Exploring The Impact of a Largescale Diagnostic Science

279

studentsinthoseyearstooktheformofworksheetsthateffectivelyledstudents

frombeginningtoendofatask(asevidencedintheartifactssupplied).The

studentresearchtaskwasinalmostallcasestudyschoolsaheavilyscaffolded

projecttellingstudentswhattheycouldandcouldnotresearch,howtodoitand

whatneededtobeincludedinawrittenreportattheend.

6.4Discussionoffindingsaddressingresearchquestionthree

Researchquestionthreeasks:Doestheuseof(andifso,howdo)formative

practicesimprovestudents’EVresultsandlaterachievementinandengagement

withscience?

Theshortanswerforschoolsthatself-identifiedis‘yes’whenitcomestotheuseof

EVresults.Table5.1listsalltheschoolsthatidentifiedthemselves.Schoolsare

listedinorderofthesizeandpolarityoftheresidual(secondlastcolumnfromthe

right)fromregressingtheirEVresultsoveranEVresultpredictorderivedfrom

NAPLANscoresasexplainedinChapterThree.

TheresidualisthemeasureofaneffectsizeofteachingontheEVresultaswas

alsoexplainedinChapterThree.Theschoolsweregroupedaccordingtothesize

andpolarityoftheresidual.WAEschoolshadresidualsthatplacedtheminthetop

20%ofschoolsandWBEschoolshadresidualsthatplacedtheminthebottom

20%ofschools.InChapterFourthefindingsfromasurveyofteacherassessment

relatedpracticeswerethatteachersatWAEschoolscomparedtoWBEschools

weremorefrequentusersofactivitiesassociatedwiththreeofthefivedimensions

offormativepractice.TheEVresultsofschoolsassociatedwithlargepositive

residualswerealsoschoolswherescienceteachersweremorefrequentusersof

activitiesassociatedwiththreeofthefivedimensionsofformativepracticethan

theircolleaguesatotherschoolswithsmallerresiduals.

OtherresearchdiscussedinChapterThreeexplainedthatthreemajorfactors

contributetotheaccountedforvariabilityoftestresults,namely,studentsocio-

culturalbackgroundandpreviouslearninghistory(50%),theactionsoftheir

teachers(30%)andschoolenvironmentfactors(20%).AsdiscussedinChapter

Page 301: Exploring The Impact of a Largescale Diagnostic Science

280

Three,theSEAscoreforaschoolisanindependentmeasureofthelearning

potentialstudentsbringtoschoolandthiswasthebasisforcreating“comparable

pairs”ofschools.Becauseitwasimpossibletoaccountobjectivelyforthe20%of

schoolenvironmentfactorsintwodifferentschools,comparingdifferentschools

withthesameSEAscorebutwidelydifferentresidualsprovidedthebest

opportunityforconfirmingthatdifferencesintheuseofformativepractices

providethemostlikelyreasonforEVresultdifferences.

Onthatbasis,itwasestablishedinChapterFourthatforcomparableschoolpairs,

theschoolwithbetterEVresultswasassociatedwithmorefrequentuseby

teachersofactivitiesassociatedwiththreeofthefivedimensionsofformative

practice.Thoseactivitieswere:

• promotingclassroomdiscoursethatelicitsevidenceoflearning;

• providingfeedbackknowntoprogresslearning;and

• theuseandmodeling(topeersandstudentsalike)ofgoodlearning

behaviours.

Theassessment-relatedworknarrativesforWAEschoolsallincludedstrong

referencestousingsciencecontextsforthespecificpurposeofhelpingstudentsto

acquirescientificvocabularyandtheskilltouseitappropriatelyandfluently

(orallyandinwriting).MGFSAE2alsohadahighpriorityfor‘writing’science.The

assessmentnarrativesoftheothercasestudyschoolsgavemoreprominenceto

otherpriorities,suchasinvestigationskills(MGFSWBE1,MCAE2,MCWBE3,

MCWBE5)oridentitybuilding(MCWBE4).Byputtingtogethertheanalysisofthe

surveysandapriorityforusingthelanguageofscience,thefollowingpictureof

formativepracticeinWAEschoolsemerged.

TeachersinWAEschoolsmanagedclassroomdiscoursethatproducedevidenceof

learning(theseconddimensionofformativepractice)thatinformedteacher

feedback(thethirddimension)onhowwellstudentsweredoinginusingscientific

language.Teachersspentalotoftheirclass-timemodellingtostudentsgood

learningbehaviours(thefifthdimension)foracquiringtheskillsandtexttypes

relatedtoscientificliteracy,includingusingprescribedlearningintentionsand

Page 302: Exploring The Impact of a Largescale Diagnostic Science

281

successcriteriarelatedtoscientificliteracy,toself-evaluate.Theanswertohow

formativepracticesimproveEVresultsrestsonthecredibilityoftheclaimthatthe

formativeuseofliteracystrategiesinsciencecontextsisthemostpowerful

influenceonsciencelearningoperatinginthecasestudyschools.Referencesmade

toHattie’s(2018)workoneffectsizesofdifferentinterventionsonlearninginthe

openingsectionofthischapterprovidesindependentconfirmationofthepowerof

suchapproaches.

Ahoped-forlastingeffectofstudentexposuretoformativepracticewastheir

acquisitionoftheskillsandattributesofself-regulated,autonomouslearners.This

wasanexpectationbasedonworkreportedintheliteraturereview(ChapterTwo)

linkingexplicitteachingoftheskillsofformativeassessmenttostudentself-

regulation(Blacketal.,2006andJamesetal.,2007).OngoingexposureafterYear

8tohigherfrequencyteacheruseofformativepracticeand,perhaps,acquisitionof

studentself-regulation,inthatlight,shouldcontinuetoproducebetterresultsfor

thosestudentsattheWAEschool(atbothYear10andinseniorsciencecourses).

Also,theexpectationwasthathigherproportionsofstudentswouldbecompleting

seniorsciencecoursesthanintheirpairedschool.Thatlegacymaybethe

explanationforbetterlaterachievementandhigherengagement.

However,inrelationtothelaterachievementandengagementpartofresearch

questionthree,analysisreportedinChapterFiveofdatafromassessment-related

worknarrativesassociatedwiththecasestudyschoolswasnotasoundbasisfor

makinganyclaimsaboutanongoingeffect.Thecorrelationbetweenthemeasure

ofYear8engagementandYear12sciencecoursecompletionsinconclusive.

However,thecorrelationsbetweenYear8achievementandYear12sciencecourse

completionsataschoolwaspersuasiveforthecasestudyschools.Giventhe

unreliabilityofcomparingYear8andYear10resultsandabsenceofapersuasive

supportingcorrelationsbetweenYear8engagementandYear12engagement,the

acquisitionofself-regulationbymorestudentsinhighresidualschoolscompared

toother,lowerresidualschoolsasproposedherecouldnotbejustified.

Page 303: Exploring The Impact of a Largescale Diagnostic Science

282

Fromtheabovediscussionofevidence,itisreasonabletoclaimthatteacheruseof

formativepracticeshelpedstudentstoachievebetterresultsinscienceattheend

ofYear8.Anadditionalconclusionthatstudentsexposedtothosepracticeshad

acquiredtheskillsofself-regulationasaconsequenceofexposuretothose

practicescouldnotbesupportedbytheavailableevidence.

6.5Suggestionsforfurtherresearch

Giventheimportanceofproducingstudentswhoareself-regulated,autonomous

learnersbythetimetheyleaveschool,furtherstudiesusingtheresearchdesignat

thecoreofthisprojectiswarrantedbythefindingsreportedinthisthesis.Theuse

ofreliable,comparabledataonachievementandengagementafterYear8to

investigatetheworthofteachingformativeassessmentstrategiestostudentsmay

beworthwhile.Additionalresearchtothatendisdiscussedbelow.

Provincialstudentsapparentlowregardforscience

ThefindingsreportedinChapterFiveaddweighttoconcernsalreadyexpressedby

otherresearcherswhohavereportedsimilarfindingsfromtheirresearchfor

provincialstudentsintheearlyyearsofsecondaryschooling.LyonsandQuinn

(2010,2012,2014)confirmthatAustralianprovincialschoolstudents’negative

attitudestosciencerelativetotheirmetropolitancounterpartspersistuptoYear

10.Theresearcherscouldonlyspeculateastothereasonsforthatnegativitybut

didseeitasabarriertobeovercome(curriculummismatchwithstudent

experience,ashortageofspecialistteachersandlackofperceivedrelevancewere

someofthepossibilitiestheylisted).TytlerandSymington(2015)writingin

TeachingSciencelistotherresearcherswhoreportedsimilarfindings.

Asmentionedintheopeningparagraphofthischaptergraduatingstudentswho

knowhowtolearnisimportant.Thatbeingso,thenitisimportanttofindoutwhy

provincialstudentsdon’tenjoyanexperiencethatmanyareclearlydoingwellat

(aftertakingintoaccounttheirlowerliteracyandnumeracylevelscomparedto

theirmetropolitancounterparts)isalsoimportant.Ifprovincialstudentsboth

understandwhytheyaredoingbetterthanexpected(byacquiringthefluencywith

Page 304: Exploring The Impact of a Largescale Diagnostic Science

283

andcontroloverthelanguageofscienceattheveryleast)andfeeltheyaredoing

betterthanexpectedthatmightprovidethemotivationforevenmorestudentsto

takeupscienceinthesenioryears.Palmer(2015)identifiedstudentenjoymentof

scienceasareasonfortakingitupinthesenioryearsofschoolingandhopefully

beyondintopreparationforaSTEMcareer.

Afirstsuggestionforfutureresearchinthisareamaybetotryandunderstand

whyprovincialstudentshavealesspositiveviewoftheirschoolscience

experiencethantheirmetropolitancounterparts.Aninitialprojectmight

undertakeafullanalysisofthestudentsurveysforthecasestudyschoolsinthis

project.TheDepartmenthasthatdatafrom2005uptothepresenttime.Atthe

veryleast,itmayprovideamorenuancedunderstandingofstudents’viewsabout

theirexperienceofscienceatschoolandadditionalcluesastowhytheydon’tlike

science.Schoolfactorsexternaltothescienceclassroommaybeacontributor,but

theconsistencyofthelowregardbystudentsinprovincialsettings(allthreeofthe

provincialWAEschoolsinthisproject)maywellhavemoretodowithparent

socio-culturaldispositionsthataccordalowervaluetoscienceinthose

communitiesthanelsewhere.Arelatedquestiontoexplorewouldbewhy

provincialstudentstakeupseniorsciencecourseswhentheyclearlydonotlike

thesubject.

Studentbackgrounds,accordingtoHattie(2003b),areresponsibleforuptohalf

theaccountedforvariationintestresults.Thesuggestionthatthesevaluesmaybe

implicatedcomesfromthefindingthatstudentsatthethreeschoolsalsorecorded

lowenjoymentoftheirprimaryscienceclassexperiences(ItemCinthestudent

survey).Topbandachieversatthethreeprovincialschoolsrankedtheir

experienceatprimaryschoolyearsscienceexperiencesat12th(PCWAE1),7th

(PCWAE2)and16th(PCWAE3)outofthe16schoolscomparedhere(thestate

resultwascountedasaschoolinTableK.5BinAppendixI).

Theimportanceofself-regulation

Giventhegrowingimportanceofproducingself-regulated,autonomouslearnersas

avaluedoutcomeofschoolingitmaybeusefultoconfirmwhetherputtingmore

Page 305: Exploring The Impact of a Largescale Diagnostic Science

284

effortintoexplicitlyteachingstudentsthestrategiesofformativeassessmentisthe

mosteffectivewayofdoingthat.Itmaybethatotherteachingapproachescando

thejobmoreeffectively.Twoapproacheshavealreadyshownpromiseinthat

regard.Thefirstisinquiry,thesecondisproblemsolving.Theirimportanceas

aspectsofscienceeducationisreflectedintheworkingscientificallyand

communicatingscientificallyEVreportingcategoriesrespectively.

Theassumptionisthatpenandpapertestsareabletoprovidesufficientvalid,

reliableandauthenticevidenceoftheattributesofself-regulationandlearning

autonomy.ThemethodologydescribedinChapterThreecouldjustaseasilybe

appliedtoexploringwhether,forexample,inquiryorproblem-solvingapproaches

wouldbeamoreeffectivemeanstothatend.Teachersurveysdesignedto

characteriseteachingthatreflectsbestpracticeinteachinginquiryandproblem

solvingmaybesubstitutedfortheformativepracticessurveyusedinthisproject.

Anappropriatesetofinterviewquestionscouldbedeveloped,artifactscollected

andrelatednarrativesgeneratedtoexamineforcorroborationoffindings.

Anotherapproachtoinvestigateisthatofrepresentationalpedagogieswhichwere

speculativelypositedasa“signaturepedagogy”forsciencebyTytler,Prain,Huber

&Waldrip(2013).Researchpapersalreadypublishedcouldprovidetheactivity

descriptors(suchastheoneonforcesespousedbyHuber,TytlerandHaslam

(2010))withwhichtogeneratesurveyitemsthatcouldbepilotedwithschools

knowntobeearlyadoptersofthesepedagogies.

Representationalpedagogiesareessentiallyformativebecausetheyshiftthe

emphasisfromwhattheteacherisdoingtowhatthestudentisdoing.That

approachtoteachingengagesstudentsincreatingrepresentationsofwhatthey

arelearningandchallengesstudentstotestthelimitsoftheirexplanatorypower.

Therepresentationsproducedmaybeinavarietyofformssuchasdiagramsand

3-Dmodels,writtentexts,presentationsusingICTandincludingaudioandvideo

contentoranycombinationthatisdeemedappropriateforpurposeandaudience.

Curriculumintent,pedagogyandassessmentareevaluatedforalignmentbyall

participantsinthebackandforwardnegotiationofmeaning.Representational

Page 306: Exploring The Impact of a Largescale Diagnostic Science

285

pedagogiesmaywellbeamoreeffectivewaytoproducestudentswhoareself-

regulatedandautonomouslearnersthanthecurrentapproachintheUKto

explicitlyteachstudentsthestrategiesofformativeassessment(James,2006).

Confirmationoffindingsfromtheinitialresearchproject

Asmentionedabove,theintroductionoftheVALID10testprovidesstandardised

achievementresultsforstudentsinallparticipatingschoolsattheendofYear10.

Theschoolsetsofscienceresultscanbeusedtoevaluatethepredictionthatin

pairsofcomparableschools,theschoolwiththehigherresidualwillcontinueto

producebetterresultsattheendofYear10andagaininsciencesubjectsatthe

endofYear12.DatasetsforthisprojectshouldbeavailablefromtheDepartment

forcohortsofstudentsdoingVALID8(beginning)in2015(basedonthenew

nationalcurriculum),VALID10in2017andY12resultsfrom2019.

Smallerstudiesmaychoosetotestthevalidityandreliabilityofaspectsofthe

methodologyusedinthisproject.Giventheclosenessofthecoefficientsof

determinationforthefourpredictorsused,itmightbesimplertousetheYear7

NAPLANreadingresultsontheirownasthebasisforthepredicatorusedinthe

regressionanalysiswithoutseriouslossintheintegrityofthefindings.

Theteachersurveyinstrumentwouldbenefitfromincludingawiderarrayof

strategiesthatmaybebeingusedbyteacherstoenhancestudentagencyas

autonomouslearners.Hattie(2018)hasalistof33strategiesundertheheadingof

Strategiesemphasizingstudentmeta-cognitive/self-regulatedlearning.Also,

existingPEELresources(Mitchelletal.,2009)couldbeaccessedforappropriate

“goodlearningbehaviours”(p.172).Procedurescouldbetestedforrecognitionby

teachersandselectedforinclusion.AstrategynotonHattie’slististhePredict-

Observe-Explainsequence(White&Gunstone,1992).Theexpandeditemsetso

producedcouldbeaddedtotheteachersurveyforarepeatoftheoriginalstudy

alongwiththeenhancementsmentionedabove.

Page 307: Exploring The Impact of a Largescale Diagnostic Science

286

Anadditionalenhancementwouldbetoincludeinterviewswithandartifact

collectionfromYear10teachersrespondingtoascienceteachersurvey.The

surveyshouldbethesameforbothYear8andYear10teachers.

Thisprojectusedtheaverageoffourconsecutiveyearsofstandardizedresiduals

asthebasisforchoosingmaximumvariationcases(Flyvbjerg,2011).Inafuture

study,researcherscouldlookforschoolswheretheresidualswereincreasingover

theyears;weredecliningovertheyearsandlookforchangesintheassessment-

relatedworknarrativesforthoseschoolsinabeforeandafterstudy.

Evidencegathering,apartfromthemethodsdescribedabove,couldbeexpanded

toincludeclassroomobservations(recordedbypeople,audioandorvideo

technology)ofteacherenactmentsoftargetstrategiesandstudentresponsesto

them.Theseobservationscouldbeusedtocorroborateteacherresponsesto

surveysandusedtoconfirmthefidelityofstrategyinterpretation.

Giventheabovesuggestionthatcommunityvaluingofsciencemaybeafactor

inhibitingstudentengagementwithscience,itmaybeusefultohavesamplesof

parentsrespondtoappropriateitemsfromthecurrentstudentsurveyinthefirst

instance.Theirresponsestothesame(ortestedequivalentitems)mayprovide

insightsintothesourceofstudentattitudes,particularlyifstudentsandtheir

parentsindependentlycompletethesurveyandtheirresponsesmatchedand

comparedwiththeirchild’sresponses.Intheeventthatthisdoesnotprovidethe

neededinsight,awiderrangeofquestionsaboutsciencemaybehelpful.Tothat

end,BarryFraser’s(Fraser,1978)TestofScienceRelatedAttitudes(TOSRA)survey

mightbeagoodstartingpoint.

6.6Recommendations

ThissectionprovidesrecommendationstotheDepartment,theNSWEducational

StandardsAuthority,theAustralianCurriculumAssessmentandReporting

Authorityandawideraudienceofeducationalresearcherswithaninterestinthe

theoryofformativeassessment,itsintegrationwithinstruction(formative

Page 308: Exploring The Impact of a Largescale Diagnostic Science

287

practices)anditspotentialforguidingstudentstolearnhowtolearn.The

recommendationsaresupportedbythefindingsreportedinthisthesis.

Theinterestofawideraudienceofeducationalresearchersispredicatedontheir

priorinterestintestingthepowerofformativeassessmenttoimprovestudent

achievementinandengagement(especiallybeyondschool)withscience.Inthat

light,itishopedthatsomeresearchersmightbepreparedtoundertakefurther

workalongthelinessuggestedintheprevioussectiontoaddmoreweighttothe

bodyofresearchsupportingthepowerofformativepracticetoimprove

achievementinandengagementwithscience.

Oneoftheclaimsforimportanceofthisresearchisthemethodologydevelopedby

thisresearchertoisolatethecontributionofteachingfromothercontributionstoa

testresult.Hereitwasusedtoseparatethecontributionofgeneralliteracyand

numeracyskillsfromthescientificliteracycomponentinasciencetestresult.The

scientificliteracycomponentiswhatstudentshavelearnedinthecontextoftheir

sciencelessons.Otherresearchersmightbeinterestedtouseitandconfirmits

utilityinotherlearningareasapartfromscience,suchasgeographyorhistory.

ThelatestroundofPISAtestingcompletedin2015(seeChapterTwo)emphasized

thatprovidingfeedbackonthelevelofthinkingdemonstratedinstudent

responsesisusefulbecauseitdifferentiatesbetweenrecallofanattributeofone

scienceconceptandbeingableto“relateandevaluatemanyitemsofknowledge”

(OECD,2017),p.40).Demonstratingthelatterinasciencecontextisarguablya

highervalueresponseinthecontextofassessingcompetenceinscientificliteracy.

Todosorequiresastudenttousemorecognitiveresourcesthantherecallofa

singleattribute.Tobescientificallyliterate,asthePISAframeworkspecifies,

requiresstudentstooperateatalevelwheretheycanrelateandevaluatemore

thanoneitemofknowledge.Includingcognitivedemandasadimensioninthe

assessmentframeworkenhancesthevalidityofthetestconstructandresultsfrom

it(Messick,1995;Mislevy,2008).

Afterconsideringanumberofschema’tooperationalizetheconstructofcognitive

demand,includingtheBiggsandCollis(1982)SOLOtaxonomy,thedevelopersof

Page 309: Exploring The Impact of a Largescale Diagnostic Science

288

thePISAtestadaptedWebb’sfourlevelDepthofKnowledgemodel(Webb,1997)

tothatend.

TheOECD-PISAdecisiontoincludecognitivedemandasanaspectofcompetency

inscientificliteracyvindicatedthedecisionbytheDepartmenttoincludea

cognitivedemanddimensionfromtheoutset(2005)initsassessmentframework

fortheEVprogram.TheDepartmentusedtheSOLOmodeldevelopedbyPegg,

PanizzonandothersattheUniversityofNewEngland(Panizzon,2003).TheSOLO

modelwasanevolutionoftheSOLOTaxonomyfirstpublishedbyBiggsandCollis

(1982;1991)andwasdescribedinChapterTwo.Giveninternationalsupportfor

andacceptanceofcognitivedemandasanexplicitenhancementtothePISA

assessmentframeworks,itwouldbeapitytodiscontinuetheonelargescale

assessmentprojectinAustraliawhereitisafeature.

AtthistimeneithertheNSWcurriculumauthority(NESA,2017)northeACARA

publishedachievementstandardsforthecurrentnationalAustralianCurriculum:

Science(ACARA,2018)includeeitherexplicitorimplicitrecognitionofcognitive

demandasadimensionintheirdefinitionsofcompetencyandrelatedassessment

supportmaterialsoradvice.Intheinterestsofimprovingthevalidityof

assessmentandensuringongoingalignmentbetweencurriculumintent,related

instructionandassessmentvalidityasdiscussedintheNRC(2001)report,ACARA

mightwanttoconsiderhowfutureiterationsofthesciencecurriculum(atthevery

least)respondtoPISAleadershipandincludereferencestodifferentlevelsof

complexityinthinkinginitsScienceSequenceofAchievementdescriptions(ACARA,

2018).RatherthandropSOLOforWebb’smodelasusedbyPISA,externally

designed,large-scaletestscouldlookatusingSOLOastheirmodelbecauseofits

historicalprioruseinscienceeducationandelsewhereinAustralasia.

ThetworeportsfromthelastroundofYear6NAPtestinginScienceLiteracyhave

droppedtheAppendicescarriedbysuccessivereportsuptoandincluding2012

wheretheconnectionbetweenSOLOandthattestwasexplained(ACARA,2017).

SOLOwasusedasaclassifierforitemsproducedbyACERtopopulatethedata

baseofscienceassessmentitemsinthecontextofthenationalScienceEducation

Page 310: Exploring The Impact of a Largescale Diagnostic Science

289

AssessmentResourceproject(ACER,2004a).Thecurrentversionofthatresourceis

beingmanagednowbyEducationServicesAustralia(ESA,n.d.).SOLOisalsoused

toinformfeedbacktoprimaryteachersinNewZealandwhohaveuseditemsfrom

the,sotitled,e-asTTledatabaseofitemstoassessstudentachievementand

progressinreading,mathematicsandwriting(Hattie&Brown,2004).

ThepresumptionoftheDepartment’sinterestisbasedonthefactoftheirtangible

supportforthisprojectbyprovidingthisresearcherwithaccesstoEVtestdata

andrelatedstatisticalanalysis.Therecommendations(below)forchangein

practiceareanexpectedoutcomefromusingatransformativemixedmethods

researchdesigninthisproject(Creswell&PlanoClark,2011).

OnepartoftherationalefortheDepartmentcontinuingwiththeVALIDprogramis

thatitincludesthedimensionofcognitivedemandinitsassessmentframework

andhasdonesofromitsinceptionin2005.Includingcognitivedemandinthe

assessmentframeworkoftestsimprovesthevalidityofthetestasdiscussed

earlier.TheinclusionofcognitivedemandintheOECD-PISAtestassessment

frameworkisvindicationoftheDepartment’searlierdecisiontouseSOLOasthe

basisformeasuringcognitivedemandinitsEVtest.Asecondistheendorsement

oftheEVtestprovidedbyFensham(2013)whosaysthatthe[EV]test

developmentprocessiscomparabletothePISAandTIMSSdevelopment

processes.Inthesamebook,achapterbyMiller(2013)arguesthatassessment

modelsareanimportantcomplementtocurriculumdocumentsbecausetheyhelp

teacherstooperationalizecurriculumstandardsandshowhowbesttoassess

curriculumintentions.

TwocomponentsofthecurrentEVtestdesignaresingledoutforfurther

comment.Thefirstarethethreeextendedresponseitemsincludedineachtest.

Theextendedresponsetasksmodelopen-endedquestionsthatenablestudentsto

respond,usingwrittentext,atthehighestlevelsofthinkingtheyarecapableof.

Thecapacitytowritescientificexplanationsisahighlyvaluedoutcomeofscience

educationwhichsome,atleast,ofthecasestudyschoolparticipantsexplicitly

acknowledged.Inclusionoftheextendedresponseitemsinthetestsignalsto

Page 311: Exploring The Impact of a Largescale Diagnostic Science

290

scienceteachersandstudentstheimportanceofthisskill.Theevidencereported

inthisthesisshowedthattheresultsofstudentsexposedtoexplicitteachingof

literacystrategiesinformativewaysleadstobetterthanexpectedresults.

Thesecondistheuseofsciencerichstimulusmaterialdrawnfromthewider

readingandInternetexperienceofstudentsascontextsforsciencequestionsisan

importantsignaltostudentsandteachersoftherelevanceofsciencefordealing

rationallywiththeworld.Italsoprovidesopportunitiesforitemconstructionthat

provideshigherlevelsofcognitivechallengetostudentsinaformthatisan

authentictestofaspectsofscientificliteracy(seeaboveinthediscussionofthe

PISArationaleforincludingcognitivedemandinitsassessmentframework).With

somemodification,itemsandtaskscouldeasilybeamendedtoprovideforawider

rangeofresponsesthanwrittentextsalone.Thiswillbeimportantonce

representationalpedagogiesandothermoreprogressiveapproachesbecome

morewidelyused.Thecapacitytouploadvideoandsoundaswellasphotosand

diagramsshouldbeconsideredinadditiontotheconstructionofwrittentextsnow

thatthetestisdeliveredonline.Inthecontextofatest,thesetofitemsatesteeis

providedwithcanbechangedtobettermeettheirdemonstratedability(as

assessedbythesoftwaremanagingtheitemsetbeingdeliveredtothetesteeas

theydothetest).

Thecapacitytouploadawiderrangeofresponsestoitemsandtaskswouldbe

madeeasierbytransformingtheonce-a-yeartesttoanonlinerepositoryofitems,

relatedstimulusmaterialsandextendedresponsetasksfromwhichteacherscould

choose.Theycouldretainandstoreitemsonlineuntiltheyenabledaccessfortheir

studentsastheyworkthroughthetopicorattheendorboth.Thecapacityfor

immediatefeedbackontheirlearning,thisbeingoneofthemostpowerfulmeans

forsupportinglearning,wouldthenbeprovided.Therearealreadyanumberof

items(andrelatedstimulusmaterial)andextended(open-ended)responsetasks

goingbackto2005heldbyESA(SEAR,2004)thatcouldbeusedtopopulatesucha

repository.

Page 312: Exploring The Impact of a Largescale Diagnostic Science

291

Onlineavailabilityofassessmentitemsandtaskshasanumberofpotential

advantageswhichincludethecapacityto:

• provideimmediatefeedbacktoteachersaboutstudentexperienceof

science(usingitemsfromthecurrentstudentsurvey);

• provideabriefdescriptionofitemandtasklinkstocurriculumintentions;

• informationaboutthelevelofcognitivedemandoftheitemortaskand

possiblereal-worldsituationswhereengagingwiththeparticularitemand

itsstimulusmaterialortaskhasbenefitsfortheindividual,societyorthe

environment;

• provideexplanationsofalternativeconceptionsindicatedbystudent

selectionofparticulardistractors(inmultiplechoiceitems)infeedbackto

students(andteachers);

• suggestionsforactivitiestocorrectmisconceptions(alreadyprovidedin

SMARTforthecurrentversionoftheEVtests);

• providearangeofanswersthatwouldbescoredatdifferentlevels

accordingtotheSOLOmodel(forextendedtasksonly);and

• thehistoryofitemandtaskuseandstudentanswerscouldberetained

onlineandmadeaccessibletobothteachersandtheeducationsystemfor

monitoringpurposes.

Thelastpointwouldenablestrongermeasuresofitemreliabilityanddifficulty

(psychometricdata)tobeconfirmedovertimeaswellasenablingmonitoringof

changeinthequalityoflearningovertimebybothteachersandthesystem.Also,

transparencyabouttheusesofthatdatawouldneedtoclearlyprovidedand

agreedtobyallparticipants.

Studentself-assessmentisseenasanimportantskillforstudentstoacquireinthe

contextofbecomingautonomouslearners(Blacketal.,2006).Withthatinmind,

waysfordirectaccessbystudentstoafuturerepositoryofassessmentitemsand

tasksshouldbedevelopedandtrialed.Inthisscenario,studentswouldbeableto

selectandcompleteitemsandobtainimmediatefeedbackontheirresponses.

Studentaccesscouldbemanagedinawaythatprotectstheintegrityoftheitems,

Page 313: Exploring The Impact of a Largescale Diagnostic Science

292

relatedstimulusmaterialandextendedresponsetasksbutenablesthedataon

responsestobegeneratedandretained.Studentresponsesshouldalsoberetained

forteacheraccessaswelltoenablethemtoevaluatehowlearningisprogressing

asithappens.Thiswouldprovidetheopportunityforteacherinterventionsbased

onwhattheyseehappeningonlineasstudentsengagewiththematerialthere.

Atthepresenttime,EVdataisprovidedtoNSWschoolsandnotpublishedinthe

samewayasNAPLANdata(onaschool-specificwebsiteforalltheworldto

access).ThefindingsreportedinChapterFivewerethatscienceteachers

understoodthepurposeoftheEVtest,werewillingtoengagewithitandfeedback

fromitandappreciatedtheabsenceofpressuresexperiencedbytheircolleagues

moredirectlyassociatedwiththepublicationofNAPLANresults.Itisstrongly

believedbythisresearcherthatshiftingtheitemsandtasksintoanonline

repositoryaccessibleasdiscussedabovewouldincreaseusagebecausethe

feedbackwouldbeimmediateandthusmostusefultoteachersandstudents

(Black,2007;Hattie&Timperley,2007;Shute,2007).Delayinreceivingfeedback

wasidentifiedbyteachersinvolvedinthisprojectasadisincentivetogreater

engagementwiththeEVprogram.

Intheeventthatpublicaccountabilityisseenasimportant,considerationcouldbe

giventosampletestingalongthelinesofthecurrentNAPprogramforYear6

scienceorsimplycontinueusingthecurrentprogramofTIMSSandPISAtesting

whichAustralianstudentshavebeendoingforthepasttwodecadesalready.Using

onlytheinternationaltestswouldavoidduplicationandfreeupresourcesfor

otherpurposes.

6.7Conclusion

Section2.2ofthisthesisoutlinedthegapbetweenidealandactualpracticefound

byGoodrumetal.(2001)intheirreviewofscienceteachingandlearningin

Australiaattheendofthe20thcentury.Thethreeresearchersintheirreportdrew

attentionthentothestrongemphasis,particularlyinsecondaryschools,on

summativeassessmentandthenegativeimpacts(notjustinAustralia),itwas

havingonscienceteaching,onachievementandonengagementwithschool

Page 314: Exploring The Impact of a Largescale Diagnostic Science

293

science(seeTable2.1).Thewritersrecommendedgreateralignmentbetween

syllabusintentions(outcomesthatfocusonscientificliteracy),instructionand

assessment.Assessment,theysaid,shouldbeusedmoretosupportinstructionas

itwashappening(formativeassessment).

Thisthesisreportsontheimpactoftwoinitiativesdesignedtohelpteachersshift

theirassessmentfocusfromsummativetoformative.Theinitiativeswerein

responsetothe2001reportbyGoodrumetal.Dataforthethesiswascollectedin

2016andcoveredschoolyears2010to2015.Thefirstinitiativewasintheformof

curriculumadviceforteachersaboutassessmentforlearning(analternativename

forformativeassessment).Itwaspromulgatedinthenewsciencesyllabuswhich

wasintroducedfrom2003(BOS,2003).Thesecondinitiativewasalargescale,

diagnosticsciencetestandstudentsurveyatthemidpointofamandatory,four-

yearsecondarysciencecourse.Thetestwaspilotedin2005,trialedin2006and

implementedacrossthestateofNSWforallYear8studentsfrom2007.Thetest

gatheredevidenceofstudentlearningrelativetosyllabusstandards(describedas

outcomes).Thesurveygatheredevidenceofstudentunderstandingaboutscience

intheworldandabouttheirexperienceofscienceintheschoolsetting.

Parents(andtheirstudents)receivedaprogressreportabouttheirlearningin

termsofbothsyllabusexpectationsandlevelofunderstandingdemonstratedin

relationtothoseexpectations.Thelevelswerereferencedtothesixlevelsof

understandingdescribedintheSOLOmodel.Teachersreceivedacomprehensive

analysisofindividualperformanceoneverytaskanditeminthetestaswellas

students’collectiveviewsaboutscienceandtheirexperienceofitatschool.

Teacherswereexpectedtousetheresultsofthetestandthesurveytodiagnose

strengths,weaknessandgapsinstudentlearning(andlevelofengagementwith

learningscience)andtorespondaccordingly.

ImpactofboththecurriculumadviceonassessmentforlearningandtheEV

programonteachers’assessment-relatedworkwasexploredagainstthefive

dimensionsofformativepracticedescribedinChapterTwo.Theevidencefrom

analysisoftheteachersurveyresponsesrevealedthatfifteenyearsafterthe

Page 315: Exploring The Impact of a Largescale Diagnostic Science

294

Goodrumetal.(2001)report,instructionandassessmentweremorealignedto

curriculumexpectations(describedintermsofoutcomes)thanwasthesituation

in2000(firstdimensionofformativepractice).Thiswasaconsistentfeatureof

teachingacrossallthreegroupsofschools,regardlessofwhetherEVresultswere

wellabove(WAE),at(AE)orwellbelowexpectation(WBE).

However,inschoolswhereresultswereWAEandAE,theteacherstherewere

morefrequentusersofdiscourseelicitingevidenceoflearning(seconddimension

offormativepractice)andprovidersoffeedbackthatadvancedlearning(third

dimensionofformativepractice)thanweretheircolleaguesinschoolswhereEV

resultswereWBE.

Whenitcametoprovidingstudentsinthefirsttwoyearsofsecondaryschoolwith

opportunitiestotaketheleadasinstructorsforeachother,noneofthethree

schoolgroupsstoodoutfordoingso(fourthdimensionofformativepractice).

InschoolswhereresultswereWAE,teachersthereweremorefrequent

demonstratorsofgoodlearningbehavioursbothwithstudents(andwitheach

other)thanweretheircolleaguesineitherAEorWBEschools(fifthdimensionof

formativepractice).

InWAEschoolsteachersalsofocusedondevelopingstudents’capacitytousethe

languageofscientificliteracyappropriately.ThiswasmostevidentintheWAE/

AE–WBEcomparisonsofschoolswithcomparablesocio-educationaladvantage

(SEA)scores.Inthosecomparisons,WAE/AEschoolshadlargerproportionsof

theirstudentsinthetopbandfortheextendedresponsecategoryofresults

(PCWAE2andMCWBE5,MCAE2andMCWBE3andMGFSAE2andMGFSWBE1).

BecauseitwasnotpossibletoensurethecomparabilityofYear10resultsacross

schoolsnortobesurethattheproportionsofstudentsdoingseniorscience

courseswasadirectreflectionofstudentdemand(ratherthanschoolresource

limitations),threepredictionsdevelopedasindicatorsofstudentself-regulation

couldnotbereliablyverified.Theonlyotherindependentmeasureofself-

regulationavailabletothisproject(studentsreportinginthesurveyapositive

Page 316: Exploring The Impact of a Largescale Diagnostic Science

295

schoolscienceexperienceattheendofYear8)wasnotconsistentlyfoundinhigh

achievingcasestudyschools.Infact,highachievementwasconsistentlylinkedto

lowratingsbystudentsoftheirschoolscienceexperience.Thiswasveryevidentin

thethreeprovincialcasestudyschools.

Thecombinationofteachersurveyresultsandassessmentnarrativessupportsthe

conclusionthatincasestudyschoolsatleast,teachersretainstrongcontrolover

theactivitiesassociatedwithformativepractices,atleastuptotheendofYear8.

Whilstthisisassociatedwithbetterthanexpectedscientificliteracyoutcomes,

studentsinprovincialschoolsinparticulardonotappeartobeenjoyingthe

experience.Thiswasincontrasttotwocoeducationalmetropolitancasestudy

schools,alsowithrelativelylowSEAscores(MCWAE1andMCWBE4)butwith

highratingsoftheirschoolscienceexperiences.

AttheendofYear10bothMCWBE5andPCWAE3havebetterresultprofilesthan

PCWAE2.AllthreeschoolshavecomparableSEAscores.Thisisareversalofthe

Year8position.Despitetheuncertaintyaroundthecomparabilityoftheactual

results,thedistributionoftheresultsacrossthegradesistelling.Itseemsthatthe

rigorousapplicationofasummativeassessmentpolicyatPCWAE2maybea

contributortothedeclineinachievementfromYear8toYear10.

Takingalltheaboveintoaccountitistheviewofthisresearcherthatprogressis

beingmadetowardhelpingstudentsacquirethetoolsneededtomanagetheirown

learning,asthefocusonmasteringthelanguageofsciencehasshown.However,

thebroadeningofthattoencompassthefullmeaningofbeingscientificallyliterate

(OECD,2017)willrequirethatstudentsareexplicitlytaughttheskillsofformative

assessmentandgivenopportunitiestousethematschool.Thiswillonlyhappen

whenthecommunityacceptsthevalidityandreliabilityofevidenceoflearning

obtainedbymeansotherthanpenandpapertests(ortheiron-lineequivalents).

Page 317: Exploring The Impact of a Largescale Diagnostic Science

296

APPENDICES

AppendixA:Competencies,BasicSkills,GenericSkillsandKey

Competencies

Table 2.1 Competences, basic skills, generic skills and key competencies

SECTION ONE: Quality Education Review Committee (QERC,1985) general competences and basic skills

1. Acquiring information; 2. Conveying information; 3. Applying logical processes; 4. Performing practical tasks as individuals; 5. Performing practical tasks as members of a group (Recommendation1, p.201).

Basic skills in (the curriculum) including: • communication skills; • Mathematics; • Science; • Technology; • the world of work; and • Australian studies (Recommendation 10, p. 203)

SECTION TWO: Australian Education Council Review Committee (AECRC, 1992) Finn review generic skills

1. Language and communication; 2. Mathematics; 3. Scientific and technological understanding; 4. Cultural understanding; 5. Problem solving; and 6. Personal and interpersonal understanding.

SECTION THREE: Australian Education Council Review Committee (AECRC, 1992) Mayer key competencies (NSW version)

1. Collecting, analysing and organising information; 2. Communicating ideas and information; 3. Planning and organising activities; 4. Working with others and in teams; 5. Using mathematical ideas and techniques; 6. Solving problems; 7. Using technology; and 8. Cultural understanding*

Page 318: Exploring The Impact of a Largescale Diagnostic Science

297

SECTION FOUR: Science 7-10 syllabus (BOS, 2003) Key Competencies are embedded within the objectives and content of the Skills. The content develops students’ ability to:

1. plan, organise and perform first-hand investigations to test a hypothesis or question that can be researched;

2. collect, analyse and organise information from first-hand investigations and secondary sources, organising data using a variety of methods including diagrams, tables and spreadsheets, and checking reliability of gathered data and information by making comparisons with observations or information from other sources;

3. communicate ideas and information using a range of text types including explanation, procedure and report formats to present data and information from first-hand investigations;

4. identify the nature of issues and problems, framing possible problem-solving strategies and developing creative solutions in a logical, coherent way;

5. use technology including CD-ROMs and the internet to access information 6. work individually and in teams where appropriate, safely, responsibly and effectively

with realistic timelines and goals; and 7. use appropriate mathematical processes including appropriate units, graphs,

spreadsheets and mathematical procedures and relationships. *This was a NSW addition to the list Source: report documents as listed in the Table (see reference list).

Page 319: Exploring The Impact of a Largescale Diagnostic Science

298

AppendixB:GoalsforSchooling(1989–2008)

Evolution of Australia’s Common and Agreed National Goals for Schooling in the Twenty First Century.

Hobart Declaration on Schooling (1989). The ten goals… 1. To provide an excellent education for all young people, being one which develops their

talents and capacities to full potential, and is relevant to the social, cultural and economic needs of the nation.

2. To enable all students to achieve high standards of learning and to develop self-confidence, optimism, high self-esteem, respect for others and achievement of personal excellence.

3. To promote equality of education opportunities, and to provide for groups with special learning requirements.

4. To respond to the current and emerging economic and social needs of the nation, and to provide those skills which will allow students maximum flexibility and adaptability in their future employment and other aspects of life.

5. To provide a foundation for further education and training, in terms of knowledge and skills, respect for learning and positive attitudes for life-long education.

6. To develop in students: a) the skills of English literacy, including skills in listening, speaking, reading and writing; b) skills of numeracy, and other mathematical skills; c) skills of analysis and problem solving; d) skills of information processing and computing; e) an understanding of the role of science and technology in society, together with scientific and technological skills; f) a knowledge and appreciation of Australia’s historical and geographic context; g) a knowledge of languages other than English; h) an appreciation and understanding of, and confidence to participate in, the creative arts; i) an understanding of, and concern for, balanced development and the global environment; and j) a capacity to exercise judgement in matters of morality, ethics and social justice.

7. To develop knowledge, skills, attitudes and values which will enable students to participate as active and informed citizens in our democratic Australian society within an international context.

8. To provide students with an understanding and respect for our cultural heritage including the particular cultural background of Aboriginal and ethnic groups.

9. To provide for the physical development and personal health and fitness of students, and for the creative use of leisure time.

10. To provide appropriate career education and knowledge of the world of work, including an understanding of the nature and place of work in our society.

Page 320: Exploring The Impact of a Largescale Diagnostic Science

299

Adelaide Declaration (released in 1998) The achievement of Australia’s common and agreed national goals for schooling establishes the pathway for lifelong learning, from the foundations established in the early years through to senior secondary education including vocational education and linking to employment and continuing education and training. Schooling should develop fully the talents and capacities of every student. In particular, when students leave school they should:

• have skills in analysis and problem solving and the ability to become confident and technologically competent members of 21st century society.

• have qualities of self-confidence, optimism, high self-esteem, and a commitment to personal excellence as a basis for their potential life roles as family, community and workforce members.

• be active and informed citizens with the ability to exercise judgement and responsibility in matters of morality, ethics and social justice; and the capacity to make sense of their world, to think about how things got to be the way they are, to make rational and informed decisions about their own lives and to collaborate with others.

• have a foundation for, and positive attitudes towards, vocational education and training, further education, employment and life-long learning.

In terms of curriculum, students should have: • attained high standards of knowledge, skills and understanding through a

comprehensive and balanced curriculum encompassing the agreed eight key learning areas: the arts; English; health and physical education; languages other than English; mathematics; science; studies of society and environment; technology and the interrelationships between them.

• attained the skills of numeracy and English literacy; in particular, every child leaving primary school should be numerate, able to read, write, spell and communicate at an appropriate level.

• been encouraged to be enterprising and to acquire those skills which will allow them maximum flexibility and adaptability in the future.

In addition, schooling should be socially just, and should ensure that: • outcomes for educationally disadvantaged students improve and match more closely

those of other students. • Aboriginal and Torres Strait Islander students have equitable access, participation and

outcomes. • all students have understanding of and respect for Aboriginal cultures and Torres Strait

Islander cultures to achieve reconciliation between indigenous and non-indigenous Australians.

• all students have the knowledge, cultural understandings and skills which respect individuals’ freedom to celebrate languages and cultures within a socially cohesive framework of shared values.

MCEETYA (2008) Melbourne Declaration (December 2008)

The Educational Goals for Young Australians Goal 1: Australian schooling promotes equity and excellence Goal 2: All young Australians become: – Successful learners – Confident and creative individuals – Active and informed citizens

Source: Hobart and Adelaide Declarations – MCEETYA, 1998 / Melbourne Declaration – MCEETYA, 2008. See reference list for full citations.

Page 321: Exploring The Impact of a Largescale Diagnostic Science

300

AppendixC:Ateachingsequenceexemplifyingdifferentviewsoflearning

Aviewoflearningthatclarifiescurriculumintention,guidesinstructionand

shapesassessment.

Thefollowingillustrateshowaviewoflearningandcognitioncanclarify

curriculumintention,guideinstructionandshapeassessment.The2003NSW

ScienceYears7-10syllabus(BOS,2003)requiresstudentstouseaparticlemodel

toexplainchangeofstate(outcome4.7.1,2&3:particlemodel,changeofstateon

p.32).Ateacherwhoisfamiliarandcomfortablewithacognitive,constructivist

viewofcognitionandlearningmighttakestudentsthroughateachingsequence

thatendswithanassessmenttask.

Thissequencemightinvolvestudentssettingupsituationsinvolvingwaterwhere

theycanobserveevaporation,boiling,meltingandfreezing.Inthatscenario,the

teachermovesaroundtheroomprovidingadviceandsupportasstudentswork

throughtheactivities.Thisisfollowedbyateacher-ledexplanationoftheparticle

modelandadiscussionofhowitcanbeusedtoexplaineachoftheexamplesthe

studentshadworkedwith.Studentsarethengivenapen-and-papertaskthathas

shortresponseitems,anextendedtaskofasimulatedexperimentinvolvingice

melting,andasetofquestionsaskingstudentstocreatelabelleddiagramsusing

particlestorepresentthechangefromicetowatertosteam.Whenmarked,the

teacherwouldleadadiscussionoftheresultswiththeclass.Thisdescriptionisa

truncatedversionofthe5Esapproach(AAS,2017),althoughtheoutlinegivenhere

isnotfromthe5Esmaterials.

A teacher who is familiar and comfortable with a situative view of learning might see an

opportunity to address two other syllabus (BOS, 2003) requirements at the same time as

teaching the particle model. Syllabus outcomes related to the nature and practice of

science (outcome 4/5.2a to evaluate the role of creativity … in describing phenomena

on p. 28) and working in teams (outcome 4/5.22.2 to practice aspects of team work

described in content items a to h on p. 44) are outcomes that lend themselves to

groupwork. The teaching sequence might involve a cooperative learning strategy, such

as the jigsaw technique (Mitchell et al., 2009, pp 75-76) to engage with all three

Page 322: Exploring The Impact of a Largescale Diagnostic Science

301

outcomes. The assessment might involve a presentation by each member of the group

(individual or group role-playing water particles and moving in ways that simulate

evaporation, melting, boiling and freezing followed by a class Q & A led by

performers).

Team members could also be asked to complete a checklist identifying aspects of

teamwork for themselves (self-assessment) and other members of the team (peer-

assessment) in terms of their own contribution to the preparation and delivery of the

content in the presentation. The test could also be done by individuals. When

completed, the teacher would provide feedback to the group drawing on the evidence of

learning from her/his observations, a reading of the checklists, and the test results.

Assessment involves both pen-and-paper responses and observations of performance as

evidence of learning. Feedback can be given in terms of both the particle model of

matter and the processes involved in preparing and delivering the performance.

Page 323: Exploring The Impact of a Largescale Diagnostic Science

302

AppendixD:FiveexamplesinvolvingaspectsoftheSOLOmodel

Exampleone-heatingice

Anexamplefromthe2005EVpilottestshowshowthetwo-learningcycleSOLO

modelwasappliedtocodeanextendedresponsetask.Thestudentswere

presentedwithadiagramofabeakercontainingice,athermometerandastirrer.

Thebeakerandcontentsweresittingonagauzematandretortstandwitha

Bunsenburnerunderitrunningalow,two-zoneflame(thisequipmentis

ubiquitousstillinNSWgovernmentschoolsciencelabs).Studentswerepresented

withatableofresults(Table2.6)showingtemperaturechangeovertime(from0

to9minutes)

Table 2.6 Table of results from heating ice

Time (in minutes) 0 1 2 3 4 5 6 7 8 9

Temperature (in ℃) 1 1 1 4 15 29 45 61 75 89

Source: ESSA 2005 test booklet, NSW Department of Education and Training

Studentswereasked:

(a) Usingtheinformationfromtheresulttable(Table2.6),describewhatwas

happeninginthefirstnineminutesoftheexperiment.

(b) Usingyourknowledgeoftheparticletheory,explainwhythishappens.

ForthefirstcycleanyONEofthefollowingwereacceptedasaU1response:change

ofstate/theicemelts/describesapartorallofthetrendchangesintemperature

overtimewithorwithoutspecificreferencetotimeintervals.AnM1response

involvedTWOormoreU1responsesbeingprovided.R1responseslinkedthetrend

changeintemperaturetoaninferredmeltingofalltheice.Notethatincycleone,

theresponsesmadenoreferencetoscienceconcepts.Responseswereintermsof

everydaylanguagerelatedtotherelevantobservations.

InthesecondcycleanyONEofthefollowingwereacceptedasU2responses:heat

increasesthemovementorvibrationofparticles/heatisabsorbedbyiceparticles

astheicemelts/heatenergybreaksdownforcesofattractionbetweenparticles

Page 324: Exploring The Impact of a Largescale Diagnostic Science

� �����

���������������������������������������������������������������������������������

����������������������������������������������������������������������������

�������������������������������������������������������������������������������

��������������������������������������������������

���������������������������������������������������������������������������

����������������������������������������������������������������������������������

����������������������������������������������������������������������������������

��������������������������������������������������������������������������������

�������������������������������������������������������������������������������

�������������������������������������������������������������������������������

�������������������

��������������������������������������

� �

�����������������������������������������������������������������������������

����������������������

��������������������������������������������������������������������������������

�����������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������������

������������� ������������������������������������������������������������������������������������������

Produced by the New South Wales Department of Education and Training • Copyright © 2008 New South Wales Department of Education and Training4

Jack and Rana were investigating the behaviour of magnets.They stood two test tubes in a test tube rack. Then they puttwo bar magnets inside each test tube. The results are shownin the photograph.

Describe what A and B show.

Use your knowledge of forces to explain the behaviour of the magnets in test tube B.

A B

Task 2 – Behaviour of magnets

Page 325: Exploring The Impact of a Largescale Diagnostic Science

304

SOLO Cycle 2 describes the interaction of magnetic poles and explains the interaction of magnetic and gravitational forces that produces the phenomenon in test tube B in the stimulus photograph

Code Description 8 non-attempt; the page for responding to the task is left blank 0 a response was made but it does not meet any of the marking criteria 1 the response contains a single piece of commonsense information relevant to the major

concept 2 the response contains two or more pieces of commonsense information relevant to the

major concept 3 the response contains a commonsense explanation about the major concept that relates two or more pieces of commonsense information 4 the response contains a single piece of ‘scientific’ information relevant to the major

concept that clearly reflects syllabus expectations or accepted science 5 the response contains two or more pieces of ‘scientific’ information relevant to the major

concept that clearly reflect syllabus expectations or accepted science 6 the response contains a clearly stated ‘scientific’ explanation about the major concept

that relates two or more pieces of information, which clearly reflect syllabus expectations or accepted science

Figure 2.9 Actual EV extended response task. Source: 2008 EV test Note: codes 1 to 6 correspond to U1-M1-R1 & U2-M2-R2 in Figure 2.6

TeacherswhoengagedwithtrainingformarkingtheEVextendedresponsetasks

andsubsequentmarkingreportedthatthetrainingandsubsequentmarkingwas

themostimportantsourceoftheirlearningabouttheSOLOmodel(seeChapter4).

ThetextundertheheadingsCodeandDescriptionprovidethecode(anumber)

andthegeneralcriteriaforassigningtherelatedcodeconsistentwiththe

qualitativedifferencesexpectedforcycle1andcycle2responses.

Theexpectationisthatstudentswillbecomescientificallyliterateasdescribedin

theofficialsciencecurriculum(syllabusinNSW).Partofthisincludesbeingableto

providescientificexplanationsforarangeofphenomenatheyobserveand

experienceinthenaturalandmadeworldsbytheendofYear10.Thisexpectation

includesbeingabletoidentifyandnametheconceptsthatlinksetsofseemingly

disparatearraysofphenomenaandusethoseconceptstoexplainobservations

relatedtothephenomena.

HerethequalityofthinkingiscapturedbylookingatSOLOlevelswithin

modes,belowthemodethatsciencedemandsasthatwhichprovidesa

satisfactoryexplanation.(Biggs&Collis,1991,pp.73-74)

Page 326: Exploring The Impact of a Largescale Diagnostic Science

305

Onceassessed,suitableremedialactioncanbetakentohelpthestudentshifttheir

levelofunderstandingtooneclosertothesyllabusversionofascientific

explanation.Theevaluativepurposesofthetest,thatis,providingfeedbacktothe

Department,areexplainednext.

Manystudentsaregoodtalkersabouttheworldtheyinhabitwhentheystart

schoolatage5andthelevelofthat‘talk’forsomewouldbeattheIkonic,R2level

orlower(seeFigure2.6).BytheendofYear8(age13-14yrs),thegraphsforthe

periodofinterest(Figure2.10)showthataround35%ofthestudentsare

operatingatthesecondcycle,unistructurallevel(U2)inscience.Thesecondcycle

iswherewewouldexpectYears7and8secondarystudentstobeworking;overall,

approximately60%are.Lessthan5%ofstudentsareoperatingatthetopendof

thesecondcycle(relationallevelR2).Thisisthelevelwewouldhopemany,ifnot

most,Year12(16-17years)wouldbeoperating.ThislowR2resultisinlinewith

expectationsgiventheageofYear8students.Nevertheless,itshouldbeagoalfor

teacherstoaimatassuggestedinthecommentaryunderthegraphsinFigure2.10.

The2011to2014resultsrepresentedinthefourcolouredgraphsinFigure2.10

arestandardsreferencedtoanassessmentframeworkputinplacein2011.Given

this,itwouldhavebeenunsurprisingtoseethegraphsshowingaprogressive

skewtotherightwitheachsuccessiveyear.Theskewwouldshowthatteachers

wereworkingsuccessfullytoimprovestudentlevelsofthinking,asevidencedby

successivelymoreofthemappearingatthemultistructural(M2)levelatthevery

least.Thereissomevisualevidenceofashifttotherightfrom2011to2013,butit

isnotevidentfromthe2014data.WithoutknowingtheSEforthedatapoints,itis

impossibletocommentonwhetherthatisarealeffectornot.Myresearchfindings

(reportedinChapter5)providesupportfortheviewthattheshiftisnot‘real’.

TheuseofSOLOlevelstoinformthereportingofachievementattheendofYear8

istheenhancementtoEVfeedbackreferredtoinChapter1.Parentsareprovided

withrepresentationsofthinking(criterionreferencedassessment)relatedtoeach

ofthesixlevelsintheSOLOmodel(seeFigure2.6).

Page 327: Exploring The Impact of a Largescale Diagnostic Science

306

Figure2.10showstheproportionofYear8studentsateachSOLOlevelforthefour

yearsofinterestinthisproject.ThelastyearfortheESSAtestwas2014.Itbecame

theVALIDtestfrom2015.

Comparison graphs The following graphs compare the performance of students in ESSAonline from 2011 to 2014 tests for Science (overall for the test). Comparison of percentage achievement in levels for Science (overall for the test)

Whilst the pattern in the trend lines is similar across the four years, the positive aspects of the data are that the majority of students are achieving levels 4 to 6 and the very low percentage of students achieving levels 1 and 2. Achievement in Level 4 indicates use of syllabus knowledge, understanding and skills in familiar and unfamiliar situations. These students should be encouraged to deepen and interrelate their learning, as Level 5 describes deep knowledge of concepts in Stage 4 whereas Level 6 describes students with a breadth and depth of integrated knowledge, understanding and skills that can be applied meaningfully to a wide variety of real-world problems. Achievement in Levels 1 to 3 suggests that many students are often not thinking beyond the commonsense or are not confident in applying scientific knowledge, understanding and skills in everyday and/or unfamiliar contexts. Students achieving in Level 3, who are able to logically explain ideas, need particular encouragement to apply science, rather than commonsense knowledge, understandings and skills, to describe and explain the world around them.

Figure 2.10 Statewide performance data and related commentary for the ESSA test 2011-2014 (Source: DEC, Essential Secondary Science Assessment 2014 state report.)

Examplethree-mappingsyllabusoutcomestotheSOLOmodel

Page 328: Exploring The Impact of a Largescale Diagnostic Science

307

Table 2.7 Selected outcomes and related SOLO levels in the 2011 EV assessment framework LEVEL 1 LEVEL 2 LEVEL 3 LEVEL 4 LEVEL 5 LEVEL 6

Out

com

es 4

.1 to

4.5

(2 o

f 7 ro

ws)

Identify a scientific discovery

Compare scientific discovery to other types of discovery

Link a scientific discovery to its effect on humans

Describe a development in science that has led to new developments in technology

Compare the methods of the scientist to the design model of the engineer and architect

Explain the role of scientific thinking on society

Identify a possible career path in science

Identify a science context in a career

Link a career in science to knowledge and skills required

Identify science as a human activity

Discuss why society should support scientific research

Out

com

es 4

.6 to

4.9

(3 o

f 16

row

s)

Identify materials attracted by a magnet

Compare the observable effects when magnets are placed end to end

Link the observable effects when two magnets are placed end to end with their position

Describe a magnetic field as producing a force that attracts particular metals

Describe the poles of a magnet as the area/ends where the magnet’s field is most intense

Explain the behaviour of magnetic poles using the term field

Identify that objects / substances take up space and/or have mass/weight

Explain that materials are held together differently in solids, liquids and gases

Explain density in terms of a simple particle model

Identify an observable feature in melting, freezing, condensation, evaporation or boiling

Describe observable features in melting, freezing, condensation, evaporation and boiling

Explain that, when substances melt, freeze, condense, evaporate and boil, they are still made of the same stuff

Identify that particles are continuously moving and interacting

Compare movement and interaction of particles in different states

Explain change of state in terms of rearrangements of particles

Identify that as particles are heated they gain energy

Identify that as particles are heated they gain energy and move

Relate changes of state to the motion of particles as energy is removed or added

Page 329: Exploring The Impact of a Largescale Diagnostic Science

308

further apart

No content for Outcomes 10 - 12 is included Outcomes 4.13 to 4.15* (1 of 8 rows)

Make a simple observation

Compare observations made by different people

Explain strategies to increase accuracy of observation

Correctly sequence steps in a scientific procedure

Accurately and systematically record observations and data

Discuss the relationship between accuracy and reliability

Outcomes 4.16, 4.17 a-d & 4.18** (1 of 8 rows)

Use a simple key or symbol to represent a concrete object or representation

Distinguish between different symbols

Complete diagrams and symbolic representations

Correctly sequence steps in a process described in a text

Distinguish between two related sets of data / information

Represent relationships using keys, symbols and flow chart

Outcomes 4.17e-g, 4.19-4.21*** (1 of 7 rows)

Identify a common unit of measurement

Identify the ratio of one unit to another

Complete a correct conversion of one unit to another

Create a simple scale

Compare the scale on two axes

Create an appropriate scale

Source: NSW Department of Education and Training DET, 2011. Shaded rows are referenced in the body text. * Planning and Conducting Investigations area / ** Communication area / and *** Critical thinking area

Table2.7andfollowingtextexplainsthemapofsyllabusoutcomesandSOLO

modellevels1to6.

Whilethetestwasdeliveredasapenandpaperexercise(from2005to2010),the

assessmentframeworkdiscussedinthissubsectionwasbeingdevelopedand

validated.

Table2.7showsanextractoftheframework.Itshowshowthesyllabusoutcomes

(writtenandpublishedforthe2003sciencesyllabus)weresubsequentlyrelated

tothesixlevelsoftheconcretesymbolicmodeofthinkingintheSOLOmodel.

Themeltingicetaskaboveandasecondtaskinvolvingmagnetsdescribedin

subsection2.6.4werepartsoftestsdonebefore2010.Boththesetasks

subsequentlymappedtotheEVframeworkproducedforthe2011–2014tests(see

theshadedsectionsacrosstheextractfromtheEVframeworkinTable2.7).Anew

frameworkwasusedtoinformtestdevelopmentfortheVALIDteststhatbegan

Page 330: Exploring The Impact of a Largescale Diagnostic Science

309

from2015.ThiswasbasedonthenewAustralianCurriculumScience(NSW

version).ThequalitycontrolprocessesusedtodevelopitemsandtasksforEV

testsarediscussedinsubsection2.6.4.

Since2011,theEVtestshavebeendeliveredonlinetoschoolcomputerslinkedto

schoolnetworksandtheinternet.Theaffordancesprovidedbyonlinedeliverywill

notbeaddressedinthisthesis.

TheEVframeworkforthe2011to2014EVtestwasorganisedasagrid(Table

2.7).Thecolumnsidentifythethesixperformancelevels(LEVELS1to6)related

tothe“twolearningcycleswithinamode”SOLOmodel.Thefiverows

accommodate21ofthe22syllabusoutcomesdefiningStage4(Years7&8)(DET,

2011).

Theperformancelevels(Table2.7)correspondtothethreelevelsofthinkingin

eachofthetwolearningcyclesintheconcretesymbolicmodeofthinking(see

Figure2.6).LEVEL1=firstcycleU1,LEVEL2=firstcycleM1,LEVEL3=firstcycle

R1,LEVEL4=secondcycleU2,LEVEL5=secondcycleM2,andLEVEL6=second

cycleR2.

ThedescriptorsineachofthegridcellswereidentifiedasappropriateforStage4

learnersbyexperiencedscienceteachersandSOLOexperts(asexplainedinthe

subsection2.6.4).Thewordingusedwasbasedonacombinationoftheir

professionaljudgmentandtheoutputsfromsophisticatedpsychometricanalysis

usingresultsfromtriallingandpilotingandthefirstfewyearsoffullcohort

testing.

Theoutcomesarenumberedintheleft-handcolumn.Thefirstdigitreferstothe

ScienceSyllabusStage,whichinthiscaseisStage4(forstudentstoachievebythe

endofYear8).Thesecondnumberidentifiestheoutcome(from1to21)The

letterscorrespondtocontentrelatedactions(indicatorsofoutcomeattainment)

linkedtoessentialsyllabuscontent(thatstudentswilllearnabout)thatdefinesthe

scopeoftheoutcome.

The21outcomesabletobeassessedbythismodeoftesting(penandpaper)are

groupedonthefullEVassessmentframeworktoreflectgroupsofoutcomes

definedhereassyllabusareas:

Page 331: Exploring The Impact of a Largescale Diagnostic Science

310

• Prescribedfocusarea(Outcomes4.1-4.5)

• Knowledgeandunderstandingarea(Outcomes4.6-4.12)

• Planningandconductinginvestigationsarea(Outcomes4.13,4.14&4.15)

• Communicationarea(Outcomes4.16,4.17a-dand4.18)

• Criticalthinkingarea(Outcomes4.17e-g,4.19,4.20&4.21)(DEC,2015,p.18).

Thefirsttwobullet-pointareasarebothknowledgeandunderstandingoutcomes;

theremainingthreearerelatedtoscienceskillsandprocesses.Noteverycellof

everyrowhasacontentdescriptionbecausethesyllabusissilentaboutrelevant

contentatthatSOLOlevel.Thisistobeexpectedbecausethesyllabuswas

publishedin2003;thesixlevelswereidentifiedwithintheexistingsyllabus

contentand‘levelled’usingSOLOexpectationsin2010.

Abouthalfoftheoutcomesareaboutknowledgeandunderstanding(1to12);the

otherhalfarescienceprocess/skilloutcomes(13to22).Theintendedmessageis

thatjuniorsecondaryscienceisasmuchaboutscienceknowledgeand

understandingasitisabout‘doingscience’.Thus,thefullgridprovidesaneasy

waytomaptheitemsforaparticulartestagainstsyllabusexpectationsandall

SOLOlevelsofthinking.

TheCriticalThinkinggroupingofoutcomesintheEVframework(seethebottom

rowofTable2.7)isanattemptbythetestdeveloperstosignaltoteachersthat

havingstudentsengagecriticallywithscienceandscience-relatedissuesisan

importantexpectation.InthefullversionoftheEVframework,thereareseven

rowsofcontentdescriptorsforthisarea.Theonechosenhereisabout

measurement;othersrelatetoaprogressioninthinkingtodowithmathematical

relationshipsbetweenvariables,dataanalysis,evidence-basedconclusions,critical

analysisofscientificexplanations,predictionsandinferences(basedonscientific

evidence),andrecognisingaspectsofaproblemthatmayberesolvedusing

science.

Measurementisanimportantcomponentofthe‘epistemic’basisofscience.The

meaningof‘epistemic’andexamplesofitemsandtasksrelatedtoitareexplained

inthePISA2015frameworksdocument(OECD,2017,pp.29-38).ThesixSOLO

levelsrelatedtomeasurementintheCriticalThinkinginTable2.7beginwith

Page 332: Exploring The Impact of a Largescale Diagnostic Science

311

specificcontextsformeasurementsandthemmovetorelationshipsbetween

aspectsofthesameanddifferentmeasurements(SOLOcycleone).FromLEVEL4

theexpectationprogressestodevelopingamoregeneralisedunderstandingof

measurementscales(SOLOcycletwo).Theprogressionhasthepowertoguide

teachingandassessmentalignedtosyllabusintentions.

Thecelldescriptorsalongeachrowprovideguidancetoateacheraboutwhat

contentistobelearnedandthecomplexityofthinkingstudentsareexpectedto

manageastheyworkthroughthetwoyearsofStage4inscience.Anexample

describingapossibleprogressioninlearningaboutunitsofmeasurement(ina

particularcontext)anditsextensiontomeasurementscalesgenerallywillnowbe

described.

Outcome4.7abouttheparticlemodelofmatter(4.7.1)andmeltingiceasa

particularexampleofchangeofstate(4.7.3)arethecontextsforthislearning

sequence.Theendresultofusingdifferentthermometers(glass-alcoholand

digital)withdifferentscales(Kelvin,CelsiusandFahrenheit)istovalidatean

operationaldefinitionforlatentheatofmeltingforonesubstance,water.The

observationthatthetemperatureofanice-watermixturestaysthesameuntilall

theicemeltsisexplainedbytheideathataddedheatisbeingusedto‘overcome’

whateveritisholdingthewaterparticlestogethertomakeiceratherthan

increasingthetemperatureoftheice-watermixture).

Theinitialemphasishereistoensureaccurateandreliablemeasuringofthe

temperatureasicemelts.Theclassdiscussionbeforeworkingingroups(each

usingadifferentthermometerwithadifferentscale)wouldbetodiscussthethree

temperaturescales(Kelvin,CelsiusandFahrenheit).Whythethreescales?Which

onetouse/orallofthem?Why?Whatistheratioofonescaledivisionrelativeto

theotheracrossthethreescales?Howdoweconvertfromonetotheother?Why

mightthisbeausefulconversiontobeabletodo?Aworksheetcouldbedeveloped

forusebymembersofthegroupworkingtogethertodiscussandanswerthe

questionsposedthere.

Oncethetaskiscompleted,eachgroupmembertakestheresultsand

independentlyanswersanothersetofquestions(suchastheonesinthe2005test

Page 333: Exploring The Impact of a Largescale Diagnostic Science

312

andothersrelatingtoscaleconversionsanddrawingagraphofonescaleverses

anothertointerpolatewithinandextrapolatebeyond)providedonanothersheet.

Afinalquestiontoanswermightbe:Howwouldyourecalibrateathermometer

wherethescalehadbeenrubbedout?

Thechallengeto‘calibrateathermometerwithnoscale’addressestheBOS

CommonGradeDescriptorcriteriaforanA:“Canapplytheseskillstonew

situations”(BOS,2013).ThetasksabovealsoaddressthreeMayerKey

Competenciesrelatingtosolvingaproblem,usingmathematicalideasand

techniquesaswellasusingtechnology(inthissituation,analoguethermometers

withthreedifferentscalesanddigitaltechnologyintheformofdigitaltemperature

probeslinkedtodataloggersorcomputers).TheMayerKeyCompetencieswere

integratedintothesyllabusatthetimeitwaswritten(AECRC,1992).

Examplefour-reportingachievementattheendofYear8

TheresultsfromtheEVtestareorganisedintoasummativereportofachievement

attheendofYear8.Thereportforstudents,parentsandteachersprovidesthe

resultsforfiveareasorcategoriesofoutcomes.ThescoresfromitemsintheEV

frameworkmappedtotheCriticalThinkingareaaredistributedtotheworking

scientificallyandcommunicatingscientificallycategories,dependingonwhether

theitemshadaninvestigatingorcommunicatingcontext.Thestudentreport

providesindividualfeedbackoneverytaskanditeminthetest.

Individualresponsesarealsoaggregatedtoprovideascoreandpositiononascale

from1to6relatedtothesixSOLOlevelsasshowninFigure2.7.Fivescalesare

providedshowinganoverallscoreforscience,ascoreforthethreeextended

responsetasksandthreeseparatescalesforknowledgeandunderstanding,

communicatingscientificallyandworkingscientifically.Providingfeedbackonfive

categoriesofscienceachievementismoreusefulandrespectfulofachievementby

anindividualthanasingleindicatorofoverallachievement,suchasagradeor

mark.Itisalsodiagnosticinthesensethatanassessmentofstrengthsand

weaknessesinparticularareasofsciencecanbeeasilyseen.

Page 334: Exploring The Impact of a Largescale Diagnostic Science

313

Figure 2.7 A sample reporting scale (Source: DET, 2007)

Thestudent’sscore(representedasathicklineonthescale)anditsplacementon

thescaleisdeterminedbyacombinationoffactorsgoingbacktothedevelopment

ofitemsandtasksforinclusioninthetestandthedependability(Harlen,2004)of

theprocessesusedthenandsubsequentlytoproducethescoresandprintits

representationontheproficiencyscale.Thequalitycontrolprocessestodothat

willbediscussedbelow.

Table2.8providesanextractfromtheEVstudentreportforreference.TIMSS,

PISAandNAP-SLalsohavecomparablesetsofdescriptorsforeachofthe

proficiencylevelsrelatedtotheirassessmentframeworks.

Essential Secondary Science AssessmentStudent report for parents 2006 Year 8This report shows the results for: Natalia Allenby

Local High School

What is ESSA?Essential Secondary Science Assessment (ESSA) is a statewide program that complements the school-basedassessment and reporting programs of NSW schools.The ESSA test assesses what Year 8 students know and can doin science; then students, parents and teachers can use the ESSA levels (see the table inside this report) toplan learning programs and activities so that students keep moving forward in their science knowledge andskills.This report provides results from the pilot test that was held on Tuesday 28 November 2006 for approximately58 000 students.

What was tested?The test assessed a variety of Stage 4 outcomes from the Science Years 7-10 Syllabus.

Science:Overall, a broad range of knowledge and skills in science were assessed using three extended response tasks and 75short response and multiple choice tasks.Extended response tasks are writing tasks that provided opportunities for students to demonstrate their integratedunderstandings and skills from various areas of the syllabus.Short response and multiple choice tasks assessed syllabus outcomes that were organised into three interrelatedstrands:

Knowing and understanding:Students responded to items thatspecifically assessed their knowledge andunderstanding of scientific concepts. Someitems tested Prescribed Focus Areas, such asthe nature and practice of science and theimpact of science on society, technology andthe environment.

Communicating scientifically:Students analysed and responded to avariety of texts that are typical ofthose used in Year 7 and Year 8science. Some items required criticalthinking.

Working scientifically:Students had opportunities todemonstrate skills incritical thinking, makingevidence-based conclusionsand in planning, conductingand analysing investigations.

How to read this reportResults are shown on five reporting scales. Each reporting scale has six achievement levels, from Level 1 toLevel 6. These levels are based on the requirements of the NSW Science Years 7-10 Syllabus. They represent astandard of what students know and can do in science. The levels in each strand are described in the tableinside this report.

Each level represents a standardof achievement in science.

An individual's resultis shown by

LEVEL 1 LEVEL 2 LEVEL 3 LEVEL 4 LEVEL 5 LEVEL 6

S A M P L Elower scores higher scores

Page 335: Exploring The Impact of a Largescale Diagnostic Science

314

Table 2.8 Extract from student report showing selected levels for three reporting categories

Ý Knowledge & understanding

Communicating scientifically

Working scientifically

Leve

l 6

• Explains physical phenomena using a model, theory or law

• Explains the interaction of complex systems (for example, relates the role of the circulatory system to the needs of cells)

• Explains the theme and function of a complex text

• Critically analyses the credibility of scientific information

• Relates the dependent and independent variables for a given problem

• Describes the wider significance of conclusions (for example, accounts for the differing amounts of water loss by plant cuttings by identifying plant processes)

Le

vel 5

• Describes examples where scientific understanding has changed

• Describes interactions of systems or within systems

• Extracts related information from diagrams, tables, graphs or other texts

• Compares two sets of information (for example, compares a table and graph and inserts information into the graph)

• Identifies ways to improve the reliability and accuracy of controlled investigations

• Applies mathematical models to data (for example, interpolates information from a line graph)

Le

vel 4

• Identifies scientific evidence (for example, identifies evidence that leads to change in a scientific theory)

• Describes a complex process of our world or space (for example, identifies requirements for photosynthesis)

• Identifies an interaction of systems or withina system (for example, identifies evidence that indicates that a chemical reaction has occurred)

• Identifies one piece of relevant scientific information

• Describes an effective solution to a problem with a science context

• Identifies a prediction, inference, conclusion, aim and hypothesis

• Selects one piece of appropriate scientific equipment for a task (for example, identifies a benefit of using a data logger to collect information in an investigation)

• Draws a conclusion based on scientific evidence

Le

vel 3

• Explains a link between technology and science

• Relates simple processes of our world or space (for example, identifies insects as consumers)

• Relates a model to an aspect of our world or space (for example, identifies kinetic energy acting in an activity)

• References information within a diagram, table, graph or other text (for example, summarises ideas across a text)

• Uses cause and effect to explain an observation (for example, identifies the effect of a change during a process)

• Relates equipment and appropriate use for a simple task (for example, identifies the correct use of a thermometer)

• Draws a simple conclusion

Ý Source: DET, 2007, p. 3. The lower arrow represents the transition from the R1 level of the first cycle in the concrete symbolic mode of thinking; the higher arrow represents the transition to the U1 level of the next (formal) mode of thinking.

TheformativeintentoftheEVprogramissignalledinthereporttoparentsand

students:

Students,parentsandteacherscanusethe[EV]levels[Table2.8]to plan

learningprogramsandactivitiessothatstudentskeepmovingforwardin

theirscienceknowledgeand skills.(DET,2007,p.3)

Page 336: Exploring The Impact of a Largescale Diagnostic Science

315

ThelevelsreferredtoarethesixlevelslinkedtotheSOLOmodeldiscussedabove.

Progress(“movingforward”intheEVreport)insciencelearningisdefinedbythe

languageusedineachoftheleveldescriptionsforaparticularreportingcategory.

Thefeedbackfromthetestandstudentsurveyisprovidedtoschoolsandschool

systemsparticipatinginthetestsomesixmonthsafterthetestsaredone,andwell

intoanewschoolyearwhenstudentshavecommencedthenextstageoflearning

(syllabusstage5earlyinYear9).Becausethefeedbackisnotimmediate,the

resultsarehelpfultoteacherswhenevaluatingtheirprogramsandmaking

changesforthenextcohortofstudentsasdiscussedinChapter5.

BecausetheprimarypurposeoftheEVtestistoprovidefeedbacktostudents,

parents,teachers,schoolsandschoolsystemsaboutprogressinstudentlearning,

theaimistohaveasmanystudentsfinishthetestaspossible.Thetest

administrationprocessismanagedattheschoollevelbyteachers,andschools

haveoneweekinwhichtocompletetheexercise.Toensurethatstudentsareable

tocompletethetest,timeallocationsforthesectionsofthetestarelistedas

approximateonly.Eightminutesisadvisedforthepreliminary,practiceitems;20

minutesforthethreeextendedresponseitems;anhourfortheshortresponse

itemsets;and,aboutfiveminutesforthestudentsurvey.

Inkeepingwiththepurposeofprovidingfeedbacktoindividuals,studentsdothe

testindividuallyastheywouldanyothertest.Thereisnocompetitiveadvantageto

behadby‘cheating’becausetheresultsprovideindividualfeedbackabouttheir

learningrelativetothesyllabusandSOLOlevels,nothowwelltheyaredoing

relativetootherstudents.

Examplefive-EVtestitems,stimulusmaterialandstudentsurvey

Schoolsareencouragedtokeepandreusethetestitemsandtasksintheirown

school-basedassessmentsbecausetheyareexemplaryassessmentitems.Teachers

haveaccesstoallthetestsfrompreviousyearsandrelatedstimulusmaterialas

wellastheassessmentrubricsusedtomarkthethreeextendedresponseitemsin

thoseyears.

Page 337: Exploring The Impact of a Largescale Diagnostic Science

316

ThethreeextendedresponseitemsintheEVtestwereplacedimmediatelyafter

thepreliminarypracticeitemswhenthetestwasinprintform.Experiencewith

externaltestsatthattime,whereextendedresponsequestionswereattheendof

thetest,showedthatmanystudentssimplyignoredthosequestions.Placementat

thebeginningofthetestobtainedanalmost100%response.Ofthethreeextended

responsetasks,oneinvolvedaninvestigationscenario;theothertwoprimarily

addressedsyllabusknowledgeandunderstandingexpectations.

Extendedresponsetasksareopenendedsothatstudentscanrespondatthe

highestlevelofunderstandingtheyarecapableofdemonstrating.(seeexamples

oneandtwoabove).Therelevantsyllabusreferencesrelatedtothesetwotasks

arehighlightedinthesectionoftheEVframeworkprovidedasTable2.7

Shortresponseitemsarewrittentoidentifynotonlyastudent’sknowledgeand

understandingbutalsotheirabilitytocomprehendatorabovethelowesttargeted

SOLOlevelofthinking(asreflectedinthewordingoftheitem).Itemsarelinkedto

apieceofstimulusmaterialrichinsciencecontentfromthesyllabusforthatstage

oflearning.Thetextprovidedischosenfromtherangeofexperiencesan

adolescentlearnerislikelytohavehadortoknowabout.Itmightbeanextract

fromanewspaperormagazineoranadvertisementorarecountofaTVnews

item,forexample.FromthreetoeightitemstargetingarangeofSOLOlevelsmight

berelatedtoanyonepieceofstimulusmaterial.

Itemsandtasks‘lookandfeel’tostudentslikeitemsandtasksinotherexternal

teststheydoeachyearforNAPLAN.Atestwouldhavearound75to85short

responseitems.Notalltheknowledgeandunderstandingsneededtosatisfyitem

demandsareprovidedinthestimulusmaterial.Studentsareexpectedtouse

knowledgeandunderstandingfromthesyllabustorespondappropriately.

Studentsareexpectedtorespondbychoosingonefromthreetofivealternatives

(toidentifythebestanswer)ortowriteafewwordsortheresultofacalculation

ontheanswersheetprovided.Distractorsarechosen,wherepossible,toidentify

misconceptionsstudentsmayhave(seeFigure2.8,Item14).

Page 338: Exploring The Impact of a Largescale Diagnostic Science

317

ESSA 2014 TEST Stimulus material and related items 9 to 16

Figure 2.8 One of the stimulus-item sets from the 2014 EV test. Source: NSW Department of Education, ESSA 2014 Test item.

Read the following article then complete items 9 to 16.

Why use a pool cover?

A pool cover is a great investment. Over a whole year, a pool can lose up to 5 mm of water each day. By using a pool cover, the water loss is reduced by about 95%.

Pool covers also extend the swimming season by increasing the pool’s water temperature by up to 8ºC.

A well-fitted pool cover keeps dirt, leaves and insects out of the pool. This also helps the cleaning equipment to keep the water suitable for swimming.

ESSA 2014 © 2014 NSW Department of Education and Communities page 6

9 Choose yes or no for each reason to answer the following question:

According to the article, what are the reasons that a pool cover is a great investment?

Yes No

prevents water loss

saves energy

keeps the pool cleaner

extends use of the pool

10 Using a net to remove leaves and insects from a pool is an example of

chromatography

filtration

sedimentation

11 Swimming pools would lose most water during

cool and cloudy days

cool and windy days

warm and cloudy nights

warm and windy days

12 On a hot day, the water on the surface of a pool would most likely undergo

a physical change

a chemical change

no change

13 Gaseous water is less dense than liquid water because particles in gaseous water are

closer together

further apart

smaller in size

larger in size

14 On hot days, water particles in the pool collide into each other more often because the water particles

have more energy

have less energy

get larger as the pool warms up

are made as the pool warms up

15 What is one environmental impact of covering a pool?

Australia would have fewer droughts.

There would be more water in dams.

People could swim for more months in the year. Swimming pools would stay clean and leaf-free.

16 Pure water has the chemical formula H2O.

What type of chemical substance is pure water?

compound

element

mixture

ESSA 2014 © 2014 NSW Department of Education and Communities page 7

9 Choose yes or no for each reason to answer the following question:

According to the article, what are the reasons that a pool cover is a great investment?

Yes No

prevents water loss

saves energy

keeps the pool cleaner

extends use of the pool

10 Using a net to remove leaves and insects from a pool is an example of

chromatography

filtration

sedimentation

11 Swimming pools would lose most water during

cool and cloudy days

cool and windy days

warm and cloudy nights

warm and windy days

12 On a hot day, the water on the surface of a pool would most likely undergo

a physical change

a chemical change

no change

13 Gaseous water is less dense than liquid water because particles in gaseous water are

closer together

further apart

smaller in size

larger in size

14 On hot days, water particles in the pool collide into each other more often because the water particles

have more energy

have less energy

get larger as the pool warms up

are made as the pool warms up

15 What is one environmental impact of covering a pool?

Australia would have fewer droughts.

There would be more water in dams.

People could swim for more months in the year. Swimming pools would stay clean and leaf-free.

16 Pure water has the chemical formula H2O.

What type of chemical substance is pure water?

compound

element

mixture

ESSA 2014 © 2014 NSW Department of Education and Communities page 7

Page 339: Exploring The Impact of a Largescale Diagnostic Science

318

Figure2.9(inexample2above)providesataskfromthe2008EVtestandthe

descriptorsforapplyingacodetostudentresponsesintheonlinemarkingprocess.

Afeatureofthecodingprocessisthatmarkersareaskedtocodeforthehighest

levelofresponseevidencedinananswer.Thetextinthesectionunderthetaskin

Figure2.9outlinesexpectationsforresponses.Forcycle1thelanguageusedis

sourcedfromtheexpectedlearningrelatedtomagnetsandforcesintheK-6

ScienceandTechnologysyllabus.Cycle2responselanguageissourcedfromthe

Science7-10syllabusinusebyschoolsatthetime(BOS,2003).

Someofthestudentsurveyquestionsspecificallyaddressissuestodowiththe

test,asexemplifiedbytheextractfromthestudentsurvey(seeFigure2.11).

ESSA 2014 student survey We would like to know what you think about this science test and about science. This survey is not a test and there are no right or wrong answers. Your responses will be kept confidential so please answer as honestly as you can.

Complete this survey about science I am interested in science.

I know about many careers that are based on science.

I want to study a science subject in Years 11 and 12.

Our knowledge about science is constantly changing.

Science helps me to make decisions about things in my life.

Science impacts on many aspects of my everyday life.

Protecting the environment for the future is my responsibility.

Science provides information about today’s important issues.

Science helps me to understand the world around me.

Complete this survey about the test and science lessons The test was about what I learn in science class.

The test was easier than I expected.

I enjoyed doing the test.

Literacy is important in learning science.

It is important that all students learn science in Years 7 to 10.

Science is the hardest subject that I learn.

In primary school, I enjoyed lessons that were about science.

In secondary school, I enjoy science lessons.

ESSA 2014 © 2014 NSW Department of Education and Communities page 30

Page 340: Exploring The Impact of a Largescale Diagnostic Science

319

Figure 2.11 Questions about the EV test. Source: NSW Department of Education and Communities, ESSA Test, 2014.

Studentresponsesareusedasfeedbacktorefineandimprovethetestandthetest

experienceforstudentsgoingforward.Schoolsalsoreceivethefeedbackfrom

studentsattheirschoolandtheirresponsescanbecomparedtotherestofthe

state.

Which part of the test did you like best? Choose one.

Dissolving tablets Expanding joints I think I can! What does your heart do? Nicolaus Copernicus Have you had your milk today? Burn for you Kata Tjuta Why use a pool cover? Wind turbines produce water Spray-on skin cells Earth’s cosy blanket Popcorn bounce! Coal

Bungeeeeeeeee!

Why did you like this part? Choose one reason.

It was interesting.

It was easy to understand.

It was about a familiar topic.

The test items were easy.

I liked the pictures in this part.

I learnt something new.

Complete this survey about your school subjects

My three favourite school subjects are

Aboriginal Education History

Agriculture Language studies

Dance Mathematics

Design and technology subjects Music

Drama PDHPE

English Science

Geography Visual arts subjects

Any other subjects

The three school subjects I think I learn most in are

Aboriginal Education History

Agriculture Language studies

Dance Mathematics

Design and technology subjects Music

Drama PDHPE

English Science

Geography Visual arts subjects

Any other subjects

ESSA 2014 © 2014 NSW Department of Education and Communities page 31

Which part of the test did you like best? Choose one.

Dissolving tablets Expanding joints I think I can! What does your heart do? Nicolaus Copernicus Have you had your milk today? Burn for you Kata Tjuta Why use a pool cover? Wind turbines produce water Spray-on skin cells Earth’s cosy blanket Popcorn bounce! Coal

Bungeeeeeeeee!

Why did you like this part? Choose one reason.

It was interesting.

It was easy to understand.

It was about a familiar topic.

The test items were easy.

I liked the pictures in this part.

I learnt something new.

Complete this survey about your school subjects

My three favourite school subjects are

Aboriginal Education History

Agriculture Language studies

Dance Mathematics

Design and technology subjects Music

Drama PDHPE

English Science

Geography Visual arts subjects

Any other subjects

The three school subjects I think I learn most in are

Aboriginal Education History

Agriculture Language studies

Dance Mathematics

Design and technology subjects Music

Drama PDHPE

English Science

Geography Visual arts subjects

Any other subjects

ESSA 2014 © 2014 NSW Department of Education and Communities page 31

Page 341: Exploring The Impact of a Largescale Diagnostic Science

320

AppendixE:Proformaforcasestudyschoolstocomplete

SCHOOL:DATE:

YEAR 8 ESSA-VALID STUDENT SURVEY DATA…Please obtain this from SMART

2011 2011 2012 2012 2013 2013 2014 2014 2015 2015LEVEL School State School State School State School State School State

5-6

A. I want to study a science subject in Years 11 &12

3-4

1-2

2011 2011 2012 2012 2013 2013 2014 2014 2015 2015LEVEL School State School State School State School State School State

5-6

B. Science is the hardest subject that I learn 3-4

1-2

2011 2011 2012 2012 2013 2013 2014 2014 2015 2015LEVEL School State School State School State School State School State

5-6

C. In primary school, I enjoyed lessons that were about science 3-4

1-2

2011 2011 2012 2012 2013 2013 2014 2014 2015 2015LEVEL School State School State School State School State School State

5-6

D. In secondary school, I enjoy science lessons 3-4

1-2

2011 2011 2012 2012 2013 2013 2014 2014 2015 2015LEVEL School State School State School State School State School State

5-6

E. My three favourite school subjects are (record the % for science) 3-4

1-2

2011 2011 2012 2012 2013 2013 2014 2014 2015 2015LEVEL School State School State School State School State School State

5-6

F. The three subjecs I think I learn most in (record the % for science)

3-4

1-2

Forthispage,fillintheboxesbyestimatingthescalereadinginSMARTforeachofthesixcomponents.Ifpressedfortimeonlydotheoddyearscomingbackfrom2015…5/6,3/4&1/2refertostudentachievementlevelsasrepresentedinSMARTforthesurvey.

In terms of your priorites for science in Years 7-9, write the letter representing the six statements (A to E) in order of importance (most important first):

Page 342: Exploring The Impact of a Largescale Diagnostic Science

321

6 ** 6 ** 6 ** 6 ** A ** A ** A **5 ** 5 ** 5 ** 5 ** B ** B ** B **4 ** 4 ** 4 ** 4 ** C ** C ** C **3 ** 3 ** 3 ** 3 ** D ** D ** D **2 ** 2 ** 2 ** 2 ** E ** E ** E **1 ** 1 ** 1 ** 1 ** ** ** **

5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 **3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 **1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 **

5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 **3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 **1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 **

5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 **3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 **1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 **

5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 ** 5-6 **3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 ** 3-4 **1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 ** 1-2 **

2015

N = number in the year

2010 2011 2012 2013 2014

2013 2014 2015

N = the number of students who sat the test that year**Copy SMART percentages into the relevant cell. The data is available from the annual school reports in SMART headed Percentages in achievement level .

N = N =

2007 2008 2009 2010 2011 2012

2008

Y8 ESSA-VALID EXTENDED RESPONSE

Y8 ESSA-VALID OVERALL RESULTS

Y8 ESSA-VALID Plan & conduct investigations / Working scientifically

SCHOOL:

DATE:

For this page, If pressed for time, only complete the data for odd years beginning with 2015 and working back.

N = N =

N = N = N =

YEAR 10 SCIENCE (SCHOOL CERTIFICATE, NOT VALID)

N = N = 2009

Y8 ESSA-VALID Problem solving and communication / Communicating scientifically

N =

2009 2010 2011 2012 2013 2014 2015

N = N = N = N = N = N =

2007

* * * * * *

* * * * * *

* * * * * *

* * * * * *

* * * * * *

2014 20152011 2012 2013

HSC Physics Total no in school course --->

COURSE YEAR 2010

HSC Senior Science

HSC Biology

HSC Chemistry

Total number of HSC English students for your school --->

Total no in school course --->

Total no in school course --->

Total no in school course --->

Total no in school course --->

HSC Earth & Environmental Science

SCHOOL:DATE:

For this page, if pressed for time, complete only the odd years working back from 2015. I realize that not all schools offer all HSC courses.

* Divide science HSC course numbers by total ENGLISH numbers for that year and convert to a %

Page 343: Exploring The Impact of a Largescale Diagnostic Science

322

AppendixF:Scienceteachersurveyquestions

INSTRUCTIONSANDCONSENT

ThepurposeofthissurveyistofindoutaboutyouruseoftheESSA/VALIDprograminthecontextofallyourassessment-relatedworkinscience.

Thereare26questionsandthewholesurveyshouldtakeyouabout25minutestocomplete.Youcanchangeyourmindatanytimeandstopcompletingthesurveywithoutconsequences.Ifyouchoosetoidentifyyourself,Iwillkeepanydatayouprovideconfidential.

##############################################################

IhavereadandunderstandthematerialabouttheresearchprovidedinATTACHMENTSONEANDTWOforwardedtomebymyprincipal.

Iwishtoproceedwithansweringthequestions

SECTIONONE:ABOUTESSA/VALID

TheESSA/VALIDtesthasbeenapartoftheYear8scienceexperiencesince2007.FeedbackfromtheESSA/VALIDtestandtherelatedstudentsurveyaccompanyingitisprovidedtoschoolsinTerm1oftheyearfollowingthetest.ThefollowingitemsaskaboutyouruseoftheESSA/VALIDtestandrelatedfeedback.[RadiobuttonsforYES/NO]

1.InrelationtoESSA/VALIDresultsformyschool,Ihaveintheprevioustwelvemonths:

1a.lookedattheresultsofthestudentsurvey

b.lookedattheitemanalysisformyclass/school

c.lookedattheanalysisofanswerstothethreeextendedresponsetasks

d.lookedatthestudentprofileinformation

e.discussedtheitemortaskanalysiswithcolleagues

f.discussedtheitemortaskanalysiswithstudents

g.discussedtheresultsofthestudentsurveywithcolleagues

h.discussedthestudentprofileinformationwithcolleagues

i.discussedtheresultsofthestudentsurveywithstudents

2.Ihaveintheprevioustwoyears:

2a.accessedESSA/VALID-relatedmaterialsinTaLE

b.accessedESSA/VALID-relatedmaterialsinSMART

c.usedinmyclassesteachingstrategiesthatIfoundintheESSA/VALID-relatedCurriculumLinksmaterials

Page 344: Exploring The Impact of a Largescale Diagnostic Science

323

d.accessedtheESSA/VALIDMarkingManual/sfortheextended-responsetasks

e.usedESSA/VALIDshortresponseitemsintopictests

f.usedESSA/VALIDextended-responsetasksintopictests

g.usedESSA/VALIDitems&/orextended-responsetasksinmyteaching

h.usedESSA/VALIDitems&/orextendedresponseitemsasmodelsforwritingnewitemsandtasksintopictests

i.contributedamendmentstofacultyprogramsasadirectresponsetoESSA/VALIDresults

j.writtenitemsfortheESSA/VALIDtest

k.beenonapaneltoevaluateESSA/VALIDitems

l.beenamarkerfortheESSA/VALIDextendedresponsetasks

m.haveattendedworkshopsaboutESSA/VALID(NOTincludingtrainingforESSA/VALIDmarking)

3.Overall,IwouldratemyunderstandingoftheESSA/VALIDprogramas

verypoor/poor/acceptable/good/verygood

4.IthinkthemostimportantpurposefortheESSA/VALIDtestis…

[aboxwith100wordslimit]

5.MyschoolwillthisyearparticipateintheVALIDsciencetestforYear10students

Yes/No/Unsure

SECTIONTWO:ABOUTSOLO

ThisnextsetofitemsisabouttheStructureoftheObservedLearningOutcome(SOLO)model.[RadiobuttonsforYES/NO]

6.InrelationtoSOLO,Ihaveintheprevioustwoyears:

6a.accessedmaterialaboutSOLOintheMarkingManualfortheextended-responsetasksintheESSA/VALIDtest

b.accessedmaterialaboutSOLOinplacesotherthantheMarkingManualfortheextended-responsetasksintheESSA/VALIDtest

c.explainedtheSOLOmodeltoanotherteacher

d.usedtheSOLOmodeltoexplaintheESSA/VALIDstudentprofiletoastudent

e.usedSOLOtodevelopassessmentcriteriaformyassignments/tests/tasks

f.usedSOLOtoprovidefeedbacktostudentsabouttheirlearning

g.LedadiscussionwithsciencestaffaboutESSA/VALIDresultsandwhattheymeanintermsoftheSOLOmodel

Page 345: Exploring The Impact of a Largescale Diagnostic Science

324

h.workedinasciencefacultywhereusingSOLOisanexplicitpartofsciencefacultypolicy&/orpractice

i.usedSOLOconcepts&/ormodelinregularstudentreportsofscienceachievementsenthometoparents/careres

j.usedtheSOLOmodeltoexplaintheESSAstudentprofiletoaparentorcarer

7.Overall,IwouldratemyunderstandingofSOLOas

verypoor/poor/acceptable/good/verygood

8.IconsiderthatIlearntmostaboutSOLOwhen…

[aboxwith100wordslimit]

SECTIONTHREE:ABOUT"ASSESSMENTFORLEARNING"

Thefollowingquestionsareabout"assessmentforlearning"andrelatedpractices.

I'mwantingtofindoutHOWOFTEN(inarelativesense)youdothethingsdescribedbelowinyourdaytodayteachingandrelatedworkinyourYears7,8and9classes.

YoumaynotknowaboutorbeunsureaboutsomethingslistedhereinwhichcasechoosetheNotknown/Unsureaboutbutton.[Respondentshadtheoptionsofmarking:

Notknown/unsureaboutNeverSeldomSometimesOften

9.Whenworkingintheclassroomwithstudents,I

9a.tellthemwhattheyshouldknow,understand,beabletodobytheendofthelesson

b.givestudentstheopportunitytosettheirownlearningintentionsforanactivityorseriesofactivities

c.explaintostudentstheindicatorsorsuccesscriteriaIwillbelookingforintheirwork

d.allowstudentssomeinputindecidingwhatsuccesscriteriaaretobeapplied

e.makethesignificanceofwhattheyaretodoexplicittostudents

f.askstudentswhytheythinktheyarebeingaskedtodotheproposedactivities

g.encouragepeerfeedbackbasedonsuccesscriteria

h.useresultsfrominstantdigitalpollingtechnologytoinformnextstepsinteachingthatlesson

10.Whenmanagingclassroomdiscussions,I

10a.askclosedquestions

b.askopenquestions

c.usewait-timebeforeresponding

d.askstudentstoexplaintheirthinking

Page 346: Exploring The Impact of a Largescale Diagnostic Science

325

e.usethe“think-pair-share-report”strategy

f.usetestorassignmentitemsandtasksasstimulusfordiscussion

g.usesamplesofstudentworkorresponsestoassessmentitemsasstimulusmaterialfordiscussion

h.explainmyresponses/thinking

11.Iprovidefeedbackonstudent'swrittenworkintheformof

11a.ticks

b.marks

c.grades(suchasAtoE)

d.commentsaboutwhattheyhavedonewell(eggoodwork,excellent,welldone...)

e.adviceabouthowtoimprove

IFFOR11eYOUCHOSENOTKNOWN/UNSUREABOUTORNEVER,SKIPTHENEXTQUESTION(Q.12)ANDGOTOQ.13

12.AstheBASISformyadviceabouthowtoimprove,Ireferto

12a.exemplaryormodelanswers

b.successcriteria

c.misconceptionsevidentinanswers

d.SOLOlevelsofthinking

e.QualityTeachingdimensionsandrelatedelements

f.Bloom’staxonomy/hierarchyofthinkingskills

g.syllabusstandards(syllabusoutcomes&relatedcontent)

13.Iprovideopportunitiesforstudentstoself-assess

13a.bygettingthemtowritesuccesscriteriaforactivities&investigations

b.bygettingthemtoconstructassessmentitemsandtasks

c.usingsuccesscriteria/assessmentrubricsorguidelines

d.byredoingworktoahigherstandard

e.byselectingitemsforaportfolioofworktheyjudgeasbeingconsistentwithnominatedsuccesscriteria

f.bygettingthemtokeepajournaloftheirreflectionsintheirownwordsonwhattheyhavelearnedinsciencelessons

14.Inmyday-to-daypreparationforandworkinclass,I

14a.askstudentstogivemefeedbackonmyteaching

b.respondtostudents’feedbackonmyteaching(thismaynotalwaysbeanimmediateresponse)

Page 347: Exploring The Impact of a Largescale Diagnostic Science

326

c.evaluatelessonsandrecordideasforchangenexttime

d.keepnotesonlearningissuesnoticedforindividualstudents

e.changetheplanned‘nextstep’inalessoninresponsetostudentfeedbackatthetime

f.accessanduseinformationabout“assessmentforlearning”inTaLE

g.accessanduseinformationabout“assessmentforlearning”intheBOSTESwebsite

h.accessanduseinformationabout“assessmentforlearning”fromotherplaces/sources(apartfromTaLE&BOSTES)

15.Icollaboratewithmyscienceteachercolleaguesto

15a.writeitemsandtasksfortests&/orassignments

b.producemarkingcriteria/assessmentrubrics

c.assessassignments/tasks/testsfromeachothersclasses

d.todevelopasharedunderstandingoflearningintentionsandsuccesscriteriaimplicitinsyllabusoutcomesforjuniorsecondaryscience

e.developasharedunderstandingofwhatprogressioninsciencelearninglookslike

SECTIONFOUR:ABOUTYOUANDYOURTEACHINGEXPERIENCE/CONTEXT

16.Iama femalemaleother17.Ihavebeenteaching 0-5yrs6-10yrs11-15yrs15+yrs18.Iamascienceteacherbytraining/qualification YesNoIfNOtoQ.18myqualificationsare…______________________________________________19.Iamaheadteacher,science YesNo20.MyHIGHESTscienceteachingqualificationis Bachelordegree+Diped(orequivalentPostgraduatequalification) BTeach(4yrdegree) MTeach(5yrdegree) DoctorateorPhDOtherscienceteachingqualification______________________________________________21.Icompletedmyhighestqualificationin(whatyear)__________22.Mytraining/qualificationstoteachsciencewereuntertaken completelyoverseas/partlyinAustraliaandpartlyoverseas/completelyinAustralia23.Iteach/havetaughtYears7-9classes thisyear/lastyear/theyearbeforelast/morethanthreeyearsago24.AtmycurrentschooltherearethismanyYear8scienceclasses

Page 348: Exploring The Impact of a Largescale Diagnostic Science

327

one/two/three/four/five/six/seven/eight/eight+25.Atmycurrentschooltherearethismanyfull-timescienceteachers one/two/three/four/five/six/seven/eight/eight+26.Atmycurrentschooltherearethismanypart-timescienceteachers one/two/three/four/five/six/seven/eight/eight+Ifyouarehappytobecontactedaboutthissurvey&/ORareinterestedincontributingtoacase-studyaboutESSA/VALIDandassessmentpractices,pleaseprovidethefollowinginformationandidentifyyourselfasrequestedbelow.27.Iaminterestedinfindingoutmoreaboutthissurvey(noteyouwillneedtoprovideyournameandpreferredcontactdetailsbelow) Yes/No28.Iaminterstedinfindingoutmoreaboutthecasestudyandwhatitwouldinvolve(noteyouwillneedtoprovideyournameandpreferredcontactdetailsbelow) Yes/NoMygivennameis:Mysurname/familynameis:Mypreferredcontactmodeis(pleaseprovidedetails):Mycurrentschoolis:Iwasappointedtomycurrentschoolin(year):Mypreviousschoolwas: Thankyouforgivingyourtimetocompletethissurvey.Yourinputwillhelpmetobetterunderstand‘assessmentforlearning’practicesusedbyscienceteachersinNSWYoumightwanttokeepacopyofyourresponsessothatyoucancomparethemwiththecollatedresponsesfromallteacherswhoparticipatedinthesurveywhichwillbeprovidedtoyouinduecourse.JimScottApril2016***PLEASETAKEAMOMENTTOGOBACKOVERTHESURVEYANDCHECKTHATYOUHAVECOMPLETEDALLQUESTIONSBEFOREFINISHING***SaveandcontinuelaterORFinish>

Page 349: Exploring The Impact of a Largescale Diagnostic Science

328

AppendixG:Interviewquestionsforcasestudyschoolparticipants(final)

1. Whatpromptedyoutojointhecasestudy?

2. WhatcontributiondoestheEVprogrammaketotheassessment-related

workdonebyyouoryourscienceteachers?

3. HowdoyouprepareyourstudentsfortheEVtest?

4. Consideratopicyouhavejustfinishedteachingorarenowwellinto

teaching.Bywhatmeansdoyoucollectevidenceofstudentlearningasyou

workthroughthetopic?

5. Towhatusesdoyouputevidenceofstudentlearning?

6. Whatsortsofthingsdoyoudointhenameofstudenttostudent(peer)

assessment?

7. Whataboutstudentself-assessment?

8. Whatarethemainsourcesofinformationyouaccesstoinformyour

assessment-relatedwork?

9. Whataretheschool/principalprioritiesandhowdotheyimpactyour

workasascienceteacherinYears7-9?

10. Whatareyour/sciencefacultyprioritiesforscienceteachinginYears7-9?

11. Thinkingbackoverthepastfiveyears,whatarethemainresourcesused

regularlybyyouandyourteacherstosupportscienceteachingandlearning

inYears7-9?

12. Ofallthethingsyouaredoinginthenameofscienceteaching,whichis

havingthemostimpactonstudentlearninginscience?/Howdoyou

know?

13. Thinkingaboutthesurveyyoucompleted(participantswerehandedapage

offiveselectedquestionsfromthesurveytoreview),howdidyoudecide

whatseldom,sometimesandoftenmeant?

14. Ifaskedbyaparentornewscienceteacherwhat“progressioninlearning

science”means,howwouldyouanswer?

15. Whatisthenatureandextentofdiscussionaboutassessmentatscience

facultymeetings?

Page 350: Exploring The Impact of a Largescale Diagnostic Science

329

IftheHThadbroughtthecompletedschooldataproformatothemeeting,the

followingquestionwasasked:

16. Whenfillingouttheform,whatresponse/sfromstudentssurprisedyouthe

most?Whydidit/theysurpriseyou?

Onceresponsesconcluded,Iindicatedthattheinterviewwascomingtoanendand

asked:

17. Wasthereanythingyouwanttorevisitoraddbeforetherecorderisturned

off?

Theinterviewwasconcludedbymesaying;“Thankyouforyourtimeand

patience…Ihopeyoufoundtheexperiencefriendlyanduseful…thisconcludesthe

interviewandI’mturningofftherecordersnow”.Oncetherecorderswereoff,I

explainedthatwhenthestudywascompletedIwouldbeprovidingfeedbackon

thesurveyresults(toallsecondaryschoolsinvitedtoparticipate)andcasestudy

summariestoparticipants.

Page 351: Exploring The Impact of a Largescale Diagnostic Science

330

AppendixH.Assessmentrelatednarrativesforcasestudyschoolsusedto

makepairwisecomparisons.

ThecriteriaforcomparisonaresharingthesameSEAscoreandhavingdifferent

residuals;themorewidelydifferenttheresidualsare,theeasiertoseedifferences

inassessment-relatedworknarratives.

PairONE:AssessmentnarrativescomparedforPCWAE1andMCWAE1

A.EngagementwithEVfeedback,resourcesandSOLO

PCWAE1

TheprovincialteachersparticipatedinthecasestudytofindoutwhytheirEV

resultswerebetterthantheirNAPLANresults(astheyhadseenbycomparingthe

proportionsofstudentsineachoftheEVandNAPLANbandstostateproportions).

TheywereearlyadoptersoftheEVprogramhavingpiloteditin2005andtrialedit

in2006beforeitbecamemandatoryacrossthestatefrom2007.Theyalsoengaged

withVALID10whenitwasfirstofferedin2015andindicatedtheywouldcontinue

withitintothefuture.TherewasevidencethattheyuseditemsfromtheEVtests

intheirownassessmenttasks,butsyllabuscriteriaratherthanSOLOwasthebasis

formarkingstudentresponsesintheassessmentrelatedartifactstheyprovided.

TheyadmittedthattheirknowledgeandunderstandingofSOLOwas“verylow”

buttheysaidtheylookedattheirEVresultseachyear.Thestudentsurveyresults

werenotusuallylookedatandnoevidencewasprovidedthattheYear8EV

feedbackwasuseddiagnosticallyduringtheyearstheywerereviewingtheir

schoolprograminpreparationforthenewsyllabusbeingimplementedfrom2014

(inY7&Y9).NocommentsweremadeabouttheirexperiencewithSOLOwhen

markingtheVALID10tests.

Whilsttherewasalongstandingandstrongfocusonscientificliteracyandgetting

studentstomakeappropriateuseofscientifictermsinreportsandexplanations,it

wasnotapparentlyconnectedbythemtothesecondcycleSOLOlevels.Twoofthe

threeteachersidentifiedthediagnosticpurposeoftheEVtestintheirresponsesto

Page 352: Exploring The Impact of a Largescale Diagnostic Science

331

theteachersurveyquestionaskingaboutthemostimportantpurposeoftheEV

program.

MCWAE1

Theschoolhasaconsiderablerefugeeintakeeachyear(around30%).Many

studentshavelittle,ifany,formaleducationorEnglishlanguageskillsbefore

arrivinginAustralia.TheirfirstexperienceforthesestudentsisinanIntensive

LanguageCentrebeforetransitioningtosecondaryeducationwhentheyreachan

appropriateleveloflanguageproficiency.TheHTandthreeofherstaffattended

theinterviewwhichwentoverthehour.TheschoolhasembracedtheEVprogram,

includingVALID10fromitsinceptionandseeitasausefulresourceamongmany

forhelpingtheirstudentstolearnscience.Theteacherslookattheresultswhen

theycomeoutandreportthehaveusedachievementfeedbacktomakechangesto

theirprograms.Teachersdonothaveaccesstothestudentsurveyfeedback.In

relationtotheEVtest,theydospendsometimehelpingstudentstoprepareby

givingthemaccesstosamplequestionsfrompastpapers.Teachersreportthat

studentsenjoydoingthetest.Theywantedtojointhecasestudyinorderto

receivefeedbackontheirassessmentpractices.

B.Groupingforinstruction

PCWAE1

Eachyeartheprovincialschoolestablishedtworelativelysmall(fewerthan

twentystudentsisnotunusual)Y7mixedabilityclassesbasedonstudentdata

providedbythefeederK-6schools.ThetwoclassesgoontoY8largelyunchanged.

TheschoolchoosestoestablishtwosmallclassesineachofYears7&8,butthen

formtwocombinedY9–10classeswhichareverylarge(inexcessofthirty

studentsineach).

MCWAE1

EachyearthreeYear7classes(withfewerthan20students)areestablishedbased

onthelevelofliteracyskills.Classesareungradedfromtheperspectiveofprior

Page 353: Exploring The Impact of a Largescale Diagnostic Science

332

scienceexperienceorlearning.Assistanceisprovidedtoclassesbylearning

supportteacherstohelpthehighproportionofstudentswithlittleformal

educationandverylimitedEnglishlanguageproficiency.Classesareretained

relativelyunchangeduntiltheendofYear10.

C.Useoflearningintentionsandsuccesscriteria

PCWAE1

Analysisofinterviewresponsesandartifactsprovidedbytheprovincialschool

establishedthattheteachingprogramwasexplicitlybasedonsyllabusintentions

asexpressedthroughoutcomesandrelatedcontent.

Inprogramoutlines,undertheheadingIndicatorsofstudentachievement,alistof

sciencevocabularystudentswereexpectedtoacquirewasprovidedaswasalistof

whatstudentswereexpectedtoknowandunderstand,andaseparatelistofwhat

studentsneededtobeabletodo(skills)bytheendofthetopic.Learningthe

spellingandmeaningsofwordsinthevocabularylistforeachtopicwasthemain

sourceofformalhomework.Indicatorsbasedonthecontentsoftheselistswere

evidentinthecriteriaincludedinrubrics/scaffoldsfortasksrelatedtothetopics

beingtaught.

Teachingactivitiesandrelatedassessmenttasksdescribedintheprogramswere

alignedtosyllabusintentions.Inrelationtothelivedexperiencesofstudents,the

teacherscommentedthataboutonethirdofstudentslivedonruralproperties,did

notrecognisethescienceinthedaytodayplantandanimalhusbandryworkand

theequipmentusedtodothatwork.Addressingthisdisconnectionbetween

scienceandthestudents’lifeexperiencewasapriorityfortheteachers.

Assessmenttaskswereassignedwithrubricsthatclearlydescribedexpectations

basedonsyllabusoutcomes.Teacherssaidtheyusedtherubricstobothintroduce

tasksandtoprovidefeedbacktostudentsoncetaskswereassessed.

TwoformalpenandpapertestsforeachofY7andY8,basedonsyllabusworking

scientificallycontent.Thetestsincludedfreeresponseextensionquestionsthat

Page 354: Exploring The Impact of a Largescale Diagnostic Science

333

askedstudentstoexplainusingscientificmodels(particlemodel)andtoidentify

andcorrectmisconceptionsinexamplesthatweregiven.

Theteacherssaidtheyprioritizedpracticalactivitiesandreportwriting.Students

didthreeresearchprojectsacrossthefouryearsseventoten(thesyllabus

specifiesaminimumoftwo).Examplesofbothlaboratoryandfieldworkwere

providedintheartifacts.Expectedlearningfromthoseexperienceswastypically

scaffoldedinaworksheetormodeledusingatextbookexample.Open-ended

questionswereevidentinthoseworksheets.Thescaffoldswereinformedby

expectationsdescribedintheskillssectionofthesyllabus.

Reportingtoparentsisingradesalignedtocurriculumstandards(AtoE),whichis

nationalpolicy(thesameappliestoMCWAE1).

TeachersreportedthattheydidnotoftenuseICTinY7/8scienceclasses.

MCWAE1

Thesciencedepartmentprogramhasfourten-weektopicsmappedtosyllabus

outcomesforthefourcontentareas(InYear7thetopicsareForces,Chemical

World,EarthandSpaceandLivingWorld).Syllabusexpectationsarealsomapped

totheeighteenelementsoftheQualityTeachingFrameworkandreferencestothe

crosscurriculumaspectsoflearningareexplicitlyidentifiedaswell.Syllabus

outcomestargetedincludeValuesandAttitudes,WorkingScientificallyand

KnowledgeandUnderstanding.Learningactivitiesaredescribedintermsoflesson

outcomesthatappeartorequirefromonetoanumberoflessonstoachieve.A

diversityofresourcesareidentifiedtoworkwithincludingconventionaltext

books(e.g.CoreScience,ScienceStage4)andworksheetsdescribingactivitiesto

beperformedandwritingtobedone.

Assessmentisbyconventionaltopictestsandendofsemestertests.Studentsare

providedwitharangeofoptionsforrespondingincludingmultiplechoice,short

andlongerresponseitemsinvolvingstudentswritingdescriptionsandor

explanations.Someitemshaveinterestingstimulusmaterialassociatedwiththe

Page 355: Exploring The Impact of a Largescale Diagnostic Science

334

item.Thereisawordpuzzleattheendofeachtestforstudentstoengagewithif

theyfinishearly.Anexampleofapracticaltestandaresearchprojectscaffoldwas

providedforYears7and8.Norubricslinkedtosyllabusoutcomesordifferent

levelsofanswerswasprovidedasmodelsforstudentstoworkwith.Studentsdid

notappeartohavemuch,ifanysayinchoosingordevisinglearningintentionsor

successcriteria.

D.Classroomdiscourseandevidenceoflearning

PCWAE1

Teachersreportedthatgroupworkiscommonandinstructionisprovidedto

studentsabouthowtoworkcooperativelyintheclassroom,laboratoryandduring

fieldwork(usingthelocalriveranda‘wetlands’area).

Teachersreportedthattheyusedapredict-observe-explainstrategytofocus

discussionofpracticalworkandasapreliminarysteptowritingupapractical

report.Theteachersreportedthatschool-basedlearningsupportstaffwere

regularlyinvitedvisitorstotheirscienceclassestohelpstudentsstrugglingwith

literacyskills.

Teachersdescribedsomeoftheearlyworkdoneintopicsasopportunitiesfor

verbalpre-testingandstudentswerehelpedtoconstructmindmapsasawayof

summarizingtheirlearning.

Schoolpolicyplacedgreatemphasisonliteracylearningasakeytohelpingall

studentssucceed.Thesciencefacultysupportedthisemphasisinitshomework

policy(acquisitionofscientificvocabulary)andinclassroomworkwherestudents

weresupportedandencouragedtoverbalisetheirexperiencesusingthe

appropriatevocabularyearlyandoften.

TheteachingprogramsforYears7and8wereorganizedintotenweektopics(one

perschoolterm).Theprogramsalsolistedresourcessuchasrelevantvideos,text

booksectionsandexcursionswhichwereanannualeventforstudentsinYears7-

10.OneexcursionforY8studentsinvolvedavisittoLaTrobeuniversitytoraise

Page 356: Exploring The Impact of a Largescale Diagnostic Science

335

studentawarenessofpostschooloptionsandanotherwhichwasanextendedfive

daytriptotheNSWsouthcoast.Studentswererequiredtowritereportsofthese

activities.

MCWAE1

Sciencelearningactivitiesprovidedtostudentsatthisschoolwerediverseand

includedconventionalclassroombasedactivitiesusingtextbooksandworksheets,

laboratoryactivitiesinvolvingequipmentandreportwriting,excursionsbeyond

theschoolgatesandvisitstotheschoolbypeoplethatworkinSTEMcareers(such

asCSIROandQuestacon.SomeaccesstoICTisprovidedinthelibraryforresearch

purposes.StudentsdoICAStestsandparticipateintheBigSciencecompetition.

Manyattendtheafter-schoolhomeworkcentreanddosciencehomeworkthere,

includingsciencevocabularyandspellingrelatedwork.Thereisaheavyemphasis

inlessonsontalkusingscientificvocabulary(wholeclassdiscussioniscommon).

Thereisexplicitinstructionrelatingtogroupworkandrolestobeperformed.

Studentsarenotconfidenttalkers,especiallyinYears7and8.Studentshavea

strongpreferenceforrotelearning(thatseemedtoteacherstoberelatedto

expectationsbasedonexperiencebroughtfromothercultures).

E.Feedback

PCWAE1

Feedbackonassessmentandothertasksoftentooktheformofdiscussionwith

studentsabouttherubriccriteriaandhowtheywereusedtoallocatemarksthat

mappedontoafive-pointscalerangingfromunsatisfactorytooutstanding.

Whenaskedaboutprogressionoflearning(oneofthequestionsinthesurvey)

theydidnotreadilyrelatesyllabusoutcomesandcontentwiththeideaoflearning

progression.

TheschoolwasmakinglessusenowoftheEducationalAssessmentAustralia

(EAA),InternationalandCompetitionAssessment(ICAS)sciencetests(20

studentssatthemin1999;lastyearonlytwodidso).Scienceteachersdidnotuse

Page 357: Exploring The Impact of a Largescale Diagnostic Science

336

theresultsfordiagnosticpurposes.Theydiduseitemsfromthemintheirclassand

assessmenttasks.Certificatesaboutparticipationandachievementwerehanded

outataregularschoolassembly.

Intheinterview,therelievingDPdrewattentiontothegoodresultsinYear10(see

theproportionofAsfortheschoolrelativetothestateinTableK.3inAppendixJ)

andcommentedthatresultstheredidnottranslateallthatwelltotheHSC,which

hefoundpuzzlingandforwhichhecouldfindnoexplanation.

Healsocommentedonthestudent’sapparentlowenjoymentofsciencelessons

comparedtothestateassomethinghecouldnotexplain(seeTableK.2data).The

othertwoteacherssaidtheyhadaskedstudentswhytheydidn’tlikescienceand

weretoldthatitwasbecause“you”(scienceteachers)followeduptoensurework

wascompleted.Thisexplicitinterestinaskingstudentswhytheydidnotlike

sciencewasaresponsetoreadingthesurveyfeedbackinpreparationforthe

interview.

MCWAE1

Talkintheinterviewindicatedastrongemphasisonoralfeedbackduringlessons

largelyrelatedtobuildinglanguageskillsintheappropriateuseofvocabulary

relatedtotheconceptsandskillsandprocessesofsciencebeingtaughtatthetime.

Pre-testingwasnotmentioned.Feedbackontestsandworkwasprovidedby

teachersintermsofmarksanddiscussionofanswers.SOLOwasnotmentioned

norweresyllabusoutcomesorexpectations.Researchprojectreportswere

heavilyscaffoldedandteachersreportedtotheresearcherthattimewasgivento

explainingwhatthedifferentcomponentsare.Classandhometimewasgivento

theprojects.

F.Activatingstudentsasinstructionalresourcesforothers

PCWAE1

Peerfeedbackwassoughtwhenoralpresentationsormodelswereproduced.No

otherdetailsaboutthatfeedbackwereprovided.

Page 358: Exploring The Impact of a Largescale Diagnostic Science

337

MCWAE1

Theonlyformalopportunityforthatappearedtobeduringgroupworkinthe

contextofpracticalworkinthelaboratory.Teacherscommentedthatthe

classroomsetupdidnotsupportusingthink-pair-sharestrategy.

G.Activatingstudents(andteachers)aslearners

PCWAE1

Thismostlytooktheformofteacher-leddiscussionaboutstudentworkinthelight

ofteacherprovidedrubrics.Therubricsdescribedarangeofresponsesshowing

thefeaturesofresponsesthatachievedhighmarks.Therubricswereinevitably

relatedtosyllabusexpectations.IntensiveliteracyworkwithstudentsinY7&Y8

wasfollowedupinYears9&10withexpectationsthatstudentswouldusethose

skillstoworkindependentlywhilsttheirteacherswereprovidingtimetovarious

groupsintheclass,giventhatstudentswerenotonlymixedabilitybutacrosstwo

grades.Self-assessmentopportunitieswereprovidedasearlyasinY7andteachers

providedfeedbackonit(seeFigure6.1).

Example1:PCWAE1

Name:________________________________________Project:________________________________________________________________________________

Purposeoftoy:_________________________________________________________________________

Areyouhappywithyourfinalproject?Why/Whynot?_____________________________Whataresomethingsyoudidreallywell?___________________________________________

Whataresomethingsyoucouldhavedonebetter?_________________________________

Thanks,Goodjob

Example2PCWAE1

MarksOutstanding--6//High--4-5//Sound--2-3//Basic/limited--1

Justification(opinion+reasons) Clearrecordofchangesmadetotoyandareasonforeachchange.Overalljustificationoffinaldesignandproduct(toy)Somerecordofchangesmadeandreasonsforthesegiven.Briefjustificationoffinaldesignandproduct(toy) Minimalrecordofchangesmadewithlittleorno

Page 359: Exploring The Impact of a Largescale Diagnostic Science

338

reasonsgiven.Anattemptmadetojustifyfinaldesignand/orproduct(toy) Anattemptmadetojustifyeithertheirdesignorfinalproductoranychangesmade

Figure 6.1 Opportunities for self-assessment in Year 7 Making a Toy task

Theteacherssaidtheymetregularlybothinformallyandformallytoworkon

aspectsofscienceteachingandrelatedassessmenttasks,whichwereoften

collectivelymarked.Theywereclearlyenthusiasticabouttheirwork.Faculty

programsprovidedwerecomprehensiveandwhilsttheydidnothavespacefor

writtenevaluation,itwasclearfromthediscussionthatthenewprogramsatthe

schoolforthenewsyllabushadbeencollaborativelydeveloped.

MCWAE1

Theteachersmetweeklyassessmentwasoftendiscussedtheysaid.The

developmentoftheteachingprogramswasasharedactivity.Teachers

participatedintheinterviewandweresupportiveandrespectfulofeachotherin

thatdiscussion.

H.Comparativesummativecomments

WhenresultswerecomparedattheendofYears8and10theprovincialschool

hadthebestresults.Alsotheyhadahigherproportionofseniorsciencecourse

completions(asaproportionofthestudentsattheirschool).Thecomparisons

madeheresupportedthethreepredictions,eventhoughtheschoolswereboth

WAEschools..

Theassessmentnarrativesfrombothschools,whencompared,revealedthatinthe

earlyyearsofsecondaryscienceteachingbothschoolsmadeuseofavarietyof

contextsforteachingsciencewhichinturnmeantthatstudentshadopportunities

toprovideevidenceoflearning.Teachersattheprovincialschoolmadegreateruse

ofrubricsrelatedtoscaffoldedtasksandtheyprovidedfeedbackduringandafter

completion.Thefeedbackwasintermsofsyllabusexpectationsandmarks

awardedasrecommendedintheBoard’sCommonGradeDescriptoroutline.There

Page 360: Exploring The Impact of a Largescale Diagnostic Science

339

weremoreopportunitiesattheprovincialschoolforpeer-andself-assessment.

Summativeassessmentatthemetropolitanschoolwasmorestronglylinkedto

traditionaltestingthanattheprovincialschool.

Whilstthethreepredictionslendweighttotheconclusionthatteachingatthe

provincialschoolwasmorecloselyalignedtotheformativepracticesprofileof

WAEschoolsasidentifiedinchapterfive,thelevelofengagementwithscience

whencomparedtoboththemetropolitanschoolandthestatewasnotinlinewith

expectationsforself-regulatedlearners(theexpectedoutcomefromteaching

characterizedasformativeasdiscussedinchaptertwo).Overall,studentsatthe

provincialschoolwerelesspositiveabouttheirschoolscienceexperiencethan

theirmetropolitancounterparts.

Itwasimpossibletoidentifyfromtheassessmentnarrativeswhystudentsatthe

provincialschoolhadsuchpoorperceptionsoftheirschoolexperienceofscience

attheendofY8.Theteachingprogramatthemetropolitanschool,comparedtothe

provincialschool,wasmorelikethatdescribedinthelefthandcolumnofTable

2.1,yetstudentsatMCWAE1werethemostpositiveabouttheirschoolscience

experienceofallthecasestudyschools(seeTableK.5inAppendixJ).Teachersat

themetropolitanschoolsaidthateventhoughparentsdidnotcometotheschool

often,theywereawareofstrongsupportforteachersandlearningbyparentswho

oftenboughttextbooksforstudentstokeepanduseathome.

PairTWO:AssessmentnarrativescomparedforMCAE2andMCWBE3

A.EngagementwithEVfeedback,resourcesandSOLO

MCAE2

Theprincipalwaskeenfortheschooltobeinvolvedinthecasestudyand

expressedinterestinanyfeedbacktocomeoutoftheprocess.Theheadteacher

sciencewasalsotherelievingdeputyprincipal(R/DP)atthetimeoftheinterview

andtheonlypersonattheinterview.Hehadbeenattheschoolintheheadteacher

positionintheperiodofinterestfortheproject.Thesciencedepartmenthadnot

Page 361: Exploring The Impact of a Largescale Diagnostic Science

340

engagedwithSOLObutwerefocusedonsyllabusoutcomesandtheBoard’s

approachtograding.TheR/DPreportedthatstudentstooktheEVtestseriously

andappearedtoenjoytheexperience.Theschoolprovidednospecialpreparation

forit.TheschoolhadengagedwithVALID10andwereplanningtocontinuewithit.

Theschoolhadnotdonetheproformaorcollectedartifactspriortothemeeting.

MCWBE3

ThemetropolitanschooldidnottakeupVALID10in2015andithadnoplanstodo

soin2016.TheHTreportedthatwhenshehadarrivedattheschool,thescience

staffhadverylimitedunderstandingofassessmentforlearningandhadnotmade

useofEVfeedbackatall.TheHTsaidthatsheandanothernewstaffmemberwho

hadarrivedattheschoolinthesameyearweretheonlyoneswhoknewanything

aboutSOLOwhichshecharacterizedas“allabout”recognizing“connections.”

Thefocusfornow,shesaid,wasonimprovingteachingandlearningpracticesin

juniorsecondaryscienceandmakinguseofdata(fromassessmentgenerallyand

SOLOinparticular)totargetresourcestothatend.

Therewasstrongevidenceintheartifactsofafocusonscientificliteracyand

appropriateuseofscientificterminologyinreports.Howevernothingwassaidby

hertolinkthistosecondcycleSOLOresponses.Thisemphasisappearedonlytobe

recent(i.e.,aftertheHT’sarrivalattheschoolandaftertheperiodofinterest).

B.Groupingforinstruction

MCAE2

TheschoolwaspromotingitselfasaschoolwithaspecialinterestinSTEMbroadly

andbiosciencesinparticular.EachYeartheschoolprovidesa‘selectiveentry’test

forlocalY6studentsthatincludessciencequestionsaswellasgeneralabilityand

literacyandnumeracyskills.Thatclassisprovidedwithanacceleratedprogram

andcompletethefouryearsciencecoursebytheendofYear9.Theotherfour

classesareunstreamedandstudentsremainintheirclassuntiltheendofYear8.

TheR/DPindicatedthatthiswasconsistentwithadeliberate‘middleschool’

Page 362: Exploring The Impact of a Largescale Diagnostic Science

341

approachtothefirstfewyearsofhighschoolaimedatprovidingsupportand

stabilityforstudentstoassistthemwithtransferfromprimarytosecondary

education.Sometwentystudentseachyearareprovidedwithadditionallearning

supportassistance.

MCWBE3

Themetropolitanschoolestablishedsixorsevenclasses(dependingonnumbers

tobeenrolled)inY7eachyear.Theestablishedclassesarethesameforthefour

coresubjectsandremainrelativelyunchangeduntiltheendofY8.Studentsare

allocatedtoclassesbasedonstudentdatafromthefeederprimaryschools.Two

parallelhighachieverclassesandfourorfivemixedabilityclassesarecreatedby

theY7adviserandotherstaff(notscience)attheschool.Changeswhentheyare

madearenegotiatedacrossthefacultiesusingadiversityofcriteriabuttypically

theyareunrelatedtoscienceassessmentresults.

C.Useoflearningintentionsandsuccesscriteria

MCAE2

TheR/DPdidshowmesometasksstudentsinYears7and8weregiven.Learning

intentionsandsuccesscriteriabasedonthesyllabuswereamajorfocusinthe

researchandothertasksstudentsengagedwithandtheyinformedtheassessment

rubricsusedbyteacherstomarkthem.Evidenceoflearningwasprimarily

gatheredfromthesetasksandusedasthebasisforreportingtoparentstwice

yearly.Thescienceteachersprovidea300wordreportonscienceachievement

twiceayeartoparents.Thereportsincludespecificreferencestoteacher

observationsofstudentsworktoillustrateaspectsofachievementrelevanttothe

reportingcategoriesaddressedintherubrics.

TheschoolretainedaYear7annualtest,butmostoftheevidenceoflearning

comesfrom4-5tasksstudentsdoeachyear,oneofwhichisapracticaltask.The

tasksputastrongfocusonscienceprocessesandtrytoengagethestudentsby

makingthemrelevanttostudentinterests,includingmakingmodels(egparachute

Page 363: Exploring The Impact of a Largescale Diagnostic Science

342

andeggdropactivity)andexplainingthemtoothermembersoftheclass.Teachers

provideassessmentfeedback.Teacherjudgmentoflearningisconveyedinmarks

whicharethentranslatedintogradesforthepurposeofreportingtoparents.

Nosampleprogramswereprovided,butthedescriptionprovidedwasoffourto

fiveSTEM/BiosciencetopicsineachofYears7and8“areidentifiersofthe

school.”ThesearecrosscurriculumcoursesincludingPDHPE,HSIEandR/DP

wrotetheseprograms.Eachdepartmentprovidestwohoursinafortnightlycycle

forthisprogram.Recognisedaneedtostrengthenunderstandingandawareness

ofthescientificmethodandabilitytoinvestigatescientificallyandthishasledto

theshifttoinquiry/projectbasedlearningemphasis.Successseenintermsof

growingnumberofstudentstakingupseniorsciencecourses.Studentsaretaking

upschoolcoursesinYear9&10(coursesinforensicsandzoology)ingood

numbers.

MCWBE3

TheHTdescribedthefacultycultureshehadinheritedas“traditionalandresistant

tochange”.Whenshearrivedattheschoolsheobserveda“widespectrumof

learners”attheschoolbutfewdifferentiationstrategiesinscienceprogramsfor

meetingthoseneeds.Astheinterviewprogressed,hergraspofwhatthose

strategiescouldbewaselaboratedbyreferencetoresourcesshehaddeveloped

withstaffattheschool.Artifactsofthisnewworkwereclearlyalignedwith

syllabus.

Atthetimeofherarrivalstaffwerenotkeento“domorethanrequired”andnone

ofthemhadmarkedeitherHSCorEVextendedresponsetasks.Sheobservedthat

whenshefirstarrivedstaffworkedindividuallytoproduceassessmenttasks

whichwerethenindividuallymarked.Shealsosaidthatatthetimeofherarrival

staffhadapoorunderstandingoftheBOSCommonGradingScaleandtheir

processesfortranslatingmarksintogradeswereunrelatedtosyllabusstandards

andthusinconsistentlyarrivedatacrosstheclasses.

Page 364: Exploring The Impact of a Largescale Diagnostic Science

343

ShesaidthatitwasherobservationthatstudentsstartedscienceinY7looking

forwardtoandlikingsciencebutwere“disengaged”bytheendofY8.

Teachingprogramsintheperiodofinterestwerefortopicslastingfiveweeks

(nowtheyaretenweeks).TheHTwasnothappywiththeschoolscienceprograms

shehadinherited,whichinherviewwere“allovertheshop”andhadbeen

developedasajointprojectwithseveralotherschools.Staffdidnothaveenough

understandingorwillingnesstodoascopeandsequencefornewsyllabustopics.

Oneofherfirstactionswastopersuadestafftoworkoncreating/collecting

resourcesfornew“scopeandsequences”(teachingprogramoutlinesmappedto

syllabusintentions)whichsheandtheothernewteacherhaddevelopedforyears

7and9soonaftertheirarrivalattheschool.

TherewasverylittleuseofICTinYears7and8scienceclasses.Priorto2015,

worksheetsprovidedbyteachersandtextbookswerethemainresourcesusedto

supportteachingandlearningshesaid.Verylittleworkwasdoneoutsidethe

classroomthenandtherewerenosciencespecificexcursionsbeforeshearrived.

TheHThasprioritizedgettingmorestudentstothink(inscienceclasses)andto

takeseniorsciencecoursesandsheisdoingthatbybuildingtheteachingand

assessmentskillsofherstaff.TherewasnomentionoflinkingofSOLOlevelstothe

discussionaboutwhattheteachingofthinkingmightinvolve.

LiteracyandwritinginparticularareschoolprioritieswhichtheHTsaystheyare

embracingnowinscienceandmakinggooduseofEVextendedresponsetasksto

thatend.

D.Classroomdiscourseandevidenceoflearning

MCAE2

Reportedly,learningtasksareassignedinclassandworkedoninstudentsown

timeaswell.Groupworkisencouragedandsupported.Teacherobservationof

studentteamworkskillsaswellastheirindividuallywrittenreportsprovide

evidenceoflearning.Wholeclassdiscussionisstronglyencouraged;theuseof

Page 365: Exploring The Impact of a Largescale Diagnostic Science

344

think-pair-share-reportlikestrategiesappeartobeusedinsomeclassesandsome

reflectivewritingbystudentsisencouraged.Studentswereprovidedwitha

diversityoftasks,mostdrawfromawiderangeofcontextsandotherlearning

areas(seeabove).TheschoolengageswiththeYoungScientistcompetitionand

providessomestudentswiththeICASsciencetestsaswell,butlittlewasdonewith

thefeedbackapartfromprovidingthecertificatestostudentswhentheywere

returnedtotheschool.TheschoolengageswithNationalScienceWeekandputs

onactivitiesforfeederprimaryschools.

MCWBE3

Informationaboutclassroompracticebefore2014wasanecdotalbuttheHT

referredtoheavyuseoftextbooks,worksheetswithlimitedopportunitiesfor

extendedwrittenresponsesandconventionallaboratorypracticalworkdesigned

toconfirmsyllabusprioritizedtheories.Practicalworkwasconductedingroups,

buttheHTreportedthatshehadlittleevidenceofpurposefuluseofgroupwork

forpeersupportedlearning.

Theschoolhasalearningsupportunitandanumberofstudentsarereceiving

supportfromitsteachers.

ThetimetablingsoftwareuseddeliversanumberofsplitclassesinYears7and8

(typicallyoneclasssharedbetweentwoteachers)andanumberofclasseswere

andstillaretaughtinthejuniorsecondaryyearsbyPDHPEteachers.

E.Feedback

MCAE2

TheR/DPreportedrecentengagementwithHattie’sVisibleLearning(2009).Itwas

notclearhowfarbackintotheperiodofinterestwasinfluencedbythis.Marks

weregivenrelatedtorubricsbasedonsyllabusoutcomes,butteachersalsogave

writtenfeedbackexplainingwhythemarkwasgivenandsuggestionsforbetter

answerswerealsoprovided.Feedbackwasprovidedagainstrubriccriteriainthe

Page 366: Exploring The Impact of a Largescale Diagnostic Science

345

contextofclassroomworkeitherone-to-onewiththeteacherorwholeclass

managedbytheteacher.

MCWBE3

TheHTreportedthatwhenshehadarrivedtwoyearsago,feedbacktostudents

fromassessmentandothertaskswasbasicandinvolvedreportingbackofmarks

with,asfarasshecouldascertain,littlediscussionordiagnosis.Assessmentthen

wasdominatedbyendoftopictestsandmarkswererecordedandusedfor

reportingsummatively.Taskscompletedinclassweresimplymarkedandhanded

backwithrudimentarydiscussion(ifatall).

EAA/ICAStestswereandstillareofferedtothetoptwoclassesbutresultsarenot

usedfordiagnosticpurposes.

F.Activatingstudentsasinstructionalresourcesforothers

MCAE2

Opportunitiestoprovidefeedbacktopeersappearedtobelimitedtogroupwork

duringclasswork.SomestudentswerealsoinvolvedindemonstratingtoYear6

studentsduringScienceWeekactivities.

MCWBE3

TheHTwasnotawareofanypeerassessmentopportunitiesbeingprovidedin

scienceclassespriortohertimeattheschool.

G.Activatingstudents(andteachers)aslearners

MCAE2

Someopportunitieswereprovidedinsomeclassesforself-reflectivewriting,but

noinformationaboutfollow-upwasprovided.TheR/DPindicatedthattherewere

regularsciencedepartmentmeetings(onceafortnight)andthatassessmentand

programingwerediscussed.ThesciencedepartmenthadaSTANSWmembership

Page 367: Exploring The Impact of a Largescale Diagnostic Science

346

andstaffparticipatedinmarkingofexternalexams(HSC)andparticipatedinother

professionallearningactivitiesrelated(mostrecently)toHattie’sVisibleLearning

program.Fromthatitcouldbeinferredthatstaffmodelledgoodlearning

behaviourswitheachother,buttheextentofthatmodellingforstudentswas

unclearfromtheinformationprovidedintheinterview.Also,therewasawareness

oftheneedfordifferentiatedcurriculumtomeetthediverseneedsoftalented

studentsanddifferentapproachesforthetwenty(estimated)studentsinthe

juniorsecondaryyearswhowereonamodifiedprogram(whichwerenot

individuallife-skillsprograms).Characteristicsofthatdifferentiationwerenot

provided.

MCWBE3

Again,atthetimeofherarrival,theHTreportedthattherewaslittleevidenceof

anyself-assessmentactivitiesorstrategiesinuse.Stafftendedtowork

independentlyandithasbeenaslowprocessupskillingtheminassessment

literacysincethen.Teamshavebeenestablishedwithinthesciencefacultyto

facilitatecooperativedevelopmentofprogramsandrelatedresources.This

collaboration,shereported,hadbeeneffectiveinraisingstaffawarenessand

understandingofassessmentissues.

H.Comparativesummativecomments

MCAE2wasanewschoolthathadatthetimeoftheinterviewonlyhaditsfull

complementofstudentsfromYear7to12forafewyears.Inthattime,ithad

deliberatelysoughttoestablishaSTEM/bioscienceidentityforitselfandprovided

studentswithalearningprogramthatreflectedthatemphasis.MCWBE3,onthe

otherhandwasawell-establishedschoolthatprovideditsstudentswithwhatwas

reportedbythenewheadteacherasa“traditional”program.

Forpredictionone,whenachievementattheendofYears8wascompared,

MCAE2’sresultsacrossallfourresultcategorieswerepositivelybiasedtowardthe

topbandofachievementmoresothatthoseatMCWBE3.Bothschoolshadatop

streamofstudents,whichmayaccountforthepositiveskewinbothsetsofresults

Page 368: Exploring The Impact of a Largescale Diagnostic Science

347

comparedtothestate.However,MCAE2resultsattheendofYear10wereslightly

positivelybiased,butcouldnotbecomparedwithMCWBE3becausethatschool

didnotprovideanyYear10resultstobecompared.

InrelationtoengagementattheendofYear8,giventheprioritygiventoSTEMat

MCAE2,thelevelofstudentengagement(asmeasuredbythecombinedscoresfor

ItemsDandE),comparedtoMCWBE3werenotthatdifferent.Topbandstudents

attheAEschoolwereonlytwoplaceshigherthantheWBEschool(9thand11th

respectivelyoutofsixteenschools.Thestatescorewascountedasaschool;both

rankedbelowthestatescore(2ndoutof16).Therankings(outof12)forthetotal

schoolresultswerethesame(9thand11threspectivelyandcomparedtothestate

whichranked5th).Basedontheassessmentnarrativesderivedfromthe

interviewsatbothschools,thiswasanunexpectedlycloseresult,particularlyfor

theAE2school,whichshouldhavereturnedamorepositiveresult.

EngagementwasassessedbylookingatYear12completionsrelativetothestate.

Inthiscomparison(seeTableK.4inAppendixJ),attheAEschool,Biology

completionswere200%,Chemistryjustover100%andPhysicswas63%(neither

schoolofferedSeniorScience).Bycomparisonwiththestate,theWBEschool

Biologyproportionwas74%,Chemistrywas39%andPhysicswas56%.

Thefiguresabovesupportthefirstprediction;noconclusioncouldbedrawnin

relationtopredictiontwoandpredictionthreewassupported.

Itwasimpossibletoidentifyfromtheassessmentnarrativeswhystudentsatboth

schoolshadsuchpoorperceptionsoftheirschoolexperienceofscienceattheend

ofY8.

Page 369: Exploring The Impact of a Largescale Diagnostic Science

348

PairTHREE:AssessmentnarrativescomparedforPCWAE2andMCWBE5

A.EngagementwithEVfeedback,resourcesandSOLO

PCWAE2

TheprovincialschoolwasambivalentabouttheEVprogram.Ontheonehandthe

HTsaiditwasusefulforbothdiagnosticpurposesandcomparativepurposesbut

didnotelaborateonhow.Ontheotherittranspiredintheinterviewthatthe

sciencestaffhadanegativeviewofitscontributiontotheassessmentpracticesat

theschool(apartfromitsvalueinshowingthecomparativestrengthoftheirEV

resultscomparedtoNAPLAN).

TheHTwasconcernedaboutthevalidityofVALID10becausetheyhadrecognised

“rehashed”Y8ESSAquestionsit.Theythoughtthatschoolmarkingreducedits

valueforcomparativepurposes.TheywouldnotbedoingVALID10thisyear

(2016)sayingthattheschoolhadcomputeraccessissues,thatshewouldnotbeat

theschoolinTerm4andthescienceteachersdidnotseethevalueinit.

TheteacherwhowastoberelievingHT(forthenexttwelvemonths)hadjoined

theinterviewtowardtheend.Shereportedthatneithershenortheotherstaff

couldseethebenefitsofusingSOLOasabasisforassessmentbecauseitconflicted

withtheBoard’sgradingsystemandstudentshadfounditconfusingtodealwith

bothsystems.

EVresultsarehandedtoparentsatthefirstparent-teachernightoftheyear.The

HTreportedthat:

• parentsdon’taskquestionsabouttheEVtest;

• thereisnospecialpreparationforthetest;

• studentsliketheonlinesciencetestandtakeitseriously,astheydo

NAPLAN;

• theschoolisfocusedonresultsandtheprincipalis“happy”withscience

resultsgenerallyandtheirEVresultsinparticular.

Page 370: Exploring The Impact of a Largescale Diagnostic Science

349

TheHTwasawarefromherownanalysisthattheschool’sEVresultswerebetter

thantheschoolsNAPLANresultsbutdidnotelaboratehowshehadarrivedatthat

conclusion.

MCWBE5

TheHTsaidheagreedtoparticipateinthecasestudytohaveasayabouttheEV

programwhichhesawasproblematic.Reasonsgivenincludedgivingsciencea

specialstatuswhichhewaspersonallyuncomfortablewith,issueswithaccessto

computers(sinceitwentonline),sciencestaffnotkeentosuperviseitandaschool

executivewhichhesaidwasnotinterestedintheresults.

Heacknowledgedthatthetestprovidedgoodquestionswhichhesaidwereused

intheirownschooltests.Hesaidthatthefacultywasnotgivenanytimebythe

schooltodigestEVfeedback(comparedtoNAPLANresults).

HeexpressedregretatthelossoftheY10statewidesciencetest(stoppedafter

2011)becauseitprovidedatarget(gradepattern)toaimforattheendofYear10

butalsosaidthattheywouldnotbetakingupVALID10.

B.Groupingforinstruction

PCWAE2

TheHTscienceattheprovincialschoolisresponsibleformanagingthe

compositionofclassesforScience,PDHPEandSocialSciences.Studentsare

initiallyplacedinthreemixedabilityclassesusingprimaryschoolliteracyand

numeracydata.Aftersixmonths,studentsarereorganizedintoseparatelygraded

classesforEnglish,MathematicsandSciencebasedonsummativeassessment

resultsineachofthesubjectsforsemesterone.

ThetopclassinSciencehasclosetothirtystudentsinit;thebottomclasshas

aroundtwentystudentsinitandisprovidedwithlearningsupport.Thereisasix

monthlyreviewofclassplacementsinScienceandstudentsaremovediftheir

performancechangeswarrantedit(eitherupordown).Thispotentialforchanging

Page 371: Exploring The Impact of a Largescale Diagnostic Science

350

classescontinuesuptoandincludingY10.TheprocessissupportedbytheHT

Sciencewhodescribesherselfasatraditionalscienceteacher.

MCWBE5

FromdayoneinY7themetropolitanschoolplacesitsnewstudentsinclasses

accordingtofourdifferentsetsofcriteria.Atopclassof“giftedandtalented”

students,asecondclassof“independentlearners”,twoorthreeclasses(depending

onnumbers)ofmixedabilitystudentsandabottomclassofstudentswithlearning

disabilitiesandotherwisepoorlearninghistories.Theseclassesarethesamefor

English,Mathematics,ScienceandSocialSciencesandtheyremaininthoseclasses

upuntiltheendofY8.AttheendofY8allstudentsaregradedonthebasisoftheir

resultsfromacommonassessmenttask(typicallyatest)andputintoaclassbased

ontheirrankintheyear.TheytypicallystayinthatclassforYears9&10.

Studentsareinvitedtojointhe“giftedandtalented”classonthebasisoftheir

resultsinatesttheyappliedtositforinY6.Thetestwassetbythesecondary

schoolanddidnotincludeanyitemsrelatedtoscientificliteracy.Studentsare

allocatedtothe“independentlearners”classonthebasisofadvicefromtheirY6

teachers.Aninterestingfeatureofthisschoolisthatitonlyadmits12-15students

tothetopandbottomclasseseachyear.Theystayintheclassfortwoyearsand

theyhavethesameScienceteacherforthetwoyears.

Noexplanationorcommentaryaboutthemeritsorotherwiseofsettingupclasses

inthiswaywasofferedbytheHTscience.Hedidsaythatthebottomclasswas

providedwithadditionalsupportfromtimetotimebylearningsupportteachers

toimproveliteracyandnumeracylevels.

C.Useoflearningintentionsandsuccesscriteria

Thesciencefacultiesfrombothschoolshavehighprofilesintheirlocal

communitiesandhowthisisachievedbyeachofthemwillbedescribedbelow.

PCWAE2

Page 372: Exploring The Impact of a Largescale Diagnostic Science

351

TheHTexplainedthatfromY7thepolicyistoexposestudentstohigh

expectationsinrelationtousingthelanguageofscienceandthereisclose

alignmentbetweensyllabusintentions,teachingandassessmentintheworkofthe

faculty.Thereareconsequencesforstudentswhoperformverywellornotsowell.

Theymaybepromotedordemotedaclassatthemiddleorendoftheyear(forthe

newschoolyear).

Theprovincialstudentsperformconsistentlywellinlocal,highprofilecommunity

agriculturaleventssuchasregion-based“HoofandHook”competitionswhichare

wellpublicizedinthelocalpress.Inanygivenyear,teachersofAgricultureare

verybusywithactivitiessuchastheabovethattakethemoutsidetheschool

duringtheschoolday,afterschoolandonweekends.TheHTreportedthatthe

Science,AgricultureandPDHPEfacultywasthe“strongest”performingfaculty

groupintheschool.Thereisastrongemphasisinthefacultyoncompetitionasthe

waytogetthebestoutofthestudents.

Inrecenttimes,withtheexceptionoflocalagriculturalevents,theschoolhasbeen

withdrawingfromgeneralscienceandtechnologybasedexcursionsandactivities

beyondtheschoolduetothecosts(oneexamplementionedwasthewithdrawal

fromtheUniversityofNewcastle’sEngineeringChallenge).Instead,localresources

areincreasinglybeingreliedupon(suchashavingalocalAboriginalelderintotalk

tostudents).AccordingtotheHTtheschoolwasnotovertlyrespondingtothe

recentSTEMinitiativebytheDepartmentasstaffattheschoolhaveforsometime

beenusingagriculturalcontextstocreateinterestinsciencebasedcareers

(ArtificialInseminationforcattleandGeneticModificationforCanolaseedwere

givenasexamples).

Thesciencefacultyisheavilyinvestedintheschoolsliteracyprogramand

contributesaperiodaweek(asdotheotherthreecorelearningareas)togeneric

literacyactivitiesprovidedbystafffromallfacultiesintheschool.TheHTscience

saidsheislobbyingformorereportwritingtobeincludedintheprogram.

TheHTsaidherprioritywastomaximizeparticipationinscienceinthesenior

school.Tothatendprogramminghadbeenpareddowntofourtopicsayearwith

Page 373: Exploring The Impact of a Largescale Diagnostic Science

352

titlessuchasBiology7,Chemistry8andPhysics9sothatstudentsknowwhatthe

contentoftheseniorsubjectsiswhentheychoosetheminthesecondhalfofY10.

Thefacultyhasfourassessmenttaskspersemester,twoofwhichareformaltests

(downfrommorethanadozenovertheyearwhenshehadfirstarrived).The

intentionbehindthereductioninassessmenttaskswastoprovidemoretimefor

teachingandshereportedthatsincedoingthat,resultshaveimproved.

Studentsdoaresearchprojecteachyearwhichisallocatedbothschoolandhome

timetobeworkedon.Thisismorethanthesyllabusrequires(itsuggestsatleast

twobedoneinthefour-yearprogram).Noneoftheothercasestudyschoolsrana

majorresearchprojecteachyear.Thestudentresearchprojectsareheavily

scaffoldedtoensurethatatraditionalreportinvolvinganaim,problem,variables,

method,resultsofobservations(tablesandgraphs),conclusion,discussionand

bibliographyisproducedastheexpectedproduct.

Teachingprogramsareorganizedaroundsyllabusknowledgeandunderstanding

outcomesandrelatedcontent.Investigatingandcommunicationskillsare

addressedintherubricsforthevarioustasksembeddedintheprogram.Those

tasksarebothteachingandlearningactivitiesaswellasassessmenttasks.Among

theartifactsprovidedwasaY7taskrequiringstudentstoproduceaposter

showinghowtoseparateamixture(onechosenfromanumberofactualexamples

withintheexperiencesofstudents)andanothertaskrequiringstudentsto

produceawrittenreportonatopic(eghearttransplants)relevanttotheY8

BiologyandSocietytopic.

Teachingandlearningprogramsalsolisttheresourcesavailabletodothetask

whichincludestraditionaltextbooksandworksheets.Acolumnisprovidedfor

teacherstoaddanyadjustmentstheyhavemadetothelistedprogram.Toassist

withthistask,teachersareprovidedwithaonepagesummaryofsuggestionsfor

adjustingteachingtoensurethatstudentshaveaccesstosyllabusoutcomes(see

Figure5.2).

Page 374: Exploring The Impact of a Largescale Diagnostic Science

353

Assessmenttasksaresupportedbyrubricsthatspelloutexpectedlearningand

howresponseswillbemarked.TheyarebasedontheBoard’sCommonGradeScale

andmarksareawardedinlinewithrubriccriteriaanddiscussedwithstudents.

Collatedmarksareaggregatedandrecordedandteacherjudgmentisusedto

convertmarkstogradesforthepurposesofreportingtoparents.Staffaregiven

timetoworkthroughthecriteriatoensuresomeconsistencyofjudgmentand

subsequentmarkingissharedtofurthersupportthat.

Studentsusenotebookstokeeparecordoftheirlearningactivities.Worksheets

areexpectedtobestuckintotheirnotebookswhichareexpectedtobebroughtto

everylesson.Monitoringofbookworkbyteachersisnotahighprioritybuttheydo

encourageassiststudentstopeerassesseachothersbookwork(seebelow).

Figure5.2Adviceonadjustmentstoteachingtoaccommodatestudentdifferences

EXAMPLES OF ADJUSTMENTS TO TEACHING AND LEARNING PROCESS IN SCIENCE: AMOUNT TO BE COMPLETED:

1. Reduce no of questions / amount to learn. 2. Reduce length of oral presentation. 3. Reduce length of written response / reading. 4. Reduce homework.

TIME 20. Individualise timeline to complete task. 21. Allow extra time to complete task / respond. 22. Allow extra time to use specific equipment.

LEVEL OF SUPPORT 40. Change the amount of personal assistance. 41. Assign peer buddies/tutors. Select role models. 42. Change groupings in class e.g. small / larger

group activities, paired activities. TEACHER INPUT

5. Use visual aids / pictorial directions. 6. Provide concrete examples / hands-on activities. 7. Plan for generalisations/ links to real life learning. 8. Repeat / model / highlight language and important

points. 9. Provide cues & prompts. 10. Simplify language. 11. Pre-teach vocabulary. 12. Specialist teacher input. 13. Provide training & assistance to help student use

specialised equipment. 14. Explicit teaching of skills eg problem solving/social

STUDENT OUTPUT 23. Adapt how learner responds to instruction. 24. Instead of written response – allow verbal. 25. Use of communication device. 26. Focus on hands-on learning. 27. Note-taker / Scribe 28. Use of cloze, matching activities, short answer,

multiple choice, portfolio, technology / computer supported response.

29. Student focuses on own goal within class activity e.g. communication, self-care, health issues, use of Braille.

SKILL LEVEL 43. Allow use of calculator, number line etc. 44. Student responds using assistive technology /

computer software. 45. Simplify task directions –use step by step guide. 46. Break down skill / task. 47. Use of visual glossaries. 48. Provide support staff / peer to help student cope

with each step of skill. 49. Modify or individualise task to match skill level. 50. Assess different skill e.g. ignore spelling and

focus on communication of ideas.

LEARNING ENVIRONMENT 15. Sit student at front of class. 16. Provide separate space in classroom for individual

tutorials. 17. Evaluate & plan for new environments e.g. camp. 18. Support understanding of appropriate when in non-

class environments e.g. social stories 19. Adjust environment to support needs arising from

disability e.g. access for wheelchair.

MATERIALS / RESOURCES 30. Notes provided for student. 31. Use of computer, iPad, etc. 32. Use of disability-specific materials e.g. audio format,

braille, larger font, coloured papers. 33. Talk to text, speech recognition software. 34. Hands-on materials, simplified timetables etc. 35. Vary arrangement on page, size of writing, visuals,

and point form. 36. Captions/subtitles for visual sources.

HEALTH / SAFETY/ SELF-CARE. 51. Monitor / assist with use of communication device,

personal amplification device, specialised equipment, medication, menstruation etc.

52. Liaise with team stakeholders on regular basis to increase participation, check on health/safety.

53. Monitor lunch time activities to support interaction, safety and direct teaching of skills.

54. Programme specific instruction on anger /depression management. Seek counsellor referral

§ CURRICULUM 55. Students work on similar outcomes but simpler concepts. 56. Students work on individualised outcomes while in class e.g student focuses on

listening, social skills, literacy. 57. Teach individualised skills in unit of work e.g. social skills, symbol reading. 58. Plan activities to target student need e.g. group work for communication.

59. Relate outcomes to functional skills. 60. Adjust curriculum to cater for programming required outside of classroom

e.g. community access, supported work experience. 61. Consistently monitor data to support programming feedback. 62. Implement additional support plan such Behaviour Analysis, Sensory

Integration Plan to compliment programming and IEP.

Page 375: Exploring The Impact of a Largescale Diagnostic Science

354

MCWBE5

Scienceteachersatthemetropolitanschoolpromotedscienceattheschoolby

runningsomeoftheirassessmenttasksas“shows”intheplaygroundatlunchtime

andintheleaduptoNationalScienceWeek.TheScienceFacultyalsoputon

displaysusingstudentsasdemonstratorsattheschool’sannualopennightfor

parentsofprospectivestudents.Highperformingstudentsarealsoinvolvedin

puttingonscienceshowsforstudentsinthelocalfeederprimaryschoolsaswell.

Theschoolhasastrongreputationinthecommunityforscienceaccordingtothe

HTsciencewhichhesupportedbyreferencetoEVstudentsurveyfeedback(see

Table5.11andrelatedanalysis)andresultsfromaY11studentsurveyconducted

bytheprincipal.

TheHT’spriorityforscienceisthatstudentsenjoythesubject.Thewayhesays

thisisachievedisbygivingaprioritytopracticalactivitiesbothinsideandoutside

theclassroomandreducingassessmentpressure.Thescienceprogramtakes

studentsintotheplaygroundandlocalbushfromY7toY10.Activitiesinclude

observationsusingdataloggersandsamplecollectionforfurtherexaminationand

analysisbackinthelab.Also,sciencetakesstudentsawayforday-longexcursions

attheendoftheyeartoTarongaZoo(Y7),PhysicsisFunatLunaPark(Y8)andthe

AquariumatDarlingHarbour(Y9).Learning/assessmenttasksincludemodel

makingandinvestigationsaswellastraditionalpracticaltests,researchtasks,

problemsolvingandcommunicationtasks.

ThesampleprogramsprovidedtomewerewrittenintotheBoard’sprogramming

templateandincludedscienceknowledgeandunderstandingoutcomesand

relatedcontentbutnoneofthesyllabusskilloutcomeswereexplicitlyreferenced

intheprograms.Assessmenttaskswereidentifiedbyatitleandsomeadditional

informationaboutcontentandskillexpectationswasprovidedinthesecond

columnundertheheadingTeaching,LearningandAssessmenttoassistwith

developingcriteriaforassessment.

Thefacultyprogramswereusedbyteacherstoplanteachingprogramsfortheir

classes.Thesampleprogramsprovidedwerebothforfiveweektopics,suggesting

Page 376: Exploring The Impact of a Largescale Diagnostic Science

355

thattherewereeighttopicsfortheyear.Progressisreportedseparatelyforthe

topandbottomclasses.Theindependentlearnersandotherclassesareseparately

assessed.Progressforallgroupsisreportedintermsofgrades.Howeverthegrade

referencingisnotdoneusingCoursePerformanceDescriptors.Insteadtheyare

referencedtodifferentcriteriaforthetopclass,theindependentlearnersand

mixedabilitygroupsandthebottomclass.

ThiswasthecaseuptotheendofY8afterwhichclassesarecreatedbasedon

achievementassessedbyacommontestandtaskattheendofY8andprogress

thereafterisreportedintermsofagradeandplaceintheyear.

Artifactsprovidedincludedrubricswithcriteriaforawardingmarks.Thecriteria

includedreferencestosciencesyllabusknowledge,understanding,skillsand

scientificliteracyexpectations.TherewasnoevidencethattheBoard’sCourse

PerformanceDescriptorsorCommonGradeScalewereusedtoassigngradesand

therewasnomentionaboutprocessesusedtoensureconsistencyofteacher

judgmentintheawardingofgrades(forthethreeorfourclasseswherethiswas

relevant).

TherewasnomentionintheinterviewofSOLObeingusedforassessment

purposesanditwasnotevidentinanyoftheartifactsprovided.SOLOwasnot

mentionedinthecontextoftheongoingfacultyprogramreviewthatbeganseveral

yearsagowiththeintroductionofthenewsyllabus.

Inrelationtoclasstasksandtheresearchproject,therewasnoscopeforstudent

choiceinwhattheywoulddoorhowitwouldbepresented.Itwasnotcleartome

whetherthesetaskswereusedbyallclassesoronlythemiddlegroup(excluding

thetopandbottomclasses).

D.Classroomdiscourseandevidenceoflearning

PCWAE2

Page 377: Exploring The Impact of a Largescale Diagnostic Science

356

Underpinningtheteachingattheprovincialschoolisacoherentapproachto

improvinggeneralliteracy(astrongschoolpriority)andthescientificliteracy

skillsofstudents.

ThebottomYear7classreceivesextraattentionfromlearningsupportteachers.

Oraldiscussionisacoreactivityandlearningactivitiesarestructuredtoallow

studentstorespondindifferentwaysaccordingtotheirlevelofskill.Marking

rubricsarerelatedtosyllabusoutcomesandrelatedcontentindicatorswhichare

sharedwithstudentsandusedtoinformoralandwrittenfeedback.Inthisschool,

streamedclassesareusedtodifferentiateteachingandtochallengestudentsatall

stagestodobetter.

AccordingtotheHT,thereisastrongemphasisongroupworkandstudentsare

supportedtodothisinproductivewaysthroughroledifferentiationandrotation

ofrolesinpracticalwork.TheHTgaveextendedexamplesofwhatthat

differentiationlookedlikeacrossclasses.Classroomactivitiesaredifferentiatedto

providestudentsofallskillsandcapabilitieswithachancetosucceed.Worksheets

providescaffoldingthatrangesfromclozepassagestoopenendedtaskswhere

explanationsareexpected.Studentsrespondastheycanandareassessedbytheir

teachersaccordingly.

Oraldiscussionistheinitialgotoactivity,butitcanbeusedforpre-testingandto

engagestudentswhohavedifficultyaccessingandconstructingwrittentexts.The

HTusesoralreadingasastrategytogetstudentstoengagewithwrittentext.She

encouragesstudentstostopandaskwhentheydon’tunderstandwhattheyare

readingandsheconstantlyprobestoensureunderstanding.Pausesare

opportunitiesfordiscussionandsharing,buttherearestrictprotocolsobservedin

theprocesstoensurenooneishumiliated.Shearguesthathavinggradedclasses

helpsinthisbecausestudentsintheclasshavesimilarissuesanditiseasierto

managewhenthedifferencesinabilityarenotsomarked.

MCWBE5

Page 378: Exploring The Impact of a Largescale Diagnostic Science

357

TheHThereisnotsohandsonwithjuniorclassesandspendsmostofhisteaching

timeinseniorphysicsclasses.Hestronglyencouragespracticalactivityinjunior

classesandcommentedthattworecentstaffchangeshavebeenhelpfulinhaving

thatfurtherimplemented.Staffaregivenfreedomtoteachtheirclassesastheysee

fit.

Heexplainedthatassessmentevidencewasbeingtakenfromagreaterdiversityof

tasksnowthaninthepastincludingpracticalexams(stationssetupandstudents

movefromonetotheotherandrecordinaworksheetwhattheyobserveand

find),communicationandproblemsolvingtasks.Communicationtasksinvolve

engagingstudentswithvideosontheschoolIntranetandgettingthemtoprovide

bothoralandwrittenreports.Hewasparticularlyproudofmodelmakingtasks(a

plantcellforY7andatoycarforY8thatgoesfastestorfarthestandplansfora

bungy-jumping“barbiedoll”forY9)becauseoftheopportunitiesitprovidesfor

studentengagementintheassessmentprocess(seelatersection).Modelmaking

andrelatedactivitieshavebeenafeatureformanyyearsinthesciencefaculty.

Thestudentresearchprojects(oneinY8andtheotherinY10)aremostlydone

individuallyandathomeandtheyarehighlyscaffoldedwitharubricprovidedby

teachersthatemphasizesaspectsofscientificreportsandmethod.Little

informationwasprovidedintheinterviewaboutthefollow-uporsupport

providedtostudentswhilsttheywereexpectedtobeworkingonthesetasks.

Theextenttowhichsupportteacherswereusedtoassistlearninginthelowest

classwasnotexplainedininterview.Descriptionsofactivitiesusedbothinandout

oftheclassroomwerereportedandevidencedintheartifactsprovided.

E.Feedback

PCWAE2

Inadditiontotheextensiveuseoforalfeedbackduringclasswork,feedbackon

writtenworkisprovidedtostudentsintheformofticksandcrossestoindicate

aspectsoftasksaddressedwellorinadequately(orincorrectly).Otherfeedbackis

Page 379: Exploring The Impact of a Largescale Diagnostic Science

358

intermsoftheBoard’sCommonGradeScalethelanguageofwhichstudentsare

introducedtoinclasstaskandassessmentrubrics.Itisusedtoprovidefeedback

tostudentsforbothteachingandassessmentpurposes.Theintentionisthat

studentsareveryfamiliarwithitandcanuseittoself-assessbythetimetheyget

tothesenioryears.

MCWBE5

Teachersareencouragedtoprovidestudentwithadiversityofactivitiesto

supportenjoymentandspontaneityinscience.Agreatdealofprofessional

judgementisexercisedinassigninggradesforreportinginthefirstfewyearsof

scienceattheschool.Thereareeffectivelythreeseparatereportingstreamsbased

onclassplacementsfromprimaryschoolassessmentofstudentability(see

above).Contentcoverageandmisconceptionsencounteredseemtobethebasis

forfeedbacktostudentsratherthanstrictadherencetosyllabusoutcomes.A

creativity/originalitymarkisalsoavailableformodelsthataremadeinclass.

Studentresearchtasksareforthemostpartundertakenindependentlyby

studentsworkingathome.Scaffoldssetoutexpectationsinrelationtodoingthe

activitiesinthetaskswhicharestronglyalignedtothesyllabusworking/

communicatingscientificallyoutcomes.Howteacherssupportstudentsasthey

workonthesetaskswasnotexplained.

TheHTexpressedaconcernthatseniorstudentsdidnotdoverywellintheHSC

extendedresponsequestionsbecausetheycouldnot“writeaparagraph”.Heused

theterm“backwardsmapping”toexplainthatstudentsneededtobetaughtto

writeearlyoninscience.Hewentontoexplainhowhewasactivelyworkingnow

withhisteacherstodomoreaboutthisinY7science.Hereferredtotwoliteracy

programs(TEEECandtheSuperSix)thatinformedthesciencefacultyworkinthis

area.Thisfocusonassessmentforlearningandliteracyappearedtoberecentand

asaresponsetonewschoolpriorities.

HeandhisstaffasfarasIcouldascertainhadnotengagedwiththeextended

responsetasksintheEVprogram,butdidfreelyusetheshortresponseitems.

Page 380: Exploring The Impact of a Largescale Diagnostic Science

359

F.Activatingstudentsasinstructionalresourcesforothers

PCWAE2

Theresearcherapproachedthisbyaskingadirectquestionaboutopportunities

beingprovidedforpeerassessment.TheHTsaidthatitwasnotaformalpractice

intheearlyyearsofsecondaryschoolsduetostudent’snaturalreticenceandlack

ofconfidencerelatedtolowliteracyabilities.OneactivitythatwasusedbytheHT

wastoengagestudentsinjointconstructiononthewhiteboardofnotes

summarisingsciencework.Year7and8studentsareinvitedtowrite,say,their

conclusiononthewhiteboardandtheclassengagesinteachermanaged

discussiontoreachaconsensusviewonwhatshouldberecorded.

Studentswereencouragedtoworkingroupsonpracticaltasksandsupportwas

providedtoassistinthisprocess.Therewassomepeerfeedbackencouragedon

studentrecordkeepingintheirnotebookstoo.Studentsprovidedeachotherwith

atickedchecklistbasedontheirassessmentofeachother’snotebooks(criteria

werecategorizedaspositivesuchasneatnessandcompletenessandnegative

includinggraffiti,tornpagesanduncorrectedspellingerrors.AtthispointtheHT

spokeaboutthehighabsenteeratesofstudentsandthefactthatsomeofthatwas

duetosuspensionfromschoolforinappropriatebehaviour.Forsomestudents

continuityintheirschoolrecordwasanissuethatshesaidimpactedovertimeon

achievement.Studentinvolvementinassessingbookworkforeachotherwasan

attempttounderscoretheimportanceofhavingacontinuousrecordofworkto

studyfrom.

MCWBE5

TheHTexplainedthatpeerfeedbackonoralpresentationsrelatedto3Dmodels

producedbystudentswasencouragedandsupported.

Someguidancewasgiveninrelationtocriteriathatshouldbeused(evidenceof

sameinartifactsprovided).Theteacherretainedcontroloverthemarkawarded,

buttherewassomediscussionwithpeersaboutwhatthatshouldbe.Hiscomment

Page 381: Exploring The Impact of a Largescale Diagnostic Science

360

wasthatkidswere,onthewhole,prettygoodatitoncetheyhadthecriteria

providedandtheywereconsistentaswellasfairwitheachother.

G.Activatingstudents(andteachers)aslearners

PCWAE2

Thescienceprogramsintheearlyyearsherewereveryteacherdriven.Students

weregivenfewopportunitiestochoosewhattheystudied.Theycouldchoosefrom

arangeofindustrialprocesseswhenitcametoresearchingseparatingmixtures(a

Y7task)buttheyhadtoproduceaposter.Abiologytopictaskprovidedalistof

threeproceduresthatcouldberesearched,butithadtobepresentedintheform

ofawrittenreport(Y8).Studentresearchprojectsweretightlyconstrainedbothin

topic(seedgerminationforY7)andexpectationsforpresentation(scaffoldforthe

writtenreport).

Intermsofteachersbeingactivatedaslearners,theHTwasfullofpraiseforher

staff(fourfulltimeteachersandonecasualwhowasnotsciencetrained).Shesaid

ofthemthattheywerethe“mostcohesivecollaborativestaff[shehad]ever

workedwith.”Theyhadengagedwillinglywiththetasksinvolvedinredoing

programsforthenewsyllabus,tookonVALID10butfounditwanting,werefully

committedtogettingthebestfromtheirstudentsandengagedfrequentlyin

professionaldialogueonteaching,studentandassessmentissues.Whenasked

aboutwhattheythought“progressioninlearningscience”meant,boththeHTand

soontoberelievingHTwereabletogiveagoodaccounteachusingadifferent

example.TheHTelaboratedusinginvestigationskillsanddescribedhowthat

mightlookfordifferent“ability”students.Bothdemonstratedagood

understandingofdifferentiationinrelationtosyllabusoutcomes.

MCWBE5

TheHTreportedthatinrecentyearstherehasbeenmorewillingnessbystaffto

meettodiscussprofessionalissuessuchasassessment.Therehadbeentimespent

collaboratingonthedevelopmentofnewprogramsaswell.Sampleprogramsfrom

Page 382: Exploring The Impact of a Largescale Diagnostic Science

361

2013and2016wereprovidedshowingchangesbutitwasnotcleartomewhether

thesewerewrittenbystaffotherthantheHT.Iwasnotprovidedwithspecific

outcomesfromanyofthesereportedrecentmeetings.Artifactsprovidedincluded

thefollowingscaffoldforY7studentstoself-assess(Figure5.3).Thistooappeared

tobearecentinitiative(post2014).

Student Self Evaluation Rate each statement out of 10 This is my best work 10 9 8 7 6 5 4 3 2 1 0 I understood this task 10 9 8 7 6 5 4 3 2 1 0 All criteria have been met 10 9 8 7 6 5 4 3 2 1 0 I am proud of my work 10 9 8 7 6 5 4 3 2 1 0

Figure 5.3 Self-assessment rating scale

H.Comparativesummativecomments

Twothingsstoodoutinthiscomparison.Thefirstwasthestrongfocuson

instructionaimedatimprovingtheliteracyskillsofthestudentsattheprovincial

schoolwhichthesciencedepartmentstronglysupportedintheirscience

programmingandlessondelivery.Therewasapparentlynosuchemphasisatthe

metropolitanschool.Thefocustherewasonengagingstudentswithadiversityof

scienceactivitiesdesignedtoengageandintereststudents.Thegoalatthe

provincialschoolwastopreparestudentsforseniorscienceoptions.

Thesecondwasthehighstakesassessmentpolicythatgradedprovincialstudents

insciencefromtheendofsemesteroneinYear7andmovedstudentsattheendof

everysemesterthereaftereitherupordownaclassifperformancewarrantedit.

Semestertestsforallclassesplayedaroleinthat.However,theattentionto

differentiatedcurriculumdeliverywasmostthoroughlydemonstratedbythe

provincialschoolherecomparedtoalltheothercasestudyschools.The

metropolitanschoolalsoestablishedtwohighachievingclassesonthebasisof

Year6informationaboutachievement(oneclass)anddemonstratedcapacityfor

independentlearning(asecondclass).Bothclassesonceestablishedremained

Page 383: Exploring The Impact of a Largescale Diagnostic Science

362

largelyunchangeduntiltheendofYear8.Summativeassessmentwaslowkeyand

evidenceoflearningwascollectedfromawiderrangeofactivities.

Inrelationtopredictionone,PCWAE2’sachievementprofilewasmorepositively

skewedtothetopbandachieversthanMCWBE5’s(TableK.1inAppendixJ).The

biaswasmostobviousfortheextendedresponsecomponentoftheEVresults.

Thiswasevidenceoftheeffectivenessofthestrongfocusonimprovingstudents

literacyskillsinthoseearlyyearsofsecondaryschooling.Therewasnoinsight

providedduringtheinterviewabouthowscienceteachersrespondedtotheclass

ofindependentlearnersatthemetropolitanschool.

However,whenlookingatengagement(TableK.5DinAppendixJ),students

reportedverydifferentlevelsofsupportfortheirexperienceofscienceatthe

school.Thelowerachieving(overall)metropolitanschool’stopbandstudents

ratedtheirexperience(ItemsDandEonthestudentsurvey)4thoutof16(the

numberofcasestudyschoolsplusthestatefigurecountedasoneschool)and

abovethestatefigurecomparedtotheprovincialschool’s14thwhichwasbelow

thestatefigure.Takingallthreeachievementbandsintoaccount,attheendof

Year8,MCWBE5studentsranked3rdandPCWAE2studentsranked12thwhich

wasthelowestofallthecasestudyschools.

Engagementwithscienceasmeasuredbytheproportionsofstudentscompleting

Year12sciencecourseswasstrongeratthemetropolitanschoolforthemore

demandingChemistryandPhysicscourses(seeTableK.4inAppendixJ).Both

PCWAE2andMCWBE5(comparedtothestate)hadmorestudentscompleting

Biology(bothhad133%);inChemistry,bothschoolshadaboutthesame

proportionscompletingasinthestate,butMCWBE5hadslightlymorethan

PCWAE2(100%versus89%);inPhysicstheproportionsrelativetothestatewere

slightlybetterforMCWBE5(106%versus81%).IntheSeniorSciencecourse,

morestudentsatPCWAE2completedthecoursethanatMCWBE5(288%versus

192%).However,whenonelooksatthelargenumberofSeniorSciencecourse

completionsattheprovincialschoolcomparedtothemetropolitanschool,either

studentsattheprovincialschoolhadbecomemorepositiveaboutscienceinthe

Page 384: Exploring The Impact of a Largescale Diagnostic Science

363

twoyearsafterYear8ortheyhadno(orlessattractive)optionstochoosefromin

Years11and12.ScienceisoptionalafterYear10.

PairFOUR:AssessmentnarrativescomparedforMGFSAE2andMGFSWBE1

A.EngagementwithEVfeedback,resourcesandSOLO

MCFSWAE1

TheHTfromthecoeducationalWAEselectiveschool(MCFSWAE1)participatedin

ordertosupportresearchsuchaswasrepresentedbythisprojectandtoprovide

feedbackabouttheEVprogramwhichwassaidtobea“highquality”program

becauseitstasksanditems“sethighexpectations”and“provideabasisfor

discriminatingbetweenresponsesfromhighabilitystudents”.Scienceteachersat

thisschooluseitemsandtasksfromtheEVtestsintheirownassessment

programsbutdonotuseSOLO-basedrubricstoassessresponses.Theschoolisnot

planningtotakeupVALID10.Studentsurveyresultsarenotlookedatnor

discussedwithstafforstudents.TheHTthoughtthatthetestprovidedquality

feedbacktoteachersand“liked”thatitwasmandatory.

TheHTsaidthatstudentsenjoyeddoingthetestonlineandtookitasseriouslyas

theydidNAPLAN.Someevenusedtheirowndevicestodothetest.Nospecial

preparationforthetestisundertakenapartfromregistrationandworking

throughthesampleitems.Therehasbeennofeedbackfromparentsaboutthetest

orresults(whengiventoparents)anditreceivesnoattentioninannualschool

reports.Theprincipaltakesaninterestintheresults.

MGFSAE2

TheHTatthegirlsAEselectiveschool(MGFSAE2)participatedtofindoutmore

aboutSOLO.TheHTreportedthattheschool’sscienceassessmentprogram

involveda“SOLObasedapproachtoassessment”byprovidingstimulusmaterial

withtestitems.SOLO-basedrubricsareincreasinglybeingusedtomarkresponses

totasksandtoinformfeedbacktostudents.Itwasreportedthatmostofthestaff

Page 385: Exploring The Impact of a Largescale Diagnostic Science

364

attheschoolsupportSOLOasabasisfortheirownprofessionallearningandfor

itsusefulnessinassessingstudent’swork.

EVfeedbackisdiscussedatthetimeitisprovidedtotheschool.TheHTdoesan

analysisofachievementtoidentifystrengthsandweaknessesoverallandbetween

classesandthisanalysisisdiscussedwithstaff.TheHTreportedthatEVresults

informteachers’ongoingdevelopmentofteachingprogramsandteacher

assessmentofstudentwork.

Thegirlsenjoydoingthetestonlineandtakeitseriously.Someusetheirown

devicestodothetest.StaffwillcontinuewithVALID10andareinterestedinthe

feedbackonstudentgrowthfromY8toY10,particularlyinrelationtomiddleband

students(onlyoneortwostudentswereassessedaslowband).

TheHTdescribestheSOLOrubricasaboutrewardingstudentresponsesthat

showappropriate“connections”betweenscienceconcepts.Differencesbetween

SOLOmarkingandBoardmarkingweredescribedtomebutwerenotseenas

problematic.TheHThadcompletedtheproformaandacknowledgedthatshe

foundtheresponsestothestudentsurveyconfrontingbutuseful.Theconcernwas

thatstudentattituderesponseswerebelowstatefiguresbutnoimmediate

thoughtsabouthowtoimproveattitudeswereoffered.TheHTnominateditemsF

&E(fromthestudentsurvey)asthemostusefulfeedbackfromtheperspectiveof

sciencefacultypriorities…thatstudentslearntheirscienceandenjoyit.

MGFSWBE1

TheheadteacherfromthegirlsWBEselectiveschool(MGFSWBE1)participatedto

provideaprofessionallearningsessionforscienceteachersaboutassessment.My

projectalignedwiththefocusattheschoolandinscienceonassessmentfor

learning.Anexternalconsultanthadbeenemployedtoimprovetheir

understandingof“differentiatingassessment”andhowtoobtainandbetteruse

assessmentdatatoimproveteachingandlearning.ScienceteachersuseEV

stimulusandrelateditemsandextendedresponsetasksintheirassessment

programbutrubricsthatreflectsyllabusintentionsratherthanSOLOthinking

Page 386: Exploring The Impact of a Largescale Diagnostic Science

365

levelsareusedtoassessstudentachievement.ThestaffviewofSOLOwasthatthe

testitemsandtasksbringcontextandskillstogethersothatresponsescanbe

assessedtorevealdifferentlevelsofthinkingusingsciencecontentknowledge.

Staffwerecriticaloftheircurrentteststhatfocusedverymuchontheacquisition

ofknowledgeandunderstandingandtheyacknowledgedthattheydidnot

sufficientlydiscriminatebetweenlevelsofachievement.

SomescienceteachersreportedthatstudentsdidnottaketheEVtestseriously

becauseresultsarenotcountedinassessment.BycontrastNAPLANistaken

seriously.Teachersreportedthatstudentswere“stressed”becausetheywerenot

suretheschool’scomputerswouldworkandthatEVtestquestionsweredifferent

tothoseinothersciencetestsdoneattheschool.Theschoolplanstocontinuewith

theVALID10programandseevalueincontinuingtheirlearningaboutSOLO.

B.Groupingforinstruction

IntheWAEschool,studentswiththeweakestliteracyresultsareallocatedtoone

class.IntheAEschool,studentsareputintoclassesonthebasisoftheirchoiceof

foreignlanguagetobestudiedandintheWBEschool,theytrytospreadstudents

fromfeederOCschoolsacrossthefiveclassesformed.Thisisdonetoprovideall

studentswiththeopportunitytobroadentheirfriendshipbase.ThusY7classesin

allthreeschoolsareeffectivelymixedabilityclassesfromtheperspectiveof

science.

Essentially,allthreeschoolsretainthesameclassesfromYears7to10.An

exceptiontothisgeneralapproachisfoundintheWAEschoolwherestudentswith

exceptionalresultsareinvitedtojoinagiftedandtalentedclasswhichis

establishedfromY8.Acceptanceintotheclassisconditionalonthestudents

agreeingtodochemistryinthesenioryears.Theclassisacceleratedbutnodetails

wereprovidedastowhatthatmeant.

C.Useoflearningintentionsandsuccesscriteria

Page 387: Exploring The Impact of a Largescale Diagnostic Science

366

MCFSWAE1

TheHTatthecoeducationalWAEschoolreportedthattheschoolplacedahigh

priorityonliteracy.Extraassistanceisgiventotheoneclasswherestudentswith

weakerliteracyskillswereplaced.TheHThasexpertiseinliteracyandan

emphasisonliteracyskillsisevidencedintheartifactsprovidedtome.TheHT

reportedahighsciencefacultypriorityforteachingscientificliteracyskillsvalued

intheworldbeyondschool,forteachingcriticalthinkingratherthanrotelearning

andforgreaterstudentengagementwithscienceatschoolandbeyond.

Twotopics,oneeachfromY7andY8,fromschoolprogramprovidedtome

demonstratedthepriorityforskilldevelopment.Theprogramorganizedcontent

intofivecolumns,thefirstdescribedcontent(sciencecontextsandcontenttouse

andlearn),thesecondskills(whatstudentsweretodowiththatcontent/the

thirdcontainedreferencestopagesinasciencetextbook),afourthincluded

referencestofacultyandotherresourcesrelevanttotheactivities.Afinalcolumn

listedinsyllabusoutcomesshorthand(e.g.SC4-CW-2e/WS6.3-6.4AB8)provides

thelinkbetweenschoolactivitiesandsyllabusintentions.TheY7programtopics

in2012numbered16.From2015thiswasreducedto13,theyearaftertheperiod

ofinterest.

Learning/assessmenttasksareaccompaniedbymarkingrubricsshowingingreat

detailhowmarksaretobeallocated.OneY7literacyassessmentprovidedtome

targetedthewritingofscientificexplanations.Themarkingcriteriaforthefive

relatedtasksinthatassignmentallocatemarksforcompletionofaspectsofthe

taskaswellasformoresophisticateddemonstrationsofthoseaspects.Figure5.4

showspartoftherubricforthatassignment.Thesuccesscriteriaappeartobe

derivedmostlyfromsyllabusintentionsbuttheyalsoincludeliteracycriteriaas

well.

Task 4 Re-writes two paragraphs in own words and uses the scaffold for structuring each explanation

Identifies the phenomenon being addressed in the first paragraph /1

Page 388: Exploring The Impact of a Largescale Diagnostic Science

367

Employs the explanation sequence, and, as appropriate: “action verbs, technical words, time connectives, cause-and-effect connectives” in order to explain the phenomenon /2

Identifies the phenomenon being addressed in the second paragraph /1

Employs the explanation sequence, and, as appropriate: “action verbs, technical words, time connectives, cause-and-effect connectives” in order to explain the phenomenon /2

Total /6 Task 5 Identifies the following language features of the text for one of the explanations and uses the correct symbol in so doing

action verbs /1

technical language or terms /1 time connectives when /1 cause-and-effect connectives As a result /1 Total /4 Figure 5.4 Sample marking criteria for scientific explanations (part only)

MGFSAE2

TheHTattheAEgirlsselectiveschoolsaidthatthefacultyprioritiesforjunior

secondaryscienceweretopreparegirlsforacareerinscience,toensurethey

werescientificallyliterate,abletocreativelyproblemsolveandtoenjoyplanning

andconductingscientificactivities.LearningprogramsforscienceinYears7-10at

thisschoolwereorganizedinto4X10weektopics.Eachtopicwascomprisedof

activitiestobecompletedbystudents.Theactivitiescombinedsyllabuscontent

andsyllabusdefinedskills.Theoverallassessmentplanshowedthatbytheendof

theyearstudentsoverallgradewouldreflecttheacquisitionofbothskillsand

knowledgeandunderstandings.

TheactivitiesforaY9topictitledTheComplexHumanwereorganizedinto

“booklets”andrelatedscaffoldsdirectedstudentstoworkingroupsandto

individuallyrecordspecifiedoutputsfromthoseactivities.Thisappearedtobea

modelforteachingandlearningsciencethathadbeeninplaceforsomeyears.The

scopeoftheactivitiesIreviewedintheartifactsprovidedwereconsistentwith

syllabusexpectationsforknowledgeandunderstandingandskillsforStage5

Page 389: Exploring The Impact of a Largescale Diagnostic Science

368

students(topicsinmostnonselectiveschoolsvisitedtargetedY8orStage4

content),butsomeoftheactivitieswentbeyondthat.

Outputstobeprovidedincludedtheconstructionoftables,graphs,procedures,

riskassessments,descriptions,explanations,generalisations,conclusionsand

justifications.Assessmentrubricstobeusedbyteacherstoscorethetasks

describedthefeaturesofoutputstoberewardedwithmarks.Thefeatures

describedforrewardwerebothindicatorsofbreadthofcoverageanddepthof

understanding/levelofskilldemonstrated.

AlsoaSOLObasedscaffoldwasprovided.Thescaffoldwasbeingtrialedwith

schoolintranetsciencequizzes.ItwasbasedonSOLOleveldescriptorsfor

comparisonwithstudentoutputstoselectedactivitiesthemselvesbasedon

contentintheschool’sY8program.

Thesciencefacultyassessmentpolicydocument(providedwiththeartifacts)

describedtheprocedurestobefollowedwhenmarksweretransformedinto

gradesforthepurposesofreportingtoparents.Theseappearedtobeconsistent

withBOSCommonGradeScalerequirements.

AsampleStage4activitytitledEnergytranformationsincludedamarkingrubric

thatcollatedmarksfortwocomponentsofthesyllabusworkingscientifically

strand(planningandconductinganinvestigationandprocessingandanalyzing

dataandinformation).

TheY9studentresearchprojectbookletprovidedincludedthestepstobe

followedinthedevelopmentofaproposalforresearch,includingopportunitiesfor

feedbackfromteachers,Turnitinsoftwareandstudentpeers(seeFigure6.3)and

assessorsfromthescientificcommunityataschoolbasedevent.Studentswere

encouragedtosubmittheirprojecttotheSTANSWYoungScientistsCompetition

aswell.

MGFSWBE1

Page 390: Exploring The Impact of a Largescale Diagnostic Science

369

TheHTfortheWBEselectivegirlsschoolexplainedthatoneoftheschool

prioritiesfortheyearwasassessmentforlearningandthatanexternalconsultant

hadbeenemployedtoprovideprofessionallearningtoteachers.Scienceteachers

hadattendedworkshopsprovidedbytheconsultant.

Thesciencefacultyprioritiesincludedworkingontheirassessmenttasksto

improvetheirquality,movingthegirlsfromrotelearningandmemorizingto

thinking,improvingtheirscientificliteracy,buildingtheirunderstandingofthe

roleofscienceinsocietyandencouraginggreaterlevelsofenthusiasmforscience.

Thesciencelearningprogramprovidesforfourtopicsperyear.Nosample

programswereprovided.TheassessmentschemesforeachofYears7-10were

provided.Therearefourformalassessmenttasksperyear.Eachtaskprovidesfor

afinalequallyweightedassessmentofknowledgeandunderstandingandworking

scientificallyoutcomesexpressedasagrade.Thesyllabusoutcomestargetedby

thetaskareprovidedinfullaspartofthetasknotification.Therubricforassigning

markswasincludedwiththetasksandthelinksbetweenmarksandgradeswas

alsoprovided(fromYear8onwards).TheBOSCommonGradeScaleappearedto

bethebasisfortheawardofgrades.

Y7tasksincludedaformaltest,ataskinvolvingdevelopingagame(across

curriculumproject…seenextparagraph),amulti-mediapresentationanda“VALID

Styletest”.InY8thetasksincludedapracticaltest,amid-coursetest,astudent

researchprojectanda“YearlyExamVALIDstyle”.

TheY7gametaskprovidesopportunitiesforstudentstodemonstrateoutcomes

fromtheArtandPDH&PEandSciencesyllabuses.Themultimediataskinvolves

studentsinpeerassessmentofgroupwork(seelatersectiononfeedback).

AY8“VALIDstyletest”wasprovidedinthesetofartifacts.Theshortitemsinthe

testweresimilarinformattothoseusedbytheBOSinitsexternaltests(bothpast

andcurrentones).EVteststypicallyprovideastimulustextandarelatedsetof3-5

itemsaboutthattext(Appendix1.XincludesanEVtestbooklet).Extended

responsetasksfrompreviousEVtestswereappropriatedintotheirtestsalso,but

Page 391: Exploring The Impact of a Largescale Diagnostic Science

370

theresponsescaffoldsweremodifiedtoconformwithBOStestformats.Therewas

noevidenceprovidedthatSOLOconceptswereusedtomarkresponses.

StudentsalsosittheICASsciencetests.Resultsarenotusedbytheschoolfor

diagnosticpurposes.Certificatesarepresentedtostudentsasanaffirmationof

theirhighrankinginthestateforachievementofthefivesetsofscientificskills

assessedbythetest.TheHThaddeclinedanofferfromtheEAAteamtoshow

scienceteachersattheschoolhowtousetheteststotrackschoolandindividual

progressusingtheresults.

D.Classroomdiscourseandevidenceoflearning

MCFSWAE1

TheassessmentnarrativefortheWAEselectiveschoolrevealsthatstudentsin

theirfirsttwoyearsofscienceareprovidedwithlearningactivitiesbasedheavily

oftextbook,classroomworksheetsandconventionalschoollaboratoryactivities.

Theprogramsprovideddescribeactivitiesthatcombinebothscientificskillsand

content,includingopportunitiesforstudentstoplananddesignthelaboratory

activities.Thereissomeevidenceintasksandtestsprovidedthatsciencerich

contextsinlinewithsyllabusexpectationsareprovidedasastimulusforteaching

andassessmentactivities(consistentwiththeEVassessmentmodelofproviding

stimulusmaterialandagroupofrelateditemstheresponsestowhichare

dependentoncomprehensionofthetextinthestimulusmaterial).

Excursionsarerareandsciencevisitorstotheschoolareonaninfrequent“adhoc”

basis.Scienceteachersdonotappeartomakemuchuseofresourcesbeyondthe

classroomtoenrichtheirteaching(suchastheschoolgroundsorlocalcreeksand

reserves)orengagetheirstudentsinscienceinvestigationssponsoredbyexternal

agenciessuchasBHP,RioTintoortheYoungScientistCompetition(runbythe

ScienceTeachersAssociationofNSW).ICTusebyscienceteachersisnotastrong

componentintheteachingofscienceattheschoolaccordingtotheHT.National

ScienceWeekisnotexploitedforitscelebrationofscience.ICAStestsofscience

Page 392: Exploring The Impact of a Largescale Diagnostic Science

371

thinkingprocessesaremandatorybutnoattemptismadetousetheresultsother

thantoaffirmthehighcompetenceofthestudents.

TheHTsaidthatgroupworkisnotactivelytaughtintheearlyyearsofscience

educationatthisschool.Thefirstmajorresearchprojectwhichthesyllabus

describedasanopportunityforgroupworkisanindividualproject(usingplants)

forY8studentsatthisschool.NoartifactsrelatingtotheStudentResearchProject

(SRP)wereprovided.

Literacybasedtasks,formaltests,writtenassignments,researchprojectsand

practicaltasksappeartobethemostvaluedsourcesofevidenceforscience

learning.PracticaltestsareintroducednoearlierthanY9asaresultofstudentsin

Y7&8beingstressedbythenoveltyandcomplexityoftheseassessmentswhen

theywereintroducedthereanumberofyearsago.

StudentsatthisschoolinYears7and8arefrequentlyaskedandgivensupportto

writescientificexplanations.Also,theyarechallengedtousethoseskillsintasks

wellbeyondtheireverydayexperience.AnexampleprovidedisaY9task(first

introducedin2013?)wherestudentsareasked,asascientist,toprepareresources

includinga3DmodelthatcouldbeusedinathreeminuteTEDpresentationto

evaluatestrategiesbeingusedtoreduceozonedepletion.Actuallyusingthe

resourcesinapresentationwasnotrequired.

MGFSAE2

TheartifactsprovidedbytheHTattheAEgirls’schoolrevealthatteachingatthe

girlsschoolsprovidesmanystructuredopportunitiesforthegirlstowork

cooperativelyonawidevarietyoftasks.Thegirlsareencouragedtodiscussthe

resultsoftheseactivities,accordingtotheHT,withboththeirteacherandpeers

beforerecordingwhattheyhavelearned.Thestructureofthesetofactivities

providesapathwaythatthegirlscanfollowattheirownpaceratherthanoneset

bytheteacherwhoisfreedupfrompresentingtotheclasstobeingabletowork

withsmallgroupsoroneononewithstudentsneedingsupport.Thusevidenceof

learningisprovidedtoteachersinthecourseofinformaloraldiscussionsand

Page 393: Exploring The Impact of a Largescale Diagnostic Science

372

formallyviathetextsproducedinresponsetopromptsprovidedbytheactivity

scaffold.

MGFSWBE1

TheHTattheWBEgirlsschoolonlyprovidedexamplesoftheassessmenttasks

usedattheschool.Theseartifactscombinedwithanswerstoquestionsprovided

byboththeHTandteachersattheschoolrevealedawillingnesstoworkoutside

thesciencefacultywithotherfacultiesandtoprovideexcursionstosciencerich

environmentsincludingthezooinY7,ashorelineenvironmentinY9andthe

PowerhouseMuseuminYear8.

Therewassomementionofastrongcommitmenttoprojectbasedlearningin

previousyearswhichisnowconfinedtotheSRPinY8andacrosscurriculum

projectinY7(mentionedabove)toproduceagamethataddressesoutcomesfrom

Art,PDH&PEaswellasscience.

Teacherwillingnesstoworkbeyondthescienceclassroomprovidesopportunities

fordevisingauthentictasksthroughwhichtobothteachscienceandtoassess

whatwaslearnedaccordingtotheHTscienceattheWBEgirlsschool.Teachertalk

attheinterviewaboutthetasksusedandthevariousformsofevidenceofthat

learningincludingstudentpresentations,models(agamewithsciencecontent)as

wellasmoreconventionaltests,assignmentsandstudentresearchprojectreports

describesthebreadthofactivitiesusedtoteachandassessevidenceoflearningat

theschool.Inthejuniorsecondaryyears,theredoesnotappeartobeastrong

focusonimprovingstudentwritingskillsinthecontextofsciencebeyond

addressingtheconventionalsectionsofatraditionalschoolscientificreportwhich

isthecommonassessmenttaskforTerm3inY8.Thereappearstobelittle

evidenceofteachingtohelpstudentsdeveloptheexpressivelanguageskillsusing

scientificvocabularypriortothat.Thereportwasconstructedbystudents

workingwithinahighlystructured,teacher-providedscaffold.

Whilstthestudentresearchprojectinvolvesgroupwork,nopersuasiveevidenceof

formalteachingintheskillsofgroupworkwaspresented.Tworubricsfor

Page 394: Exploring The Impact of a Largescale Diagnostic Science

373

assessingthetaskandrelatedreportareprovided.Onetoteachersandasecond

onetostudentswhichtheyusetoselfassess.Bothrubricsappeartobemodeled

ontheBOSCommonGradeScale.Onlythreegradesarepossible(A,BorC)which

appearstobebasedonthehistoricevidencefromYear10externalsciencetesting

thatendedin2011andperhaps(butnotstatedanywhere)thepatternoflevels

awardedintheEVresultspackage.Itwasnotclearhoworwhetherstudentswere

coachedintheuseoftheirself-assessmentrubric.

E.Feedback

MCFSWAE1

TheHTatthecoedschoolstatedthatscienceteachersattheschoolprovide

considerableinformalfeedbacktostudentsinthenormalcourseofdaytoday

teaching.Theyalsoprovidestudentswithformalfeedbackonperformanceintests

andtasksona“lookandlisten”basis(mycharacterisation).Studentsareprovided

withthetest/taskandtheirindividuallyteachermarkedfeedbacksheets.Awhole

classpresentationismadebytheteachertotheclassabouttheoverallstrengths

andweaknessesinresponses.Studentsarethenexpectedtoreflectontheir

individualfeedbackintheirowntime.

Teachersrecordmarksawardedfortests/tasksandtheyareconvertedtogrades

forthepurposeofreportingtoparentstwiceayear.Noinsightsabouthowthe

conversionwasdonewasprovidedandnoevidencewasprovidedthattheBoard’s

CommonGradeScale(orSOLOlevelsforthatmatter)wasthebasisforthat

conversioneither.Noevidencewasprovidedthatstudentsaregivenaccessto

syllabusoutcomesortheBOSCommonGradeScaleintheearlyyearsofsecondary

school.ReportinguptotheendofYear9wasintermsofgradesonly.Placeinthe

yearisprovidedinYear10aswellasgrades.

MGFSAE2

Theactivity“booklets”usedbytheAEschoolprovidemoretimefortheteacherto

workwithstudentstoprovidefeedbackonindividualissuesastheyarise.The

Page 395: Exploring The Impact of a Largescale Diagnostic Science

374

growinguseofSOLOprovidesanotherdimensiontothetypeoffeedbackateacher

isabletoprovideaswell.

MGFSWBE1

TheHTscienceattheWBEgirlsschoolprovidedtherubricsusedtoconvey

feedbacktostudentsabouttheirlearninginthefourformalassessmenttasks.That

feedbacktooktheformofmarksassignedaccordingtowhatappearedtobe

syllabusbasedcriteria.AccordingtotheHTscience,consistencyofmarkingwas

ensuredbydiscussionofrubricsandsampleresponsesatmeetingsofrelevant

teachersconvenedbythesciencecoordinatorsforeachYeargroup.Markswere

subsequentlyconvertedtogradesforthepurposeofreportingusingaversionof

theBOSCommonGradeScalemodel.

F.Activatingstudentsasinstructionalresourcesforothers

MCFSWAE1

TheHTscienceattheWAEschoolacknowledgedthatmakingthemostof

groupworkandtherangeofstrategiesassociatedwithit(suchasthink-pair-share

andreportactivities)wasnotahighpriorityamongstscienceteachersatthe

school.Norwasanyevidenceprovidedaboutopportunitiesstudentshaveto

providefeedbacktopeersabouttheirperformanceorachievement.

MGFSAE2

Discussionwithpeersinthecontextofgroupworkisstronglysupportedby

teachersattheAEschoolaccordingtotheHTsciencethere.Informaldiscussion

providesopportunitiesforjointconstructionwithpeersofresponses,individually

recorded,tothediversityofrequiredoutputspresentedtostudentsintheactivity

“booklets”.ThereisaformalopportunityattheAEschoolinY9forstudentsto

providefeedbacktopeersaboutthequalityoftheirreportsandrelated

explanationsintermsofspecificsuccesscriteria(Figure5.5).

Page 396: Exploring The Impact of a Largescale Diagnostic Science

375

Figure 5.5 Peer assessment scaffold for a Y9 task at the AE school

MGFSWBE1

AttheWBEgirlsschool,theoneformalopportunitytoprovidefeedbacktopeers

usingaskillbasedgenericscaffold(Figure5.6)wasmediatedbytheclassteacher

whowouldonlypassitonifhe/sheapprovedthecontents(thepersonmakingthe

assessmentwasanonymous).Itwasnotclearwhethermorethanoneteacher

(whorespondedtothatquestionveryconvincinglyattheinterview)used

groupworktoprovideopportunitiesforstudentstoactasinstructionalresources

fortheirpeers.

Page 397: Exploring The Impact of a Largescale Diagnostic Science

376

Thetopicforthepresentationwasanimalclassificationandanexcursiontothe

zoowasinvolved.Informationcollectedtherewasexpectedtobeusedbackat

schooltoprepareanddeliveramultimediareporttotheclass.

Itwasnotclearabouttheextenttowhichthiswas/isusedandforhowmanyyears

itmayhavebeenused.

Thestructureofthestudentpresentationwasorganizedusingateacherprovided

scaffoldthatdoubledasarubric(fortheteachertouse)toassignmarksfor

aspectsofthegroup’spreparationandpresentation.

Page 398: Exploring The Impact of a Largescale Diagnostic Science

377

Figure 5.6 Peer assessment rubric for Y7 multi-media presentation task

G.Activatingstudents(andteachers)aslearners

MCFSWAE1

Manyopportunitiesareprovidedtostudentsinthecoedschoolinthejunior

secondaryyearstodevelopgoodlearningbehavioursinthecontextoflaboratory

Page 399: Exploring The Impact of a Largescale Diagnostic Science

378

basedactivityworksheetsthatsupportthedevelopmentofskillsin

comprehension,analysis,evaluationandjustificationofchoices(usingexpressive,

oralandwrittenlanguage).

Inrelationtoteachermodelingofgoodlearningbehaviours,theHTreportedthat

thesciencefacultymeteverytwoweeksandthattheagendaofteninvolvedshared

professionalworksuchasdevelopmentofprogramresources,assessmenttasks,

markingrubricsandjointmarkingofstudentwork.Therewasnomentioninthe

interviewsabouthowteachersworkedwiththeirclassestohelpstudentsachieve

controlovertheirlearningapartfromtheirworkwithlanguageskills

(explanations).

MGFSAE2

TheuseofapeerassessmentscaffoldbystudentsattheAEgirlsschoolas

describedintheprevioussectionprovidesteacherswithameansforpromoting

goodlearningbehavioursinallstudents.Theskillofassessingyourownwork

againstcriteriaisanimportantsteptoself-regulatedlearningorlearninghowto

learn.Ifyoucanrecognisegapsorweaknessinyourownwork,thenyoucan

devisestrategiestoaddressthem.Thetimetakentoexplainhowtodothatself-

assessmentisanexampleofteachersmodelinggoodlearningbehaviours.

TheembracebyscienceteachersattheAEgirlsschoolofSOLOandtheir

preparednesstoworkwithittoimprovetheirownprofessionalcompetenceand

thelearningoutcomesforgirlsattheirschoolwasalsoevident.

MGFSWBE1

Oneconcreteexampleofsupportforself-assessmentwasprovidedbytheHTat

theWBEschool.Inthecontextoftheirstudentresearchproject(Task3,Term3of

Year8)studentsareprovidedwiththerubricteacherswouldusetoassessthe

planfortheirinvestigationandencouragedtouseitforthemselvespriorto

submittingtheirproposal.Itwasnotcleartomewhetherthishadbeenused

earlierthanlastyearoncethenewprogramshadbeenputinplace.

Page 400: Exploring The Impact of a Largescale Diagnostic Science

379

AccordingtotheHT,scienceteachersattheWBEschoolmeetregularly.Someof

themeetingsaredevotedtocollaborativeworkonprogrammingbutmore

recentlyondevelopingbetterassessmenttaskstoimprovethequalityof

informationaboutstudentlearningincludinghowtobetterdiscriminatebetween

achievement.NoevidenceofsupportforandusebyteachersoftheSOLOmodel

wasmentionedintheinterviewapartfromtheappropriationofextended

responsetasksforuseintheirowntestsmodeledonBoardformats.

H.Comparativesummativecomments

Comparisonsbetweenthesethreeschoolsarefraughtbecauseoftheirdifferences

inSEAscoresandbythefactthatoneisacoeducationalschool,theothertwogirls

schools.Ontheassumptionthatgenderdifferencesarenotstatisticallysignificant,

atleastinthefirstfewyearsofsecondaryschool(PISAandTIMSSresultsfor

Australiasupportthatconclusion),itisveryobviousthatattheendofYear8,the

comprehensiveschoolisdoingbetterintermsofachievementandengagement

thaneitherofthetwogirlsschools.ThefocusonwritingattheWAEandAEschool

showsupinthelowerextendedresponsescorefortheWBEschool,despitethem

doingbestinthecommunicatingscientificallycategoryofresults(seeTableK.1in

AppendixJ).

IntermsofengagementattheendofYear8,theWAEfullyselectiveentryschool

rankedabove(8th)theothertwofullyselectiveentryschoolsintermsofstudent’s

enjoymentoftheirschoolscienceexperience(seeTableK.5DinAppendixJ).

StudentsattheAEschooldidnotenjoytheirscienceexperiencecominginatthe

bottomoftherankings16thbytopstudentsonItemsDandEcombined.TheWBE

schooldidbetterat13th.

PairFIVE:AssessmentnarrativescomparedforPCWAE2andPCWAE3

ThenarrativeforPCWAE2waspresentedaboveinthecontextofpairTWO.Thus

onlytheinformationforPCWAE3willbeprovidedhere.

A.EngagementwithEVfeedback,resourcesandSOLO

Page 401: Exploring The Impact of a Largescale Diagnostic Science

380

Onlytheheadteacherattendedtheinterview.Participationwasonthebasisof

wantingtoknowhowtheyweredoingintermsofassessmentpracticeswhichhe

didnotthinkwereanydifferenttootherschools.Theschoolhadnotdone

VALID10in2015andhadnoplanstodosogoingforward.Thereasongivenfor

thatwasconsistentwithpolicydecisiontokeepformalassessmentstoaminimum

andmanageinalowkeywaybecauseofitsperceivednegativeimpactonstudents

motivationtolearn.TherewasalargeIndigenouspopulationattheschool(the

largestofthethreeWAEprovincialschoolscomparedhere)andtheSEAscorewas

verylow.TherewasnospecialpreparationfortheEVtestwhichhereported

studentsenjoyeddoing.Noparenthadaskedaboutthereportonresultswhenit

wassenthome.Theproformahadbeencompletedfortheinterviewand

assessment-relatedartifactswereavailableaswell.

B.Groupingforinstruction

Theschoolestablishes4-5classesinYear7eachyeardependingonnumbersfrom

feederprimaryschools.Classesarestreamedonthebasisoffeederschool

achievementandotherdata.Whilstnotgradedfromascienceperspective,thetop

Year7classreceivesamore“challenging”programinsciencethanisprovidedfor

theotherclasses.Thebottomclassreceivesadditionalsupportfromlearning

supportteacherswhoworkwiththescienceteachersintheclass.Theseclasses

arelargelyretainedgoingintoYear8withsomechangesbasedonendofyeartest

resultsand“behaviour”issues.

C.Useoflearningintentionsandsuccesscriteria

Learningprogramsarebasedonsyllabuslearningintentions(outcomesand

relatedcontent)andtraditionalcontentorganisers(IntroductiontoLaboratory/

Forces/Solids,LiquidsandGases/Earth,SunandMoon/Skills—Preparationfor

theSRP/CellsandClassificationandWorkingwithNaturearethetopicheadings

forYear7).The2nd,3rd,5thand6thtopicseachhave7weeksallocatedtothem.The

lasttopicincludesafocuson“patternsinnature…respirationand

photosynthesis…ecology…plantsystemsandstructuresandhuman(fire)and

naturaldisasters…scientificandindigenousknowledgetoextractresourcesfrom

Page 402: Exploring The Impact of a Largescale Diagnostic Science

381

theenvironment.”Manyoftheactivitiesassociatedwiththetopicsareliteracy

focused(correctuseofappropriatevocabulary…adaptationnotadaption)and

separatepageslistspellingandotherliteracyresourcesforeachtopic.Eachtopic

hasspecificassessmenttasksandthereisacommonassessmenttaskeachterm

(fourinayear).Theredon’tappeartobeanyformalexamsortests.Thepriorityis

forstudentengagementandenjoyment.Studentsareprovidedwithadiversityof

activitiesusingawiderangeofresourcesfromwithintheschoolincluding

Agriculture,whichsciencemanages.Studentsvisitalocalsciencefaireachyear.

Relevanceisimportant(egdiabetesinthecontextofworkondisease).The

studentsdoamajorresearchprojecteachyearwhichisdonemostlyinclasstime.

Textbooksandworksheetsareimportantcomponentsofclassroomwork.Students

useschoolICT,butitisnotalargepartoftheirwork.

D.Classroomdiscourseandevidenceoflearning

Classdiscoursefocusesexplicitlyonsciencelanguageuseincludingoral(first)and

thenwrittenwork.Researchprojectsarescaffoldedtohelpstudentslearnthe

componentsofascientificreport;thescaffoldingisprogressivelyreducedfrom

Year7toYear10.Writtenresponsestocommonassessmenttasksisanimportant

componentoftheassessmentdecisionsandsubsequentreportingtoparents.

E.Feedback

Thisislargelyprovidedbytheteacherinthecontextofwholeclassdiscussion

(oral)andtoindividualsandsmallgroupsduringpracticalworkinthelab.

Studentsareprovidedwithfeedbacksheetsfromcommontasksandadviceasto

howtheywentintermsofgradesbasedontheBoard’scommongradedescriptors.

F.Activatingstudentsasinstructionalresourcesforothers

PeerassessmentwasnotapriorityforYears7and8.Therewassomeuseofthink-

pair-share-reportstrategy,butnotwidespread(accordingtoHT).Groupworkwas

encouraged,butnoevidenceofteachingstudentstheskillsofworkingingroups

wasprovided.

Page 403: Exploring The Impact of a Largescale Diagnostic Science

382

G.Activatingstudents(andteachers)aslearners

Thefocuswasonteachermanagedlearning,studentswerenotgivenopportunities

togeneratelearningexpectationsorsuccesscriteria,butinfeedbackontests,

someteachersexplainedhowfeedbackcouldbeusedtoimprovelearning.The

teachersattheschoolmodelledgoodlearningbehavioursinclassandwitheach

otherinmeetingstodiscussanddevelopassessmentcriteriawhichwerethen

usedindividuallytoassesstheirownstudents.Incommentaryontheproforma,

theHTsawtheconnectionbetweenlikingscienceandbetterresults.

H.Comparativesummativecomments

PCWAE2andPCWAE3hadmuchincommon.Theyhadrelativelylargenumbers

(comparedtoPCWAE1)ofindigenousstudents.Thetwoschoolssetupgraded

classes,butbothproducedevidenceofdifferentiatedteachinginresponseto

studentskills.Amajordifferencebetweenthetwoschoolswastheapproachtaken

tosummativeassessment.LikeMCWBE5(theschoolcomparedtoPCWAE2

above),PCWAE3hadalowkeyapproachtosummativeassessment.

AttheendofYear8,predictiononewassatisfiedintermsofbothachievementand

engagement(theextendedresponsedifferentialindicatedthatPCWAE2wasthe

moresuccessfulintermsofteachingwritingskills).

PredictiontwowasabouttheextrapolationofresultsfromYear8toYear10.The

evidenceofresultswasnotdirectlycomparable,buttheindicationherewasthat,

despitePCWAE3havingahigherproportionofitsstudentsabsentonanydayfrom

Years7to10(seeanalysisinChapter5)thanwassoatPCWAE2,theirY10result

patternwasbiasedslightlymoretothehighergradesthanPCWAE3’sresultswere

(seeTableK.3inAppendixJ).

Intermsofpredictionthree,PCWAE2hadproportionatelymoreofitsstudents

completingsciencecoursesattheendofYear12whencomparedtoPCWAE3

(relativetothestatenumbers).ThemostmarkeddifferencewasinBiologywhere

Page 404: Exploring The Impact of a Largescale Diagnostic Science

383

thedifferencewas133%versus67%).Again,thisfindingneedstobequalifiedby

unknownsaboutschoolresourcesandstudentdemandforseniorsciencecourses.

Page 405: Exploring The Impact of a Largescale Diagnostic Science

384

AppendixI:Datatablesforpairedschoolcomparisons

Table K.1 Achievement results for comparable school pairs (Year 8 EV reporting categories) EV % ERT % WSCI % CSCI %

School AB sch sta sch sta sch sta sch sta

MCWAE1 5-6 7 18.6 12 20.3 9 19.4 8 22.4

x̅ = 1.85 ± 0.48 3-4 66 67.9 57 63.4 56 63.3 56 60.3

SEAS = 2.8 ± 0.46 1-2 27 13.5 32 16.3 35 17.3 36 17.3

MCAE2 5-6 16 18.6 18 20.3 17 19.4 25 22.4

x̅ = .03 ± 0.42 3-4 77 67.9 72 63.4 72 63.3 62 60.3

SEAS = 3.9 ± 0.30 1-2 7 13.5 10 16.3 11 17.3 13 17.3

MCWBE3 5-6 12 18.6 17 20.3 13 19.4 21 22.4

x̅ = -1.69 ± 0.13 3-4 76 67.9 68 63.4 70 63.3 61 60.3

SEAS = 4.0 ± 0.25 1-2 12 13.5 15 16.3 17 17.3 18 17.3

PCWAE2 x̅ = 1.69 ± 0.21 SEAS = 1.8 ± 0.45

5-6 12 18.6 18 20.3 16 19.4 14 22.4

3-4 76 67.9 66 63.4 69 63.3 71 60.3

1-2 12 13.5 16 16.3 15 17.3 15 17.3

MCWBE5 x̅ = -1.48 ± 0.28 SEAS = 2.1 ± 0.11

5-6 13 18.6 12 20.3 17 19.4 16 22.4

3-4 69 67.9 66 63.4 61 63.3 66 60.3

1-2 18 13.5 22 16.3 22 17.3 19 17.3

MCFSWAE1 x̅ = 1.19 ± 0.29 SEAS = 8.6 ± 0.16

5-6 95 18.6 85 20.3 80 19.4 87 22.4

3-4 5 67.9 15 63.4 20 63.3 13 60.3

MGFSAE2 x̅ = -0.09 ± 0.44 SEAS = 8.3 ± 0.16

5-6 95 18.6 85 20.3 76 19.4 89 22.4

3-4 5 67.9 15 63.4 24 63.3 11 60.3

MGFSWBE1 x̅ = -1.42 ± 0.02 SEAS = 8.9 ± 0.14

5-6 94 18.6 70 20.3 78 19.4 93 22.4

3-4 6 67.9 30 63.4 22 63.3 7 60.3

PCWAE1 5-6 29 18.6 32 20.3 45 19.4 38 22.4

x̅ = 2.68 ± 0.38 3-4 69 67.9 62 63.4 51 63.3 58 60.3

SEAS = 2.7 ± 0.22 1-2 2 13.5 6 16.3 4 17.3 4 17.3

PCWAE2 SEE FOURTH DATA SET ABOVE

PCWAE3 5-6 12 18.6 15 20.3 15 19.4 14 22.4

x̅ = 1.43 ± 0.25 3-4 75 67.9 66 63.4 68 63.3 66 60.3

SEAS = 2.0 ± 0.27 1-2 13 13.5 19 16.3 17 17.3 20 17.3

Note. SEAS = socio-educational advantage score / x̅ = mean school residual / AB = achievement band / EV % = proportions of students at each level of EV score (sch = school & sta = state) / ERT % = proportions for extended response tasks / WSCI = proportions for working scientifically / CSCI = proportions for communicating scientifically

Page 406: Exploring The Impact of a Largescale Diagnostic Science

385

Table K.2 Engagement measures at the end of Year 8 Item A/4 Item B/4 Item C/4 Item D/4 Item E/% Item F/% School AB sch sta sch sta sch sta sch sta sch sta sch sta

5-6 3.21 2.78 1.43 1.56 3.69 2.76 3.21 2.83 26.81 13.50 34.64 25.13

MCWAE1 3-4 2.49 1.76 2.33 1.69 2.81 2.35 2.63 2.23 12.53 6.65 22.88 16.50

1-2 2.00 1.37 2.59 2.03 2.44 2.01 2.47 1.91 6.88 4.58 12.81 9.71

5-6 2.33 2.78 1.33 1.56 2.75 2.76 3.00 2.83 12.07 13.50 22.14 25.13

MCAE2 3-4 1.61 1.76 1.80 1.69 2.27 2.35 2.26 2.23 5.12 6.65 14.27 16.50

1-2 1.10 1.37 2.12 2.03 1.39 2.01 1.22 1.91 0.75 4.58 7.10 9.71

5-6 2.87 2.78 1.53 1.56 2.82 2.76 2.57 2.83 11.99 13.50 19.57 25.13

MCWBE3 3-4 1.41 1.76 1.90 1.69 2.16 2.35 1.75 2.23 3.70 6.65 10.03 16.50

1-2 1.09 1.37 1.90 2.03 1.87 2.01 1.30 1.91 0.63 4.58 4.47 9.71

5-6 2.75 2.78 1.48 1.56 2.85 2.76 2.74 2.83 9.26 13.50 24.90 25.13

PCWAE2 3-4 1.69 1.76 2.06 1.69 1.84 2.35 2.08 2.23 2.98 6.65 16.17 16.50

1-2 1.44 1.37 1.86 2.03 2.04 2.01 1.93 1.91 2.92 4.58 10.02 9.71

5-6 2.90 2.78 1.58 1.56 3.12 2.76 3.00 2.83 19.60 13.50 25.35 25.13

MCWBE5 3-4 1.97 1.76 1.74 1.69 2.51 2.35 2.39 2.23 9.18 6.65 19.24 16.50

1-2 1.32 1.37 2.02 2.03 2.26 2.01 2.03 1.91 6.06 4.58 10.17 9.71

MCFS WAE1*

5-6 3.22 2.78 1.95 1.56 2.92 2.76 2.81 2.83 13.71 13.50 28.51 25.13

3-4 2.28 1.76 2.61 1.69 2.59 2.35 2.38 2.23 2.14 6.65 24.44 16.50

MGFS AE2*

5-6 2.73 2.78 1.65 1.56 2.53 2.76 2.32 2.83 6.58 13.50 20.95 25.13

3-4 ns 1.76 ns 1.69 ns 2.35 ns 2.23 ns 6.65 ns 16.50

MGFS WBE1*

5-6 3.01 2.78 1.72 1.56 2.81 2.76 2.80 2.83 9.81 13.50 25.40 25.13

3-4 2.66 1.76 2.02 1.69 2.65 2.35 2.68 2.23 6.44 6.65 18.94 16.50

5-6 2.55 2.78 1.50 1.56 2.65 2.76 2.46 2.83 12.44 13.50 19.89 25.13

PCWAE1 3-4 1.35 1.76 1.44 1.69 1.55 2.35 1.99 2.23 3.64 6.65 12.70 16.50

1-2 1.75 1.37 2.00 2.03 1.75 2.01 1.25 1.91 nil 4.58 8.33 9.71

PCWAE2 SEE FOURTH DATA SET ABOVE

5-6 2.60 2.78 1.35 1.56 1.25 2.76 2.19 2.83 8.61 13.50 20.19 25.13

PCWAE3 3-4 1.91 1.76 1.50 1.69 1.84 2.35 2.19 2.23 8.0 6.65 14.61 16.50

1-2 1.09 1.37 2.10 2.03 1.4 2.01 1.67 1.91 nil 4.58 1.85 9.71

Note. Scores for Items A to D range from 0-4; Items E & F are the proportions (as a %) of students from that achievement band at that school (sch = school & sta = state). AB = achievement band / Item A = intend to study science in senior years / Item B = science is the hardest subject I learn / Item C = enjoyed primary school science / Item D = enjoy secondary science lessons / Item E = number choosing science (as one of three favourite subjects) / Item F = number choosing science (as one of the three subjects they learn most in) * These three schools had no students in the bottom achievement band / ns = no results supplied

Page 407: Exploring The Impact of a Largescale Diagnostic Science

386

Table K.3 Year 10 results Grade (%) MCWAE1 (%) MCAE21 (%) 2011 2014 2015 MEAN 2012 2013 2014 2015 MEAN A (13) 3 1 1 2 7 5 4 8 6 B (25) 10 6 9 8 15 22 18 31 22 C (36) 29 19 16 21 37 52 52 51 47 D (19) 32 38 22 31 37 18 26 7 22 E (7) 26 36 52 38 4 3 nil 3 3

1MCWBE3 did not provide any Year 10 results and MCAE2’s results were used here instead. PCWAE2 (%) MCWBE5 (%) 2012 2013 2014 2015 MEAN 2012 2013 2014 2015 MEAN A (13) 7 5 2 6 5 6 11 9 11 9 B (25) 19 11 9 10 12 17 21 19 23 20 C (36) 47 56 46 33 46 53 46 44 46 47 D (19) 15 25 29 44 28 21 11 21 18 18 E (7) 12 3 14 7 9 3 11 7 2 6

Grade 2012 2013 2014 2015 MEAN

MCFSWAE1

A (13) 59 64 65 63 63 B (25) 36 31 30 33 33 C (36) 5 5 6 4 5 D (19) nil nil nil nil 0 E (7) nil nil nil nil 0

MGFSAE2

A (13) 89 86 82 83 85 B (25) 8 13 18 15 14 C (36) 3 1 nil 2 2 D (19) nil nil nil nil 0 E (7) nil nil nil nil 0

MGFSWBE1

A (13) 80 65 66 85 74 B (25) 17 35 33 15 25 C (36) 3 nil 1 nil 1 D (19) nil nil nil nil 0 E (7) nil nil nil nil 0

PCWAE1

A (13) 13 15 18 11 14 B (25) 19 21 5 29 19 C (36) 45 39 46 46 44 D (19) 23 24 27 14 22 E (7) nil nil 4 nil 1

PCWAE2

A (13) 7 5 2 6 5 B (25) 19 11 9 10 12 C (36) 47 56 46 33 46 D (19) 15 25 29 44 28 E (7) 12 3 14 7 9

PCWAE3

LEVEL* \ YR** 2009 2010 2011 6 (9) 4 3 5 4 5 (25) 22 21 16 20 4 (35) 38 39 40 39 3 (23) 27 30 34 30 2 (5) 9 7 5 7 1 (<1) 0 0 0 0

GRADE (Four year average proportions of state population achieving grades A to E as a percentage)

Page 408: Exploring The Impact of a Largescale Diagnostic Science

387

Table K.4 Science course completions at the end of Year 12 Subject (state % in 2015) MCWAE1 % MCWBE3 %

Year 2013 2014 2015 MEAN 2013 2014 2015 MEAN Biology (28.5) 25 22 48 32 15 31 18 21 Chemistry (18) 17 6 14 12 3 8 9 7 Earth & Env. Sc. (2.4) n/a n/a n/a N/A n/a n/a n/a N/A Physics (16) 15 12 14 14 12 6 8 9 Senior Science (10.4) 27 26 23 25 n/a n/a n/a N/A Subject (state % in 2015) PCWAE1 % MCAE2 %

Year 2013 2014 2015 MEAN 2013 2014 2015 MEAN Biology (28.5) 41 50 30 40 n/a 55 58 57 Chemistry (18) 24 n/a 20 22 n/a 21 17 19 Earth & Env. Sc. (2.4) n/a n/a n/a N/A n/a n/a n/a N/A Physics (16) n/a 10 35 22 n/a 12 8 10 Senior Science (10.4) 59 40 n/a 50 n/a n/a n/a N/A Subject (state % in 2015) PCWAE2 % MCWBE5 %

Year 2013 2014 2015 MEAN 2013 2014 2015 MEAN Biology (28.5) 31 46 38 38 35 37 42 38 Chemistry (18) 15 14 18 16 13 21 20 18 Earth & Env. Sc. (2.4) n/a n/a n/a N/A n/a n/a n/a N/A Physics (16) 8 n/a 18 13 26 12 14 17 Senior Science (10.4) 46 n/a 13 30 22 12 26 20 Subject (state % in 2015) MGFSAE2 % MGFSWBE1 %

Year 2013 2014 2015 MEAN 2013 2014 2015 MEAN Biology (28.5) 21 22 17 20 18.6 23.6 24.8 22 Chemistry (18) 55 53 54 54 58.4 61.8 54.8 58 Earth & Env. Sc. (2.4) n/a n/a n/a N/A n/a n/a n/a N/A Physics (16) 18 21 30 23 30 29.3 23.6 28 Senior Science (10.4) n/a n/a n/a N/A n/a n/a n/a N/A Subject (state % in 2015) MCFSWAE1 PCWAE3 %

Year 2013 2014 2015 MEAN 2013 2014 2015 MEAN Biology (28.5) 31.9 40 30.2 34 21 19 17 19 Chemistry (18) 65.9 74.3 71.2 70 17 15 19 17 Earth & Env. Sc. (2.4) n/a n/a n/a N/A n/a 10 n/a 10 Physics (16) 46.4 45.7 46 46 21 3 9 11 Senior Science (10.4) 5.8 10.7 11.5 9 8 32 26 22 Note. The proportions (%) reported are relative to the total English candidature for the state in 2015 and at the school for each year. n/a = subject not offered that year

Page 409: Exploring The Impact of a Largescale Diagnostic Science

388

Table K.5A Student survey item scores, ranks and relative to the state

School* A

TOP

A TRNK

A

TMB

A

RTMB

A

STA

B**

TOP

B** TRNK

B**

TMB

B**

RTMB

B**

STA PCWAE1 -0.23 12 -0.26 7 B -0.06 8 -0.34 10 B MCWAE1 0.43 2 1.79 1 A -0.13 12 1.07 1 A PCWAE2 -0.03 9 -0.03 4 B -0.08 9 0.12 6 A MCWAE2 -0.01 8 -1.49 13 B -0.27 15 -2.25 13 B PCWAE3 -0.18 11 -0.31 8 B -0.21 13 -0.33 11 B MCFSWAE1 0.44 1 #NULL! NA NA 0.39 1 #NULL! NA NA MCAE2 -0.45 14 -0.87 10 B -0.23 14 -0.03 9 B MCAE3 -0.33 13 -0.24 6 B -0.12 10 0.31 4 A MGFSAE2 -0.05 10 #NULL! NA NA 0.09 3 #NULL! NA NA MCAE6 -0.49 15 -0.89 11 B -0.71 16 -0.77 12 B MGFSWBE1 0.23 4 #NULL! NA NA 0.16 2 #NULL! NA NA MCWBE5 0.12 5 0.28 2 A 0.02 4 0.06 5 A MCWBE4 -1.45 16 -1.15 12 B -0.12 11 0.37 2 A MCWBE3 0.09 6 -0.54 9 B -0.03 7 0.05 7 A MCPSWBE2 0.28 3 -0.20 5 B 0.00 5= 0.20 3 A STATE 2.78 7 5.91 3 = 1.56 5= 5.28 8 = Schools* in residual order / A & B = Item number of survey / TOP = top band scores relative to state (see bottom row) / TRNK = top band rank (n = 16) / TMB = sum of scores for all three achievement bands relative to the state (see bottom row) / STA = above (A) or below (B) the state score. Item A = I want to study a science subject in years 11 &12. Item B** = Science is the hardest subject I learn (disagreement with that was taken as a good thing)

Table K.5B Student survey item scores, ranks and relative to the state

School* C

TOP

C TRNK

C

TMB

C

RTMB

C

STA

D

TOP

D TRNK

D

TMB

D

RTMB

D

STA PCWAE1 -0.11 12 -1.28 11 B -0.37 14 -1.27 11 B MCWAE1 0.93 1 2.75 1 A 0.38 2 1.34 1 A PCWAE2 0.09 7 -0.30 8 B -0.09 11 -0.22 7 B MCWAE2 -0.12 13 -2.27 12 B 0.03 6 -2.00 13 B PCWAE3 -1.51 16 -4.15 13 B -0.64 16 -0.92 10 B

MCFSWAE1 0.16 6 #NULL! NA NA -0.02 9 #NULL! NA NA MCAE2 -0.01 11 -0.72 10 B 0.17 4 -0.49 9 B MCAE3 -0.15 14 -0.36 9 B -0.16 12 0.14 4 A

MGFSAE2 -0.23 15 #NULL! NA NA -0.51 15 #NULL! NA NA MCAE6 0.47 2 1.17 2 A 0.15 5 0.06 5 A

MGFSWBE1 0.05 9 #NULL! NA NA -0.03 10 #NULL! NA NA MCWBE5 0.36 3 1.13 3 A 0.17 3 0.45 3 A MCWBE4 0.35 4 0.97 4 A 0.50 1 1.13 2 A MCWBE3 0.06 8 -0.21 7 B -0.26 13 -1.35 12 B

MCPSWBE2 0.18 5 -0.07 6 B 0.01 7 -0.45 8 B STATE 2.76 10 7.12 5 = 2.83 8 6.97 6 =

Schools* in residual order / C & D = Item number of survey / TOP = top band scores relative to state (see bottom row) / TRNK = top band rank (n = 16) / TMB = sum of scores for all three achievement bands relative to the state (see bottom row) / STA = above (A) or below (B) the state score. Item C = In primary school, I enjoyed lessons that were about science / Item D = In secondary school, I enjoy science lessons

Page 410: Exploring The Impact of a Largescale Diagnostic Science

389

Table K.5C Student survey item scores, ranks and relative to the state

School* E

TOP

E TRNK

E

TMB

E

RTMB

E

STA

F

TOP

F

TRNK

F

TMB

F

RTMB

F

STA PCWAE1 -1.06 9 -8.65 12 B -5.24 15 -15.66 11 B MCWAE1 13.31 2 21.49 1 A 9.51 1 28.50 1 A PCWAE2 -4.24 14 -9.57 13 B -0.23 9 -0.48 7 B MCWAE2 1.90 4 -5.01 8 B 2.14 5 -4.35 9 B PCWAE3 -4.89 15 -8.12 10 B -4.94 14 -19.63 12 B

MCFSWAE1 0.21 7 #NULL! NA NA 3.38 3 #NULL! NA NA MCAE2 -1.43 10 -6.79 9 B -2.99 11 -10.82 10 B MCAE3 0.75 6 2.40 4 A -4.13 12 -0.22 6 B

MGFSAE2 -6.92 16 #NULL! NA NA -4.18 13 #NULL! NA NA MCAE6 -2.39 12 -4.40 7 B -0.84 10 -3.41 8 B

MGFSWBE1 -3.69 13 #NULL! NA NA 0.27 6 #NULL! NA NA MCWBE5 6.10 3 10.11 3 A 0.22 7 3.64 3 A MCWBE4 16.13 1 17.18 2 A 8.20 2 14.66 2 A MCWBE3 -1.51 11 -8.41 11 B -5.56 16 -22.83 13 B

MCPSWBE2 1.01 5 -0.74 6 B 2.46 4 3.25 4 A STATE 13.50 8 24.73 5 = 25.13 8 51.34 5 =

Schools* in residual order / E & F = Item number of survey / TOP = top band scores relative to state (see bottom row) / TRNK = top band rank (n = 16) / TMB = sum of scores for all three achievement bands relative to the state (see bottom row) / STA = above (A) or below (B) the state score. Item E = my three favourite subjects (15 to choose from) / Item F = the three subjects I thought I learned most in (15 to choose from)

Table K.5D Student survey items (D + E) scores, ranks and relative to the state

School* D+E TOP

D+E TRNK

D+E TMB

D+E RTMB

D+E STA

NSALL RANK

/12

TALL RANK

/16

PCWAE1 -1.43 10 -9.92 13 B 10 13 MCWAE1 13.69 3 22.83 1 A 1 1 PCWAE2 -4.33 14 -9.79 12 B 7 11 MCWAE2 1.93 5 -7.01 8 B 9 6 PCWAE3 -5.53 15 -9.04 10 B 11 16

MCFSWAE1 0.19 8 #NULL! NA NA N/A 4 MCAE2 -1.26 9 -7.28 9 B 8 12 MCAE3 0.59 7 2.54 4 A 4 10

MGFSAE2 -7.43 16 #NULL! NA NA N/A 15 MCAE6 -2.24 12 -4.34 7 B 6 9

MGFSWBE1 -3.72 13 #NULL! NA NA N/A 8 MCWBE5 6.27 4 10.56 3 A 3 3 MCWBE4 16.63 1 18.31 2 A 2 2 MCWBE3 -1.77 11 -9.76 11 B 12 14

MCPSWBE2 1.02 6 -1.19 6 B N/A 5 STATE 16.33 2 31.70 5 = 5 7

Schools* in residual order / D + E = Item numbers of survey / TOP = top band scores relative to state (see bottom row) / TRNK = top band rank (n = 16) / TMB = sum of scores for all three achievement bands relative to the state (see bottom row) / STA = above (A) or below (B) the state score. Items D + E = the sum of the scores for Items D and E from the student survey (see above for what they are) NSALL RANK = sum of all achievement band survey scores (rank out of 12) / TALL RANK = sum of top achievement band survey scores (rank out of 16).

Page 411: Exploring The Impact of a Largescale Diagnostic Science

390

AppendixJ:Surveydescriptivestatistics

The reference to residual quintile group is about the three groups (WAE, AE and WBE and the groups separating WAE from AE and AE from WBE…five groups in all differentiated by their residuals). Case Processing Summary

Cases Valid Missing Total N Percent N Percent N Percent

Residual quintile group * EV1A

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV1B

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV1C

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV1D

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV1E

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV1F

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV1G

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * EV1H

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV1I

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV2A

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2B

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2C

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV2D

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2E

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2F

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV2G

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV2H

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2I

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV2J

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2K

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * EV2L

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * EV2M

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * EV3

85 100.0% 0 0.0% 85 100.0%

Page 412: Exploring The Impact of a Largescale Diagnostic Science

391

Residual quintile group * EV5

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S6A

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S6B

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * S6C

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S6D

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S6E

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S6F

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * S6G

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S6H

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * S6I

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * S6J

84 98.8% 1 1.2% 85 100.0%

Residual quintile group * S7

84 98.8% 1 1.2% 85 100.0%

Page 413: Exploring The Impact of a Largescale Diagnostic Science

392

CODE:1.00 = YES and 2.00 = NO Residual quintile group * EV1A Crosstabulation

EV1A

Total 1.00 2.00 Residual quintile group WBE Count 17 15 32

% within Residual quintile group

53.1% 46.9% 100.0%

AE Count 20 8 28 % within Residual quintile group

71.4% 28.6% 100.0%

WAE Count 20 5 25 % within Residual quintile group

80.0% 20.0% 100.0%

Total Count 57 28 85 % within Residual quintile group

67.1% 32.9% 100.0%

Residual quintile group * EV1B Crosstabulation

EV1B

Total 1.00 2.00 Residual quintile group WBE Count 17 15 32

% within Residual quintile group

53.1% 46.9% 100.0%

AE Count 22 6 28 % within Residual quintile group

78.6% 21.4% 100.0%

WAE Count 18 7 25 % within Residual quintile group

72.0% 28.0% 100.0%

Total Count 57 28 85 % within Residual quintile group

67.1% 32.9% 100.0%

Residual quintile group * EV1C Crosstabulation

EV1C

Total 1.00 2.00 Residual quintile group WBE Count 13 18 31

% within Residual quintile group

41.9% 58.1% 100.0%

AE Count 21 7 28 % within Residual quintile group

75.0% 25.0% 100.0%

WAE Count 17 8 25 % within Residual quintile group

68.0% 32.0% 100.0%

Total Count 51 33 84 % within Residual quintile group

60.7% 39.3% 100.0%

Page 414: Exploring The Impact of a Largescale Diagnostic Science

393

Residual quintile group * EV1D Crosstabulation

EV1D

Total 1.00 2.00 Residual quintile group WBE Count 10 22 32

% within Residual quintile group

31.3% 68.8% 100.0%

AE Count 13 14 27 % within Residual quintile group

48.1% 51.9% 100.0%

WAE Count 17 8 25 % within Residual quintile group

68.0% 32.0% 100.0%

Total Count 40 44 84 % within Residual quintile group

47.6% 52.4% 100.0%

Residual quintile group * EV1E Crosstabulation

EV1E

Total 1.00 2.00 Residual quintile group WBE Count 17 15 32

% within Residual quintile group

53.1% 46.9% 100.0%

AE Count 23 5 28 % within Residual quintile group

82.1% 17.9% 100.0%

WAE Count 16 9 25 % within Residual quintile group

64.0% 36.0% 100.0%

Total Count 56 29 85 % within Residual quintile group

65.9% 34.1% 100.0%

Residual quintile group * EV1F Crosstabulation

EV1F

Total 1.00 2.00 Residual quintile group WBE Count 5 27 32

% within Residual quintile group

15.6% 84.4% 100.0%

AE Count 6 22 28 % within Residual quintile group

21.4% 78.6% 100.0%

WAE Count 8 17 25 % within Residual quintile group

32.0% 68.0% 100.0%

Total Count 19 66 85 % within Residual quintile group

22.4% 77.6% 100.0%

Page 415: Exploring The Impact of a Largescale Diagnostic Science

394

Residual quintile group * EV1G Crosstabulation

EV1G

Total 1.00 2.00 Residual quintile group WBE Count 12 20 32

% within Residual quintile group

37.5% 62.5% 100.0%

AE Count 15 11 26 % within Residual quintile group

57.7% 42.3% 100.0%

WAE Count 14 11 25 % within Residual quintile group

56.0% 44.0% 100.0%

Total Count 41 42 83 % within Residual quintile group

49.4% 50.6% 100.0%

Residual quintile group * EV1H Crosstabulation

EV1H

Total 1.00 2.00 Residual quintile group WBE Count 8 24 32

% within Residual quintile group

25.0% 75.0% 100.0%

AE Count 11 16 27 % within Residual quintile group

40.7% 59.3% 100.0%

WAE Count 9 16 25 % within Residual quintile group

36.0% 64.0% 100.0%

Total Count 28 56 84 % within Residual quintile group

33.3% 66.7% 100.0%

Residual quintile group * EV1I Crosstabulation

EV1I

Total 1.00 2.00 Residual quintile group WBE Count 3 28 31

% within Residual quintile group

9.7% 90.3% 100.0%

AE Count 6 22 28 % within Residual quintile group

21.4% 78.6% 100.0%

WAE Count 6 19 25 % within Residual quintile group

24.0% 76.0% 100.0%

Total Count 15 69 84 % within Residual quintile group

17.9% 82.1% 100.0%

Page 416: Exploring The Impact of a Largescale Diagnostic Science

395

Residual quintile group * EV2A Crosstabulation

EV2A

Total 1.00 2.00 Residual quintile group WBE Count 7 25 32

% within Residual quintile group

21.9% 78.1% 100.0%

AE Count 9 19 28 % within Residual quintile group

32.1% 67.9% 100.0%

WAE Count 11 14 25 % within Residual quintile group

44.0% 56.0% 100.0%

Total Count 27 58 85 % within Residual quintile group

31.8% 68.2% 100.0%

Residual quintile group * EV2B Crosstabulation

EV2B

Total 1.00 2.00 Residual quintile group WBE Count 15 17 32

% within Residual quintile group

46.9% 53.1% 100.0%

AE Count 24 4 28 % within Residual quintile group

85.7% 14.3% 100.0%

WAE Count 21 4 25 % within Residual quintile group

84.0% 16.0% 100.0%

Total Count 60 25 85 % within Residual quintile group

70.6% 29.4% 100.0%

Residual quintile group * EV2C Crosstabulation

EV2C

Total 1.00 2.00 Residual quintile group WBE Count 11 21 32

% within Residual quintile group

34.4% 65.6% 100.0%

AE Count 18 9 27 % within Residual quintile group

66.7% 33.3% 100.0%

WAE Count 9 16 25 % within Residual quintile group

36.0% 64.0% 100.0%

Total Count 38 46 84 % within Residual quintile group

45.2% 54.8% 100.0%

Page 417: Exploring The Impact of a Largescale Diagnostic Science

396

Residual quintile group * EV2D Crosstabulation

EV2D

Total 1.00 2.00 Residual quintile group WBE Count 16 16 32

% within Residual quintile group

50.0% 50.0% 100.0%

AE Count 15 13 28 % within Residual quintile group

53.6% 46.4% 100.0%

WAE Count 16 9 25 % within Residual quintile group

64.0% 36.0% 100.0%

Total Count 47 38 85 % within Residual quintile group

55.3% 44.7% 100.0%

Residual quintile group * EV2E Crosstabulation

EV2E

Total 1.00 2.00 Residual quintile group WBE Count 20 12 32

% within Residual quintile group

62.5% 37.5% 100.0%

AE Count 22 6 28 % within Residual quintile group

78.6% 21.4% 100.0%

WAE Count 19 6 25 % within Residual quintile group

76.0% 24.0% 100.0%

Total Count 61 24 85 % within Residual quintile group

71.8% 28.2% 100.0%

Residual quintile group * EV2F Crosstabulation

EV2F

Total 1.00 2.00 Residual quintile group WBE Count 16 15 31

% within Residual quintile group

51.6% 48.4% 100.0%

AE Count 22 6 28 % within Residual quintile group

78.6% 21.4% 100.0%

WAE Count 18 7 25 % within Residual quintile group

72.0% 28.0% 100.0%

Total Count 56 28 84 % within Residual quintile group

66.7% 33.3% 100.0%

Page 418: Exploring The Impact of a Largescale Diagnostic Science

397

Residual quintile group * EV2G Crosstabulation

EV2G

Total 1.00 2.00 Residual quintile group WBE Count 19 13 32

% within Residual quintile group

59.4% 40.6% 100.0%

AE Count 20 7 27 % within Residual quintile group

74.1% 25.9% 100.0%

WAE Count 18 7 25 % within Residual quintile group

72.0% 28.0% 100.0%

Total Count 57 27 84 % within Residual quintile group

67.9% 32.1% 100.0%

Residual quintile group * EV2H Crosstabulation

EV2H

Total 1.00 2.00 Residual quintile group WBE Count 14 18 32

% within Residual quintile group

43.8% 56.3% 100.0%

AE Count 21 7 28 % within Residual quintile group

75.0% 25.0% 100.0%

WAE Count 14 11 25 % within Residual quintile group

56.0% 44.0% 100.0%

Total Count 49 36 85 % within Residual quintile group

57.6% 42.4% 100.0%

Residual quintile group * EV2I Crosstabulation

EV2I

Total 1.00 2.00 Residual quintile group WBE Count 7 24 31

% within Residual quintile group

22.6% 77.4% 100.0%

AE Count 21 7 28 % within Residual quintile group

75.0% 25.0% 100.0%

WAE Count 12 13 25 % within Residual quintile group

48.0% 52.0% 100.0%

Total Count 40 44 84 % within Residual quintile group

47.6% 52.4% 100.0%

Page 419: Exploring The Impact of a Largescale Diagnostic Science

398

Residual quintile group * EV2J Crosstabulation

EV2J

Total 1.00 2.00 Residual quintile group WBE Count 0 32 32

% within Residual quintile group

0.0% 100.0% 100.0%

AE Count 2 26 28 % within Residual quintile group

7.1% 92.9% 100.0%

WAE Count 0 25 25 % within Residual quintile group

0.0% 100.0% 100.0%

Total Count 2 83 85 % within Residual quintile group

2.4% 97.6% 100.0%

Residual quintile group * EV2K Crosstabulation

EV2K

Total 1.00 2.00 Residual quintile group WBE Count 0 32 32

% within Residual quintile group

0.0% 100.0% 100.0%

AE Count 2 26 28 % within Residual quintile group

7.1% 92.9% 100.0%

WAE Count 2 23 25 % within Residual quintile group

8.0% 92.0% 100.0%

Total Count 4 81 85 % within Residual quintile group

4.7% 95.3% 100.0%

Residual quintile group * EV2L Crosstabulation

EV2L

Total 1.00 2.00 Residual quintile group WBE Count 10 22 32

% within Residual quintile group

31.3% 68.8% 100.0%

AE Count 9 18 27 % within Residual quintile group

33.3% 66.7% 100.0%

WAE Count 13 11 24 % within Residual quintile group

54.2% 45.8% 100.0%

Total Count 32 51 83 % within Residual quintile group

38.6% 61.4% 100.0%

Page 420: Exploring The Impact of a Largescale Diagnostic Science

399

Residual quintile group * EV2M Crosstabulation

EV2M

Total 1.00 2.00 Residual quintile group WBE Count 7 25 32

% within Residual quintile group

21.9% 78.1% 100.0%

AE Count 9 18 27 % within Residual quintile group

33.3% 66.7% 100.0%

WAE Count 9 16 25 % within Residual quintile group

36.0% 64.0% 100.0%

Total Count 25 59 84 % within Residual quintile group

29.8% 70.2% 100.0%

Residual quintile group * EV3 Crosstabulation (see questionnaire for the key)

EV3

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 5 4 12 9 2 32 % within Residual quintile group

15.6% 12.5% 37.5% 28.1% 6.3% 100.0%

AE Count 0 0 8 11 9 28 % within Residual quintile group

0.0% 0.0% 28.6% 39.3% 32.1% 100.0%

WAE Count 0 2 6 11 6 25 % within Residual quintile group

0.0% 8.0% 24.0% 44.0% 24.0% 100.0%

Total Count 5 6 26 31 17 85 % within Residual quintile group

5.9% 7.1% 30.6% 36.5% 20.0% 100.0%

Residual quintile group * EV5 Crosstabulation (see questionnaire for the key)

EV5

Total 1.00 2.00 3.00 Residual quintile group WBE Count 15 6 11 32

% within Residual quintile group

46.9% 18.8% 34.4% 100.0%

AE Count 14 8 5 27 % within Residual quintile group

51.9% 29.6% 18.5% 100.0%

WAE Count 18 1 6 25 % within Residual quintile group

72.0% 4.0% 24.0% 100.0%

Total Count 47 15 22 84 % within Residual quintile group

56.0% 17.9% 26.2% 100.0%

Page 421: Exploring The Impact of a Largescale Diagnostic Science

400

Residual quintile group * S6A Crosstabulation

S6A

Total 1.00 2.00 Residual quintile group WBE Count 17 15 32

% within Residual quintile group

53.1% 46.9% 100.0%

AE Count 15 13 28 % within Residual quintile group

53.6% 46.4% 100.0%

WAE Count 13 11 24 % within Residual quintile group

54.2% 45.8% 100.0%

Total Count 45 39 84 % within Residual quintile group

53.6% 46.4% 100.0%

Residual quintile group * S6B Crosstabulation

S6B

Total 1.00 2.00 Residual quintile group WBE Count 6 26 32

% within Residual quintile group

18.8% 81.3% 100.0%

AE Count 7 20 27 % within Residual quintile group

25.9% 74.1% 100.0%

WAE Count 9 15 24 % within Residual quintile group

37.5% 62.5% 100.0%

Total Count 22 61 83 % within Residual quintile group

26.5% 73.5% 100.0%

Residual quintile group * S6C Crosstabulation

S6C

Total 1.00 2.00 Residual quintile group WBE Count 13 19 32

% within Residual quintile group

40.6% 59.4% 100.0%

AE Count 10 18 28 % within Residual quintile group

35.7% 64.3% 100.0%

WAE Count 12 12 24 % within Residual quintile group

50.0% 50.0% 100.0%

Total Count 35 49 84 % within Residual quintile group

41.7% 58.3% 100.0%

Page 422: Exploring The Impact of a Largescale Diagnostic Science

401

Residual quintile group * S6D Crosstabulation

S6D

Total 1.00 2.00 Residual quintile group WBE Count 6 26 32

% within Residual quintile group

18.8% 81.3% 100.0%

AE Count 4 24 28 % within Residual quintile group

14.3% 85.7% 100.0%

WAE Count 8 16 24 % within Residual quintile group

33.3% 66.7% 100.0%

Total Count 18 66 84 % within Residual quintile group

21.4% 78.6% 100.0%

Residual quintile group * S6E Crosstabulation

S6E

Total 1.00 2.00 Residual quintile group WBE Count 7 25 32

% within Residual quintile group

21.9% 78.1% 100.0%

AE Count 8 20 28 % within Residual quintile group

28.6% 71.4% 100.0%

WAE Count 9 15 24 % within Residual quintile group

37.5% 62.5% 100.0%

Total Count 24 60 84 % within Residual quintile group

28.6% 71.4% 100.0%

Residual quintile group * S6F Crosstabulation

S6F

Total 1.00 2.00 Residual quintile group WBE Count 4 27 31

% within Residual quintile group

12.9% 87.1% 100.0%

AE Count 5 23 28 % within Residual quintile group

17.9% 82.1% 100.0%

WAE Count 6 18 24 % within Residual quintile group

25.0% 75.0% 100.0%

Total Count 15 68 83 % within Residual quintile group

18.1% 81.9% 100.0%

Page 423: Exploring The Impact of a Largescale Diagnostic Science

402

Residual quintile group * S6G Crosstabulation

S6G

Total 1.00 2.00 Residual quintile group WBE Count 6 26 32

% within Residual quintile group

18.8% 81.3% 100.0%

AE Count 7 21 28 % within Residual quintile group

25.0% 75.0% 100.0%

WAE Count 8 16 24 % within Residual quintile group

33.3% 66.7% 100.0%

Total Count 21 63 84 % within Residual quintile group

25.0% 75.0% 100.0%

Residual quintile group * S6H Crosstabulation

S6H

Total 1.00 2.00 Residual quintile group WBE Count 3 28 31

% within Residual quintile group

9.7% 90.3% 100.0%

AE Count 3 25 28 % within Residual quintile group

10.7% 89.3% 100.0%

WAE Count 5 19 24 % within Residual quintile group

20.8% 79.2% 100.0%

Total Count 11 72 83 % within Residual quintile group

13.3% 86.7% 100.0%

Residual quintile group * S6I Crosstabulation

S6I

Total 1.00 2.00 Residual quintile group WBE Count 2 29 31

% within Residual quintile group

6.5% 93.5% 100.0%

AE Count 1 27 28 % within Residual quintile group

3.6% 96.4% 100.0%

WAE Count 1 23 24 % within Residual quintile group

4.2% 95.8% 100.0%

Total Count 4 79 83 % within Residual quintile group

4.8% 95.2% 100.0%

Page 424: Exploring The Impact of a Largescale Diagnostic Science

403

Residual quintile group * S6J Crosstabulation

S6J

Total 1.00 2.00 Residual quintile group WBE Count 0 32 32

% within Residual quintile group

0.0% 100.0% 100.0%

AE Count 2 26 28 % within Residual quintile group

7.1% 92.9% 100.0%

WAE Count 3 21 24 % within Residual quintile group

12.5% 87.5% 100.0%

Total Count 5 79 84 % within Residual quintile group

6.0% 94.0% 100.0%

Residual quintile group * S7 Crosstabulation (see questionnaire for the key)

S7

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 9 5 12 6 0 32 % within Residual quintile group

28.1% 15.6% 37.5% 18.8% 0.0% 100.0%

AE Count 6 6 9 7 0 28 % within Residual quintile group

21.4% 21.4% 32.1% 25.0% 0.0% 100.0%

WAE Count 8 5 3 5 3 24 % within Residual quintile group

33.3% 20.8% 12.5% 20.8% 12.5% 100.0%

Total Count 23 16 24 18 3 84 % within Residual quintile group

27.4% 19.0% 28.6% 21.4% 3.6% 100.0%

Page 425: Exploring The Impact of a Largescale Diagnostic Science

404

SECTION THREE: ASSESSMENT FOR LEARNING (AFL) DESCRIPTIVE STATISTICS Case Processing Summary

Cases Valid Missing Total N Percent N Percent N Percent

Residual quintile group * AFL9A

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL9B

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL9C

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL9D

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL9E

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL9F

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL9G

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL9H

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL10A

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL10B

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL10C

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL10D

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL10E

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL10F

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL10G

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL10H

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL11A

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL11B

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL11C

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL11D

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL11E

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * Exemplary or model answers

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * Success criteria

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * Misconceptions

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * SOLO levels

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * QT model

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * Bloom categories

85 100.0% 0 0.0% 85 100.0%

Page 426: Exploring The Impact of a Largescale Diagnostic Science

405

Residual quintile group * Syllabus outcomes / standards

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * AFL13A

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL13B

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL13C

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL13D

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL13E

80 94.1% 5 5.9% 85 100.0%

Residual quintile group * AFL13F

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL14A

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL14B

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL14C

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL14D

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL14E

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * AFL14F

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL14G

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL14H

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL15A

83 97.6% 2 2.4% 85 100.0%

Residual quintile group * AFL15B

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL15C

82 96.5% 3 3.5% 85 100.0%

Residual quintile group * AFL15D

79 92.9% 6 7.1% 85 100.0%

Residual quintile group * AFL15E

83 97.6% 2 2.4% 85 100.0%

N = 47 items

Page 427: Exploring The Impact of a Largescale Diagnostic Science

406

SEE QUESIONNAIRE FOR KEY EXPLAINING RESPONSE AND RELATED NUMBER Residual quintile group * AFL9A Crosstabulation

AFL9A

Total 1.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 1 8 20 31 % within Residual quintile group

6.5% 3.2% 25.8% 64.5% 100.0%

AE Count 0 0 7 21 28 % within Residual quintile group

0.0% 0.0% 25.0% 75.0% 100.0%

WAE Count 0 0 11 13 24 % within Residual quintile group

0.0% 0.0% 45.8% 54.2% 100.0%

Total Count 2 1 26 54 83 % within Residual quintile group

2.4% 1.2% 31.3% 65.1% 100.0%

Residual quintile group * AFL9B Crosstabulation

AFL9B

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 4 8 17 0 31 % within Residual quintile group

6.5% 12.9% 25.8% 54.8% 0.0% 100.0%

AE Count 1 2 8 15 2 28 % within Residual quintile group

3.6% 7.1% 28.6% 53.6% 7.1% 100.0%

WAE Count 0 4 5 14 1 24 % within Residual quintile group

0.0% 16.7% 20.8% 58.3% 4.2% 100.0%

Total Count 3 10 21 46 3 83 % within Residual quintile group

3.6% 12.0% 25.3% 55.4% 3.6% 100.0%

Residual quintile group * AFL9C Crosstabulation

AFL9C

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 1 3 10 16 31 % within Residual quintile group

3.2% 3.2% 9.7% 32.3% 51.6% 100.0%

AE Count 0 0 3 4 21 28 % within Residual quintile group

0.0% 0.0% 10.7% 14.3% 75.0% 100.0%

WAE Count 0 0 1 8 15 24 % within Residual quintile group

0.0% 0.0% 4.2% 33.3% 62.5% 100.0%

Total Count 1 1 7 22 52 83 % within Residual quintile group

1.2% 1.2% 8.4% 26.5% 62.7% 100.0%

Residual quintile group * AFL9D Crosstabulation

AFL9D

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 5 14 9 2 31 % within Residual quintile group

3.2% 16.1% 45.2% 29.0% 6.5% 100.0%

Page 428: Exploring The Impact of a Largescale Diagnostic Science

407

AE Count 1 1 10 15 1 28 % within Residual quintile group

3.6% 3.6% 35.7% 53.6% 3.6% 100.0%

WAE Count 0 3 7 12 2 24 % within Residual quintile group

0.0% 12.5% 29.2% 50.0% 8.3% 100.0%

Total Count 2 9 31 36 5 83 % within Residual quintile group

2.4% 10.8% 37.3% 43.4% 6.0% 100.0%

Residual quintile group * AFL9E Crosstabulation

AFL9E

Total 3.00 4.00 5.00 Residual quintile group WBE Count 1 11 19 31

% within Residual quintile group

3.2% 35.5% 61.3% 100.0%

AE Count 0 6 21 27 % within Residual quintile group

0.0% 22.2% 77.8% 100.0%

WAE Count 1 10 13 24 % within Residual quintile group

4.2% 41.7% 54.2% 100.0%

Total Count 2 27 53 82 % within Residual quintile group

2.4% 32.9% 64.6% 100.0%

Residual quintile group * AFL9F Crosstabulation

AFL9F

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 4 19 6 30 % within Residual quintile group

3.3% 13.3% 63.3% 20.0% 100.0%

AE Count 1 4 16 7 28 % within Residual quintile group

3.6% 14.3% 57.1% 25.0% 100.0%

WAE Count 1 5 10 8 24 % within Residual quintile group

4.2% 20.8% 41.7% 33.3% 100.0%

Total Count 3 13 45 21 82 % within Residual quintile group

3.7% 15.9% 54.9% 25.6% 100.0%

Residual quintile group * AFL9G Crosstabulation

AFL9G

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 1 5 18 5 30 % within Residual quintile group

3.3% 3.3% 16.7% 60.0% 16.7% 100.0%

AE Count 0 1 4 18 4 27 % within Residual quintile group

0.0% 3.7% 14.8% 66.7% 14.8% 100.0%

WAE Count 0 1 5 11 7 24 % within Residual quintile group

0.0% 4.2% 20.8% 45.8% 29.2% 100.0%

Total Count 1 3 14 47 16 81

Page 429: Exploring The Impact of a Largescale Diagnostic Science

408

% within Residual quintile group

1.2% 3.7% 17.3% 58.0% 19.8% 100.0%

Residual quintile group * AFL9H Crosstabulation

AFL9H

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 4 12 10 5 0 31 % within Residual quintile group

12.9% 38.7% 32.3% 16.1% 0.0% 100.0%

AE Count 2 12 5 8 1 28 % within Residual quintile group

7.1% 42.9% 17.9% 28.6% 3.6% 100.0%

WAE Count 3 13 6 0 2 24 % within Residual quintile group

12.5% 54.2% 25.0% 0.0% 8.3% 100.0%

Total Count 9 37 21 13 3 83 % within Residual quintile group

10.8% 44.6% 25.3% 15.7% 3.6% 100.0%

Residual quintile group * AFL10A Crosstabulation

AFL10A

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 5 18 8 31 % within Residual quintile group

0.0% 16.1% 58.1% 25.8% 100.0%

AE Count 1 4 16 6 27 % within Residual quintile group

3.7% 14.8% 59.3% 22.2% 100.0%

WAE Count 0 5 14 5 24 % within Residual quintile group

0.0% 20.8% 58.3% 20.8% 100.0%

Total Count 1 14 48 19 82 % within Residual quintile group

1.2% 17.1% 58.5% 23.2% 100.0%

Residual quintile group * AFL10B Crosstabulation

AFL10B

Total 4.00 5.00 Residual quintile group WBE Count 13 18 31

% within Residual quintile group

41.9% 58.1% 100.0%

AE Count 12 16 28 % within Residual quintile group

42.9% 57.1% 100.0%

WAE Count 11 13 24 % within Residual quintile group

45.8% 54.2% 100.0%

Total Count 36 47 83 % within Residual quintile group

43.4% 56.6% 100.0%

Residual quintile group * AFL10C Crosstabulation

AFL10C

Total 1.00 3.00 4.00 5.00 WBE Count 1 1 14 15 31

Page 430: Exploring The Impact of a Largescale Diagnostic Science

409

Residual quintile group

% within Residual quintile group

3.2% 3.2% 45.2% 48.4% 100.0%

AE Count 0 0 12 16 28 % within Residual quintile group

0.0% 0.0% 42.9% 57.1% 100.0%

WAE Count 0 0 5 19 24 % within Residual quintile group

0.0% 0.0% 20.8% 79.2% 100.0%

Total Count 1 1 31 50 83 % within Residual quintile group

1.2% 1.2% 37.3% 60.2% 100.0%

Residual quintile group * AFL10D Crosstabulation

AFL10D

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 2 11 17 31 % within Residual quintile group

3.2% 6.5% 35.5% 54.8% 100.0%

AE Count 0 2 8 18 28 % within Residual quintile group

0.0% 7.1% 28.6% 64.3% 100.0%

WAE Count 0 0 6 18 24 % within Residual quintile group

0.0% 0.0% 25.0% 75.0% 100.0%

Total Count 1 4 25 53 83 % within Residual quintile group

1.2% 4.8% 30.1% 63.9% 100.0%

Residual quintile group * AFL10E Crosstabulation

AFL10E

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 2 5 19 3 31 % within Residual quintile group

6.5% 6.5% 16.1% 61.3% 9.7% 100.0%

AE Count 1 2 3 17 5 28 % within Residual quintile group

3.6% 7.1% 10.7% 60.7% 17.9% 100.0%

WAE Count 2 2 4 10 5 23 % within Residual quintile group

8.7% 8.7% 17.4% 43.5% 21.7% 100.0%

Total Count 5 6 12 46 13 82 % within Residual quintile group

6.1% 7.3% 14.6% 56.1% 15.9% 100.0%

Residual quintile group * AFL10F Crosstabulation

AFL10F

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 8 19 4 31 % within Residual quintile group

0.0% 25.8% 61.3% 12.9% 100.0%

AE Count 0 2 17 9 28 % within Residual quintile group

0.0% 7.1% 60.7% 32.1% 100.0%

WAE Count 1 1 14 8 24 % within Residual quintile group

4.2% 4.2% 58.3% 33.3% 100.0%

Page 431: Exploring The Impact of a Largescale Diagnostic Science

410

Total Count 1 11 50 21 83 % within Residual quintile group

1.2% 13.3% 60.2% 25.3% 100.0%

Residual quintile group * AFL10G Crosstabulation

AFL10G

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 13 12 3 30 % within Residual quintile group

6.7% 43.3% 40.0% 10.0% 100.0%

AE Count 3 8 13 4 28 % within Residual quintile group

10.7% 28.6% 46.4% 14.3% 100.0%

WAE Count 2 2 16 4 24 % within Residual quintile group

8.3% 8.3% 66.7% 16.7% 100.0%

Total Count 7 23 41 11 82 % within Residual quintile group

8.5% 28.0% 50.0% 13.4% 100.0%

Residual quintile group * AFL10H Crosstabulation

AFL10H

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 1 1 16 12 31 % within Residual quintile group

3.2% 3.2% 3.2% 51.6% 38.7% 100.0%

AE Count 0 0 0 14 13 27 % within Residual quintile group

0.0% 0.0% 0.0% 51.9% 48.1% 100.0%

WAE Count 0 0 0 6 18 24 % within Residual quintile group

0.0% 0.0% 0.0% 25.0% 75.0% 100.0%

Total Count 1 1 1 36 43 82 % within Residual quintile group

1.2% 1.2% 1.2% 43.9% 52.4% 100.0%

Residual quintile group * AFL11A Crosstabulation

AFL11A

Total 3.00 4.00 5.00 Residual quintile group WBE Count 4 8 19 31

% within Residual quintile group

12.9% 25.8% 61.3% 100.0%

AE Count 1 7 20 28 % within Residual quintile group

3.6% 25.0% 71.4% 100.0%

WAE Count 1 5 18 24 % within Residual quintile group

4.2% 20.8% 75.0% 100.0%

Total Count 6 20 57 83 % within Residual quintile group

7.2% 24.1% 68.7% 100.0%

Residual quintile group * AFL11B Crosstabulation

AFL11B

Total 3.00 4.00 5.00 Residual quintile group WBE Count 3 7 21 31

Page 432: Exploring The Impact of a Largescale Diagnostic Science

411

% within Residual quintile group

9.7% 22.6% 67.7% 100.0%

AE Count 1 11 16 28 % within Residual quintile group

3.6% 39.3% 57.1% 100.0%

WAE Count 0 6 18 24 % within Residual quintile group

0.0% 25.0% 75.0% 100.0%

Total Count 4 24 55 83 % within Residual quintile group

4.8% 28.9% 66.3% 100.0%

Residual quintile group * AFL11C Crosstabulation

AFL11C

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 4 8 13 6 31 % within Residual quintile group

12.9% 25.8% 41.9% 19.4% 100.0%

AE Count 3 4 13 8 28 % within Residual quintile group

10.7% 14.3% 46.4% 28.6% 100.0%

WAE Count 2 4 8 9 23 % within Residual quintile group

8.7% 17.4% 34.8% 39.1% 100.0%

Total Count 9 16 34 23 82 % within Residual quintile group

11.0% 19.5% 41.5% 28.0% 100.0%

Residual quintile group * AFL11D Crosstabulation

AFL11D

Total 4.00 5.00 Residual quintile group WBE Count 10 20 30

% within Residual quintile group

33.3% 66.7% 100.0%

AE Count 8 20 28 % within Residual quintile group

28.6% 71.4% 100.0%

WAE Count 5 19 24 % within Residual quintile group

20.8% 79.2% 100.0%

Total Count 23 59 82 % within Residual quintile group

28.0% 72.0% 100.0%

Residual quintile group * AFL11E Crosstabulation

AFL11E

Total 4.00 5.00 Residual quintile group WBE Count 12 19 31

% within Residual quintile group

38.7% 61.3% 100.0%

AE Count 6 22 28 % within Residual quintile group

21.4% 78.6% 100.0%

WAE Count 2 22 24 % within Residual quintile group

8.3% 91.7% 100.0%

Page 433: Exploring The Impact of a Largescale Diagnostic Science

412

Total Count 20 63 83 % within Residual quintile group

24.1% 75.9% 100.0%

Residual quintile group * Exemplary or model answers Crosstabulation

Exemplary or model answers

Total .00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 1 3 19 7 32 % within Residual quintile group

6.3% 3.1% 9.4% 59.4% 21.9% 100.0%

AE Count 3 0 2 15 8 28 % within Residual quintile group

10.7% 0.0% 7.1% 53.6% 28.6% 100.0%

WAE Count 1 0 1 15 8 25 % within Residual quintile group

4.0% 0.0% 4.0% 60.0% 32.0% 100.0%

Total Count 6 1 6 49 23 85 % within Residual quintile group

7.1% 1.2% 7.1% 57.6% 27.1% 100.0%

Residual quintile group * Success criteria Crosstabulation

Success criteria

Total .00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 5 14 11 32 % within Residual quintile group

6.3% 15.6% 43.8% 34.4% 100.0%

AE Count 3 1 6 18 28 % within Residual quintile group

10.7% 3.6% 21.4% 64.3% 100.0%

WAE Count 1 1 9 14 25 % within Residual quintile group

4.0% 4.0% 36.0% 56.0% 100.0%

Total Count 6 7 29 43 85 % within Residual quintile group

7.1% 8.2% 34.1% 50.6% 100.0%

Residual quintile group * Misconceptions Crosstabulation

Misconceptions

Total .00 3.00 4.00 5.00 Residual quintile group

WBE Count 3 3 14 12 32 % within Residual quintile group

9.4% 9.4% 43.8% 37.5% 100.0%

AE Count 3 2 10 13 28 % within Residual quintile group

10.7% 7.1% 35.7% 46.4% 100.0%

WAE Count 1 3 8 13 25 % within Residual quintile group

4.0% 12.0% 32.0% 52.0% 100.0%

Total Count 7 8 32 38 85 % within Residual quintile group

8.2% 9.4% 37.6% 44.7% 100.0%

Residual quintile group * SOLO levels Crosstabulation

SOLO levels

Total .00 2.00 3.00 4.00 5.00 WBE Count 9 10 7 4 2 32

Page 434: Exploring The Impact of a Largescale Diagnostic Science

413

Residual quintile group

% within Residual quintile group

28.1% 31.3% 21.9% 12.5% 6.3% 100.0%

AE Count 9 4 8 7 0 28 % within Residual quintile group

32.1% 14.3% 28.6% 25.0% 0.0% 100.0%

WAE Count 8 7 3 3 4 25 % within Residual quintile group

32.0% 28.0% 12.0% 12.0% 16.0% 100.0%

Total Count 26 21 18 14 6 85 % within Residual quintile group

30.6% 24.7% 21.2% 16.5% 7.1% 100.0%

Residual quintile group * QT model Crosstabulation

QT model

Total .00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 3 7 9 10 3 32 % within Residual quintile group

9.4% 21.9% 28.1% 31.3% 9.4% 100.0%

AE Count 5 2 5 8 8 28 % within Residual quintile group

17.9% 7.1% 17.9% 28.6% 28.6% 100.0%

WAE Count 3 3 5 7 7 25 % within Residual quintile group

12.0% 12.0% 20.0% 28.0% 28.0% 100.0%

Total Count 11 12 19 25 18 85 % within Residual quintile group

12.9% 14.1% 22.4% 29.4% 21.2% 100.0%

Residual quintile group * Bloom categories Crosstabulation

Bloom categories

Total .00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 3 3 12 10 4 32 % within Residual quintile group

9.4% 9.4% 37.5% 31.3% 12.5% 100.0%

AE Count 3 1 3 13 8 28 % within Residual quintile group

10.7% 3.6% 10.7% 46.4% 28.6% 100.0%

WAE Count 2 2 4 12 5 25 % within Residual quintile group

8.0% 8.0% 16.0% 48.0% 20.0% 100.0%

Total Count 8 6 19 35 17 85 % within Residual quintile group

9.4% 7.1% 22.4% 41.2% 20.0% 100.0%

Residual quintile group * Syllabus outcomes / standards Crosstabulation

Syllabus outcomes / standards

Total .00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 3 3 15 9 32 % within Residual quintile group

6.3% 9.4% 9.4% 46.9% 28.1% 100.0%

AE Count 4 2 1 8 13 28 % within Residual quintile group

14.3% 7.1% 3.6% 28.6% 46.4% 100.0%

WAE Count 1 1 1 7 15 25 % within Residual quintile group

4.0% 4.0% 4.0% 28.0% 60.0% 100.0%

Page 435: Exploring The Impact of a Largescale Diagnostic Science

414

Total Count 7 6 5 30 37 85 % within Residual quintile group

8.2% 7.1% 5.9% 35.3% 43.5% 100.0%

Residual quintile group * AFL13A Crosstabulation

AFL13A

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 12 12 6 0 30 % within Residual quintile group

0.0% 40.0% 40.0% 20.0% 0.0% 100.0%

AE Count 1 6 7 12 2 28 % within Residual quintile group

3.6% 21.4% 25.0% 42.9% 7.1% 100.0%

WAE Count 0 5 9 7 2 23 % within Residual quintile group

0.0% 21.7% 39.1% 30.4% 8.7% 100.0%

Total Count 1 23 28 25 4 81 % within Residual quintile group

1.2% 28.4% 34.6% 30.9% 4.9% 100.0%

Residual quintile group * AFL13B Crosstabulation

AFL13B

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 15 12 4 0 31 % within Residual quintile group

0.0% 48.4% 38.7% 12.9% 0.0% 100.0%

AE Count 1 7 10 8 1 27 % within Residual quintile group

3.7% 25.9% 37.0% 29.6% 3.7% 100.0%

WAE Count 1 6 10 5 1 23 % within Residual quintile group

4.3% 26.1% 43.5% 21.7% 4.3% 100.0%

Total Count 2 28 32 17 2 81 % within Residual quintile group

2.5% 34.6% 39.5% 21.0% 2.5% 100.0%

Residual quintile group * AFL13C Crosstabulation

AFL13C

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 2 2 13 13 31 % within Residual quintile group

3.2% 6.5% 6.5% 41.9% 41.9% 100.0%

AE Count 0 2 3 5 17 27 % within Residual quintile group

0.0% 7.4% 11.1% 18.5% 63.0% 100.0%

WAE Count 0 1 0 8 14 23 % within Residual quintile group

0.0% 4.3% 0.0% 34.8% 60.9% 100.0%

Total Count 1 5 5 26 44 81 % within Residual quintile group

1.2% 6.2% 6.2% 32.1% 54.3% 100.0%

Residual quintile group * AFL13D Crosstabulation

AFL13D

Total 1.00 2.00 3.00 4.00 5.00 WBE Count 0 3 11 14 3 31

Page 436: Exploring The Impact of a Largescale Diagnostic Science

415

Residual quintile group

% within Residual quintile group

0.0% 9.7% 35.5% 45.2% 9.7% 100.0%

AE Count 1 1 5 15 6 28 % within Residual quintile group

3.6% 3.6% 17.9% 53.6% 21.4% 100.0%

WAE Count 0 1 2 16 4 23 % within Residual quintile group

0.0% 4.3% 8.7% 69.6% 17.4% 100.0%

Total Count 1 5 18 45 13 82 % within Residual quintile group

1.2% 6.1% 22.0% 54.9% 15.9% 100.0%

Residual quintile group * AFL13E Crosstabulation

AFL13E

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 8 12 11 0 31 % within Residual quintile group

0.0% 25.8% 38.7% 35.5% 0.0% 100.0%

AE Count 2 11 6 6 1 26 % within Residual quintile group

7.7% 42.3% 23.1% 23.1% 3.8% 100.0%

WAE Count 2 8 7 5 1 23 % within Residual quintile group

8.7% 34.8% 30.4% 21.7% 4.3% 100.0%

Total Count 4 27 25 22 2 80 % within Residual quintile group

5.0% 33.8% 31.3% 27.5% 2.5% 100.0%

Residual quintile group * AFL13F Crosstabulation

AFL13F

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 17 8 2 3 31 % within Residual quintile group

3.2% 54.8% 25.8% 6.5% 9.7% 100.0%

AE Count 1 16 3 6 1 27 % within Residual quintile group

3.7% 59.3% 11.1% 22.2% 3.7% 100.0%

WAE Count 0 9 7 5 2 23 % within Residual quintile group

0.0% 39.1% 30.4% 21.7% 8.7% 100.0%

Total Count 2 42 18 13 6 81 % within Residual quintile group

2.5% 51.9% 22.2% 16.0% 7.4% 100.0%

Residual quintile group * AFL14A Crosstabulation

AFL14A

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 3 15 9 3 30 % within Residual quintile group

10.0% 50.0% 30.0% 10.0% 100.0%

AE Count 2 6 15 5 28 % within Residual quintile group

7.1% 21.4% 53.6% 17.9% 100.0%

WAE Count 1 5 10 7 23 % within Residual quintile group

4.3% 21.7% 43.5% 30.4% 100.0%

Page 437: Exploring The Impact of a Largescale Diagnostic Science

416

Total Count 6 26 34 15 81 % within Residual quintile group

7.4% 32.1% 42.0% 18.5% 100.0%

Residual quintile group * AFL14B Crosstabulation

AFL14B

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 3 7 10 10 31 % within Residual quintile group

3.2% 9.7% 22.6% 32.3% 32.3% 100.0%

AE Count 1 1 4 15 7 28 % within Residual quintile group

3.6% 3.6% 14.3% 53.6% 25.0% 100.0%

WAE Count 0 0 3 8 11 22 % within Residual quintile group

0.0% 0.0% 13.6% 36.4% 50.0% 100.0%

Total Count 2 4 14 33 28 81 % within Residual quintile group

2.5% 4.9% 17.3% 40.7% 34.6% 100.0%

Residual quintile group * AFL14C Crosstabulation

AFL14C

Total 3.00 4.00 5.00 Residual quintile group WBE Count 3 16 12 31

% within Residual quintile group

9.7% 51.6% 38.7% 100.0%

AE Count 1 9 18 28 % within Residual quintile group

3.6% 32.1% 64.3% 100.0%

WAE Count 1 8 15 24 % within Residual quintile group

4.2% 33.3% 62.5% 100.0%

Total Count 5 33 45 83 % within Residual quintile group

6.0% 39.8% 54.2% 100.0%

Residual quintile group * AFL14D Crosstabulation

AFL14D

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 6 19 4 31 % within Residual quintile group

6.5% 19.4% 61.3% 12.9% 100.0%

AE Count 1 3 10 14 28 % within Residual quintile group

3.6% 10.7% 35.7% 50.0% 100.0%

WAE Count 0 5 7 12 24 % within Residual quintile group

0.0% 20.8% 29.2% 50.0% 100.0%

Total Count 3 14 36 30 83 % within Residual quintile group

3.6% 16.9% 43.4% 36.1% 100.0%

Residual quintile group * AFL14E Crosstabulation

AFL14E

Total 3.00 4.00 5.00 Residual quintile group WBE Count 3 16 12 31

Page 438: Exploring The Impact of a Largescale Diagnostic Science

417

% within Residual quintile group

9.7% 51.6% 38.7% 100.0%

AE Count 1 8 18 27 % within Residual quintile group

3.7% 29.6% 66.7% 100.0%

WAE Count 0 9 14 23 % within Residual quintile group

0.0% 39.1% 60.9% 100.0%

Total Count 4 33 44 81 % within Residual quintile group

4.9% 40.7% 54.3% 100.0%

Residual quintile group * AFL14F Crosstabulation

AFL14F

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 14 6 9 0 31 % within Residual quintile group

6.5% 45.2% 19.4% 29.0% 0.0% 100.0%

AE Count 1 8 9 9 1 28 % within Residual quintile group

3.6% 28.6% 32.1% 32.1% 3.6% 100.0%

WAE Count 1 5 8 9 1 24 % within Residual quintile group

4.2% 20.8% 33.3% 37.5% 4.2% 100.0%

Total Count 4 27 23 27 2 83 % within Residual quintile group

4.8% 32.5% 27.7% 32.5% 2.4% 100.0%

Residual quintile group * AFL14G Crosstabulation

AFL14G

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 7 9 11 2 31 % within Residual quintile group

6.5% 22.6% 29.0% 35.5% 6.5% 100.0%

AE Count 1 3 11 7 6 28 % within Residual quintile group

3.6% 10.7% 39.3% 25.0% 21.4% 100.0%

WAE Count 0 4 6 9 5 24 % within Residual quintile group

0.0% 16.7% 25.0% 37.5% 20.8% 100.0%

Total Count 3 14 26 27 13 83 % within Residual quintile group

3.6% 16.9% 31.3% 32.5% 15.7% 100.0%

Residual quintile group * AFL14H Crosstabulation

AFL14H

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 2 7 8 12 2 31 % within Residual quintile group

6.5% 22.6% 25.8% 38.7% 6.5% 100.0%

AE Count 0 3 7 15 3 28 % within Residual quintile group

0.0% 10.7% 25.0% 53.6% 10.7% 100.0%

WAE Count 0 2 4 12 6 24 % within Residual quintile group

0.0% 8.3% 16.7% 50.0% 25.0% 100.0%

Page 439: Exploring The Impact of a Largescale Diagnostic Science

418

Total Count 2 12 19 39 11 83 % within Residual quintile group

2.4% 14.5% 22.9% 47.0% 13.3% 100.0%

Residual quintile group * AFL15A Crosstabulation

AFL15A

Total 2.00 4.00 5.00 Residual quintile group WBE Count 1 6 24 31

% within Residual quintile group

3.2% 19.4% 77.4% 100.0%

AE Count 0 5 23 28 % within Residual quintile group

0.0% 17.9% 82.1% 100.0%

WAE Count 0 8 16 24 % within Residual quintile group

0.0% 33.3% 66.7% 100.0%

Total Count 1 19 63 83 % within Residual quintile group

1.2% 22.9% 75.9% 100.0%

Residual quintile group * AFL15B Crosstabulation

AFL15B

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 1 1 9 20 31 % within Residual quintile group

3.2% 3.2% 29.0% 64.5% 100.0%

AE Count 0 1 5 22 28 % within Residual quintile group

0.0% 3.6% 17.9% 78.6% 100.0%

WAE Count 0 1 7 15 23 % within Residual quintile group

0.0% 4.3% 30.4% 65.2% 100.0%

Total Count 1 3 21 57 82 % within Residual quintile group

1.2% 3.7% 25.6% 69.5% 100.0%

Residual quintile group * AFL15C Crosstabulation

AFL15C

Total 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 3 12 16 31 % within Residual quintile group

0.0% 9.7% 38.7% 51.6% 100.0%

AE Count 1 0 7 20 28 % within Residual quintile group

3.6% 0.0% 25.0% 71.4% 100.0%

WAE Count 0 1 9 13 23 % within Residual quintile group

0.0% 4.3% 39.1% 56.5% 100.0%

Total Count 1 4 28 49 82 % within Residual quintile group

1.2% 4.9% 34.1% 59.8% 100.0%

Residual quintile group * AFL15D Crosstabulation

AFL15D

Total 1.00 2.00 3.00 4.00 5.00 WBE Count 1 1 4 11 14 31

Page 440: Exploring The Impact of a Largescale Diagnostic Science

419

Residual quintile group

% within Residual quintile group

3.2% 3.2% 12.9% 35.5% 45.2% 100.0%

AE Count 0 0 2 7 18 27 % within Residual quintile group

0.0% 0.0% 7.4% 25.9% 66.7% 100.0%

WAE Count 0 0 0 8 13 21 % within Residual quintile group

0.0% 0.0% 0.0% 38.1% 61.9% 100.0%

Total Count 1 1 6 26 45 79 % within Residual quintile group

1.3% 1.3% 7.6% 32.9% 57.0% 100.0%

Residual quintile group * AFL15E Crosstabulation

AFL15E

Total 1.00 2.00 3.00 4.00 5.00 Residual quintile group

WBE Count 0 1 2 16 12 31 % within Residual quintile group

0.0% 3.2% 6.5% 51.6% 38.7% 100.0%

AE Count 0 1 3 10 14 28 % within Residual quintile group

0.0% 3.6% 10.7% 35.7% 50.0% 100.0%

WAE Count 1 1 1 7 14 24 % within Residual quintile group

4.2% 4.2% 4.2% 29.2% 58.3% 100.0%

Total Count 1 3 6 33 40 83 % within Residual quintile group

1.2% 3.6% 7.2% 39.8% 48.2% 100.0%

Page 441: Exploring The Impact of a Largescale Diagnostic Science

420

RESPONDENTDATADISAGGREGATEDINTOCASESTUDY(CS)SCHOOLS,

SCHOOLSTHATIDENTIFIEDTHEMSELVESANDREMAINDER(ANONYMOUS)

Case Processing Summary

Cases Valid Missing Total N Percent N Percent N Percent

Residual quintile group * Gender * Status within residual quintile group

78 91.8% 7 8.2% 85 100.0%

Residual quintile group * Teaching experience * Status within residual quintile group

78 91.8% 7 8.2% 85 100.0%

Residual quintile group * Science teacher by training * Status within residual quintile group

80 94.1% 5 5.9% 85 100.0%

Residual quintile group * Alternative quals * Status within residual quintile group

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * Head teacher or not * Status within residual quintile group

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * Highest qualification * Status within residual quintile group

79 92.9% 6 7.1% 85 100.0%

Residual quintile group * Year highest qual completed * Status within residual quintile group

85 100.0% 0 0.0% 85 100.0%

Residual quintile group * Where trained * Status within residual quintile group

78 91.8% 7 8.2% 85 100.0%

Residual quintile group * Last taught Yr 7-9 classes * Status within residual quintile group

81 95.3% 4 4.7% 85 100.0%

Residual quintile group * Y8 classes at your school * Status within residual quintile group

79 92.9% 6 7.1% 85 100.0%

Residual quintile group * FT science teachers * Status within residual quintile group

78 91.8% 7 8.2% 85 100.0%

Residual quintile group * PT science teachers * Status within residual quintile group

64 75.3% 21 24.7% 85 100.0%

Page 442: Exploring The Impact of a Largescale Diagnostic Science

421

REFER TO THE QUESIONNAIRE FOR THE KEY LINKING NUMBERS TO RESPONSE Residual quintile group * Gender * Status within residual quintile group Crosstabulation

Status within residual quintile group Gender

Total 1 2 UNKNOWN Residual quintile

group WBE Count 13 6 19

% within Residual quintile group

68.4% 31.6% 100.0%

AE Count 12 4 16 % within Residual quintile group

75.0% 25.0% 100.0%

WAE Count 8 5 13 % within Residual quintile group

61.5% 38.5% 100.0%

Total Count 33 15 48 % within Residual quintile group

68.8% 31.3% 100.0%

IDKNOWN Residual quintile group

WBE Count 4 1 5 % within Residual quintile group

80.0% 20.0% 100.0%

AE Count 3 3 6 % within Residual quintile group

50.0% 50.0% 100.0%

WAE Count 3 1 4 % within Residual quintile group

75.0% 25.0% 100.0%

Total Count 10 5 15 % within Residual quintile group

66.7% 33.3% 100.0%

CSSCHOOL Residual quintile group

WBE Count 3 2 5 % within Residual quintile group

60.0% 40.0% 100.0%

AE Count 3 1 4 % within Residual quintile group

75.0% 25.0% 100.0%

WAE Count 5 1 6 % within Residual quintile group

83.3% 16.7% 100.0%

Total Count 11 4 15 % within Residual quintile group

73.3% 26.7% 100.0%

Total Residual quintile group

WBE Count 20 9 29 % within Residual quintile group

69.0% 31.0% 100.0%

AE Count 18 8 26 % within Residual quintile group

69.2% 30.8% 100.0%

WAE Count 16 7 23 % within Residual quintile group

69.6% 30.4% 100.0%

Total Count 54 24 78 % within Residual quintile group

69.2% 30.8% 100.0%

Page 443: Exploring The Impact of a Largescale Diagnostic Science

422

Residual quintile group * Teaching experience * Status within residual quintile group Crosstabulation

Status within residual quintile group Teaching experience

Total 1 2 3 4 UNKNOWN Residual quintile

group WBE Count 8 2 1 8 19

% within Residual quintile group

42.1% 10.5% 5.3% 42.1% 100.0%

AE Count 1 3 3 9 16 % within Residual quintile group

6.3% 18.8% 18.8% 56.3% 100.0%

WAE Count 1 2 3 7 13 % within Residual quintile group

7.7% 15.4% 23.1% 53.8% 100.0%

Total Count 10 7 7 24 48 % within Residual quintile group

20.8% 14.6% 14.6% 50.0% 100.0%

IDKNOWN Residual quintile group

WBE Count 1 0 4 5 % within Residual quintile group 20.0% 0.0% 80.0% 100.0%

AE Count 2 0 4 6 % within Residual quintile group 33.3% 0.0% 66.7% 100.0%

WAE Count 1 1 2 4 % within Residual quintile group 25.0% 25.0% 50.0% 100.0%

Total Count 4 1 10 15 % within Residual quintile group 26.7% 6.7% 66.7% 100.0%

CSSCHOOL Residual quintile group

WBE Count 1 2 2 5 % within Residual quintile group 20.0% 40.0% 40.0% 100.0%

AE Count 0 1 3 4 % within Residual quintile group 0.0% 25.0% 75.0% 100.0%

WAE Count 0 1 5 6 % within Residual quintile group 0.0% 16.7% 83.3% 100.0%

Total Count 1 4 10 15 % within Residual quintile group 6.7% 26.7% 66.7% 100.0%

Total Residual quintile group

WBE Count 8 4 3 14 29 % within Residual quintile group

27.6% 13.8% 10.3% 48.3% 100.0%

AE Count 1 5 4 16 26 % within Residual quintile group

3.8% 19.2% 15.4% 61.5% 100.0%

WAE Count 1 3 5 14 23 % within Residual quintile group

4.3% 13.0% 21.7% 60.9% 100.0%

Total Count 10 12 12 44 78 % within Residual quintile group

12.8% 15.4% 15.4% 56.4% 100.0%

Page 444: Exploring The Impact of a Largescale Diagnostic Science

423

Residual quintile group * Science teacher by training * Status within residual quintile group Crosstabulation

Status within residual quintile group

Science teacher by training

Total 1 2 UNKNOWN Residual quintile

group WBE Count 16 3 19

% within Residual quintile group

84.2% 15.8% 100.0%

AE Count 18 0 18 % within Residual quintile group

100.0% 0.0% 100.0%

WAE Count 12 1 13 % within Residual quintile group

92.3% 7.7% 100.0%

Total Count 46 4 50 % within Residual quintile group

92.0% 8.0% 100.0%

IDKNOWN Residual quintile group

WBE Count 5 5 % within Residual quintile group

100.0% 100.0%

AE Count 6 6 % within Residual quintile group

100.0% 100.0%

WAE Count 4 4 % within Residual quintile group

100.0% 100.0%

Total Count 15 15 % within Residual quintile group

100.0% 100.0%

CSSCHOOL Residual quintile group

WBE Count 5 5 % within Residual quintile group

100.0% 100.0%

AE Count 4 4 % within Residual quintile group

100.0% 100.0%

WAE Count 6 6 % within Residual quintile group

100.0% 100.0%

Total Count 15 15 % within Residual quintile group

100.0% 100.0%

Total Residual quintile group

WBE Count 26 3 29 % within Residual quintile group

89.7% 10.3% 100.0%

AE Count 28 0 28 % within Residual quintile group

100.0% 0.0% 100.0%

WAE Count 22 1 23 % within Residual quintile group

95.7% 4.3% 100.0%

Total Count 76 4 80 % within Residual quintile group

95.0% 5.0% 100.0%

Page 445: Exploring The Impact of a Largescale Diagnostic Science

424

Residual quintile group * Alternative quals * Status within residual quintile group Crosstabulation

Status within residual quintile group Alternative quals

Total .0 UNKNOWN Residual quintile

group WBE Count 18 4 22

% within Residual quintile group

81.8% 18.2% 100.0%

AE Count 17 1 18 % within Residual quintile group

94.4% 5.6% 100.0%

WAE Count 12 3 15 % within Residual quintile group

80.0% 20.0% 100.0%

Total Count 47 8 55 % within Residual quintile group

85.5% 14.5% 100.0%

IDKNOWN Residual quintile group

WBE Count 5 0 5 % within Residual quintile group

100.0% 0.0% 100.0%

AE Count 5 1 6 % within Residual quintile group

83.3% 16.7% 100.0%

WAE Count 4 0 4 % within Residual quintile group

100.0% 0.0% 100.0%

Total Count 14 1 15 % within Residual quintile group

93.3% 6.7% 100.0%

CSSCHOOL Residual quintile group

WBE Count 3 2 5 % within Residual quintile group

60.0% 40.0% 100.0%

AE Count 4 0 4 % within Residual quintile group

100.0% 0.0% 100.0%

WAE Count 5 1 6 % within Residual quintile group

83.3% 16.7% 100.0%

Total Count 12 3 15 % within Residual quintile group

80.0% 20.0% 100.0%

Total Residual quintile group

WBE Count 26 6 32 % within Residual quintile group

81.3% 18.8% 100.0%

AE Count 26 2 28 % within Residual quintile group

92.9% 7.1% 100.0%

WAE Count 21 4 25 % within Residual quintile group

84.0% 16.0% 100.0%

Total Count 73 12 85 % within Residual quintile group

85.9% 14.1% 100.0%

Page 446: Exploring The Impact of a Largescale Diagnostic Science

425

Residual quintile group * Head teacher or not * Status within residual quintile group Crosstabulation

Status within residual quintile group

Head teacher or not

Total 1 2 UNKNOWN Residual quintile

group WBE Count 4 15 19

% within Residual quintile group

21.1% 78.9% 100.0%

AE Count 6 12 18 % within Residual quintile group

33.3% 66.7% 100.0%

WAE Count 5 9 14 % within Residual quintile group

35.7% 64.3% 100.0%

Total Count 15 36 51 % within Residual quintile group

29.4% 70.6% 100.0%

IDKNOWN Residual quintile group

WBE Count 4 1 5 % within Residual quintile group

80.0% 20.0% 100.0%

AE Count 4 2 6 % within Residual quintile group

66.7% 33.3% 100.0%

WAE Count 4 0 4 % within Residual quintile group

100.0% 0.0% 100.0%

Total Count 12 3 15 % within Residual quintile group

80.0% 20.0% 100.0%

CSSCHOOL Residual quintile group

WBE Count 4 1 5 % within Residual quintile group

80.0% 20.0% 100.0%

AE Count 4 0 4 % within Residual quintile group

100.0% 0.0% 100.0%

WAE Count 4 2 6 % within Residual quintile group

66.7% 33.3% 100.0%

Total Count 12 3 15 % within Residual quintile group

80.0% 20.0% 100.0%

Total Residual quintile group

WBE Count 12 17 29 % within Residual quintile group

41.4% 58.6% 100.0%

AE Count 14 14 28 % within Residual quintile group

50.0% 50.0% 100.0%

WAE Count 13 11 24 % within Residual quintile group

54.2% 45.8% 100.0%

Total Count 39 42 81 % within Residual quintile group

48.1% 51.9% 100.0%

Page 447: Exploring The Impact of a Largescale Diagnostic Science

426

Residual quintile group * Highest qualification * Status within residual quintile group Crosstabulation

Status within residual quintile group Highest qualification

Total 1 2 3 4 5 UNKNOWN Residual

quintile group WBE Count 10 5 1 1 0 17

% within Residual quintile group

58.8% 29.4% 5.9% 5.9% 0.0% 100.0%

AE Count 11 2 3 0 2 18 % within Residual quintile group

61.1% 11.1% 16.7% 0.0% 11.1% 100.0%

WAE Count 9 1 2 0 2 14 % within Residual quintile group

64.3% 7.1% 14.3% 0.0% 14.3% 100.0%

Total Count 30 8 6 1 4 49 % within Residual quintile group

61.2% 16.3% 12.2% 2.0% 8.2% 100.0%

IDKNOWN Residual quintile group

WBE Count 5 0 5 % within Residual quintile group

100.0% 0.0%

100.0%

AE Count 5 1 6 % within Residual quintile group

83.3% 16.7%

100.0%

WAE Count 2 2 4 % within Residual quintile group

50.0% 50.0%

100.0%

Total Count 12 3 15 % within Residual quintile group

80.0% 20.0%

100.0%

CSSCHOOL Residual quintile group

WBE Count 4 1 0 5 % within Residual quintile group

80.0% 20.0% 0.0%

100.0%

AE Count 4 0 0 4 % within Residual quintile group

100.0% 0.0% 0.0%

100.0%

WAE Count 5 0 1 6 % within Residual quintile group

83.3% 0.0% 16.7%

100.0%

Total Count 13 1 1 15 % within Residual quintile group

86.7% 6.7% 6.7%

100.0%

Total Residual quintile group

WBE Count 19 6 1 1 0 27 % within Residual quintile group

70.4% 22.2% 3.7% 3.7% 0.0% 100.0%

AE Count 20 3 3 0 2 28

Page 448: Exploring The Impact of a Largescale Diagnostic Science

427

% within Residual quintile group

71.4% 10.7% 10.7% 0.0% 7.1% 100.0%

WAE Count 16 3 3 0 2 24 % within Residual quintile group

66.7% 12.5% 12.5% 0.0% 8.3% 100.0%

Total Count 55 12 7 1 4 79 % within Residual quintile group

69.6% 15.2% 8.9% 1.3% 5.1% 100.0%

Residual quintile group * Where trained * Status within residual quintile group Crosstabulation

Status within residual quintile group Where trained

Total 1 2 3 UNKNOWN Residual quintile

group WBE Count 0 2 15 17

% within Residual quintile group

0.0% 11.8% 88.2% 100.0%

AE Count 4 2 12 18 % within Residual quintile group

22.2% 11.1% 66.7% 100.0%

WAE Count 2 1 11 14 % within Residual quintile group

14.3% 7.1% 78.6% 100.0%

Total Count 6 5 38 49 % within Residual quintile group

12.2% 10.2% 77.6% 100.0%

IDKNOWN Residual quintile group

WBE Count 5 5 % within Residual quintile group 100.0% 100.0%

AE Count 6 6 % within Residual quintile group 100.0% 100.0%

WAE Count 4 4 % within Residual quintile group 100.0% 100.0%

Total Count 15 15 % within Residual quintile group 100.0% 100.0%

CSSCHOOL Residual quintile group

WBE Count 0 5 5 % within Residual quintile group

0.0% 100.0% 100.0%

AE Count 1 3 4 % within Residual quintile group

25.0% 75.0% 100.0%

WAE Count 1 4 5 % within Residual quintile group

20.0% 80.0% 100.0%

Total Count 2 12 14 % within Residual quintile group

14.3% 85.7% 100.0%

Total Residual quintile group

WBE Count 0 2 25 27 % within Residual quintile group

0.0% 7.4% 92.6% 100.0%

AE Count 5 2 21 28

Page 449: Exploring The Impact of a Largescale Diagnostic Science

428

% within Residual quintile group

17.9% 7.1% 75.0% 100.0%

WAE Count 3 1 19 23 % within Residual quintile group

13.0% 4.3% 82.6% 100.0%

Total Count 8 5 65 78 % within Residual quintile group

10.3% 6.4% 83.3% 100.0%

Page 450: Exploring The Impact of a Largescale Diagnostic Science

429

REFERENCES

AAS,AustralianAcademyofSciences.(2016).PrimaryConnections:Linkingscience

withliteracy.Retrievedfromhttps://primaryconnections.org.au/about

AAS,AustralianAcademyofScience.(2017).SciencebyDoing--Home.Retrieved

fromhttps://www.sciencebydoing.edu.au/

ABS,AustralianBureaofStatistics.(2018).4221.0SchoolsAustralia,2017.

Retrievedfrom

http://www.abs.gov.au/ausstats/[email protected]/PrimaryMainFeatures/4221.0?

OpenDocument

ACACA,AustralianCurriculum,AssessmentandCertificationAuthorities.(2018).

Home.Retrievedfrom

http://www.acaca.edu.au/index.php/schooling/assessment-and-

reporting/

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2013a).

GuidetounderstandingICSEA(IndexofCommunitySocio-educational

Advantage)valuesfrom2013onwards.Retrievedfrom

https://acaraweb.blob.core.windows.net/resources/Guide_to_understandi

ng_ICSEA_values.pdf

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2013b).

NAPLAN.Retrievedfromhttp://www.nap.edu.au/naplan/naplan.html

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2014a).2012

NAPSLPublicReport.Retrievedfromhttp://www.nap.edu.au/results-

and-reports/national-reports.html

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2014b).

ICSEA2013:TechnicalReport.Retrievedfrom

https://www.myschool.edu.au/MoreInformation

Page 451: Exploring The Impact of a Largescale Diagnostic Science

430

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2014c).The

AustralianCurriculum:ScienceF-10.Retrievedfrom

http://www.australiancurriculum.edu.au/Download/F10

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2015).ICSEA

2014:TechnicalReport.Retrievedfrom

https://www.myschool.edu.au/media/1033/icsea_2014_technical_report.p

df

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2016a).

Aboutus.Retrievedfromhttps://www.acara.edu.au/about-us

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2016b).My

Schoolwebsite/About.Retrievedfrom

http://www.myschool.edu.au/about/

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2016c).

Reporting.Retrievedfromhttps://www.acara.edu.au/reporting

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2016d).NAP

website:Welcome.Retrievedfromhttp://www.nap.edu.au/about/why-

nap.html

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2017).NAP

SampleAssessments-ScienceLiteracy.Retrievedfrom

http://www.nap.edu.au/nap-sample-assessments/science-literacy

ACARA,AustralianCurriculumAssessmentandReportingAuthority.(2018).

Science:SequenceofAchievement:7-10.Retrievedfrom

http://docs.acara.edu.au/resources/Science_Sequence_of_achievement.pdf

ACER,AustralianCouncilforEducationalResearch.(2004a).ScienceEducation

AssessmentResource(SEAR):FinalReport.Retrievedfrom

http://cms.curriculum.edu.au/sear/newcms/view_page.asp?page_id=3526

Page 452: Exploring The Impact of a Largescale Diagnostic Science

431

ACER,AustralianCouncilforEducationalResearch.(2004b).ScientificLiteracy

ProgressMap(pp4).Melbourne,Victoria:AustralianCouncilfor

EducationalResearch.

AECRC,AustralianEducationCouncilReviewCommittee.(1992).Key

Competencies.ReportoftheCommitteetoadvisetheAustralianEducation

CouncilandMinistersofVocationalEducation,EmploymentandTrainingon

employment-relatedKeyCompetenciesforpostcompulsoryeducationand

training.Retrievedfromhttp://www.voced.edu.au/content/ngv%3A28045

ARG,AssessmentReformGroup.(2002a).AssessmentforLearning:10principles.

Retrievedfromhttp://www.nuffieldfoundation.org/assessment-reform-

group

ARG,AssessmentReformGroup.(2002b).Testing,MotivationandLearning.

Retrievedfromhttp://www.nuffieldfoundation.org/assessment-reform-

group

ARG,AssessmentReformGroup.(2006).Theroleofteachersintheassessmentof

learning.Retrievedfromhttp://www.nuffieldfoundation.org/assessment-

reform-group

Au,W.(2007).High-StakesTestingandCurricularControl:AQualitative

Metasynthesis.EducationalResearcher,36,11.

Australia,Commonwealthof.(2001).BackingAustralia'sAbility:Anactionplanfor

thefuture.Canberra:CommonwealthofAustralia.Retrievedfrom

https://trove.nla.gov.au/work/34335833.

Ball,S.,Rae,I.,&Tognolini,J.(2000).AreportfortheNationalEducation

PerformanceMonitoringTaskforce:optionsfortheassessmentand

reportingofprimarystudentsinthekeylearningareaofsciencetobeused

inthereportingofnationallycomparableoutcomesofschoolingwithinthe

contextoftheNationalGoalsforSchoolingintheTwenty-FirstCentury.

Retrievedfrom

Page 453: Exploring The Impact of a Largescale Diagnostic Science

432

http://educationcouncil.edu.au/site/DefaultSite/filesystem/documents/Re

ports%20and%20publications/Archive%20Publications/Measuring%20an

d%20Reporting%20Student%20Performance/Assessment_Primary_Stude

nts_Science-Context_National_Goals.pdf

Batterham,R.(2000).TheChancetoChange:FinalReport.Canberra:Departmentof

Science,IndustryandResources.

Bell,B.,&Cowie,B.(2002).FormativeAssessmentandScienceEducation(Vol.12).

Dordrecht:Kluwer.

Beveridge,M.(1985).Thedevelopmentofyoungchildrens'understandingofthe

processofevaporation.BritishJournalofEducationalpsychology,55,84-90.

Biggs,J.(1995).AssessingforLearning:Somedimensionsunderlyingnew

approachestoeducationalassessment.AlbertaJournalofEducational

Research,41(1),18.

Biggs,J.(1998).AssessmentandClassroomLearning:aroleforsummative

assessment?AssessmentinEducation:Principles,Policy&Practice,5(1),

103-110.doi:10.1080/0969595980050106

Biggs,J.(1999).WhattheStudentDoes:teachingforenhancedlearning.Higher

EducationResearch&Development,18(1),57-75.

Biggs,J.,&Collis,K.(1982).EvaluatingtheQualityofLearning:TheSOLO(Structure

oftheObservedLearningOutcome)Taxonomy.NewYork:AcademicPress.

Biggs,J.,&Collis,K.(1991).Multimodallearningandthequalityofintelligent

behaviour.InHelgaRowe(Ed.),Intelligence:Reconceptualizationand

measurement(pp.57-76).Melbourne,Victoria:ACER.

Billett,S.(1996).Situatedlearning:Bridgingsocioculturalandcognitivetheorising.

LearningandInstruction,6(3),263-280.

Page 454: Exploring The Impact of a Largescale Diagnostic Science

433

Black,P.(2007).Fullmarksforfeedback.MakingtheGrade(JournaloftheInstitute

ofEducationalAssessors),Spring2007,18-21.

Black,P.(2013).FormativeandSummativeAspectsofAssessment:Theoretical

andResearchFoundationsintheContextofPedagogy.SAGEHandbookof

ResearchonClassroomAssessment.SAGEPublications,Inc.ThousandOaks,

CA:SAGEPublications,Inc.

Black,P.,Harrison,C.,Lee,C.,Marshall,B.,&Wiliam,D.(2004).Workinginsidethe

blackbox:assessmentforlearningintheclassroom.London:NFER-Nelson.

Black,P.,McCormick,R.,James,M.,&Pedder,D.(2006).LearningHowtoLearn

andAssessmentforLearning:atheoreticalinquiry.ResearchPapersin

Education-SpecialIssue,21(2),119-132.

Black,P.,&Wiliam,D.(1998a).AssessmentandClassroomLearning.Assessmentin

Education:Principles,Policy&Practice,5(1),7-74.

doi:10.1080/0969595980050102

Black,P.,&Wiliam,D.(1998b).Insidetheblackbox:raisingstandardsthrough

classroomassessment.London:King'sCollegeLondon.Dept.ofEducation&

ProfessionalStudies.

Black,P.,&Wiliam,D.(2005).Changingteachingthroughformativeassessment:

Researchandpractice.CERI,2005,223-240.Retrievedfrom

http://www.oecd.org/education/ceri/35337920.pdf

Black,P.,&Wiliam,D.(2009).Developingthetheoryofformativeassessment.

EducationalAssessment,EvaluationandAccountability(formerly:Journalof

PersonnelEvaluationinEducation),21(1),5-31.doi:10.1007/s11092-008-

9068-5

Bloom,B.,Engelhart,M.,Furst,E.,Hill,W.,&Krathwohl,D.(Eds.).(1956).

Taxonomyofeducationalobjectives:Theclassificationofeducationalgoals.

Handbook1:Cognitivedomain.NewYork:DavidMcKay.

Page 455: Exploring The Impact of a Largescale Diagnostic Science

434

Bøe,M.V.,Henriksen,E.K.,Lyons,T.,&Schreiner,C.(2013).Participationin

scienceandtechnology:youngpeople'sachievement-relatedchoicesinlate-

modernsocieities.StudiesinScienceEducation,47(1),37-72.

doi:10.1080/03057267.2011.549621

Boekaerts,M.,&Corno,L.(2005).Self-RegulationintheClassroom:APerspective

onAssessmentandIntervention.AppliedPsychology:AnInternational

Review,54(2),199-231.

Boekaerts,M.,Maes,S.,&Karoly,P.(2005).Self-RegulationAcrossDomainsof

AppliedPsychology:IsthereanEmergingConsensus?AppliedPsychology:

AnInternationalReview,54(2),15.

BOS,BoardofStudiesNSW.(2003).ScienceYears7-10Syllabus(Vol.2013).

Sydney:BoardofStudiesNSW.

BOS,BoardofStudiesNSW.(2011).SchoolCertificateReview--DiscussionPaper.

Sydney:BoardofStudiesNSW.

BOS,BoardofStudiesNSW.(2013).AbouttheCommonGradeScale.Retrievedfrom

http://arc.nesa.nsw.edu.au/go/7-8/common-grade-scale/.

BOS,BoardofStudiesNSW.(n.d.).Stage5CoursePerformanceDescriptors--

Science.Retrievedfromhttp://arc.nesa.nsw.edu.au/go/9-10/stage-5-

grading/cpds/index/science

BOSTES,BoardofStudies,TeachingandEducationalStandardsNSW.(2012).NSW

SyllabusesfortheAustralianCurriculum:ScienceK-10.Retrievedfrom

http://syllabus.bostes.nsw.edu.au/science/

Boyle,S.,Fahey,E.,Loughran,J.,&Mitchell,I.(2001).Classroomresearchintogood

learningbehaviours.EducationalActionResearch,9(2),27.

doi:10.1080/09650790100200149

Page 456: Exploring The Impact of a Largescale Diagnostic Science

435

Broadfoot,P.(2009).Foreword.InJ.Cumming&C.Wyatt-Smith(Eds.),

Educationalassessmentinthe21stcentury:connectingtheoryandpractice

(pp.309).Dordrecht:Springer.Retrievedfrom

http://www.lib.uts.edu.au/sso/goto.php?url=http://dx.doi.org/10.1007/9

78-1-4020-9964-9.doi:10.1007/978-1-4020-9964-9_3

Broadfoot,P.,&Black,P.(2004).Redefiningassessment?Thefirsttenyearsof

AssessmentinEducation.AssessmentinEducation:Principles,Policy&

Practice,11(1),7-27.doi:10.1080/0969594042000208976

Brookhart,S.(2003).DevelopingMeasurementTheoryforClassroomAssessment

PurposesandUses.EducationalMeasurement:IssuesandPractice,22(4),5-

12.

Bryman,A.(2012).SocialScienceMethods.NewYork:OxfordUniversityPress.

CC,CurriculumCorporation.(n.d.).AssessmentforLearningwebsite.Retrieved

fromhttp://www.assessmentforlearning.edu.au/default.asp

CERI,OECD-CentreforEducationalResearchandInformation.(2005).Formative

Assessment:Improvinglearninginsecondaryclassrooms.Retrievedfrom

http://www.oecd.org/education/ceri/35661078.pdf

CERI,OECD-CentreforEducationalResearchandInformation.(2008).21st

CenturyLearning:Research,InnovationandPolicy.Directionsfromrecent

OECDanalysis(pp.13).Retrievedfrom

http://www.oecd.org/site/educeri21st/40554299.pdf

CGCS,CounciloftheGreatCitySchools.(2015).StudentTestinginAmerica'sGreat

CitySchools:AnInventoryandPreliminaryAnalysis.Retrievedfrom

http://www.cgcs.org/cms/lib/DC00001581/Centricity/Domain/87/Testin

gReport.pdf

Chubb,Ian.(2012).Mathematics,Engineering&ScienceintheNationalInterest.

RetrievedfromCanberra:

Page 457: Exploring The Impact of a Largescale Diagnostic Science

436

http://www.chiefscientist.gov.au/category/archives/mathematics-

engineering-and-science-report/

Clark,I.(2012).FormativeAssessment:AssessmentIsforSelf-regulatedLearning.

EducationalPsychologyReview,24,205-249.doi:10.1007/s10648-011-

9191-6

CommonwealthofAustraliaConstitutionAct(TheConstitution).Retrievedfrom

https://www.legislation.gov.au/Details/C2013Q00005

Connell,R.W.(1985).TheCompetitiveAcademicCurriculum.InD.Cohen&T.

Maxwell(Eds.),BlockedattheEntrance:Context,CasesandCommentaryon

CurriculumChange:EntrancePublications.

Cooney,G.(2006).ReviewofAssessmentsinthecontextofNational

Developments.Retrievedfrom

https://www.det.nsw.edu.au/media/downloads/dethome/yr2007/cooney

reviewfll.pdf

Corcoran,T.,Mosher,F.A.,&Rogat,A.(2009).LearningProgressionsinScience:An

evidence-basedapproachtoreform.Retrievedfromhttp://www.cpre.org/

andhttp://www.ccii-cpre.org/.

Corrigan,D.,Gunstone,R.,&Jones,A.(Eds.).(2013).ValuingAssessmentinScience

Education:Pedagogy,Curriculum,Policy:Dordrecht:Springer.

Cowie,B.(2005).Studentcommentaryonclassroomassessmentinscience:a

socioculturalinterpretation.InternationalJournalofScienceEducation,

27(2),199-214.doi:10.1080/0950069042000276721

Cowie,B.(2013).AssessmentintheScienceClassroom:Priorities,Practices,and

Prospects.SAGEHandbookofResearchonClassroomAssessment.SAGE

Publications,Inc.ThousandOaks,CA:SAGEPublications,Inc.

Page 458: Exploring The Impact of a Largescale Diagnostic Science

437

Cowie,B.,&Bell,B.(1999).AModelofFormativeAssessmentinScience.

AssessmentinEducation:Principles,Policy&Practice,6(1),101-116.

doi:10.1080/09695949993026

Creswell,J.W.(2012).EducationalResearch:Planning,Conducting,andEvaluating

QuantitativeandQualitativeResearch(4e).Boston,MA:Pearson.

Creswell,J.W.,&PlanoClark,V.L.(2011).DesigningandConductingMixed

MethodsResearch(2E).ThousandOaks,CA:SAGE.

CRTTE,CommitteefortheReviewofTeachingandTeacherEducation.(2003).

Australia'sTeachers:Australia'sFuture.AdvancingInnovation,Science,

TechnologyandMathematics.(DowReport).Canberra,ACT:Departmentof

Education,ScienceandTraining.

Cumming,Joy,&Wyatt-Smith,Claire.(2009).Educationalassessmentinthe21st

century:connectingtheoryandpractice(pp.xxv).Retrievedfrom

http://www.lib.uts.edu.au/sso/goto.php?url=http://dx.doi.org/10.1007/9

78-1-4020-9964-9doi:10.1007/978-1-4020-9964-9_3

CURASS,AustralianEducationCouncilCurriculumandAssessmentCommittee.

(1994).Science--acurriculumprofileforAustralianschools.Carlton,Victoria,

Australia:CurriculumCorporation.

Dann,R.(2002).Promotingassessmentaslearning:improvingthelearningprocess.

London:Routledge/Falmer.

Darling-Hammond,L.(2003).StandardsandAssessments:WhereWeAreand

WhatWeNeed.TeachersCollegeRecord.#11109(2/16/2003).

http://www.tcrecord.org.

Deakin-Crick,R.,Broadfoot,P.,&Claxton,G.(2004).DevelopinganEffective

LifelongLearningInventory:theELLIproject.AssessmentinEducation,

11(3),247-272.doi:10.1080/0969594042000304582

Page 459: Exploring The Impact of a Largescale Diagnostic Science

438

DEC,NSWDepartmentofEducationandCommunities.(2014).ESSAtestbooklet

(pp.31).Sydney:DepartmentofEducationandCommunities

DEC,NSWDepartmentofEducationandCommunities.(2015).EssentialSecondary

ScienceAssessment2014statereport.Sydney:NSWDepartmentof

EducationandCommunities.

Denzin,N.K.,&Lincoln,Y.S.(Eds.).(2011).TheSAGEHandbookofQualitative

Research4E.WashingtonDC:SAGE.

DES,DepartmentforEducationandSkills.(2003).21stCenturySkills.Realising

ourPotential.Individuals,Employers,Nation.Retrievedfrom

https://www.gov.uk/government/uploads/system/uploads/attachment_d

ata/file/336816/21st_Century_Skills_Realising_Our_Potential.pdf

DET,NSWDepartmentofEducationandTraining.(2003).QualityTeachinginNSW

PublicSchools:DiscussionPaper.Sydney,Australia:NSWDET,Professional

SupportandCurriculumDirectorate.

DET,NSWDepartmentofEducationandTraining.(2006).QualityteachinginNSW

publicschools:Anassessmentpracticeguide.Sydney:DET.

DET,NSWDepartmentofEducationandTraining.(2007).ESSAreportforparents

(2006example).Retrievedfrom

http://www.schools.nsw.edu.au/learning/7-

12assessments/essa/essasmart.php

DET,NSWDepartmentofEducationandTraining.(2008).PrinciplesofAssessment

andReportinginNSWPublicSchools.Sydney:NSWDepartmentof

EducationandTraining.Retrievedfrom

http://www.curriculumsupport.education.nsw.gov.au/timetoteach/assess/

princep_ass.htm.

DET,NSWDepartmentofEducationandTraining.(2011).EssentialSecondary

ScienceAssessment2011framework.Sydney:EMSAD.

Page 460: Exploring The Impact of a Largescale Diagnostic Science

439

DET,NSWDepartmentofEducationandTraining.(2015).DN/15/00033:Critical

changestoEssentialSecondaryScienceAssessment(ESSA)Sydney.Sydney:

DepartmentofEducationandTraining.

Driver,R.,&Easley,J.(1978).Pupilsandparadigms:Areviewofliteraturerelated

toconceptdevelopmentinadolescentsciencestudents.StudiesinScience

Education,5,61-84.

Dulfer,N.,Polesel,J.,&Rice,S.(2012).TheExperienceofEducation:Theimpactsof

highstakestestingonschoolstudentsandtheirfamilies.AnEducator’s

Perspective.Rydalmere:TheWhitlamInstitute.

Duschl,R.A.,Schweingruber,H.A.,&Shouse,A.W.(Eds.).(2007).TakingScienceto

School:LearningandTeachingScienceinGradesK-8.Washington,DC:The

NationalAcademiesPress.

EAA,EducationalAssessmentAustralia.(2018).ICASScience.Retrievedfrom

https://www.eaa.unsw.edu.au/icas/subjects/science

Earl,K.,&Giles,D.(2011).An-otherLookatAssessment:AssessmentinLearning.

NewZealandJournalofTeachers'Work,8(1),11-20.

ESA,EducationServicesAustralia.(n.d.).Home.Retrievedfrom

https://www.esa.edu.au/solutions/our-solutions

ESA,EducationServicesAustralia.(2012).NationalDigitalLearningResources

Network:ScienceRetrievedfrom

http://www.ndlrn.edu.au/using_digital_resources/australian_curriculum_r

esources/science.html

Fensham,P.(2013).InternationalAssessmentsofScienceLearning:TheirPositive

andNegativeContributionstoScienceEducation.InD.Corrigan,R.

Gunstone,&A.Jones(Eds.),ValuingAssessmentinScienceEducation:

Pedagogy,Curriculum,Policy(pp11-32).Dordrecht:Springer.

Page 461: Exploring The Impact of a Largescale Diagnostic Science

440

Fensham,P.,&Rennie,L.(2013).TowardsanAuthenticallyAssessedScience

Curriculum.InD.Corrigan,R.Gunstone,&A.Jones(Eds.),Valuing

AssessmentinScienceEducation:Pedagogy,Curriculum,Policy(pp69-100).

Dordrecht:Springer.

Flyvbjerg,B.(2011).CaseStudy.InN.K.Denzin&Y.S.Lincoln(Eds.),TheSAGE

HandbookofQualitativeResearch(4e)(pp.301-316).ThousandOaks,CA:

Sage.

Fraser,B.L.(1978).Developmentofatestofscience-relatedattitudes.Science

Education,62(509-515).

Frey,B.B.,&Schmitt,V.L.(2007).ComingtoTermsWithClassroomAssessment.

JournalofAdvancedAcademics,18(3),402-423.

Ginsberg,H.,&Opper,S.(1979).Piaget'sTheoryofIntellectualDevelopment(2nd

ed.).NewJersey:Prentice-HallInc.

Goodrum,D.,Rennie,L.J.,&Hackling,M.(2001).TheStatusandQualityofTeaching

andLearningofScienceinAustralianSchools.Canberra:Departmentof

Education,TrainingandYouthAffairs

Gipps,C.(1999).Chapter10:Socio-CulturalAspectsofAssessment.Reveiwof

ResearchinEducation.doi:10.3102/0091732X024001355

Goodrum,D.,&Rennie,L.J.(2007).AustralianSchoolScienceEducationNational

Plan2008-2012Volume1TheNationalActionPlan.Canberra,Department

ofEducation,ScienceandTraining.

Griffin,P.(2009).Teachers'UseofAssessmentData.InC.Wyatt-Smith&J.J.

Cumming(Eds.),EducationalAssessmentinthe21stCentury(pp.25).

Dordrecht:Springer.doi:10.1007/978-1-4020-9964-9

Guba,E.G.(1981).Criteriaforassessingthetrustworthinessofnaturalistic

inquiries.EducationalCommunitcationandTechnologyJournal,29(2).

Page 462: Exploring The Impact of a Largescale Diagnostic Science

441

Hackling,M.(2004).Chaptereight:Assessmentinscience.InG.J.Venville&V.M.

Dawson(Eds.),TheArtofTeachingScienceformiddleandsecondaryschool

(pp.126-144).CrowsNest,NSW:Allen&Unwin.

Hackling,M.,Peers,S.,&Prain,V.(2007).PrimaryConnections:Reformingscience

teachinginAustralianprimaryschools.TeachingScience,53(3),12-16.

Hammersley,M.(2008).QuestioningQualitativeInquiry:CriticalEssays.Los

Angeles,CA:SAGE.

Hand,B.,Yore,L.D.,Jagger,S.,&Prain,V.(2010).Connectingresearchinscience

literacyandclassroompractice:areviewofscienceteachingjournalsin

Australia,theUKandtheUnitedStates,1998-2008.StudiesinScience

Education,46(1),45-68.doi:10.1080/03057260903562342

Hargreaves,E.(2005).AssessmentforLearning?Thinkingoutsidethe(black)box.

CambridgeJournalofEducation,35(2),213-224.

Harlen,W.(2004).Asystematicreviewoftheevidenceofreliabilityandvalidityof

assessmentbyteachersusedforsummativepurposes.Retrievedfrom

http://eppi.ioe.ac.uk/cms/Default.aspx?tabid=116

Harlen,W.(2005).Trustingteachers’judgement:researchevidenceofthe

reliabilityandvalidityofteachers’assessmentusedforsummative

purposes.ResearchPapersinEducation,20(3),245-270.

doi:10.1080/02671520500193744

Harlen,W.,&Deakin-Crick,R.(2002).Asystematicreviewoftheimpactof

summativeassessmentandtestsonstudents'motivationforlearning.

Retrievedfromhttp://eppi.ioe.ac.uk/

Hattie,J.(2003a).FormativeandSummativeInterpretationsofAssessment

Information.Auckland,NZ:UniversityofAuckland.

Page 463: Exploring The Impact of a Largescale Diagnostic Science

442

Hattie,J.(2003b).TeachersMakeaDifference,Whatistheresearchevidence?ACER

researchconferencepaper.Melbourne.ACER.

https://research.acer.edu.au/research_conference_2003/4

Hattie,J.(2005).Whatisthenatureofevidencethatmakesadifferencetolearning?

PaperpresentedattheUsingDatatoSupportLearning,GrandHyattHotel,

Melbourne7-9August2005.

Hattie,J.(2009).VisibleLearning,Tomorrow’sSchools,TheMindsetsthatmakethe

differenceinEducation.PaperpresentedattheGuestLecturesbyVisiting

Academics,TheTreasury,Wellington,NZ.

Hattie,J.(2012).VisibleLearningforTeachers:Maximisingtheimpactonlearning.

London:Routlege.

Hattie,J.(2018).VisibleLearningplus252+InfluencesonStudentAchievement.

Retrievedfromhttps://visible-learning.org/wp-

content/uploads/2018/03/VLPLUS-252-Influences-Hattie-ranking-DEC-

2017.pdf

Hattie,J.,Jaeger,R.M.,&Bond,L.(1999).PersistentMethodologicalQuestionsin

EducationalTesting.InP.AsgharIran-Nejad&D.Pearson(Eds.),Reviewof

ResearchinEducation(Vol.24(1)).WashingtonDC:AERA.

Hattie,J.,&Brown,G.(2004,September).CognitiveProcessesinasTTle:TheSOLO

Taxonomy(asTTLeTechnicalReport#43).Auckland:Universityof

Auckland/NZMinistryofEducation

Hattie,J.,&Timperley,H.(2007).ThePowerofFeedback.ReviewofEducational

Research,77(1),81-112.doi:10.3102/003465430298487

Heritage,M.(2010).FormativeAssessmentandNext-GenerationAssessment

Systems:AreWeLosinganOpportunity?Retrievedfrom

http://www.ccsso.org/

Page 464: Exploring The Impact of a Largescale Diagnostic Science

443

Hickey,D.T.,Taasoobshirazi,G.,&Cross,D.(2012).AssessmentasLearning:

EnhancingDiscourse,Understanding,andAchievementinInnovative

ScienceCurricula.JournalofResearchinScienceEducation,49(10),1240-

1270.

Huber,P.,Tytler,R.,&Haslam,F.(2010).TeachingandLearningaboutForcewith

aRepresentationalFocus:PedagogyandTeacherChange.Researchin

ScienceEducatiion,40,5-28.doi:10.1007IsIII65-009-9154-9

IEA,InternationalAssociationfortheEvaluationofEducationalAchievement.

(2013).TIMSSandPIRLSHome.Retrievedfromhttp://timss.bc.edu/

James,M.(2006).Learninghowtolearn,inclassrooms,schoolsandnetworks.

ResearchPapersinEducation,21(2),101-234.

James,M.(2009).AssessmentinSchools:Fitforpurpose?Cambridge:Universityof

CambridgeFacultyofEducation.

James,M.,McCormick,R.,Black,P.,Drummond,M.-J.,Fox,A.,MacBeath,J.,...

Wiliam,D.(2007).ImprovingLearningHowtoLearn:Classrooms,schools

andnetworks.London:Routledge.

JFF,JobsfortheFuture.(2007).TheSTEMworkforcechallenge:theroleofthe

publicworkforcesysteminanationalsolutionforacompetitive,science,

technology,engineering,andmathematics(STEM)workforce.Retrieved

from

https://www.doleta.gov/youth_services/pdf/STEM_Report_4%2007.pdf

Johnson,R.B.,Onwuegbuzie,A.J.,&Turner,L.A.(2007).TowardsaDefinitionof

MixedMethodsResearch.JournalofMixedMethodsResearch,1(2),112-133.

doi:10.1177/1558689806298224

Jones,A.,&Buntting,C.(2013).International,NationalandClassroomAssessment:

PotentFactorsinShapingWhatCountsinSchoolScience.InD.Corrigan,R.

Page 465: Exploring The Impact of a Largescale Diagnostic Science

444

Gunstone,&A.Jones(Eds.),ValuingAssessmentinScienceEducation:

Pedagogy,Curriculum,Policy.(pp33-550).Dordrecht:Springer.

Klenowski,V.,&Wyatt-Smith,C.(2012).Theimpactofhighstakestesting:the

Australianstory.AssessmentinEducation:Principles,Policy&Practice,

19(1),65-79.doi:10.1080/0969594X.2011.592972

LaerdStatistics.(2013).MultipleRegressionAnalysisusingSPSSStatistics.

Retrievedfromhttps://statistics.laerd.com/spss-tutorials/multiple-

regression-using-spss-statistics.php

LaerdStatistics.(2017).TestingforNormalityusingSPSSStatistics.Retrievedfrom

https://statistics.laerd.com/spss-tutorials/testing-for-normality-using-

spss-statistics.php

LaerdStatistics.(2018).OnewayANOVAusingSPSSStatistics.Retrievedfrom

https://statistics.laerd.com/spss-tutorials/one-way-anova-using-spss-

statistics.php#procedure

Lane,D.M.(n.d.).ONlineStatisticsEducation:AMultimediaCourseofStudy.D.M.

Lane(Ed.)(pp.692).Retrievedfrom

http://onlinestatbook.com/2/index.html

Lemke,J.L.(2001).ArticulatingCommunities:SocioculturalPerspectiveson

ScienceEducation.JournalofResearchinScienceTeaching,38(3),296-316.

Lim,X.-S.,TanEngThye,J.,&KangLu-Ming,T.(2009).Avoidingthe“prolonged

agony”ofstudyingforstandardizednationalexams:Atwhatprice?Paper

presentedattheAARE2009conference,Canberra.

http://www.aare.edu.au/09pap/abs09.htm

Lyons,T.,&Quinn,F.(2010).ChoosingScience:Understandingthedeclinesin

seniorhighschoolscienceenrolments.Retrievedfrom

https://simerr.une.edu.au/pages/projects/131choosingscience.pdf

Page 466: Exploring The Impact of a Largescale Diagnostic Science

445

Lyons,T.,&Quinn,F.(2012).RuralHighSchoolStudents'AttitudesTowardSchool

Science.AustralianandInternationalJournalofRuralEducation,22(2),8.

Lyons,T.,&Quinn,F.(2014).HowRelevantAreAustralianScienceCurriculafor

RuralandRemoteStudents?AustralianandInternationalJournalofRural

Education,24(2),8.

Mansell,W.,James,M.,&TheAssessmentReformGroup.(2009).Assessmentin

schools.Fitforpurpose?ACommentarybytheTeachingandLearning

ResearchProgramme.London:EconomicandSocialResearchCouncil,

TeachingandLearningResearchProgamme.

Marzano,R.J.(2000).WhatareGradesFor?TransformingClassroomGrading(pp.

10).Denver,Colorado:Mid-continentResearchforEducationandLearning

(McREL).

Marton,K.,&Säljö,R.(1976).Onqualitativedifferencesinlearning:1—outcomes

andprocess.BritishJournalofEducationalpsychology,46,8.

Masters,G.N.(2009).Assessingsciencelearning(PATsciencetests).Melbourne,

VIC:ACER.

Masters,G.N.(2013).ReformingEducationalAssessment:Imperatives,principles

andchallenges.InS.Mellor(SeriesEd.)AustralianEducationReview.

Retrievedfromhttp://research.acer.edu.au/aer/12

Matters,G.,&Curtis,D.(2008).Astudyintotheassessmentandreportingof

employabilityskillsofseniorsecondarystudents.Retrievedfrom

https://research.acer.edu.au/cgi/viewcontent.cgi?article=1000&context=ar

_misc

MCEETYA,MinisterialCommitteeofEducation,Employment.TrainingandYouth

Affairs.(1998).AReviewofthe1989CommonandAgreedGoalsfor

SchoolinginAustralia(The‘HobartDeclaration’).CarltonSouth,Victoria:

MCEETYASecretariat.

Page 467: Exploring The Impact of a Largescale Diagnostic Science

446

MCEETYA,MinisterialCouncilofEducation,Employment,TrainingandYouth

Affairs.(2008).MelbourneDeclarationonEducationalGoalsforYoung

Australians.Melbourne,Victoria:CurriculumCorporation.

Merriam,S.B.(1998).Qualitativeresearchandcasestudyapplicationsineducation.

SanFrancisco:Jossey-Bass.

Messick,S.(1995).ValidityofPsychologicalAssessment:ValidationofInferences

FromPersons'ResponsesandPerformancesasScientificInquiryIntoScore

Meaning.AmericanPsychologist,50(9),741-749.

Millar,R.,&Hames,V.(2003).TowardsEvidence-basedPracticeinScience

Education1-4.InTeachingandLearningResearchProgrammeTLRP(Ed.).

York,UK:UniversityofYork.

Millar,R.(2013).ImprovingScienceEducation:WhyAssessmentMatters.InD.

Corrigan,R.Gunstone,&A.Jones(Eds.),ValuingAssessmentinScience

Education:Pedagogy,Curriculum,Policy(pp55-68).Dordrecht:Springer.

Mislevy,R.J.(2008).HowCognitiveScienceChallengestheEducational

MeasurementTradition,18.Retrievedfrom

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.471.4656&rep

=rep1&type=pdf

Mitchell,I.,Mitchell,J.,&Lumb,D.(2009).PrinciplesofTeachingforEffective

Learning:TheVoiceoftheTeacher.Clayton,VIC:PEELPublishing.

Mullis,I.V.S.,Martin,M.O.,Ruddock,G.J.,O'Sullivan,C.Y.,&Preuschoff,C.(2009).

TIMSS2011AssessmentFrameworks.USA:TIMSS&PIRLSInternational

StudyCentre,LynchSchoolofEducation,BostonCollege.

NAEP,NationalAssessmentGoverningBoard.(2011).ScienceFrameworkforthe

2011NationalAssessmentofEducationalProgress.Washington,DC:US

GovernmentPrintingOffice.

Page 468: Exploring The Impact of a Largescale Diagnostic Science

447

NESA,NSWEducationStandardsAuthority.(n.d.).Home.Retrievedfrom

https://www.esa.edu.au/solutions/our-solutions

NESA,NSWEducationStandardsAuthority.(2017).StatisticsArchive--Stage5

results(ArchivedYear10Gradereports).Retrievedfrom

http://www.boardofstudies.nsw.edu.au/ebos/static/ebos_stats.html

NESA,NSWEducationStandardsAuthority.(2018).Assessmentfor,asandof

Learning.Retrievedfromhttp://syllabus.nesa.nsw.edu.au/support-

materials/assessment-for-as-and-of-learning/

Newton,P.(2007).Clarifyingthepurposesofeducationalassessment.Assessment

inEducation,14(2),149.doi:10.1080/09695940701478321

Newton,P.(2010).TheMultiplePurposesofAssessment.InP.McGaw,E.Peterson,

&B.Baker(Eds.),InternationalEncyclopediaofEducation(ThirdEdition)

(pp.392-396).Oxford:Elsevier.

Nicol,D.J.,&Macfarlane-Dick,D.(2006).Formativeassessmentandself-regulated

learning:amodelandsevenprinciplesofgoodfeedbackpractice.Studiesin

HigherEducation,31(2),199-218.doi:10.1080/03075070600572090

NRC,NationalResearchCouncil.(1996).NationalScienceEducationStandards

(NSES)(pp.273).Retrievedfrom

https://www.csun.edu/science/ref/curriculum/reforms/nses/nses-

complete.pdf

NRC,NationalResearchCouncil.(2001).KnowingWhatStudentsKnow:The

ScienceandDesignofEducationalAssessment.Retrievedfrom

http://www.nap.edu/catalog.php?record_id=10019

NRC,NationalResearchCouncil.(2008).ResearchonFutureSkillDemands:A

WorkshopSummary.Retrievedfrom

https://www.nap.edu/catalog/12066/research-on-future-skill-demands-a-

workshop-summary

Page 469: Exploring The Impact of a Largescale Diagnostic Science

448

NSWDofE,NSWDepartmentofEducation.(2013).Policies:Curriculumplanning

andprogramming,assessingandreportingtoparentsK-12.Retrievedfrom

https://www.det.nsw.edu.au/policies/curriculum/schools/curric_plan/PD

20050290.shtml?query=Curriculum+planning+and+programming%2c+ass

essing+and+reporting+to+parents+K-12

NSWDofE,NSWDepartmentofEducation.(2017).ClassSize.Retrievedfrom:

https://education.nsw.gov.au/about-us/our-people-and-structure/history-

of-government-schools/facts-and-figures/class-size

NSWDofE,NSWDepartmentofEducation.(2018).VALIDprogram.Retrieved

fromhttps://education.nsw.gov.au/teaching-and-learning/student-

assessment/assessment-and-reporting/assessment/valid

NuffieldFoundation.(2018).TheAssessmentReformGroup(ARG).Retrievedfrom

http://www.nuffieldfoundation.org/assessment-reform-group

Nusche,D.,Radinger,T.,Santiago,P.(co-ordinator),&Shewbridge,C.(2013).

SynergiesforBetterLearning:ANINTERNATIONALPERSPECTIVEON

EVALUATIONANDASSESSMENT.Paris:OECD.Retrievedfrom

http://www.oecd.org/education/school/synergies-for-better-learning.htm

OCS,OfficeoftheChiefScientist.(2014).Science,Technology,Engineeringand

Mathematics:Australia'sFuture.Retrievedfrom

http://www.chiefscientist.gov.au/2014/09/professor-chubb-releases-

science-technology-engineering-and-mathematics-australias-future/

OCS,OfficeoftheChiefScientist.(2017).ScienceandmathsinAustralian

secondaryschoolsdatasheet.Retrievedfrom

http://www.chiefscientist.gov.au/2016/07/science-and-maths-in-

australian-secondary-schools-datasheet/

OECD,OrganisationforEconomicCo-operationandDevelopment.(1996).The

Knowledge-basedEconomy(pp.46).Paris:OECD.Retrievedfrom

http://www.oecd.org/sti/sci-tech/1913021.pdf

Page 470: Exploring The Impact of a Largescale Diagnostic Science

449

OECD,OrganisationforEconomicCo-operationandDevelopment.(1997).

ThematicReviewoftheTransitionfromInitialEducationtoWorkingLife:

Australia,CountryNote.Paris:OECD.Retrievedfrom

http://www.oecd.org/edu/innovation-education/1908315.pdf

OECD,OrganisationforEconomicCo-operationandDevelopment.(2003).Key

CompetenciesforaSuccessfulLifeandaWell-FunctioningSociety.Paris:

OECD.Retrievedfromhttp://deseco.ch/bfs/deseco/en/index/02.html

OECD,OrganisationforEconomicCo-operationandDevelopment.(2011).OECD

ReviewsofEvaluationandAssessmentinEducation:Australia.Paris:OECD.

Retrievedfromhttps://www.oecd.org/australia/48519807.pdf

OECD,OrganisationforEconomicCo-operationandDevelopment.(2014).About

PISA.Retrievedfromhttp://www.oecd.org/pisa/aboutpisa.htm

OECD,OrganisationforEconomicCo-operationandDevelopment.(2017).PISA

2015:AssessmentandAnalyticalFramework:Science,Reading,

Mathematic,FinancialLiteracyandCollaborativeProblemSolving,revised

edition,PISA.Paris:OECD.Retrievedfrom

https://www.mecd.gob.es/dctm/inee/internacional/pisa-2015-

frameworks.pdf?documentId=0901e72b820fee48

doi:10.1787/9789264281820-en

OECD,OrganisationforEconomicCo-operationandDevelopment.(2018).How's

Life?MeasuringWell-being.Retrievedfrom

http://www.oecdbetterlifeindex.org/

Osborne,J.,&Dillon,J.(2008).CriticalReflections:AReporttotheNuffield

Foundation.Retrievedfromhttp://efepereth.wdfiles.com/local--

files/science-education/Sci_Ed_in_Europe_Report_Final.pdf

Palmer,T.-A.(2015).FreshMindsforScience:Usingmarketingsciencetohelpschool

science.(DoctorofPhilosophyPhD),UniversityofTechnologySydney,

Sydney.Retrievedfromhttp://hdl.handle.net/10453/37019

Page 471: Exploring The Impact of a Largescale Diagnostic Science

450

Panizzon,D.(2003).Usingacognitivestructuralmodeltoprovidenewinsights

intostudents'understandingsofdiffusion.InternationalJournalofScience

Education,25(12),1427-1450.doi:10.1080/0950069032000052108

Panizzon,D.,Arthur,D.,&Pegg,J.(2006).EssentialSecondaryScienceAssessment:

Developmentandscopeofatesttoexplorescientificliteracyand

achievementinNSW.TeachingScience,52(4),6.

Panizzon,D.,&Bond,T.(2006).Exploringconceptualunderstandingsofdiffusion

andosmosisbyseniorhighschoolandundergraduateuniversityscience

students.InX.Liu&W.J.Boone(Eds.),Applicationsofraschmeasurementin

scienceeducation(pp.137-164).MapleGrove,MN:JamPress.

Panizzon,D.,&Bond,T.(2007).Measuringscientificunderstanding:Apedagogical

problemanditspotentialsolution?PaperpresentedattheAAREConference,

Fremantle,WA.

Panizzon,D.,Callingham,R.,Wright,T.,&Pegg,J.(2007).ShiftingSands:Using

SOLOtopromoteassessmentforlearningwithsecondarymathematicsand

scienceteachers.PaperpresentedattheAAREConference,Fremantle,WA.

PEEL,ProjectforEnhancingEffectiveLearning.(2009).AboutPEEL.Retrieved

fromhttp://www.peelweb.org/index.cfm?resource=about

Pegg,J.,Panizzon,D.,Arthur,D.,Scott,J.,&Aylmer,W.(2011).Assessingstudent

responsesinsecondaryscience:ExamplesfromESSA.Unpublished

manuscript,SiMMERCentre,UniversityofNewEngland.Armidale,NSW.

Pellegrino,J.W.(2009).TheDesignofanAssessmentSystemfortheRacetotheTop:

ALearningSciencePerspectiveonIssuesofGrowthandMeasurement.Paper

presentedattheExploratorySeminar:MeasurementChallengesWithinthe

RacetotheTopAgenda.Retrievedfrom:

http://www.k12center.org/publications.html.

Page 472: Exploring The Impact of a Largescale Diagnostic Science

451

Pierce,R.,&Chick,H.(2011).Teachers'intentionstousenationalliteracyand

numeracyassessmentdata:apilotstudy.AustralianEducationalResearcher,

38,433-447.

Polesel,J.,Dulfer,N.,&Turnbull,M.(2012).TheExperienceofEducation:The

impactsofhighstakestestingonschoolstudentsandtheirfamilies.Literature

Review.Rydalmere:TheWhitlamInstitute.

QERC.(1985).QualityofEdcuationinAustralia:ReportoftheReviewCommittee.

Canberra,ACT:AustralianGovernmentPublishingService.

QSA,QueenslandStudiesAuthority.(2012).Memo:FutureofQCATS.Retrievedfrom

https://www.qsa.qld.edu.au/qsa_secure/memos.act?year=2012&type=ME

MO&docType=MEMO&orderBy=audience

Rowe,K.J.,&Hill,P.W.(1996).Assessing,RecordingandReportingStudents'

EducationalProgress:thecasefor'subjectprofiles'.Assessmentin

Education:Principles,Policy&Practice,3(3),309-352.

doi:10.1080/0969594960030304

Rowe,K.J.(2006).SchoolPerformance:AustralianState/TerritoryComparisonsof

StudentAchievementsinNationalandInternationalStudies.Carlton,

Victoria:ACER.Retrievedfrom

http:research.acer.edu.au/learning_processes/5

Ruiz-Primo,M.A.,Shavelson,R.J.,Hamilton,L.,&Klein,S.(2002).Onthe

EvaluationofSystemicScienceEducationReform:Searchingfor

InstructionalSensitivity.JournalofResearchinScienceTeaching,39(5),369-

393.

Ruiz-Primo,M.A.(2009).TowardsaFrameworkforAssessing21stCenturyScience

Skills:CommissionedpaperfortheNationalAcademies.Denver:Universityof

ColoradoDenver.Retrievedfrom

https://sites.nationalacademies.org/cs/groups/dbassesite/documents/we

bpage/dbasse_072612.pdf

Page 473: Exploring The Impact of a Largescale Diagnostic Science

452

Ruiz-Primo,M.A.,&Li,M.(2012).ExaminingFormativeFeedbackintheClassroom

Context:NewResearchPerspectives.SAGEHandbookofResearchon

ClassroomAssessment.SAGEPublications,Inc.InJ.H.McMillan(Ed.).

ThousandOaks,CA:SAGEPublications,Inc.

Ryan,C.(1997).NSWKeyCompetenciesPilotProjectReport.Sydney:NSW

DepartmentofEducationCo-ordination.

Sadler,R.(2007).Perilsinthemeticulousspecificationofgoalsandassessment

criteria.AssessmentinEducation,14(3),387-392.

Sadler,R.(1998).FormativeAssessment:revisitingtheterritory.Assessmentin

Education:Principles,Policy&Practice,5(1),77-84.

doi:10.1080/0969595980050104

Sandelowski,M.(2000).WhateverHappenedtoQualitativeDescription?Research

inNursing&Health,23,334-340.

Schraw,G.,Crippen,K.J.,&Hartley,K.(2006).PromotingSelf-RegulationinScience

Education:MetacognitionasPartofaBroaderPerspectiveonLearning.

ResearchinScienceEducation,36,111-139.doi:10.1007/s11165-005-3917-

8

Schroder,H.,Driver,M.,&Streufert,S.(1967).Humaninformationprocessing.New

York:Holt,Rinehart,&Winston.

SCSA,WASchoolsCurriculumandStandardsAuthority.(2010).Western

AustralianMonitoringStandardsinEducation(WAMSE):Sciencetests.

Retrievedfrom

http://www.scsa.wa.edu.au/internet/Years_K10/WAMSE/Tests/Science

Sfard,A.(1998).OnTwoMetaphorsforLearningandtheDangersofChoosingJust

One.EducationalResearcher,27(4),10.doi:10.3102/0013189X027002004

Page 474: Exploring The Impact of a Largescale Diagnostic Science

453

Shayer,M.(1976).Developmentinthinkingofmiddleschoolandearlysecondary

schoolpupils.SchoolScienceReview,57.

Shayer,M.(2003).NotjustPiaget;notjustVygotsky,andcertainlynotVygotskyas

alternativetoPiaget.LearningandInstruction,13,465-485.

doi:10.1016/S0959-4752(03)00092-6

Shenton,A.K.(2004).Strategiesforensuringtrustworthinessinqualitative

researchprojects.EducationforInformation(22),63-75.

Shepard,L.A.(1993).EvaluatingTestValidity.InL.Darling-Hammond(Ed.),

ReviewofResearchinEducation(Vol.19(Issue1),pp.46).WashingtonDC:

AERA.

Shepard,L.A.(2001).TheRoleofClassroomAssessmentinTeachingand

Learning.InV.Richardson(Ed.),HandbookofResearchonTeaching(pp.

1278).WashingtonDC:AmericanEducationalResearchAssociation.

Shute,V.J.(2007).FocusonFormativeFeedback.Princeton,NJ:EducationalTesting

Service.

Sjøberg,S.,&Schreiner,C.(2010).TheROSEproject:Anoverviewandkeyfindings.

Retrievedfromhttp://roseproject.no./publications/english-pub.html

Smith,M.(2005).DataforschoolsinNSW:Whatisprovidedandcanithelp?Paper

presentedattheUsingDatatoSupportLearning,GrandHyattHotel,

Melbourne7-9August2005.

SCCS&F,StandingCommitteeofCommunities,SchoolsandFamilies.(2008).The

PurposesofTestingandFitnessforPurpose(ThirdReport).Retrievedfrom

www.educationengland.org.uk/documents/pdfs/2008-testing-and-

assessment.pdf

Stake,R.E.(1995).TheArtofCaseStudyResearch.ThousandOaks:Sage.

Page 475: Exploring The Impact of a Largescale Diagnostic Science

454

Stiggins,R.J.(2002).AssessmentCrisis:theAbsenceofAssessmentFORLearning.

PhiDeltaKappan,83(10),758-765.

Stiggins,R.J.(2004).Student-CentredClassroomAssessment.Retrievedfrom

169.204.228.86

Stiggins,R.J.(2007).AssessmentThroughtheStudent'sEyes.Educational

Leadership,64(8),22-26.

Stiggins,R.J.,&Chappius,J.(2005).UsingStudent-InvolvedClassroomAssessment

toCloseAchievementGaps.TheoryintoPractice,44(1),11-18.

Stiggins,R.J.,&DuFour,R.(2009).MaximizingthePowerofFormative

Assessments.PhiDeltaKappan,90(9),640-644.

doi:10.1177/003172170909000907

Tebbutt,HonCarmel.(2005).ScienceStudentsAssessment(page14956).

Retrievedfrom

http://www.parliament.nsw.gov.au/prod/PARLMENT/hansArt.nsf/0/31FF

954D026D2F48CA256FE5007AE34C

Thomson,S.,DeBortoli,L.,&Underwood,C.(2017).PISA2015:Reporting

Australia'sresults.Retrievedfromhttp://www.acer.org/ozpisa/reports/

Thomson,S.,Hillman,K.,&DeBortoli,L.(2013).Ateacher'sguidetoPISAscientific

literacy(pp.58).Retrievedfromhttp://www.acer.edu.au/ozpisa/reports/

Thomson,S.,Wernert,N.,O'Grady,E.,&Rodrigues,S.(2017).TIMSS2015:

ReportingAustralia'sresults.Retrievedfrom

https://research.acer.edu.au/timss_2015/2/

Tobin,K.(2012).SocioculturalPerspectivesonScienceEducation.InB.J.Fraser,K.

Tobin,&C.J.McRobbie(Eds.),SecondInternationalHandbookofScience

Education(pp.3-17).Dordrecht:Springer.

Page 476: Exploring The Impact of a Largescale Diagnostic Science

455

Torrance,H.(2007).Assessmentaslearning?Howtheuseofexplicitlearning

objectives,assessmentcriteriaandfeedbackinpost-secondaryeducation

andtrainingcancometodominatelearning.AssessmentinEducation,14(3),

281-294.

Treagust,D.F.(2006).Diagnosticassessmentinscienceasameansofimproving

teaching,learningandretention.PaperpresentedattheAssessmentin

ScienceTeachingandLearningSymposium,TheUniversityofSydney.

http://science.uniserve.edu.au/pubs/procs/2006/index.html

Tytler,R.(2007).Re-imaginingscienceeducation:engagingstudentsinsciencefor

Australia'sfuture.Camberwell,Vic.:ACERPress.

Tytler,R.,&Hubber,P.(2010).Arepresentation-intensivesignaturepedagogyfor

schoolscience?PaperpresentedattheAAREAnnualConference,

Melbourne.

Tytler,R.,&Prain,V.(2010).AFrameworkforRe-thinkingLearninginScience

fromRecentCognitiveSciencePerspectives.InternationalJournalofScience

Education,32(15),2055-2078.

Tytler,R.,Prain,V.,Huber,P.,&Waldrip,B.(Eds.).(2013).Constructing

RepresentationstoLearninScience:SensePublishers.

Tytler,R.,&Symington,D.(2015).ScienceLearninginRuralAustralia.Not

necessarilythepoorcousin.TeachingScience,61(3),7.

UNDP,UnitedNationsDevelopmentProgram.(2018).HumanDevelopmentIndex.

Retrievedfromhttp://hdr.undp.org/en/content/human-development-

index-hdi-table

UNESCO,UnitedNationsEducational,ScientificandCulturalOrganisation.(2005).

UNESCOWorldReport:TowardsKnowledgeSocieties.Paris:UNESCO

Publishing.

Page 477: Exploring The Impact of a Largescale Diagnostic Science

456

Vygotsky,L.S.(1978).Mindinsociety.London:HarvardUniversityPress.

Waldrip,B.,Prain,V.,&Carolan,J.(2010).UsingMulti-ModalRepresentationsto

ImproveLearninginJuniorSecondaryScience.ResearchinScience

Education,40,60-80.doi:10.1007/sI1165-009-9157-6

Wasson,D.(2009).Largecohorttesting--Howcanweuseassessmentdatatoeffect

schoolandsystemimprovement.Paperpresentedatthe2009ACERresearch

conference,Perth.Conferencetheme—AssessmentandStudentLearning:

Collecting,InterpretingandUsingDatatoInformTeaching.

Webb,N.L.(1997).ResearchMonographNo.6:CriteriaforAlignmentof

ExpectationsandAssessmentsinMathematicsandScienceEducationED

414305.Retrievedfromhttp://eric.ed.gov/?id=ED414305

White,R.,&Gunstone,R.(1992).ProbingUnderstanding.London:TheFalmer

Press.

Wiggins,G.P.(1998).Educativeassessment:designingassessmentstoinformand

improvestudentperformance:Jossey-Bass.

Wiliam,D.(2011a).Embeddedformativeassessment.Bloomington,IN47404:

SolutionTreePress.

Wiliam,D.(2011b).Whatisassessmentforlearning?StudiesinEducational

Evaluation,37,3-14.

Wilson,M.,&Sloane,K.(2000).FromPrinciplestoPractice:AnEmbedded

AssessmentSystem.AppliedMeasurementinEducation,13(2),181-208

Yin,R.K.(2003).CaseStudyResearch3rdedition(Vol.5).ThousandOaks:SAGE.