i. revisiting a 30+ year old argument
DESCRIPTION
Systematic Naturalistic Inquiry: Toward a Science of Performance Improvement (aka improvement research) Anthony S. Bryk Carnegie Foundation for the Advancement of Teaching Society for Research on Educational Effectiveness, March 2010. I. Revisiting a 30+ year old argument. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/1.jpg)
Systematic Naturalistic Inquiry: Toward a Science of Performance
Improvement (aka improvement research)
Anthony S. BrykCarnegie Foundation for the Advancement of
Teaching
Society for Research on Educational Effectiveness, March 2010
![Page 2: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/2.jpg)
I. Revisiting a 30+ year old argument
• Is design really the answer?
• The randomized treatment control paradigm as the gold standard circa 1975
• Takes me back to the spring of 1978: “evaluating program impact: a time to cast away
stones, a time to gather stones together”
• And, is this really the right question?
![Page 3: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/3.jpg)
II. What Information Does an RCT Actually Provide?
– Two marginal distributions YT and YC: the distributions of outcomes under the treatment and control conditions.
– Provides answers to questions that can be addressed in term of observed differences in these two marginal distributions.
![Page 4: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/4.jpg)
Evidentiary Limits of the Treatment-Control Group Paradigm
• Suppose now that we define a treatment effect for individual i as αi.
• We can estimate the mean treatment effect, μα.
• But, interestingly we cannot estimate the median effect or any percentile points in the αi distribution.
![Page 5: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/5.jpg)
Evidentiary Limits (continued)
• Nor can we assess any linkages between αi and how these effects might be changing over time, or depend on individual and context characteristics.
– To accomplish the latter, we need to know about the treatment effect distribution conjoint with multivariate data on individual and program characteristics.
![Page 6: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/6.jpg)
Evidentiary Limits (continued)
• Of course we can add a limited number of factors into the design and estimate these interaction effects.
• So we can do something on a limited scale within the T/C paradigm– But we need to know the factors in advance– And they have to be small in number– Pushing the envelop here would be time-
consuming, expensive and cumbersome
![Page 7: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/7.jpg)
My conclusions back then
• We need a different methodology for learning about programs and the multiple factors that may affect their outcomes
– An accumulating evidence strategy (Light and Smith) from multiple efforts at systematic inquiry over time
– Needs to be dynamic in design—as we learn from practice we are changing it
– A system orientation— “elements standing in strong interaction.”
pause
![Page 8: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/8.jpg)
The Paradox of Anti-depressant-Induced Suicidiality
(H.I. Weisberg, V.C. Hayden, V.P. Pontes (2009) Clinical Trials. Vol 6.No. 2, 109-118. )
• Key conclusions:– When the causal effect of an intervention varies across
individuals the threat to validity can be serious.
– RCTs should not automatically be considered definitive, especially when the results conflict with those of observational studies.
– Not only the magnitude but even the direction of the population causal effect may be erroneous.
![Page 9: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/9.jpg)
III. So a New Directions 2010: Basic Principles
• Returning to this idea of a prospective accumulating evidence strategy
– Simplest version: the multi-site trial vs. cluster randomized trial.
– Extend this idea out to all three facets – contexts, teachers, and students.
![Page 10: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/10.jpg)
Basic Principles
• Anchored in a working theory about advancing improvements reliably at scale
- Assume a systems perspectives: interventions as operationally defined in strong interactions with the specific people who take it up and the contexts in which they work.
– Gathering and using empirical evidence about such phenomena should be the organizing goal.
![Page 11: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/11.jpg)
Basic Principles (continued)
• Accelerated longitudinal design + a value added analytic model. – Counterfactual comes from a baseline
comparison. – In principle we have some evidence about
variable effects attached to individuals, their teachers and their context.
– Any individual piece not very precise but if we have enough cases there is power to see many signals.
![Page 12: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/12.jpg)
Basic Principles (continued) • A key internal validity concern – a coterminous
intervention to worry about.• But we also now have an evidentiary resource
not typically found in RCT – a capacity to examine questions of replicability
over many different contexts of intervention. – This is the generalizability evidence that relates
directly to our reliability consideration.– Can we make this happen with any reliability over
many different situation?
![Page 13: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/13.jpg)
What makes it naturalistic?
• Easily engaged in practice. Could be routinely done.
• Could imagine gathering such data at large scale.
• Immediacy of evidence – possibility of learning as you go.
• And as it will turn out, actually moot (opportunistic) on the question of an appropriate design analysis paradigm
![Page 14: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/14.jpg)
A recently completed study of the efficacy ofLiteracy Collaborative Professional Development
Co-contributor: Gina BiancarosaUniversity of Oregon
Detailing the causal cascade from the intentional design of professional education through
changes in instructional practice and then on through to improvements in student learning
gains over time.
IV. Elaborate through an Example
![Page 15: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/15.jpg)
Setting the Context: Typical District Approach to a Coaching Initiative
School
UnionContracts
Hiring& Assignment
PoliciesFor Coaches
Salary PoliciesFor Coaches
District
Coaching
Credit to: A Framework for Effective Management of School System Performance. Lauren Resnick, Mary Besterfield-Sacre, Matthew Mehalik, Jennifer Zoltners Sherer and Erica Halverson.
![Page 16: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/16.jpg)
And then voila! (aka the zone of wishful thinking!)
![Page 17: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/17.jpg)
Peering inside the “Black Box”: the actual work of coaches
![Page 18: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/18.jpg)
Data for Performance Improvement
What do coaches need to know and be able to do and how do we know if they can?
Quality of the trust dependency/ relationship
How do coaches actually spend their time?
Quality of coach-teacher trustsocial resources for improvement
Who is being coached on What topics? What about the Individual teacher might affectThese social exchanges?
![Page 19: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/19.jpg)
Data for Performance Improvement
Teacher practice development
Evidence of teacher
learning ?
![Page 20: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/20.jpg)
Filling out the account: an information system to support instructional improvement
Coaching Logs
Coaching performanceassessments
Surveys of coachprincipal trust:respect, regard, competence and integrity
Surveys of teacher-coach trust and school-based professional community
Teacher practice development
Observational evidence of teacher
learning and practice
Coaching logs: the who and what of PD as delivered and what’s next?
![Page 21: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/21.jpg)
IndividualTeacher
ClassroomLiteracy Practice
School-wide support for teacher learning
FormalSchool Structure
InformalOrganization
* principal leadership * coach quality/role relationship * resource allocations (time) * school size
* Work relations among teachers * Influence of informal leaders *professional norms
Joined in a Working Theory of Practice Improvement
Background• Willingness to engage
innovation– Experiment with
new practices in the classroom
• Expertise – Prior experiences in
comprehensive literacyteaching (ZPD)
LC Intervention:amount, quality and content
Of PD Impact onStudentlearning
It is hard to improve what you do not really understand.
![Page 22: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/22.jpg)
Linked to evidence about variability in effects on student
learning associated with teachers and schools
• Assessing (even crudely) the value added to learning associated with individual classrooms and schools and investigating what might be driving observed variability in these effects.
![Page 23: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/23.jpg)
Accelerated Cohort design6 cohorts studied over 4 years
Year of Study
First Year Second Year Third Year Fourth Year
Fall Spring Fall Spring Fall Spring Fall Spring
K C C D D E E F F
1 B B C C D D E E
2 A A B B C C D DGra
de
Training yearYear 1 of
implementationYear 2 of
implementationYear 3 of
implementation
![Page 24: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/24.jpg)
Hierarchical Crossed Value-added Effects Model
€
Y = δ000 + b00 + d00 + e
+δ010(Location _ 2) + δ020(Location _ 3) + δ030(Location _ 4) + δ040(Location _ 5) + δ050(Location _ 6)
+δ060(Cohort _ 2) + δ070(Cohort _ 3) + δ080(Cohort _ 5) + δ090(Cohort _ 6) + δ 0100(Cohort _ 7)
+δ100(Developmental_Metric _ Adjustment _ 4)
+δ200(Developmental_Metric _ Adjustment _ 5)
+δ300(Developmental_Metric _ Adjustment _ 6)
+δ400(Base_ AcademicYr_ Effect) + b40(Base_ Acad.Yr _ Effect) + d40(Base_ AcadYr_ Effect)
+δ500(Base_K /1_ Summer _ Effect) + d50(Base_K /1_ Summer _ Effect)
+δ600(Base_1/2 _ Summer _ Effect) + d60(Base_1/2 _ Summer _ Effect)
+c70(Teacher _Base_ValueAdded )
+δ800(Year1_ValueAdded ) + c80(Year1_ValueAdded ) + d80(Year1_ValueAdded )
+δ900(Summer1_ValueAdded )
+δ1000(Year2 _ValueAdded ) + c100(Year2 _ValueAdded ) + d100(Year2 _ValueAdded )
+δ1100 (Summer2 _ValueAdded )
+δ1200(Year3_ValueAdded ) + c120(Year _ 3_ValueAdded ) + d120(Year3_ValueAdded )
Individual growth parameters
overall value-added effects
teacher-level school-level value-added effects
![Page 25: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/25.jpg)
Value-added effects by year
Year 1 Year 2 Year 3
Average value-added (overall)
.164 .280 .327
Performance improvement
16% 28% 32%
Effect size .22 .37 .43
Ave. student learning growth is 1.02 per academic year
![Page 26: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/26.jpg)
School 95% plausible value-added range
±.23
Variability in school value-added, year 1Average student gain per academic year
No effect
Year 1 mean effect
![Page 27: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/27.jpg)
School 95% plausible value-added range
±.23 ±.28
Variability in school value-added, year 2Average student gain per academic year
No effect
Year 1 mean effect
Year 2 mean effect
![Page 28: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/28.jpg)
School 95% plausible value-added range
±.23 ±.28 ±.37
Variability in school value-added, year 3Average student gain per academic year
No effect
Year 1 mean effect
Year 2 mean effectYear 3 mean effect
![Page 29: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/29.jpg)
Teacher 95% plausible value-added range
±.51
Variability in teacher value-added within schools, year 1
Average student gain per academic year
No effect
![Page 30: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/30.jpg)
Teacher 95% plausible value-added range
±.51 ±.71
Variability in teacher value-added within schools, yr 2
Average student gain per academic year
No effect
![Page 31: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/31.jpg)
Teacher 95% plausible value-added range
±.51 ±.71 ±.91
Variability in teacher value-added within schools, yr 3
Average student gain per academic year
No effect
![Page 32: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/32.jpg)
Exploring variation in trends
Which teachers and schools improved most?
Why? Under what conditions?
![Page 33: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/33.jpg)
V. To Sum Up The accelerated multi-cohort design is
relatively easy to implement in school settings (a naturalistic data design).
It affords treatment effect results not easily obtainable through the “gold standard”— A multivariate distribution of effects linked with
potential sources of their variation and dynamic over time
![Page 34: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/34.jpg)
To Sum Up More generally, an argument for an
evolutionary, exploratory approach to accumulating evidence Data designs are now practical and analytic
tools exist. Imagine if we had such information now on the
750+ schools that have been involved with LC over the past 15 years.
A stronger empirical base for a design-engineering-development orientation to the improvement of schooling.
![Page 35: I. Revisiting a 30+ year old argument](https://reader036.vdocuments.us/reader036/viewer/2022070413/56814ce2550346895db9e35e/html5/thumbnails/35.jpg)
To Sum up: Useable Knowledge for Improving Schooling
• Anchored in:– place problems of practice improvement at the
center– a working theory of practice and its improvement– Measure core work activities and outcomes– Aim for a science of performance improvement
• Variation is the natural state of affairs• Make it an object of study• Reliability is a key improvement concern in human-
social resource intensive enterprise