assessing students' opportunity to learn the intended curriculum using an online teacher log:...
TRANSCRIPT
![Page 1: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/1.jpg)
This article was downloaded by: [University of Winnipeg]On: 11 September 2014, At: 11:30Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Educational AssessmentPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/heda20
Assessing Students' Opportunity to Learnthe Intended Curriculum Using an OnlineTeacher Log: Initial Validity EvidenceAlexander Kurza, Stephen N. Elliotta, Ryan J. Kettlerb & Nedim Yelaa Arizona State Universityb Rutgers UniversityPublished online: 14 Aug 2014.
To cite this article: Alexander Kurz, Stephen N. Elliott, Ryan J. Kettler & Nedim Yel (2014) AssessingStudents' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial ValidityEvidence, Educational Assessment, 19:3, 159-184, DOI: 10.1080/10627197.2014.934606
To link to this article: http://dx.doi.org/10.1080/10627197.2014.934606
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.
This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions
![Page 2: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/2.jpg)
Educational Assessment, 19:159–184, 2014
Copyright © Taylor & Francis Group, LLC
ISSN: 1062-7197 print/1532-6977 online
DOI: 10.1080/10627197.2014.934606
Assessing Students’ Opportunity to Learn theIntended Curriculum Using an Online Teacher
Log: Initial Validity Evidence
Alexander Kurz and Stephen N. ElliottArizona State University
Ryan J. KettlerRutgers University
Nedim YelArizona State University
This study provides initial evidence supporting intended score interpretations for the purpose of
assessing opportunity to learn (OTL) via an online teacher log. MyiLOGS yields 5 scores related
to instructional time, content, and quality. Based on data from 46 middle school classes, the
evidence indicated that (a) MyiLOGS has high usability, (b) its quarterly summary scores are
relatively consistent over time, (c) summary scores based on 20 randomly sampled log days provide
reliable estimates of teachers’ respective yearly summary scores, and (d) most teachers report
positive consequences from using the instrument. Agreements between log data from teachers
and independent observers were comparable to agreements reported in similar studies. Moreover,
several OTL scores exhibited moderate correlations with achievement and virtually nonexistent
correlations with a curricular alignment index. Limitations and directions for future research to
strengthen and extend this initial evidence are discussed.
Current test-based accountability contingencies targeted at schools are intended to compelteachers and administrators to improve relevant instructional inputs and processes in ways that
can lead to student achievement of intended outcomes. Although annual state assessments are
designed to yield test scores that permit valid interpretations of what students know and are
able to do, the evidence is rarely sufficient to make valid test score inferences about teachers’
instructional provisions (Polikoff, 2010). However, if the psychometric property of instructional
Correspondence should be sent to Alexander Kurz, T. Denny Sanford School of Social and Family Dynamics,
Arizona State University, 951 S Cady Mall, Tempe, AZ 85287. E-mail: [email protected]
Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/heda.
159
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 3: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/3.jpg)
160 KURZ ET AL.
sensitivity remains unknown, then test users cannot be confident that these assessments register
differences in instruction (D’Agostino, Welsh, & Corson, 2007). Teachers’ efforts to providestudents with the opportunity to learn the intended curriculum thus are likely to remain
unmeasured and unaccounted for in most test-based accountability systems.
Researchers interested in the concept of opportunity to learn (OTL) have established a range
of OTL indices that can lead to more direct, and potentially more valid, score interpretations
about time, content, and quality differences in teachers’ enacted curricula than inferences basedon test scores from summative, large-scale assessments alone (Kurz, 2011). Following a process
of validation outlined in the Standards for Educational and Psychological Testing (American
Educational Research Association, American Psychological Association, & National Council
on Measurement in Education, 1999), we present a summary of initial evidence supporting
intended score interpretations for the purpose of assessing OTL via an online teacher log
called the Instructional Learning Opportunities Guidance System (MyiLOGS; Kurz, Elliott, &Shrago, 2009). The summary includes multiple sources of evidence—usability, reliability, as
well as validity evidence based on content, responses processes, internal structure, relations
to other variables, and consequences of using the measure—and a critical appraisal of this
evidence in light of the proposed score interpretations and uses. To establish context for the
study and evidence based on content, we begin by providing a brief summary of researchrelated to OTL and the extent to which the instrument’s OTL indices represent the content
domains identified in the literature.
DEFINING OTL
For decades, researchers have examined instructional indicators of the enacted curriculum under
the larger concept of OTL (Rowan & Correnti, 2009). Kurz (2011) reviewed the respective
research literature and identified major lines of OTL research related to the time, content,
and quality of classroom instruction. His conceptual synthesis of OTL acknowledged the co-occurrence of all three enacted curriculum dimensions during instruction (see Figure 1). That
is, teachers allocate instructional time and content coverage to the standards that define the
intended curriculum using a variety of pedagogical approaches. The conceptual model depicts
OTL as a matter of degree along three orthogonal axes with distinct zero points.
OTL Indices
Carroll (1963) provided one of the first operational definitions of OTL according to the time
allocated to instruction in a school’s schedule (i.e., allocated time). Subsequently, researchers
developed more instructionally sensitive indices for descriptive purposes and to examine their
contributions to student achievement (see Borg, 1980). Such indices were based on the pro-portion of allocated time dedicated to instruction (i.e., instructional time), the proportion of
instructional time during which students were engaged (i.e., engaged time), or the proportion
of engaged time during which students experienced a high success rate (i.e., academic learning
time).
Researchers also defined OTL in relation to the content covered during instruction. The
main focus was the extent to which the content of instruction overlapped with the content of
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 4: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/4.jpg)
OPPORTUNITY TO LEARN 161
FIGURE 1 Conceptual model of opportunity to learn. Source: Kurz, 2011. Copyright © Springer ScienceC
Business Media, LLC 2011. Reprinted with kind permission from Springer Science and Business Media.
assessments (i.e., content overlap). The work of Husén (1967) for the International Association
of the Evaluation of Educational Achievement exemplifies this line of research, which typically
requires teachers to rate their coverage of the constructs assessed by test items. Followingstandards-based reform, policymakers shifted the normatively desirable target of instruction
from tested content to the broader intended curriculum, whose content is merely sampled
by large-scale achievement tests (Rowan, Camburn, & Correnti, 2004). Under the No Child
Left Behind Act (2001), states have been required to define their subject- and grade-specific
intended curricula through a set of rigorous academic standards. Subsequently, stakeholders
became more interested in taxonomies that allowed experts to judge the alignment betweenthe content of various curricula such as a teacher’s enacted curriculum and a state’s intended
curriculum. Porter (2002), for example, developed the Surveys of Enacted Curriculum (SEC),
which have been used to quantify alignment between standards and assessments (as well as
other curricula) via structured ratings along a comprehensive list of content topics and cognitive
demands (see Roach, Niebling, & Kurz, 2008).Researchers further considered aspects of instructional quality to operationalize OTL. Teach-
ers’ uses of empirically supported instructional practices and instructional resources, for exam-
ple, have become common considerations, especially following the findings from the process-
product literature (see Brophy & Good, 1986). More recently, meta-analytic findings have been
used by researchers and practitioners to identify specific instructional practices that contributeto student achievement (Slavin, 2002) including the achievement of specific subgroups such
as students with disabilities (e.g., Gersten et al., 2009). Examples include explicit instruction
(i.e., modeling and engaging students in a step-by-step approach to solving a problem), visual
representations, and guided feedback. Instructional grouping formats other than whole class
also have received support in meta-analytic reviews (see Elbaum, Vaughn, Hughes, Moody, &
Schumm, 2000).
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 5: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/5.jpg)
162 KURZ ET AL.
TABLE 1
Enacted Curriculum Dimensions, Relevant OTL Indices, and Proposed Definitions
Enacted
Curriculum Dimension OTL Index Index Definition
Time Instructional time Instructional time dedicated to teaching the general curriculum
standards and, if applicable, any custom objectives.
Content Content coverage Content coverage of the general curriculum standards and, if
applicable, any custom objectives.
Quality Cognitive processes Emphasis of cognitive process expectations along a range from
lower order to higher order thinking skills.
Instructional practices Emphasis of instructional practices along a range from generic
to empirically supported practices.
Grouping formats Emphasis of grouping formats along a range from individual to
whole class instruction.
Note. OTL D opportunity to learn.
OTL Frameworks
Stevens (1993) provided the first conceptual framework of OTL, bringing together four ele-
ments: content coverage, content exposure (i.e., time on task), content emphasis (i.e., emphasis
of cognitive processes), and quality of instructional delivery (i.e., emphasis of instructional prac-
tices). Despite its lack of operationalization, her framework has guided numerous researchersinterested in OTL (e.g., Abedi, Courtney, Leon, Kao, & Azzam, 2006; Herman & Abedi, 2004;
Wang, 1998). Most important, Stevens clarified OTL as a teacher effect related to the allocation
of adequate instructional time covering a core curriculum via different cognitive demands and
instructional practices that can produce student achievement.
According to the model by Kurz (2011), which was based on the aforementioned literature,OTL is a matter of degree related to the temporal, curricular, and qualitative aspects of a
teacher’s instruction. To provide OTL, a teacher must dedicate instructional time to covering
the content prescribed by the intended curriculum using pedagogical approaches that address a
range of cognitive processes, instructional practices, and grouping formats. Table 1 provides a
breakdown of the three enacted curriculum dimensions as well as the selection of OTL indicesand respective definitions that informed the scores of MyiLOGS. The indices related to the
quality of instruction require further clarification. We understand that the mere implementation
of certain cognitive processes, instructional practices, and grouping formats is neither an
exhaustive description of instructional quality nor a straightforward guarantee that a specific
demand, practice, or format itself was implemented with high quality. Instead we adopt a more
general stance used in prior OTL research (e.g., Brophy & Good, 1986; Carroll, 1989; Stevens,1993), which posits that certain instructional variables can impact the quality of instruction as
evidenced by their relation to student achievement.
OTL Measures
Many measures of classroom instruction assess aspects of a teacher’s enacted curriculum (e.g.,
Connor et al., 2009; Pianta & Hamre, 2009). Similarly, most OTL measures are designed to
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 6: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/6.jpg)
OPPORTUNITY TO LEARN 163
assess instructional inputs and processes that contribute to student achievement of intended
outcomes (Herman, Klein, & Abedi, 2000; McDonnell, 1995). Given current accountabilitysystems, these intended outcomes are clearly specified via a state’s academic standards-hence
Kurz’s particular definition of OTL. Very few measures address this content focus in ways that
provide actual time estimates (e.g., minutes) and information on instructional practices (see
Kurz, 2011). In addition, some measures provide scores for the entire class (e.g., Porter, 2002),
individual students (e.g., Rowan, Camburn, & Correnti, 2004), or both (e.g., Kurz, Talapatra,& Roach, 2012). Last, measurement methods can rely primarily on direct observation (e.g.,
Pianta & Hamre, 2009), teacher self-report (e.g., Porter, 2002), or a combination of both (e.g.,
Kurz et al., 2014).
Given the cost of frequent observations and related challenges of generalizing from a limited
sample of teaching observations to a universe of teaching events across the school year, Rowan
and colleagues have argued for teacher logs (e.g., Rowan et al., 2004; Rowan & Correnti,2009). Historically, the complexity and variability of classroom instruction have led most
OTL researchers to adopt a teacher logging approach (Burstein & Winters, 1994). Teacher
logs, however, can be administered as end-of-year surveys, which require teachers to provide a
summative classwide account of their instruction across the entire school year, or intermittently
based on a set of discreet days for individual students. Empirical evidence does not supportthe use of summative teacher logs for high- and low-frequency teaching behaviors and more
complex accounts of instruction (see Rowan et al., 2004). In addition, some evidence suggests
that classwide OTL indices differ from student-specific indices for students with disabilities
(Kurz et al., 2014). These findings call into question the extent to which classwide OTL indices
can be generalized to individual students nested within the same class.
OTL Studies
Researchers and policymakers have used OTL studies in a number of contexts to (a) describe
the instructional opportunities offered to different groups of students, (b) monitor the effects of
school reform efforts, and (c) understand and improve students’ academic achievement (Hermanet al., 2000; Porter, 1991; Roach et al., 2009). The development and initial use of MyiLOGS
occurred in the context of special education to assess OTL for students with disabilities nested
in either general or special education classes. The importance of OTL studies for students with
disabilities is grounded in a policy rationale, which requires compliance with federal legislation
such as the Individuals with Disabilities Education Act (1997) mandating students’ access tothe general education curriculum including its academic standards (Karger, 2005). In addition,
the participation of students with disabilities in tests that assess grade-level standards further
necessitates their exposure to the content of these standards to ensure the validity of certain
test score inferences (Wang, 1998). Finally, recent findings have raised concerns about OTL for
students with disabilities: limited use of allocated time for instruction (Vannest & Hagan-Burke,2010), low exposure to standards-aligned content (Kurz, Elliott, Wehby, & Smithson, 2010), and
inconsistent use of evidenced-based practices (Burns & Ysseldyke, 2009), as well as other issues
related to instructional quality (Vaughn, Levy, Coleman, & Bos, 2002). Operationalizing and
measuring OTL thus can quantify students’ access to the general education curriculum, provide
evidence concerning valid test score inferences, and identify areas of classroom instruction in
need of intervention.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 7: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/7.jpg)
164 KURZ ET AL.
ASSESSING OTL WITH MyiLOGS
The MyiLOGS OTL measure was designed to assess students’ opportunity to learn the intended
curriculum based on five OTL indices along three enacted curriculum dimensions: time, content,
and quality. This online teacher log is completed concurrently with a teacher’s instructional
planning and implementation efforts. For every school day (so-called calendar days), teachers
are asked to report on their instructional time dedicated to the state-specific academic standardsand any custom objectives such as other valued academic skills not included in the standards or
Individualized Education Program objectives for students with disabilities. Based on a random
sample of 2 weekdays (so-called detail days), teachers are further asked to report on additional
details related to instructional quality for their overall class and individual students. Specifically,
teachers report on the cognitive processes expected for each enacted content standard and their
various instructional practices implemented in a particular grouping format.
Scores
Based on five OTL indices, five major OTL scores are calculated (see Table 2). One score isrelated to time (i.e., Time on Standards), one to content (i.e., Content Coverage), and three to
quality (i.e., Cognitive Processes, Instructional Practices, Grouping Formats). The score for time
is a percentage based on a teacher’s allocated class time. The score for content is a percentage
TABLE 2
OTL Scores, Definitions, and Quarter Calculations
OTL Score Score Definition Quarter Calculation
Time on standardsa Percentage of allocated class time used for instruction
on the state-specific academic standards.
Based on 40 logged
calendar days
Content coveragea Percentage of state-specific academic standards
addressed for one minute or more.
Based on 40 logged
calendar days
Cognitive processesb Sum of differentially weighted percentages of
instructional time dedicated to each cognitive
process expectation (Attend and Remember x1;
Understand/Apply, Analyze/Evaluate, and Create
x2).
Based on 8 logged
detail days
Instructional practicesb Sum of differentially weighted percentages of
instructional time dedicated to each instructional
practice (Used Independent Practice and Other
Instructional Practices x1; Provided Direct
Instruction, Provided Visual Representation, Asked
Question, Elicited Think Aloud, Provided Guided
Feedback, and Assessed Student Knowledge x2).
Based on 8 logged
detail days
Grouping formatsb Sum of differentially weighted percentages of
instructional time dedicated to each grouping format
(Whole Class x1; Individual andSmall Group x2).
Based on 8 logged
detail days
Note. OTL D opportunity to learn.aScore can be calculated based on 1 or more calendar days. A typical week features 5 calendar days.bScore can be calculated based on 1 or more detail days. A typical week features 2 random detail days.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 8: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/8.jpg)
OPPORTUNITY TO LEARN 165
based on the total number of standards for a particular subject and grade. The three scores for
quality are percentages based on the time dedicated to one of two categories. The weighting foreach category is either 1.00 or 2.00. Scores of 1.00 indicate an exclusive focus on lower order
thinking skills (i.e., attend, remember), or generic instructional practices (i.e., independent
practice, other instructional practices), or whole-class instruction. Scores of 2.00 indicate
an exclusive focus on higher order thinking skills (i.e., understand/apply, analyze/evaluate,
create), or evidence-based instructional practices (i.e., direct instruction, visual representations,questions, think aloud, guided feedback, reinforcement, assessment), or individual/small group
instruction. Given an allocated class time of 60 min, for example, a teacher who spends 15 min
asking students to recall definitions of triangles .0:25 � 1 D 0:25/ and 45 min having students
create examples of these triangle types .0:75 � 2 D 1:50/ would receive a Cognitive Process
Score of 1.75. As such, the score is a linear transformation of the percentage of time spent in
any of the high-order categories.The weighting for the two categories is partly intended to prevent potentially negative
user associations with a score of 0. More important, the weighting and use of two categories
for all three quality-related scores is grounded in two operating assumptions: (a) teachers
address a range of cognitive processes, instructional practices, and grouping formats during
the course of their instruction; and (b) teachers who emphasize higher order thinking skills,evidence-based instructional practices, and alternative grouping formats can improve the quality
of students’ opportunity to learn valued knowledge and skills. Although the empirical basis
for these assumptions is insufficient to single out specific processes, practices, or formats,
we decided on a dichotomous grouping for two reasons. First, teachers must move expected
cognitive processes beyond recall to promote a transfer of knowledge (Anderson et al., 2001;Mayer, 2008). As such, teachers emphasizing high-order thinking skills should receive scores
closer to 2.00. Second, given empirical support for evidence-based instructional practices and
grouping formats other than whole class, teachers emphasizing the latter should also receive
scores closer to 2.00.
Although it is possible to calculate all scores based on a single day of logging, all scores
are intended to be used and interpreted as quarterly and yearly summary scores. Two monthsof school should yield 40 or more calendar days and 8 or more detail days. The quarter score
calculations are thus based on sets of 40 consecutively logged calendar days for the Time on
Standards and Content Coverage scores and on sets of 8 consecutively logged detail days for the
Cognitive Processes, Instructional Practices, and Grouping Formats scores. With the exception
of the Content Coverage score, quarterly summary scores represent the average percentageacross sets of consecutively logged days. The yearly summary score represents the average
percentage across all available log days. Given that the Content Coverage score is calculated
cumulatively, its four quarterly summary scores thus are based on Day 40, 80, 120, and 160,
respectively. For example, a Content Coverage score of 0.10 calculated on Day 40 represents
the first quarter summary score and indicates a teacher covered 10% of the academic standards(for at least a minute or more) during the first 40 days of logging.
Score Interpretations
For the purpose of assessing OTL—the extent to which a teacher dedicates instructional time to
cover the content prescribed by intended standards using a range of cognitive processes, instruc-
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 9: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/9.jpg)
166 KURZ ET AL.
tional practices, and grouping formats—MyiLOGS scores are designed to allow interpretations
about a teacher’s (a) time spent on academic standards; (b) content coverage of academicstandards; (c) emphases along a range of cognitive processes, especially lower order versus
higher order thinking skills; (d) emphases along a range of instructional practices, especially
generic versus evidence-based instructional practices; and (e) emphases along a range of
instructional grouping formats, especially whole-class versus more differentiated instructional
groupings based on small groups or individual students.
EVIDENCE BASED ON CONTENT
The five main OTL scores calculated via MyiLOGS were developed on the basis of theory and
the empirical OTL literature and can be used to address all three enacted curriculum dimensions
relevant to the concept of OTL. The initial measure was pilot tested for several months with
a small group of general and special education teachers across three states. Subsequently,we used feedback from teachers and a panel of experts to refine the selection of cognitive
processes and instructional practices. The panel included instructional leaders from three state
departments of education, as well as a team of university researchers and consultants with
expertise in curriculum, measurement, and special education. Given the limitations of end-of-
year summative surveys, the finalized measure allowed teachers to gather OTL data on a dailybasis to maximize generalizability to a universe of teaching events across the school year. In
addition, teachers were able to provide OTL data for their overall class and individual students.
As such, the measure can be used to collect a large number of data points for both the overall
class and nested target student—thereby addressing issues of generalizability and instructional
differentiation.
Although the available evidence based on content suggests that MyiLOGS addresses allpreviously outlined content domains of OTL via the five OTL scores, the selection of scores
related to instructional quality underrepresents instructional resources (Herman et al., 2000), a
consideration in some OTL research. The MyiLOGS teacher profile gathers key information on
instructional resources (e.g., teacher preparation, teaching experience, participation in relevant
in-service education), however, these data do not influence score calculations for instructionalquality. Moreover, no information is gathered on material resources (e.g., availability of instruc-
tional materials). Next, we describe the methods used to collect and analyze the data under the
remaining evidence categories.
METHOD
Participants
The teacher participant sample featured 38 general and special education teachers from seven
middle schools in Arizona (n D 15 teachers), five middle schools in Pennsylvania (n D 12
teachers), and five middle schools in South Carolina (n D 11 teachers). To be included in the
study, each general and special education teacher had to provide mathematics (MA) and/or
reading (RE) instruction to two eighth-grade students with disabilities. The subject-specific
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 10: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/10.jpg)
OPPORTUNITY TO LEARN 167
samples across states were comprised as follows: (a) 19 teachers provided OTL data on 20
MA classes featuring 39 target students, and (b) 23 teachers provided OTL data on 26 REclasses featuring 50 target students. Several teachers logged multiple classrooms (e.g., two
different MA classes), and some of the same target students were logged by multiple teachers
(e.g., a MA teacher and a RE teacher). Out of the 46 classrooms logged by teachers, 29
were (full-inclusion) general education classes and 17 were (self-contained) special education
classes.The target student sample .N D 56/ largely comprised boys and students with learning
disabilities. The Arizona subsample was predominately Hispanic, and the subsamples in Penn-
sylvania and South Carolina were predominately Caucasian and African American. The Arizona
subsample further featured a very large proportion of students on free/reduced lunch. To further
describe the target sample, teachers were asked to rate students’ performance levels in the areas
of MA, RE, motivation, and prosocial behavior via the Performance Screening Guide (Elliott& Gresham, 2008) and students’ academic skills and enablers via the Academic Competence
Evaluation Scales (DiPerna & Elliott, 2000).
The mean level ratings via the Performance Screening Guide across all three states indicated
that the target student sample generally performed at Level 2 (in need of intervention) in both
academic areas and at Level 3 (at risk for problems) in the Motivation to Learn and ProsocialBehavior areas. The mean total scores via the Academic Competence Evaluation Scales further
placed students’ academic skills across all three states in the Developing range (first decile
nationally) and students’ academic enabling behaviors in the Competent range (fourth decile
nationally). The teachers’ low academic ratings of the target student sample were consistent
with students’ below proficient performance on previous year’s state test. About 91% of allparticipating students performed below proficiency in MA and RE.
Measures and Procedures
MyiLOGS
This online teacher log (www.myilogs.com) features the state-specific academic standards
for various subjects (including Common Core State Standards) and additional customizable
skills that allow teachers to add student-specific objectives (e.g., Individualized Education
Program objectives). The measure therefore allows teachers to document the extent to which
their classroom instruction covers individualized intended curricula. To this end, MyiLOGSprovides teachers with a monthly instructional calendar that includes an expandable sidebar,
which lists all intended objectives for a class. Teachers drag and drop planned standards
that are to be the focus of the lesson onto the respective calendar days and indicate the
approximate number of minutes dedicated to each standard. After the lesson, teachers are
required to confirm enacted standards, instructional time dedicated to each standard, and anytime not available for instruction (due to transitions, class announcements, etc.) at the class
level. In addition, two randomly selected days per week require further documentation. On
these detail days, teachers report on additional time emphases related to the standards listed on
the calendar including cognitive process expectations, instructional practices, grouping formats,
and time not available for instruction. This detailed reporting occurs for the overall class and
individual students along two 2-dimensional matrices. For the first matrix (see Figure 2),
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 11: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/11.jpg)
168 KURZ ET AL.
FIGURE 2 Screenshot of the Objective � Cognitive Process matrix.
teachers report on the instructional minutes allocated per standard along five cognitive process
expectations for student learning adapted from the revised version of Bloom’s taxonomy (seeAnderson et al., 2001): Attend, Remember, Understand/Apply, Analyze/Evaluate, and Create.
MyiLOGS also includes an Attend category, which is not part of the revised Bloom’s taxonomy.
The cognitive expectation of Attend allows teachers to differentiate between the expectation
of students (passively) listening to instructional tasks and related instructions and (actively)
recalling information such as a fact, definition, term, or simple procedure. Similar categorieshave been used in the context of special education, especially for students with significant
cognitive disabilities (e.g., Karvonen, Wakeman, Flower, & Browder, 2007).
For the second matrix (see Figure 3), teachers report on the instructional minutes allocated
per instructional practice along three grouping formats. In Table 3, seven instructional practices
are marked by a table note to indicate empirical support on the basis of research syntheses andmeta-analyses (e.g., Gersten et al., 2009; Marzano, 2000; Vaughn, Gersten, & Chard, 2000).
In addition, grouping formats other than whole class also have received empirical support
for improving learning outcomes (see Elbaum et al., 2000). “Other instructional practices”
FIGURE 3 Screenshot of the instructional Practice � Grouping Format matrix.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 12: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/12.jpg)
OPPORTUNITY TO LEARN 169
TABLE 3
Instructional Practices and Definitions
Instructional Practice Definition
Provided direct instructiona Teacher presents issue, discusses or models a solution approach, and engages
students with approach in similar context.
Provided visual representationsa Teacher uses visual representations to organize information, communicate
attributes, and explain relationships.
Asked questionsa Teacher asks questions to engage students and focus attention on important
information.
Elicited think alouda Teacher prompts students to think aloud about their approach to solving a
problem.
Used independent practice Teacher allows students to work independently to develop and refine
knowledge and skills.
Provided guided feedbacka Teacher provides feedback to students on work quality, missing elements,
and observed strengths.
Provided reinforcementa Teacher provides reinforcement contingent on previously established
expectations for effort and/or work performance.
Assessed student knowledgea Teacher uses quizzes, tests, student products, or other forms of assessment to
determine student knowledge.
Other instructional practices Any other instructional practices not captured by the aforementioned key
instructional practices.
aThis instructional practice has received empirical support across multiple studies.
represents a generic category to allow teachers to report on their entire allocated class timeusing the available selection of instructional practices and/or “time not available for instruction.”
Teachers use the latter category to indicate any noninstructional minutes (e.g., transitions,
announcements, fire drills), which together with instructional minutes must add up to the total
allocated class time. The grouping formats were defined as follows: (a) Individual: Instructional
action is focused on individuals working on different tasks; (b) Small Group: Instructional actionis focused on a small groups working on the different tasks; (c) Whole Class: Instructional
action is focused on the whole class working on the same task. If students are working on the
same 20 math problems on their own, then the grouping format remains Whole Class (i.e., no
task differentiation). If one group is working on defining isosceles triangles and another one
on defining equilateral triangles, then the grouping format is Small Group (i.e., the task wasdifferent by group).
Training Surveys
At the conclusion of the 3-hr MyiLOGS training session, all participants completed a 9-item survey to provide information on their satisfaction with the training and the software. In
addition, eight months after completion of the data collection phase, all participants were asked
to complete a follow-up survey regarding the utility of MyiLOGS and the associated MyiLOGS
report that summarized their instructional provisions during the previous year. Participants also
were asked to complete a final instructional scenario comparable to the scenarios completed
during the performance assessment to determine maintenance of skills to use MyiLOGS.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 13: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/13.jpg)
170 KURZ ET AL.
SEC
This online survey (www.seconline.org), typically conducted at the end of the year, provides
information on the alignment between intended, enacted, and assessed curricula. The SEC
alignment method hereby relies on content translations by teachers (for purposes of the enacted
curriculum) and curriculum experts (for purposes of the intended and assessed curriculum) who
code a particular curriculum into a content framework that features a comprehensive K–12 list
of subject-specific topics. The SEC content frameworks in MA and RE include 183 and 163topics, respectively. All content translations occur along a 2-dimensional matrix of topics (e.g.,
multiply fractions) and cognitive demands (e.g., memorize). Teachers report on their enacted
curriculum at the end of the school year by describing different instructional emphases for each
topic and any applicable cognitive expectations using a 4-point scale. As such, instructional
time is not directly assessed via the SEC. To calculate alignment between two content matrices,the data in each matrix are reduced to cell-by-cell proportions with their sum across all rows
and columns equaling 1.00. Porter’s (2002) alignment index (AI) takes both dimensions (i.e.,
topics and cognitive demands) into consideration when calculating the content overlap between
two matrices according to this formula: AI D 1 � Œ.†jxi � yi j/=2�, where xi indicates the cell
proportion in cell i for matrix x and yi indicates the cell proportion in cell i for matrix y. Theindex thus ranges from 0 to 1, the latter indicating perfect alignment.
Assuming accurate recall, the AI can provide information about the extent to which a
teacher’s enacted curriculum matches the content topics and cognitive expectations expressed
in the academic content standards of the general curriculum. However, the SEC employs several
levels of inference to determine this index. Unlike MyiLOGS, which allows teachers to directly
report on instructional time and content coverage allocated to state-specific standards, the SECrelies on (a) expert judgment to translate the state-specific standards into a content matrix and
(b) teacher judgment to translate their enacted curricula into a second set of content matrices.
Only the subsequent comparison of both matrices ultimately determines the AI. An additional
limitation of the AI as an OTL proxy stems from the overlap calculation at the intersection
of topic and cognitive demand, which does not offer separate scores for content coverage andcognitive process expectations. A teacher who emphasized the same topics indicated by the
standards thus can receive an AI of 0, if the topics were emphasized at a different category of
cognitive of demand from what the standards prescribed. Given that MyiLOGS scores provide
separate information on key OTL indices that are otherwise combined in the AI, we expected
small correlations .r < :3/ among scores. We note that the SEC can yield data that allow for
alignment calculations at the marginal for topic or cognitive demand only (Polikoff, Porter, &Smithson, 2011). For purposes of this study, we relied on the commonly used AI, which was
calculated based on the aforementioned formula using enacted curriculum matrices (established
by participants) and the respective intended curriculum matrices for their state, subject, and
grade. The latter matrices were provided directly through the Measures of the Enacted Curricu-
lum Project at the Wisconsin Center for Education Research. As such, they were establishedusing multiple raters and standard SEC methods (Porter, Polikoff, Zeidner, & Smithson, 2008).
State Tests
In three states, paper-and-pencil assessments designed to measure student achievement of
state standards were used to provide summative data on the extent to which students have
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 14: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/14.jpg)
OPPORTUNITY TO LEARN 171
achieved the academic standards of the general curriculum for 8th-grade MA and RE: the
Arizona Instrument to Measure Standards, the Pennsylvania System of School Assessment,and South Carolina’s Palmetto Assessment of State Standards. Given previous associations
between the five OTL indices and student achievement, we expected medium correlations
.r > :3/ between classwide OTL scores and class achievement.
Training Procedures
Each teacher received the standard professional development on the use of MyiLOGS,which focused sequentially on four elements: worked example (15 min), guided practice (1
hr), performance assessment (45 min to 1 hr), and independent practice (1 hr). For purposes
of the performance assessment, teachers had to pass a sequence of tests. These tests featured
written instructional scenarios that summarized typical lessons. Teachers had to correctly log the
instructional scenario via MyiLOGS. Teachers had to pass two scenarios with 100% accuracy
to be able to continue in the study.To ensure accurate use of the SEC, the lead author worked with the director of the Measures
of the Enacted Curriculum Project at the Wisconsin Center for Education Research to develop
a training video that reviewed the online completion procedures and logging conventions of the
SEC. The 30-min video also reviewed the similarities and differences of the cognitive process
expectations between the SEC and MyiLOGS. Prior to using the SEC, all participants had toreview the training video.
Study Procedures
Personnel in each state began the recruitment process at the beginning of the 2010–2011
school year. The trainings were implemented in Arizona during the months of September and
October followed by Pennsylvania and South Carolina. Of 41 recruited teachers, 38 could be
trained to criterion during the allotted training time. All participants were compensated for theirtime spent on study-related tasks. Each teacher received a $150 honorarium for participation in
the MyiLOGS training and $100 per month for using MyiLOGS to report on daily classroom
instruction. The monthly compensation was contingent on timely completion of MyiLOGS,
which was monitored through biweekly procedural fidelity checks. The required logging period
for all participants was 4 full months after the teacher training with the option to continuethrough the month of April 2011. At the end of school year, participants further completed the
SEC.
Observation Procedures
Each teacher participant was observed at least once during his or her logging period. An
additional 20% of teachers were randomly selected to receive a total of three observationsresulting in 51 observations across all 38 participants. Trained observers used an observation
form that mirrored the two 2-dimensional matrices used in the MyiLOGS software to code the
dominant cognitive process per standard and the dominant instructional practice per grouping
format observed during a 1-min interval. For training purposes, observers had to obtain an
overall agreement percentage of 80% or higher on two consecutive 30-min sessions. A vibrating
timer on a fixed interval was used to indicate the 1-min recording mark. Interobserver agreement
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 15: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/15.jpg)
172 KURZ ET AL.
was collected on about 30% of all observation sessions across states. All observations sessions
lasted for the entire class period.For agreement purposes, cell-by-cell agreement was calculated for each matrix based on cell
estimates within a 3-min range or less. For each matrix, interrater agreement was calculated
as the total number of agreements divided by the sum of agreements and disagreements. In
addition, an overall interrater agreement percentage was calculated as the total number of
agreements across both matrices divided by the sum of agreements and disagreements acrossboth matrices. That latter index was used in establishing the training criterion (at or above 80%)
and retraining criterion (below 80%) for observers. Agreement percentages between observers
as well as teachers and observers are reported in the Results section.
Design and Data Analysis
Data for the usability and consequences related to MyiLOGS were collected using a teacher
survey and reported using descriptive statistics. The reliability of MyiLOGS scores was es-
timated via Pearson correlations between summary scores based on randomly selected setsof days and yearly summary scores as indicators of precision (i.e., how precisely can yearly
summary scores be estimated using smaller sets of randomly sampled days) and via Pearson
correlations between quarterly summary scores as indicators of stability (i.e., how stable are
quarterly summary scores from one set of consecutively logged days to the other). The validity
of inferences drawn from MyiLOGS scores was characterized using multiple forms of evidence,as suggested by the Standards for Educational and Psychological Testing (AERA, APA, &
NCME, 1999). Evidence based on response processes was indicated using descriptive statistics
about the degree to which teachers appropriately logged information in the measure. Evidence
based on internal structure was indicated by a matrix of Pearson correlations among all five
major OTL scores. Evidence based on relations to other variables was indicated by correlations
between MyiLOGS indices and data from direct observations, the SEC, and class achievementtest scores.
RESULTS
Usability and Evidence Based on Response Processes
Nielsen (1994) defined usability along five quality components that were applied to the My-
iLOGS software: (a) learnability (i.e., ease of logging), (b) efficiency (i.e., logging time once
trained), (c) memorability (i.e., ease of reestablishing proficiency after a period of nonuse),
(d) errors (i.e., frequency and severity of errors), and (e) satisfaction. Evidence that supports
learnability and low error rates is based on the fact that 93% of users could be trained to anerror-free criterion in about 3 hr of training. The software’s learnability is further supported
by post-training and follow-up survey results (see Table 4) related to understanding the system
and being able to use it reliably (Posttraining Questions 3, 4, 7, and 9).
Evidence related to efficiency and response processes was based on website user statistics,
which indicated that participants completed their logs concurrently with their instructional
efforts as intended two to three times per week with a relatively small time investment.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 16: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/16.jpg)
OPPORTUNITY TO LEARN 173
TABLE 4
Posttraining and Follow-up Survey Results
Posttraininga
Follow-Up
(n D 26)
Question
No. Question Stem M SD M SD
1 Professional development related to the content standards
is important for promoting effective instruction.
5.8 0.4 5.6 0.6
2 Comprehensive, high-quality coverage of the content
standards is an important part of effective instruction.
5.8 0.4 5.6 0.6
3 The MyiLOGS training was helpful for understanding how
to use the system.
5.9 0.3 5.4 0.7
4 Based on the MyiLOGS training, I was prepared to use the
system reliably.
5.5 0.5 5.3 0.8
5 An online version of this training (e.g., webinar) could
have been equally effective.
3.2 1.5 3.9 1.4
6 I think MyiLOGS can support my comprehensive,
high-quality coverage of the content standards.
5.6 0.6 5.2 0.7
7c The MyiLOGS training scenarios were helpful for
understanding how to use the system.
5.9 0.4
8c Overall, I think the trainers were well prepared. 5.9 0.4
9c Overall, I think the training time was sufficient for
understanding how to use the system.
5.7 0.5
10d The charts and tables of the MyiLOGS Report provided
meaningful information about my instruction.
5.3 0.7
11d I would use the MyiLOGS Report feedback during the
school year to improve my instruction.
5.2 0.8
12d I think MyiLOGS Instructional Growth Plan could be
helpful as a professional development tool.
5.2 0.8
13d Using MyiLOGS substantially increase my self-reflection
and awareness of how and what I was teaching.
5.3 0.8
Note. 1 (strongly disagree), 2 (disagree), 3 (somewhat disagree), 4 (somewhat agree), 5 (agree), 6 (strongly agree).an D 41. bn D 26. cPosttraining-only question. dFollow-up-only question.
Specifically, the website tracked teachers’ average number of log-ins per week (excluding
holidays and other school breaks) as well as their active logging time per week. On average,participants logged into MyiLOGS 2.4 times per week .SD D 0:6/ and clocked about 5.9 min
per week .SD D 1:4/ of active logging time. In addition, their log completion was monitored
on a biweekly basis. Completion checks were based on completed calendar days as well as
detail days for both the overall class and target students. A total of 15 checks were completed
during 30 weeks of instructional logging. On average, 92% of classrooms per check werelogged without any missing calendar or detail day information. Following e-mail prompts, all
teachers completed their missing data prior to the next check. The final instructional data set
was 100% complete.
Posttraining and follow-up survey results further supported user satisfaction (see Table 4
Questions 6, 10, 11, and 13) with the majority of participants being in agreement that the
software supported their instruction. Evidence for memorability was established via a retest
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 17: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/17.jpg)
174 KURZ ET AL.
scenario completed by 26 participants 8 months after completion of the study (more than 14
months after the initial training). Across states, 100% of respondents completed the calendarlevel correctly (i.e., entries for the classwide time and content scores), 92% complete the class
details correctly, and 91% and 82% completed Target Student 1 and 2 correctly. Based on the
initial performance assessment standard, 10 out of 26 respondents (38%) maintained criterion
level performance of 100% accuracy across all categories (i.e., entries for the classwide and
student-specific time, content, and quality scores).
MyiLOGS Scores and Estimates of Their Reliability
The mean scores and standard deviations for the quarterly and yearly summary scores aredocumented in Table 5. Quarterly summary scores only were calculated if all log days per
quarter were completed (see Table 2). Yearly summary scores were calculated based on all
completed log days. The means for the three quality scores were very consistent across quarters.
Time on Standards decreased across quarters especially from the second to the third quarters.
Content Coverage increased most notably during the first two quarters with only minor increases
in subsequent quarters.The precision and representativeness with which summary scores based on randomly sam-
pled sets of log days can estimate teachers’ yearly summary scores generally increased with
larger sets of randomly sampled days (see Table 6). All five summary scores based on 10
randomly sampled days correlated with their respective yearly summary scores above 0.80.
For Time on Standards and Content Coverage, summary scores based on 30 randomly sampledlog days yielded fairly precise estimates of teachers’ yearly summary scores with correlations
close to or above 0.90 and diminishing returns for larger samples of log days thereafter. For the
three quality scores, summary scores based on 10 randomly sampled days appeared to yield
reasonably precise estimates of teachers’ yearly summary scores and diminishing returns for
larger samples of log days thereafter.The correlations among quarterly summary scores provide information about the stability
of these scores. As documented in Tables 7 and 8, the correlations among the same MyiLOGS
TABLE 5
Means and Standard Deviations for Quarterly and Yearly Summary Scores
Quarter 1 Quarter 2 Quarter 3 Quarter 4
Yearly
Summary
Score M SD M SD M SD M SD M SD
Time on standards 0.75 0.23 0.74 0.19 0.60a 0.21a 0.60b 0.19b 0.68 0.18
Content coverage 0.33 0.22 0.50 0.22 0.62a 0.21a 0.64b 0.14b 0.68 0.22
Cognitive processes 1.74 0.16 1.73 0.16 1.73a 0.17a 1.75c 0.19c 1.74 0.14
Instructional processes 1.66 0.18 1.60 0.21 1.61a 0.21a 1.63c 0.18c 1.62 0.19
Grouping formats 1.27 0.22 1.24 0.24 1.23a 0.21a 1.19c 0.22c 1.25 0.22
Note. N D 46 unless otherwise noted.aN D 44. bN D 12. cN D 38.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 18: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/18.jpg)
OPPORTUNITY TO LEARN 175
TABLE 6
Correlations Between Summary Scores Based on Sets of Randomly Sampled Days and
Yearly Summary Score
No. of Randomly Sampled Days
Score Ten Twenty Thirty Forty Fifty
Time on standards .83 .89 .92 .93 .94
Content coverage .83 .85 .87 .86 .87
Five Ten Fifteen Twentya Twenty-Fivea
Cognitive processes .73 .81 .85 .87 .84
Instructional practices .82 .88 .89 .81 .79
Grouping formats .89 .92 .93 .91 .89
Note. All correlations represent mean correlations based on 10 repeated random samples. All
yearly summary scores were calculated without including the respective sets of randomly selected
days. N D 46 unless otherwise noted.aN D 42. All correlations p < .05.
quarterly score over time were generally moderate to high for contiguous quarters and low
to moderate for noncontiguous quarters. Overall, these correlations decreased in magnitude
from the first to fourth quarter, suggesting a change in these instructional indicators as theschool year progresses. Specifically, the stability of quarterly summary scores for Time on
Standards and Content Coverage decreased from one quarter to the other. Correlations related
to the fourth quarter must be interpreted with caution, because only 12 teachers featured data
sets with 160 log days. For the three quality scores, the patterns between Quarter 1 and 2, 2
and 3, and 3 and 4 are fairly consistent. The first and second quarters are moderately stable,whereas the correlations between the remaining quarters show greater stability with correlations
above 0.50.
TABLE 7
Correlations Among Quarterly Summary Scores for Time and Content
Quarter 1
(Day 1–40)
Quarter 2
(Day 41–80)
Quarter 3a
(Day 81–120)
Quarter 4b
(Day 121–160)
TS CC TS CC TS CC TS CC
Quarter 1 — —
Quarter 2 .88 .94 — —
Quarter 3a .43 0.76 .66 .83 — —
Quarter 4b .41c .36 .69 .35 .89 .88 — —
Note. Quarters 1, 2, 3, and 4 summary scores are based on the first, second, third, and fourth set of 40 consecutively
logged calendar days, respectively. N D 46 unless otherwise noted. TS D Time on Standards; CC D Content Coverage.aN D 44. bN D 12. All correlations except c are significant at p < :05.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 19: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/19.jpg)
176 KURZ ET AL.
TABLE 8
Correlations Among Quarterly Summary Scores for Quality
Quarter 1
(Day 1–8)
Quarter 2
(Day 9–16)
Quarter 3a
(Day 17–24)
Quarter 4b
(Day 25–32)
CP IP GF CP IP GF CP IP GF CP IP GF
Quarter 1 — — —
Quarter 2 .56 .65 .70 — — —
Quarter 3a .48 .71 .40 .73 .78 .78 — — —
Quarter 4b .45 .64 .25c .69 .51 .56 .83 .57 .68 — — —
Note. Quarters 1, 2, 3, and 4 summary scores are based on the first, second, third, and fourth set of 8 consecutively
logged detail days, respectively. N D 46 unless otherwise noted. CP D Cognitive Processes; IP D Instructional
Practices; GF D Grouping Formats.aN D 44. bN D 38. cAll correlations except this one are significant at p < :05.
Internal Structure
Initial evidence for the internal structure of MyiLOGS was provided by the inter-correlations
among the five OTL scores. As indicated in Table 9, the correlations between 4 of 10 score
pairs were low, falling at or below 0.30. None of the correlations exceeded 0.43. Thus in allcases, the shared variance between any pair of scores was less than 18% suggesting that each
of the five scores provides relatively unique information regarding instruction.
Relations to Other Variables
To describe relations to other variables, we examined the extent to which the OTL scores were
related to the SEC AI. Given that the AI is based on a teacher’s report for their overall class, we
used the calendar-based MyiLOGS OTL scores for Time on Standards and Content Coverage,
which also refer to the overall class. The three quality scores were based on classwide detail
days. Second, we examined the relations between the scores and average class achievement
on the Arizona state test for the 15 participating teachers—the only state that provided class-specific achievement data for all students in participating classrooms. Last, we calculated the
TABLE 9
Correlations Among Yearly Summary Scores
Time on
Standards
Content
Coverage
Cognitive
Processes
Instructional
Practices
Grouping
Formats
Time on standards —
Content coverage .36 —
Cognitive processes �.16a .14a —
Instructional practices .41 .32 �.36 —
Grouping formats �.33 �.30 �.15a �.43 —
Note. N D 46. aAll correlations except this one are significant at p < :05.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 20: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/20.jpg)
OPPORTUNITY TO LEARN 177
extent to which teacher log data were in agreement with the log data of independent observers
recording the same lesson.The correlational data did not support meaningful relations between the SEC AI and any
of the five OTL scores (see Table 10). Controlling for state (i.e., dummy codes for AZ, PA,
and SC) and subject (i.e., dummy codes for MA and RE), a regression model that included
all yearly OTL scores to predict AI resulted in partial correlations that failed to account for
more than 1% of shared variance with p exceeding .05 in all cases. The predictive analysesindicated that one time-based and two quality-based MyiLOGS scores were related to average
class achievement. Specifically, the correlation between the yearly summary score for Time
on Standards and class achievement was r D :56, p < :05, accounting for about 31% of the
variance in average class achievement. The correlation between the yearly summary score for
Cognitive Processes and class achievement was r D :64, p < :05, accounting for about 41%
of the variance in average class achievement. Last, the correlation between the yearly summaryscore for Grouping Formats and class achievement was r D �:71, p < :05, accounting for
about 50% of the variance in average class achievement.
To estimate the extent to which teachers’ log data represented a valid account of their
classroom instruction, we calculated agreement percentages between teachers and independent
observers on the basis of detail days at the class level related to five cognitive processexpectations per standard and nine instructional practices per grouping format. Across sessions,
agreement between teachers and observers for cognitive processes per standard ranged between
27% and 100% with an average of 63%. Across sessions, agreement for instructional practices
per grouping format ranged between 64% and 100% with an average of 82%. Overall agreement
between teachers and observers across sessions ranged between 55% and 100% with an averageof 77%. In the context of prior validity research using teacher logs, Camburn and Barnes (2004)
reported agreement percentage between teachers and observers that ranged between 37% and
75% with an average agreement of 52%. The current findings are consistent with prior research,
which also featured only one subject-specific observation per teacher.
Interobserver agreement was collected on more than 30% of all observation sessions between
two trained observers. Across sessions, agreement between two independent observers forcognitive processes per standard ranged between 67% and 100% with an average of 93%.
Across sessions, agreement for instructional practices per grouping format ranged between
TABLE 10
Partial Correlations Between Opportunity to Learn Scores and
SEC Alignment Index
Score
SEC Alignment
Index
Time on standards .06
Content coverage �.07
Cognitive processes �.04
Instructional practices .05
Grouping formats �.11
Note. N D 46. All correlations p > :05. SEC D Surveys of Enacted
Curriculum.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 21: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/21.jpg)
178 KURZ ET AL.
89% and 100% with an average of 98%. Overall agreement between two observers across
sessions ranged between 85% and 100% with an average of 97%.
Consequences of Using the Measure
Results from the posttraining (Question 6) and follow-up survey (Questions 10, 11, and 13)
indicated that the use of MyiLOGS was associated with several intended consequences (see
Table 4). On average, teachers agreed that MyiLOGS was useful for supporting their com-
prehensive, high-quality coverage of the content standards and increasing their self-reflection.Their agreement for Question 6 decreased only slightly upon completion of the study and a
substantial period of nonuse. At the end of the study, teachers were also allowed to review
graphical representations of their instructional data via the MyiLOGS Report, which features
more than a dozen charts and tables. The responses provided support that teachers found these
graphical reports meaningful for improving their instruction.
DISCUSSION
Empirical information about students’ opportunity to learn the intended curriculum is critical
to instructional equity, access to the general curriculum, testing fairness, and the validity of testscore inferences about teacher instruction. Despite federal directives that mandate instruction
based on the content that students are expected to know, few, if any, OTL measurement
options have been available that can be deployed at scale daily or weekly. In addition, the
potential for programmatic research leading to interventions that target malleable factors of
instruction rests upon sound conceptualization, operationalization, and measurement of OTL.
This study summarized initial evidence supporting intended score interpretations for the purposeof assessing OTL via MyiLOGS, an online teacher log. As discussed next, this initial evidence
is limited but promising for the assessment of OTL at scale.
Major Findings
Educational technologies for use by teachers must be able to demonstrate usability in authentic
educational settings (i.e., classrooms) with actual users (i.e., teachers) before evidence of relia-bility and validity become relevant. That is, teacher self-report measures that yield reliable OTL
scores and permit valid inference about teacher instruction are of little practical value if teachers
are not able to meaningfully and efficiently integrate them into their daily instructional practices.
To this end, we established usability evidence with teachers via performance assessments, user
surveys, and an 8-month posttest. These data indicated that MyiLOGS users can be trained tocriterion within a relatively short time and that their proficiency can be maintained across a
substantial period of nonuse. Users who responded to our follow-up survey further agreed that
MyiLOGS provided valuable personalized feedback that could be used to improve instruction.
Available evidence in support of usability, however, remains limited to survey and user integrity
data based on a volunteer sample compensated for participation. To strengthen evidence for
usability, we recommend additional survey questions that address the issue of usability more
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 22: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/22.jpg)
OPPORTUNITY TO LEARN 179
directly (e.g., ease of use, burden of daily use, importance of collecting OTL data) using paid
as well as unpaid users.The quarterly summary score means remained very consistent for the Cognitive Processes,
Instructional Practices, and Grouping Formats scores. The quarterly summary score means for
Time on Standards means were similar for the first and second quarters as well as for the
third and fourth quarters. Given that the Content Coverage score is cumulative, the increases
in means during the first three quarters were not surprising.Initial evidence for the reliability of MyiLOGS scores indicated that summary scores based
on randomly sampled sets of 20 log days can function as reliable estimates of teacher’s
respective yearly summary scores. This finding thus suggests that, depending on measurement
purpose (i.e., descriptive, formative), logging across the majority of the school year may not
be necessary. Estimates of score stability using quarterly summary scores for each MyiLOGS
scale ranged from high to moderate depending on the time span covered, with contiguous scorequarters being more stable than noncontiguous score quarters.
Several sources of evidence were collected and used to test the validity of intended score
interpretations. First, we found that the five OTL scores along the three enacted curriculum
dimensions of time, content, and quality provided relatively independent information, account-
ing for little variance among the respective scores. This finding is consistent with the proposedtheoretical model of OTL. Second, we found that the OTL scores were not related to the SEC’s
index of curricular alignment. Given that MyiLOGS scores provide separate information on key
OTL indices that are otherwise combined in the AI, the virtually nonexistent correlations were
expected. As predicted, the AI establishes content overlap between the intended curriculum
(i.e., standards) and the enacted curriculum (i.e., instruction) on the basis of emphasizingthe same topics using the same categories of cognitive demand. In other words, the SEC
considers emphases of content coverage and cognitive processes conjunctively for purposes
of calculating alignment between what is taught and what is expected. MyiLOGS, on the
other hand, provides separate scores for Time on Standards, Content Coverage, and Cognitive
Processes. To establish (convergent) evidence related to other variables in future studies, we
suggest the use of measures that separately assess the constructs in question: instructionaltime, content coverage, and instructional quality. Vannest and Parker (2010), for example, have
collected data on teachers’ instructional time via the Teacher Time Use observation instrument
that should yield convergent evidence supporting interpretations based on the Time on Standards
score. In addition, SEC data based on the Survey of Instructional Practices and the Survey of
Instructional Content could be combined to calculate indices that match the five OTL scoresof MyiLOGS, thus offering the possibility for collecting evidence of convergent validity.
From a predictive validity perspective, the moderate to large correlations that the yearly
summary scores for Time on Standards and Cognitive Processes shared with class achievement
were expected; given that students are more likely to perform standards that they are taught,
and that higher order cognitive processes can be expected more readily of students who arehigher achieving. The very large negative correlation between Grouping Formats and class
achievement, however, was unexpected. One hypothesis is that small groups and individualized
instruction are often used in response to students struggling to learn academic content—
especially in the context of special education, which was a focus in this sample. Given the
small sample size and no control for students’ prior achievement, these initial correlations must
be considered preliminary and interpreted with caution.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 23: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/23.jpg)
180 KURZ ET AL.
Evidence based on classroom observations by independent observers is critical when ex-
amining the validity of score interpretations based on self-report. We found that teacher logdata and observer log data provided similar accounts for the same lesson. The area of greatest
disagreement between teachers and observers involved the enacted cognitive processes per
standard. It should be noted that the observation protocol required observers to allocate instruc-
tional and noninstructional minutes for the entire allocated class time by recording cognitive
processes per standard and instructional practices per grouping format. As such, teachersand observers were able to disagree on the basis of instructional time, enacted standards,
cognitive processes, instructional practices, and grouping formats. In addition, agreements and
disagreements were not categorical but rather based on time with a 3-min confidence band.
Compared to the agreement percentages reported by Camburn and Barnes (2004), the overall
agreement percentages reported in this study were higher. To strengthen evidence related to
log data from external observers, we recommend multiple observations per teacher, ideally inthe context of a generalizability study that can account for multiple sources of variance in
observational scores (Hill, Charalambous, & Kraft, 2012).
Consequential validity evidence was based on survey data, which supported the notion that
the self-recording and self-monitoring required to use MyiLOGS, especially in conjunction with
the MyiLOGS Report, had some formative instructional benefits. That is, most teachers reportedthat using MyiLOGS increased self-reflection and awareness regarding their own instruction
and that the MyiLOGS Report provided meaningful instructional information. However, the
findings are limited by the fact that 32% of teachers did not respond to the follow-up survey.
It is unclear if the nonresponding teachers’ responses would have been less or more supportive
of intended benefits than those of the responding teachers. To strengthen evidence based onconsequences, we recommend survey questions that directly assess the intended consequence
of MyiLOGS’s formative instructional benefits such as the extent to which teachers actually
changed their instruction following a review of their MyiLOGS data. Strong evidence would
constitute efficacy data based on an experimental design with teachers randomly assigned to a
treatment group using MyiLOGS and a well-documented control group.
Evidence for Intended Score Interpretations
Confirmation bias is a particular concern for developers of measurement tools (e.g., Haertel,
1999; Ryan, 2002). In the present study, we tried to address this concern by making our outcome
expectations clear prior to data collection, including teachers’ input and feedback, and using anadditional measure (i.e., the SEC) that was expected to function differently from MyiLOGS.
That being said, the existing evidence is preliminary and limited in support of the intended
score interpretations related to a teacher’s time use, content coverage, and emphases along
a range of cognitive processes, instructional practices, and grouping formats; thus restricting
current uses of MyiLOGS to low-stakes purposes for teachers and researchers.First, available evidence supports the contention that MyiLOGS scores provide separate
information on time, content, and aspects of instructional quality related to cognitive processes,
instructional practices, and grouping formats. This finding is consistent with the underlying
theoretical model of OTL, which MyiLOGS is designed to address. However, it should be
noted that the Content Coverage score is based on an adjustable time threshold. For this
study, the time threshold was set to 1 min. Increasing the threshold to 60 min is likely to
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 24: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/24.jpg)
OPPORTUNITY TO LEARN 181
increase the relation between time and content. Second, the scores related to instructional
quality are intended to yield interpretations about the extent to which teachers emphasized arange of cognitive processes, instructional practices, and grouping formats. However, current
scoring conventions make dichotomous distinctions (e.g., lower order vs. high-order thinking
skills), which do not permit score interpretations about range. In fact, a Cognitive Processes
score of 2.00 can be the result of a teacher exclusively emphasizing Understand/Apply or a
teacher emphasizing all three higher order cognitive processes. This limitation, however, couldbe resolved by considering different methods for calculating the respective scores. Third, the
most important avenues for strengthening evidence to support intended score interpretations are
(a) providing strong relations with separate criterion measures of instructional time, content
coverage, and instructional quality, and (b) supplying improved interobserver data both in
terms of breadth (i.e., more observations per teacher), depth (i.e., generalizability study), and
precision (i.e., discreet agreement decisions for aspects of time, content, quality).
Limitations
The initial evidence for the use of MyiLOGS and the validity of its score inferences was based
on a relatively small sample of general and special education middle school teachers. Theseteachers were volunteers from three states where different content standards and summative
statewide achievement tests were in use. No attempt was made to secure a representative sample
of teachers to participate in the study. This sample issue is a clear limitation to generalization
and needs to be addressed in future studies.
Another limitation stems from two methodological challenges related to the observationsystem. Given the possibility that a teacher can address all cognitive processes and instructional
practices in one lesson, the observation protocol allowed any categories that were neither
reported by the teacher nor observed by the observer to be counted as an agreement. This
convention may have contributed to inflated agreement percentages in certain cases. A second
methodological challenge of the observation system was the varying cell sizes by which
agreement percentages were calculated. Depending on the number of standards/objectivesper lesson, the possible number of agreements/disagreements varied from teacher to teacher.
This prevented the application of alternative agreement statistics such a Kappa, which could
have accounted for chance agreement. Last, the majority of teachers was observed only
once and coupled with possible reactivity to being observed, the session may not have been
representative of a teacher’s typical logging practices. All observations were used to establishagreement percentages for the class only. The extent to which similar agreement percentages
can be established for student-level information remains unknown but needs to be investigated,
particularly if one’s purpose is to understanding access to the general curriculum for all students.
Implications for Future Research and Practice
A continued program of research on the uses and psychometric properties of MyiLOGS in a
wide range of grades and types of classroom would enhance the current findings. Particular areas
for future research on technical characteristics include interrater reliability, validity evidence
based on content, and validity evidence based on relations to other variables, in particular
alignment indices and interim measures of student achievement. Using samples of classrooms
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 25: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/25.jpg)
182 KURZ ET AL.
that are co-taught, the interrater reliability of the measure could be calculated based on Pearson
correlations between scores generated by the two teachers. To evaluate content validity, agroup of experts on OTL (including teachers, assessment leaders, and researchers, among
others) could independently evaluate MyiLOGS to determine which of its indices are critical
for measuring its constituent constructs. Also, direct observation is often considered the “gold
standard” against which all other measures of teaching are judged. Although the current study
includes relations between MyiLOGS and direct observation, future research could includemore observations or video coding to increase the reliability and validity of inferences drawn
from the criterion measure. Observations may also be used in the context of a multitrait–
multimethod study, which allows for the examination of convergent and discriminant validity
coefficients. Last, another study of relations to other variables could examine whether students
who have experienced low OTL (across various indices and content areas) respond differently
to increased opportunities, through intervention or instructional programming changes, thando similarly performing students who have experienced high OTL. Evidence that supports
the claim that increases in OTL lead to improved performances among those who previously
experienced low OTL would strongly support MyiLOGS as a measure of the constructs.
Conclusion
The importance of OTL has been apparent to stakeholders in the policy and research realm
for decades (e.g., Anderson, 1986; McDonnell, 1995; O’Day, 2004) and led to the inclusion of
voluntary OTL standards in the Goals 2000: Educate America Act (PL 103-227) and subsequentfederal policies such as the access to the general curriculum mandates under the Individuals with
Disabilities Education Act (1997). Difficulties defining the concept of OTL and operationalizing
its indicators, however, have contributed, at least in part, to the failure of OTL gaining a
foothold in our current test-based accountability system. This study provided initial usability,
reliability and validity evidence for MyiLOGS—an online teacher log measure that holds
potential for large-scale assessment of OTL and formative feedback for targeted instructionalchanges. The opportunity to advance a number of research agendas on effective teaching,
instructional equity, and teacher professional development, as well as helping teachers learn
more about their ongoing instructional provisions lies ahead with effective measures of OTL.
FUNDING
This research was supported by an Enhanced Assessment Grant from the U.S. Department ofEducation (S368A090006). The opinions expressed are those of the authors and do not represent
the views of the U.S. Department of Education or endorsement by the federal government.
REFERENCES
Abedi, J., Courtney, M., Leon, S., Kao, J., & Azzam, T. (2006). English language learners and math achievement:
A study of opportunity to learn and language accommodation (Tech. Rep. No. 702). Los Angeles: University of
California, National Center for Research on Evaluation, Standards, and Student Testing.
American Educational Research Association, American Psychological Association, & National Council on Measure-
ment in Education. (1999). Standards for educational and psychological testing. Washington, DC: Author.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 26: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/26.jpg)
OPPORTUNITY TO LEARN 183
Anderson, L. W. (1986). Opportunity to learn. In T. Husén & T. Postlethwaite (Eds.), International encyclopedia of
education: Research and studies (pp. 3682–3686). Oxford, UK: Pergamon.
Anderson, L. W., Krathwohl, D. R., Airasian, P. W., Cruikshank, K. A., Mayer, R. E., Pintrich, P. R., : : : Wittrock,
M. C. (2001). A taxonomy for learning, teaching, and assessing: A revision of Bloom’s taxonomy of educational
objectives. New York, NY: Longman.
Borg, W. R. (1980). Time and school learning. In C. Denham & A. Lieberman (Eds.), Time to learn (pp. 33–72).
Washington, DC: National Institute of Education.
Brophy, J., & Good, T. L. (1986). Teacher behavior and student achievement. In M. C. Wittrock (Ed.), Handbook of
Research on Teaching (3rd ed., pp. 328–375). New York, NY: Macmillian.
Burns, M. K., & Ysseldyke, J. E. (2009). Reported prevalence of evidence-based instructional practices in special
education. Journal of Special Education, 43, 3–11.
Burstein, L., & Winters, L. (1994, June). Models for collecting and using data on opportunity to lean at the state
level: OTL options for the CCSSO SCASS science assessment. Presented at the CCSSO National Conference on
Large-scale Assessment, Albuquerque, NM.
Camburn, E., & Barnes, C. A. (2004). Assessing the validity of a language arts instruction log through triangulation.
Elementary School Journal, 105, 49–73.
Carroll, J. B. (1963). A model of school learning. Teachers College Record, 64, 723–733.
Carroll, J. B. (1989). The carroll model: A 25-year retrospective and prospective view. Educational Researcher, 18, 2–31.
Connor, C. M., Morrison, F. J., Fishman, B. J., Ponitz, C. C., Glasney, S., Underwood, P. S., : : : Schatschneider, C.
(2009). The ISI classroom observation system: Examining the literacy instruction provided to individual students.
Educational Researcher, 38, 85–99.
D’Agostino, J. V., Welsh, M. E., & Corson, N. M. (2007). Instructional sensitivity of a state’s standards-based
assessment. Educational Assessment, 12, 1–22.
DiPerna, J. C., & Elliott, S. N. (2000). Academic competence evaluation scales: Manual K–12. San Antonio, TX:
Psychological Corporation.
Elbaum, B., Vaughn, S., Hughes, M. T., Moody, S. W., & Schumm, J. S. (2000). How reading outcomes for students
with learning disabilities are related to instructional grouping formats: A meta-analytic review. In R. Gersten, E. P.
Schiller, & S. Vaughn (Eds.), Contemporary special education research: Syntheses of the knowledge base on critical
instructional issues (pp. 105–135). Mahwah, NJ: Erlbaum.
Elliott, S. N., & Gresham, F. M. (2008). Social skills improvement system. San Antonio, TX: Pearson.
Gersten, R., Chard, D. J., Jayanthi, M., Baker, S. K., Morphy, P., & Flojo, J. (2009). Mathematics instruction for
students with learning disabilities: A meta-analysis of instructional components. Review of Educational Research,
79, 1202–1242.
Haertel, E. H. (1999). Validity arguments for high-stakes testing: In search of evidence. Educational Measurement:
Issues and Practices, 18(4), 5–9.
Herman, J. L., & Abedi, J. (2004). Issues in assessing English language learners’ opportunity to learn mathematics
(CSE Report No. 633). Los Angeles, CA: University of California, Center for the Study of Evaluation.
Herman, J. L., Klein, D. C., & Abedi, J. (2000). Assessing students’ opportunity to learn: Teacher and student
perspectives. Educational Measurement: Issues and Practice, 19, 16–24.
Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough teacher observation
systems and a case for the generalizability study. Educational Researcher, 41, 56–64.
Husén, T. (1967). International study of achievement in mathematics: A comparison of twelve countries. New York,
NY: Wiley & Sons.
Individuals with Disabilities Education Act Amendments of 1997, 20 U.S.C. §§ 1400 et seq.
Karger, J. (2005). Access to the general education curriculum for students with disabilities: A discussion of the
interrelationship between IDEA and NCLB. Wakefield, MA: National Center on Accessing the General Curriculum.
Karvonen, M., Wakeman, S. Y., Flowers, C., & Browder, D. M. (2007). Measuring the enacted curriculum for students
with significant cognitive disabilities: A preliminary investigation. Assessment for Effective Intervention, 33, 29–38.
Kurz, A. (2011). Access to what should be taught and will be tested: Students’ opportunity to learn the intended
curriculum. In S. N. Elliott, R. J. Kettler, P. A. Beddow, & A. Kurz (Eds.), Handbook of accessible achievement
tests for all students: Bridging the gaps between research, practice, and policy (pp. 99–129). New York, NY: Springer.
Kurz, A., Elliott, S. N., Lemons, C. J., Zigmond, N., Kloo, A., & Kettler, R. J. (2014). Assessing opportunity-to-
learn for students with disabilities in general and special education classes. Assessment for Effective Intervention,
1534508414522685. Advance online publication. doi:10.1177/1534508414522685
Kurz, A., Elliott, S. N., & Shrago, J. S. (2009). MyiLOGS: My instructional learning opportunities guidance system.
Nashville, TN: Vanderbilt University.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14
![Page 27: Assessing Students' Opportunity to Learn the Intended Curriculum Using an Online Teacher Log: Initial Validity Evidence](https://reader031.vdocuments.us/reader031/viewer/2022022200/5750a3161a28abcf0ca01044/html5/thumbnails/27.jpg)
184 KURZ ET AL.
Kurz, A., Elliott, S. N., Wehby, J. H., & Smithson, J. L. (2010). Alignment of the intended, planned, and enacted
curriculum in general and special education and its relation to student achievement. Journal of Special Education,
44, 131–145.
Kurz, A., Talapatra, D., & Roach, A. T. (2012). Meeting the curricular challenges of inclusive assessment: The role
of alignment, opportunity to learn, and student engagement. International Journal of Disability, Development and
Education, 59, 37–52.
Marzano, R. J. (2000). A new era of school reform: Going where the research takes us (REL No. #RJ96006101).
Aurora, CO: Mid-continent Research for Education and Learning.
Mayer, R. E. (2008). Learning and instruction (2nd ed.). Upper Saddle River, NJ: Pearson.
McDonnell, L. M. (1995). Opportunity to learn as a research concept and a policy instrument. Educational Evaluation
and Policy Analysis, 17, 305–322.
Nielsen, J. (1994). Heuristic evaluation. In J. Nielsen & R. L. Mack (Eds.), Usability inspection methods (pp. 25–60).
New York, NY: Wiley & Sons.
No Child Left Behind Act of 2001, 20 U.S.C. §§ 6301 et seq.
O’Day, J. A. (2004). Complexity, accountability, and school improvement. In S. H. Fuhrman & R. F. Elmore (Eds.),
Redesigning accountability systems for education (pp. 15–43). New York, NY: Teachers College Press.
Pianta, R. C., & Hamre, B. K. (2009). Conceptualization, measurement, and improvement of classroom processes:
Standardized observation can leverage capacity. Educational Researcher, 38, 109–119.
Polikoff, M. S. (2010). Instructional sensitivity as a psychometric property of assessments. Educational Measurement:
Issues and Practice, 29, 3–14.
Polikoff, M. S., Porter, A. C., & Smithson, J. (2011). How well aligned are state assessments of student achievement
with state content standards? American Educational Research Journal, 48, 965–995.
Porter, A. C. (1991). Creating a system of school process indicators. Educational Evaluation and Policy Analysis, 13,
13–29.
Porter, A. C. (2002). Measuring the content of instruction: Uses in research and practice. Educational Researcher, 31,
3–14.
Porter, A. C., Polikoff, M. S., Zeidner, T., & Smithson, J. (2008). The quality of content analyses of state student
achievement tests and content standards. Educational Measurement: Issues and Practice, 27, 2–14.
Roach, A. T., Chilungu, E. N., LaSalle, T. P., Talapatra, D., Vignieri, M. J., & Kurz, A. (2009). Opportunities and
options for facilitating and evaluating access to the general curriculum for students with disabilities. Peabody Journal
of Education, 84, 511–528.
Roach, A. T., Niebling, B. C., & Kurz, A. (2008). Evaluating the alignment among curriculum, instruction, and
assessments: Implications and applications for research and practice. Psychology in the Schools, 45, 158–176.
Rowan, B., Camburn, E., & Correnti, R. (2004). Using teacher logs to measure the enacted curriculum: A study of
literacy teaching in third-grade classrooms. Elementary School Journal, 105, 75–101.
Rowan, B., & Correnti, R. (2009). Studying reading instruction with teacher logs: Lessons from the Study of
Instructional Improvement. Educational Researcher, 38, 120–131.
Ryan, K. (2002). Assessment validation in the context of high-stakes assessment. Educational Measurement: Issues
and Practice, 21, 7–15.
Slavin, R. E. (2002). Evidence-based education policies: Transforming educational practice and research. Educational
Researcher, 31, 15–21.
Stevens, F. I. (1993). Applying an opportunity-to-learn conceptual framework to the investigation of the effects of
teaching practices via secondary analyses of multiple-case-study summary data. Journal of Negro Education, 62,
232–248.
Vannest, K. J., & Hagan-Burke, S. (2010). Teacher time use in special education. Remedial and Special Education,
31, 126–142.
Vannest, K. J., & Parker, R. I. (2010). Measuring time: The stability of special education teacher time use. Journal of
Special Education, 44, 94–106.
Vaughn, S., Gersten, R., & Chard, D. J. (2000). The underlying message in LD intervention research: Findings from
research syntheses. Exceptional Children, 67, 99–114.
Vaughn, S., Levy, S., Coleman, M., & Bos, C. S. (2002). Reading instruction for students with LD and EBD: A
synthesis of observation studies. Journal of Special Education, 36, 2–13.
Wang, J. (1998). Opportunity to learn: The impacts and policy implications. Educational Evaluation and Policy
Analysis, 20, 137–156.
Dow
nloa
ded
by [
Uni
vers
ity o
f W
inni
peg]
at 1
1:31
11
Sept
embe
r 20
14