the cbal writing assessment project paul deane nora

27
The CBAL Writing Assessment Project Paul Deane Nora Odendahl Thomas Quinlan Mary Fowles Doug Baldwin ETS, Princeton, NJ Paper presented at the annual meeting of the American Educational Research Association (AERA) and the National Council on Measurement in Education (NCME) held between March 23 to 28, 2008, in New York. Unpublished Work Copyright © 2008 by Educational Testing Service. All Rights Reserved. These materials are an unpublished, proprietary work of ETS. Any limited distribution shall not constitute publication. This work may not be reproduced or distributed to third parties without ETS's prior written consent. Submit all requests through www.ets.org/legal/index.html . Educational Testing Service, ETS, the ETS logo, and Listening. Learning. Leading. are registered trademarks of Educational Testing Service (ETS).

Upload: others

Post on 17-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

The CBAL Writing Assessment Project

Paul Deane

Nora Odendahl

Thomas Quinlan

Mary Fowles

Doug Baldwin

ETS, Princeton, NJ

Paper presented at the annual meeting of the

American Educational Research Association (AERA) and the

National Council on Measurement in Education (NCME)

held between March 23 to 28, 2008, in New York.

Unpublished Work Copyright © 2008 by Educational Testing Service. All Rights Reserved. These materials are an

unpublished, proprietary work of ETS. Any limited distribution shall not constitute publication. This work may not

be reproduced or distributed to third parties without ETS's prior written consent. Submit all requests through

www.ets.org/legal/index.html.

Educational Testing Service, ETS, the ETS logo, and Listening. Learning. Leading. are registered trademarks of

Educational Testing Service (ETS).

i

Abstract

The overarching goal of the CBAL Writing project is to design assessment that enhances

instruction, while being grounded in a cognitive theory of writing competency. There is a

rich cognitive literature on the nature of writing expertise and the difference between

novice and skilled writers. Bereiter and Scardamalia (1987) observed that novice writers

typically adopt a ‘knowledge-telling’ approach, while more skillful writers sometimes

deploy a ‘knowledge-transforming’ approach. This ability to adopt a knowledge-

transforming approach depends upon an array of writing subskills. We developed a

‘competency model’ to explicitly identify these subskills, which fall into three broad

categories: (a) basic language and literacy skills; (b) strategically manage writing

processes; and (c) think critically for writing. In designing an assessment to measure

these areas of writing competence, we wanted to capture something of the complex

coordination of writing subskills, while also taking into account the role of other relevant

factors, such as background knowledge. To emphasize critical thinking, we opted for a

project-based approach, in which smaller writing tasks (e.g., notes, a summary) serve to

scaffold a larger task (e.g., a letter to the editor). Preliminary responses from teachers

have been quite favorable, and the preliminary analysis of data from our first pilot

suggests that this approach has promise.

1

Introduction

There is a longstanding tension between writing assessment and writing

instruction. Writing assessment is constrained by the need to establish score reliability

and predictive power (Elliott, 2005) within limits determined by time availability and

cost of administration, and is constrained by concerns that writing tests should be fair and

not unduly influenced by outside knowledge. Writing instruction, on the other hand, is

motivated by a concern with the complexities of producing texts in a variety of contexts,

for a variety of purposes, and under conditions that require strategic management of a

variety of intellectual and linguistic skills (Bazerman, 2008; MacArthur, Graham &

Fitzgerald, 2006). The constraints of the one do not always match the concerns of the

other, which can have unfortunate and negative consequences for both (Hillocks, 2002).

The goal of the work reported here is to approach the problem of writing assessment in a

manner that conforms to the recommendations of the NRC Committee on the

Foundations of Assessment (Pellegrino et al., 2001: 292-293), who argue that a sustained

effort must be made to coordinate instruction and assessment, and to ground both in a

cognitive theory of domain area expertise. This paper presents initial results from two

years of designing and piloting such an assessment as part of CBAL (‘Cognitively-Based

Assessments of, for and as Learning’), an internal ETS research initiative.

This initiative is designed to conduct research to support the creation of a future

system of assessment that

• Documents what students have achieved (“of learning”),

• Helps identify what instruction should occur next (“for learning”),

2

• Is considered by students and teachers to be a worthwhile educational

experience in and of itself (“as learning”).

It seeks to build a unified approach to accountability assessment, formative

assessment, and professional support, in line with the following principles:

• Accountability tests, formative assessments, and professional support should (i)

be based on a single integrated framework motivated by cognitive research, (ii)

be responsive to state instructional and curricular standards, and (iii) be

designed to support best practices in writing instruction while maintaining

rigorous psychometric standards.

• Assessments should consist largely of engaging, extended, constructed-response

tasks, delivered by computer and automatically scored where appropriate.

• Individual tasks should be viewed by teachers and students as worthwhile

learning experiences in their own right, resulting in positive washback, in which

test preparation becomes an appropriate learning experience.

• Accountability assessment should be distributed over multiple administrations

(i) to reduce the importance of any one assessment and testing occasion; (ii) to

provide time for complex integrated tasks that better assess the construct; and

(iii) to provide prompt interim information in support of instruction.

• Assessments should help students participate actively in their own learning.

This paper describes work done to date on the writing portion of the CBAL

initiative, focusing on initial work developing the conceptual design, but with brief

discussion of initial pilot results, and focusing almost exclusively on summative

3

assessment design. We will touch briefly on formative issues, but these will primarily be

addressed by another presentation in this symposium.

4

What the Cognitive Literature Teaches Us about Learning to Write

There is a rich cognitive literature on the nature of writing expertise and the

difference between novice and skilled writers. In particular, skilled writers spend more

time planning and revising their work than novice writers; they focus more of their effort

and attention on managing the development of content, and concern themselves less with

its formal, surface characteristics; and they employ a variety of self-regulatory strategies.

(Bereiter & Scardamalia, 1987; Galbraith, 1999; Graham, 1997; Graham & Harris, 2000;

Kellogg, 1988; McCutchen, Francis, and Kerr, 1997; McCutchen, 2000). Moreover,

novice writers benefit from instruction on planning and revision strategies. They also

benefit when provided with instruction that enables them to think critically about topic-

relevant content (De La Paz, 2005; De La Paz & Graham, 1997a, 1997b, 2002; Graham

& Perin, 2006; Hillocks, 1987; Kellogg, 1988; Quinlan, 2004). Bereiter and Scardamalia

(1987) characterize the difference between novice and skilled authors as the difference

between a ‘knowledge-telling’ approach and a ‘knowledge-transforming’ approach to

writing. In a knowledge-telling approach, the focus of the writer’s effort is on the

process of putting words on the page. In a knowledge-transforming approach, writing is

a recursive process of knowledge-development and knowledge-expression.

Knowledge-transforming is by its nature a much more effortful and sophisticated

process than knowledge-telling, and develops only as writers gain significant expertise.

The literature suggests five major reasons why a student may fail to deploy a knowledge-

transforming approach to writing: (i) undeveloped or inefficient literacy skills; (ii) lack of

strategic writing skills; (iii) insufficient topic-specific knowledge; (iv) weak content

reasoning and research skills, and (v) unformed or rudimentary rhetorical goals

5

Undeveloped or Inefficient Literacy Skills

The high-level, strategic skills required for a knowledge-transforming approach to

writing place heavy demands on memory and attention. Inefficient oral fluency,

transcription, and text decoding may render it impossible to free up the working memory

capacity needed for strategic thought (Bourdin & Fayol, 1994, 2000; Kellogg, 2001;

Olive & Kellogg, 2002; Pearl, 1979; Piolat, Roussey, Olive, & Farioli, 1996; Shanahan,

2006; Torrance & Galbraith, 2005). Similarly, the ability to monitor and reflect upon

one’s own writing, which is critical to planning and revision, depends in large part upon

aspects of reading skill, both decoding and higher verbal comprehension, and thus

reading difficulties can obstruct revision and planning (Hayes, 1996, 2004; Kaufer,

Hayes, & Flower, 1986; McCutchen et al., 1997).

Lack of Strategic Writing Skills

Even skilled writers cannot handle all aspects of complex writing tasks

simultaneously; they succeed by applying strategies that reduce each task into

manageable pieces. A significant element in writing skill is thus the ability to

strategically and efficiently intersperse planning, text production, and evaluation,

sometimes switching back and forth rapidly among tasks, and other times, devoting

significant blocks of time to a single activity (Matsuhashi, 1981; Schilperoord, 2002).

Controlling writing processes so that the choice of activities is strategically appropriate

and maximally efficient is itself a skill to be learned (see Coirier, Anderson & Chanquoy,

1999 with respect specifically to persuasive writing expertise).

6

Insufficient Topic-Specific Knowledge

All models of writing expertise presuppose a critical role for long-term memory in

which the subject matter of writing must be retrieved, either top-down (Bereiter &

Scardamalia, 1987; Hayes & Flower, 1980) or bottom-up (Galbraith, 1999; Galbraith &

Torrance, 1999). Prior topical knowledge gives writers a major advantage, not only in

generating content, but in pursuing critical thinking tasks where topic knowledge is

required to support judgments of relevance and plausibility. Thus it is not surprising that

topic knowledge is a major predictor of writing quality (DeGroff, 1987; Langer, 1985;

McCutchen, 1986).

Weak Content Reasoning and Research Skills

Academic writing should be viewed not merely as an expressive act, but also as

intrinsically involving critical thought. While many of the problems the writer faces are

rhetorical, involving audience and purpose, these goals typically require the author to

develop ideas, to identify information needs, and to obtain that information, whether by

observation, inference, argument, or research (see Hillocks, 1987 meta-analysis, which

indicated the critical importance of inquiry strategies to improve student writing, and the

related arguments in Hillocks, 1995).

The reasoning required successfully to complete a writing task varies with

purpose, audience, and genre. Overall the literature suggests that one cannot assume that

novice writers, or even all adults, possess the full range of content reasoning and research

skills needed to support a knowledge-transforming approach to writing (Felton & Kuhn,

2001; Kuhn, 1991; Kuhn, Katz & Dean, 2004; Means & Voss, 1996; Perkins, 1985;

Perkins, Allen, & Hafner, 1983).

7

Unformed or Rudimentary Rhetorical Goals

A sophisticated writer is aware that all writing is communication within a social

context, in which the author must take the audience into account, collaborate with others,

and more generally, act within one or more communities of practice with well-defined

social expectations about writing. Students benefit from instructional activity that

clarifies the intended audience and the writer’s obligations to that audience (Cohen &

Riel, 1989; Daiute, 1986; Daiute & Dalton, 1993; Yarrow & Topping, 2001). In

particular, social, interactive activities such as peer review have a strong beneficial

impact (Graham & Perin, 2006), particularly when writing instruction is structured to

enable students to internalize social norms about academic writing (Beaufort, 2000;

Flower, 1989; Kent, 2003, Kostouli, 2005).

The Competency Model

We seek to implement the principles of Evidence-Centered Design (Mislevy et

al., 2003) in which assessment design is driven by the construction of explicit evidentiary

arguments. As part of the design process entailed by this approach, we developed a

‘competency model’ which explicitly identifies what skills a writing assessment should

measure. According to this model, there are three basic strands of writing competence:

Strand I: Language and literacy skills for writing.

Strand II: Writing process management skills

Strand III: Critical thinking skills for writing

Strand I is concerned with being able to use Standard English, being able to use

basic literacy skills such as reading in support of writing, and most centrally, being able

to draft and edit text. Strand II is concerned with being able to manage writing processes

8

strategically to produce an effective document, and thus is concerned with document

planning, selection and organization of materials, and text and content evaluation.

Each strand can be broken down further. The following diagram presents the

major skills we isolate as potentially relevant to writing. Note that though we present a

hierarchical diagram, with each node having only a single parent, the model does not

presuppose a strict hierarchy of componential skills; almost every part of writing

intrinsically interacts, and our primary goal in laying forth the hierarchy shown in Figure

1 is to present a reasonably complete outline of the kinds of skills required to succeed as

a writer, and to set as the goal of assessment measurement of these skills.

The nodes in Figure 1 are what Evidence-Centered Design refers to as “student

model variables” or “competency model variables.” Such variables have a dual nature by

design: on the one hand, they represent aspects of student writing competency; on the

other, they are intended to be operationalized in a measurement model, and must

therefore be explicitly connected to measurable features.

It is critical to interpret these three strands as jointly describing a model of skills

that need to be assessed to measure writing expertise viewed as including knowledge-

transforming, not just knowledge-telling strategies. The various competencies described

above form part of an interacting and interwoven complex of skills that cannot easily be

separated from one another or tested in isolation, and yet measuring writing skill is in

large part measuring progress in the development and integration of these intellectual and

social abilities.

9

Figure 1. The CBAL Writing Competency Model

WRITE

Use Language and Literacy

Skills

Use Strategies to Manage the

Writing Process

Use Critical Thinking Skills for Writing

Reason Critically

about Content

Reason about

Social Context

(Purpose, Audience)

Plan/Model

Control/Focus

Document Structure

Evaluate/Reflect

Activate/Retrieve

Select/Organize

Assess/Critique

Edit/Revise

Collaborate/Review

Accommodate/Engage

Narrate/Describe

Explain/Hypothesize

Gather/Synthesize

Support/Refute

Draft

Proofread/

Correct

Compose

(Express/Clarify)

Transpose

(Spelling, Mechanics)

Inscribe

(Handwriting,

Use Written

Vocabulary

Control

Sentence

Structure

Use Written

Style

Read/

Decode

Speak/

Understand

10

Strategies for Test Design

The literature review and competency model have led us to emphasize the

following key ideas:

• Writing proficiency is a complex array of skills

• These skills notably involve critical thinking about content, audience, and

purpose.

• Background knowledge plays a central role.

Each of these considerations has had a major impact on CBAL test design:

Complexity. The test design follows the CBAL principle of giving periodic

assessments throughout the year, but each prototype test focuses on a different mix of

writing proficiencies. The richness and complexity of the writing construct entails that

we can only effectively measure writing skill if we provide multiple occasions for

writing, using a wide range of writing tasks and situations.

Critical thinking. The prototype design addresses the importance and variety of

critical-thinking skills by devising smaller tasks to measure these skills separately. Many

critical thinking tasks occur naturally as activities people have to do as part of preparing

for, writing, and revising a long piece of writing, so that they can naturally be presented

as subsidiary activities within a larger writing project.

Background knowledge. Rather than develop generic prompts that minimize the

role of background knowledge, we adopted an approach in which relevant resource

materials are provided, so that all students have available a rich array of useful

information to support effective writing. While such an approach is not unprecedented,

11

we chose to make it the centerpiece of our approach because it enables tasks to be

designed whose instructional relevance is immediate and obvious.

Scaffolding. Rather than relying on a single prompt, we seek to provide materials

and activities that provide initial guidance to students and enable them to display their

writing skills to best advantage. Structuring tests to provide such scaffolding natural led

to a project-based approach in which all tasks hang together conceptually as parts of a

larger writing project.

In the resulting test design,

• Each test presents a multi-part “project” and is structured around a

scenario or situation that provides a context and purpose for a series of

related tasks.

• Tasks focus explicitly on writing strategies and critical thinking and are

explicitly scored for quality of thinking, not just for generic writing quality

• Each test typically focuses on one genre or mode of discourse and the

critical thinking/writing strategies associated with that mode of discourse,

but not rigidly; any of the activities naturally associated with a coherent

project may appear together in a single assessment.

• Language and literacy skills (vocabulary, grammar, usage, and mechanics)

are not measured by separate items, but are scored ‘in the background’ as

a feature of each written response.

• Usually two or three short tasks precede a longer, more integrated writing

task; sometimes, a short follow-up task comes after the long task, with

12

shorter tasks designed to model appropriate steps in the research and

thinking processes for a larger writing project.

• Source materials, rubrics, guidelines, and other supporting documents are

systematically built into the test design, reflecting the reality that academic

writing does not take place in a vacuum. This approach also permits

assessment of research skills crucial to academic writing.

Essentially, these strategies create “thematic” assessments, in which all the tasks

in a single assessment draw from the same topic and take place in the same context, so

that they hang together as different (but separable and psychometrically distinct) tasks

within a single larger-scale writing project.

Initial Pilots

In order to examine how well these concepts worked out in practice, nearly a

dozen possible writing assessments were developed, ranging over a variety of genres and

modes of writing. Two of these tests were selected for administration in collaboration

with middle school teachers from an urban/suburban school district. A battery of

formative tasks was also developed in cooperation with teachers, who tried them out in

the classroom. These tasks mirrored the summative tasks and enabled students to gain

some familiarity with test formats beforehand; teacher reaction was strongly favorable,

supporting the conclusion that we had designed tasks with clear instructional value. We

are still analyzing the results of these pilots, and thus the information presented in this

paper should be considered tentative. We will be administering additional pilot tests in

an expanded testing program in 2008.

13

The pilot materials focused primarily on persuasive writing, though we intend to

cover a variety of genres as the work proceeds. Some tasks focused on evaluating

sources and choosing appropriate material to support or refute an argument; others, on

building and structuring an argument and embodying it in an essay; still others on

methods for critiquing other people’s arguments and revising one’s own. We expect to

pilot additional tests to create a sequence of tests that cover all the major modes of

writing, and which provide a variety of project models that require instructionally useful

combinations of specific writing competencies.

The tests were administered via computer on custom-designed software in a

format that supported integration of reading and research with project-focused writing

tasks. Although the design is still exploratory, results from the first pilot test are

encouraging. The pilot was administered to a group of 120 students, with individuals

being randomly assigned to one or the other of the two tests. The distribution of scores

across the two tests was roughly comparable, with high reliability across tasks.

For the pilot study briefly described above, we developed an analytic scoring

rubric focused on the three strands of writing proficiency proposed in the competency

model.

Strand I (use language and literacy skills). Instead of using multiple-choice items

to measure these skills, the current approach is to apply a generic Strand I rubric to all

written responses of sufficient length. This rubric focuses on sentence-level features of

the students’ writing, such as grammar, usage, mechanics, spelling, vocabulary, and

variety of sentence structures.

14

Strand II (use strategies to manage the writing process). Although some tasks may

be designed to assess this category of skills, the primary approach is to apply a generic

Strand II rubric to all written responses of sufficient length in order to measure

document-level skills, including organization, structure, focus, and development.

Strand III: (use critical-thinking skills). In the test design, the prewriting and

inquiry tasks (or, occasionally, post-essay tasks) tend to focus primarily on critical-

thinking skills; the extended task, of course, draws on these skills as well. However, in

contrast to the generic rubrics for sentence-level and document-level skills, the rubrics for

critical-thinking skills are specific to each task. Evaluating the quality of ideas and

reasoning in a response requires a tailored rubric that reflects the content-related

requirements of the task.

In addition, lower-half scores were given diagnostic labels indicating how they

failed to meet the rubric, so that (for instance) an essay might be identified as having too

many grammatical errors on Strand I, being disorganized on Strand II, or failing to take a

clear position on Strand III. The most common error patterns in the pilot data involved

problems in sentence structure, failure to develop content adequately, and weak reasoning

(not giving strong supporting evidence.)

We are still experimenting with the scoring approach, and hope to develop novel

methods that will allow us to combine reliable measurement of strands I and II (where

automated scoring methods are most defensible) while giving the tests a clear focus on

content and rhetorical purpose.

15

Potential for Automated Scoring

One issue that should be mentioned here is the relevance of automated scoring

techniques. For many purposes – such as scoring the quality of critical thinking in an

essay – there is no substitute for human scoring. However, there are several strong

reasons to consider automated scoring as part of the CBAL test design: it can make more

rapid and detailed feedback feasible, thereby making tests more instructionally useful,

and can help to contain costs and compensate for the difficulty of obtaining sufficient

teacher time to score multiple assessments per year.

Most of the features relevant to Strand I of the writing competency model (and

some of those relevant to Strand II) can be measured using the kinds of features needed in

any case for automated essay scoring. Recently several approaches to automated essay

scoring have come into use for scoring large-scale assessments, including the PEG

system, a descendant of Page’s original system (Page 1966a, b; 1994, 1995, 2003),

methods based on Latent Semantic Analysis, or LSA (Foltz, Laham & Landauer, 1999;

Landauer, Laham & Foltz, 2003), and the e-rater™ system (Burstein et al., 1998,

Burstein, 2003).

At ETS, the primary automated essay scoring system is e-rater, which has

changed significantly over the years. While the original e-rater system used more than 50

individual features and selected whichever features best predicted human scores for a

particular prompt, e-rater 2.0 uses a much smaller set of features specifically selected to

measure particular aspects of writing performance: e.g., content, organization,

development, vocabulary, grammar, usage, mechanics, and style (Attali & Burstein,

2005; Ben-Simon & Bennett, 2007; Burstein, 2003; Chodorow & Burstein, 2004). It is

16

important to note, however, that the definition of features in e-rater 2.0 and their

alignment with a writing model is not based upon an explicit model of writing

competency, nor is the evidence used in e-rater’s feature set based upon direct evidence

of how writers mature and develop in their writing skills. It represents, rather, a selection

of features whose relation to known writing constructs is clear and defensible and which

can be shown to provide good predictions of human essay grades. Given that e-rater was

originally built specifically for a particular type of essay (persuasive GMAT essays) and

that a subset of the CBAL writing tasks are intended to be in essay format, we do not

intend to use e-rater unmodified for scoring, but it provides a crucial foundation upon

which automated scoring for CBAL writing can be built. Moreover, there is evidence

that e-rater scores can be used to provide a developmental scale covering student writers

from 4th to 12th grades (Attali & Powers, 2007).

We expect in future research to use many of the features used in automated essay

scoring systems to provide measurement particularly of strand I of the CBAL

competency model (language and literacy skills), and thus to provide automated scoring

for some (but not all) of the constructs relevant to writing. We do not, however, expect to

use automated scoring as the sole scoring technology, since one of the main purposes of

the CBAL Writing Assessment is to focus instruction and test preparation on the skills

needed to support effective, rhetorically appropriate engagement with content. However,

it is possible that automated scoring could play a confirmatory (check scoring) role even

for Strand III scores, since early pilot results indicate strong correlations across strand

scores.

17

Conclusions

The strategies and prototypes presented here are in an early stage. The efficacy of

the CBAL approach is unproven either as an assessment design or as a method of

aligning assessment with instruction, and much research remains to be done. Preliminary

response from teachers and preliminary analysis of data from our first pilot suggests,

however, that the approach has promise, and at the very least it represents a major

attempt to develop a new kind of writing assessment from first principles.

18

References

Attali, Y. & Burstein, J. (2005). Automated essay scoring with e-rater v. 2.0 (ETS

Research Report RR-04-45). Princeton, NJ: Educational Testing Service.

Attali, Y. & Powers, D. (2007). [A developmental writing scale]. Unpublished raw data.

Beaufort, A. (2000). Learning the trade: A social apprenticeship model for gaining

writing expertise. Written Communication: 17(2), 185-223.

Ben-Simon, A. and Bennett, R.E. (2007). Toward more substantively meaningful

automated essay scoring. The Journal of Technology, Learning and Assessment:

6(1).

Bereiter, C., & Scardamalia, M. (1987). The psychology of written composition.

Hillsdale, NJ: Lawrence Erlbaum Associates.

Bourdin, B., & Fayol, M. (1994). Is written language production more difficult than oral

language production? A working memory approach. International Journal of

Psychology: 29(5), 591-620.

Bourdin, B., & Fayol, M. (2000). Is graphic activity cognitively costly? A developmental

approach. Reading and Writing: 13(3-4), 183-196.

Burstein, J. (2003). The E-rater Scoring Engine: Automated Essay Scoring with Natural

Language Processing. In M.D. Shermis and J.C. Burstein (Eds.), Automated essay

scoring: A cross-disciplinary perspective (pp. 113-122). Mahwah, NJ: Lawrence

Earlbaum Associates.

Burstein, J., Kukich, K., Wolff, S., Lu, C., Chodorow, M., Braden-Harder, L. and Harris,

M.D. (1998). Automated scoring using a hybrid feature identification technique.

19

Proceedings of the Annual Meeting of the Association of Computational

Linguistics, Montreal, 1, 206-210.

Chodorow, M. & Burstein, J. (2004). Beyond essay length: Evaluating e-rater’s

performance on TOEFL essays (TOEFL Research Report 73). Princeton, NJ:

Educational Testing Service.

Cohen, M. and M. Riel. (1989). The effect of distant audiences on students' writing.

American Educational Research Journal, 26(2): 143-159.

Coirier, P., Andriessen, J. E. B., & Chanquoy, L. (1999). From planning to translating:

The specificity of argumentative writing. In P. Coirier & J. Andriessen (Eds.),

Foundations of argumentative text processing (pp. 1–28). Amsterdam:

Amsterdam University Press.

Daiute, C. (1986). Do 1 and 1 make 2? Patterns of influence by collaborative authors.

Written Communication, 3(3), 382-408.

Daiute, C., & Dalton, B. (1993). Collaboration between children learning to write: Can

novices be masters? Cognition and Instruction, 10(4), 281-333.

Deane, P., Quinlan, T., Odendahl, N., Welsh, C. and Bivens-Tatum, J. Forthcoming.

Cognitive models of writing: Writing proficiency as a complex integrated skill.

CBAL Literature Review—Writing. ETS Research Report, under review.

DeGroff, L. J. C. (1987). The influence of prior knowledge on writing, conferencing, and

revising. Elementary School Journal: 88(2), 105-118.

De La Paz, S. (2005). Effects of historical reasoning instruction and writing strategy

mastery in culturally and academically diverse middle school classrooms. Journal

of Educational Psychology: 97(2), 139-156.

20

De La Paz, S., & Graham, S. (1997a). Effects of dictation and advanced planning

instruction on the composing of students with writing and learning problems.

Journal of Educational Psychology: 89(2), 203-222.

De La Paz, S., & Graham, S. (1997b). Strategy instruction in planning: Effects on the

writing performance and behavior of students with learning difficulties.

Exceptional Children: 63(2), 167-181.

De La Paz, S. & Graham, S. (2002). Explicitly teaching strategies, skills and knowledge:

Writing instruction in middle school classrooms. Journal of Educational

Psychology: 94(4), 687-698.

Elliot, N. (2005). On a scale: A social history of writing assessment in America. New

York: Peter Land.

Felton, M. and D. Kuhn (2001). The development of argumentative discourse skill.

Discourse Processes: 32(2&3), 152-153.

Flower, L. (1989). Cognition, context, and theory building. College Composition and

Communication: 40(3) 282-311.

Foltz, P. W., Laham, D., & Landauer, T. K. (1999a). The Intelligent Essay Assessor:

Applications to educational technology. Interactive Multimedia Electronic

Journal of Computer-Enhanced Learning: 1(2). Retrieved March 11, 2008, from

http://imej.wfu.edu/articles/1999/2/04/index.asp

Galbraith, D. (1999). Writing as a knowledge constituting process. In M. Torrance & D.

Galbraith (Eds.), Knowing what to write: Conceptual processes in text production

(pp. 139-160). Amsterdam: Amsterdam University Press.

21

Graham, S. (1997). Executive control in the revising of students with learning and writing

difficulties. Journal of Educational Psychology, 89(2), 223-234.

Graham, S. and Harris, K.R. (2000). The role of self-regulation and transcription skills in

writing development. Educational Psychologist: 35(1), 3-12.

Graham, S. and Perin, D. (2007). Writing next: Effective strategies to improve writing of

adolescents in middle and high schools – A report to Carnegie Corporation of

New York. Washington, DC: Alliance for Excellent Education.

Hayes, J. R. (1996). A new framework for understanding cognition and affect in writing.

In C. M. Levy & S. Ransdell (Eds.), The science of writing: Theories, methods,

individual differences, and applications (pp. 1-27). Mahwah, NJ: Lawrence

Erlbaum Associate.

Hayes, J.R. (2004). What triggers revision? In G. Rijlaarsdam (Series Ed.) & L. Allal, &

L. Chanquoy (Vol. Eds.), Studies in writing: Vol. 13. Revision: Cognitive and

instructional processes (pp. 189–207). Boston: Kluwer Academic Publishers.

Hayes, J. R., & Flower, L. S. (1980). Identifying the organization of writing processes. In

L. Gregg & E. R. Steinberg (Eds.), Cognitive processes in writing (pp. 3-30).

Hillsdale, NJ: Lawrence Erlbaum Associates.

Hillocks, G, Jr. (1987). Synthesis of research on teaching writing. Educational

Leadership: 44(8), 71.

Hillocks, G., Jr. (1995). Teaching writing as reflective practice. New York: Teachers

College Press.

Hirschman, J. (2007, November 10). Letters: A school is more than an A, B, or C. New

York Times. Retrieved March 11, 2008, from http://www.nytimes.com

22

Kaufer, D. S., Hayes, J. R., & Flower, L. S. (1986). Composing written sentences.

Research in the Teaching of English: 20(2), 121-140.

Kellogg, W. H. 1970. Syntactic maturity in schoolchildren and adults. Monographs of the

Society for Research in Child Development, 35(1, Serial No. 134).

Kellogg, R. (1988). Attentional overload and writing performance: Effects of rough draft

and outline strategies. Journal of Experimental Psychology: Learning, Memory

and Cognition: 14(2), 355-365.

Kellogg, R.T. (2001). Competition for working memory among writing processes. The

American Journal of Psychology: 114(2), 175-191.

Kent, T. (2003). Post-process theory: beyond the writing-process paradigm. In L. Z.

Bloom, D. A. Daiker & E. M. White (Eds.), Composition studies in the new

millennium: Rereading the past, rewriting the future. Carbondale, IL: Southern

Illinois University Press.

Kostouli, T. (2005). Writing in context(s): Textual practices and learning processes in

sociocultural settings. Heidelberg: Springer Verlag.

Kuhn, D. (1991). The skills of argument. Cambridge: Cambridge University Press.

Kuhn, D., Katz, J. B., & Dean, D., Jr. (2004). Developing reason. Thinking & Reasoning:

10(2), 197-219.

Kuhn, D., Shaw, V., & Felton, M. (1997). Effects of dyadic interaction on argumentative

reasoning. Cognition and Instruction: 15(3), 287-315.

Landauer, T.K., Laham, D. and Foltz, P.W. (2003). Automated scoring and annotation of

essays with the Intelligence Essay Assessor. In M.D. Shermis and J.C. Burstein

23

(Eds.), Automated essay scoring: A cross-disciplinary perspective (pp. 87-112).

Mahwah, NJ: Lawrence Earlbaum Associates.

Langer, J. A. (1985). Children's sense of genre: A study of performance on parallel

reading and writing. Written Communication: 2(2), 157-187.

Matsuhashi, A. (1981). Pausing and planning: The tempo of written discourse production.

Research in the Teaching of English: 15(2), 113-134.

McCutchen, D. (1986). Domain knowledge and linguistic knowledge in the development

of writing ability. Journal of Memory and Language: 25(4), 431-444.

McCutchen, D. (1996). A capacity theory of writing: Working memory in composition.

Educational Psychology Review: 8(3), 299-325.

McCutchen, D. (2000). Knowledge, processing, and working memory: Implications for a

theory of writing. Educational Psychologist: 35(1), 13-23.

McCutchen, D., Francis, M., & Kerr, S. (1997). Revising for meaning: Effects of

knowledge and strategy. Journal of Educational Psychology: 89(4), 667-676.

Means, M. L., & Voss, J. F. (1996). Who reasons well? Two studies of informal

reasoning among children of different grade, ability, and knowledge levels.

Cognition & Instruction: 14(2), 139.

Mislevy, R. J., Steinberg, L. S., & Almond, R. G. (2003). On the structure of educational

assessments. Measurement: Interdisciplinary Research and Perspectives: 1(1), 3-

62.

Olive, T. & Kellogg, R.T. (2002). Concurrent Activation of high- and low-level processes

in written composition. Memory and Cognition: 30(4), 594-600.

24

Page, E.B. (1966a). The imminence of grading essays by computer. Phi Delta Kappan:

47, 238-243.

Page, E.B. (1966b). Grading essays by computer: Progress report. Proceedings of the

Invitational Conference on Testing Problems (pp. 86-100). Princeton, NJ:

Educational Testing Service.

Page, E.B. (1994). New computer grading of student prose, using modern concepts and

software. Journal of Experimental Education: 62(2), 127-142.

Page, E.B. (1995). Computer grading of essays: A different kind of testing? Address for

APA Annual Meeting, Sunday, Aug. 13, 1995. Session 3167, Sheraton N.Y.

Hotel, Princess Ballroom, 1:00-1:50 P.M. Invited address sponsored by Divs. 16,

5, 7, 15, Prof. Timothy Z. Keith, Chair.

Page, E.B. (2003). Project essay grade: PEG. In M.D. Shermis and J.C. Burstein (Eds.),

Automated essay scoring: A cross-disciplinaryperspective (pp. 43-54). Mahwah,

NJ: Lawrence Erlbaum

Pearl, S. (1979). The composing processes of unskilled college writers. Research in the

Teaching of English, 13(4), 317-336.

Pellegrino, J.W., Chudowsky, N., and Glaser, E. (Eds.). (2001). Knowing what students

know: The science and design of educational assessment. Washington, D.C.:

National Academy Press.

Perkins, D. N. (1985). Postprimary education has little impact on informal reasoning.

Journal of Educational Psychology: 77(5), 562-571

25

Perkins, D. N., Allen, R., & Hafner, J. (1983). Difficulties in everyday reasoning. In W.

Maxwell & J. Bruner (Eds.), Thinking: The expanding frontier (pp. 177-189).

Philadelphia, PA: The Franklin Institute Press.

Piolat, A., Roussey, J.-Y., Olive, T., & Farioli, F. (1996). Charge mentale et mobilisation

des processus rédactionnels : Examen de la procédure de Kellogg. Psychologie

Française: 41(4), 339-354.

Quinlan, T. (2004). Speech recognition technology and students with writing difficulties:

improving fluency. Journal of Educational Psychology: 96(2), 337-346.

Schilperoord, J. (2002). On the cognitive status of pauses in discourse production. In T.

Olive & C. M. Levy (Eds.), Contemporary tools and techniques for studying writing

(pp. 61-90). Dordrecht/Boston/London: Kluwer Academic Publishers.

Shanahan, T. (2006). Relations among oral language, reading and writing development.

In C.A. MacArthur, S. Graham, and J. Fitzgerald (Eds.), Handbook of writing

research (pp. 171-183). New York & London: Guilford Press.

Torrance, M., & Galbraith, D. (2005). The processing demands of writing. In C.

MacArthur, S. Graham & J. Fitzgerald (Eds.), Handbook of writing research.

New York: Guilford Publishers.

Yarrow, F.; Topping, K.J. (2001). Collaborative writing: The effects of metacognitive

prompting and structured peer interaction. British Journal of Educational

Psychology: 71(2): 261-282.