direct written corrective feedback

Upload: will13877

Post on 07-Mar-2016

238 views

Category:

Documents


0 download

DESCRIPTION

Direct Written Corrective Feedback,Learner Differences, and theAcquisition of Second LanguageArticle Use for Generic and SpecificPlural Reference

TRANSCRIPT

  • Direct Written Corrective Feedback,Learner Differences, and theAcquisition of Second LanguageArticle Use for Generic and SpecificPlural ReferenceCHARIS STEFANOULancaster UniversityLinguistics and English LanguageCounty SouthLancaster LA1 4 YTUnited KingdomEmail: [email protected]

    ANDREA RVSZUniversity College LondonUCL Institute of Education20 Bedford WayLondon WC1H 0ALUnited KingdomEmail: [email protected]

    This article reports on a classroom-based study that investigated the effectiveness of direct written cor-rective feedback in relation to learner differences in grammatical sensitivity and knowledge of meta-language. The study employed a pretestposttestdelayed posttest design with two treatment sessions.Eighty-nine Greek English as a foreign language (EFL) learners were randomly assigned to 3 groups: di-rect feedback only, direct feedback plus metalinguistic comments, and comparison. The linguistic targetwas article use for specific and generic plural reference. A text summary and a truth value judgment testwere employed to measure any development in learners ability to use articles. The results revealed anadvantage for receiving direct feedback over no feedback, but provided no clear evidence for the benefitof supplying metalinguistic information. Additionally, participants with greater grammatical sensitivityand knowledge of metalanguage proved more likely to achieve gains in the direct feedback only group.Keywords: written corrective feedback; individual differences; article use

    THE ROLE OF WRITTEN CORRECTIVEfeedback (WCF) has received considerable inter-est among instructed second language acquisition(SLA) researchers for the past two decades. In-terest in the topic has partly been driven byTruscotts (1996) controversial claim that WCF,a widely used pedagogical tool, is ineffective andpotentially harmful to second language (L2)learners. Contrary to Truscotts supposition,recent years have seen an accumulation of em-pirical evidence attesting that WCF can be usefuland effective in promoting L2 development

    The Modern Language Journal, 99, 2, (2015)DOI: 10.1111/modl.122120026-7902/15/263282 $1.50/0C2015 The Modern Language Journal

    (e.g., Bitchener, 2008; Bitchener & Knoch, 2008;Ellis et al., 2008; Ferris & Roberts, 2001; Sheen,2007, 2010; Sheen, Wright, & Moldawa, 2009; VanBeuningen, De Jong, & Kuiken, 2012). As a result,researchers are now focusing on investigatingthe factors actually influencing the efficacy ofWCF, among them its consistency in focusingon particular linguistic targets (e.g., Ferris &Roberts, 2001), the number of linguistic targets(e.g., Ellis et al., 2008; Farrokhi & Sattarpour,2012), the amount of metalinguistic informationaccompanying the feedback (e.g., Bitchener,2008; Bitchener & Knoch, 2008; Sheen, 2007,2010), the availability of the correct construction(e.g., Storch & Wigglesworth, 2010; Suh, 2010),and learner differences in cognitive abilities, suchas inductive language learning capacity (Sheen,2007).

  • 264 The Modern Language Journal 99 (2015)

    One aim of the current study is to extend ex-isting research by further exploring the extent towhich metalinguistic information may influencethe effectiveness of WCF. The additional aimsand novel aspects of our research include exam-ining whether WCF can facilitate the acquisitionof a new linguistic targetarticle use for genericand specific plural referenceand whether thelink between WCF type and SLA may be moder-ated by grammatical sensitivity and knowledge ofmetalanguage.

    BACKGROUND

    Written Corrective Feedback

    Written corrective feedback refers to theinformation provided to L2 learners about the ill-formedness of their written production (Loewen,2012). The vast majority of SLA researchers (e.g.,Ferris, 1999, 2002) agree that there are a numberof potential benefits to supplying WCF in L2classrooms. For example, in line with the widelyaccepted view that acquisition requires somefocus on form, WCF can function as such a deviceand draw learners attention to L2 constructions(e.g., Ellis, 2005), thereby helping them to noticegaps in their current L2 knowledge (Schmidt,1990). WCF may additionally engage learners inguided learning and problem solving and, as aresult, promote the type of reflection that is morelikely to foster long-term acquisition (Bitchener& Knoch, 2008, p. 415).Contrary to these arguments, a few researchers

    have raised concerns regarding the use ofWCF onL2 writing, Truscotts (1996, 2007) counterclaimsbeing the most influential among them. Truscottobjected to the use ofWCF on the ground that theway it is practiced in themajority of language class-rooms disregards well-established understandingsfrom SLA research, including that (a) L2 develop-ment is a gradual and intricate process, which en-tails more than just the sudden discovery of rulesand simple knowledge transfer from teachers tostudents, (b) there is little probability that a sin-gle form of feedback will promote the acquisitionof features from various linguistic domains such aslexis, morphology, and syntax, (c) WCF is likely tohave no value for promoting implicit knowledgeand only has the capacity to assist in developing alimited degree of explicit knowledge, which maybe helpful for revision purposes but not for gen-uine L2 improvement, and (d) teachers are notequipped to provide feedback that is adjusted tothe developmental needs of their learners giventhe absence of well-documented developmental

    patterns in SLA. Truscott did not only questionthe efficacy of feedback, but went as far as to sug-gest that the provision of corrective feedback maybe counterproductive and constitute a source ofanxiety and stress for students, whomay ultimatelyavoid using complex features in fear of receivingfeedback.In response to calls from both proponents and

    opponents ofWCF for improved research designs,by now a large number of tightly controlled stud-ies have accrued, examining mainly two dimen-sions: direct/indirect and focused/unfocusedfeedback. The distinction between direct and in-direct feedback concerns whether the correctconstruction is supplied or not. Direct feedbackentails the correct construction, whereas indirectWCF only indicates the presence of an error, leav-ing the responsibility of correction to learners.Focused feedback consistently targets a single ora limited number of problematic linguistic fea-tures, while unfocused or comprehensive feed-back addresses all errors in the learners writ-ing irrespective of their nature. Previous researchcomparing the effectiveness of feedback alongthese two dimensions has yielded mixed findings(see Van Beuningen et al., 2012, for a review).Nonetheless, focused direct feedback, the objectof this investigation, has consistently been shownto enhance grammatical accuracy (e.g., Bitch-ener, 2008; Bitchener & Knoch, 2008, 2010; Far-rokhi & Sattarpour, 2012; Sheen et al., 2009). Rel-atively little is known, however, about the effectsof different kinds of focused direct feedback onSLA.The handful of studies that have explored the

    learning potential of different types of focuseddirect feedback have primarily been concernedwith determining the utility of enhancing directfeedback with metalinguistic information. Sheen(2007) looked into whether direct written correc-tions with or without metalinguistic commentshave a greater capacity to promote the acquisitionof article use for first mention and anaphoricreference. She found that the participantsadultintermediate ESL learners from various first lan-guage (L1) backgroundsbenefited to a greaterextent from direct feedback with the addition ofmetalinguistic information. Using the same lin-guistic target, Bitchener and Knoch (2008, 2009;Bitchener, 2008) examined, in a series of studies,the extent to which learners accuracy can be fos-tered by four types of WCF conditions: (a) directwritten corrections with written and oral metalin-guistic comments, (b) direct written correctionswith written metalinguistic comments, (c) directwritten corrections only, and (d) no feedback.

  • Charis Stefanou and Andrea Rvsz 265

    All three studies converged on the same findingthat the provision of direct feedback was morebeneficial than receiving no feedback, regardlessof whether the feedback occurred on its ownor was complemented with metalinguistic com-ments. More recently, Shintani and Ellis (2013)also compared the potential benefits affordedby direct corrective feedback and metalinguisticcomments. Unlike in previous research, however,metalinguistic comments did not complementcorrections, but were provided on their own.The researchers found that, while metalinguisticcomments facilitated the acquisition of explicitknowledge of the target feature (English indefi-nite article), direct feedback did not lead to gains.Drawing on the participants stimulated recallcomments, Shintani and Ellis concluded thatmetalinguistic information may have enabledlearners to develop awareness of the target rule,which they were able to apply in the revisionprocess. Clearly, the results to date are mixedregarding the impact of supplementing direct fo-cused feedback with metalinguistic information.One goal of the present study is to help clarifythe value of providing metalinguistic comments.Additionally, we intend to begin investigating

    whether existing findings on direct focused feed-back may be extended to other linguistic con-structions. As discussed earlier, several studieshave focused on the same linguistic rule: articleuse for first mention and anaphoric reference.Here, we examined the extent to which varioustypes of direct WCF may facilitate the acquisitionof another aspect of English article use: genericand specific plural reference. Additionally, a keycontribution of our research lies in exploring themoderating effects of two learner factors on WCF,addressing recent calls to explore how learnerswith differential cognitive abilities may benefitfrom different kinds of WCF (Bitchener & Ferris,2012; Kormos, 2012).

    Learner Differences

    In the present study, learner differences or learnerfactors are used as cover terms to encompass dif-ferential capacities among learners in grammat-ical sensitivity and knowledge of metalanguage.Grammatical sensitivity is a cognitive ability thathas traditionally been conceptualized as a com-ponent of language learning aptitude (e.g., Car-roll, 1981; Skehan, 2002). It refers to the abil-ity to recognize the different syntactic patternsand grammatical functions of words in a givensentence structure, irrespective of knowledge ofgrammatical terminology. The choice of gram-

    matical sensitivity as a potential mediating vari-able was driven by the assumption that this capac-ity would be particularly relevant to learners abil-ity to benefit from WCF targeting a grammaticalphenomenon.Knowledge of metalanguage refers to the

    ability to use subject-specific terminology to artic-ulate metalinguistic rules. Tests of metalanguagetypically ask participants to identify examples ofgrammatical terms in L1 and/or L2 sentences(Alderson, Clapham, & Steel, 1997; Berry, 2009;Elder, 2009) and/or to give stand-alone examplesof grammatical terms (Berry, 2009). Knowledge ofmetalanguage, which was operationalized here aslearners knowledge of the appropriate terminol-ogy to describe structures and forms in their L2,was presumed to be especially important to thecapability to learn frommetalinguistic comments.We expected that familiarity with metalanguagewould assist learners in understanding andmaking use of the metalinguistic explanationsoffered, thus leading to larger instructional gains.Although the importance of investigating

    aptitudetreatment interactions has increasinglybeen emphasized (e.g., DeKeyser, 2012; Robin-son, 2002, 2005; Vatz et al., 2013) in the L2literature, empirical research exploring potentiallinks between learners cognitive abilities andpropensity to benefit from particular instruc-tional treatments remains relatively limited (seeVatz et al., 2013, for a recent summary). Todate, only the previously mentioned study bySheen (2007) looked into the relationship be-tween WCF and learner differences in cognitiveabilities. In particular, she examined whetherinductive language learning ability may moderatethe efficacy of direct written feedback. The resultsrevealed greater benefits for learners with highinductive language learning ability under bothfeedback conditions investigated, direct feedbackonly and direct feedback with metalinguisticcomments. Sheen additionally found that highlanguage analytic ability posed a greater advan-tage when the written feedback was accompaniedby metalinguistic information.To the best of our knowledge, grammatical sen-

    sitivity and knowledge of metalanguage have notyet been studied in the context of WCF research.Nevertheless, there is some evidence indicatingthat learners who differ along these factors mayrespond differently to particular instructionaltechniques. Among other individual differencefactors, Erlam (2005) examined the moderatingeffects of grammatical sensitivity on the effec-tiveness of three types of instruction: inductive,structured input, and deductive. She found that,

  • 266 The Modern Language Journal 99 (2015)

    in L2 French classrooms, students with highergrammatical sensitivity benefited more from in-ductive and structured input training than deduc-tive instruction. Similar findings emerged froma recent study of oral corrective feedback by Li(2013), who observed that grammatical sensitivitywas positively linked to learning from implicitfeedback (operationalized as recasts) but not tolearning from explicit feedback. Trofimovich,Ammar, and Gatbonton (2007) also reported apositive correlation between learners grammat-ical sensitivity and ability to benefit from recasts.Sachs (2010), however, in exploring the effectsof computer-mediated feedback on L2 devel-opment, found that grammatical sensitivity pre-dicted gains under the no feedback and more ex-plicit feedback condition (right/wrong feedbackplus metalinguistic information in the form oftree diagrams) in her research, whereas grammat-ical sensitivity was not significantly related to gainsin themore implicit condition (right/wrong feed-back). For the no feedback group, metalinguisticknowledge also emerged as a predictor of learn-ing. Based on these studies, no clear patterns canbe discerned regarding the moderating effectsof grammatical sensitivity and metalinguisticknowledge on L2 instruction. More research isneeded to clarify the role of these factors undervarious instructional conditions, and the presentstudy takes up this research direction.

    METHOD

    Research Questions

    In light of the research needs outlined earlier,the following research questions were formed:

    RQ1. a. Does direct written corrective feedbackhelp improve Greek EFL learners use ofarticles for specific and generic plural ref-erence?b. If yes, which type is more beneficial: di-rect written feedback only or direct writtenfeedback with metalinguistic information?

    RQ2. a. Are there any relationships betweenlearning benefits from direct written cor-rective feedback and learner differencesin grammatical sensitivity, or knowledge ofmetalanguage?b. If yes, does the strength of the relation-ships differ by type of feedback, direct writ-ten feedback only or direct written feed-back with metalinguistic information?

    Design

    The study followed a pretestposttestdelayedposttest design, with two treatment sessionsbetween the pretest and posttest. The participants

    (N = 89) were randomly assigned to one of threegroups: direct feedback only (n= 30; henceforth,direct feedback only), direct feedback plus meta-linguistic comments (n = 30; henceforth, directmetalinguistic), or the comparison group (n =29). The participants ability to use articles for spe-cific and generic plural reference wasmeasured intwo assessment tasksa text summary and a truthvalue judgment (TVJ) test. During the two treat-ment sessions, the two direct feedback groups car-ried out additional versions of the text summarytask and received feedback in response to arti-cle errors according to their group assignment.The comparison group completed the same tasks,but received feedback only on spelling errors. Awords-in-sentences test and a test ofmetalanguagewere also administered to all participants, in or-der to measure their grammatical sensitivity andknowledge of metalanguage respectively.

    Participants

    The 89 participants were EFL students whowere in their first year in the same public highschool in Cyprus. All of them were native speakersof Greek and had been studying English for 6 to7 years. Fifty-one were female and 38 were male.They were all 16 years of age. Using the OxfordPlacement Test (Dave, 2004), intermediate-levelparticipants were selected for the study.1 The ra-tionale for including participants with intermedi-ate proficiency was that they were likely to haveat least some knowledge of the English article sys-tem, but were unlikely to have already masteredthe rules associated with generic article use. Addi-tionally, using participants with intermediate pro-ficiency was thought to increase the comparabilityof our research to previous WCF studies, most ofwhich also involved participants of intermediatelevel proficiency in English.

    Linguistic Target

    In the present study, the written feedback tar-geted article use with specific and generic plu-ral referents. Languages with articles vary as towhether they allow both definite and bare pluralreferents, and the two languages involved in thisstudy, English and Greek, exemplify this distinc-tion. English, the target language, permits bothdefinite and bare plural referents and assigns adifferent meaning to each. As illustrated in thefollowing examples, definite plural noun phrases(1a) carry a specific meaning like demonstrativeplurals (1c); that is, they describe referents thatare known to both the speaker and the hearer. By

  • Charis Stefanou and Andrea Rvsz 267

    contrast, the only reading available for bare plu-rals is that of generic reference (1b).

    (1) Plural Reference in Englisha. Definite plural

    The parrots are colorful. [specific refer-ence: parrots are known to both inter-locutors]

    b. Bare pluralParrots are colorful. [generic refer-ence: all parrots in general]

    c. Demonstrative pluralThese parrots are colorful. [specific ref-erence: parrots are known to both inter-locutors]

    In Greek, the participants L1, plural referentscan only be definite, and both generic and spe-cific readings can be assigned to them. In otherwords, a definite plural noun phrase can describeeither some specific referents or a species in gen-eral, as exemplified in (2):

    (2) Definite Plural Reference in Greek

    O .the (pl.) parrots are colorful

    (the) parrots are colorful. [specific ref-erence: parrots known to both interlocu-tors OR generic reference: all parrots ingeneral]

    In sum, English employs articles to map twodifferent meanings (generic and specific refer-ence) onto two different forms (bare and definiteplurals), whereas Greek marks both meaningsonto a single form (definite plural). Thus, theinterpretation of Greek definite plurals as eitherspecific or generic is dependent on the context.This type of cross-linguistic difference has beendemonstrated to pose a challenge for Spanishlearners of English, whose L1, in line with mostRomance languages, behaves like Greek withregard to plural referents (see Ionin & Montrul,2010; Snape, Garca Mayo, & Grel, 2009). Im-portantly, Greek EFL learners themselves havebeen found to show difficulty in using pluralgenerics (Stefanou, 2010). Therefore, it appearsworthwhile to investigate the extent to whichwritten corrective feedback techniques may helplearners master this construction.

    Treatment

    The two treatment sessions took place duringthe participants normally scheduled classes, inwhich one of the researchers acted as the teacher.First, the participants completed two versions

    of the text summary task. As part of the textsummary task, they were required first to read ashort text in Greek. Half of the text introducedan animal species using generic reference, andthe other half described a specific pair of animalsof the same species. Next, without consultingthe text, the participants were asked to provideshort descriptions of eight pictures in English,each of which corresponded to part of the in-formation presented earlier in the text. Fourof the pictures targeted generic reference (e.g.,Bears sleep in the winter; henceforth, generic textsummary), and the other four were designed toelicit specific referents (e.g., The bears in my towncame from Northern Europe; henceforth, specifictext summary). Thus, the two treatment sessions,overall, intended to elicit the use of 16 genericand 16 specific referents (both treatment ses-sions included two treatment tasks, with eachtreatment task designed to elicit four generic andfour specific uses). Three different versions ofthe text summary task were also used as part ofthe assessment. The task had no time limit, andparticipants could seek assistance with unfamiliarvocabulary. Except for the L1 reading compo-nent, this task format was aligned well with theactivities that the students would normally carryout during their English classes.Once all students had finished the task, their

    task sheets were collected for marking. In thetwo experimental groups, all article errors withgeneric and specific plural referents were cor-rected by one of the researchers using directWCF. Other error types, including differenterrors in article use, were ignored. For the directfeedback only group, the feedback took theform of insertions of the definite article whenthe context required specific instead of genericplural reference, or deletions of the definitearticle when the use of bare generics ratherthan definite specific plurals would have beenappropriate. In the direct metalinguistic group,the direct corrections were complemented withrelevant metalinguistic information, which washandwritten at the top of each task sheet inEnglish. This information made reference to theactual content of the text summary task (lions inthe example that follows), and read as:

    Use the + plural noun (e.g., The lions . . .)to describe someparticular animals.

    =Use + plural noun (e.g., Lions . . .)to describe all animals in general.

    In the comparison group, article errors with spe-cific and generic referents were ignored and

  • 268 The Modern Language Journal 99 (2015)

    corrections of spelling errors were providedinstead.In the next session, the students received back

    their text summaries with theWCF and were givenfive minutes to look over their errors and the re-spective corrections, with the advice to attend tothe feedback carefully because, as they were told,they would later have to complete a similar task.Piloting suggested that 5 minutes allowed suffi-cient time for learners to examine the feedback.No further comments were provided by the re-searcher, and the students were not asked to revisetheir writing, as in Bitchener & Knoch (2009), El-lis et al. (2008), and Sheen (2010). This method-ological choice was in line with normal feedbackpractice in this context, since the students areusually not asked to revise their work based onteacher feedback. It was also deemed more ap-propriate given the availability of the correct con-struction in the feedback provided, which wouldhave made revisions resemble passive copying onthe part of the students. As Polio (2012) notes,it is obvious that a writer can look at direct cor-rections and copy them onto a new piece of writ-ing (p. 377); what is key to the success of WCFis drawing learner attention to the target of thefeedback provided. Referring to Sachs and Polio(2007), she suggests this might be achieved by ask-ing students to take time to look over their cor-rections before revising. The revision component,however, does not seem necessary to trigger notic-ing, as evidenced in some existing studies of WCF(Ellis et al., 2008; Sheen, 2010) and the findingsobtained here.

    Assessment Tasks

    Two testing tasks were designed to assess arti-cle use with specific and generic plural referents:a text summary and a truth value judgment test.Both tasks entailed two types of itemsitems tar-geting article use for specific reference and itemsdesigned to elicit article use for generic refer-ence. Our rationale for using two different typesof assessment tasks was to assess any effects of thetreatment on the participants productive and re-ceptive knowledge of the targeted constructions.Three parallel versions of the testing tasks weredeveloped and counterbalanced across the threetesting sessions in a split-block design. VersionA of the assessment tasks can be found in theAppendix.

    Text Summary Test. This test had exactly thesame format as the treatment task; it aimed atassessing the participants productive knowledge

    of generic and specific article use. In line withits specifications, it succeeded in eliciting the useof plural referents 88.3% of the time in both thegeneric and specific sections of the assessment.To assess the internal consistency reliability of thethree versions of the test, Cronbachs alpha wascalculated by aggregating the scores of the threeversions of the text summary across the threetesting sessions. This was found to be high for allthree versions of the generic ( > .77) as well asthe specific ( > .88) parts of the assessment. Thetotal score was four points for both the genericand specific components of the test. Participantsreceived one point for each correct response.We decided against utilising obligatory occasionanalysis given that the use of the target construc-tion was found to be essential in describing eachpicture prompt based on native speaker baselinedata.

    Truth Value Judgment Test. The aim of the truthvalue judgment test was to probe the students re-ceptive knowledge of plural generic and specificnoun phrases; it was modeled on the instrumentused in Ionin and Montruls (2009) study investi-gating article use by Spanish learners of English.In keeping with Ionin and Montruls task specifi-cations, each item involved a short story of about2040 words, which juxtaposed a specific with ageneric reading. Based on the story, participantswere asked to judge the truth value of a subse-quent statement. Each version of the test included18 items: 12 target and 6 distractor items. The 12target items were constructed using four stories,each repeated three times: once with a definiteplural in the subsequent statement, once with abare plural, and a third time with a demonstrativeplural. The item with the demonstrative pluralstatement served as the control, since anaphoricreference is the only possible reading for both En-glish and Greek demonstrative plurals. In two sto-ries, the target statement was true with a definiteplural, true with a bare plural, and false with ademonstrative plural. For the other two stories,the values were reversed. An example of a storyand a corresponding statement with a definiteplural (the libraries) is the following:

    3. Example of a Truth Value Judgment Item

    Most libraries are full of millions of printed books.But there are three strange libraries. They donthave printed books, they only have millions of elec-tronic books.

    The libraries have electronic books.TRUE FALSE

  • Charis Stefanou and Andrea Rvsz 269

    The six distractor items also consisted of a 2040 word story, followed by a truefalse statement.The stories juxtaposed interpretations of passiveand active voice. The six items altogether in-cluded three stories, each appearing twice: oncewith a true target statement and a second timewith a false one.As indicated by Web VocabProfiler v3 (Cobb,

    n.d.), the vast majority of the vocabulary items inthe test were among the first thousand most fre-quent words in the English language, and a fewwere among the second thousandmost frequentlyused words. In addition, students were allowed toask for the meaning of any unknown words in or-der to ensure that lack of vocabulary knowledgedoes not interfere with test performance. Theinternal consistency reliability coefficients, com-puted separately for the items included in the dif-ferent versions of the assessment, were at accept-able or good levels for both the generic and spe-cific TVJ test versions ( > .70 and > .78). Forthe generic as well as the specific items, the maxi-mum total score was 4 points.

    Measures of Learner Differences

    Participants were administered two tests, awords-in-sentences test and a test of metalan-guage, in order to obtain information about theirgrammatical sensitivity, and knowledge of meta-language respectively.

    Words in Sentences Test. The words-in-sentencestest was an adapted version of the correspondingcomponent of the MLAT and aimed to assess theparticipants grammatical sensitivity or ability tounderstand the functions of words in sentences.Fifteen Greek sentences were utilized as stimuli.Each key sentence included an underlined word,and was followed by a second sentence in whichfive words were underlined. The participants hadto choose which of the five words in the secondsentence filled the same grammatical role as theunderlined word in the key sentence. The inter-nal consistency reliability for the test was good( = .774).Test of Metalanguage. The test of metalanguage

    was an adaptation of an instrument originally de-veloped for this purpose by Bloor (1986) andmore recently used by Alderson et al. (1997). Thistest was deemed appropriate for use in the presentstudy, since receptive knowledge of metalanguagewas hypothesized to be relevant in instructionalconditions that prompted learners to interpretmetalinguistic feedback. The participants werepresented with a Greek sentence and asked to

    identify words and phrases that corresponded toa list of 10 grammatical terms (e.g., adjective, prepo-sition), some of which were potentially relevantto describing rules about the target construction(e.g., definite article, noun). The same sentenceused by Bloor and Alderson et al., translated intoGreek, was used as the source for the examples:Materials are delivered to the factory by a supplier, whousually has no technical knowledge, but who happensto have the right contacts. The internal consistencyreliability for the test was good ( = .845).

    Procedure

    As Figure 1 illustrates, the learners were ini-tially screened using the grammar part of theOxford Placement Test (Dave, 2004). On thesecond day of the study, the pretest was admin-istered, followed by the first treatment task. Afew days later within the same week, the partici-pants received WCF in response to errors theymade on the first treatment task. After studyingthe feedback for 5 minutes, they completed thesecond treatment task. The second week of thestudy started with the participants looking overthe WCF, which addressed errors in their perfor-mance on the second treatment task. They wereagain given 5 minutes to process the feedback,then the immediate posttest followed just likein previous research by Bitchener and Knoch(2009). In the third week, the two measures oflearner factors were administered. Finally, in thefourth week, the delayed posttest was completed,2 weeks after the second treatment session. Thepretest, posttest, and delayed posttest lasted ap-proximately 40 minutes, and the participants onaverage took about 20 minutes to carry out eachtreatment task. The words-in-sentences and meta-language tests were completed within the timelimit of a normally scheduled 45-minute class.During the period of the study, the teachers of theparticipating classes were asked not to provideany input on article use in their English lessons inan attempt to control for exposure to the targetconstruction outside of the experiment.

    Data Analyses

    Scoring. Both the assessment tasks and themeasures of learner differences were markeddichotomously. One point was awarded for eachcorrect answer, and zero points were given forincorrect responses. For the two assessment tasks,separate scores were calculated for the genericand the specific reference items.

  • 270 The Modern Language Journal 99 (2015)

    FIGURE 1Study Design

    Statistical Analyses. As a first step, descriptivestatistics were calculated for each groups perfor-mance on the two assessment tasks of the pretest,posttest, and delayed posttest and the two learnerdifference measures. In order to address thefirst research question and examine the effectsof direct WCF on learner gains in article use, aseries of ANOVAs was conducted. First, one-wayANOVAs were run on the pretest scores for eachassessment task in order to detect any initialgroup differences. Next, the data were submittedto a series of mixed-model ANOVAs. Since theresults of Mauchlys test for sphericity were sta-tistical in the analyses, the GreenhouseGeissercorrection was applied to the degrees of freedom.Where appropriate, post hoc independent sam-ples t-tests were performed. To measure effectsizes, we computed partial eta-squared (p2) forthe ANOVAs and Cohens d for the t-tests. Follow-ing Cohen (1988), p2 values of .01, .06, and .14and d values of .20, .50, and .80 were consideredsmall, medium, and large. The second researchquestion was addressed by computing Spearmantwo-tailed bivariate correlations. First, correla-tions were calculated between the participants

    pretest scores and the learner difference factorsin order to detect any relationships betweenthe variables at the time of the pretest. Next,we correlated the learners pretestposttest andpretestdelayed posttest gain scores in the as-sessment tasks with their scores in the measuresof learner differences. Following Cohen (1992),2 values of .01, .09, and .25 were interpretedas small, medium, and large effect sizes. For allanalyses, the alpha level for significance was setat .01 to adjust for multiple comparisons. Allstatistical analyses were conducted using the IBMStatistical Package for Social Sciences (SPSS) 19.

    RESULTS

    RQ1: Effects of Direct Written Corrective Feedback onArticle Use

    Table 1 presents the descriptive statistics forthe learners scores on the two assessment tasksof the pretest, posttest, and delayed posttest. AsTable 1 and Figures 25 show, the comparisongroup improved slightly from the pretest to theposttest in generic article use on both tests but

  • Charis Stefanou and Andrea Rvsz 271

    TABLE1

    Pretest,Po

    sttest,andDelayed

    PosttestScores

    ontheTe

    stingTa

    sksAcrosstheGroup

    s

    Com

    parison(N

    =29

    )DirectO

    nly(N

    =30

    )DirectM

    etalingu

    istic(N

    =30

    )

    PreM(SD)

    PostM(SD)

    DelM(SD)

    PreM(SD)

    PostM(SD)

    DelM(SD)

    PreM(SD)

    PostM(SD)

    DelM(SD)

    Gen

    eric

    Text

    summary

    2.69

    (1.34)

    3.38

    (1.11)

    3.14

    (1.30)

    3.40

    (.89

    )3.50

    (1.00)

    3.60

    (.89

    )2.90

    (1.27)

    3.67

    (.92

    )3.83

    (.59

    )TVJ

    2.93

    (1.43)

    3.38

    (1.05)

    2.72

    (1.36)

    3.17

    (.98

    )3.40

    (1.10)

    3.47

    (.97

    )3.00

    (1.23)

    3.03

    (1.38)

    3.57

    (.94

    )Sp

    ecific

    Text

    summary

    1.72

    (1.51)

    1.17

    (1.60)

    1.48

    (1.64)

    .97(1.07)

    2.93

    (1.41)

    3.13

    (1.43)

    1.13

    (1.55)

    3.10

    (1.47)

    3.33

    (1.18)

    TVJ

    1.27

    (1.36)

    1.31

    (1.51)

    .76(1.02)

    .73(.94

    )1.67

    (1.49)

    1.87

    (1.55)

    1.13

    (1.63)

    2.30

    (1.68)

    2.90

    (1.47)

    Note.

    Thetotalscoreswere4po

    intsforeach

    test.

    declined in the accurate article use with pluralspecific referents. The same trends were observedfor the pretestdelayed posttest gain scores of thecomparison group, except for a slight decrease inthe generic TVJ scores. As compared to the com-parison group, both feedback groups exhibitedconsiderable pretestposttest improvement in thegeneric as well as the specific sections of both as-sessment tasks, and maintained their improvedscores on the delayed posttest. The only excep-tion to this pattern was the direct metalinguisticgroups posttest performance on the generic partof the TVJ test, where the participants retained,rather than increased, their pretest scores.Spearman correlational analyses, run sepa-

    rately for the participants pretest, posttest, anddelayed posttest scores, found four significant cor-relations between the participants performanceon the assessment tasks: between (a) the genericand specific parts of the text summary pretest, (b)the generic and specific components of the TVJpretest, (c) the specific part of the text summaryposttest and generic part of the TVJ posttest, and(d) the specific parts of the text summary andTVJ delayed posttest. The strength of these cor-relations were all in the small to medium range(.09 < 2 < .16, p < .01). The remaining 14 cor-relations did not yield significant links. Overall,these results suggest that the tests did not tap ex-actly the same constructs.Turning to the inferential statistics conducted

    to test group differences, one-way ANOVAs foundno significant difference in the pretest scores ofthe three groups (comparison, direct only, directmetalinguistic) on any of the assessments, generictext summary: F(2,86) = 2.83, p = .07, p2 = .06,specific text summary: F(2,86) = 2.41, p = .10,p

    2 = .05; generic TVJ: F(2,86) = .29, p = .75,p

    2 < .01; specific TVJ: F(2,86) = 1.30, p = .28,p

    2 = .03. Having established that no significantdifferences existed between the groups at thepretest, an overall mixed-model ANOVA was run,with time as the within-subjects factor and groupas the between-subjects variable, for each assess-ment task. Except for the generic part of the textsummary test, F(3.43,86)= 2.26, p= .08, p2 = .05,the interaction between time and group emergedas statistically significant. The effect size was largefor the specific text summary test: F(3.64,86) =12.13, p < .01, p2 = .22, and specific TVJ test,F(3.49,86) = 6.30, p < .01, p2 = .13, but mediumfor the generic TVJ test, F(3.39,86)= 3.09, p= .02,p

    2 = .07.Post hoc paired comparisons revealed signifi-

    cant differences, withmedium to large effect sizes,between the comparison and direct feedback only

  • 272 The Modern Language Journal 99 (2015)

    FIGURE 2Generic Text Summary: Performance Across Groups

    group and between the comparison and meta-linguistic group for the specific component ofboth the text summary test, direct only, andcomparison: F(1.59,57) = 21.34, p < .01, p2 =.27; metalinguistic and comparison: F(1.91,57) =15.54, p < .01, p2 = .21, and TVJT, direct onlyand comparison: F(1.84,57) = 8.66, p< .01, p2 =.13; metalinguistic and comparison: F(1.74,57) =10.64, p < .01, p2 = .16. However, no significantdifferences were detected in the performanceof the two feedback groups, text summary test:F(1.78,58) < .01, p = .99, p2 < .01, TVJ test:

    F(1.67,58) = 1.84, p = .43, p2 = .01. As shown inTable 2, independent samples t-tests confirmedthat the direct only and metalinguistic groupsachieved significantly higher pretestposttestgains than the comparison group on the specificpart of the text summary test, and displayedgreater pretestdelayed posttest gains on the spe-cific part of both the text summary and TVJ tests.The effect sizes were in the large range. For thegeneric TVJ test, post hoc mixed-model ANOVAsfound a significant difference between the per-formance of the comparison and metalinguistic

    FIGURE 3Specific Text Summary: Performance Across Groups

  • Charis Stefanou and Andrea Rvsz 273

    FIGURE 4Generic TVJ: Performance Across Groups

    group, F(1.71,57) = 5.36, p < .01, p2 = .09.However, an independent samples t-test revealedthat this was due to a posttestdelayed posttestdifference, t(57) = 4.20, p < .01, d = 1.07,rather than differences in pretestposttest,t(57) = 1.04, p = .30, d = .27, or pretestdelayedposttest gains, t(57) = 1.90, p = .06, d = .49.In sum, the two direct feedback groups demon-

    strated significantly greater pretestposttest andpretestdelayed posttest development than thecomparison group in article use for specific pluralreference on both the text summary and TVJ tests,

    but no significant differences were found betweenthese groups on the generic component of thetests. Nor was any difference detected between thegains of the direct only and metalinguistic groupson any of the assessments.

    RQ2: Moderating Effects of Learner Factors on theEffectiveness of Direct WCF

    Table 3 displays the descriptive statistics of thelearners performance on the words-in-sentencestest and the test of metalanguage for the two

    FIGURE 5Specific TVJ: Performance Across Groups

  • 274 The Modern Language Journal 99 (2015)

    TABLE 2Results of t-tests Comparing the Comparison Group With the Direct Only and Direct Metalinguistic Groupsin Specific Reference Contexts

    Groups/Testing Task Gain Score t(57) p d

    Comparisondirect onlySpecific text summary Pretestposttest 5.44

  • Charis Stefanou and Andrea Rvsz 275

    TABLE 4Correlations Between Measures of Learner Differences and Pretest Scores of the Direct Only and DirectMetalinguistic Groups

    Metalanguage Grammatical Sensitivity

    Direct Only Direct Meta Direct Only Direct Meta

    Testing Task p p p p

    GenericText sum. .44 .02 .15 .44 .04 .83 .19 .31TVJ .24 .20 .13 .49 .12 .52 .04 .83

    SpecificText sum. .21 .26 .22 .24 .13 .48

  • 276 The Modern Language Journal 99 (2015)

    of introspective methods (e.g., stimulated recallprotocols).A second issue that deserves attention regard-

    ing the results for the overall effectiveness of WCFis the fact that the differences in gains betweenthe experimental and comparison groups were ofa much larger effect size on the text summarytest as compared to the TVJ test. The principle ofTransfer Appropriate Processing (TAP) may offeran explanation for this finding. The fundamentaltenet of TAP is that we can better transfer and re-member what we have learned if the cognitive pro-cesses that are active during learning are similar tothose that are active during retrieval (Lightbown,2008, p. 27). One implication of TAP is that whenthere is a match between the learning and test-ing conditions in an effects-of-instruction exper-iment, participants will be better able to retrievewhat they have learned during the instructionaltreatment and use it in the assessment. Applyingthis principle to the present study, the participantsmay have demonstrated higher gains on the textsummary test because, unlike the TVJ test, the textsummary required them to use articles under con-ditions that were very similar to those they hadpreviously encountered during the treatment.Having established the positive effects of direct

    WCF on L2 article use, we examined the extentto which the learners development differeddepending on whether they received direct feed-back only or direct feedback supplemented withmetalinguistic information. The statistical anal-yses, conducted to compare the pretestposttestand pretestdelayed posttest gains of the twoexperimental groups on the two assessment tasks,yielded no significant difference. Our results thenlargely reflect those documented by Bitchener(2008) and Bitchener and Knoch (2008), whofound no benefits for complementing directfeedback with metalinguistic comments, andrun contrary to Sheens (2007) findings, whodetected superior gains in article use on all ofher three assessment tasks when metalinguisticinformation was also available to learners.A possible explanation for the conflicting find-

    ings might lie in the nature of the treatment tasksemployed. In Bitchener and Knochs research,the participants completed picture descriptiontasks in a way that is similar to the present studywhere participants were asked to provide shortdescriptions of pictures based on a descriptivetext they had previously heard. The descriptivetask in the current study elicited a list of sentencesrather than a cohesive text, and it is likely thatthe written output produced by Bitchener andKnochs participants was similar in nature. This

    relative simplicity of the learner output producedin terms of discourse features might have madethe target of the direct WCF more salient andtransparent in these studies, thus participantsmight have experienced relatively little difficultyin deducing the relevant article rules on theirown without the added help of metalinguisticinformation. Sheen, in contrast, requested herparticipants to reproduce narratives as part of thetreatment. This task probably led to the creationof more cohesive and complex texts, which mighthave imposed more demands on the learnerswhen processing the feedback, thereby making itmore difficult for them to discern the target rulein the absence of metalinguistic comments.The second research question asked about the

    extent to which the effectiveness of direct WCF onarticle errors is moderated by learner differencesin grammatical sensitivity and knowledge of meta-language. An additional subquestion queried theextent to which the strength of any relationshipsdiffers depending on whether learners receiveddirect feedback only or direct feedback plus meta-linguistic comments. Spearman correlations,which were run between the learner differencemeasures and the combined gain scores of bothexperimental groups on the two assessment tasks,revealed three medium-sized links, all involvinggain scores on the specific section of the textsummary task. Grammatical sensitivity was foundto correlate with the learners pretestposttestand pretestdelayed posttest gain scores, andknowledge of metalanguage with their pretestposttest gains. In other words, the participantswho demonstrated greater grammatical sensitivityand familiarity with metalinguistic terminologywere found to benefit more from the feedbackprovided.The same type of correlational analyses, con-

    ducted for the two experimental groups sepa-rately, yielded the same but large-size correlationsfor the direct feedback only group. No significantcorrelations were detected for the direct meta-linguistic group. Overall, these results suggestthat, while the participants with greater grammat-ical sensitivity and knowledge of metalanguagewere more likely to learn from the WCF in thedirect feedback only group, the participants per-formance was not affected by learner differenceswhen metalinguistic information was also madeavailable.An issue worthy of discussion is that the two

    learner factors in focus, grammatical sensitivityand knowledge ofmetalanguage, were only linkedto the participants gains in the direct feedbackonly group, but not to the extent of development

  • Charis Stefanou and Andrea Rvsz 277

    exhibited by the direct metalinguistic group. Apossible explanation for the lack of moderatingeffects observed for the participants in the met-alinguistic group may be that the provision of ad-ditional metalinguistic information, in fact, neu-tralised any advantage that could potentially havebeen afforded by higher grammatical sensitivityor knowledge of metalanguage. In other words,the information supplied in the metalinguisticcomments probably enabled the participants tocompensate for their potentially weaker ability torecognize grammatical functions and to use gram-matical terminology. A similar argument was putforward by Erlam (2005) in explaining why gram-matical sensitivity was found to facilitate L2 learn-ing under the more implicit, but not the moreexplicit, instructional conditions in her study ofaptitudetreatment interactions.Finally, a question that arises is why all the sig-

    nificant relations between the learner factors andgain scores were detected on the task componentthat targeted article use with specific referents.Again, this finding may be accounted for by thehigh scores achieved by the participants on thegeneric items. As already mentioned, the parti-cipants already produced relatively high scoreson the pretest tasks assessing their ability to markgeneric reference (as noted earlier, this mightnot have corresponded to accurate underlyingknowledge). This initial strong performance leftlittle room to demonstrate pretestposttest devel-opment on the part of the learners, leading to acorrespondingly low variance in the gain scorestargeting generic items. This limited variance,in turn, is likely to have restricted the chancesof detecting significant correlations between themeasures of learner differences and gains inthe generic sections of the assessment tasks. Incontrast, the larger variance of the learners gainscores in the specific itemsmade it more probablethat any moderating effects of the learner factorswould surface.

    CONCLUSION

    The present study set out to examine the extentto which direct WCF can help Greek EFL learn-ers improve their article use for generic and spe-cific plural reference. Two feedback types werecompared: direct written feedback only and di-rect written feedback plus metalinguistic informa-tion. Two learner factors were also investigatedin relation to the participants ability to learnfromWCF: grammatical sensitivity and knowledgeof metalanguage. The novel aspects of the re-search included its new linguistic target and fo-

    cus on learner difference factors that have notyet been investigated (grammatical sensitivity andfamiliarity with metalinguistic terms). Corrobo-rating the results of previous empirical researchand contrary to Truscotts (1996) claims, the pro-vision of direct WCF on article use was foundto be superior to the comparison condition. In-terestingly, however, supplementing direct WCFwith metalinguistic comments afforded little ad-ditional benefit to learners. Furthermore, partic-ipants with greater grammatical sensitivity andfamiliarity with metalinguistic terminology ap-peared to improve more only when direct feed-back, in the absence of metalinguistic comments,was supplied in response to article errors. No sig-nificant links emerged between the measures oflearner differences and the gains of the experi-mental group that received direct feedback andmetalinguistic comments. This pattern of findingsis in line with Erlams (2005) proposal that condi-tions rich in input may have the capacity to neu-tralize learner differences in cognitive abilities.Finally, some limitations of the study need to be

    acknowledged and considered in future research.First, the text summary task constituted a focusedpedagogic task, which was designed to elicit spe-cific and generic article use. As a consequence, itlacked situational authenticity (Ellis, 2003), andinvolved article use in relatively controlled con-texts only. To address these limitations, an im-portant avenue for future research would involveextending the research questions posed here totasks that resemblemore real-life uses of languageand elicit less controlled application of the tar-get rule. Second, we focused on only one aspectof article use, thus it is not straightforward thatour results would transfer to other linguistic tar-gets (Xu, 2009). This imposes limits on the gener-alizability of the findings and weakens the claimswe made about feedback efficacy. Thus, a replica-tion of this study with complex linguistic construc-tions would be especially desirable, given that lit-tle is known about the impact of WCF on com-plex L2 features. A third, related limitation con-cerns the fact that we only sought evidence aboutthe participants development in producing andcomprehending the linguistic target. No attemptwasmade to examine the potential impact of feed-back on the linguistic complexity of learners pro-duction as in Hartshorn et al. (2010) and Van Be-uningen et al. (2012). Given that the text sum-mary test elicited a list of simple sentences ratherthan a text, the learners output was not amenableto analysis in terms of linguistic complexity mea-sures. Using less controlled tasks in future re-search, as suggested before, would help resolve

  • 278 The Modern Language Journal 99 (2015)

    this issue. Fourth, our investigation included onlytwo learner factors. Thus, as suggested by Kor-mos (2012), further studies exploring the moder-ating effects of other learner factors, such as work-ing memory capacity andmotivation, are also war-ranted. In future research, context-specific factorsshould also be taken into consideration, since is-sues such as learners schooling environment to-gether with their teachers guidance and assess-ment practices may shape their capacity to bene-fit from pedagogical interventions. Finally, follow-up research would profit from utilizing introspec-tivemethods to uncover how learners engage withdifferent types of WCF, and whether this mightbe influenced by learner differences in cognitiveabilities.

    ACKNOWLEDGMENTS

    We would like to thank the editor and the threeanonymous reviewers for their helpful suggestions onthis article. Any errors, of course, are our own. This re-search was supported by the Language Learning Disser-tation Grant awarded to Charis Stefanou.

    NOTE

    1 According to the manual of the Oxford PlacementTest, scores ranging between 120 and 149 correspondto an intermediate level of proficiency, or B1 thresholdand B2 vantage of the Common European Frameworkranking. Keeping with these guidelines, learners werechosen for participation in the study if their test scoresfell within the aforementioned range.

    REFERENCES

    Alderson, J. C., Clapham, C., & Steel, D. (1997). Met-alinguistic knowledge, language aptitude, and lan-guage proficiency. Language Teaching Research, 1,93121.

    Berry, R. (2009). EFLmajors knowledge ofmetalinguis-tic terminology: A comparative study. LanguageAwareness, 18, 113128.

    Bitchener, J. (2008). Evidence in support of written cor-rective feedback. Journal of Second Language Writ-ing, 17, 102118.

    Bitchener, J., & Ferris, D. (2012). Written corrective feed-back in second language acquisition and writing. NewYork: Routledge.

    Bitchener, J., & Knoch, U. (2008). The value of writtencorrective feedback for migrant and internationalstudents. Language Teaching Research, 12, 409431.

    Bitchener, J., & Knoch, U. (2009). The contribution ofwritten corrective feedback to language develop-ment: A ten month investigation. Applied Linguis-tics, 31, 193214.

    Bloor, T. (1986).What do language students know aboutgrammar? British Journal of Language Teaching, 24,157162.

    Carroll, J. B. (1981). Twenty-five years of research in for-eign language aptitude. In K. C. Diller (Ed.), In-dividual differences and universals in language learn-ing aptitude (pp. 83118). Rowley, MA: NewburyHouse.

    Cobb, T. (n.d.). Web Vocabprofile. Acccessed 16 July 2014at http://www.lextutor.ca/vp/eng/

    Cohen, J. (1988). Statistical power analysis for the behavioralsciences. Hillsdale, NJ: Lawrence Erlbaum.

    Cohen, J. (1992). A power primer. Psychological Bulletin,112, 155159.

    Dave, A. (2004). Oxford Placement Test 1. Oxford: OxfordUniversity Press.

    DeKeyser, R. (2012). Interactions between individualdifferences, treatments, and structures in SLA.Language Learning, 62, 189200.

    Elder, C. (2009). Validating a test of metalinguisticknowledge. In R. Ellis, S. Loewen, C. Elder, R. Er-lam, J. Philp, & H. Reinders (Eds.), Implicit andexplicit knowledge in second language learning, testingand teaching (pp. 113138). Bristol, UK: Multilin-gual Matters.

    Ellis, R. (2003). Task-based language learning and teaching.Oxford: Oxford University Press.

    Ellis, R. (2005). Principles of instructed language learn-ing. System, 33, 209224.

    Ellis, R., Sheen, Y., Murakami, M., & Takashima, H.(2008). The effects of focused and unfocused writ-ten corrective feedback in an English as a foreignlanguage context. System, 36, 353371.

    Erlam, R. (2005). Language aptitude and its relation-ship to instructional effectiveness in second lan-guage acquisition. Language Teaching Research, 9,147171.

    Farrokhi, F., & Sattarpour, S. (2012). The effects of di-rect written corrective feedback on improvementof grammatical accuracy of high-proficiency L2learners. World Journal of Education, 2, 4957.

    Ferris, D. R. (1999). The case for grammar correction inL2 writing classes: A response to Truscott (1996).Journal of Second Language Writing, 8, 110.

    Ferris, D. R. (2002). Treatment of error in second languagewriting classes. Ann Arbor, MI: The University ofMichigan Press.

    Ferris, D. R., & Roberts, B. (2001). Error feedback in L2writing classes: How explicit does it need to be?Journal of Second Language Writing, 10, 161184.

    Garca Mayo, M. (2008). The acquisition of four non-generic uses of the article the by Spanish EFL learn-ers. System, 36, 550565.

    Hartshorn, K. J., Evans, N. W., Merrill, P. F., Sud-weeks, R. R., StrongKrause, D., & Anderson, N.J. (2010). Effects of dynamic corrective feedbackon ESL writing accuracy. TESOL Quarterly, 44, 84106.

  • Charis Stefanou and Andrea Rvsz 279

    Ionin, T., & Montrul, S. (2009). Article use andgeneric reference: Parallels between L1- and L2-acquisition. In M. Garca Mayo & R. Hawkins(Eds.), Second language acquisition of articles: Em-pirical findings and theoretical implications (pp. 147171). Philadelphia/Amsterdam: John Benjamins.

    Ionin, T., & Montrul, S. (2010). The role of L1 transferin the interpretation of articles with definite plu-rals in L2 English. Language Learning, 60, 877925.

    Kormos, J. (2012). The role of individual differences inL2 writing. Journal of Second Language Writing, 21,390403.

    Li, S. (2013). The interactions between the effects ofimplicit and explicit feedback and individual dif-ferences in language analytic ability and workingmemory.Modern Language Journal, 97, 634654.

    Lightbown, P. M. (2008). Transfer appropriate process-ing as a model for classroom second language ac-quisition. In Z.H.Han (Ed.),Understanding secondlanguage process (pp. 2744). Clevedon, UK: Multi-lingual Matters.

    Liu, D., & Gleason, J. L. (2002). Acquisition of the arti-cle the by nonnative speakers of English. Studies inSecond Language Acquisition, 24, 126.

    Loewen, S. (2012). The role of feedback. In S. M. Gass &A. Mackey (Eds.), The Routledge handbook of secondlanguage acquisition (pp. 2440). London: Rout-ledge.

    Polio, C. (2012). The relevance of second language ac-quisition theory to the written error correction de-bate. Journal of Second Language Writing, 21, 375389.

    Robinson, P. (2002). Learning conditions, aptitudecomplexes and SLA: A framework for researchand pedagogy. In P. Robinson (Ed.), Individual dif-ferences and instructed language learning (pp. 113135). Philadelphia/Amsterdam: John Benjamins.

    Robinson, P. (2005). Aptitude and second language ac-quisition. Annual Review of Applied Linguistics, 25,4673.

    Sachs, R. (2010). Individual differences and the effectivenessof visual feedback on reflexive binding in L2 Japanese.(Unpublished doctoral dissertation). GeorgetownUniversity, Washington, DC.

    Sachs, R., & Polio, C. (2007). Learners uses of twotypes of written feedback on a L2 writing revi-sion task. Studies in Second Language Acquisition, 29,67100.

    Schmidt, R. W. (1990). The role of consciousness insecond language learning. Applied Linguistics, 11,129158.

    Sheen, Y. (2007). The effect of focused written correc-tive feedback and language aptitude on ESL learn-ers acquisition of articles. TESOL Quarterly, 41,255281.

    Sheen, Y. (2010). Differential effects of oral and writtencorrective feedback in the ESL classroom. Studiesin Second Language Acquisition, 32, 203234.

    Sheen, Y., Wright, D., & Moldawa, A. (2009). Differen-tial effects of focused and unfocused written cor-

    rection on the accurate use of grammatical formsby ESL learners. System, 37, 556569.

    Shintani, N., & Ellis, R. (2013). The comparative ef-fect of direct written corrective feedback and met-alinguistic explanation on learners explicit andimplicit knowledge of the English indefinite ar-ticle. Journal of Second Language Writing, 22, 286306.

    Skehan, P. (2002). Theorizing and updating aptitude.In P. Robinson (Ed.), Individual differences and in-structed language learning (pp. 6994). Philadel-phia/Amsterdam: John Benjamins.

    Snape, N., Garca Mayo, M., & Grel, A. (2009). Span-ish, Turkish, Japanese andChinese L2 learners ac-quisition of generic reference. In M. Bowles et al.(Eds.), Proceedings of the 10th Generative Approachesto Second Language Acquisition Conference (pp. 18).Somerville, MA: Cascadilla Proceedings Project.

    Stefanou, C. (2010). The use of the English article systemby Greek learners of English. (Unpublished mastersthesis). Lancaster University, Lancaster, UK.

    Storch, N., & Wigglesworth, G. (2010). Learners pro-cessing, uptake and retention of corrective feed-back on writing. Studies in Second Language Acquisi-tion, 32, 303334.

    Suh, B. R. (2010). Written feedback in second language ac-quisition: Exploring the roles of type of feedback, lin-guistic targets, awareness and concurrent verbalization.(Unpublished doctoral dissertation). GeorgetownUniversity, Washington, DC.

    Trenkic, D. (2007). Variability in second language ar-ticle production: Beyond the representationaldeficit vs. processing constraints debate. SecondLanguage Research, 23, 289327.

    Trofimovich, P., Ammar, A., & Gatbonton, E. (2007).How effective are recasts? The role of attention,memory, and analytical ability. In A. Mackey (Ed.),Conversational interaction in second language acquisi-tion: A series of empirical studies (pp. 171195). Ox-ford: Oxford University Press.

    Truscott, J. (1996). The case against grammar correc-tion in L2 writing classes. Language Learning, 46,327369.

    Truscott, J. (2007). The effect of error correction onlearners ability to write accurately. Journal of Sec-ond Language Writing, 16, 255272.

    Van Beuningen, C. G., de Jong, N. H., & Kuiken, F.(2012). Evidence on the effectiveness of compre-hensive error correction in second language writ-ing. Language Learning, 62, 141.

    Vatz, K., Tare, M., Jackson, S. R., & Doughty, C. J. (2013).Aptitudetreatment interactions in second lan-guage acquisition: Findings and methodology. InG. Granena &M. Long (Eds.), Sensitive periods, lan-guage aptitude, and ultimate L2 attainment (pp. 273292). Philadelphia/Amsterdam: John Benjamins.

    Xu, C. (2009). Overgeneralization from a narrow focus:A response to Ellis et al. (2008) and Bitchener(2008). Journal of Second Language Writing, 18, 270275.

  • 280 The Modern Language Journal 99 (2015)

    Appendix

    Version A of Assessment Tasks

    Task 1: Text Summary

    As part of a school visit to the zoo you have to write a short description about some animals. First you have 3 minutesto read the given text. Then the text will be replaced with some pictures. You have to write a summary of the text inEnglish using the pictures to help you remember what it was about.

    Text 1:T . , . .T A. . . .

  • Charis Stefanou and Andrea Rvsz 281

    (Translation)Lions are the king of the jungle. Lions usually sleep during the day and hunt during the night. Lionsare usually used in circuses.The lions in our town zoo were brought from Africa. The lions play with a ball all day. Last week the lionshad two babies. Next month the lions will be transferred to France.

    Task 2: Truth Value JudgmentRead the following stories and decide if the sentence given below each story is True or False by circlingthe corresponding choice. Your decision should be based on the story.

    Story 1:Most cinemas have several rows with seats. But there are three strange cinemas. They dont have seats,they only have several sofas.The cinemas have several sofas. TRUE FALSE

    Story 2:Most hotels are very noisy places because of the hundreds of people who live there. But two hotels arealways very quiet. There is a rule that forbids guests to make loud noise.Hotels are very quiet. TRUE FALSE

    Story 3:Yesterday I heard a very funny story that happened in a school. A little boy was being chased by a littlegirl around the school yard because he had stolen her favorite doll.A little girl was chasing a little boy. TRUE FALSE

    Story 4:In our History class we were taught that most castles were made of big pieces of stone. But the teachersaid that there were two castles that were different. They were made of wood instead of stones.These castles were made of stones. TRUE FALSE

    Story 5:Most hotels are very noisy places because of the hundreds of people who live there. But two hotels arealways very quiet. There is a rule that forbids guests to make loud noise.The hotels are very noisy. TRUE FALSE

    Story 6:Yesterday I was at the park and I saw something unusual. A dog was being followed by two squirrels allaround the park.A dog was following two squirrels. TRUE FALSE

    Story 7:Most cinemas have several rows with seats. But there are three strange cinemas. They dont have seats,they only have several sofas.Cinemas have several rows with seats. TRUE FALSE

    Story 8:In ancient Greece most temples didnt have guards to protect them. But two temples were very special.They were very rich and so they had guards to protect them.These temples had guards to protect them. TRUE FALSE

  • 282 The Modern Language Journal 99 (2015)

    Story 9:Last night I saw a film about strange animal stories. There was a case of a sheep which was being protectedby a cow while it was injured in the farm.A cow was protecting a sheep. TRUE FALSE

    Story 10:In ancient Greece most temples didnt have guards to protect them. But two temples were very special.They were very rich and so they had guards to protect them.Temples had guards to protect them. TRUE FALSE

    Story 11:In our History class we were taught that most castles were made of big pieces of stone. But the teachersaid that there were two castles that were different. They were made of wood instead of stones.The castles were made of wood. TRUE FALSE

    Story 12:Last night I saw a film about strange animal stories. There was a case of a sheep which was being protectedby a cow while it was injured in the farm.A sheep was protecting a cow. TRUE FALSE

    Story 13:Most hotels are very noisy places because of the hundreds of people who live there. But two hotels arealways very quiet. There is a rule that forbids guests to make loud noise.These hotels are very quiet. TRUE FALSE

    Story 14:In ancient Greece most temples didnt have guards to protect them. But two temples were very special.They were very rich and so they had guards to protect them.The temples didnt have guards to protect them. TRUE FALSE

    Story 15:Yesterday I heard a very funny story that happened in a school. A little boy was being chased by a littlegirl around the school yard because he had stolen her favorite doll.A little boy was chasing a little girl. TRUE FALSE

    Story 16:Most cinemas have several rows with seats. But there are three strange cinemas. They dont have seats,they only have several sofas.These cinemas have several rows with seats. TRUE FALSE

    Story 17:In our History class we were taught that most castles were made of big pieces of stone. But the teachersaid that there were two castles that were different. They were made of wood instead of stones.Castles were made of stones. TRUE FALSE

    Story 18:Yesterday I was at the park and I saw something unusual. A dog was being followed by two squirrels allaround the park.Two squirrels were following a dog. TRUE FALSE