module 1: first language acquisition, second language ... › 2014 › 10 › ling-307 … ·...

66
The Royal Commission at Yanbu Sector of Colleges and Institutes Yanbu University College - Male Campus Applied Linguistics Department Language Acquisition Module 1: First language acquisition, second language acquisition, bilingualism: definitions, differences and similarities “When we study human language, we are approaching what some might call the human essence, the distinctive qualities of mind that are, so far as we know, unique to [humans].” (Chomsky, 1968, p. 100) Language is a sophisticated matter that cannot be studied from one perspective only, but we need to appreciate its complex nature by discussing the contributions of psychology and psycholinguistics, sociology and sociolinguistics, neuroscience, discourse analysis, and education. But first, let us clarify and define our terminology as we dig deep into the complexity of language acquisition. One of the first terms is ‘Native Language’ (NL). It refers to “the first language that a child learns. It is also known as the primary language, the mother tongue, or the L1 (first language).” (Gass & Salinker, 2008). Target Language (TL): refers to the language being learned. First Language Acquisition (Child Language Acquisition): The field that investigates these cases of monolingual language acquisition is known by the generic name of child language acquisition or first language acquisition. A robust empirical research base tells us that, for children who grow up monolingually, the bulk of language is acquired between 18 months and three to four years of age. Child language acquisition happens in a predictable pattern, broadly speaking. First, between the womb and the few first months of life, infants attune themselves to the prosodic and phonological makeup of the language to which they are exposed and they also learn the dynamics of turn taking. During their first year of life they learn to handle one-word utterances. During the second year, two- word utterances and exponential vocabulary growth occur. The third year of life is characterized by syntactic and morphological deployment. Some more pragmatically or syntactically subtle phenomena are learned by five or six years of age. After that point, many more aspects of mature language use are tackled when children are taught how to read and write in school. And as children grow older and their life circumstances diversify, different adolescents and adults will embark on very different kinds of literacy practice and use language for widely differing needs, to the point that neat landmarks of acquisition cannot be demarcated any more. Instead, variability and choice are the most interesting and challenging linguistic phenomena to be explained at those later ages. But the process of acquiring language is essentially completed by all healthy children by age four of life, in terms of most abstract syntax, and by age five or six for most other ‘basics’ of language. Page of 1 66

Upload: others

Post on 23-Jun-2020

29 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

The Royal Commission at YanbuSector of Colleges and InstitutesYanbu University College - Male CampusApplied Linguistics Department

Language Acquisition

Module 1: First language acquisition, second language acquisition, bilingualism: definitions, differences and similarities

“When we study human language, we are approaching what some might call the human essence, the distinctive qualities of mind that are, so far as we know, unique to [humans].”

(Chomsky, 1968, p. 100)

Language is a sophisticated matter that cannot be studied from one perspective only, but we need to appreciate its complex nature by discussing the contributions of psychology and psycholinguistics, sociology and sociolinguistics, neuroscience, discourse analysis, and education. But first, let us clarify and define our terminology as we dig deep into the complexity of language acquisition.

One of the first terms is ‘Native Language’ (NL). It refers to “the first language that a child learns. It is also known as the primary language, the mother tongue, or the L1 (first language).” (Gass & Salinker, 2008).Target Language (TL): refers to the language being learned.

First Language Acquisition (Child Language Acquisition): The field that investigates these cases of monolingual language acquisition is known by the generic name of child language acquisition or first language acquisition. A robust empirical research base tells us that, for children who grow up monolingually, the bulk of language is acquired between 18 months and three to four years of age. Child language acquisition happens in a predictable pattern, broadly speaking. First, between the womb and the few first months of life, infants attune themselves to the prosodic and phonological makeup of the language to which they are exposed and they also learn the dynamics of turn taking. During their first year of life they learn to handle one-word utterances. During the second year, two-word utterances and exponential vocabulary growth occur. The third year of life is characterized by syntactic and morphological deployment. Some more pragmatically or syntactically subtle phenomena are learned by five or six years of age. After that point, many more aspects of mature language use are tackled when children are taught how to read and write in school. And as children grow older and their life circumstances diversify, different adolescents and adults will embark on very different kinds of literacy practice and use language for widely differing needs, to the point that neat landmarks of acquisition cannot be demarcated any more. Instead, variability and choice are the most interesting and challenging linguistic phenomena to be explained at those later ages. But the process of acquiring language is essentially completed by all healthy children by age four of life, in terms of most abstract syntax, and by age five or six for most other ‘basics’ of language.

Page � of �1 66

Page 2: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Second language acquisition (SLA): refers to the process of learning another language after the native language has been learned. Sometimes the term refers to the learning of a third or fourth language. The important aspect is that SLA refers to the learning of a nonnative language after the learning of the native language. The second language is commonly referred to as the L2. As with the phrase “second language,” L2 can refer to any language learned after learning the L1, regardless of whether it is the second, third, fourth, or fifth language. By this term, we mean both the acquisition of a second language in a classroom situation, as well as in more “natural” exposure situations.

Foreign Language Learning: Foreign language learning is generally differentiated from second language acquisition in that the former refers to the learning of a nonnative language in the environment of one’s native language (e.g., French speakers learning English in France or Spanish speakers learning French in Spain, Argentina, or Mexico). This is most commonly done within the context of the classroom.

Second language acquisition, on the other hand, generally refers to the learning of a nonnative language in the environment in which that language is spoken (e.g., German speakers learning Japanese in Japan or Punjabi speakers learning English in the United Kingdom). This may or may not take place in a classroom setting. The important point is that learning in a second language environment takes place with considerable access to speakers of the language being learned, whereas learning in a foreign language environment usually does not.

What about Bilingualism?!Edwards (2006) starts off his article on the foundations of bilingualism by saying “Everyone is bilingual. That is, there is no one in the world (no adult, anyway) who does not know at least a few words in languages other than the maternal variety. If, as an English speaker, you can say c’est la vie or gracias or guten Tag or tovarisch—or even if you only understand them—you clearly have some command of a foreign tongue . . . The question, of course, is one of degree . . .” (p. 7). He goes on to say, “it is easy to find definitions of bilingualism that reflect widely divergent responses to the question of degree” (p. 8). Bhatia (2006) states this in an interesting way when he says “the process of second language acquisition—of becoming a bilingual” (p. 5). In other words, the end result of second language acquisition is a bilingual speaker. Given that bilingualism is seen as the end result and given that we know that native-like competence in a second language is rare, there is some difficulty in discussing bilingualism in this way. Thus, Bhatia and Edwards are referring to two different phenomena. Edwards is saying that one is bilingual at any point in the SL learning process, whereas Bhatia is referring only to the end point and does not deal with whether or not that end point has to be “native” or not.

Valdés (2001a) also discusses the issue of degree when she says “the term bilingual implies not only the ability to use two languages to some degree in everyday life, but also the skilled superior use of both languages at the level of the educated native speaker” (p. 40). She acknowledges that this is a narrow definition, for it considers the bilingual as someone who can “do everything perfectly in two languages and who can pass undetected among monolingual speakers of each of these two languages” (p. 40). This she refers to as the “mythical bilingual.” She argues that there

Page � of �2 66

Page 3: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

are, in fact, different types of bilinguals and that it is, therefore, more appropriate to think of bilingualism as a continuum with different amounts of knowledge of the L1 and L2 being represented. In this view, the term bilingualism can refer to the process of learning as well as the end result, the product of learning.

Finally, Deuchar and Quay (2000) define bilingual acquisition as “the acquisition of two languages in childhood” (p. 1), although they point to the difficulties involved in this definition given the many situations that can be in place. They point to De Houwer (1995), who talks about bilingual first language acquisition, referring to situations when there is regular exposure to two languages within the first month of birth and bilingual second language acquisition, referring to situations where exposure begins later than one month after birth but before age two. Wei (2000, pp. 6–7) presents a useful table of various definitions/types of bilinguals. SLA Vs. Bilingualism:There are some key differences between the two fields. SLA often favours the study of late-starting acquirers, whereas bilingualism favours the study of people who had a very early start with their languages. Additionally, one can say that bilingualism researchers tend to focus on the products of bilingualism as deployed in already mature bilingual capabilities of children or adults, whereas SLA researchers tend to focus on the pathways towards becoming competent in more languages than one. This in turn means that in SLA the emphasis often is on the incipient stages rather than on ultimate, mature competence.A third difference is that bilingual research typically maintains a focus on all the languages of an individual, whereas SLA traditionally orients strongly towards the second language, to the point that the first language may be abstracted out of the research picture. In this sense, SLA may be construed as the pure opposite of monolingual (first) child language acquisition. Indeed, in both fields monolingual competence is often taken as the default benchmark of language development.

Some terms used to describe cases of bilingualism from the literature:- balanced bilingual: someone whose mastery of two languages is roughly equivalent- coordinate bilingual: someone whose two languages are learned in distinctively separate contexts- dominant bilingual: someone with greater proficiency in one of his or her languages and uses it

significantly more than the other language(s)- early bilingual: someone who has acquired two languages early in childhood- late bilingual: someone who has become a bilingual later than childhood- maximal bilingual: someone with near-native control of two or more languages- minimal bilingual: someone with only a few words and phrases in a second language- receptive bilingual: someone who understands a second language, in either its spoken or written

form, or both, but does not necessarily speak or write it- simultaneous bilingual: someone whose two languages are present from the onset of speech- concessive (successive) bilingual: someone whose second language is added at some stage after

the first has begun to develop

Page � of �3 66

Page 4: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Effects of Bilingualism on cognitive Development: Bilingual and monolingual speakers may develop different patterns of cognitive skills due to the different language environments they experience.- Control of attention: a lot of work shows that bilingual speakers activate both language systems

even when they are only conversing in one language (Brysbaert, 1998; Francis, 1999; Gollan & Kroll, 2001; Smith, 1997). This requires a good deal of attentional control. during linguistic tasks, bilinguals must constantly inhibit one language and activate the other but must be capable of switching quickly from one language to another if required. These sorts of attention switching skills are considered to be under the control of the central executive mechanism in the brain, which is said to control and regulate a large number of cognitive processes such as planning, memory and attention. There is now robust evidence that bilinguals outperform monolinguals on these tasks, much of it provided by Bialystok and her colleagues. For example, Martin-Rhee and Bialystok (2008) tested bilingual and monolingual four- and five-year-old children on the Simon task (Simon, 1969). In this task, a red square and a blue square are presented on a computer screen and the participants have to press a red button in response to the red square and a blue button in response to the blue square. Sneakily, for half of the trials, the squares occur on the opposite side of the screen to the corresponding keys. So for example, the red square might appear on the left side of the screen, requiring the child to press the red button on the right side of the computer keyboard. People’s instinctive reaction in this task is to press the key on the same side as the stimulus, so inhibiting this reaction requires a good level of cognitive control and slows down their reaction times. Martin-Rhee and Bialystok found that bilinguals responded faster than monolinguals because they were quicker at resolving the conflict between the two possible responses. They concluded that bilinguals are better at selectively attending to conflicting cues because “they must constantly control attention between two active and competing language systems” (Martin-Rhee & Bialystok, 2008, p. 91).

- Metalinguistic Awareness: refers to the ability to reflect on and think about the nature of language and its functions. For example, work on phonological awareness (awareness of the sound system of a language) has produced contradictory results. Bruck and Genesee (1995) reported that bilingual children showed better performance on an onset-rime segmentation task at age five years (separating words into the onset and rime; swift into sw and ift) although the advantage disappeared a year later. However, monolingual five-year- olds were better than bilinguals on a phoneme-counting task (e.g. identifying that the word run has three phonemes). Similarly, Bialystok, Majumder & Martin (2003) found that Spanish–English, but not Chinese–English bilinguals, were better at a phoneme segmentation task (e.g. segmenting run into /r/, /ʌ/

and /n/). However, they also reported no difference between groups on a phoneme substitution task (e.g. substitute /s/ for /k/ to make sat out of cat). These mixed results suggest that the pattern of performance may not be straightforward. It is probably the case that other factors are equally influential in these tasks; factors such as the child’s age, her ability in her two languages, the task she is performing and even perhaps the nature of the two languages she is learning.

- Language Proficiency and Fluency: there seems to be evidence that bilinguals are disadvantaged compared to monolinguals in some tasks. In particular, studies have shown that bilinguals may have more difficulty accessing words from memory. For example, bilinguals tend

Page � of �4 66

Page 5: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

to be slower at rapid picture naming tasks and tend to experience more ‘tip of the tongue’ phenomena, which happen when a speaker just cannot bring the right word to mind (Gollan & Kroll, 2001; Gollan & Silverberg, 2001).

Bilingualism and code-switching: a common phenomenon:A common phenomenon among bilingual speakers is code-switching, which essentially refers to the use of more than one language in the course of a conversation. Sometimes this might happen because of the lack of a concept in one language and its presence in the other; sometimes it might be for humor; and sometimes it might happen simply because of the social context. For example, Grosjean (2001, p. 3) presents the following diagram to illustrate the issue of language mode, which is “the state of activation of the bilingual’s languages and language processing mechanisms at a given point in time” (p. 2). The native language (here called the base language) is always totally activated; it is the language that controls linguistic activities. The guest language, on the other hand, can be in low to high activation depending on the context. Only in bilingual language mode (the right side of the diagram) is there almost equal activation, and it is in these contexts when code-switching occurs.

Bilingualism, or at least some form of knowledge of more than one language, is so common throughout the world that Cook has proposed that the “normal” propensity is for humans to know more than one language rather than taking monolingualism as the default position. He refers to this as multicompetence, which he defines as the “knowledge of two or more languages in one mind” (Cook, 2003, p. 2; cf. Cook, 1991, 1992). If multicompetence is the “norm,” then there needs to be a re- evaluation of what it means to be a native speaker of a language. Cook (2005) argued that there are effects of multilingualism on how individuals process their native language, even

Page � of �5 66

Page 6: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

individuals with a minimal knowledge of a second language. Cook further argues that the monolingual orientation of second language acquisition belies the reality of the context of language learning in much of the world where knowledge of more than one language is the norm.

Module 2: First/Child Language Acquisition

A: Language Acquisition Research

i. L1 Research: Behaviorism vs. Language Acquisition Device (Nativism)

Until the 1960s, the research agenda of language acquisition studies, just like that of psychology and linguistics in general, was strongly determined by behaviorist learning theories. An explanation referring to mental capacities of the learner did not seem to make much sense in that context; it would, indeed, have been regarded as a non-scientific approach to the problem. Only after the constraints and restrictions of behaviorist psychology had been shaken off could the language sciences begin to understand language learning as a mental activity happening in the cognitive system of the individual. Chomsky’s (1959) famous and influential review of Skinner’s (1957) book Verbal Behavior is a milestone on this road to the cognitive turn. What this term is meant to convey is that it is the study of human cognition, which is now identified as the major task of linguistics, in close cooperation with other sciences, especially cognitive psychology and philosophy (see Chomsky 1968). With respect to the language faculty, the issues put on the research agenda by this change of perspective include, among other things, the problem of how to characterize the knowledge system represented in the mind of a person who speaks and understands a particular language, as well as to explain how this knowledge is used and, most importantly in the present context, how this linguistic knowledge and the ability to use it are acquired. The Language Acquisition Device, then, represents the initial state of the language faculty, that is, prior to any exposure to the language to be acquired (see Chomsky 1988). This new approach had an enormous impact on L1 research, and as early as in the early 1960s appeared the first of an ever increasing number of publications applying these ideas to the study of first language acquisition.

ii. L2 Research:

L2 research, on the other hand, took somewhat longer to liberate itself from the dominating influence of behaviorism. This is partly due, perhaps, to the fact that for a long time it had exclusively been occupied, and still continues to be primarily concerned, with foreign language learning in classroom settings, rather than with naturalistic L2 acquisition. The idea that learning crucially implies changing previously acquired behavior seems to have been deeply rooted in language teaching. It is therefore not surprising that interference from L1 was, and in part still is, regarded as the major factor determining the shape of L2 speech. The research paradigm which elaborated this idea in considerable detail is Contrastive Analysis (CA).

Contrastive Analysis continued a line of thought which had been expressed quite clearly as early as 1945 by Charles C. Fries in the following frequently quoted statement:

Page � of �6 66

Page 7: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

The most efficient materials are those that are based upon a scientific description of the language to be learned, carefully compared with a parallel description of the native language of the learner. (Fries 1945: 9)

The next step was taken by Robert Lado, a former student of Fries, in assuming that ‘individuals tend to transfer forms and meanings, and the distribution of forms and meanings of their native language and culture to the foreign language and culture’ (Lado 1957: 2). This assumption, which Lado as well as many others at the time regarded as an uncontroversial generalization based on empirical observation, was turned into a prediction, the perhaps major theoretical claim of CA, when Lado (1957) and Weinreich (1953) before him argued that ‘those elements that are similar to his [i.e. the learner’s, JMM] native language will be simple for him, and those elements that are different will be difficult’ (Lado 1957: 2).

We can sum up by saying that second language research suffered longer than first language research from its behaviorist heritage. By focusing on the comparison of linguistic structures justified exclusively in grammatical terms rather than with respect to their psycholinguistic plausibility, and, moreover, by defining learning primarily in terms of habit formation and changing of habits, questions relating to the possibility of a common underlying language making capacity for the various types of language acquisition could not even be formulated. As a result, the role of the native language in second language acquisition was seen exclusively as a possible source of transfer.

iii. L2 Research: A Cognitive Turn

An explicitly cognitive orientation of second language acquisition research was initiated in the late 1960s. The change is best illustrated by the seminal paper by Pit Corder (1967). He refers to the child’s ‘innate predisposition to acquire language’ and the ‘internal mechanism’ which makes the acquisition of grammar possible, and then raises the question of whether the child’s language making capacity remains available to second language learners. Although he is careful about the conclusions to be drawn from these assumptions, he leaves no doubt about the fact that he favors a positive answer, postulating ‘the same mechanism’ for both L1 and L2 acquisition, and proposes (p. 164)

as a working hypothesis that some at least of the strategies adopted by the learner of a second language are substantially the same as those by which a first language is acquired. Such a proposal does not imply that the course or sequence of learning is the same in both cases.

What exactly Corder means by ‘strategies’ is not entirely clear, nor does he elaborate on the last point, that is, what might cause the emergence of different learning sequences in spite of the claim that the underlying mechanisms are the same. He does, however, list what he sees as differences between the two acquisition types, namely that (1) children acquiring their L1, as opposed to L2 learners, are inevitably successful, (2) L1 development is part of the child’s maturational process, (3) at the onset of second language acquisition, another language is already present, and (4) the motivation for language acquisition is quite different in the two cases. Corder suspects that this last

Page � of �7 66

Page 8: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

factor, motivation, is the principal one distinguishing first and second language acquisition. In order to gain insights into the nature of the underlying mechanism and of the strategies used in second language acquisition, Corder suggests studying the errors found in L2 speech. He distinguishes between random mistakes and systematic errors. The latter, he claims (p. 166), ‘reveal his [the learner’s, JMM] underlying knowledge of the language to date, or, as we may call it his transitional competence’. If, for example, learners use the form thinked, this suggests that they have acquired knowledge about tense marking in English, even if this particular form is an error, deviating from the target norm. The truly stimulating ideas in Corder (1967), with respect to the present discussion, are that he explicitly suggested the same underlying mechanism for L1 and L2 acquisition, introduced the notion of ‘transitional competence’, and demanded that the focus of L2 research should be on the learner, rather than on learners’ productions. This can only be achieved if acquisition studies strive for psycholinguistically plausible grammatical analyses of learner utterances. In other words, L2 learners are assumed to acquire systematic knowledge about the L2; a ‘third system in addition to the NL [native language, JMM] of learners and the TL [target language, JMM] to be learned’ is introduced, to use Selinker’s (1992: 18) words. Note, however, that assuming a kind of transitional competence does not oblige us to subscribe to the idea of one and the same mechanism underlying L1 and L2 acquisition.

Suggestions similar to the ‘transitional competence’ were indeed made by a number of authors, proposing ‘approximative systems’ (Nemser 1971), ‘idiosyncratic dialects’ (Corder 1971) or ‘interlanguages’ (Selinker 1972). These terms are not synonymous, but they coincide in so far as they postulate a structured transitional knowledge base in the L2 learner. It contains elements of the target grammar, possibly also elements of the L1 grammar (‘interlingual errors’, Richards 1971), and, most importantly, elements different from both source and target systems, ‘developmental errors’ (‘intralingual errors’, Richards 1971) which prove that the learner is actively and creatively participating in the acquisitional process. The term most generally adopted is Selinker’s (1972) ‘interlanguage’ (IL)

B: First Language Acquisition:

The gift for language which manifests itself in the effortless acquisition of language by toddlers can safely be qualified as a species-specific endowment of humans. In fact, it enables children to develop a full grammatical competence of the languages they are exposed to, independently of individual properties like intelligence, personality, strength of memory and so on, or of particularities of the learning environment, for example social settings, whether the child is an only child or has siblings, birth order among siblings, whether the child has one or more primary caregivers, communicative styles of parents or caregivers, and so forth. The theoretical framework adopted here as the theory about the human language making capacity is that of Universal Grammar (UG), as it has been developed by Chomsky (1981a, 1986, 1995, 2000a, among others) and his colleagues. A central aspect of the theory of UG is that it views the human language faculty as comprising a priori knowledge about the structure of language. Importantly, knowledge of language is understood as being internal to the human mind/brain, and the object of linguistic theory is therefore the mental grammar or competence of the individual which Chomsky (1986) refers to as I-language, an

Page � of �8 66

Page 9: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

internal entity of the individual, as opposed to E-language, ‘E’ suggesting ‘external’, that is, the overt products in language use.

The genetically transmitted or innate implicit linguistic knowledge which UG attempts to capture is formulated in this theory in terms of abstract principles determining the set of possible human languages. They are universal in the sense that it is predicted that no grammar of a natural language, that is, no I-language, will violate these principles. Note that this allows for the possibility that principles may simply not be manifest in a given language. If, for example, UG contains principles explaining properties of prepositions, these will not be exhibited in languages without prepositions. They are nevertheless universal in the intended meaning of the term, namely wherever prepositions occur, they are constrained by the relevant UG principles. To sum up this point, UG is designed to capture all and only those properties which human languages have in common and thus to explain the nature of this species-specific faculty, but not every property present in a grammar must conform to principles of UG. In fact, many grammatical phenomena are language-specific, representing particularities of individual languages, and UG has nothing to say about them. Take, for example, the fact that French interrogative words qui/que ‘who/what’ can both refer to objects, but only qui can question subjects. Although this is undoubtedly a property of French grammar which needs to be represented in the mental grammars of speakers of this language, it is rather unlikely that it follows from a constraint imposed by UG. In other words, the grammar of a particular language is characterized by universal as well as language-specific features. Both reflect properties of the language faculty, but the latter must be acquired and will be contingent on properties of the learner and the learning environment. It is thus conceivable that a learner might not cognize that *que vient? ‘what is coming?’ is ungrammatical.

-The role of UG in explaining the human language making capacity.

The basic idea is that since UG is conceived of as representing the initial state of the language faculty, it can also be understood as a crucial component of the LAD, the Language Acquisition Device. The claim that UG indeed represents the initial state of the child’s linguistic development has, in fact, long been a fundamental assumption of generative theorizing and continues to be a defining property of UG in that Universal Grammar is understood as a theory about what the child brings to the task of language acquisition – or ‘growth’, as Chomsky prefers to say, comparing language development to the growth of organs – before any experience with the target language. To quote only one instance where he explains this idea, Chomsky (2000a: 4) suggests that we

think of the initial state as a ‘language acquisition device’ that takes experience as ‘input’ and gives the language as an ‘output’ – an ‘output’ that is internally represented in the mind/brain.

- Milestones of First Language Acquisition:

The First Sounds:

Speech perception – which includes, for instance, the ability to segment the speech stream into meaningful units, to recognize one’s own name in the speech stream, or to distinguish between similar sounding vowels (e.g. /ee/ and /oo/) – is a critical skill that infants develop early in life.

Page � of �9 66

Page 10: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

These early language skills also involve visual information; for instance, infants as young as two months have been shown to be able to match vowel sounds they hear with the appropriate lip, mouth, and face movements. These early speech perception skills related to the sound structure of language may help infants to bootstrap into more complex language competencies; bootstrapping refers to the possibility that skills in one area of language might help the child to develop competencies in other language areas. For instance, infants’ ability to recognize their own names in the speech stream (which appears around the fifth month) may provide them with a means to recognize novel, adjacent words.

Early research in speech perception demonstrated that during their first few months of life, infants are able to discriminate between similar sounds (for example, between /b/ and /p/) both in their native language(s) as well as in other languages. Over time, however, infants become more attuned to their native language(s) and less able to make sound distinctions in other languages.

Infants are born with the capacity to learn any language in the world, but the capacity to hear like a native fades very early on.

The first sound made by all infants is crying. All infants can do this immediately from birth; although crying may signal distress, discomfort, boredom, or other emotions in the first month of life, it is not an intentional attempt to communicate. From about the second to fifth month, infants engage in cooing. Coos are generally vowel-like sounds which are often interpreted as signs of pleasure and playfulness.

All infants begin to babble anywhere between four and six months and generally continue to do so until they reach around one year of age. Babbling is characterized by vowel or consonant–vowel sounds such as ouw-ouw or ma-ma. At this age, infants’ tongues tend to be relatively large compared to the size of their mouths, and as a result, these sounds will often be palatals, such as [y] or [ñ]. Labial sounds such as [b] and [m] are also common. Babbling begins to conform to the sound patterns of the adults’ language between six and ten months of age, with adult native speakers showing the ability to discriminate the babbles of Chinese, Arabic, English, or French infants.

Although there is no meaning (such as a demand for food) associated with this babbling for hearing or for deaf infants, it can be a source of interactive play. In some cultures, infants are encouraged to continue to babble by caregivers’ smiles or touches, or by their own babbling in return. Infants will often stop babbling in order to listen to their interlocutor (sometimes engaging in give-and-take exchanges known as proto-conversations), and around the fifth month, some infants are able to immediately imitate simple sound sequences presented to them.

The First Words:

Sometime around their first birthdays, children begin to assign specific meanings to the sounds they produce. These first words mark the beginning of what is known as the holophrastic stage. Holophrastic means ‘characterized by one-word sentences’; infants tend to use single words to communicate a variety of complex functions.

Page � of �10 66

Page 11: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Children’s words at this stage tend to be concrete objects which are grounded in and central to everyday experiences and interactions (such as light, tree, water), rather than abstract concepts (peace, happiness). These first words tend to be content words (bear or bed) rather than function words (the, and, on). For children learning English, most first words are nouns. This seems to be related to the fact that sentences in English typically end with nouns, where they are salient, or more noticeable, to learners. This is not the case for children learning all languages, however. For instance, Korean-learning infants’ first words are often verbs; in the Korean language, verbs are sentence-final and sentences may consist of only a verb.

While working to master the vocabulary around them, children often engage in both semantic overextension and underextension. For instance, a child may overextend the meaning of the word water to include not just drinking water, but also juice, milk, and soda. Underextension, which seems to be less common, refers to the reverse phenomenon: a child, for example, might use baby only to refer to an infant sibling and not to the other babies he/she encounters.

Around age two, children enter the two-word stage, characterized by use of phrases which are not more than two words. For English-learning infants, this typically means combining a subject and verb (e.g. baby cry, mama sleep) or a verb and modifier (e.g. eat now, go out). The ordering of these two-word phrases is not fixed, however, and there tends to be limited systematic use of grammatical morphology (for example, the possessive is formed as Miranda bed rather than Miranda’s bed).

As in many other stages of their linguistic development, children’s capacity for comprehending words outpaces their production ability. For instance, around the age of one, children can typically understand about seventy different words, but only productively use about six. There is about a four- to six-month delay between when children can comprehend a given number of words and when they can produce that many words themselves. Sometime around the end of the second year, children’s productive vocabulary begins to develop rapidly; this is sometimes known as the vocabulary spurt. During this period, children begin to add about two hundred words a month to their vocabularies!

At approximately two and half years of age, children begin to produce phrases of three or more words, entering the multi-word stage (e.g. Graham go out, Daddy cook dinner, Baby food all gone). Children’s language at this stage has been described as telegraphic speech because, like the economical language used in telegraphs, it is seemingly direct and makes only limited use of morphological and syntactic markers.

First Sentences: Morphological and Syntactic Development

Many diary, observational, and experimental studies have documented and explored how children become competent users of their language’s system of morphology and syntax. From this research, we know that for all languages, both signed and spoken, this process seems to involve the formation of internal “rules”; in other words, children’s increasingly regular use of grammatical forms (even non-adult-like or “incorrect” usages such as broked or foots) may reflect children’s developing grammatical rule systems.

Page � of �11 66

Page 12: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

We also know that children seem to begin to acquire this grammatical competence at a very young age and, as in vocabulary development, comprehension skills outpace production. For instance, children who are only seventeen months of age, and typically still producing only one- or two- word utterances, tend to look longer at video clips that correctly correspond to the grammar of the oral commentary. For instance, children who hear “The bear sat on the bird” and are shown two pictures (one of a bear sitting on a bird and another of a bird sitting on a bear) will look longer at the picture where the bear is sitting on the bird. This research demonstrates that even at very young ages children are tuned into the semantic significance of their language’s grammatical structures .

Research has also demonstrated that morphological and syntactic development is predictable. In other words, all children follow similar patterns and pass through the same developmental sequences as their competence develops. Although there is some variation depending on the language being acquired, many patterns and processes are constant across different language and cultural groups. The development of inflectional and derivational morphology in children’s productive language becomes apparent once the child enters the multiple-word stage and continues through age five. The development of inflectional morphology was the focus of early and intensive investigation. Brown’s investigation of Adam, Sarah, and Eve made important advances in this area. (See Box 6.2.)

Page � of �12 66

Page 13: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Page � of �13 66

Page 14: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Through analysis of Adam, Sarah, and Eve’s spontaneous speech, Brown mapped out when different grammatical morphemes consistently appeared in their speech and how this corresponded to other aspects of their language, in particular to mean length of utterance (MLU). MLU is a widely used measurement of the complexity of children’s language and is calculated from the average number of morphemes (not words) per utterance. Brown illustrated that: (1) the order of acquisition was similar across the three unacquainted children (with present progressive, plural, and past irregular verb forms appearing first); (2) the age at which children acquired competence in using these forms varied widely (compare, for instance, Eve and Adam at age two years and three months in Box 6.2); and (3) the MLU stage served as a good index of the level of development for grammatical morphology (and indeed was much more predictive of grammatical development than age). More recent research has stressed the importance of vocabulary as a predictor of grammatical development.

Another early study which sheds light on when children acquire inflectional morphology was Jean Berko’s famous “wug” study. Rather than recording and analyzing children’s spontaneous speech as Brown did, Berko asked young children of different ages to form the plural of unknown, nonsense creatures, such as “wugs.” (See Figure 6.2.) The experimenter pointed to an item and said, “This is a wug.” She then showed a picture with the same two animals and said, “Now here is another one. There are two of them. There are two _____?” Berko found that even preschool children were able to form the plural correctly, demonstrating that they had learned the rule for forming plurals and could apply this rule correctly in novel contexts, and were not just repeating forms which they had previously heard.

In developing these rules, children pass through predictable stages. For instance, children overgeneralize in the early phases of acquisition, meaning that they apply the regular rules of

Page � of �14 66

Page 15: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

grammar to irregular nouns and verbs. Overgeneralization leads to forms which we sometimes hear in the speech of young children such as goed, eated, foots, and fishes. This process is often described as consisting of three phases:

• Phase 1: The child uses the correct past tense of go, for instance, but does not relate this past-

tense went to present-tense go. Rather, went is treated as a separate lexical item.

• Phase 2: The child constructs a rule for forming the past tense and begins to overgeneralize this

rule to irregular forms such as go (resulting in forms such as goed).

• Phase 3: The child learns that there are (many) exceptions to this rule and acquires the ability to

apply this rule selectively.

Note that from the observer’s or parents’ perspectives, this development is “U-shaped” – that is, children can appear to be decreasing rather than increasing in their accuracy of past-tense use as they enter phase 2. However, this apparent “back-sliding” is an important sign of linguistic development.

We see similar patterns, known as “developmental sequences,” in other areas of grammar, such as the formation of English negatives and interrogatives. As outlined in Box 6.3, children move through identifiable stages, although these stages are more continuous and overlapping than discrete. Note that from the parents’ perspective, children’s development is also not always straightforward. For instance, a child will likely produce inverted yes/no questions (Did Karalyn eat cake?), while still using normal declarative word order for WH-questions (How Izzy go out?).

Page � of �15 66

Page 16: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

These data highlight the extent to which negation and interrogative development are interrelated and also dependent upon development of the necessary vocabulary. For instance, both questions and negatives are dependent upon acquisition of the verb auxiliary system (including, for instance, do, does, did, is, am, have, has) and modals (for example, can, could, may, might). These data also demonstrate the rule-governed nature of children’s language production: the “mistakes” in the examples here are generally not the result of children repeating the language that surrounds them; rather, they reflect the developing rule system of children’s language.

Specialization of the brain for language:

One of the first people to find such evidence was Paul Broca, a French surgeon. In 1861, he published a report on a patient who had had great difficulty producing speech. At autopsy, Broca found that he had damage in the lower edge of the left frontal lobe, in an area now called “Broca’s area.” Four years later, Broca published a further report where he showed that damage to areas in the left hemisphere produced aphasia, while damage to the corresponding areas in the right hemisphere did not (Broca 1856, 1861). In 1874, a German doctor, Carl Wernicke, followed up this

Page � of �16 66

Page 17: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

research in a monograph where he described patients with deficits in language comprehension, deficits associated with lesions or injury in the left hemisphere but outside Broca’s area.

When people are listening to speech, the auditory signals received through the ears travel first to an area on each side of the brain called Herschl’s area, part of the auditory cortex. Information from the right ear goes directly to the left hemisphere of the brain, and from the left ear goes to the right hemisphere. Then there appears to be a division of labor, with the words of a message going mainly to the left hemisphere, and properties such as intonation, rate of speech, pitch, rhythm, and stress going mainly to the right hemisphere. While these two kinds of information may be stored separately, they are connected by fibers that link the two hemispheres, the corpus callosum.

Evidence of specialization in the brain for language has traditionally been drawn from three main sources: hemispheric dominance, effects of trauma or injury, and the effects on language of hemispherectomy.

Hemispheric dominance. Almost everyone shows left hemisphere dominance. When people listen to words played simultaneously into each ear through headphones, they tend to hear words played to the right ear and not those played to the left ear. Words into the right ear go directly to the left or contralateral hemisphere.

Hemispheric damage. When someone has a stroke or receives a brain injury, only injuries to the left hemisphere affect language. In particular, only left hemisphere injuries in or near Wernicke’s and Broca’s areas are consistently associated with language loss or disturbance.

Hemispherectomies. When there has been extensive trauma to the brain, it is sometimes necessary to remove one or other hemisphere entirely. Whether the removal is partial or complete, the two hemispheres again show that they are not symmetrical. Removal of the left hemisphere typically results in loss of language, whereas removal of the right hemisphere leaves language unaffected (Dennis & Whitaker 1975).

Hearing vs. Deaf:

Researchers have used fMRI to examine sentence-processing carried out by native English speakers with normal hearing and by native signers of American Sign Language. The signers were either hearing (and hence bilingual in ASL and English) or congenitally deaf. The areas of the brain activated in the two populations were the classical language areas in the left hemisphere.

In another study of the deaf, researchers looked at the ERPs elicited by anomalies in either signed or spoken sentences. They compared four groups of adults: (a) deaf adults, born of deaf parents, who had learnt ASL at a young age; (b) hearing adults, also born of deaf parents, who had likewise learnt ASL young as a first language; (c) hearing adults, born of hearing parents, who learnt ASL after age seventeen; and, finally, (d) hearing adults with no experience of ASL at all. The left hemisphere showed extensive activity for all four groups. But the right hemisphere was also involved in language processing for the deaf adult children of deaf parents, and, to a lesser extent, the hearing adult children of deaf parents.

Page � of �17 66

Page 18: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

A Sensitive Period for Language Acquisition?

Sensitive periods in development have long intrigued biologists. In many species, there are periods when learning of certain kinds can take place more effectively than later in development. Sometimes the sensitive period is critical in that, once it is past, that learning can no longer occur, as in imprinting in chicks, or the acquisition of the pertinent songs in young songbirds. In 1967, Eric Lenneberg marshaled an impressive array of evidence for a critical period for the learning of language. Lenneberg argued that the reason language couldn’t be recovered after puberty was that lateralization (specialization of the left hemisphere for language) was by then complete. And any areas of the right hemisphere that could deal with language were also assigned. After lateralization of both hemispheres, then, it is no longer possible to allocate language functions to other areas of the brain; and if areas already assigned for language become damaged, there is now nowhere else to “put” language. These arguments for a critical period have been challenged by later work. Several studies have shown that lateralization for language is essentially complete earlier than Lenneberg proposed, around age five (e.g., Witelson & Pallie 1973). This in turn raises questions about whether the notion of a critical period for language is correlated with the completion of lateralization. If children found it easier to learn a language prior to lateralization, age five to six ought to mark a point at which all this changes. But children appear highly successful in learning additional languages up to age twelve and even later. Lateralization may therefore offer a less than compelling explanation of difficulties in later learning.

Children with early brain injury appear to attain normal (or nearnormal) language despite damage to areas critical for language in adults. Their language capacities appear to be fairly resilient. But sometimes children may have to have an entire hemisphere removed – for instance, to arrest seizures associated with Sturge-Weber-Dimitri syndrome. Their subsequent language abilities depend on which hemisphere was removed. Language is more affected by loss of the left hemisphere, and visual processing more by loss of the right hemisphere.

Is there an innate language acquisition device?

Many researchers have proposed that people have an innate capacity for language acquisition, a capacity that distinguishes humans from other species. The question is, just what is innate? Are there special areas of the brain dedicated to language? Here, as we have seen, the answer appears to be yes, but the specialization in these areas develops with exposure to language. And lateralization for language, then, only begins during the second year of life. Could there be built-in categories, universal in human languages, such as “noun” and “verb”? Are there any built-in structures? How might built-in categories and structures affect the process of acquisition? Or, instead of this, could humans be endowed with learning mechanisms specialized just for language? These questions have elicited considerable debate. Here we look at some of the proposals that have been made.

In the 1960s, Chomsky proposed that the human capacity for language was innate. The assumptions here are that: (a) natural-language syntax is too complex for children to learn from what they hear around them, because (b) adults offer such a distorted and imperfect source of data. And (c) children learn their first language so fast that they must rely on some innate capacity, specifically for

Page � of �18 66

Page 19: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

syntactic categories and syntactic structure. As Chomsky put it, “The grammar has to be discovered by the child on the basis of the data available to him, through the use of the innate capacities with which he is endowed” (1972:183). We will focus on the existence of innate categories and structures.

Current positions on the status of innate categories and structures tend to espouse one of two main views: the continuity view and the maturational view (see O’Grady 1997). Both focus on syntax, which is treated as autonomous, independent of phonology and the lexicon. Morphology is sometimes included with syntax, because it too can be viewed as rule-based and because it marks syntactic distinctions such as “subject of” and “predicate,” as well as parts of speech (by distinguishing nouns from verbs from adjectives, for instance). The continuity view assumes that children use the same notions and relations throughout development; they are present from birth. These researchers generally subscribe to an innate Universal Grammar (UG) common to speakers at every stage of development (e.g., Pinker 1984). The strong version of this approach assumes that children just beginning to speak have the same mental representations for linguistic constructions as adults (e.g., Lust 1994). A weaker version assumes that, although children come with all the categories and operations, they don’t make use of them all immediately. They first have to learn how to instantiate such elements as relativizers (that, who, which), complementizers (to, for, whether), or wh- forms for questions (what, where, why). One result is that the focus in most studies has been limited to the learning of grammatical elements. But languages differ, and researchers have to take that into account. One version of UG that has been invoked contains parameters that allows for variation across languages. Some languages, for example, have complement constructions that are head-initial – the term introducing a complement comes first, followed by the complement, as in English (e.g., They said that he came in at five). Others are head-final and place the head after the complement, as in Japanese. In UG, this difference is captured by a word-order parameter with two values: head-initial and head-final. Upon exposure to a language, children “discover” the value of this parameter and set it accordingly. Languages also differ in whether finite verbs appear with an overt subject in the form of a nominal or pronoun, as in French (e.g., je veux partir ‘I want to-go’), or whether they can omit these, as in Italian (e.g., voglio partire ‘I-want to-go’). This variation is captured by the parameter called subject-drop, which is either permitted or not. Both these parameters are assumed to be present from the start in acquisition, but they can only be set after experience with a language. Children also need to acquire relevant words with which to display their syntactic knowledge. Until they use nouns and verbs, there is no way to tell what underlying grammatical categories or structures they might know. Take two year- olds who haven’t yet learnt I or me in self-reference and usually use a verb alone for their own actions (e.g., Throw ball), or three-year-olds who don’t yet use complementizers like whether or that (e.g., Rod said Nico coming). Have these children set the appropriate parameters yet? The values on some parameters are associated directly with specific lexical items, such as him versus himself, which determine the domain for the antecedent of the pronoun, as in Ken washed him (i.e., someone else) versus Ken washed himself. This parameter either sets the domain as the smallest clause containing the pronoun or as the sentence containing the pronoun. In English, both him and himself are associated with the first setting (the smallest clause), but in other languages, the pronoun him may be associated with the first setting and the reflexive himself with the second, as in Japanese.

Page � of �19 66

Page 20: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

The continuity view offers a potentially simple and elegant account of acquisition for syntax (Macnamara 1982; Pinker 1984). If UG in its entirety is present from the start, then children have only to set a certain number of parameters and learn any lexical items with syntactic consequences. There is no need to track changes in learning mechanisms since they remain unchanged, as do children’s representations of linguistic categories and structures.

The maturational view of what is innate differs from the continuity view on the role of experience. In the maturational account, children make progress in syntactic acquisition without much regard to experience. Development is driven instead by a biological timetable. As a result, groups with different kinds of experience adhere to the same timetable. For example, Gleitman (1981) argued that children with normal hearing, children who are blind, and children who are deaf (and not exposed to a sign language) follow much the same timing. All, for instance, produce their first one-word forms around age one; two- and three-word expressions by age two; and some simple grammatical sequences by age three. But notice that all three groups require normal intelligence and normal input to arrive at these milestones.

Proponents of UG often prefer maturation over continuity because biological maturation can be relatively independent of experience. As Felix (1988:371) put it, “The mechanism that ‘pushes’ the child through the sequence of developmental stages is therefore the maturational schedule.” What is innate, then, is UG combined with a biologically based schedule for the emergence of each parameter setting. To take one example, Borer and Wexler (1987) proposed that subject-drop as a parameter in UG is just not available until a particular point in acquisition (also Hyams 1986, but see Ingham 1992). Before this point, exposure to any relevant information in child-directed speech can have no effect. This allows researchers to ignore what happens in acquisition prior to the setting of each parameter. There is no need, in the maturational view, to account for early errors, since they are assumed to play no role in the later emergence of the target construction. Borer and Wexler also proposed that the passive construction (as in The cat was chased by the boy) “matures” only at around age five, thus accounting for its relatively late emergence in English. Here, however, the maturational account runs into distinct difficulty. First, Demuth (1989) showed that, for speakers of Sesotho (a Bantu language), the passive is well established in young children before age three, and, second, several researchers have shown that the passive in English is also acquired considerably before age five (e.g., Clark & Carpenter 1989a; Pinker et al. 1987). They have also shown that emergence of the passive depends heavily on the precise verbs used (e.g.,Maratsos et al. 1985).

Another maturational account was put forward by Radford (1990). He proposed that young children go through three stages. The first is pre-grammatical in that any terms used have yet to be categorized as nouns or verbs, say. Next (at around 1;8) comes the lexical stage. This is marked by an increase in vocabulary size, especially for nouns, verbs, prepositions, and adjectives, and by the appearance of word combinations like X + Complement (e.g., open box, in bag), Modifier + X (e.g., nice book, back in, very good), and Possessor + X (e.g., baby cup, daddy gone, doggy down, hand cold). Absent from this lexical stage, according to Radford, are all “functional categories,” such as determiners (a, the, this, that, etc.) and complementizers (whether, that), as well as inflections, such as tense suffixes (wants, jumped), modal auxiliaries (can, must), and the infinitival marker to. These emerge only in the third, functional, stage. Functional categories emerge when they do, according to

Page � of �20 66

Page 21: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Radford (1990:274), because they “are genetically programmed to come into operation at different biologically determined stages of maturation.” But in other languages, a variety of functional categories emerge much earlier (e.g., determiners in Sesotho: Demuth 1992), and even in English, some members of a specific functional set emerge many months before others (Fletcher 1985; Ingham 1998). These data raise serious questions about the status of a functional stage per se.

An alternative to Radford’s maturational account is that functional categories come in later than some instances of lexical categories because they are semantically more complex and so require more structural knowledge. They are also often pragmatically complex as well. This account is consistent with the data on Korean determiners, a lexical category that emerges relatively late, near age three (O’Grady 1993), and also with the rather long period of acquisition (three years or more) for functional categories in English (e.g., Brown 1973). A major determinant of acquisition there is relative semantic complexity, with less complex meanings mastered before more complex ones. If so, complexity of meaning may offer a more general explanation, crosslinguistically, than the stages proposed by Radford.

Then, the question of what is innate is rather a question of whether we make use of innate learning mechanisms unique to language, and if so, what form these might take. The real debate here should be over the specificity or generality of the learning mechanisms themselves, not the categories or structures to be learnt. As Lenneberg (1967:394) put it:

[N]o features that are characteristics of only certain natural languages, eitherparticulars of syntax, or phonology, or semantics, are assumed…to be innate.However, there are many reasons to believe that the processes by which therealized, outer structure of a natural language comes about are deep-rooted,species-specific, innate properties of man’s biological nature.

What is innate in this view is the manner in which humans process information.

A Learning Mechanism, Just for Syntax?Some researchers have argued that there is a module in the brain for the rule-governed portion of language. Humans are unique in having developed language; there is strong evidence for specialization of the left hemisphere for language, and part of this specialization consists in a module devoted to the syntactic (rule-based) component of language. As a result, they have argued, whatever mechanisms are used for language acquisition must be specialized and quite distinct from any used for other cognitive acquisitions. In short, language is distinct from cognition, and mechanisms for acquisition can be used only for language.

What it the evidence for this position? First, language is unique to people and is not found in other species. Second, some language disorders show selective impairment of the rule-governed aspects of language (specifically, syntactic rules and regular morphological paradigms). This has been viewed as evidence that language is encapsulated and hence distinct from other capacities (Curtiss 1988). For instance, studies of children with Williams Syndrome initially suggested that WS children’s language was normal but their nonverbal skills were not (e.g., Mervis 1999; Morris & Mervis 1999). This population therefore offered strong evidence for a dissociation between

Page � of �21 66

Page 22: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

linguistic skills and general cognitive skills. Children’s cognition could be seriously impaired, in short, but their language remained intact. Yet, as researchers looked more closely at the Williams Syndrome child, that picture became much cloudier.

The findings across tasks and languages show that WS individuals do not after all present evidence for an encapsulated language capacity. The initially rosy picture of their syntactic abilities was exaggerated. Children with Williams Syndrome do not follow the normal course of acquisition for syntax, morphology, or the lexicon. Their syntax and morphology do not, after all, represent a unified, neatly modular syntactic or computational skill, distinct from other linguistic and communicative abilities.

Page � of �22 66

Page 23: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Module 3: The Role of the Native Language in Acquiring a Second Language.

Lado, in his early and influential book Linguistics Across Cultures (1957), stated this clearly:

individuals tend to transfer the forms and meanings, and the distribution of forms and meanings of their native language and culture to the foreign language and culture—both productively when attempting to speak the language and to act in the culture, and receptively when attempting to grasp and understand the language and the culture as practiced by natives.

(p. 2)

Lado’s work and much of the work of that time was based on the need to produce pedagogically relevant materials. To produce these native language-based materials, it was necessary to do a contrastive analysis of the native language and the target language. This entailed making detailed comparisons between the two languages in order to determine similarities and differences.

To understand why language transfer was accepted as the mainstream view of language learning, it is necessary to understand the psychological and linguistic thinking at the time Lado was writing.

It is important at this juncture to clarify one important aspect of our understanding of the term transfer. Although the original term used in the classical literature on transfer did not imply a separation into two processes, negative versus positive transfer, there has been some confusion in the use of the terms in the second language literature. Implicit in the use of these terms is that there are two different underlying learning processes, one of positive transfer and another of negative transfer. But the actual determination of whether or not a learner has positively or negatively transferred is based on the output, as analyzed by the researcher, teacher, native speaker/hearer, when compared and contrasted with target language output. In other words, these terms refer to the product, although their use implies a process. There is a process of transfer; there is not a process of negative or positive transfer. Thus, one must be careful when using terminology of this sort because the terminology suggests a confusion between product and process.

1. Behaviorism:

1.1. The Behaviorist account of L1 learning:

Bloomfield’s classic work, Language (1933), provides the most elaborate description of the behaviorist position with regard to language.

The typical behaviorist position is that language is speech rather than writing. Furthermore, speech is a precondition for writing. The justification for this position came from the facts that (a) children without cognitive impairment learn to speak before they learn to write and (b) many societies have no written language, although all societies have oral language; there are no societies with only written but no spoken language systems.

Page � of �23 66

Page 24: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Within the behaviorist framework speaking consists of mimicking and analogizing. We say or hear something and analogize from it. Basic to this view is the concept of habits. We establish a set of habits as children and continue our linguistic growth by analogizing from what we already know or by mimicking the speech of others.

Bloomfield divides a situation like this into three parts:

1  Practical events before the act of speech (e.g., hungry feeling, sight of apple).

2  Speech event (making sound with larynx, tongue, and lips).

3  Hearer’s response (Ahmad’s leaping over the fence, fetching the apple, placing it in Sarah’s hand).

Thus, in this view, speech is the practical reaction (response) to some stimulus.

1.2. Psychological Background:

One of the key concepts in behaviorist theory was the notion of transfer. The main claim with regard to transfer is that the learning of task A will affect the subsequent learning of task B. What is of interest is how fast and how well you learn something after having learned something else.

In a transfer experiment related to verbal learning, consider a study by Sleight (1911) in which he was concerned with the ability to memorize prose more easily if one has had “prior experience” in memorizing poetry. He compared four groups of 12-year-old children on their ability to memorize prose. Three groups had prior training on the memorization of (a) poetry, (b) tables of measures, or (c) content of prose passages. A fourth group had no prior training in any type of memorization. Following training, the groups were given tests that tested their ability to memorize prose. The question was: “To what extent does poetry memorization, or more precisely, the skills used in poetry memorization, transfer to memorization of prose?” (The results were nonsignificant.)

Let’s consider an example from the area of language learning. According to the initial view of language transfer, if speakers of a particular language (in this case, Italian) form questions by saying:

(4-1) Mangia bene il bambino? eats well the baby “Does the baby eat well?”

then those same (Italian) speakers learning English would be expected to say

(4-2) Eats well the baby?

when asking a question in English. A behaviorist notion underlying this expectation is that of habits and cumulative learning. According to behaviorist learning theory:

Learning is a cumulative process. The more knowledge and skills an individual acquires, the more likely it becomes that his new learning will be shaped by his past experiences and activities. An adult rarely, if ever, learns

Page � of �24 66

Page 25: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

anything completely new; however unfamiliar the task that confronts him, the information and habits he has built up in the past will be his point of departure. Thus transfer of training from old to new situations is part and parcel of most, if not all, learning. In this sense the study of transfer is coextensive with the investigation of learning.

(Postman, 1971, p. 1019)

While this statement is not specifically intended as a description of language learning, we can see how the concepts were applied to second language learning. A distinction noted above that is commonly made in the literature is between positive transfer (also known as facilitation) and negative transfer (also known as interference). These terms refer respectively to whether transfer results in something correct or something incorrect, and, to repeat a point made earlier, do not imply two distinct cognitive processes. As an example with relation to second language learning, if a Spanish speaker is learning Italian, when asking a question that speaker might correctly produce

(4-3) Mangia bene il bambino? eats well the baby

because in Spanish one uses the same word order to form questions.

(4-4) ¿Come bien el niño? eats well the baby

This is known as positive transfer. But if that same speaker is learning English and produces

(4-5) Eats well the baby?

the incorrect utterance is known as negative transfer.

With regard to interference, there are two types noted in the literature: (a) retroactive inhibition—where learning acts back on previously learned material, causing someone to forget (language loss)—and (b) proactive inhibition—where a series of responses already learned tends to appear in situations where a new set is required. This is more akin to the phenomenon of second language learning because the first language in this framework influences/inhibits/modifies the learning of the L2.

We turn now to the work on second language learning that was based on these behaviorist positions. As noted earlier, Lado’s work made these theoretical underpinnings explicit. Recall also that the major impetus for this work was pedagogical. In his foreword to Lado’s book, Fries noted:

Before any of the questions of how to teach a foreign language must come the much more important preliminary work of finding the special problems arising out of any effort to develop a new set of language habits against a background of different native language habits . . .

Learning a second language, therefore, constitutes a very different task from learning the first language. The basic problems arise not out of any essential

Page � of �25 66

Page 26: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

difficulty in the features of the new language themselves but primarily out of the special “set” created by the first language habits. (Fries, 1957)

Thus, underlying much work in the 1950s and 1960s was the notion of language as habit. Second language learning was seen as the development of a new set of habits. The role of the native language, then, took on great significance, because, in this view of language learning, it was the major cause for lack of success in learning the L2. The habits established in childhood interfered with the establishment of a different set of habits.

From this framework emerged contrastive analysis, because if one is to talk about replacing a set of habits (let’s say, the habits of English) with another set of habits (let’s say, those of Italian), valid descriptions are needed comparing the “rules” of the two languages. It would be mis- leading, however, to consider contrastive analysis in a monolithic fashion. In fact, there are two distinct traditions of contrastive analysis that emerged. In the North American tradition, the emphasis was on language teaching and, by implication, language learning. Contrastive analyses were conducted with the ultimate goal of improving classroom materials.

As Fisiak (1991) noted, this is more appropriately considered “applied contrastive analysis.” In the European tradition, the goal of contrastive analysis was not pedagogical. Rather, the goal of language comparison was to gain a greater understanding of language.

A: Contrastive Analysis Hypothesis:

What are the tenets of contrastive analysis? Contrastive analysis is a way of comparing languages in order to determine potential errors for the ultimate purpose of isolating what needs to be learned and what does not need to be learned in a second-language-learning situation. As Lado detailed, one does a structure-by-structure comparison of the sound system, morphological system, syntactic system, and even the cultural system of two languages for the purpose of discovering similarities and differences. The ultimate goal is to predict areas that will be either easy or difficult for learners.

The pedagogical materials that resulted from contrastive analyses were based on a number of assumptions:

1. Contrastive analysis is based on a theory of language that claims that language is habit and that language learning involves the establishment of a new set of habits.

2. The major source of error in the production and/or reception of a second language is the native language.

3. One can account for errors by considering differences between the L1 and the L2.

4. A corollary to item 3 is that the greater the differences, the more errors will occur.

5. What one has to do in learning a second language is learn the differences. Similarities can be safely ignored as no new learning is involved. In other words, what is dissimilar between two languages is what must be learned.

Page � of �26 66

Page 27: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

6. Difficulty and ease in learning is determined respectively by differences and similarities between the two languages in contrast.

There were two positions that developed with regard to the Contrastive Analysis Hypothesis (CAH) framework. These were variously known as the a priori versus the a posteriori view, the strong versus weak view and the predictive versus explanatory view. In the strong view, it was maintained that one could make predictions about learning and hence about the success of language-teaching materials based on a comparison between two languages. The weak version starts with an analysis of learners’ recurring errors. In other words, it begins with what learners do and then attempts to account for those errors on the basis of NL–TL differences. The weak version, which came to be part of error analysis, gained credence largely due to the failure of predictive contrastive analysis. The important contribution of the former approach to learner data (i.e., error analysis) was the emphasis it placed on learners themselves, the forms they produced, and the strategies they used to arrive at their IL forms. Those arguing against the strong version of contrastive analysis were quick to point out the many areas where the predictions made were not borne out in actual learner production.

But there were other criticisms as well. Perhaps the most serious difficulty and one that ultimately led to the demise of contrastive analysis, a hypothesis that assumed that the native language was the driving force of second language learning, was its theoretical underpinnings. In the 1960s, the behaviorist theory of language and language learning was challenged. Language came to be seen in terms of structured rules instead of habits. Learning was seen not as imitation but as active rule formation.

The recognition of the inadequacies of a behaviorist theory of language had important implications for second language acquisition, for if children were not imitators and were not influenced in a significant way by reinforcement as they learned language, then perhaps second language learners were not either. For example, it is not uncommon for beginning second language learners to produce an utterance such as 4-6:

(4-6) He comed yesterday.

in which the learner attempts to impose regularity on an irregular verb. There was no way to account for this fact within a theory based primarily on a learner transferring forms from the NL to the TL.

Not only did errors occur that had not been predicted by the theory, but also there was evidence that predicted errors did not occur. That is, the theory did not accurately predict what was happening in nonnative speech.

Yet another criticism of the role of contrastive analysis had to do with the concept of difficulty. Recall that a fundamental tenet of the CAH was that differences signified difficulty and that similarity signified ease. Difficulty in this view was equated with errors. If a learner produced an

Page � of �27 66

Page 28: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

error, or errors, this was a signal that the learner was having difficulty with a particular structure or sound. But what actually constitutes a sign of difficulty?

Recognition of the complexity of comparing languages became apparent quite early, particularly in works such as that of Stockwell, Bowen, and Martin (1965a, 1965b), who, rather than dichotomize the results of language comparison into easy and difficult and therefore dichotomize the needs of learning into a yes/no position, established a hierarchy of difficulty and, by implication, a hierarchy of learning. Included in this hierarchy are different ways in which languages can differ.

B: Error Analysis:

It is a type of linguistic analysis that focuses on the errors learners make. Unlike contrastive analysis (in either its weak or strong form), the comparison made is between the errors a learner makes in producing the TL and the TL form itself. It is similar to the weak version of contrastive analysis in that both start from learner production data; however, in contrastive analysis the comparison is made with the native language, whereas in error analysis it is made with the TL.

Even though the main emphasis in second language studies during the 1950s and 1960s was on pedagogical issues, a shift in interests began to emerge. The conceptualization and significance of errors took on a different role with the publication of an article by Corder (1967) titled “The significance of learners’ errors.” Unlike the typical view held at the time by teachers, errors, in Corder’s view, are not just to be seen as something to be eradicated, but rather can be important in and of themselves.

Errors can be taken as red flags; they provide windows onto a system— that is, evidence of the state of a learner’s knowledge of the L2. They are not to be viewed solely as a product of imperfect learning; hence, they are not something for teachers to throw their hands up in the air about.

It has been found that second language errors are not a reflection of faulty imitation. Rather, they are to be viewed as indications of a learner’s attempt to figure out some system, that is, to impose regularity on the language the learner is exposed to. As such, they are evidence of an underlying rule- governed system. In some sense, the focus on errors is the beginning of the field of second

Page � of �28 66

Page 29: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

language acquisition, which at this point is beginning to emerge as a field of interest not only for the pedagogical implications that may result from knowing about second language learning, but also because of the theoretical implications for fields such as psychology (in particular learning theory) and linguistics.

In the same article, Corder was careful to distinguish between errors and mistakes. Mistakes are akin to slips of the tongue. That is, they are generally one-time-only events. The speaker who makes a mistake is able to recognize it as a mistake and correct it if necessary. An error, on the other hand, is systematic. That is, it is likely to occur repeatedly and is not recognized by the learner as an error.

Taken from the perspective of a learner who has created a grammatical system (an interlanguage), everything that forms part of that interlanguage system by definition belongs there. Hence, there can be no errors in that system. Errors are only errors with reference to some external norm (in this case the TL).

A great deal of the work on error analysis was carried out within the context of the classroom. The goal was clearly one of pedagogical remediation. There are a number of steps taken in conducting an error analysis.

1  Collect data. Although this is typically done with written data, oral data can also serve as a base.

2  Identify errors. What is the error (e.g., incorrect sequence of tenses, wrong verb form, singular verb form with plural subject)?

3  Classify errors. Is it an error of agreement? Is it an error in irregular verbs?

4  Quantify errors. How many errors of agreement occur? How many irregular verb form errors occur?

5  Analyze source.

6  Remediate. Based on the kind and frequency of an error type, pedagogical intervention is carried out.

Error analysis provides a broader range of possible explanations than contrastive analysis for researchers/teachers to use to account for errors, as the latter only attributed errors to the NL. In comparison, there are two main error types within an error analysis framework: interlingual and intralingual. Interlingual errors are those which can be attributed to the NL (i.e., they involve cross-linguistic comparisons). Intralingual errors are those that are due to the language being learned, independent of the NL.

Error analysis was not without its detractors. One of the major criticisms of error analysis was directed at its total reliance on errors to the exclusion of other information.

Page � of �29 66

Page 30: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Perhaps the most serious attempt at showing the inadequacies of error analysis comes from a 1974 article by Schachter. She collected 50 compositions from each of four groups of learners of English: native speakers of Persian, Arabic, Chinese, and Japanese. Her research focused on the use of English restrictive relative clauses (RC) by each of these four groups. The findings in terms of errors are given in Table 4.3 (taken from Table 4.4).

Compare the construction of relative clauses in (Chinese and Japanese) and (Arabic and Persian).

Page � of �30 66

Page 31: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Page � of �31 66

Page 32: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Module 4: Generative Approaches to First Language Acquisition

A: First language Acquisition: Basics

The study of language acquisition raises questions such as these: How do children break into language? How does knowledge of language emerge in early infancy, and how does it grow? What are the milestones of the language acquisition process? What kinds of linguistic knowledge do children display at given points of development?

The framework adopted here to answer these questions is the generative theory of Universal Grammar (Chomsky 1975, 1981, 1986). According to this theory, human beings are innately endowed with a system of richly structured linguistic knowledge, which guides infants in analyzing incoming linguistic stimuli. Such a theory makes possible clear and falsifiable predictions about children's linguistic competence and offers the tools needed to precisely characterize this competence at given points of development.

For children, acquiring a language is an effortless achievement that occurs:

- without explicit teaching,

- on the basis of positive evidence (i.e., what they hear),I

- under varying circumstances, and in a limited amount of time,

- in identical ways across different languages.

1. Acquiring language without explicit teaching:

Unlike learning a second language in adulthood, acquiring a ®rst or native language does not require systematic instruction. Parents usually do not teach children the rules of language or tell them what kinds of sentences they can and cannot say. Language develops spontaneously by exposure to linguistic input, that is, on the basis of what children hear. Children are rarely corrected, and even when they are, they resist the correction. McNeill (1966, 69) reports the following conversation between a child and his mother:

The child in this exchange uses double negation (nobody don't), an option that is not allowed in standard English. As the exchange shows, correction does not seem to have helped the child very

Page � of �32 66

Page 33: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

much: he eventually notices the use of likes (though he uses it incorrectly), but he fails to take advantage of the whole content of the correction.

2. Acquiring language on the basis of positive evidence:

Parents' corrections should inform children of what is not possible in the language they are exposed to; such information coming from correction is called negative evidence. As noted, however, corrections are rare, and they do not seem to improve children's linguistic behavior. Much research has been conducted to establish whether negative evidence is available to children in the form of parents' disapproval or failure to understand, parents' expansion of what children say, and frequency of parents' reactions to children's utterances (see Bohannon and Stanowicz 1988; Demetras, Post, and Snow 1986; Hirsh-Pasek, Treiman, and Schneiderman 1984). Although the question is still much debated, the general conclusion is that negative evidence is not provided to all children on all occasions, is generally noisy, and is not sufficient (see Brown and Hanlon 1970; Bowerman 1988; Morgan and Travis 1989; Marcus 1993). Thus, negative evidence is not a reliable source of information. Children have the best chance to succeed in acquiring language by relying on positive evidence, the utterances they hear around themÐa resource that is abundantly available.

3. Acquiring Language under Varying Circumstances and in a Limited Amount of Time:

Children acquire language under different circumstances, and the linguistic input they are exposed to may vary greatly from child to child. Nevertheless, they all attain the same competence and do so in a limited amount of time. By about 5 years of age they have mastered most of the constructions of their language, although their vocabulary is still growing.

4. Acquiring Language in Identical Ways across Different Languages :

Children achieve linguistic milestones in parallel fashion, regardless of the specific language they are exposed to. For example, at about 6±8 months all children start to babble, that is, to produce repetitive syllables like bababa. At about 10±12 months they speak their first words, and between 20 and 24 months they begin to put words together. It has been shown that children between 2 and 3 years speaking a wide variety of languages use infinitive verbs in main clauses or omit sentential subjects, although the language they are exposed to may not have this option. Across languages young children also overregularize the past tense or other tenses of irregular verbs. Interestingly, similarities in language acquisition are observed not only across spoken languages, but also between spoken and signed languages. For example, at the age when hearing babies start to babble orally, deaf babies start to do the same manually (see Petitto 1995). It is striking that the timing and milestones of language acquisition are so similar and that the content of early languages is virtually identical, despite great variations in input and in conditions of acquisition.

B: The Logical Problem of Language Acquisition:

Looking at the facts described in the last section, researchers have characterized the problem of language acquisition as follows (see Baker and McCarthy 1981):

Page � of �33 66

Page 34: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

.  Children come to have very rich linguistic knowledge that encompasses a potentially infinite number of sentences, although they hear a finite number of sentences.

.  The data that children draw upon consist of positive evidence (sentences that are acceptable in the language they are exposed to).

.  Children are not told which sentences are ill formed or which interpretations sentences cannot have in their language, but eventually they attain this knowledge; all mature speakers can judge whether a sentence is acceptable or not (under a given interpretation).

.  Although children make ``errors,'' they do not make certain errors that would be expected if they generalized from the linguistic input. For example, although children hear sentences like Who do you wanna invite? and Who do you wanna see?, they do not generalize from these to impossible English sentences like *Who do you wanna come?; although this generalization would seem reasonable, children never say such sentences.

These points are part of an argument about the mechanisms underlying language acquisition

the so-called argument from the poverty of the stimulus. Essentially, this argument starts

with the premises that all speakers of a language know a given fairly abstract property and

that this property cannot be induced from the evidence available to children (positive

evidence).

C: Language Acquisition from the perspective of Universal Grammar:

A grammar is a finite system since it is somehow represented in the mind/brain. As Chomsky showed in the 1950s, it is a mental generative procedure that uses finite means to generate an indefinite number of sentences. The term grammar, as used here, refers to a psychological entity, not to an inventory of sounds, morphemes, inflectional paradigms, and syntactic constructions (e.g., passives, relative clauses).

Our linguistic knowledge allows us to produce and understand sentences we have never heard before. It also gives us the tools to establish whether a sentence is acceptable in our language or not. For example, although (2) is comprehensible, it is not an acceptable sentence in English. It does not comply with what we know to be licit in English.

(2) Dog a old a bone ate.

Page � of �34 66

Page 35: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

It is again our grammar that permits us to say that the sentence in (3) is perfectly sound, but only on the interpretation that Mary washed another female individual. It cannot mean that Mary washed herself.

(3) Mary washed her.

In other words, the pronoun her in (3) must refer to or pick out an individual distinct from the individual picked out by Mary. As (4) shows, however, pronouns need not always be interpreted in this way.

(4) Mary washes her socks.

The sentence in (4) is ambiguous: it can mean either that Mary washes some other female individual's socks or that Mary washes her own socks. Unlike the pronoun in (3), the pronoun here can be interpreted in two ways: either it refers to the same individual picked out by Mary or it refers to another salient individual in the extralinguistic context.

Linguistic ambiguity is pervasive. Sentence (5) is also ambiguous, having the two readings in (6a) and (6b) (example from Lightfoot 1982, 19).

(5)  John kept the car in the garage.

(6)  a. The car that John kept was the one in the garage. b. The garage was where John kept the car.

Human beings have the resources to cope with linguistic ambiguity. We know whether a sentence is ambiguous or not, whether we can interpret it in certain ways or not, because our grammar assigns sentences structural representations constrained in specific ways. The string in (5) can be associated with two structural representations, (7a,b), each corresponding to one of the two legitimate interpretations of this string, (6a,b).

Page � of �35 66

Page 36: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

In summary, we can do certain things with language because we have a grammar, a psychological entity realized somehow in our mind/brain. This grammar assigns certain structural representations to sentences, and it sanctions certain interpretations while banning others. It does this by means of constraints that establish what is possible and what is not possible in language. In the next section we will look more closely at the notion of constraints.

Page � of �36 66

Page 37: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

D: Constraints:

Constraints are linguistic principles that prohibit certain arrangements of words, certain operations, and certain associations of sounds and meanings. They encode properties that hold universally (i.e., in language after language) and are all inviolable (i.e., no violation of any sort is tolerated).

Sentences must conform to linguistic constraints if they are to be considered well formed or acceptable. For example, the question in (8b), obtained from the declarative sentence in (8a), is judged ill formed by English speakers, as conventionally indicated by the ``star'' (*), because it violates a constraint of English grammar.

(8) a. John regrets that Paul behaved badly. b. *How does John regret that John behaved?

Constraints are of two kinds: constraints on form and constraints on meaning. Constraints on form encode the linguistic information that certain sentences are ill formed. An example of a constraint on form is the one operative in (8b). Notice that a minimal variant of (8b) namely, (9b), obtained from (9a) is well formed.

(9) a. John thinks that Paul behaved badly. b. How does John think that Paul behaved?

The question of interest here is this: how does the child who has heard (8a) and (9a,b) refrain from abstracting a rule that would yield (8b)? In fact, English speakers all share the knowledge that questions like (8b) are ill formed. Linguists propose that a constraint on grammar is responsible for this knowledge. Moreover, speakers of all other languages investigated thus far also know that the counterpart of (8b) is ill formed in their languages, and that the counterpart of (9b) is well formed. Thus, the kind of knowledge that allows us to say that (8b) or its counterpart in another language is not acceptable cannot be language specific, but must be universal. Another constraint on form is that governing the optional contraction between want and to in English.

(10) a. b. Who do you want to invite?

Who do you wanna invite?

(11) a. b. When do you want to go out?

When do you wanna go out?

(12) a. b. Who do you want to come?

*Who do you wanna come? Page � of �37 66

Page 38: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

It is possible to contract want and to in (10a) and (11a). However, in (12a) the result of wanna-contraction is ill formed, something that speakers of English implicitly know, although they may not be able to formally express this prohibition. Essentially, wanna-contraction is not possible when the questioned element is the subject of the infinitival clause, for example, John in I want John to come. Although linguists speak of ``a constraint governing wanna-contraction,'' remember that terms naming specific syntactic constructions are used only for convenience. In its general formulation this is a universal constraint that blocks a certain process from occurring in certain structural configurations.

Beyond constraints on form, grammars include constraints on the meaning that speakers assign to acceptable sentences. Consider the sentences in (13). Both sentences are perfectly well formed, but while in (13a) the two italicized expressions he and John cannot pick out the same individual, in (13b) they can (here, “*” indicates that the sentence in (13a) is ruled out when the two italicized expressions pick out the same individual).

(13) a. *He danced, while John was singing. b. While he was singing, John was dancing.

In other words, (13b) is ambiguous: the pronoun he can refer either to the same individual that John refers to (anaphoric interpretation of the pro- noun) or to another salient character in the extrasentential context (exophoric/deictic interpretation of the pronoun). By contrast, in (13a) the pronoun he can only be interpreted as referring to some individual other than John. Interestingly, in all languages investigated so far, the counter- parts of (13a,b) work the same way; that is, the constraint governing the interpretation of (13) holds universally.

E: The Innateness Hypothesis:

Recall the premises of the argument from the poverty of the stimulus: that all speakers of a language know a fairly abstract property and that this property cannot be induced from the evidence available to children (positive evidence). The conclusion that these premises invited us to draw is the answer to the question we started with: where does linguistic knowledge come from? Imitation, reinforcement, and association having failed to answer this question, we must look further. In fact, the answer that Chomsky (1959) gave in arguing against behaviorist views and that conclude the argument from the poverty of the stimulus is that this knowledge is inborn.

There is a debate as to how rich the genetic makeup supporting human linguistic abilities is. Researchers in the Chomskyan tradition assume that inborn human knowledge is richly structured and must consist of the kinds of constraints (or of something equivalent in its effects) discussed above. It is very unlikely that these constraints are learned since they hold universally. It would be very curious that all languages conform to these constraints if this crosslinguistic similarity were not somehow dictated by our mind/brain: languages “are all basically set up in the way that human

Page � of �38 66

Page 39: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

biology expects them to be” (Gleitman and Lieberman 1995, xxi). Thus, children are born expecting that, whichever language they are going to hear, it will have the properties that their genetic equipment is prepared to cope with.

The hypothesis that the language capacity is innate and richly structured explains why language acquisition is possible, despite all limitations and variations in the learning conditions. It also explains the similarities in the time course and content of language acquisition. How could the process of language acquisition proceed in virtually the same ways across modalities and across languages, if it were not under the control of an innate capacity? Of course, not all linguistic knowledge is innate, for children reared in different linguistic environments learn different languages. That languages vary is obvious. For example, in Italian the sentential subject can be phonologically silent, while in English it cannot. However, this variation is not unlimited. Universal Grammar (UG) is the name given to the set of constraints with which all human beings are endowed at birth and that are responsible for the course of language acquisition. UG defines the range of possible variation, and in so doing it characterizes the notion of possible human language. A characterization of UG is a characterization of the initial linguistic state of human beings, the genetic equipment necessary for acquiring a language.

According to this nativist view, acquisition results from the interaction between inborn factors and the environment. Language is not learned, but, under normal conditions, it is deemed to emerge at the appropriate time, provided the child is exposed to spoken or signed language. Obviously, children have to learn the words of their language, its lexicon. They also have to figure out what the regularities of their language are, and how innate constructs are instantiated in their linguistic environment (Fodor 1966).

F: The Principles & Parameters Model:

Our genetic endowment makes it possible to learn any human language. Children raised in an English-speaking environment speak English, those raised in an Italian-speaking environment speak Italian, and those raised in a Tibetan-speaking environment speak Tibetan. Although all languages have the same basic underlying structure, there are variations. For example, in some languages (e.g., English and Italian) the verb comes before complements; in others (e.g., Turkish and Bengali) it comes after. So, while an English speaker would say John bought books, a Turkish speaker would say something equivalent to John books bought. Some languages (e.g., Italian and Spanish) allow the sentential subject to remain phono- logically unexpressed; others (e.g., English) do not. So, while Bought a book is ungrammatical in English, its counterpart is acceptable in Italian or Spanish.

The model of language adopted here makes sense of these variations by holding that UG consists of two types of constraints: principles and parameters. Hence, it is called the principles-and-parameters model (Chomsky 1981). Principles encode the invariant properties of languages, that is, the universal properties that make languages similar. For example, the constraint discussed above governing the interpretation of pronouns is a principle; in any human language this principle regulates the interpretation of pronouns. Parameters encode the properties that vary from one language to another; they can be thought of as switches that must be turned on or of. An example is

Page � of �39 66

Page 40: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

the pro-drop or null subject parameter governing the phonological expression of the sentential subject.

As a first approximation, we can formulate the pro-drop parameter as in (A).

(A) Can the sentential subject be phonologically null?

Depending on the particular language, the answer to the question in (A) will vary. If a child is exposed to Italian, the parameter in (A) will be set to the positive value; if the child is exposed to English, it will be set to the negative value.

Under the principles-and-parameters model, children are innately endowed with principles and parameters, because both are given by UG. The children's task is to set the parameters to the value expressed by the language of their environment. In this model, then, language acquisition consists (among other things) in selecting the appropriate values of the parameters specified by UG.

The theory of language acquisition endorsed here is a selective theory, rather than an instructive one. “Under an instructive theory, an outside signal imparts its character to the system that receives it, instructing what is essentially a plastic and modifiable nervous system; under a selective theory, a stimulus may change a system that is already structured by identifying and amplifying some component of already available circuitry'' (Lightfoot 1991, 2). In other words, under an instructive theory genuine learning takes place; under a selective theory no learning takes place because the stimulus works on what is already inborn.

Selection, rather than instruction, operates in other biological systems besides language. Niels Kaj Jerne has defended a selective theory of antibody formation, whereby antigens select antibodies that already exist in an individual's immune system (for discussion of these issues, see Jerne 1967, 1985; Piattelli-Palmarini 1986). He has also conjectured that certain central nervous system processes might work selectively and has pointed out that in the history of biology selective theories have often replaced instructive theories.

In summary, UG is the human genetic endowment that is responsible for the course of language acquisition. It includes principles and parameters that encode the invariant and variant properties of languages, respectively. Parameters define the range of variation that is possible in language; and together, principles and parameters define the notion “possible human language.” Language acquisition is a selective process whereby the child sets the values of parameters on the basis of the linguistic environment.

Page � of �40 66

Page 41: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Module 5: Generative Approaches to Second Language AcquisitionA defining property of generative approaches to non-native language (L2) acquisition is a focus on the question of whether (adult) L2 acquisition is guided and constrained by the same principles of Universal Grammar (UG) that are assumed by generative grammarians to guide and constrain native language (L1) acquisition. Within this broad research paradigm, some L2 researchers have claimed that UG becomes inactive or “inaccessible” at some point in the human life cycle and thus plays no role in non-native language acquisition after that (Clahsen and Muysken 1986; Meisel 1997).

Others have claimed that UG remains fully active or “accessible” in the human brain throughout life and (in principle) plays the same role in both native and (adult) non-native language acquisition (or at least, UG would play the same role, if it were not for the confounding factor of previously acquired grammars) (Dekydtspotter, Sprouse and Anderson 1997; Dekydtspotter, Sprouse and Swanson 2001; Herschensohn 2000; Schwartz 1987; Schwartz and Sprouse 1996; Slabakova 2008; Vainikka and Young- Scholten 1994; White 1989, 2003a; and many others).

A third group of L2 scholars has claimed that only those properties and/or categories of UG instantiated in the L1 grammar can be accessed in adult L2 acquisition (Bley-Vroman 1990; Hawkins and Chan 1997; Schachter 1989b; Tsimpli and Dimitrakopoulou 2007; Tsimpli and Roussou 1991), while yet a fourth position is that UG is selectively impaired or “partially accessible” in the adult L2 learner (Beck 1998a).

Despite competing claims about the epistemology of adult L2 acquisition, the commonality uniting all of the perspectives referenced above is the assumption that children are born with UG, that is, a body of domain-specific cognitive principles or mechanisms constraining the acquisition of language, and that UG plays a fundamental role in accounting for the observable course of L1 development in childhood. On the interpretation presented in this module, the basis for positing UG is the empirical fact that human children (exposed to contextualized linguistic input, i.e. primary linguistic data), systematically and without the need for specific instruction or for direct negative evidence (direct information about what is impossible, e.g. impossible strings annotated as such), acquire systems of subdoxastic linguistic knowledge, which cannot plausibly be inferred from the input on the basis of domain-general learning principles alone.

The enormous gap between the input available to the child (primary linguistic data) and the system of knowledge acquired, a system that includes what is possible but, crucially, excludes what is impossible, has come to be known as the poverty of the stimulus. (See Thomas 2002 for a history of the development of this term and its rising importance over the course of the evolution of generative linguistic theory.) As Thomas (2002) astutely points out, the “stimulus” is in fact in no way “impoverished” from the perspective of the language-acquiring child. Quite the contrary, the stimulus (ambient linguistic input uttered in contexts of the world) is entirely sufficient to allow all children, barring pathology, to develop mental grammars that appear to match those of the speech community in which they live with respect to even extremely subtle and complex properties. The

Page � of �41 66

Page 42: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

standard (perhaps, defining) explanation of generative grammar for this phenomenon is that the brain/mind of human children is endowed with UG, a network of domain-specific cognitive predispositions that filter the input and narrowly constrain the set of grammars that can be projected from the input. Thus, the stimulus is “impoverished” only from the perspective of the expectations of a purely inductive domain-general learning hypothesis. It would be counterintuitive, to say the least, to deny that there is a general POS associated with the acquisition of human languages, given that cognitively normal humans acquire the ambient language of their speech community, and no other species does so, nor do other species seem to have anything remotely like language in the sense of human language, with properties such as recursion and the generation of an infinite set of discrete sentences. There is something special about human brains that produces cognitive outcomes to the linguistic stimulus that are radically different from the cognitive outcomes produced by the brains of even our closest primate cousins exposed to the same stimulus.

However, the gap between what an “unbiased” analysis of the input would predict and the grammar actually triggered in the child’s brain suggests that much more is at stake than merely the superior domain-general reasoning/learning abilities of Homo sapiens sapiens. To the extent that the contribution of this system of innate knowledge to language acquisition has no application outside the realm of human language, one is left with the conclusion that the human brain is equipped with domain-specific cognitive structures and operations for language (henceforth language-specific knowledge).

A: General Nativism vs. Special Nativism:

The general nativist position maintains that there is no specific mechanism designed for language learning. Rather, “there are general principles of learning that are not particular to language learning but may be employed in other types of learning” (Eckman, 1996, p. 398).

Special nativism includes theories of language (learning) that posit special principles for language learning, principles that are unique to language (learning) and that are not used in other cognitive endeavors. Both the general nativist and special nativist positions agree that there is something innate involved in language learning; it is the nature of the innate system that is in question. Is it available only for the task of language learning or is it also available for more general learning tasks? This module treats only the special nativist approach, known as Universal Grammar (UG). Central to these approaches is an understanding of language as a system with its own rules.

B: Universal Grammar and Second Language Acquisition:

The UG approach to second language acquisition begins from the perspective of learnability. The assumption of innate universal language properties is motivated by the need to explain the uniformly successful and speedy acquisition of language by children in spite of insufficient input.

In UG theory, universal principles form part of the mental representation of language, and it is this mental grammar that mediates between the sound and meaning of language. Properties of the human mind are what make language universals the way they are. As Chomsky (1995, p. 167) noted: “The theory of a particular language is its grammar. The theory of languages and the

Page � of �42 66

Page 43: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

expressions they generate is Universal Grammar (UG); UG is a theory of the initial state So of the

relevant component of the language faculty.” The assumption that UG is the guiding force of child language acquisition has long been maintained by many, but only in the past two decades has it been applied to second language acquisition. After all, if properties of human language are part of the mental representation of language, it is assumed that they do not cease being properties in just those instances in which a nonnative language system is being employed.

The theory underlying UG assumes that language consists of a set of abstract principles that characterize core grammars of all natural languages. In addition to principles that are invariable (i.e., all languages have them) are parameters that vary across languages.

In sum, Universal Grammar is “the system of principles, conditions, and rules that are elements or properties of all human languages” (Chomsky, 1975, p. 29). It “is taken to be a characterization of the child’s prelinguistic state” (Chomsky, 1981, p. 7). Thus, the necessity of positing an innate language faculty is due to the inadequate input, in terms of quantity and quality, to which a learner is exposed. Learning is mediated by UG and by the L1.

How does this relate to second language acquisition? The question is generally posed as an access-to-UG problem. Does the innate language faculty that children use in constructing their native language grammars remain operative in second language acquisition? More recently, this question is formulated as an issue of initial state. What do second language learners start with?

What is the nature of the linguistic knowledge with which learners begin the second language acquisition process?

The two variables influencing this debate are transfer (i.e., the availability of the first language grammar) and access to UG (i.e., the extent to which UG is available).

Two broad views are discussed here: the Fundamental Difference Hypothesis (Bley-Vroman, 1989; Schachter, 1988), which argues that what happens in child language acquisition is not the same as what happens in adult second language acquisition, and the Access to UG Hypothesis, which argues that the innate language facility is alive and well in second language acquisition and constrains the grammars of second language learners as it does the grammars of child first language learners.

1. Fundamental Difference Hypothesis:

The Fundamental Difference Hypothesis starts from the belief that, with regard to language learning, children and adults are different in many important ways. For example, the ultimate attainment reached by children and adults differs. In normal situations, children always reach a state of “complete” knowledge of their native language. In second language acquisition (at least, adult second language acquisition), not only is “complete” knowledge not always attained, it is rarely, if ever, attained. Fossilization, representing a non-TL stage, is frequently observed (Han, 2004; Long, 2007).

Page � of �43 66

Page 44: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Another difference concerns the nature of the knowledge that these two groups of learners have at the outset of language learning. Second language learners have at their command knowledge of a full linguistic system. They do not have to learn what language is all about at the same time that they are learning a specific language.

One final difference to mention is that of motivation and attitude toward the target language and target language community. Differential motivation does not appear to impact a child’s success or lack of success in learning language. All human beings without cognitive impairment learn a first language.

In sum, the basic claim of the Fundamental Difference Hypothesis is that adult second language learners do not have access to UG. Rather, what they know of language universals is constructed through their NL. In addition to the native language, which mediates access to UG, second language learners make use of their general problem-solving abilities.

This information (knowing about language) is gleaned by means of knowing that the NL is this way and by assuming that these facts are a part of the general character of language rather than a part of the specific nature of the native language. Thus, the learner constructs a pseudo-UG, based on what is known of the native language. It is in this sense that the NL mediates knowledge of UG for second language learners.

2. Access to UG hypothesis:

The opposing view to the Fundamental Difference Hypothesis is the Access to UG Hypothesis. The common perspective is that “UG is constant (that is, unchanged as a result of L1 acquisition); UG is distinct from the learner’s L1 grammar; UG constrains the L2 learner’s interlanguage grammars”.

White outlines five positions with regard to the initial state of second language learning; the first three take the first language as the basis of the initial state and the second two take UG as the initial state: (1) Full Transfer/Full Access, (2) Minimal Trees, (3) Valueless Features, (4) Initial Hypothesis of Syntax, and (5) Full Access (without transfer).

L1 as the basis:

i. Full Transfer / Full Access: This position assumes that the starting point is the L1 grammar, but that there is full access to UG during the process of acquisition. The learner is assumed to use the L1 grammar as a basis but to have full access to UG when the L1 is deemed insufficient for the learning task at hand. L1 and L2 learning differ, and there is no prediction that learners will eventually attain complete knowledge of the L2.

ii. Minimal Trees Hypothesis: The Minimal Trees Hypothesis also maintains that both L1 and UG are available concurrently (Vainikka and Young-Scholten, 1994, 1996a, 1996b). However, the L1 grammar that is available contains no functional categories, and these categories, initially, are not available from any source. The emergence of functional categories is not dependent on the L1 and hence there is no transfer; rather, they emerge in response to L2 input. The development of functional categories of learners from different languages will be the same. On

Page � of �44 66

Page 45: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

this view, learners may or may not reach the final state of an L2 grammar, depending on what is available through the L1 and what is available through UG. They should be able to reach the final state of an L2 grammar with regard to functional categories.

iii. Valueless Features: The claim is that there is weak transfer (Eubank 1993, 1993/1994, 1996). The L1 is the primary starting point. Unlike the Minimal Trees Hypothesis, both functional and lexical categories are available from the L1, but the strength of these features is not available. There are consequences of feature strength in areas such as word order. Acquisition involves acquiring appropriate feature strength of the L2. Learners should be able to fully acquire the L2 grammar.

UG as the basis:

i. The initial hypothesis of syntax: This position maintains that, as in child language acquisition, the starting point for acquisition is UG.

ii. Full Access / No Transfer: This position maintains that, as in child language acquisition, the starting point for acquisition is UG (Epstein, Flynn, and Martohardjono, 1996, 1998; Flynn, 1996; Flynn and Martohardjono, 1994). There is a disconnection between the L1 and the developing L2 grammar. A prediction based on this position is that L1 and L2 acquisition will proceed in a similar fashion, will end up at the same point, and that all L2 acquisition (regardless of L1) would proceed along the same path. Learners should be able to reach the same level of competence as native speakers. If there are differences, they are performance-related rather than competence-related.

Page � of �45 66

Page 46: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Module 6: Cognitive-Interactionist Perspective on L2 Learning

Cognitive-interactionism is associated with the work in developmental psychology by Jean Piaget (e.g. 1974) and refers to the position that multiple internal (cognitive) and external (environmental) factors reciprocally interact (hence the word ‘interactionist’) and together affect the observed processes and outcomes of a phenomenon – in this case, additional language learning. It is noteworthy that internal cognition is assumed to be the locus of learning (hence the word ‘cognitive’ in the term) and that a clear separation between cognitive-internal and social-external worlds is presupposed, since how the two interact is the object of inquiry.

Languages are almost always learned with and for others, and these others generate linguistic evidence, rich or poor, abundant or scarce, that surrounds learners. Knowing about the language benefits afforded by the environment is thus important for achieving a good understanding of how people learn additional languages. In this module, we will examine environmental influences on L2 learning. We open with the story of Wes (Schmidt, 1983), who is probably the most frequently cited, admired and puzzled-over exemplar in the long gallery of learners that SLA researchers have mounted to date through the methodology of case study (Duff, 2008). He stands for someone who became particularly adept at ‘initiating, maintaining, and regulating relationships and carrying on the business of living’ in his additional language (Schmidt, 1983, p. 168) but remained unable to master the L2 grammar despite what seemed to be sufficient time and ideal environmental conditions.

The Case of ‘WES’:

Wes was a young Japanese artist who learned English without instruction in Honolulu. Schmidt (1983) kept rich field notes over the duration of the study and collected 18 hours of English oral data, in the form of letters that Wes tape-recorded over three years during his visits to Tokyo, to update people back in Honolulu about personal and professional matters. Towards the final months of the study, Schmidt also recorded an additional three hours of casual conversations in Honolulu.

In his early thirties, Wes emigrated from Tokyo to Honolulu by choice, in a financially and socially comfortable position, in pursuit of expanded international recognition in his already well-established career. Perhaps two features can be singled out as most defining of Wes’s personality. One is his strong professional identity as an artist, captured in excerpt (1) from an oral letter recorded into the third year of the study (Schmidt, 1983, p. 158):

(1) “you know I’m so lucky / because ah my business is painting / also my hobby is painting / ... this is my life / cannot stop and paint / you know nobody push / but myself I’m always push”

The second defining feature is Wes’s predisposition towards communication. His was the kind of social personality that avidly seeks people and engages in skillfully designed reciprocal interaction. This is illustrated in excerpt (2), also recorded around the same time as the previous excerpt:

(2) well / I like talk to people you know / um / I’m always listen then start talk / then listen / always thinking my head / then talk / some people you know only just talk, talk, talk, talk /

Page � of �46 66

Page 47: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

(Ibid., p. 160)

Schmidt describes Wes as someone who was confident and felt comfortable in his own identity as Japanese and at the same time showed extremely positive attitudes towards Hawai‘i and the United States throughout the three years of study. Most of his acquaintances and friends, and most of his clients and art brokers, were L1 English speakers, and he had an L1 English-speaking roommate. In Schmidt’s estimation, by the third year of study, he was using English in his daily interactions between 75 and 90 per cent of the time.

Wes had arrived in the United States with ‘minimal’ communicative ability in English (p. 140), and within three years he was able to function in the L2 during ‘promotional tours, exhibitions of paintings [... and] appearances and demonstrations by the artist’ that demanded of him whole-day, around-the-clock interactions and a mixture of demonstration painting and informal lecturing, all in English (p. 144). This transformation of his second language capacities seems remarkable. A close analysis of his language production in the recorded letters and conversations, however, revealed a more ambivalent picture. Evidence of the greatest strength and improvement was found in the area of oral discourse competence. Wes quickly became skillful and expressive in his conversations and grew able to narrate, describe and joke in rather sophisticated ways. Most interlocutors considered him a charming conversationalist who never ran out of topics and often took charge of steering conversations. At the other extreme, in the area of Wes’s grammatical competence, Schmidt uncovered puzzling stagnation. For example, the findings for verbal tense morphology are telling. For all three years, Wes’s verbs were characterized by overuse of –ing attached consistently to certain verbs denoting activities (e.g. joking, planning, training, touching), the use of past in only high-frequency irregular forms that can be memorized as items (e.g. went, sent, told, saw, said, met, bought), a complete absence of –ed and an overwhelming preference to make interlocutors understand the intended tense and aspect of his messages via lexical means such as adverbs (e.g. all day, always, right now, yesterday, tomorrow). In other words, over three years of rich exposure to and meaningful use of English, Wes’s temporal L2 system remained rudimentary, stuck in the transition between the lexical marking stage and the next stage of development, where tense and aspect morphology begins to deploy (Bardovi- Harlig, 2000. Likewise, his articles and plurals improved minimally, from practically no occurrence in the beginning of the study to accuracy in up to a meagre third of the relevant cases, but even then in great part because of repeated occurrence of these forms in chunked phrases like n years old or n years ago (for plural –s), and a little (bit) X (for use of the indefinite article a).

Regarding progress in the two areas of sociolinguistic and strategic competence, Wes presented a mixed profile, less extreme than the positive picture of discourse competence or the negative picture of grammatical competence. On the one hand, he developed a certain sociolinguistic repertoire that enabled him to issue requests, hint and make suggestions, if often strongly couched in indirectness:

. (3)  maybe curtain [maybe you should open the curtain]

Page � of �47 66

Page 48: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

. (4)  this is all garbage [put it out]

. (5)  uh, you like this chair? [please move over]

Since Japanese is well known for its indirect politeness (Ide et al., 2005), this preference may have been transferred from his L1.

How can Wes’s mixed success story of language learning be explained? Schmidt proposed that ‘sensitivity to form’ or the drive to pay attention to the language code (p. 172) seems to be the single ingredient missing in Wes’s efforts to learn the L2. Despite optimal attitudes towards the L2 and its members and plentiful and meaningful participation in English interactions, Wes was driven as a learner by an overriding investment in ‘message content over message form’ (p. 169). As he himself puts it, ‘I know I’m speaking funny English / because I’m never learning / I’m only just listen / then talk’ (p. 168). Schmidt concluded that positive attitudes and an optimal environment will afford the linguistic data needed for learning, but that the learning will not happen unless the learner engages in active processing of those data. In other words, grammar acquisition cannot be successful without applying ‘interest’, ‘attention’ and ‘hard work’ (p. 173) to the task of cracking the language code.

With this conclusion, the kernel of the Noticing Hypothesis was born. Nevertheless, cognitive-interactionist SLA researchers interested in the environment spent most of the decade of the 1980s exploring the four ingredients of attitudes, input, interaction and output before the insights finally converged into an emerging consensus that attention was the needed fifth ingredient to consider. Let us examine each ingredient and its associated hypothesis.

1. Attitudes:

Obviously, the L2 environment engenders in learners certain attitudes that have affective and social–psychological bases and that must be considered if we want to understand L2 learning.

In the late 1970s, John Schumann at the University of California Los Angeles focused on attitudes and proposed the Pidginization Hypothesis, also known as the Acculturation Model (explained in Schumann, 1976). The proposal was inspired by his case study of Alberto, a 33-year-old immigrant worker from Costa Rica who appeared to be unable to move beyond basic pidginized English after almost a year and a half in Boston, and even after he was provided with some individualized instruction. Schumann predicted that great social distance between the L1 and L2 groups (as is the case of circumstantial immigrants, who speak a subordinate minority language and are surrounded by a powerful language of the majority), and an individual’s affective negative predispositions towards the target language and its members (e.g. culture shock, low motivation) may conspire to create what he characterized as a bad learning situation that causes learners to stagnate into a pidgin-like state in their grammar, without inflections or mature syntax. Conversely, he predicted that the more acculturated a learner can become (that is, the closer to the target society and its

Page � of �48 66

Page 49: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

members, socially and psychologically), the more successful his or her eventual learning outcomes will be.

Schmidt’s (1983) study of Wes was originally designed as a test case of the Acculturation Model and, as we have seen, it provided strong evidence against attitudes being the only or most important explanatory mechanism for L2 learning success. In response to the new evidence, Schumann eventually modified his model in important ways (Schumann, 1990, 1997).

2. Input:

The environment affords learners input, or linguistic data produced by other competent users of the L2. Also in the late 1970s, Stephen Krashen at the University of Southern California formally proposed a central role in L2 learning for input in his Comprehensible Input Hypothesis (best formulated in Krashen, 1985). This proposal drew on his extensive educational work with English-language learners in California’s schools and communities.

According to Krashen, the single most important source of L2 learning is comprehensible input, or language which learners process for meaning and which contains something to be learned, that is, linguistic data slightly above their current level. This is what Krashen termed i+1. Learners obtain comprehensible input mostly through listening to oral messages that interlocutors direct to them and via reading written texts that surround them, such as street signs, personal letters, books and so on. When L2 learners process these messages for meaning (which they will most likely do if the content is personally relevant, and provided they can reasonably understand them), grammar learning will naturally occur. Krashen proposed this role for input on the assumption that the mechanisms of L2 learning are essentially similar to the mechanisms of L1 learning: in order to build an L1 grammar, children only need to be exposed to the language that parents or caretakers direct to them for the purpose of meaning making.

The strong claim that comprehensible input is both necessary and sufficient for L2 learning proved to be untenable in light of findings gleaned by Schmidt (1983) and by many others, who documented minimal grammatical development despite ample meaningful opportunities to use the language, even with young L2 learners – for example, children attending French immersion (Swain, 1985) and regular English-speaking schools (Sato, 1990). Input is undoubtedly necessary, but it cannot be sufficient.

3. Interaction:

Much in the linguistic environment, particularly in naturalistic settings, but also in today’s communicative classrooms, comes to learners in the midst of oral interaction with one or more interlocutors, rather than as exposure to monologic spoken or written discourse. In the early 1980s, Michael Long proposed the Interaction Hypothesis (best explained and updated in Long, 1996). The hypothesis grew out of work conducted for his dissertation at the University of California Los

Page � of �49 66

Page 50: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Angeles, in which college-level ESL learners were paired to interact with English native-speaking pre-service and in-service teachers of ESL. It extended Krashen’s proposal by connecting it in novel ways with studies in discourse analysis that had entered the field via the work of SLA founder Evelyn Hatch (1978) and work done on caretaker speech and foreigner talk in neighbouring disciplines. At the time, Long agreed with Krashen that learning happens through comprehension, and that the more one comprehends, the more one learns. However, he departed from the strong input orientation of the times by focusing on interaction and proposing that the best kind of comprehensible input learners can hope to obtain is input that has been interactionally modified, in other words, adjusted after receiving some signal that the interlocutor needs some help in order to fully understand the message.

Interactional modifications are initiated by moves undertaken by either interlocutor in reaction to (real or perceived) comprehension problems, as they strive to make meaning more comprehensible for each other, that is, to negotiate for meaning. Typically, negotiation episodes begin with clarification requests if non-understanding is serious (e.g. whaddya mean? uh? pardon me?), confirmation checks when the interlocutor is somewhat unsure she has understood the message correctly (e.g. you mean X? X and Y, right?) and comprehension checks if one interlocutor suspects the other speaker may not have understood what she said (e.g. you know what I mean? do you want me to repeat?). Following signals of a need to negotiate something, the other interlocutor may confirm understanding or admit non-understanding, seek help, repeat her words exactly or try to phrase the message differently. Often this two-way process makes both interlocutors modify their utterances in ways that not only increase the comprehensibility of the message but also augment the salience of certain L2 forms and make them available to the learner for learning (Pica, 1994). This is illustrated in (6):

When interlocutors like Jane and Hiroshi work through messages in these ways, engaging in as much (or as little) negotiation for meaning as needed, we might say that they are generating tailor-

Page � of �50 66

Page 51: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

made comprehensible input, or learner-contingent i+1, at the right level the particular interlocutor needs to understand the message. It was for this reason that Long predicted interactionally modified input would be more beneficial than other kinds of input. For example, it may be better than unmodified or authentic input (as in the listening or reading of authentic texts) but also better than pre-modified input (as in graded readers), which often means simplifying the language, thus risking the elimination of the +1 in the i+1 equation. Interactional modifications have the potential to bring about comprehension in a more individualized or learner-contingent fashion, with repetitions and redundancies rather than simplification. Thus, an important general benefit of interactional modifications is their contingency, in that learners are potentially engaging in what educational researchers would call just-in-time learning, or learning at the right point of need.

4. Output:Where there is interaction, learners engage by necessity not only in comprehending and negotiating messages but also in making meaning and producing messages, that is, in output. By the mid-1980s, it was becoming apparent to SLA researchers that positive attitudes and plentiful input and interaction, while important, were not sufficient to guarantee successful grammatical acquisition. It was at this juncture that Canadian researcher Merrill Swain (1985) at the University of Toronto formulated her Pushed Output Hypothesis (you will also see the terms Comprehensible Output Hypothesis and Output Hypothesis used interchangeably). She did so drawing on results of large-scale assessment of the linguistic outcomes of French immersion schools in Ontario, an English-speaking province of Canada.

Specifically, she compared the oral and written performances of children who had studied in immersion schools against the performances on the same tasks by same-age L1 French peers. She found patterns that remarkably resonate with Schmidt’s (1983) findings for Wes. School immersion from kindergarten to sixth grade afforded these children optimal development in discourse competence (as well as optimal comprehension abilities and school content learning), but not in grammatical competence or in sociolinguistic competence for aspects that demanded grammatical means (as opposed to formulaic means) for their realization, such as the French choice between vous/tu (formal/informal ‘you’) and the use of conditional as a politeness marker (Swain, 1985). She concluded that the missing element in this school immersion context was sufficient opportunities for the children to actually use the language in meaningful ways, through speaking and writing.

Comprehension does not usually demand the full processing of forms. During comprehension (e.g. when children read textbooks and listen to teacher explanations in school), it is possible to get the gist of messages by relying on key content words aided by knowledge of the world, contextual clues, and guessing. For example, in yesterday I walked three miles, we may hear ‘yesterday’ and not even need to hear the morpheme –ed in order to know our interlocutor is telling us about something that happened in the past. By the same token, reliance on this kind of lexical processing is less possible during production, because the psycholinguistic demands of composing messages

Page � of �51 66

Page 52: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

force speakers to use syntactic processing to a much greater extent. Thus, Swain proposed that ‘producing the target language may be the trigger that forces the learner to pay attention to the means of expression needed in order to successfully convey his or her own intended meaning’ (p. 249). This is particularly true if interlocutors do not understand and push for a better formulation of the message, if learners push themselves to express their intended meaning more precisely or if the nature of what they are trying to do with words (i.e. the task) is demanding, cognitively and linguistically.

Swain’s pushed output hypothesis created a space to conceive of a competence- expanding role for production that had not been possible before the mid-1980s, when most researchers could envision a causal role in L2 learning only for comprehension. Even to this date, many scholars view production as merely useful in building up fluency (e.g. de Bot, 1996; VanPatten, 2004). Yet, a focus on pushed output allows for the possibility that production engages crucial acquisition-related processes (Izumi, 2003; see section 4.9). Optimal L2 learning must include opportunities for language use that is slightly beyond what the learner currently can handle in speaking or writing, and production which is meaningful and whose demands exceed the learner’s current abilities is the kind of language use most likely to destabilize internal interlanguage representations. By encouraging risk-full attempts by the learner to handle complex content beyond current competence, such conditions of language use may drive learning.

5. Attention:

Attention to formal detail in the input seemed to be missing and perhaps needed. The insights Schmidt gained from studying Wes and from a later case study of himself learning Portuguese during a five-month stay in Rio de Janeiro (Schmidt and Frota, 1986) led him to formally propose the Noticing Hypothesis in the early 1990s (best explained in Schmidt, 1995). He claimed that, in order to learn any aspect of the L2 (from sounds, to words, to grammar, to pragmatics), learners need to notice the relevant material in the linguistic data afforded by the environment. Noticing refers to the brain registering the new material, with some fleeting awareness at the point of encounter that there is something new, even if there is no understanding of how the new element works, and possibly even if there is no reportable memory of the encounter at a later time.

Since it is difficult to distinguish absence of noticing from inability to remember and report the experience of noticing at a later time, Schmidt (2001) concluded cautiously that the more L2 learners notice, the more they learn, and that learning without noticing (that is, subliminal learning), even if it exists in other domains of human learning, plays a minimal role in the challenging business of learning a new language.

The capacity to attend to the language code can be internally or externally fostered. Instances of noticing can be driven from within the learner, as when she struggles to put a sentence together and express her thoughts and in the process discovers something new. They can also be encouraged by external means, for example, through a lesson orchestrated by a teacher, a question or reaction from an interlocutor, and so on. Through such internal and external means, learners pay attention to the existence of new features of the L2 (Schmidt, 1995), become aware of locatable gaps between their

Page � of �52 66

Page 53: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

utterances and those of interlocutors (Schmidt and Frota, 1986) and discover holes in what they are able to express with their given linguistic resources in the L2 (Swain and Lapkin, 1995). Thus, attention and noticing act as filters that moderate the contributions of the environment.

6. Negative Feedback:

A final benefit of the environment, not discussed so far, is that it may provide learners with information about the ungrammaticality of their utterances. When the interlocutor has the actual intention to provide such negative information, then we may want to speak of error correction. However, more often than not, it is impossible (for the researcher as much as for the parties involved in the interaction!) to decide whether the intention to correct was at work. Therefore, we will prefer the term negative feedback over error correction or the near- synonymous corrective feedback (both of which imply a clear pedagogical intention to correct) and also over negative evidence (which is used in formal linguistic discussions about what linguistic abstract information would be needed to reset certain values within the limits available in Universal Grammar; Beck et al., 1995). Negative feedback can be provided in interactive discourse orally, but it also occurs very often in writing (both in classrooms and in non-school contexts for professional, technical and creative writing) and in the context of technology- mediated communication and study.

From the perspective of cognitive-interactionist researchers, negative feedback may come about as part of negotiating meaning or form. For example, a clarification request (e.g. sorry?) is offered when intelligibility is low and meaning itself needs to be negotiated. Nevertheless, it may convey to the learner an indication, albeit a most implicit and indirect one, that some ungrammaticality is present:

At the other extreme, explicit corrections overtly focus on the form at fault and occur when a teacher clearly indicates to a student that some choice is non-target- like:

Page � of �53 66

Page 54: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Somewhere in the middle are recasts and elicitations. Recasts occur when an interlocutor repeats the learner utterance, maintaining its meaning but offering a more conventional or mature rendition of the form. For example:

Elicitations include moves such as asking how do we say X? or directly asking the interlocutor to try again. When they occur in classrooms, the teacher may initiate an other-repetition and pause in the middle of the utterance at fault to let the student complete it correctly, as in (15):

Page � of �54 66

Page 55: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Module 7: Cognition and SLA

Cognition refers to how information is processed and learned by the human mind (the term comes from the Latin verb cognoscere, ‘to get to know’). SLA researchers interested in cognition study what it takes to ‘get to know’ an additional language well enough to use it fluently in comprehension and production.

In this module you will learn about cognitive SLA theories and constructs that have been developed to explain the nature of second language as a form of cognition. The theories can be broadly classified into traditional information processing, which has dominated SLA theorizing and research since the mid-1980s, and emergentism, which is a development of the late 1990s that grew out from the former. A central preoccupation in SLA research on cognition is with memory and attention in L2 learning.

1. Information processing in Psychology and SLA:Information processing emerged in the field of psychology in the 1970s, out of the so-called cognitive revolution of the late 1950s. In a nutshell, the human mind is viewed as a symbolic processor that constantly engages in mental processes. These mental processes operate on mental representations and intervene between input (whatever data get into the symbolic processor, the mind) and output (whatever the results of performance are). Performance, rather than behaviour, is a key word in information processing theories. This is because inferences about mental processes can only be made by inspecting what is observable during processing while performing tasks, rather than by inspecting external behaviour in response to stimuli, as behaviourists used to do.

Several key assumptions made by information processing psychologists have been embraced in current SLA research about cognition. First, the human cognitive architecture is made of representation and access. Second, mental processing is dual, comprised of two different kinds of computation: automatic or fluent (unconscious) and voluntary or controlled (conscious). Third, cognitive resources such as attention and memory are limited.

Information processing theories distinguish between representation (or knowledge) and access (or processing). Bialystok and Sharwood Smith (1985) used a library metaphor to explain this distinction to their SLA audience: ‘knowing what is in the library, plus how the contents are classified and related to one another, must be distinguished from retrieving desired information from the books at a given time’ (p. 105). Linguistic representation is comprised of three kinds of knowledge: grammatical, lexical and schematic or world-related. New L2 knowledge is stored in the mind and has to be accessed and retrieved every time it is needed for use in comprehension or production.

Access entails the activation or use of relevant knowledge via two different mechanisms known as automatic and controlled processing. Human cognition is supported by both automatic and controlled processing. Information processing psychologists believe that all human perception and action, as well as all thoughts and feelings, result from the interaction of these two kinds of processing.

Automatic processes require small effort and take up few cognitive resources, and therefore many automatic processing routines can run in parallel. During automatic processing, cognitive activation is triggered bottom

Page � of �55 66

Page 56: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

up by exogenous sources in the environment (something outside the processor, that is, some aspect of the data in the input or environment). By contrast, controlled processing is activated by top-down, endogenous sources (by something inside the processor, that is, by voluntary, goal-directed motivation in the individual’s mind), and it is handled by what we call the central executive.

Controlled processes therefore allow us self-regulation, but they require a lot more effort and cognitive resources than automatic processes, and thus cannot operate in parallel; they are serial. For this reason, controlled processing is subject to a bottleneck effect. When we voluntarily attend to one thing, we need to block out the rest. If several demands are competing for controlled processing, they will be prioritized and certain processes will wait in line, so to speak, while only one is being executed. This is what we call a limited capacity model of information processing. The model predicts that performance that draws on controlled processing is more variable and more vulnerable to stressors than performance that draws on automatic processing. Therefore, a widely employed method in the study of automaticity is the dual-task condition, where the researcher creates processing stress by asking participants to carry out two tasks simultaneously, a primary task and a distracting task. Under this dual-task pressure, because the distracting task consumes attention away from the primary task, performance on the main task may become variable and vulnerable. If this happens, it is taken as evidence that the participant is relying on more controlled processing and therefore has not yet reached automatization on the performance called by the primary task.

2. The Power of Practice: Procedurlization and AutomaticityA particular kind of information processing theory, called skill acquisition theory, has been fruitful in guiding SLA efforts since the mid-1980s (e.g. Bialystok and Sharwood Smith, 1985; McLaughlin, 1987). The most influential version has been adopted from the early formulations of cognitive psychologist John Anderson’s Adaptive Control of Thought theory (Anderson, 1983), although his most recent version of the theory goes well beyond traditional information processing notions (Anderson, 2007).

Skill acquisition theory defines learning as the gradual transformation of performance from controlled to automatic. This transformation happens through relevant practice over many trials, which enables controlled processes gradually to be withdrawn during performance and automatic processes to take over the same performance. The process has been called proceduralization or automatization and entails the conversion of declarative or explicit knowledge (or ‘knowledge that’) into procedural or implicit knowledge (or ‘knowledge how’). It is important to realize that the learning of skills is assumed to start with the explicit provision of relevant declarative knowledge. Thus, L2 learners (particularly instructed learners) begin with explanations explicitly presented by their teachers or in textbooks and, through practice, this knowledge can hopefully convert into ability for use, or implicit-procedural knowledge made up of automatic routines.

How does practice work? It helps proceduralization of new knowledge by allowing the establishment and strengthening of corresponding links in long-term memory. The more this knowledge is accessed via practice, the easier it will become to access it without effort and without the involvement of the central executive at a future time. However, the power of practice is not constant over time. There is a well-known power law of learning, by which practice will at some point yield no large returns in terms of improvement, because optimal performance has been reached (Ellis and Schmidt, 1998). In addition, proceduralization is skill-specific. Therefore, practice that focuses on L2 production should help automatize production and practice that focuses on L2 comprehension should help automatize comprehension (DeKeyser, 1997). The final outcome of the gradual process of proceduralization or automatization is automaticity, which is defined

Page � of �56 66

Page 57: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

as automatic performance that draws on implicit-procedural knowledge and is reflected in fluent comprehension and production and in lower neural activation patterns (Segalowitz, 2003).

Two misinterpretations of skill acquisition tenets are common: (a) that automaticity is simply accelerated or speedy behaviour; and (b) that L2 learners simply accumulate rules that they practice until they can use them automatically.

2.1. An example study of the skill acquisition theory in SLA:

Please refer to the study I provided you with

3. Long Term Memory:Long-term memory is about representation. It is virtually unlimited in its capacity and it is made of two kinds: explicit-declarative memory and implicit-procedural memory. Much of the knowledge encoded in long-term memory is explicit-declarative, that is, verbalizable and consciously recalled. Explicit-declarative memory supports recollection of facts or events, and it is served by the hippocampus in the human brain. As much knowledge, or probably more, is encoded in implicit-procedural memory. These are things that we know without knowing that we know them. Implicit-procedural memory supports skills and habit learning, and it is served by the neocortex in the human brain. Tulving proposed a further distinction: semantic and episodic memory (see Tulving, 2002). Semantic memory pertains to relatively decontextualized knowledge of facts that ‘everyone knows’. Episodic memory involves knowledge of the events in which people are personally involved or ‘the events we’ve lived through’. Episodic memory corresponds to a more recent type of memory in evolution, believed to have evolved from semantic memory. It ‘allows people to consciously re-experience past experiences’ and also to think of their future (Tulving, 2002, p. 6).

3.1. Long Term Memory and Vocabulary Knowledge:

What does it mean to remember a word? At a fundamental level, a word is established in long-term memory when the link between a form and its meaning is made. However, knowing a word means a lot more: it includes the strength, size and depth of the knowledge represented in memory.

Vocabulary knowledge strength concerns the relative ability to use a given known word productively or to recognize it passively. Thus, strength is a matter of degree of proceduralization in implicit memory.

It is typically found that learners know more words receptively than productively, particularly if they are infrequent or difficult words, and that this gap becomes smaller as proficiency develops. Within the purview of explicit- declarative memory, by contrast, is the size of the mental lexicon, which refers to the total number of words known and represented in long-term memory. Size is often related to the relative frequency with which words are encountered in the input that surrounds learners, since high-frequency words usually make it into long-term memory earlier in the learning process than low-frequency words.

Page � of �57 66

Page 58: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Paul Nation in New Zealand has uncovered many interesting findings about vocabulary size in L2 learning (e.g. Nation and Waring, 1997; Nation, 2006). For example, in L1 a five-year-old child begins school with an established vocabulary of about 5,000 word families, and a typical 30-year-old college-educated adult ends up knowing about 20,000 word families (Nation and Waring, 1997). For L2 users, new vocabulary presents a formidable challenge. They need to learn about 3,000 new words in order to minimally follow conversations in the L2, and about 9,000 new word families if they want to be able to read novels or newspapers in the L2 (Nation, 2006).

Vocabulary depth resides in the realm of both explicit and implicit memory and refers to how well the known words are really known, that is, how elaborated, well specified and structured (or how analysed, in Bialystok’s 2001 sense) the lexical representations are.

As Meara (2007) has argued for many years now, the notion of depth of vocabulary knowledge assumes the existence in implicit long-term memory of networks of meaning-based and form- based associations across the entire mental lexicon.

4. Working Memory:

By contrast to long-term memory, which is about representation and is unlimited, working memory is about access and is limited. A simple but useful definition of working memory in SLA is offered via an example by Nick Ellis (2005): ‘If I ask you what 397 × 27 is, you do not look up the answer from long-term memory, you work it out’ (p. 338). Peter Robinson (1995) describes it as ‘the workspace where skill development begins ... and where knowledge is encoded into (and retrieved from) long-term memory’ (p. 304). In other words, we need working memory to hold information (a storage function) as well as to integrate new information with known information already encoded in long-term memory (a processing function). Working memory handles automatic and controlled processing. Importantly, thus, it is the site for the executive control, which supports controlled processing (Baddeley and Hitch, 1974), and also the site of consciousness (Baars and Franklin, 2003). As Nick Ellis (2005) explains, working memory ‘is the home of explicit induction, hypothesis formation, analogical reasoning, prioritization, control, and decision-making. It is where we develop, apply, and hone our metalinguistic insights into an L2. Working memory is the system that concentrates across time, controlling attention in the face of distraction’ (p. 337). Two characteristics help define working memory. First, unlike long-term memory, working memory is of limited capacity. A second characteristic is temporary activation. Activation is so central to working memory that Nelson Cowan, another main authority on L1 memory from the US, has obliterated the traditional distinction between long-term and working memory and suggests instead that working memory is just the part of memory that becomes activated during a processing event (Cowan, 2005).

Since memory is involved in information processing in pervasive ways, people who have better working memory capacities can learn an L2 more efficiently. Thus, working memory capacity is posited to help predict learning rate and ultimate levels of attainment in the L2.

Page � of �58 66

Page 59: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Two observations have attracted some attention: First, it has been observed that working memory capacity is smaller in the L2, when compared to the L1. For example, in one of the first SLA studies of working memory, Harrington and Sawyer (1992) found that their 32 EFL participants’ memory performance was consistently lower in the L2 than in the L1 across a battery of memory tasks. More recently, Towell and Dewaele (2005) also found an L2–L1 lag, when the shadowing of a continuous oral passage was carried out in L2 and L1 by 12 participants, all students of French enrolled in second- or third-year levels at a university in the United Kingdom. Second, it has also been noted that, as L2 proficiency develops, this lag in working memory capacity between the L2 and L1 should become smaller. However, even less is known empirically about this widely held assumption. Without understanding better the first question of how L2 working memory functions and what constellation of forces initially makes it smaller in capacity than L1 working memory, it will be difficult to explore the second question of how L2 and L1 working memory capacity align together with increasing proficiency.

5. Techniques used to measure storage capacity:

Page � of �59 66

Page 60: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

6. Attention and L2 Learning: Indeed, together with memory, attention is another essential component of cognition. Remember that under normal conditions simple activation of a stimulus in working memory will last for a few seconds and then fade away. Here is where attention comes in; it heightens the activation level of input in working memory, allowing it to remain there for longer through rehearsal and thus making it available for further processing and for entering long-term memory.

One main characteristic of attention is that its capacity is limited. Working memory is capacity limited, possibly because attention is (Cowan, 2001).

Because focal attention is limited, it is also thought to be selective. Only one attention-demanding processing task can be handled at the same time.

A third definitional feature is that attention can be voluntary, in the sense that it can be subject to cognitive, top-down control that is driven by goals and intentions of the individual.

A fourth characteristic is that attention controls access to consciousness.

The focus has been on processes and outcomes of learning under three attentional conditions, which can be summarized as: incidental (i.e. learning without intention, while doing something else), implicit (i.e. learning with no intervention of controlled attention, usually without providing rules and without asking to search for rules) and explicit (i.e. learning with the intervention of controlled attention, usually summoned by the provision of rules or by the requirement to search for rules). In a nutshell, SLA researchers have asked themselves whether L2 learning is possible without intention, without attention, without awareness and without rules.

6.1. Learning without intention:

Is it possible to learn about the L2 incidentally, as a consequence of doing something else in the L2, or does all L2 learning have to be intentional? It is unanimously agreed in SLA that incidental L2 learning is possible indeed. The learning of L2 vocabulary during pleasure reading is an incidental type of learning that has been found to be possible in the L2 as well as in the L1. This is what Jan Hulstijn (2003) concludes in a seminal review. This is also what the research reviewed by Krashen (2004), as well as the more recent evidence gleaned by Marlise Horst (2005) and others (e.g. Pigada and Schmitt, 2006), shows. we cannot ignore the fact that intentions wax and wane and fluctuate during online processing and that, in the end, it is online attention that is at stake in a cognitive understanding of L2 learning. Furthermore, while learning without intention is possible, people learn faster, more and better when they deliberately apply themselves to learning. Thus, learning with intention remains of central importance in SLA because of its facilitative role, as Hulstijn and Laufer (2001) have argued with respect to vocabulary learning.

Page � of �60 66

Page 61: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

6.2. Learning without Attention:

Perhaps the most contested matter is whether new L2 material can be learned without attention. The debate has to do with Schmidt’s highly influential Noticing Hypothesis. The question asked is: Is detection only sufficient for L2 learning or is noticing necessary? Detection is defined as registration outside focal or selective attention (Tomlin and Villa, 1994), whereas noticing is defined as detection plus controlled activation into the focus of conscious attention (Schmidt, 1995).

Robinson (1995) and Doughty (2001) have argued that Nelson Cowan’s (1988, 2001, 2005) unified model of memory and attention offers a framework for envisioning this problem as one that depends on a continuum in the quality of attention (from low-level, automatic attention to high-level, controlled attention), rather than on an all-or-nothing dichotomy between unattended and attended processing. In Cowan’s model, detection that involves registration outside focal or selective attention is the kind of low-level, minimal attention typically assumed during automatic processing in information processing theories. For example, imagine we are taking a walk and our sensory storage catches a patch of green for several hundred milliseconds (Cowan, 1988). Immediate activation in long-term memory of an already existing representation that shares some fundamental feature will make us recognize it meaningfully, but pre-attentively, as a tree. This will occur without subjective experience, that is, without reaching consciousness. Any instance of language use mandatorily involves this kind of automatic, low- attentional processing (Ellis, 2002a). On the other hand, detection that goes on to involve focal or selective attention via controlled activation summons the kind of high-level, focal attention assumed during controlled processing in information processing theories. This quality of attention is thought to be accompanied by subjective experience or awareness at the time of processing. For example, we are taking a walk, our eyes catch a patch of green, we see a tree, but we also experience rapid feelings of vague pleasantness – maybe we even intuit that fall has begun. These are typical fleeting consciousness effects that range according to US consciousness scholar Bernard Baars from fringe conscious (the vague pleasantness, the tacit memory of signs of fall around us) to more substantial and qualitative (visual imagery or inner speech) (Baars and Franklin, 2003). But we may immediately move on to something else and forget about having seeing any tree or having had this experience of fleeting consciousness. Much of language use can also involve this kind of conscious, subjective awareness, while automatic, low- attentional processing also goes on (N. Ellis, 2002a, 2005).

Which of the two extreme qualities of attention (low-level automatic detection or high-level, controlled activation) leads to learning? Or can both result in learning?

Herein lies the point of disagreement in SLA. Tomlin and Villa (1994) suggested, contra Schmidt, that detection at the periphery of focal attention is all that is necessary for L2 learning, whereas detection plus controlled activation into focal attention is facilitative of learning, but not necessary. Gass (1997) also agrees that noticing facilitates L2 learning but cannot be considered necessary. By contrast, Schmidt (1994, 2001) has maintained that detection involving peripheral attention is not enough for L2 learning, on the grounds that novel material that is attended peripherally could never be encoded in long-term memory. Instead, detection plus controlled activation into the focus of

Page � of �61 66

Page 62: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

attention is needed for L2 learning: ‘what learners notice in input is what becomes intake for learning’ (1995, p. 20). Schmidt also proposes that nothing is free in L2 learning: ‘in order to acquire phonology, one must attend to phonology; in order to acquire pragmatics, one must attend to both linguistic forms and the relevant contextual features; and so forth’ (1995, p. 17).

Drawing on Cowan’s (1988) unified model of memory and attention, Robinson (1995) agreed with Schmidt that noticing is necessary for learning, but stipulated that noticing should be conceived as involving focal attention plus rehearsal, thus eschewing the vexing question of proving phenomenological awareness of the expe- rience of noticing. Nick Ellis concedes that the Noticing Hypothesis may be right, but only if accompanied by an Implicit Tallying Hypothesis (2002a, p. 174), which imposes two provisos: (a) noticing is necessary only for new elements with certain properties that make low-attentional learning unlikely, but not for all aspects of lan- guage to be learned, and (b) noticing may be necessary only for the initial registra- tion of such ‘difficult’ elements so as to make an initial representation in long-term memory possible, but not for subsequent encounters. This is because ‘once a stim- ulus representation is firmly in existence, that stimulus ... need never be noticed again; yet as long as it is attended to for use in the processing of future input for mean- ing, its strength will be incremented and its associations will be tallied and implic- itly cataloged’ (Ellis, 2002a, p. 174). Acknowledging that it may be impossible to demonstrate zero noticing at the time of processing empirically (see discussion in sec- tion 5.12), Schmidt did shy away from his initial claim that noticing is the ‘necessary and sufficient condition for converting input to intake’ (1990, p. 129), and since then his position has been that ‘more noticing leads to more learning’ (1994, p. 18). That is, noticing is facilitative of L2 learning (see also Schmidt, 2001).

6.3. Learning without Awareness:

In its weaker form that states noticing is facilitative of L2 learning, the Noticing Hypothesis has attracted compelling support. Particularly the research programme led by Ron Leow at Georgetown University has offered ample evidence that noticing with awareness, and even more so with understanding, is facilitative of L2 learning. In these studies (e.g. Leow, 1997, 2001; Rosa and O’Neill, 1999; Rosa and Leow, 2004), think-aloud protocols are used to classify learners by their comments into: unaware, if no trace of noticing can be found in the introspective data; aware, if simple mention is made of the subjective experience of paying attention to the targets; or aware with understanding, if more abstract comments are made involving partial formulation of rules or generalizations. The results have consistently revealed better post-test scores for participants that produced verbal reports showing awareness and, even to a higher degree, understanding.

6.4. Learning without Rules:

At the heart of the SLA research programme on learning without rules is a focus on the products of implicit L2 learning: Can grammar generalizations result from experiencing L2 data without explicit knowledge being provided at the outset of the learning process? Or even without the learner actively and consciously searching to discover generalizations behind the language data she experiences?

Page � of �62 66

Page 63: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

United States psychologist Arthur Reber was the first to expend sustained effort into the study of implicit learning, which he defined as learning without rules. He pioneered an artificial grammar research paradigm in the 1960s that has been pursued by many others to this date (see Reber, 1996). In this type of experiment, participants in the implicit learning condition are asked to memorize strings of letters. This is an incidental-implicit learning condition in that: (a) participants think they are doing something (memorizing strings) that is different from what the researcher hopes they will do (extract formal regularities or rules), and (b) they are not given any explicit declarative knowledge (no rules for the artificial grammar) or any orienting towards the possibility of rules underlying the stimuli (no instructions to search for rules). When they are later asked to judge new strings as grammatical or not, they perform above chance level. This is interpreted as proof that they learned something about the artificial grammar indeed. When requested to verbalize any rules at all, however, they are at a loss. This is interpreted as evidence that their learning has resulted in implicit (intuitive, non-verbalizable) knowledge of the artificial grammar.

An important point of contention in interpreting these results, however, is whether learning without rules is about symbolic or associative learning. For those who, with Reber, believe we can learn without rules or awareness of rules, the proposal is that implicit (unconscious) processing leads to the abstraction of rules that are symbolically represented in the mind, only that they happen to be inaccessible to consciousness. That is, their theories of implicit learning are abstractionist and symbolic. However, increasingly more psychologists are willing to reinterpret the evidence from implicit learning studies as showing learning of underlying statistical structure, rather than learning of underlying rules (Shanks, 2005). This radical proposal has been made possible by the appearance and burgeoning of connectionist and associative theories in psychology since the 1980s.

In the end, then, is L2 learning possible without rules? Robinson concludes that, in the absence of rules, low-level associative learning that draws on data-driven processes supported by memory is certainly possible. Learning without rules leads to the formation of memories of instances that can be accessed more easily, allowing for faster performance, but without knowledge that can be generalized to new instances. That is, without the initial provision of rules (without an explicit learning condition), learning is bottom-up (i.e. data and memory driven), and it does not lead to knowledge of a systematic rule of some kind. With rules, learning proceeds by drawing on high-level attention and conceptually driven processes supported by conscious attention, resulting in generalization with awareness.

The reconceptualization of implicit learning as statistical learning is just one of many consequences of a wider trend in cognitive psychology to reconceptualize information processing as an associative, probabilistic, rational, usage-based, grounded, dynamic and, in sum, emergent adaptation of the agent to the environment.

7. An Emergentist account of SLA:

Emergentism refers to a contemporary family of theories in cognitive science that have coalesced out of increasingly critical examinations of the tenets of information processing theories.

Page � of �63 66

Page 64: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

In a manifesto of emergentism in SLA, Ellis and Larsen-Freeman summarized the position as follows:

“Emergentists believe that simple learning mechanisms, operating in and across the human systems for perception, motor-action and cognition as they are exposed to language data as part of a communicatively-rich human social environment by an organism eager to exploit the functionality of language, suffice to drive the emergence of complex language representations.”(2006, p. 577)

Three important tenets on which emergentist approaches build are associative learning, probabilistic learning and rational contingency (Ellis, 2006a). From these three principles derive the ‘simple learning mechanisms’ to which Ellis and Larsen-Freeman (2006) refer in the quote above. Associative learning, as we saw in section 5.15, means that learning happens as we form memories of instances or exemplars we experience in the input, in a process of automatic extraction of statistical information about the frequency and sequential properties of such instances. Ellis (2006a) explains that the human architecture of the brain is neurobiologically programmed to be sensitive to the statistical properties of the input and to learn from them.

When processing stimuli, the brain engages in a continuous and mandatory (as well as implicit, in the sense of automatic and certainly unconscious) tally of overall frequency of each form and likelihood of co-occurrence with other forms. This statistical tallying is supported by neural structures in the neocortex (Ellis, 2006b). Probabilistic learning posits that learning is not categorical but graded and stochastic, that is, it proceeds by (subconscious) guesswork and inferences in response to experience that always involves ambiguity and uncertainty (Chater and Manning, 2006). However, this kind of probabilistic calculation is not a slave of whatever is experienced by the human brain as a contiguous temporal or spatial surface pattern. Instead, the probability calculations of the human mind are guided by principles of rational contingency, or automatically computed expectations of outcomes on the basis of best possible evidence (Chater and Manning, 2006; Ellis, 2006b). Specifically, the processor makes best-evidence predictions about outcomes based upon (a) the overall statistics extracted by accumulated experience, (b) the most recent relevant evidence, (c) attention to cues detected to be present and (d) the clues provided by the context (Ellis, 2002a, 2006a, 2006b, 2007). Each time the outcome is confirmed or not in another relevant event, the processor adjusts to the new evidence and modifies its prediction so its predictive accuracy is better next time.

Additional important tenets in the emergentist family of theories are perhaps broader in scope. One is usage-based learning, or the position that language use and language knowledge are inseparable, because we come to know language from using it. Hence the specification in the earlier quote by Ellis and Larsen-Freeman (2006) that learning from exposure comes about ‘as part of a communicatively-rich human social environment’ and is experienced ‘by an organism eager to exploit the functionality of language’ (p. 577). Among others, US cognitive scientist Michael Tomasello, now at the Max Planck Institute in Germany, has been instrumental in advancing a view of language acquisition that is usage-based, where grammar concepts emerge out of communicative

Page � of �64 66

Page 65: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

and social needs: ‘people construct relational and semantic categories in order to make sense of the world and in order to communicate with one another’ (Abbot-Smith and Tomasello, 2006, p. 282). Importantly, this commitment to usage-based learning means that two traditional distinctions in linguistics and information processing, respectively, are transcended: competence and performance, and representation and access. Furthermore, meaning (rather than rules) is held to be of primary importance in understanding the language faculty. For this reason, the linguistic schools that best suit the emergentist project are cognitive linguistics (Langacker, 2008) and corpus linguistics (Gries, 2008).

Another broad-scope tenet of emergentism is that cognition is grounded, and therefore language is too. By this, it is meant that our species’ experience in the world and the knowledge that we abstract from such experience is always structured by human bodies and neurological functions (Evans et al., 2007; Barsalou, 2008). This is why Ellis and Larsen-Freeman (2006) described learning mechanisms as ‘operating in and across the human systems for perception, motor- action and cognition’ (p. 577). Perception and action, and not only abstract or symbolic information, are believed to shape cognition (Wilson, 2002). Hence, perceptual and sensory-motor functions of the brain must also be implicated in language acquisition. They contribute to the emergence of language abstractions and they also constrain and guide many of the simple learning mechanisms of associative, probabilistic and rational contingent learning.

The final tenet that is worth highlighting in this synoptic examination of emergentism is that language acquisition, like the acquisition of other forms of cognition, is a self-organizing dynamical system. This entails viewing the phenomenon to be explained (e.g. language learning) as a system (or ecology) composed of many inter- connected parts that self-organize on the basis of multiple influences outside the system; these influences provide constraints that afford self-organization, but no single cause has priority over others .

Page � of �65 66

Page 66: Module 1: First language acquisition, second language ... › 2014 › 10 › ling-307 … · Language Acquisition Module 1: First language acquisition, second language acquisition,

Page � of �66 66