eng 626 corpus approaches to language studies lexico-grammatical profiles bambang kaswanti purwo...

15
ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo [email protected]

Upload: jason-cook

Post on 27-Dec-2015

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

ENG 626CORPUS APPROACHES TO LANGUAGE STUDIES

lexico-grammatical profiles

Bambang Kaswanti [email protected]

Page 2: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

[Hunston ch.1]

» corpus by itself – can do nothing at all just a store of used language» corpus access software ▪ re-arrange the store ▪ enable observations of various kinds to be made ▪ process data from corpus in three ways: showing frequency phraseology collocation

[O’Keefee 1.5]

basic corpus linguistic techniques using standard software such as Wordsmith Tools (Scott 1999) Monoconc Pro (2000)

▪ concordancing▪ word frequency counts▪ key word analysis▪ cluster analysis

Page 3: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

» concordancing ▪ a core tool in CL ▪ to find every occurrence of a particular word or phrase (the search word or phrase “node”)

» word frequency counts or wordlists ▪ rapid calculation of word frequency lists (wordlists) for any batch of texts ▪ with a rank ordering of all the words in order of frequency

» key word analysis▪ words whose frequency is unusually high in comparison with some norm▪ not usually the most frequent words in a text, but the more “unusually frequent”▪ useful way of characterizing a text or a genre

Page 4: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

» cluster analysis ▪ how language systematically clusters into combinations of words or “chunks” (e.g. I mean, this that and the other, etc.) ▪ how this contribute to the description of the vocabulary of a language (to help Ls acquire vocab n develop fluency) ▪ 2-, 3-, 4-, 5-, or 6-word combinations

[key word analysis]▪ potential applications in the areas of forensic linguistics stylistics content analysis text retrieval [in ELT] to create word lists (LSP Programs)

Page 5: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

lexico-grammatical profiles[when looking at concordance lines] create a “lexico-grammatical profile” of a word and its contexts of use

1. collocates ▪ which word(s) occur most frequently w/ statistical significance in the word’s environment?

2. chunks/idioms ▪ does the word form part of any recurrent chunks? ▪ is the word idiom-prone? ▪ what types occur (e.g. binominal or trinominals)? (rough n ready; ready, willing and able)3. syntactic restrictions ▪ are there syntactic patterns that restrict the word? (e.g. prepositions that go with the word?) ▪ what are the typical clause-position (initial/medial/final)? ▪ are there any tense/aspect restrictions

Page 6: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

4. semantic restrictions ▪ are there semantic restrictions? (e.g. applied to [+HUM] only, never with an intensifier)

5. (semantic) prosody ▪ words, as well as having typical collocates (e.g. blonde collocates w/ hair, not w/ car) tend to occur in particular environments: positive or negative ۰ 90% of collocates of cause are negative (accident, cancer, commotion, crisis, delay) ۰ provide collocates with words of positive connotation (care, food, help, jobs, relief, support)

Page 7: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

[O’Keefee Ch. 3]

traditional view of vocabulary:vocabulary = all the single words of language

over years, in the light of corpus analysis:open the criteria to search for recurrences of more than one word (i.e. pairs and trios of words, even larger groupings)

▪ “chunks” like a couple of , at the moment, all the time as frequent as single words (possible, alone, fun, expensive)

▪ single words has been widely considered to be the basic unit units of more than one word (phrasal verbs, compound, idioms) higher level of proficiency exceptions: ۰ greetings and everyday expressions how are things? see you tomorrow, thanks very much

Page 8: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

۰ specialized functional phrases Happy New Year, good luck۰ common prepositional phrases at the weekend, on the first of May ۰ a high-frequency compounds bus stop, whiteboard

» collocation ۰ groupings of more than one word + unitary of meaning and specialized functions ۰ statistical tendency of words to co-occur (Hunston 2001:12)

۰ collocations are not absolute or deterministic, but are probabilistic events (resulting from repeated combinations used n encountered by speakers of any language e.g. strong tea, powerful cars ۰ common verbs display distinct preferences for what they combine with: things turn or go grey, brown, white people go (*turn) mad, insane, bald, blind

Page 9: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

» strings of words in corpora ▪ CL: it is lexis, rather than syntax, which accounts for the organization and patterning of language

▪ two fundamental principles at work in the creation of meaning: the “idiom principle” the “open choice principle”

▪ syntax, the slots where there are choices to be made (the open choice principle) far from being primary; only brought into service occasionally, a kind of “glue” to cement the lexical chunks together

▪ form n meaning work hand in hand Cambridge International Corpus: [100 examples of be touched by] 14% ‘experience physical contact’ 86% nonphysical meaning, 80% of which ‘emotionally affected by’ touch [+passive]: nonphysical senses

Page 10: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

» phraseology and idiomaticity ▪ contributors to the understanding of multi-word vocabulary: ۰ corpus linguistics ۰phraseology and the study of idiomaticity (for Ts n Ls)

▪ different terminologies to describe the phenomena of multi-word vocabulary or chunks ۰ lexical phrases ۰ prefabricated patterns ۰ routine formulae ۰ formulaic sequences ۰ lexicalized stems ۰ chunks ۰ (restricted) collocations ۰ fixed expressions ۰ multi-word units/expressions ۰ idioms ۰ etc.

multi-wordphenomena fundamental feature of language use

Page 11: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

concordance lines – many instances of use of a word or phrase “latent patterning” – phraseology

[Hunston Ch.1]

phraseology vs. how Ts explain “confusing adjectives” such as interested and interesting

▪ “the minimal pair” the boy is interested n the boy is interesting

▪ concordance lines: frequent pattern of ۰ interested: “someone is interested in something”

۰ interesting: always preceded by a noun: “an interesting thing”, “what is interesting is …”, “it is interesting to see …”

Page 12: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

reference books have difficulty explaining between n through

a phraseological approach:between: frequently found after nouns such as difference, distinction, gap, contrast, conflict, n quarrel relationship, agreement, comparison, meeting, contact, correlation

through: frequently found after verbs such as go, pass, come, run, fall, n lead

“semantic functions”between has a “location” meaning the channel between Africa n Sicily earnings between L5 and L6 a weekthrough has an “instrumental meaning”

Page 13: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

NSs often recognize if a phraseology is unusual to explain why that is the case is not easy

“require to be done” seems wrong to Owen’s (1996) intuition

Bank of English:“REQUIRE to be” fairly frequentlythe past participle verbs to follow [+SPEC], Not a general verb such as do. These roses require to be pruned each spring

require to be done very few (3 out of 302) Owen’s intuitions backed up by evidence of the corpus (on phraseology, not grammatical grounds)

Page 14: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

What’s the contribution of NS’s intuition?make generalizations from a mass of specific info in a corpuse.g. Bank of English: CONTACT – verb + noun (Sripicharn 1998)

• typically used with “official” persons (office, newspaper, etc.) contact your travel agent • also found when the person a family member or a friend she had no contact with her father

the difference between two kinds of noun (travel agent n father) is important (Sripicharn 1998)

Page 15: ENG 626 CORPUS APPROACHES TO LANGUAGE STUDIES lexico-grammatical profiles Bambang Kaswanti Purwo bkaswanti@atmajaya.ac.id

REFERENCES

Hunston, Susan. 2002. Corpora in Applied Linguistics. Cambridge UP.O’Keeffe, Anne; Michael McCarthy; Ronaldo Carter. 2007. From Corpus to Classroom: Language Use and Language Teaching. Cambridge UP.