lexical frequency & esp(wk5)
TRANSCRIPT
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 1/22
Engineering English: A lexicalfrequency instructional model
Olga Mudraya
Department of Linguistics and Modern English Language, Lancaster University, Lancaster LA1 4YT, UK
Abstract
This paper argues for the integration of the lexical approach with a data-driven corpus-
based methodology in English teaching for technical students, particularly students of Engi-
neering. It presents the findings of the authors computer-aided research, which aimed to
establish a frequency-based corpus of student engineering lexis. The Student Engineering Eng-lish Corpus (SEEC), reported here, contains nearly 2,000,000 running words reduced to 1200
word families or 9000 word-types encountered in engineering textbooks that are compulsory
for all engineering students, regardless of their fields of specialization.
The most immediate implication arising from this research is that sub-technical vocabulary
as well as Academic English should be given more attention in the ESP classroom. The paper
illustrates some sample data-driven instructional activities consistent with the lexical
approach, in order to help students acquire the so-called language prefabs, or formulaic
multi-word units/collocations, for technical and non-technical uses. The integration of the lex-
ical approach with a corpus linguistic methodology can enrich the learners language experi-
ence and raise their language awareness, bringing out the researcher in them.
2005 The American University. Published by Elsevier Ltd. All rights reserved.
1. Introduction
In recent years, corpus linguistics has come together with language teaching by
recognizing the importance of language corpora as a basis for acquiring facts about
0889-4906/$30.00 2005 The American University. Published by Elsevier Ltd. All rights reserved.
doi:10.1016/j.esp.2005.05.002
E-mail address: [email protected].
www.elsevier.com/locate/esp
English for Specific Purposes 25 (2006) 235–256
ENGLISH FOR
SPECIFIC
PURPOSES
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 2/22
the language to be learned and sharing a larger, ‘‘chunkier’’ view of language (Johns,
1991; McEnery & Wilson, 1997; Murison-Bowie, 1996). The availability of language
corpora to language learners and teachers offers promising opportunities in learning
a language, allowing learners to set up and carry out their own language analyseswith the help of computer concordancing programs that are aimed at identifying col-
locations, or word partnerships, in which certain words co-occur in natural text with
greater than random frequency.
The lexical approach to language teaching and learning (Lewis, 1993; Nattinger &
DeCarrico, 1992; Willis, 1990; overview in Moudraia, 2001) is similarly directed at
teaching collocations. It makes a particular distinction between vocabulary, tradi-
tionally understood as a stock of individual words with fixed meanings, and lexis
which takes into account not only single words but also word combinations that
we store in our mental lexicons ready for use.
This paper aims to show how the integration of the lexical approach with a cor-
pus-based methodology in teaching English for Specific Purposes (ESP), especially
Engineering English, can improve the way ESP is taught. My particular point here
is to demonstrate how a technical student can benefit from the data-driven lexical
approach. The examples will be taken from my Student Engineering English Corpus
(SEEC) of nearly 2,000,000 running words (Moudraia, 2003, 2004), which was built
with the purpose of establishing a representative corpus of Student Engineering Eng-
lish that reflects the lexis encountered in compulsory textbooks for engineering stu-
dents, regardless of their fields of specialization.
2. Corpus linguistics and ESP language learning
The lexical approach argues that language consists of chunks which, when com-
bined, produce continuous coherent text, and that only a minority of spoken sentences
are entirely novel creations. The existence and importance of formulaic multi-word
units has been pointed out by many linguists. Bolinger (1976) called them ‘‘the prefabs
of language’’ before Sinclair (1987, 1991) put forward the notion of the idiom principle
as a clear methodological grounding for viewing collocation, arguing that words do not
occur at random in a text. On the contrary, ‘‘a language user has available to him or hera large number of semi-preconstructed phrases that constitute single choices, even
though they might appear to be analysable into segments’’ (Sinclair, 1991, p. 110).
Corpus linguistics is a methodology which can be described as a study of natural
language on examples of real life language use via a corpus (McEnery & Wilson,
2001), defined as a body of text that is representative of a particular variety of lan-
guage and is stored on a computer. The availability of language corpora to language
learners and teachers adds a fresh dimension to the criteria for success in learning a
language. With data-driven learning (Johns, 1991), the data are primary, the teacher
has a new role as coordinator of research, and learners become research workers in
control of their learning process.In particular, concordancing programs for computerized text analysis can be used
very productively in and outside the ESP classroom. A concordance is ‘‘a collection
236 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 3/22
of the occurrences of a word-form, each in its textual environment’’ (Sinclair, 1991,
p. 32). Language teachers can use concordancers to produce vocabulary exercises to
help their students understand word partnerships. The concordance data can make
language facts more explicit by isolating common patterns in authentic languagesamples, the point of a concordance being to present abundant examples of a word
in its usual contexts. By seeing the contexts and collocates, the learners can get a
much better idea of the use of the word than they would achieve by merely looking
it up in the dictionary. Furthermore, by drawing students attention to collocates of
the keyword, concordance-based study has considerable potential for expanding
student vocabulary. Essentially, keywords are the words which are most unusually
(or outstandingly, in Scotts (1997) terms) frequent in a given body of text compared
with its frequency in a reference corpus.
McEnery and Wilson (2001, p. 121), identify ESP as a particular domain-specific
area of language teaching and learning, where ‘‘corpora can be used to provide many
kinds of domain-specific material for language learning, including quantitative ac-
counts of vocabulary and usage which address the specific needs of students in a par-
ticular domain more directly than those taken from more general language corpora’’.
In professional domains, various corpora are being built. Most of them are of finite
size, with the exception of so-called monitor corpora – open-ended collections of
texts, to which new texts are being constantly added until the corpora ‘‘will get
too large for any practicable handling, and will be effectively discarded’’ ( Sinclair,
1991, p. 25).
The largest current professional corpus is to be the Corpus of Professional English(http://www.perc21.org/cpe_project/index.html). It is being developed in collabora-
tion between the Professional English Research Consortium (PERC), Japan, and
Lancaster University, UK. When finished, it will consist of a 100-million-word data-
base of English used by professionals in science, engineering, technology and other
fields. Also, a monitor engineering corpus of several million words representing
the English used by engineers in over 355 professional engineering organizations,
has been steadily growing at the University of Aizu in Japan (Orr & Takahashi,
2002).
However, for language learning and teaching, smaller corpora can be more useful 1
as they are designed to represent the specific part of the language under investigationand are tailored to address the aspects of the language relevant to the needs of the
learner. Furthermore, smaller corpora are more manageable, allowing easier and fas-
ter access to language data. Some examples of smaller technical corpora designed for
language learners are Peter Roes Corpus of Scientific English comprising 280,000
running words, i.e., tokens, (cited in Yang, 1986), the 400,000-token Guangzhou
Petroleum English Corpus or GPEC (Qi-bo, 1989), the Jiaotong Daxue English of
Science and Technology (JDEST ) Corpus (Yang, 1985a, 1985b, 1986) and the Hong
Kong University of Science and Technology (HKUST ) Computer Science Corpus
(James, Davidson, Heung-yeung, & Deerwester, 1994) of 1,000,000 tokens each, as
1 See Ghadessy, Henry, and Roseberry (2001) for applications of smaller corpora to language teaching.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 237
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 4/22
well as my Student Engineering English Corpus (SEEC) of nearly 2,000,000 tokens
(Moudraia, 1999, 2003, 2004).
All these corpora are largely based on textbook selections although they are quite
different in design and have different objectives. For example, the JDEST was cre-ated mainly to monitor language teaching materials in order to learn ‘‘how well
the materials which have been developed for the learners of English are representing
the authentic materials they are going to read in the future’’ and also possibly to pro-
vide some knowledge on the productivity of different multi-word term patterns
(Yang, 1986, p. 103). The authors also hoped that the JDEST might be used for syn-
tactic and discourse study of EST (Yang, 1985b, p. 95). Peter Roes Corpus of Sci-
entific English was used for the automatic identification of scientific/technical terms
(Yang, 1986, p. 97).
The purpose of building the GPEC was threefold: firstly, to get to know more
about the features of Petroleum English; secondly, to provide teachers and learners
with a series of vocabulary lists; and finally, to gain some empirical knowledge in
developing a model for processing a medium-sized corpus on a microcomputer
(Qi-bo, 1989, p. 28). The HKUST had two principal objectives: (i) an empirical
determination of the nature of the comprehension problems of Chinese-speaking
undergraduate students in listening and reading in English for academic purposes;
and (ii) the development of materials to enhance listening and reading skills, in-
formed by the findings of empirical enquiries (James et al., 1994, p. 3). The SEEC
had three primary aims: (a) to establish a representative corpus of Student Engineer-
ing lexis; (b) to provide teachers and learners with a word list that could serve as thelexical syllabus foundation of English for Engineering; and (c) to explore the syntac-
tical, morphological, lexical, and discursive features of Engineering English (Moud-
raia, 2004, p. 142).
Despite their different purposes, all these corpora have led to the production of
vocabulary lists and lexical syllabuses for ESP/EST courses at tertiary level.
Exploitation of the findings using concordancing software was another outcome
of some of these projects. For example, an automatic monitoring and collecting
system of scientific/technical terms, in which new terms collected could be sup-
plied with concordances, was envisioned by the JDEST developers (Yang, 1986,
p. 102–103).I believe that concordancing is an indispensable tool in course design. In this pa-
per, I will be exploring the issue of technical/sub-technical/non-technical vocabulary
with examples from the SEEC. In the fourth section of this article, I have included
some data-driven instructional activities based on concordance samples from the
SEEC that are aimed at helping students acquire language prefabs for technical
and non-technical uses in the specialist context.
3. Technical/sub-technical/non-technical vocabulary
The division between technical and non-technical vocabulary is far from distinct.
Strictly technical words are characterized by the absence of exact synonyms,
238 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 5/22
resistance to semantic change, and a very narrow range; e.g., words such as urethane,
or vulcanise. Some researchers (Baker, 1988; Cowan, 1974; Flowerdew, 1993; Trim-
ble, 1985), however, distinguish a third category – so-called sub-technical vocabulary,
a class of words that stand between technical and non-technical words. These are lex-ical items with technical as well as non-technical senses, e.g. iron, force, stress, cur-
rent, tension, strength, etc., which have the same meaning in several technical
disciplines. As Baker (1988, p. 91) noted, the term sub-technical covers ‘‘a whole
range of items that are neither highly technical and specific to a certain field of
knowledge nor obviously general in the sense of being everyday words which are
not used in a distinctive way in specialised texts’’.
In addition, according to Yang (1986), sub-technical words are identified by their
frequency and distribution as well as their collocational behaviour. Yangs statistical
analysis has shown that sub-technical words have very high distribution across all
specialized fields; however, their frequency of occurrence is lower than that of func-
tion words. Both function words and sub-technical words are characterized by fairly
low peakratio (i.e., the maximum frequency of occurrence divided by the average fre-
quency) and rangeratio (i.e., the maximum frequency divided by the minimum fre-
quency). On the other hand, technical terms have very low distribution but very
high peakratio and rangeratio (Yang, 1986, p. 98). Even so, a sub-technical word
might also be a term in a specific field if it suddenly shows a peak frequency in that
field. In view of this, I will also be examining whether the most frequent words in the
SEEC are indeed technical or non-technical/sub-technical.
4. The Student Engineering English Corpus
4.1. Rationale
The goal of the project was to develop a reliable lexical syllabus for engineering
students in order to meet the objectives of English teaching for Engineering at Wala-
ilak University in Thailand,2 where I had worked for nearly seven years. We were in
a situation quite common in Southeast Asia: lectures in most subjects were delivered
in a local language (Thai, in this case) whilst textbooks were in English. That is why,in order to build a representative corpus of Student Engineering English, I selected
English-language textbooks, 13 in total, used in basic engineering disciplines
(BED). By BED, I mean those disciplines which are compulsory for all engineering
students regardless of their fields of specialization. At Walailak University, these
were Engineering Mechanics, Engineering Materials, Mechanics of Materials,
Mechanics of Fluids, Thermodynamics, Electrical Engineering, Engineering Draw-
ing, Manufacturing Process and Computer Programming. The main criterion for
selection was that the textbooks were recommended for engineering students, who
had to read them in English.
2 The project was supported by a small Grant # 970112 from the Walailak University Research Council.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 239
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 6/22
4.2. Procedures
The main stages in the project included gathering a text corpus, putting it into ma-
chine-readable form, conducting a computational analysis of the material, and build-
ing a word list.3 Whole texts were used in the SEEC, as opposed to text extracts,
which is the case with most other smaller technical corpora designed for languagelearners (e.g., GPEC, JDEST and HKUST). In corpus construction, whole texts
are preferable to text extracts wherever possible, as this frees the researcher from
concerns about the validity of sampling techniques; moreover, a corpus made up
of whole documents is open to a wider range of linguistic studies than a collection
of short samples (Sinclair, 1991, p. 19). The SEEC is composed of thirteen text files,
details of which as presented in Table 1. The collected material formed a corpus of
about 2 million tokens and over 18,000 word-types, analysed with the help of the
WordSmith Tools software (Scott, 1996).
4.3. Word list organization
The entries in the resulting word list were organized by word families. The lem-
matisation process reduced the number of entries to about 7700 that were treated
according to the cumulative frequency of occurrence of the members of the word
families, and the most frequent word families (with the sum total of 100 occurrences,
3 This step required permission from the publishers for the electronic use of their texts. My
acknowledgements go to McGraw-Hill Australia (permission dated October 12, 1998), McGraw-Hill
Companies, Inc. (permission dated December 1, 1998), Brooks/Cole Publishing Company (Grant No. G-
09857, November 17, 1998) and Addison Wesley Longman Limited (ref. AP/2743, November 25, 1998) for
their permission to store their texts in an electronic format in order to create a word list.
Table 1
The structure of the SEEC
N Text file Bytes Tokens Types Type/token
ratio
Standardised type/token
ratioOverall 11,694,812 1,986,595 18,203 0.92 9.85
1 Manufact.txt 1,764,178 290,782 10,082 3.47 13.72
2 Material.txt 1,444,793 232,743 7056 3.03 10.49
3 Fluidmech.txt 1,307,973 220,666 5333 2.42 9.15
4 Mechmat.txt 1,177,429 202,513 4125 2.04 7.79
5 Elec.txt 983,672 167,394 5626 3.36 10.27
6 Intofluidmech.txt 860,281 147,028 4666 3.17 9.54
7 Dynamics.txt 795,910 142,446 3205 2.25 7.07
8 Statics2.txt 710,854 127,623 4129 3.24 9.57
9 Statics1.txt 668,896 121,696 2919 2.40 6.94
10 Chemi.txt 653,622 110,812 4299 3.88 9.60
11 Graph.txt 486,152 80,804 5034 6.23 11.55
12 Pascal.txt 466,756 77,242 3124 4.04 8.54
13 Draw.txt 374,296 64,846 3030 4.67 9.05
240 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 7/22
or 0.005%) were selected. As a result, the 1260 most frequent word families compris-
ing 8850 words were included in the Student Engineering Word List which can serve
as the foundation for an Engineering English lexical syllabus.
The ‘‘word family’’ here is interpreted in the broadest sense, incorporating notonly derived and inflected forms but compound words as well, according to Level
7 of Bauer and Nations (1993) scale. Table 2 gives an example of the word fam-
ily under the headword use which is the most frequent word family in the Student
Engineering Word List. Also, Appendix A presents the one hundred most fre-
quent entries listed by headwords (i.e., base word or the most frequent word in
the family).
4.4. Word frequency analysis – findings
Word frequency analysis of the SEEC was carried out in comparison with the
COBUILD Bank of English Corpus and the British National Corpus (BNC). The
COBUILD Bank of English Corpus is the biggest monitor corpus of the English lan-
guage, steadily growing at Birmingham University, UK. Currently, it contains about
450,000,000 tokens; this analysis, however, is based on the 323,302,789 tokens that
COBUILD had in 2000. The BNC, developed at Lancaster University, UK in the
1990s, is the biggest finite corpus of the English language to date, containing around
100,000,000 tokens. For the analysis of the most frequent word forms, I used the
Written part of the BNC of 89,800,000 tokens.
The word frequency analysis was concerned with the most frequent word forms inall three corpora, including the most frequent closed-class (grammatical) and open-
class (content) word forms. It has revealed, firstly, that the most frequent word forms
in all three corpora, being mainly function words, concur (Appendix B). The
correlation between the fifty most frequent closed-class word forms in the SEEC,
Table 2
Use – the most frequent word family in the Student Engineering Word List
N Headword Frequency % Words joined
ABC order – 1186
Frequency order – 1
use 10,313 0.52 use (2784: n 961, v 1823), uses
(262: n 48, v 214), using (2100),
used (4538);
useful (341), usefully (1), usefulness (7);
useless (6);
usable (22), useable (2);
user (149), users (24), users (2);
usage (39); reuse (4: n 3, v 1),
re-use (3: n 1, v 2), reused (5),
reusable (7);
unused (5 adj), unusable (5);
misuse (1
n), misusing (1), misused (1);abuse (2: v 1, attrib 1);
multiuse (1 attrib), multi-user (1 attrib)
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 241
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 8/22
the COBUILD Bank of English and the BNC Written proved to be statistically sig-
nificant at the .01 level. The Spearmans rank order correlation between the fifty
most frequent closed-class word forms in the SEEC and the COBUILD Bank of
English is .778 while between the SEEC and the BNC Written it is .802.Secondly, a comparison of the fifty most frequent open-class (content) word
forms has indicated that the content word forms in the SEEC are predominantly
from the scientific register, while the most frequent content word forms in
COBUILD and the BNC Written are of a general nature (Appendix C). Further-
more, the most frequent content word forms in the SEEC are rather infrequent in
COBUILD and the BNC Written (Appendix D). This finding supports Salagers
(1983, p. 54) observation about ‘‘those context-dependent words’’ which occur with
high frequency across different scientific disciplines but tend to be used infrequently
in general word-frequency counts.
Similarly, the most frequently encountered words in the SEEC appear to be sub-
technical , i.e., words with non-technical as well as technical senses, common in most
kinds of technical writing, which are identified by their frequency and distribution
as well as their collocational behaviour (Yang, 1986). The SEEC word frequency anal-
ysis has additionally revealed that the non-technical sense of a sub-technical lexical
item is used more frequently than its technical sense. For example, the word solution
is more commonly used in the SEEC in the non-technical sense than in the chemical
sense (Table 3), even in a Chemical Engineering Thermodynamics textbook4 (Table 4).
Finally, keyword analysis of the Student Engineering lexis, carried out with the
help of the WordSmith Tools software, has provided further support for my hypoth-esis that the most frequent words in a specialist corpus are in fact sub-technical and
non-technical. Basically, keywords are the words which are most unusually frequent
in a given body of text against a reference corpus while the so-called key-keywords
(Scott, 1997) are the most frequent keywords over a number of files in the database
ensuring that these words are characteristic of the whole corpus.
The key-keyword comparison of the SEEC against the BNC Written Sampler5
provides some interesting information about the key verbs in the SEEC – they ap-
pear to be predominantly from the academic register. The key-key verbs in the SEEC
are: act, apply, assume, be, become, calculate, consider, correspond to, define, deter-
mine, exert, give, illustrate, indicate, locate, obtain, occur, require, show, sketch, solve,substitute, and use. These verbs are key in at least five (seven on average) text files out
of the thirteen that constitute the SEEC (Table 1). Importantly, ten of these key
verbs (assume, correspond, define, illustrate, indicate, locate, obtain, occur, require,
substitute) are included in Coxheads (2000) New Academic Word List; ten (assume,
correspond, define, illustrate, indicate, locate, obtain, occur, require, sketch) are also in
4 However, the word form solutions, although very infrequent, does occur more frequently in its
chemical sense in the Chemical Engineering Thermodynamics textbook.5 The BNC Written Sampler is a one-million-word written subcorpus of the BNC containing a wide and
balanced sampling of texts from the BNC Written. It was used for the key-keyword comparison as the full
BNC was too large to be analysed by WordSmith Tools.
242 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 9/22
Xue and Nations (1984) University Word List; and the first nine in each appear in
both lists. Interestingly, Martin (1976) considered academic and sub-technical vocab-
ulary equivalent terms.Thus, word frequency analysis of the SEEC in comparison with COBUILD and
the BNC suggests that the answer to the research question Are the most frequent
words in a specialist corpus technical or non-technical/sub-technical? is that they
are (a) sub-technical and (b) non-technical from the academic register. This finding
has important implications for teaching Engineering English, indicating that more
attention should be devoted to academic English and sub-technical vocabulary.
5. Data-driven teaching and learning of Engineering English
Corpus linguistic techniques, such as the use of concordancers, play a major role
in data-driven learning and increasingly shape the development of teaching materials
Table 3
Technical vs. non-technical senses
Rank Headword Words joined Frequency %
ABC order Frequency order
1032 29 solution
(of a problem)
solution (1899), solutions
(185); solve (912), solves (8),
solving (341), solved (249);
solvable (8), solver (3),
solvers (2); unsolved (1);
resolve (29), resolving (41),
resolved (96); unresolved (2)
3776 0.19
1033 242 solution
(liquid)
solution (455), solutions (114),
solutionizing (1), solutionized
1); solubility (111), soluble (32),insoluble (16); solvent (96),
solvents (27), solute (58)
solvus (8); dissolve (14),
dissolves (11), dissolving (4),
dissolved (44), dissolution (16),
nondissolvable (1)
1025 0.05
Table 4
Dispersion of the word solution in a Chemical Engineering Thermodynamics textbook
Word Frequencies Per 1000 words
Total General sense Chemical sense
solution 259 157 102 2.36
solutions 27 6 21 0.25
solution + solutions 286 163 123 2.60
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 243
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 10/22
(Cheng, Warren, & Xun-feng, 2003; Flowerdew, 1993; Hadley, 2002; Johns, 1991;
McKay, 1980; Mudraya, 2004; Murison-Bowie, 1996; Thurstun & Candlin, 1998).
Via corpus-based teaching and learning, learners become exposed to authentic
real-life language use and no longer rely solely on published instructional material,much of which is inauthentic.
Within the lexical approach too, language activities are directed towards naturally
occurring language, and more time is devoted to collocations and idiomatic expres-
sions. Lewis (1993) claims that the basis of language is lexis, while grammar is the
search for powerful patterns. There is compelling evidence (Lewis, 1993; McKay,
1980) that the majority of errors made by foreign/second language learners are
semantic errors of inappropriate word choice caused by vocabulary deficiency
and, particularly, by lack of collocational power. In consequence, Nattinger (1980,
p. 341) has suggested that
Perhaps we should base our teaching on the assumption that, for a great deal
of the time anyway, language production consists of piecing together the ready-
made units appropriate for a particular situation and that comprehension
relies on knowing which of these patterns to predict in these situations. Our
teaching, therefore would center on these patterns and the ways they can be
pieced together, along with the ways they vary and the situations in which they
occur.
I argue for the integration of the lexical approach with data-driven corpus-based
methodology in English teaching, including ESP teaching, as I believe that the use of language corpora in the classroom can improve students knowledge of the language
and their ability to use it effectively. Clearly, the major strength of using a computer
corpus in language teaching is the insight it can provide into the unique collocational
patterns of a word. This is one of the many persuasive reasons for utilizing computer
corpora in the development of vocabulary materials. Although the exercises that
resemble those of standard vocabulary and grammar teaching practices (i.e.,
blank-filling, sentence completion, word matching, translation, etc.) can still be
put to use, their linguistic focus has fundamentally changed, with many of the activ-
ities being of the receptive, awareness-raising kind that can aid language acquisition
by providing learners with a tool which enables them to process input moreeffectively.
I find concordancing a very valuable tool in course design. A case can be made,
though, for the use of the specialist corpus for teaching ESP students, since, even
where lexis is common to both general and the specialist corpus, the items in the spe-
cialist corpus, as Flowerdew (1993, p. 236) has noted, may have particular uses that
will be revealed in concordancing. Keeping this in mind, I have worked out some
data-driven exercises based on concordance data that are aimed at helping students
acquire language prefabs for technical and non-technical uses in the specialist con-
text. These concordance-based activities are designed not so much to help students
understand engineering textbooks but rather to aid productive use of the languageprefabs. Fig. 1 presents a concordance sample from the SEEC that includes carefully
selected examples of the word solution used both in the general sense (e.g., solution of
244 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 11/22
a problem) and in the technical (chemical) sense. Solution was chosen because it fig-ures, in its general sense, as a high-frequency word family and also occurs frequently
as a sub-technical item.
Fig. 1. Concordance sample of solution.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 245
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 12/22
Activity 1. Study the concordance data and find the instances of the word solution
used (1) in the general sense (e.g., solution of a problem) and (2) in the technical
(chemical) sense.
Answer key
General Chemical
Lines 1–3, 5–10, 20, 22–23, 33–34,
38, 41, 43–44, 46–52, 54, 57, 59–61
Lines 4, 11–19, 21, 24–32, 35–37,
39–40, 42, 45, 53, 55–56, 58
Activity 2a. From the concordance data, supply the adjectives that collocate with
the word solution used (1) in the general sense and (2) in the technical (chemical)
sense. Underline the adjectives that can be used with both senses of solution.Answer key
General Chemical
adequate, analytical, (not) possible,
particular, optimum, similar, following,
explicit, known, sensitive, insensitive,
straightforward, ideal, alternative
aqueous, acid, dilute, concentrated,
strong, weak, ideal, saturated,
following, similar, liquid, particular,
partially miscible
Activity 2b. From the concordance data, supply the verbs that collocate with the
word solution used (1) in the general sense and (2) in the technical (chemical) sense.
Underline the verbs that can be used with both senses of solution.
Answer key
General Chemical
find, obtain, complicate, is/are, attempt,
add, yield, exist, take, lead to, give rise
to, have, lengthen, print, contain,calculate, simplify
add, be/is, has, pump, form, immerse in,
enter, flow out of, absorb, plate out of,
take, exist, contain
Fig. 2 presents a concordance sample from the SEEC that includes carefully
selected examples of the verb solve used only in its general sense (e.g., solve a
problem). It was chosen because it features as a high-frequency word family and a
prominent key verb in the SEEC. Below are concordance-based activities designed
to provide some insight into the syntactic patterns in which the verb solve functions
in the SEEC.
Activity 3. Use the concordance data to exemplify the following syntactic patterns
with the verb solve + solution method.
246 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 13/22
Pattern 1: ‘solve/solves/solving/solved with’ as in ‘The following problems are
designed to be solved with a computer’.
Answer key
Solve with a vector approach.
the following problems are designed to be solved with a computer.
if this problem were solved with a numerical calculation
Solving this with the characteristic yields
air velocity expressions, each solved as appropriate with the equationproblems which we shall riot attempt to solve with the means at our disposal
Many problems can be solved with more than one choice of system.
it solves rapidly with any root-finding program
Pattern 2: ‘solve/solves/solving/solved by’ as in ‘The problem may be solved graph-
ically by drawing’.
Answer key
Solve, first, by double integrationthe problem may be solved graphically by drawing
transcendental equations which must be solved by successive approximations
Fig. 2. Concordance sample of solve.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 247
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 14/22
will be an oblique triangle and should be solved by applying the law of sines
can be solved quite simply by the use of
When the problem is solved simply by moving the disk from
a wide class of problems which are solved by trial and error.problems that cannot be solved by the Work-Energy Principle
problems in this chapter have been solved by using the Moody diagram.
Such problems are solved by considering a short length of
equations and as such may be solved by numerical techniques.
set of algebraic equations that can be solved by methods developed earlier
were solved by the application of second law.
Solve by trial the equation
would be relatively simple matter to solve them, say by matrix method.
Solving a problem by following five steps
solved by computer.
Pattern 3: solve/solves/solving/solved using as in Alternatively, we can solve such
problems using graphical solution.
Answer key
deformable-body mechanics problems are solved using these work-energy principle
Alternatively, we can solve such problems using graphical solution
the following problems are intended to be solved using the program provided in
13.14 Solve Problem 12.18c, using the method15.75 Using the method of 15.7, solve 15.49.
we demonstrate the FORTRAN program that solves these using the routine Original
Equations (13.52) can be solved using each of the two sets
we add this reaction to be solved using the ‘‘final’’ moles
Another useful activity would be finding out when one syntactic pattern (the use
of by, with, using with solve) was preferred over another. It would require examining
all the relevant concordance lines in the corpus but the limitations of space do not
allow it to be included here.By using corpora, students gain direct access to abundant examples of authentic
language samples, resulting in a better understanding of the use and patterns of cer-
tain linguistic features. Thus, corpus-based teaching can help train multi-skilled
autonomous learners who can take charge of their own learning processes.
6. Conclusion
In this paper, I have argued for the integration of the lexical approach with a
data-driven corpus-based methodology in ESP teaching, as I believe that the useof language corpora in the classroom can improve students knowledge of the
language and their ability to use it effectively. This leads me to the conclusion that
248 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 15/22
corpora can also improve the way ESP teaching is approached. It can inform teach-
ing and learning, producing students who know what it means to use a corpus, who
know how to extract material from it, and who, consequently, can learn a great deal
about language via a corpus. After all, as Dlaska (1999, p. 403) observed, ESP teach-ing need not be ‘‘dire and difficult pedagogical ground’’, forcing language teachers to
surrender their expertise in favour of teaching unfamiliar subjects, but on the con-
trary, it needs to ‘‘address, and eventually bridge, the discrepancy between general
language ability and specialized language ability . . . since the two areas are not in
opposition but complement each other’’.
Appendix A. The one hundred most frequent word families in the Student Engineering
Word List
N Headword Frequency %
1 use 10,313 0.52
2 force 9247 0.46
3 form 7075 0.35
4 flow 7045 0.35
5 pressure 7016 0.35
6 show (v) 7002 0.35
7 determine 6896 0.34
8 figure/configure 6650 0.33
9 section 6404 0.32
10 line 5812 0.29
11 equation 5771 0.2912 point 5236 0.26
13 angle 4923 0.25
14 act/react/interact/transact/counteract 4666 0.23
15 velocity 4614 0.23
16 system 4540 0.23
17 value 4484 0.23
18 apply 4327 0.22
19 problem 4278 0.21
20 work 4198 0.21
21 give 4103 0.21
22 axis 4053 0.20
23 stress 4033 0.20
24 material 4014 0.2025 center 3992 0.20
26 length/long 3890 0.19
27 part 3867 0.19
28 surface 3821 0.19
29 solution (of a problem) 3776 0.19
30 type 3606 0.18
31 produce 3582 0.18
32 metal 3457 0.17
33 example 3447 0.17
34 load 3406 0.17
35 other/another 3371 0.16
36 time 3299 0.16
37 high 3252 0.1638 energy 3245 0.16
39 vary 3232 0.16
(continued on next page)
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 249
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 16/22
Appendix A (continued )
N Headword Frequency %
40 number 3216 0.16
41 temperature 3119 0.1642 body 3101 0.16
43 process 3048 0.15
44 chapter 3016 0.15
45 moment 2989 0.15
46 machine 2979 0.15
47 dimension 2938 0.15
48 put 2889 0.14
49 placement 2840 0.14
50 require 2828 0.14
51 area 2827 0.14
52 plane 2820 0.14
53 direction 2784 0.14
54 result 2763 0.1455 move/remove 2751 0.14
56 all 2741 0.14
57 follow 2731 0.14
58 constant 2719 0.14
59 unit 2661 0.13
60 view 2647 0.13
61 fluid 2639 0.13
62 know 2609 0.13
63 draw 2603 0.13
64 operation 2601 0.13
65 component 2560 0.13
66 expression 2528 0.13
67 beam 2513 0.1368 end 2484 0.12
69 pipe 2476 0.12
70 make 2467 0.12
71 steel 2429 0.12
72 assume 2424 0.12
73 shear 2409 0.12
74 case (=state) 2351 0.12
75 find 2343 0.12
76 diameter 2341 0.12
77 obtain 2341 0.12
78 mass 2337 0.12
79 air/aero- 2315 0.12
80 define 2276 0.1181 also 2267 0.11
82 calculate 2266 0.11
83 water 2262 0.11
84 cut 2258 0.11
85 element 2254 0.11
86 rotate 2250 0.11
87 maximum 2246 0.11
88 different 2235 0.11
89 change 2205 0.11
90 equilibrium 2183 0.11
91 structure 2183 0.11
92 position 2177 0.11
93 base/basic 2172 0.1194 write 2167 0.11
95 consider 2154 0.11
250 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 17/22
Appendix A (continued )
N Headword Frequency %
96 design 2125 0.11
97 free 2087 0.1098 friction 2086 0.10
99 low 2083 0.10
100 method 2070 0.10
Appendix B. The fifty most frequent word forms in the SEEC, the COBUILD Bank of
English Corpus and the BNC Written
SEEC (ca. 2 million words) COBUILD(ca. 323 million words)
BNC (written)(ca. 90 million words)
N Word % N Word % N Word %
1 the 8.50 1 the 5.58 1 the 6.43
2 of 4.19 2 of 2.60 2 of 3.11
3 a 2.84 3 to 2.51 3 and 2.70
4 and 2.72 4 and 2.37 4 to 2.60
5 is 2.43 5 a 2.21 5 a 2.18
6 in 2.07 6 in 1.83 6 in 1.95
7 to 2.06 7 that 1.04 7 is 0.99
8 for 1.08 8 is 0.93 8 that 0.99
9 are 0.88 9 it 0.92 9 was 0.94
10 be 0.83 10 for 0.87 10 it 0.9311 that 0.80 11 i 0.78 11 for 0.88
12 at 0.76 12 was 0.76 12 on 0.72
13 as 0.75 13 on 0.70 13 with 0.67
14 by 0.71 14 he 0.65 14 he 0.67
15 with 0.57 15 with 0.64 15 be 0.67
16 on 0.50 16 as 0.57 16 i 0.66
17 from 0.48 17 you 0.54 17 by 0.55
18 an 0.47 18 be 0.53 18 as 0.55
19 this 0.47 19 at 0.52 19 at 0.49
20 or 0.46 20 by 0.50 20 you 0.47
21 we 0.42 21 but 0.47 21 are 0.47
22 which 0.42 22 have 0.46 22 his 0.47
23 it 0.38 23 are 0.44 23 had 0.4624 if 0.32 24 his 0.43 24 not 0.46
25 figure 0.31 25 from 0.43 25 this 0.45
26 flow 0.31 26 they 0.43 26 have 0.44
27 can 0.28 27 this 0.39 27 from 0.44
28 determine 0.27 28 not 0.38 28 but 0.43
29 force 0.27 29 had 0.35 29 which 0.39
30 two 0.26 30 has 0.34 30 she 0.38
31 shown 0.25 31 an 0.32 31 they 0.37
32 will 0.25 32 we 0.32 32 or 0.37
33 used 0.23 33 or 0.29 33 an 0.36
34 may 0.22 34 said 0.28 34 her 0.35
35 velocity 0.22 35 one 0.28 35 were 0.33
36 pressure 0.22 36 there 0.27 36 there 0.2837 its 0.20 37 will 0.27 37 we 0.28
38 when 0.20 38 their 0.27 38 their 0.28
(continued on next page)
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 251
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 18/22
Appendix B (continued )
SEEC (ca. 2 million words) COBUILD
(ca. 323 million words)
BNC (written)
(ca. 90 million words)
N Word % N Word % N Word %
39 have 0.20 39 which 0.27 39 been 0.28
40 has 0.19 40 she 0.26 40 has 0.27
41 equation 0.19 41 were 0.26 41 will 0.26
42 not 0.19 42 all 0.25 42 one 0.26
43 one 0.18 43 been 0.25 43 all 0.25
44 each 0.18 44 who 0.25 44 would 0.25
45 point 0.18 45 her 0.24 45 can 0.22
46 where 0.18 46 would 0.23 46 if 0.21
47 system 0.17 47 up 0.22 47 who 0.21
48 forces 0.17 48 if 0.22 48 more 0.21
49 these 0.16 49 more 0.22 49 when 0.21
50 between 0.16 50 when 0.22 50 said 0.20
Appendix C. The fifty most frequent open-class word forms in the SEEC, the
COBUILD Bank of English Corpus and the BNC Written
SEEC (ca. 2 million words) COBUILD (ca. 323 million words) BNC Written (ca. 90 million words)
N Rank Word % N Rank Word % N Rank Word %
1. 5 is 2.43 1. 8 is 0.93 1. 7 is 0.99
2. 9 are 0.88 2. 12 was 0.76 2. 8 that 0.993. 10 be 0.83 3. 18 be 0.53 3. 9 was 0.94
4. 25 figure 0.31 4. 22 have 0.46 4. 15 be 0.67
5. 26 flow 0.31 5. 23 are 0.44 5. 21 are 0.47
6. 27 can 0.28 6. 29 had 0.35 6. 23 had 0.46
7. 28 determine 0.27 7. 30 has 0.34 7. 26 have 0.44
8. 29 force 0.27 8. 34 said 0.28 8. 35 were 0.33
9. 30 two 0.26 9. 35 one 0.28 9. 39 been 0.28
10. 31 shown 0.25 10. 37 will 0.27 10. 40 has 0.27
11. 32 will 0.25 11. 41 were 0.26 11. 41 will 0.26
12. 33 used 0.23 12. 43 been 0.25 12. 42 one 0.26
13. 34 may 0.22 13. 46 would 0.23 13. 44 would 0.25
14. 35 velocity 0.22 14. 55 can 0.20 14. 45 can 0.22
15. 36 pressure 0.22 15. 58 new 0.16 15. 50 said 0.2016. 39 have 0.20 16. 59 do 0.16 16. 51 do 0.20
17. 40 has 0.19 17. 60 two 0.16 17. 61 could 0.16
18. 41 equation 0.19 18. 62 time 0.15 18. 64 time 0.15
19. 43 one 0.18 19. 63 people 0.15 19. 67 two 0.14
20. 45 point 0.18 20. 64 like 0.15 20. 70 may 0.14
21. 47 system 0.17 21. 68 now 0.15 21. 73 new 0.13
22. 48 forces 0.17 22. 71 year 0.14 22. 74 like 0.13
23. 51 surface 0.16 23. 75 first 0.13 23. 78 first 0.12
24. 52 energy 0.16 24. 76 could 0.13 24. 80 did 0.12
25. 53 stress 0.16 25. 81 last 0.12 25. 81 now 0.12
26. 54 section 0.15 26. 83 well 0.12 26. 83 people 0.11
27. 55 example 0.15 27. 85 years 0.11 27. 85 should 0.11
28. 57 line 0.14 28. 86 know 0.11 28. 86 very 0.1129. 58 chapter 0.14 29. 89 very 0.10 29. 88 see 0.10
252 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 19/22
Appendix C (continued )
SEEC (ca. 2 million words) COBUILD (ca. 323 million words) BNC Written (ca. 90 million words)
N Rank Word % N Rank Word % N Rank Word %
30. 60 use 0.14 30. 91 pound 0.10 30. 91 made 0.10
31. 63 temperature 0.13 31. 92 back 0.10 31. 93 back 0.10
32. 64 problem 0.13 32. 94 get 0.10 32. 94 way 0.09
33. 65 must 0.13 33. 95 may 0.10 33. 96 years 0.09
34. 66 given 0.13 34. 97 think 0.09 34. 97 being 0.09
35. 67 time 0.13 35. 98 even 0.09 35. 100 work 0.09
36. 68 body 0.12 36. 100 way 0.09 36. 107 make 0.08
37. 72 area 0.12 37. 101 right 0.09 37. 108 even 0.07
38. 73 constant 0.12 38. 102 three 0.09 38. 110 still 0.07
39. 75 value 0.12 39. 104 dont 0.09 39. 111 must 0.07
40. 77 number 0.12 40. 106 world 0.09 40. 112 own 0.07
41. 78 solution 0.12 41. 110 being 0.09 41. 113 know 0.07
42. 79 fluid 0.12 42. 111 says 0.09 42. 115 year 0.0743. 80 shear 0.12 43. 112 government 0.09 43. 116 good 0.07
44. 81 length 0.12 44. 114 dollar 0.08 44. 119 last 0.07
45. 82 moment 0.11 45. 115 should 0.08 45. 120 get 0.07
46. 84 mass 0.11 46. 116 made 0.08 46. 121 three 0.07
47. 85 axis 0.11 47. 117 good 0.08 47. 122 well 0.07
48. 86 maximum 0.11 48. 119 see 0.08 48. 123 take 0.07
49. 88 work 0.11 49. 120 go 0.08 49. 125 go 0.07
50. 89 plane 0.11 50. 121 did 0.08 50. 126 government 0.07
Appendix D. The fifty most frequent content word forms in the SEEC compared
against the COBUILD Bank of English Corpus and the BNC Written
N Rank in Corpus Word % in Corpus
SEEC
(ca. 2 m)
COBUILD
(ca. 323 m)
BNC W
(ca. 90 m)
SEEC
(ca. 2 m)
COBUILD
(ca. 323 m)
BNC W
(ca. 90 m)
1. 25 940 546 figure 0.31 0.01 0.02
2. 26 2563 1934 flow 0.31 0.004 0.006
3. 28 3670 2521 determine 0.27 0.003 0.004
4. 29 487 605 force 0.27 0.02 0.02
5. 31 1267 630 shown 0.25 0.008 0.02
6. 33 185 134 used 0.23 0.05 0.06
7. 35 a 7891 velocity 0.22 a 0.001
8. 36 705 837 pressure 0.22 0.01 0.01
9. 41 a 3798 equation 0.19 a 0.003
10. 45 272 230 point 0.18 0.03 0.04
11. 47 298 173 system 0.17 0.03 0.05
12. 48 633 810 forces 0.17 0.02 0.01
13. 51 1752 1092 surface 0.16 0.006 0.01
14. 52 918 793 energy 0.16 0.01 0.01
15. 53 2066 2120 stress 0.16 0.005 0.005
16. 54 1437 497 section 0.15 0.007 0.02
17. 55 469 790 example 0.15 0.02 0.01
18. 57 352 415 line 0.14 0.03 0.02
19. 58 1836 631 chapter 0.14 0.006 0.02
20. 60 208 132 use 0.14 0.04 0.06
(continued on next page)
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 253
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 20/22
Appendix D (continued )
N Rank in Corpus Word % in Corpus
SEEC
(ca. 2 m)
COBUILD
(ca. 323 m)
BNC W
(ca. 90 m)
SEEC
(ca. 2 m)
COBUILD
(ca. 323 m)
BNC W
(ca. 90 m)
21. 63 3116 2318 temperature 0.13 0.003 0.005
22. 64 341 307 problem 0.13 0.03 0.03
23. 66 292 192 given 0.13 0.03 0.04
24. 67 64 66 time 0.13 0.15 0.15
25. 68 399 324 body 0.13 0.02 0.03
26. 72 390 249 area 0.12 0.02 0.04
27. 73 2808 1981 constant 0.12 0.004 0.005
28. 75 774 523 value 0.12 0.01 0.02
29. 77 247 165 number 0.12 0.04 0.05
30. 78 1928 1466 solution 0.12 0.005 0.007
31. 79 6344 5093 fluid 0.12 0.001 0.002
32. 80
a a
shear 0.12
a a
33. 81 1997 1418 length 0.12 0.05 0.008
34. 82 528 453 moment 0.12 0.02 0.02
35. 84 1569 1362 mass 0.11 0.007 0.008
36. 85 a 7092 axis 0.11 a 0.001
37. 86 2839 2021 maximum 0.11 0.003 0.005
38. 87 1146 424 thus 0.11 0.009 0.02
39. 88 135 102 work 0.11 0.07 0.09
40. 89 2198 2884 plane 0.11 0.005 0.004
41. 94 1292 708 material 0.11 0.008 0.01
42. 95 8372 6198 diameter 0.11 0.0008 0.001
43. 96 1180 559 type 0.11 0.009 0.02
44. 97 305 242 water 0.11 0.03 0.04
45. 98 178 163 end 0.11 0.05 0.0546. 99 2521 2242 metal 0.11 0.004 0.005
47. 100 181 157 part 0.11 0.05 0.05
48. 101 7404 6584 beam 0.11 0.001 0.001
49. 102 a 5085 equilibrium 0.11 a 0.002
50. 103 565 358 using 0.11 0.02 0.03
a Not among 10,000 most frequent word forms.
References
Bauer, L., & Nation, P. (1993). Word families. International Journal of Lexicography, 6 (3), 1–27.
Baker, M. (1988). Sub-technical vocabulary and the ESP teacher: an analysis of some rhetorical items in
medical journal articles. Reading in a Foreign Language, 4(2), 91–105.
Bolinger, D. (1976). Meaning and memory. Forum Linguisticum, 1, 1–14.
Cheng, W., Warren, M., & Xun-feng, X. (2003). The language learner as language researcher: putting
corpus linguistics on the timetable. System, 31(2), 173–186.
Cowan, J. R. (1974). Lexical and syntactic research for the design of EFL reading materials. TESOL
Quarterly, 8(4), 389–399.
Coxhead, A. (2000). A new Academic Word List. TESOL Quarterly, 34(2), 213–238.
Dlaska, A. (1999). Suggestions for a subject-specific approach in teaching foreign languages to engineering
and science students. System, 27 (3), 401–417.
Flowerdew, J. (1993). Concordancing as a tool in course design. System, 21(2), 231–244.
Ghadessy, M., Henry, A., & Roseberry, R. L. (Eds.). (2001). Small corpus studies and ELT: theory and
practice. Amsterdam/Philadelphia: John Benjamins Publishing Co.
254 O. Mudraya / English for Specific Purposes 25 (2006) 235–256
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 21/22
Hadley, G. (2002). An introduction to data-driven learning. RELC Journal, 33(2), 99–124.
James, G., Davidson, R., Heung-yeung, A. C., & Deerwester, S. (1994). English in Computer Science: a
corpus-based lexical analysis. The Hong Kong University of Science and Technology: Longman Asia
Ltd.
Johns, T. (1991). Should you be persuaded – two examples of data-driven learning materials. ELR Journal,
4, 1–16.
Lewis, M. (1993). The lexical approach: the state of ELT and the way forward. Hove, England: Language
Teaching Publications.
Martin, A. V. (1976). Teaching academic vocabulary to foreign graduate students. TESOL Quarterly,
10(1), 91–99.
McEnery, A., & Wilson, A. (1997). Teaching and language corpora (TALC). ReCALL, 9(1),
5–14.
McEnery, A., & Wilson, A. (2001). Corpus linguistics (2nd ed.). Edinburgh: Edinburgh University Press.
McKay, S. L. (1980). Developing vocabulary materials with a computer corpus. RELC Journal, 11(2),
77–87.
Moudraia, O. (1999). Lexical syllabus foundation for engineering. RELC Journal, 30(2), 140–141.Moudraia, O. (2001). Lexical approach to second language teaching. Eric Digest EDO-FL-01-02.
Washington, DC: ERIC Clearinghouse on Languages and Linguistics. Available from http://
www.cal.org/ericcll/digest/0102lexical.html .
Moudraia, O. (2003). The student engineering corpus: analysing word frequency. In: D. Archer, P.
Rayson, A. Wilson, & T. McEnery (Eds.), Proceedings of the corpus linguistics 2003 conference
(pp. 552–561). UCREL technical paper number 16. UCREL, Lancaster University. ISBN
1862201315.
Moudraia, O. (2004). The student engineering English corpus. ICAME Journal, 28, 139–143.
Mudraya, O. V. (2004). Using a lexical approach for data-driven instruction of engineering English. IEEE
Transactions on Professional Communication, 47 (1), 65–70.
Murison-Bowie, S. (1996). Linguistic corpora and language teaching. Annual Review of Applied
Linguistics, 16 , 182–199.Nattinger, J. (1980). A lexical phrase grammar for ESL. TESOL Quarterly, 14, 337–344.
Nattinger, J., & DeCarrico, J. (1992). Lexical phrases and language teaching . Oxford: Oxford University
Press.
Orr, T., & Takahashi, A. (2002). Constructing a corpus of fundamental engineering English for nonnative
speakers. In J. Williams (Ed.), Conference proceedings of the IEEE international professional
communication conference (pp. 403–409). USA: Oregon.
Qi-bo, Z. (1989). A quantitative look at the Guangzhou Petroleum English Corpus. ICAME Journal, 13,
28–38.
Salager, F. (1983). The lexis of fundamental medical English: classificatory framework and rhetorical
function (a statistical approach). Reading in a Foreign Language, 1, 54–64.
Scott, M. (1996). WordSmith tools. Oxford: Oxford University Press. Available from http://www.lexi-
cally.net/wordsmith/ .
Scott, M. (1997). PC analysis of key words – and key key words. System, 25(2), 233–245.
Sinclair, J. M. (Ed.). (1987). Looking up: an account of the COBUILD project in lexical computing . London:
Collins COBUILD.
Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press.
Thurstun, J., & Candlin, C. N. (1998). Concordancing and the teaching of the vocabulary of academic
English. English for Specific Purposes, 17 (3), 267–280.
Trimble, L. (1985). English for science and technology: a discourse approach. Cambridge: Cambridge
University Press.
Willis, D. (1990). The lexical syllabus: a new approach to language teaching . London: Collins
COBUILD.
Xue, G., & Nation, I. S. P. (1984). A university word list. Language Learning and Communication, 3(2),
215–229.
O. Mudraya / English for Specific Purposes 25 (2006) 235–256 255
7/25/2019 Lexical Frequency & ESP(Wk5)
http://slidepdf.com/reader/full/lexical-frequency-espwk5 22/22
Yang, H. (1985a). The JDEST Computer Corpus of texts in English for science and technology. ICAME
News, 9, 24–25.
Yang, H. (1985b). The use of computers in English teaching and research in China. In R. Quirk & H. G.
Widdowson (Eds.), English in the world: teaching and learning the language and literature (pp. 86–100).
Cambridge: Cambridge University Press.
Yang, H. (1986). A new technique for identifying scientific/technical terms and describing scientific texts.
Literary and Linguistic Computing, 1(2), 93–103.
Olga Mudraya (Ph.D./Comparative Linguistics; MA (Hon)/Teaching English and Literature) is currently
a Research Associate in the Department of Linguistics and Modern English Language at Lancaster
University, UK. Previously, she was Assistant Professor at Walailak University in Thailand. Her current
research interests include Corpus Linguistics and ESP.
256 O. Mudraya / English for Specific Purposes 25 (2006) 235–256