using an enhanced mda model in study of world englishes richard xiao university of central...

25
Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire [email protected]

Upload: nicholas-wade

Post on 28-Mar-2015

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

Using an enhanced MDA model in study of World Englishes

Richard Xiao

University of Central Lancashire

[email protected]

Page 2: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

2

Overview of the talk

• Biber’s (1988) MF/MD analytical framework

• The enhanced multidimensional analysis (MDA) model

• An MDA analysis of five varieties of English in the ICE

Page 3: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

3

Factor analysis

• The key to the multidimensional analysis approach

• A common data reduction method available in many standard statistics packages such as SPSS

• Reducing a large number of variables to a manageable set of underlying factors or dimensions

• Extensively used in social sciences to identify clusters of variables

Page 4: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

4

Biber’s MF/MD approach

• Established in Biber (1988): Variation across Speech and Writing (CUP)– Factor analysis of 67 functionally related

linguistic features– 481 text samples, amounting to 960,000

running words• LOB• London-Lund• Brown corpus• A collection of professional and personal letters

Page 5: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

5

Biber’s MF/MD approach

• Biber’s seven factors / dimensions– Informational vs. involved production– Narrative vs. non-narrative concerns– Explicit vs. situation-dependent reference– Overt expression of persuasion– Abstract vs. non-abstract information – Online informational elaboration– Academic hedging

Page 6: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

6

Biber’s MF/MD approach

• Influential and widely used– Synchronic analysis of specific registers / genres and

author styles– Diachronic studies describing the evolution of

registers– Register studies of non-Western languages and

contrastive analyses– Research of University English and materials

development– Move analysis and study of discourse structure

• …largely confined to grammatical categories

Page 7: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

7

The enhanced MDA model

• Enhancing Biber’s MDA by incorporating semantic components with grammatical categories– Wmatrix = CLAWS + USAS– A total of 141 linguistic features investigated

• 109 features retained in the final model

– Five million words in 2,500 text samples, with one million for each of the 5 varieties of English

• ICE – GB, HK, India, Singapore, the Philippines• 300 spoken + 200 written samples• 12 registers ranging from private conversation to academic

writing

Page 8: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

8

ICE registers and proportionsS1A (20%) Spoken – Private

S1B (16%) Spoken – Public

S2A (14%) Spoken – Monologue – Unscripted

S2B (10%) Spoken – Monologue – Scripted

W1A (4%) Written – Non-printed – Non-professional writing

W1B (6%) Written – Non-printed – Correspondence

W2A (8%) Written – Printed – Academic writing

W2B (8%) Written – Printed – Non-academic writing

W2C (4%) Written – Printed – Reportage

W2D (4%) Written – Printed – Instructional writing

W2E (2%) Written – Printed – Persuasive writing

W2F (4%) Written – Printed – Creative writing

Page 9: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

9

141 linguistic features covered

• A) Nouns 21 categories, e.g.– nominalisation, other nouns; 19 semantic classes of

nouns (e.g. evaluations, speech acts)

• B) Verbs: 28 categories, e.g.– Do as pro-verb, be as main verb, tense and aspect

markers, modals, passives, 16 semantic categories of verbs

• C) Pronouns: 10 categories, e.g.– Person, case, demonstrative

• D) Adjectives: 11 categories, e.g.– Attributive vs. predicative use, 9 semantic categories

Page 10: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

10

141 linguistic features covered

• E) Adverbs: 7 categories• F) Prepositions (2 categories)• G) Subordination (3 categories)• H) Coordination (2 categories)• I) WH-questions / clauses (2 categories)• J) Nominal post-modifying clauses (5 categories)• K) THAT-complement clauses (3 categories)• L) Infinitive clauses (3 categories)• M) Participle clauses (2 categories)• N) Reduced forms and dispreferred structures (4

categories)• O) Lexical and structural complexity (3 categories)

Page 11: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

11

141 Linguistic features covered

• P) Quantifiers (4 categories)• Q) Time expressions (11 categories)• R) Degree expressions (8 categories)• S) Negation (2 categories)• T) Power relationship (4 categories)• U) Definiteness (2 categories)• V) Helping/hindrance (2 categories)• X) Linear order (1 category)• Y) Seem / Appear (1 category)• Z) Discourse bin (1 category)

Page 12: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

12

Procedure of data analysis• 1) Data clean-up• 2) Grammatical and semantic tagging with Wmatrix• 3) Extracting the frequencies of 141 linguistic features

from 2,500 corpus files• 4) Building a profile of normalised frequencies (per 1,000

words) for each linguistic feature• 5) Factor analysis

– Factor extraction (Principal Factor Analysis)– Factor rotation (Pramax)– Optimum structure: 9 factors

• 6) Interpreting extracted factors• 7) Computing factor scores• 8) Using the enhanced MDA model in exploration of

variation across registers and language varieties

Page 13: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

13

The enhanced MDA model• Nine factors established in the new model

– 1) Interactive casual discourse vs. informative elaborate discourse

– 2) Elaborative online evaluation– 3) Narrative concern– 4) Human vs. object description – 5) Future projection– 6) Personal impression and judgement– 7) Lack of temporal / locative focus– 8) Concern with degree and quantity– 9) Concern with reported speech

• Robustness of the model in register analysis

Page 14: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

14

5 English varieties across 9 factors

-20

-15

-10

-5

0

5

Factor1

Factor2

Factor3

Factor4

Factor5

Factor6

Factor7

Factor8

Factor9

Factors

Fac

tor

sco

re

GB

HK

IN

PH

SG

• Both differences and similarities• This general picture may blur many register-based subtleties

– Language can vary across registers even more substantially than across language varieties (cf. Biber 1995)

Page 15: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

15

1) Interactive casual discourse vs. informative elaborate discourse

• Indian English displays the lowest score in nearly all registers - it is less interactive but more elaborate

– Sanyal (2007): “clumsy Victorian English [that] hangs like a dead Albatross around each educated Indian’s neck”

• Modern BrE appears to be most interactive and least elaborate (e.g. S1A, S1B, W2D)

• 3 varieties of English used in East and Southeast Asia are very similar

-50-40-30-20-10

0102030405060

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=9.04, 4 d.f. p<0.001

Page 16: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

16

2) Elaborative online evaluation

• BrE generally shows a higher score than non-native varieties of English (e.g. W2A, W1B, S2B)

• Non-native English varieties tend to be very similar in most registers

-6

-4

-2

0

2

4

6

8

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=14.13 4 d.f.p<0.001

Page 17: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

17

3) Narrative concern

• BrE demonstrates a greater propensity for narrative concern– Most noticeably in news reportage (W2C) and instructional writing (W2D)

• Indian English is least concerned with narrative– Esp. in registers like correspondence (W1B), instructional writing (W2D),

and unscripted monologue (S2A)

-8

-6

-4

-2

0

2

4

6

8

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

Register

Fac

tor

sco

re

GB HK IN PH SG

F=7.974 d.f.p<0.001

Page 18: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

18

4) Human vs. object description

• Very close in a number of registers• Indian English and BrE show similarity in a greater range of

registers• HK and Singapore Englishes display great similarity

-6

-5

-4

-3

-2

-1

0

1

2

3

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=5.92 4 d.f.p<0.001

Page 19: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

19

5) Future projection

• BrE has the highest score in all printed written registers (W2A–W2F)• Indian English shows the lowest score in nearly all registers

-8

-6

-4

-2

0

2

4

6

8

10

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=47.63 4 d.f.p<0.001

Page 20: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

20

6) Personal impression / judgement

• Very similar in many registers…with most noticeable differences in non-printed written registers (W1A, W1B), non-academic writing (W2B), and news reportage (W2C)

• HK English displays a distribution pattern similar to Singapore English in spoken registers (S1A–S2B) and unpublished written registers (W1A, W1B), but it is very close to Philippine English in printed writing (W2A–W2F)

-4

-2

0

2

4

6

8

10

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=12.25 4 d.f.p<0.001

Page 21: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

21

7) Lack of temporal / locative focus

• Overall difference is not significant statistically– …but there are noticeable differences in some registers (e.g. W1B,

W2D)• Indian English demonstrates a consistently higher score in spoken

registers (S1A-S2B) – …but a lower score in unpublished writing (e.g. W1B)

-12

-10

-8

-6

-4

-2

0

2

4

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=2.28 4 d.f.p=0.058

Page 22: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

22

8) Concern with degree / quantity

• BrE generally displays a higher score in nearly all registers• HK English does not appear to be concerned with degree and quantity (e.g.

W2D)• Similarly Indian English also lacks a focus on degree and quantity (e.g.

W1B)

-6-5-4-3-2-1012345

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

RegisterF

acto

r sc

ore

GB HK IN PH SG

F=24.324 d.f.p<0.001

Page 23: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

23

9) Concern with reported speech

• Overall difference is not significant• Noticeable difference in news reportage (W2C)

– East and Southeast Asian English varieties show a greater propensity for concern with reported speech than BrE and Indian English

-6

-4

-2

0

2

4

6

8

10

S1A S1B S2A S2B W1A W1B W2A W2B W2C W2D W2E W2F

Register

Fac

tor

sco

re

GB HK IN PH SG

F=1.51 4 d.f.p=0.196

Page 24: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

24

Summary and future research

• Summary– Seeking to enhance Biber’s MDA model with

semantic components– Introducing the new model in research of World

Englishes• Directions for future research

– More native English varieties from the Inner Circle– A wider and more balanced coverage of geographical

regions– Including socio-culturally relevant semantic categories– Combining corpora and more traditional resources in

socio-cultural studies and historical research• …adequately descriptive + sufficiently explanatory…

Page 25: Using an enhanced MDA model in study of World Englishes Richard Xiao University of Central Lancashire RXiao@uclan.ac.uk

25

Thank you!