variation in english: perception and patterns in the identification … · 2016-01-20 · ii...
TRANSCRIPT
Variation in English: perception and patterns in
the identification of Maltese English
Sarah Grech
A thesis submitted in fulfilment of the requirements for the
degree of
Doctor of Philosophy
The Institute of Linguistics
University of Malta
2015
University of Malta Library – Electronic Thesis & Dissertations (ETD) Repository
The copyright of this thesis/dissertation belongs to the author. The author’s rights in respect of this
work are as defined by the Copyright Act (Chapter 415) of the Laws of Malta or as modified by any
successive legislation.
Users may access this full-text thesis/dissertation and can make use of the information contained in
accordance with the Copyright Act provided that the author must be properly acknowledged.
Further distribution or reproduction in any format is prohibited without the prior permission of the
copyright holder.
i
Declaration
I declare that I am the legitimate author of this thesis and that it is my original work.
No portion of this work has been submitted in support of an application for another degree or
qualification of this or any other university or institution of learning. The work reported in
this thesis has been carried out by myself, except in the case of other works, sources and
research, which have been duly acknowledged.
Sarah Grech
ii
Abstract
This dissertation investigates patterns of speech perception and speech production,
and some of the relationships between the two, in a newly emerging dialect of English,
Maltese English (MaltE).
The research taps into the familiar sensation felt by many native MaltE listeners, that
they would know if a person was Maltese even if they were speaking in English, and even if
the English concerned contained a range of patterns of variation across different Maltese
speakers. In considering both patterns of perception and of production of aspects of variation
in MaltE, I investigate whether 28 native MaltE listeners are sensitive to the perception of
what might be considered identifiably MaltE, even though there may also be variation in the
use of such identifiably MaltE characteristics. I then explore patterns of production in both
natural and more scripted speech in six MaltE speakers within the framework of a continuum
of varation, with reference to five particular characteristics widely noted to be associated with
MaltE speech patterns. Here I expect that degrees of variation in some of the five
characteristics may correspond quite closely with the listeners’ perceptions about what they
consider to be more or less identifiably MaltE.
Results indicate a promising degree of correspondence between perceptions, on one
hand, and the frequency and/or form of variation present in the five characteristics studied, on
the other hand. The results also point towards a better understanding of the interplay between
perception and production in language variation. To this end the results of both the perception
and the production patterns are also used to build the beginnings of an index which may
represent different degrees of variation in each MaltE speaker, across a number of features
identified as salient for MaltE.
iii
iv
Acknowledgments
As I conclude the final sections of my doctoral dissertation, I am struck by the
humbling thought that I could not have reached this point without the support and generosity
of so many people.
To begin with, my two supervisors could not have made a better team: Professor Ray
Fabri, unflappable, completely reasonable and utterly pragmatic at all times, and Dr
Alexandra Vella, unfailingly inspiring, generous and insightful. I owe it to them that I ever
decided to start on this journey in the first place, and I will always appreciate those intense,
focused conversations (thankfully kept on track by good coffees) which helped to sharpen
analytical skills and get me through a tangle of theories, without losing my way completely.
As for people keeping me on the right path, I am also grateful to Dr Albert Gatt for
ensuring that I did not lose my way in trying to get through enough of the basics in statistics
to be able to make sense of some of my data. It is also thanks to Albert, together with the
keen students on the Institute of Linguistics Stats course, that I find, to my surprise, that I am
eager to continue learning more about how to use and refine statistical analyses in future
work.
Ray, Sandra and Albert are three members of the Institute of Linguistics, which has
turned out to be a great place to be a PhD candidate, so I'd also like to thank all the individual
members of the Institute, including its students, for collectively fostering an atmosphere of
focused scholarship and enthusiasm for what is, of course, the best field in the world to be
studying (and the fact that I can say that without a hint of sarcasm at the very end of this
thesis, says it all). A special mention goes to Sarah, for her talented artwork for my task
pictures, and for her valuable feedback during the early stages of task design. You might
consider a lucrative sideline in cartooning for linguistics, Sarah!
v
I'd also like to thank the University of Malta, for funding this PhD, and my two new
homes away from home, Universität Bamberg (Institut für Anglistik und Amerikanistik) and
Universität Cologne (Institut für Linguistik – Phonetik). What a privilege it has been to have
the opportunity to talk, discuss, present work and get such valuable feedback, thanks to
Manfred, Martine, Michaela, Sebastien, Anne, Doris, Stefan, Luke, and last but not least,
Simon, who patiently helped me get my head around some Praat scripts. Thanks too, go to
many more people at all three Universities for also patiently listening to early, semi-formed
ideas for this thesis while it was still in its infancy.
The infancy stages of a thesis must always be a difficult and unsure time, but here too,
I was blessed to work closely with an exceptionally generous group of participants who
readily agreed to take part in my two related studies, as native listeners and speakers of
MaltE. Thanks to their interest, selfless sharing of knowledge, and time, those early stages of
research were almost painless, and always fruitful. All the participants remain anonymous,
but they know who they are, and I remain deeply grateful to them for their insights,
generosity and encouragement.
My colleagues at the Centre for English Language Proficiency and the Department of
English have all played a significant part in ensuring that this thesis gets completed in good
time. Particular thanks go to Dr Odette Vassallo and Professor Ivan Callus for going to great
lengths to carve out pockets of time for me to concentrate on my thesis, particularly when so
much other work was pressing. Thanks too are due to Mario and Giuliana for so generously
taking time to encourage and humour me in many ways, during the last two years.
My students attached to the Centre and the Department also deserve a hearty thanks.
The balanced doses of good will, commiserations, and also occasionally (less helpful but no
less welcome) distractions, often boosted flagging spirits. In particular, I'd like to thank Aida,
vi
whose sheer determination and energy at the data collection phase was probably the sole
reason I managed to get so many participants for one of my studies.
Dr Odette Vassallo also deserves a special mention as a friend, "Odette". Being a kind
and patient listener is a wonderful quality in a friend, so it was a bonus that as a friend and
fellow academic, Odette could stomach my ramblings about the more niggling details of this
thesis. Many friends have supported me and my family through this journey, and without
them cheering us on with food, sane company and much patience, writing up this PhD would
have been a much less enjoyable process. Thanks especially to Jo and Tony and hours of
child-friendly swimming time at their lovely home, and Grace (and that Port). Emma too
deserves special mention for having the guts to teach me how to sew a dolphin costume for
carnival in the middle of a PhD.
And finally, my family. Every single member of my family has offered support in
whichever way they knew best, including with the occasional "have you finished yet?",
which did keep me on my toes a little! Dad helped me get started, and aunts, uncles, cousins,
and in-laws kept egging me on, while Katie, my sister, is an absolute rock. There are not
many people who can switch from super-efficient auntie, to sympathetic sister (in-law, too),
and back again to last-minute child-minder at the blink of an eye, but Katie managed it every
time, so thanks, Katie, for being the best sister anyone could hope for.
It seems like a colossal understatement to say that this thesis would never have seen
the light of day if it hadn't been for Robert, but that's the truth of it. The fact that he quite
enjoys the cooking (and laundry, apparently, who knew?), and was happy to spend endless
hours chasing after our two at the swings, or keeping a watchful eye out at the seaside,
doesn't diminish the fact that he put up with all this, and much more, often solo, for the last
three years. He could do with a medal for sticking by me through the various highs and lows
that this PhD journey has often entailed, but I know he’s probably content just to see the end
vii
of the journey with me. How did I get so lucky? As for Emma and Steven, well I like to hope
that it's been a good learning curve for them to see their mum 'studying' and working hard, oh
and juggling too many things a little frantically, but I suspect they are just quite relieved to
see the back of 'Mummy's big book'. At last, they're in luck!
viii
This thesis is dedicated to Robert, my husband, to Emma and to Steven, my children,
and to my mother, Jeni, who decided quite early on that I would need to write something to
earn my living.
ix
CONTENTS Declaration .............................................................................................................................. i
Abstract .................................................................................................................................. ii
Acknowledgments................................................................................................................. iv
List of Figures ..................................................................................................................... xiv
List of Tables ...................................................................................................................... xvi
1 Introduction ........................................................................................................................ 1
1.1 An overview of the thesis ............................................................................................ 6
1.2 Chapter Overview ....................................................................................................... 8
2 Describing Variation ........................................................................................................ 12
2.1 Variation and The World Englishes Context ............................................................ 13
2.1.1 The Kachruvian approach .......................................................................... 14
2.1.2 A 'cline in bilingualism' and systemic approaches ...................................... 17
2.1.3 Implications for MaltE within Kachruvian approaches ................................... 19
2.1.4 Schneider's Dynamic Model ...................................................................... 20
2.1.5 The development of systematic patterns of variation in the Dynamic Model
21
2.1.6 Describing MaltE within the Dynamic Model ............................................ 23
2.2 The "World Englishes" Framework for MaltE ......................................................... 25
2.2.1 Evidence for structured variation in MaltE and current research ................ 26
2.2.2 Some aspects of variation in MaltE ............................................................ 31
2.2.3 Attitudes towards Maltese English ............................................................. 35
x
2.4 Conclusions ............................................................................................................... 37
3 The notion of identifiability and its measurement ........................................................... 39
3.1 Defining "Identifiable" .............................................................................................. 39
3.1.1 A link between the identifiably salient characteristics of variation and
listener perception in language ............................................................................... 39
3.1.2 Narrowing down identifiable patterns of variation for MaltE ..................... 42
3.2 Measuring perception ................................................................................................ 45
3.3 Using magnitude estimation: closer considerations .................................................. 47
3.3.1 The rationale for magnitude estimation ...................................................... 48
4 Methods and Procedures ....................................................................................................... 57
4.1 Study 1: overview and rationale ................................................................................ 58
4.1.1 The speakers in the audio stimuli for Study 1 ............................................ 59
4.1.2 Preparing audio stimuli for ME .................................................................. 60
4.1.3 The expert listeners in Study 1 ................................................................... 62
4.1.4 Further implications resulting from Study 1 ............................................... 63
4.2 Study 2: overview and rationale ................................................................................ 64
4.2.1 Identifying five salient characteristics in MaltE ......................................... 65
4.2.2 The speakers and their data ........................................................................ 88
4.2.3 Recordings and data collection .................................................................. 91
4.2.4. Task Design ............................................................................................... 94
4.3 Preparing and collecting the data ............................................................................ 101
4.3.1 Speaker data: preparation and annotation ................................................. 101
xi
4.3.2 Perception study: design and preparation ................................................. 103
4.4 Data annotation, markup and analysis: procedures ................................... 106
4.4.1 Annotationand mark-up procedures ......................................................... 107
4.4.2 Analysis of data: procedures .................................................................... 110
4.4.3 Phonetic analysis: procedures .................................................................. 111
5 Study 1: analysis and findings ....................................................................................... 118
5.1 Overview and comments on using magnitude estimation ........................................... 118
5.2 Study 1: Analysis of Study 1 Magnitude Estimation results ................................... 121
5.3 Study 1: results of feedback commentary .................................................................... 125
5.4 Study 1: conclusions................................................................................................ 130
6 Study 2: analysis and findings ...................................................................................... 132
6.1 Perception study (Study 2) ...................................................................................... 132
6.1.1 Perception study (Study 2): Analysis ....................................................... 135
6.1.2 Perception study (Study 2): discussion and conclusions ........................... 145
6.2 Patterns of variation in the production of speech in 6 MaltE speakers ................... 147
6.2.1 Patterns of variation at the segmental level: An overview .................................. 150
6.2.2 Two trends ............................................................................................... 157
6.3 Fine-grained analysis............................................................................................... 160
6.3.1 A Pairwise Variability Index (Vowel Durations) ...................................... 160
6.3.2 Preference for full vowels over schwa (schwaØ) ..................................... 163
6.3.3 Variant realisation of fricatives /θ/ and /ð/, labelled (th) .......................... 167
xii
6.3.4 (a) ............................................................................................................ 174
6.3.5 (r) ............................................................................................................ 181
6.4 Individual speaker variation .................................................................................... 187
6.4.1 Variation and indexical information ......................................................... 188
6.4.2 Variation, speech style and register .......................................................... 197
7 An Index of MaltE Variation ......................................................................................... 201
7.1 Indexing variation in a series of features ..................................................................... 205
7.2 Indexing speakers on a continuum of variation ........................................................... 207
7.3 Discussion .................................................................................................................... 209
8 Conclusion .................................................................................................................... 213
8.1 Patterns of production .................................................................................................. 213
8.2 Patterns of perception .................................................................................................. 216
8.3 Future research ............................................................................................................. 218
References .............................................................................................................................. 220
Appendix A MaltE commentary and complaints.............................................................. 240
Appendix B – Expert Listener Feedback ............................................................................... 243
Appendix C Participant information pack ........................................................................ 245
Data Collection Information Pack and Tasks .................................................................... 245
Part 1 - Information for you, as a participant in this study: .............................................. 245
Appendix D Speech production task materials. ................................................................ 247
Appendix E Perception Study Powerpoint Slides ............................................................ 248
xiii
Appendix F Sample data file for Sp1 ............................................................................... 255
xiv
List of Figures
Figure 4. 1. Some of the vowels relevant to a discussion on MaltE ................................................................ 74
Figure 4. 2.Two pictures for 'Spot the difference' ........................................................................................ 100
Figure 4. 3 An analysis in Praat, Sp5 ............................................................................................................. 109
Figure 4. 4 overt realisation, post-vocalic (r), Sp2 ........................................................................................ 114
Figure 4. 5 null realisation, post-vocalic (r), Sp2 ........................................................................................... 114
Figure 4. 6 "authentic", Sp3, evidence of a burst in the realisation of the (th) variable ................................ 115
Figure 4. 7 "forty-three", Sp5, a different pattern of little, if any burst, followed by pronounced aspiration
........................................................................................................................................................... 115
Figure 5. 1 Expert MalE listener judgments on 10 speakers ......................................................................... 121
Figure 6. 1 Modulus scores from 28 naive listeners expressed as % ............................................................. 136
Figure 6. 2 Total scores % at <1 (less identifiable) and >1 (more identifiable) for each of 10 clips by 28 naȉve
listeners .............................................................................................................................................. 137
Figure 6. 3 % score <1 for speakers in 2 clips ................................................................................................ 138
Figure 6. 4 % score >1 for speakers in 2 clips ................................................................................................ 139
Figure 6. 5 Scores <1 for Clip 8 ..................................................................................................................... 141
Figure 6. 6 Mean correlation per listener ..................................................................................................... 142
Figure 6. 7 (i-v). Proportion of variation in five variables for six speakers .................................................. 151
Figure 6. 8 Varied realisation of /θ/ and /ð/ (variable (th)) distribution across all 6 speakers ..................... 153
Figure 6. 9 Distribution of variant realisation of /æ/ (variable (a)) in six speakers ....................................... 154
Figure 6. 10 Proportion of the realisation of schwa as full vowels in six speakers ........................................ 154
Figure 6. 11 Preference for post vocalic (r) across six speakers .................................................................... 155
Figure 6. 12 Comparing % listeners' identifiability ratings and measured (PVI V.Dur) .................................. 162
Figure 6. 13 a "the", Sp4 ............................................................................................................................. 164
Figure 6. 14 a. "motorbike", Sp3, with a full vowel instead of schwa in "tor" .............................................. 165
Figure 6. 15 "think", Sp1. (th) annotated as strongly aspirated [th] .............................................................. 169
Figure 6. 16 "enthusiasts" in Sp1. (th) annotated as [t] ................................................................................ 169
Figure 6. 17 "thing" in Sp6. (th) annotated as [θ] ......................................................................................... 170
xv
Figure 6. 18 "authentic" in Sp1, with evidence of frication for /θ/ ............................................................... 171
Figure 6. 19 "authentic" in Sp3, with evidence of a burst, followed by aspiration, annotated as [t] ............ 172
Figure 6. 20 "enthusiast" in Sp5 with evidence of no burst and aspiration .................................................. 172
Figure 6. 21 Three realisations of / θ / (variable (th)) across six speakers, by text type (raw figures) ........... 173
Figure 6. 22 Three realisations of (a) across text type (raw) ......................................................................... 176
Figure 6. 23 (a-f). Vowel space used by Sp1 (male), with particular reference to [æ], [e], [ʌ]....................... 177
Figure 6. 24 (a) Formant measurements for a range of vowels in Sp1 ......................................................... 179
Figure 6. 25 Three variants of (r) in six speakers .......................................................................................... 181
Figure 6. 26 Variant realisation of (r) by speaker and by speech style .......................................................... 182
Figure 6. 27 (r) realised as an alveolar approximant [ɹ], and as an alveolar approximant with frication [ɹ]̝,
Sp1 ..................................................................................................................................................... 184
Figure 6. 28 (r) realised as null, and as an alveolar approximant with frication [ɹ]̝, Sp1 ............................... 185
Figure 6. 29 Vowel durations when followed by r_Y (overt) and r_N (null) realisations of post-vocalic (r) ... 186
Figure 6. 30 Patterns of variation in the group of speakers rated highly identifiable ................................... 195
Figure 6. 31 Patterns of variation, across all six speakers ............................................................................. 196
Figure 6. 32 Identifiability ratings for Sp6 ................................................................................................... 199
Figure 7. 1 % of listeners rating each speaker (Sp) more identifiable than the modulus .............................. 201
Figure 7. 2 Identifiability ratings corresponding to PVI V.Dur and four segmental variables in six MaltE
speakers ............................................................................................................................................. 202
Figure 7. 3 Indices of Variation for all five variables per speaker ................................................................. 207
xvi
List of Tables
Table 4. 1 A summary of native MaltE expert listeners' observations ............................................................ 68
Table 4. 2 Five variables chosen for analysis ................................................................................................. 69
Table 4. 3 Vowel interval measures for 2 speakers of MaltE .......................................................................... 84
Table 4. 4 Task types for data collection ........................................................................................................ 98
Table 4. 5 Extract, Excel file extracted from Praat, for normalised PVI calculations ..................................... 116
Table 4. 6 PVI calculations for three vowel segments .................................................................................. 117
Table 5. 1 Modulus scores and scales for 9 expert MaltE listeners ............................................................... 119
Table 5. 2 li(stener) scores on 10 clips, with average and standard deviation, per clip. ................................ 122
Table 5. 3 Expert listener correlations .......................................................................................................... 124
Table 5. 4 Corresponding p-values for expert listener correlations .............................................................. 125
Table 5. 5 Examples of comments by expert listeners in the feedback section of the Pilot Study ................ 128
Table 6. 1 A rough categorisation of ME judgment of perception ratings for "identifiability" ...................... 140
Table 6. 2 Paired correlations for 28 MaltE listeners .................................................................................... 144
Table 6. 3 Summary of Labelling .................................................................................................................. 148
Table 6. 4 (schwaØ) in 'Spontaneous' and in scripted 'TextAloud' speech .................................................... 167
Table 6. 5 Preference for substitution of (th) in Spontaneous speech and scripted TextAloud ..................... 168
Table 6. 6 (th) realised as a strongly aspirated stop [th] .............................................................................. 172
Table 6. 7 Variant realisation of (a) in the three highly identifiable MaltE speakers .................................... 174
Table 6. 8 Two marked realisations of (a) in TextAloud and Spontaneous text ............................................ 175
Table 6. 9 A summary of speaker background .............................................................................................. 189
xvii
Finding myself lost in Salisbury I stopped a random bus to ask if it by any chance
passes from my road. The driver stared for a second and said "jaqaw inti Maltija?" Then I got
a "itla' sabiħa nieħdok fejn trid!" … hehe gotta love the Maltese and the fact that we're
Everywhere!
Fran, lost in Salisbury, on Facebook.
1
1 Introduction
Variation in language has had a long and turbulent history of scholarship within the
broader study both of linguistics in general and languages and language varieties in
particular. Sometimes viewed askance as carrying evidence of some form of weakening or
deterioration with respect to the more standardised – and therefore more widely codified –
languages, variation in language has more recently enjoyed a surge of interest across a wide
variety of areas within linguistic analysis.
Although the English language can hardly be said to suffer from a dearth of research
and scholarship – far from it – the same cannot always be said for its lesser known varieties,
which generally must first weather a period of denigration and suppression as exponents of
'bad' English before possibly getting to a stage of finally being tolerated and then, perhaps,
researched (Milroy, 2001; Schneider, 2007). Maltese English, a variety of English spoken in
the Maltese islands, is no exception, and it has been argued by Thusat et al. (2009) that it can
be placed within that phase of development which Schneider (2007) has termed 'nativization',
a phase characterised also by "a clash of opinions" and "community-internal discussions of
the adequacy of linguistic usage" (Schneider, 2007:43). Many of the clashes and discussions
surrounding Maltese English are evident in national newspapers, blogs or YouTube clips, and
are sometimes also paralleled by discussions on falling standards of language use and
language education in the local context (see Appendix A).
This thesis sets out to explore aspects of variation within Maltese English, a variety
that is used extensively by many of the inhabitants of the Maltese archipelago situated in the
southern Mediterranean Sea. There may also be other communities scattered throughout the
world with Maltese English influences or undertones as the islands have a long history of
emigration to Australia, Canada, Eygpt, North Africa and Britain, but this to my knowledge,
2
has not yet been extensively studied and will not be considered further at this point. The
Maltese and Gozitan speech community and its subgroups across the Maltese islands is
characterised by a number of socio-linguistic features, among them, that it is a closely knit,
largely bilingual, community, which can boast the contributions of a rich array of different
languages throughout its long history as a midway point in the Mediterranean. With influence
at the very least from various forms of Arabic, Sicilian, Italian, French and English, and with
both Maltese and English formally designated as official languages in the constitution
(Article 5 Language), language variation in such a community is bound to be rich, highly
nuanced and subtle and most importantly, intricately woven into the social fabric of the
islands, and way of life (Brincat, 2011; Vella, 2012; Camilleri, 2013).
Both Brincat (2011) and Vella (2012) have highlighted the deeply rich linguistic
history enjoyed by the Maltese islands, developing in part as a means of survival through the
various comings and goings of a succession of foreign rulers keen to exploit the islands'
advantageous geographical situation in the Mediterranean region. However attitudes towards
the inheritance of a linguistic melting pot are not always so generous or favourable. With
regards to the islands' use of English and its variation, public perception, at least, is more
likely to express concern at deteriorating standards rather than to view it with indulgence as
an enriched illustration of local society. In some senses, Maltese English has suffered the
same fate as many other burgeoning varieties of English in that the degree of variation within
the variety has almost served to have it written off simply as an example of bad English,
where many speakers have not learned a standardised model or norm well enough at school,
displaying in consequence, little control over its finer details. Many of the features or
characteristics noted as typical of this variety of English are therefore in fact seen to belie
good education or minimally, a good command of English, although this is not necessarily
the case. While it is true that some aspects of variation in Maltese English may constitute
3
elements of fossilised interlanguage (Debrincat, 1999), it is becoming increasingly evident
that such fossilisation perhaps describes just one aspect of the varied patterns of speech that
are closely associated with Maltese English (see Chapter 2 for more on the patterns
themselves).
The overriding popular concern with deteriorating standards of English, or the
labelling of different levels of variation of the English used in Malta ranging from 'native' or
'good' to 'bad', sometimes termed with the pejorative 'Minglish', or 'Manglish', has in part, at
least, contributed to the fact that research on Maltese English (or MaltE, from now onwards)
has been limited to isolated, if principled, efforts to identify and describe the more salient
characteristics of this variety, often with reference to an established (external) norm, such as
Standard British English. In circular fashion, this lack of sustained and coherent scholarship
has resulted in a struggle to fully understand how MaltE operates in the local speech
community - for operate it does – and it has also resulted in something of a vacuum for
educators and other professionals using English, who are often unsure which set of models or
norms to relate to.
Maltese linguists (Borg, 1980, 1986; Camilleri, 1992; Vella, 1995) trying to situate
the use of English within the wider context of language usage in Malta have always been
quick to note that particularly in speech, the English used in Malta most likely does not
constitute a homogenous or static entity, but instead presents evidence of degrees of
variation. This understanding of what Borg (1980: 4) first termed "gradation" and later
expanded on as "a continuum of speech styles" (Borg, 1986:11) will be discussed more fully
in Chapter 2, but it is worth noting here that the issue of variation within Maltese English has
until more recently, sometimes stymied any extended research. More recently, both
theoretical and descriptive frameworks for language variation, together with the development
of sociophonetic research, have encouraged networks more conducive to the descriptive and
4
objective study of such aspects of variation. The recognition of a variety of English, MaltE,
for example, within the International Corpus of English, and known now as ICE-MTA
(Hilbert & Krug, 2010) has undoubtedly helped to establish this variety more formally, as
have other formal descriptions of aspects of Maltese English since the 1990s (Mazzon,
1992/1993; Vella, 1995; Bonnici, 2010; among others). The shift towards a more inclusive
approach in the study of World Englishes is captured succinctly by Bhatt (2001: 534), who
explains that the primary motivations of many studies in this approach is "to describe the
structure of a 'nonnative' variety in its own terms, not as descriptions of aborted
'interlanguages'".
The current research takes the position that variation in MaltE, as it has been
obliquely alluded to, or sometimes parodied and caricatured, for decades, is evidence of a
dialect being shaped by the needs of the speech community that uses it and as such, merits
close attention in order to arrive at a better appreciation of what such variation may mean and
what effects it may have on those who use it. MaltE is much like any other emerging variety
of English in presenting a range of patterns at all levels of linguistic structure, many of which
are systematically produced and hence predictable enough to generate clear perceptions
among listeners that a speaker is, or is not, from Malta. As Schneider reminds us, "In most
instances, as soon as a person starts to speak, listeners will be able to roughly assess where
the speaker grew up, in which social circumstances, and how formal or casual is the speech
situation being framed…" (Schneider, 2007: 8).
MaltE is typical of any other language and its assorted variations, in that its speakers
are able to exploit and manipulate a wide range of features and characteristics which go
beyond the raw intended meaning of a message to convey any number of relevant nuances or
clues about themselves, in anything from the immediate conversation at hand, to the wider
speech community at large. At the same time, such nuances are not necessarily consciously
5
noticeable to the listener, even if they are subconsciously used to place a speaker within a
particular context.
For researchers in the fields of speech perception and sociophonetics, this dichotomy
is not a new one, and this is reflected in a drive to represent not only the contrastive and
categorical patterns of language acquisition and use, but also to account for what might once
have been simply termed "noise" in the speech signal. Research during the past two decades
has seen new efforts being made to acknowledge and account for the understanding that the
speech signal conveys much more than the intended message of the speaker, and usage-based
models of language description now emphasise the constant interplay between a listener's
language experience and history, and their judgment of another speaker which is so often
shaped by that same experience (Docherty & Foulkes, 2014). Thus, more recent theories in
usage-based models of speech perception and sociophonetic variability have begun to be able
to account for this apparent anomaly that native listeners can on the one hand broadly identify
another native speaker, even if they cannot always agree on the finer details (Clopper &
Pisoni, 2005).
Until this point, the apparent lack of consensus on what, in the case of MaltE, would
be a prototypical Maltese speaker of English, might be taken to underscore for example, the
common (mis)perception that the English spoken by a non-native speaker of English was
simply "bad" English, lacking any form of systematic patterning or linguistic coherence, and
consequently, most definitely not worthy of further consideration. In the current climate,
however, it is possible to contemplate investigating MaltE patterns of production and
perception from a more objective perspective, thereby approaching a more coherent
understanding of how such patterns can in fact be meaningful within the speech communities
of the Maltese islands, and not at all random, or aberrant, as it has been much more frequently
suggested. MaltE, may, in fact, be more accurately described as approaching a newly
6
emerging variety of English, and as such, is another example of a thriving variety of language
fully capable of drawing on a plethora of available features or characteristics at any level of
linguistic structure.
At this point, therefore, two related questions capture the motivation for this research,
namely:
1) Do MaltE listeners perceive a range (continuum) of variation as identifiable
for MaltE speakers?
2) Can any predetermined variables be identified as salient triggers for, or
indicators of, such perception patterns, if these are evident?
The investigation of these two questions captures the main direction of this
dissertation. On the strength of observations and research into recurring themes readily
associated with descriptions of MaltE both in previous literature, and also more anecdotally, I
expect that it may be possible to begin to identify patterns of co-occurrence, or
correspondence, between a given number of linguistic characteristics and the perception of
what can be considered readily identifiable, as MaltE. The patterning of such identified
linguistic characteristics, as well as a closer understanding of what transpire to be salient
characteristics of MaltE will be discussed in the following chapters.
1.1 An overview of the thesis
The defining theme which has shaped my research questions (listed above) relates
closely to the indication first mooted by Borg (1980) and since acknowledged by other
linguists (in particular Camilleri, 1992; Vella, 1995, 2012; and Bonnici, 2010), namely, that
variation within MaltE in the Maltese speech community is in evidence, and seems,
unsurprisingly, to operate on a continuum. How such variation in MaltE operates and how it
can be categorised or understood has been studied very little to date, although interest in both
7
inter- and intra-speaker variation is increasing (for example, Debrincat, 1999; Caruana, 2006;
Bonnici, 2010). An issue which continues to beset studies in this field relates precisely to
those patterns of variation which may seem to be only superficially related, to be viewed in
such a way as to allow for the classification of a variety to begin in the first place. Bonnici,
Hilbert & Krug (2012: 10), acknowledge that "choosing the ‘variety type’ to which MaltE
belongs is problematic, as any choice would not represent the linguistic reality on the
islands". Yet the persistent, if still largely anecdotal, awareness that MaltE is instantly
recognisable by a native listener would suggest some degree of coherence across this
continuum. Some indication to the resolution of such apparent disparity might perhaps be
forthcoming following a closer consideration of more recent trends in the ways that variation
in MaltE is being viewed, on its own terms, and not solely with reference to a more
established "standard" form of English, typically Standard British English or General
American English (Kachru, 1990; Schneider, 2007, for example).
The current research focuses on aspects of variation at the level of pronunciation
within MaltE, as it is evidenced in six case studies of MaltE speakers, and in two perception
studies with 9 expert listeners, and with 28 naïve listeners respectively. The decision to limit
the focus of the analysis carried out to the phonetic/phonological domain was taken for two
main reasons. Firstly, although a coherent picture of the description of MaltE across all levels
of linguistic analysis would be a substantial achievement, it would not be possible to do
justice to the full depth and range of MaltE variation patterns within the scope of this thesis.
Secondly, it became increasingly clear both in the study of previous literature (particularly
Vella, 1995; Bonnici, Hilbert & Krug, 2012), and early on in the first Perception study
carried out with expert listeners for this research (see section 1.2 below), that although
features at the syntactic or lexical levels would often present noticeable and substantial
evidence of variation within MaltE, variation at the phonetic/phonological levels was likely
8
to be much more broadly distributed across different speakers. In other words, Maltese
speakers of English, regardless of their linguistic background, or language dominance,
usually tend to exhibit some evidence of being MaltE speakers at the phonetic or
phonological levels, even in cases where evidence at other levels of structure, such as the
grammatical, are limited, or non-existent (Vella, 1995).
It is clear that MaltE will undoubtedly present evidence of variation at levels of
analysis other than the phonetic/phonological ones, and this has been indicated in the
available literature, but it is also noted in the same literature that variation at the
phonetic/phonological level is highly likely to be prevalent across the entire continuum of
usage, and may thus also be more usefully indicative of what is considered identifiably MaltE
at discrete points along the continuum of variation within MaltE (Vella, 1995; Bonnici, 2010;
Fabri, 2011). The core aim here then, is to understand whether certain characteristics and/or
features of MaltE can be considered salient at the phonetic level, and if so, to then analyse
how these are manifested in the case studies of six speakers. In studying more closely those
characteristics which might trigger the perception of MaltE, the research aims to move
towards the development of a profile of MaltE, which can be expanded to account for a more
comprehensive range of patterns at all levels of analysis, including the phonetic/phonological,
morpho-syntactic, and pragmatic levels.
1.2 Chapter Overview
Chapter 2 contextualises the study of MaltE within the wider field of World Englishes
and the developments therein of a more pluricentric view of new varieties of English. This
chapter largely revolves around sociolinguistic models and frameworks as the field of
sociolinguistics can be considered one of the first to make a case in favour of this more
pluricentric approach to the study of new varieties of English. If more traditional models for
9
the description of English take a conventionally accepted standard form as the sole model,
thereby propagating the sense of an essentially monocultural and monolingual perspective
(Cheshire, 1991), then the more recent growth of English on a global level has necessitated
the reappraisal of such models. The challenge to account more realistically for the
development of English – or Englishes - globally was first met by researchers determined to
understand what place English had in new societies where it had not previously been present
(Platt, Weber & Ho,1984; Kachru, 1990; Schneider, 2003; 2007; Kortmann & Lukenheimer,
2013, among others).
Chapters 3 and 4 consider the notion of perceptions of what is typically, or
identifiably associated with a given language, or language variety. Chapter 3 expands on the
notion of the identifiability of a language variety, by its native listeners, with reference to an
emerging dialect of English, and the current trends in the study of perception and production
of dialectal variation. Also under consideration in this chapter is the issue of how best to
capture patterns of speech perception particularly in a climate where strong attitudinal
positions towards MaltE may prevail. Here I discuss studies relating also to how perception
of language, and of variation in particular, can be measured, before concentrating on
Magnitude Estimation as the measuring tool used in the current research. Chapter 4 presents
the methods and procedures used to collect, annotate and analyse the data on which the two
studies constituting this research are built. Both studies concentrate on the combined research
of patterns of perception and patterns of production, in native MaltE listeners and speakers
respectively, and focus in particular on five salient characteristics, the rationale for the choice
of which is also explained in Chapter 4 (Section 4.2.1). The five characteristics relate to
aspects of vowel reduction, vowel duration, variant realisations of the /æ/ phoneme, rhoticity
and the substitution of the phonemes /θ/ and /ð/ with a phonetic variant more closely
identified as a [t] or a [d], respectively. All five characteristics present instances of varied
10
realisation in the speakers in the collected data. Some of these characteristics are also
variously reported in the literature, even though the range of variation for each of them is not
necessarily fully understood. They are sometimes also parodied widely in popular culture as
examples of what a MaltE speaker might sound like (see Appendix A). Many of the chosen
characteristics were also highlighted in the first study for this research where nine expert
native MaltE listeners were also encouraged to identify those features or patterns or
characteristics which they felt stood out in shaping their judgments about each speaker. Thus
Chapter 4 describes the two studies, henceforth referred to as Study 1 and Study 2,
considered both separately and in relation to each other, which constitute a series of speaking
and listening tasks designed to capture both patterns of production and perception
respectively, on which this research is based.
Chapters 2, 3 and 4 together serve to contextualise those aspects of MaltE which are
the main focus of this thesis. Although limited in breadth, the research carried out to date on
MaltE (notably, Delceppo, 1986; Calleja, 1987; Vella, 1995; Debrincat, 1999; Bonnici, 2010)
has provided clear indications of where to look for patterns of variation, and such
descriptions have formed the basis for the sketch of MaltE as we currently understand it (see
Section 2.2.2 below for a detailed discussion). A steady input and compilation of
undergraduate dissertations, articles and two doctoral dissertations, together with the
previously mentioned ICE-MTA, has resulted in a not unimpressive series of descriptions of
diverse aspects of MaltE. Such descriptions have made it much more feasible to propose a
select number of characteristics which would most likely be expected to trigger perceptions
of what might be identifiable, for MaltE.
Chapter 5 presents the analysis and the results of Study 1, while Chapter 6 presents
the analysis and the results of Study 2. Chapter 6 also presents the findings of patterns of
production (Study 2) for the five characteristics referred to earlier, this section, in a case
11
study analysis of six of the MaltE speakers presented for the Perception study in Study 2
carried out with naïve native MaltE listeners. Study 1 and Study 2 were both designed to
capture the more introspective judgments made by both expert (Study 1) and naïve (Study 2)
native MaltE listeners concerning their perception of MaltE, while listening to a series of
recordings from different speakers of MaltE. The analyses of patterns of perception in Studies
1 and 2, and of production in Study 2 point towards a well defined path in the approach to
understanding some of the salient features of MaltE. In Chapter 7, the identification of salient
aspects readily identified with MaltE is further explored and exploited, with the development
of a preliminary suggestion for an index of variation based on the analysis of the five
characteristics of phonetic variation combined with the results of the listener perceptions of
MaltE. Future studies may also be able to expand on this initial attempt in order to begin to
identify correlations between indexical information on the one hand and patterns in linguistic
structure on the other. Such an index, therefore, may eventually be honed to capture those
aspects of variation which have here been identified as salient, along with doubtless many
others at other levels of linguistic analysis, in order to establish how meaningful they may be
for the speech community that uses them.
Finally, Chapter 8 presents the conclusions of this research, together with a
consideration of the limitations necessarily imposed on the study of a relatively newly
described variety of English. This chapter also suggests further directions for future research
and illustrates the potential for exploiting an index of variation for MaltE. This in turn, could
form the basis of a better understanding not only of what variant features or characteristics
are typical of MaltE, but more importantly, how such aspects of variation may carry socio-
indexical meaning that can be studied in relation to the speech community of the Maltese
islands.
12
2 Describing Variation
The English language, in all its forms, varieties and applications is clearly widely
researched and widely studied. The particular domain of this research, broadly speaking, is
variation in English, and this too has proved of great interest to scholars for as long as the
language itself has been discussed (Aitchison, 2001). This chapter first provides a broad view
on how the study of variation has developed over the latter half of the twentieth century,
before discussing the focus on those aspects of the study of variation which can throw light
on the development of a newly emerging variety, such as MaltE might be considered to be.
Although MaltE as a coherent variety has resisted definition until relatively recently,
the growing refinement of the parameters within which World Englishes are described has
allowed scholars a new space in which to reconsider in a more positive light, aspects of
variation which had previously been considered problematic. One such aspect is undoubtedly
the issue that a particular variety of English did not simply constitute 'bad' vs 'good' English,
nor was it necessarily defined in equally polarised terms as 'native' or 'non-native' or 'learner'.
Instead, variation is now increasingly seen in its simplest terms to be reflecting the efforts of
a speech community to shape a language to its varying needs. Achebe's iconic remark
reproduced below encapsulates the struggle to come to terms with the notion that the English
language might sometimes be more than a 'foreign' or 'other' language for people outside the
strict confines of Britain, or America. For Achebe (1965) reminds us here that English is
inevitably more than a non-native language for many people, particularly – though not
exclusively - in the colonies or ex-colonies, where some have grown up hearing and using the
English language from a very young age, in lands far removed from the 'motherland' that
might have been Britain, or America. The result is a new need to use this same language,
which must begin to reshape itself to answer this new need, as Achebe (1965) explains:
13
I feel that the English language will be able to carry the weight of my
African experience. But it will have to be a new English, still in full
communion with its ancestral home but altered to suit its new African
surroundings.
2.1 Variation and The World Englishes Context
The approach to considering English and 'Englishes' from a pluricentric base is not
always readily welcomed, even in the very speech communities where such pluricentricity is
clearly a key feature. It is worth noting that the uncertainty, or even intransigence and
reluctance, encountered in the process of coming to terms with a local variety of English is by
no means unusual across many situations where English may have first developed as a 'non
native' language, such as, for example, the language of a colonising force. In her description
of Indian English for example, Pingali Sailaja (2009) echoes familiar sentiments in noting
that "in India, those who consider their English to be good are outraged at being told that
their English is Indian. Indians want to speak and use English like the British, or, more lately,
like the Americans" (Sailaja 2009: 14). Sailaja continues that the preferred terms used to
describe the English used in India is "'Indians' English'", perhaps in the hope that this renders
the thing vague enough to avoid further serious study.
Nevertheless, as Cheshire (1991) also comments, it is simply not appropriate to
continue with the idea that any non standard variation in English is simply a departure from a
norm. She explains that, "Current descriptions...are all too often given as lists of assorted
departures from Southern British Standard English or from American Standard English with
no attempt at determining the extent to which the local linguistic features function as part of
an autonomous system" (Cheshire 1991: 7). It was this "autonomous system" that Kachru
14
(1990, 1992) tapped into in his early discussions of the development of English in British
outposts and colonies throughout the latter 200 years or so of empire building.
The term ‘Postcolonial English’ captures a pivotal concept in the history of describing
those particular varieties developing in ex-colonies previously under British rule, because it
represents a departure from the traditional view of a centralised or ‘Standard’ view of English
against which all others are gauged. Bhatt (2001: 527) explains the shift in perspective which
began to develop in the 60s, and gained new momentum in the 80s and 90s:
This conceptual shift affords a "pluricentric" view of English, which
represents diverse sociolinguistic histories, multicultural identities, multiple
norms of use and acquisition, and distinct contexts of function.
In this chapter I present two pivotal approaches to the study of variation in English,
and I consider these with reference to the context of MaltE in Malta. Section 2.1 below
focuses on two approaches adopted by Braj B. Kachru and by Edgar Schneider in the 1980s
and the 2000s, respectively, and section 2.1.6 places MaltE within the context of these
approaches in particular, and within the context of the World Englishes framework in
general. Section 2.2 goes on to consider salient aspects of MaltE patterns of variation with
particular reference to variation in the phonetic/phonological domains and also considers
various attitudinal positions with respect to MaltE.
2.1.1 The Kachruvian approach
Linguists working in different areas of specialisation have, since the 1960s, been
exploring the implications and effects of British colonial rule on the linguistic makeup of
newly independent nations such as India and Pakistan, Malaysia, Kenya or Nigeria. However,
it is Braj B. Kachru who is often credited with shifting perspectives radically enough so as to
allow for the first descriptions of new Englishes on their own terms rather than in relation to
15
varying degrees of departure from a norm. Similarly, in the field of sociolinguistic analysis,
there has been a noticeable increased emphasis on the perceptions of English as equally - if
not more - represented in its varieties and variations than in its standards (Cheshire, 1991;
Kachru, 1996; Crystal, 2003, among others). This is marked, too, by the increased focus,
driven primarily by the study of sociolinguistic variation, on language variation and change.
Synchronic and diachronic change were increasingly seen to constitute a significant
concentration of linguistic patterns suggesting that previous views of these as random
behaviour by unschooled or otherwise fringe language users would have to be rethought. As
Labov (1978) famously illustrated examining its production in a community of speakers in
New York, a post-vocalic 'r' can have a very meaningful pattern of realisation across different
social strata. Sociolinguists whose main area of study was English were to take up the study
of English language variation in the 1980s, and more wholeheartedly in the early 2000s, as an
important aspect of research in English, underscored by a new determination to understand
more about how any given speaker community might need to use this language.
While, following Saussure, diachronic and synchronic patterns of language change
and variation were once considered separate systems, later linguistic theory began to allow
for the understanding that diachronic and synchronic change could just as readily be
understood as two sides to the same coin (McMahon 1994; Lass, 1997). The sociolinguistic
study of post colonial Englishes can be seen as one trend which has embraced this view of the
study of language variation arising as a function of diachronic change. This same trend has
allowed for a keener appreciation of how newly emerging dialects can be studied and viewed
as autonomous systems developing systematically across time and in a particular society.
If we take a chronological account of the gradual shift in the description of the
English language and its varieties, together with the emphasis on presenting evidence for
structured and systematic patterns of variation in newly emerging varieties of English, it
16
becomes immediately evident that the English language, in a rather short space of time,
moved beyond the confines of its original domain of ownership. The English language is seen
to belong to the world now, and not to a particular country, as Bhatt (2001:528) claims,
“English is regarded less as a European language and an exclusive exponent of Judeo-
Christian traditions”. As the comment made by Achebe earlier illustrates, only a "new
English" can hope to truly be identified with the many people across the globe who might use
it as something much more than a non-native language. Again, Kachru with his studies in
South Asian Englishes, is often credited with identifying a more realistic framework within
which "World Englishes", as they were to become known, could be considered coherent and
self-defining instances of variation in English. Earlier and later studies of Englishes spanning
the globe could begin to be accounted for within this new domain of World Englishes. Such
studies include, among others, Mesthrie's (1992) description of South African Indian English
or Bambgose (1982) on Nigerian English.
Kachru’s (1990) now iconic three concentric circles model is taken to reflect the
expansion of the English language from its beginnings as the language of speakers in, for
example, England, or America in the 'inner circle', through to the English spoken in West
Africa or South Asia, as a result of English colonial legacies in the ‘expanding circle', until it
also reaches such areas as Scandinavia or China, where English is rigorously taught in the
context of its relevance as a global language in the ‘outer circle’. While the model itself may
reflect the political and historical aspect of the spread of English, first through trade and
colonisation, then through education and a new form of globalised trade, it also acts as a clear
illustration of the observation that non-native English speakers outnumber native English
speakers by 4:1 (Bhatt 2001, Crystal 2003).
The direction of Kachruvian linguistics has been driven by "an underlying philosophy
that has argued for the importance of inclusivity and pluricentricity in approaches to the
17
linguistics of English worldwide" (Bolton 2006:240). Kachru began advocating the need for
new models and norms that would be able to describe newly emerging varieties of English
without reference to an external standard, or norm. This new perspective allowed linguists to
begin to identify systematic patterns of variation that may well have been missed in the
previous habits of describing a local or ‘non-native’ variety purely in a one-dimensional
relationship with an established ‘standard’ of English. For example, it became possible to see
a pattern of attitudes and perspectives towards these new Englishes as they developed – or
were not allowed to develop, in the case of some countries (Schneider 2007) – which
followed predictable stages of development, beginning with initial rejection of the local
variety and “clear indifference to it” (Kachru 1990: 90), but possibly progressing to a point at
which “English becomes part of the local literary and cultural traditions” (Kachru 1990:91) .
2.1.2 A 'cline in bilingualism' and systemic approaches
Crucially, with particular reference to the current research, Kachru initiated a deeper
understanding of two important features that would result from this more decentralised
approach to the study of variation in English. Firstly, there is the recognition of a cline within
a variety of English, defined by Kachru as the 'cline in bilingualism' (Kachru 1990:89),
further amplified thus:
A speaker of a non-native variety may engage in a variety-shift, depending
on the participants in a situation...There are thus degrees of approximation
to a norm, depending on the context, participants, and the desired end result
of a speech act (ibid.)
Secondly, Kachru has sensitized subsequent research to the novel thought that these
new varieties of English may well present evidence of ‘systemicness’. Kachru asserts that it
18
is in fact a "fallacy" that "the diversity and variation in English, and innovation and creativity
in the Outer Circle, are indicators of the decay of English" (Kachru 1996:149).
Both of these observations have important implications for the description of MaltE.
In the first place, the 'cline in bilingualism' (Kachru 1990) reiterates a pattern that scholars
since Borg (1980) have noticed for language usage in Malta, but which is difficult to account
for if we operate solely within the paradigm of 'more or less British-English-sounding'. In
other words, variation within a variety is difficult to account for with reference to an external
norm such as Standard British English, which itself has its own local norms and dialects.
Instead, Kachru's notion of a cline is highlighted by Cheshire (1991), cited earlier, who
advocates challenging the monolingual frameworks that have been comfortable up to recent
years and suggests taking a fresh look at how "local linguistic features function as part of an
autonomous system" (Cheshire 1991:7). Kachru's perspective on how Indian English
variation operates within the social network particular to the Indian subcontinent has given
rise to a rich description of variation within this variety including 'Butler English', 'Babu
English', and 'Educated Indian English', among others (Sailaja 2009).
While a similar categorisation is a little premature for MaltE, one can easily see the
implications of being able to adapt the 'continuum' of variation noted by Borg (1986),
Camilleri (1992), Vella (1995), and others. Camilleri (1992), for example, defines four
family types based on the balance of language usage between Maltese and English and
including Borg's (1986) distinction between 'Mixed Maltese English' and Maltese English.
Camilleri Grima (2001:5) further notes "The Maltese bilingual context is a rather complex
one given that there is only one speech community that uses a number of dialects, a standard
variety of the national language and English in very similar and sometimes the same spheres
of activity". This could well be a case where Kachru's observation of a cline can be applied.
19
Examples of systematic variation of the kind that Kachru also notes as evidence of
healthy linguistic development, rather than decay, have also been noted in descriptions of
MaltE (Mazzon 1992, Vella 1995, among others) and are further described in section 2.1.6
below with reference to phonetic/phonological features (for a full compilation of described
features in MaltE, see Bonnici (2010). Bonnici (2010: 131ff) also notes that such variation is
actually noted by participants in her study, which in itself would suggest that such features
are readily associated with a localised variety of English and may in fact carry important
information or meaning in some way.
2.1.3 Implications for MaltE within Kachruvian approaches
It is possible to consider MaltE as another example of a newly emerging variety of
English arising from its 200 years of British colonisation, but here the Kachruvian model
does not offer a neat fit for this variety. Kachru's (1990) now well established 3 Circles model
is ground breaking in its representation of successive waves of communities taking up
English, largely defined by political circumstances. These have been very much defined by
Kachru's own linguistic experiences in ex-colonies in Asia and Africa, where the socio-
political climate was often multi-ethnic and multi-cultural. In such contexts, English was seen
as a potentially unifying lingua franca ensuring that no single language could claim
supremacy over another if it was given official status. This position meant that English
swiftly became the language of such public and official spaces as law courts, government,
and the media. This is not necessarily a context that Malta might recognise because the
linguistic environment, as well as the historical context and heritage, are different. For
example, the Maltese language itself is already a unifying factor across the islands, thriving
for centuries alongside the currencies offered by other lingua francas throughout its history of
serving as an attractive colony. English, for many in Malta, is considered a necesssity for
20
communication with the outside world - the latest in a long history of lingua francas - and it
is also valued for its extended use in domains of economic success and technology. Indeed, as
Schneider explains, "English, and in practice this means indigenous forms of it, is also
spreading rapidly among the less educated, often for specific purposes such as to achieve a
limited communicative ability in trade or tourism" (Schneider 2011:211).
While the early models proposed by Kachru may not be able to account more
specifically for the development of MaltE, the movement away from viewing English within
a monolingual culture, where other varieties could only be anomalies or aberrations, is now
well established and has paved the way for new perspectives on the development of new
varieties of English.
2.1.4 Schneider's Dynamic Model
Drawing both on linguistic analysis of the full range of English variation, as well as
on the accompanying sociolinguistic discussion, Schneider’s development of a "Dynamic
Model' has crystallized the growing realisation that in the vast majority of ex-colonies,
English has "indigenized and grown local roots. It has begun to thrive and to produce
innovative, regionally distinctive forms and uses of its own, in contact with indigenous
languages and cultures" (Schneider, 2007:2).
Schneider's Dynamic Model (2003, 2007) goes straight to the point of systematic
variation. Drawing upon some three decades of sociolinguistic and exploratory studies of
individual varieties of English, Schneider establishes that if we begin to consider variation
without the traditional reference to a standard, we may begin to notice considerable degrees
of similarity and common patterns, at all levels of linguistic analysis. Schneider (2007)
presents an array of studies of a wide range of English varieties and offers evidence, from
these, of numerous examples of patterns of variation common to varieties as diverse as those
21
of the Pacific and Africa, or Asia and the Caribbean. Among them are many instances which
will also be familiar in a MaltE context, such as the progressive form with state verbs (I'm
liking/thinking etc.), no third-person agreement for present simple (She work/he sing), or in
phonology, main stress shift in complex words or the replacement of dental fricatives with [t]
or [d] (Schneider 2007: 76-85). These patterns, and others, are widely reported in descriptions
of World Englishes and are also reported for MaltE (Mazzon 1993, Vella 1995, Bonnici
2010, among others).
2.1.5 The development of systematic patterns of variation in the Dynamic
Model
The evidence of systematic patterns of variation, both within and across varieties,
gives the lie to the still widely encountered view that any variation from a standard need not
be seriously considered as relevant to serious linguistic enquiry because it is simply ‘bad’
English. Indeed, some examples of linguistic innovation and predictable patterns of language
have now become widely used, as Kachru (1996: 139) explains, "it is the nonnative users
who are now responsible for its spread and teaching, and uses". These uses are not
necessarily confined to purely utilitarian needs relating to career advancement, or
understanding smart phone menus. Moreover, now it is also the case that "processes of
language change initiated by language contact are not restricted to grammar, lexis, style, and
discourse. They go beyond these levels and cross over to literatures across cultures" (Kachru
1996: 144).
In presenting wide ranging examples as evidence of systematic variation within and
across varieties of postcolonial English, Schneider has developed his Dynamic Model which
seeks to account for the full range of all those varieties of English arising from the British
postcolonial experience. The model itself describes a five-phase process charting the
22
development of a dialect from the moment the first ‘settlers’ arrive, the Foundation phase,
through to the point where a postcolonial, independent country may arrive at a new and
recognisably discrete variety of English which has developed from the unique combination of
various factors particular to that country, described in the dynamic model as the final phase of
Differentiation. The model is intended to capture the distinct realities presented by colonised
lands, and their possible effects on and contribution to the development of a new variety of
English, but Schneider (2007:57) also allows that "In every developmental process the
boundaries and succession of stages may be realized fuzzily". Some common features would
include, for example, the presence of the ‘settler’ or colonising community, the varying
degrees of rigour with which the colonising language may have been established, or even the
decisions taken by a newly independent nation regarding the use of English. Another
contributing factor to the development of a postcolonial variety of English may be the
interplay between the different speech communities. Schneider accounts for the indigenous
community (IDG strand), the settler community (STL), and another potential player in an
adstrate community (ADS), which provides "a linguistic input that enriches and expands an
existing contact scenario not from 'above' or 'below' but rather 'from the side'" (2007: 58).
The language brought into a community by the "settler" strand might be considered imposed
from "above", as it were, and may quickly become the "exonormative" (Kachru, 1990)
standard. Conversely, the native language of the colonised, indigenous community (IDG) is
more likely to constitute change from "below", in other words, from within the local
community, while the adstrate community may also contribute to the process of change,
though not necessarily in terms of imposition, or of upward social dynamics. These factors
and others may combine in a number of ways to yield a variety of English which is unique,
on the one hand, but on the other hand also similar to all those other varieties generated by
similar contexts.
23
In applying his model to different instances of the development of English in
postcolonial communities, Schneider illustrates that the model can account for a number of
possible scenarios and outcomes in the stages of development of a new variety of English.
For example, Hong Kong (Schneider, 2007: 133-139) displays evidence of being well
established in the third – nativization – phase of the model, and while Schneider
acknowledges that it is impossible to predict the future, it is likely that "a major turning point
suggests that the drive towards English seems to be stronger than might have been
anticipated". On the other hand, the Philippines presents a scenario where the first three
phases appear to have been followed, but "The Philippines appears to be an example of a
country where the in-built developmental trends of the Dynamic Model get overruled by
changing external conditions, thus coming to a halt" (Schneider, 2007: 143).
2.1.6 Describing MaltE within the Dynamic Model
The Dynamic model appears to be dynamic enough to account for a healthy range of
varieties in its framework. In abstracting away from the particulars of each instance of
postcolonial English to take a broader view of the shared elements of postcolonial lands and
communities, it allows for "a sequence of characteristic stages of identity rewritings and
associated linguistic changes affecting the parties involved in a colonial-contact setting"
(Schneider 2007: 29).
One aspect of the Dynamic Model that has come under some criticism is the
observation that the model rests heavily on the common aspects of the history of postcolonial
varieties of English, but caters much less for global trends and present-day influences, such
as, for example, greater internet access to online streaming of both British and American TV,
or the relaxing of selected border controls such as in Europe, which might allow for greater
movement of different ethnic groups with different linguistic backgrounds. Bonnici (2010:32)
24
notes that "we cannot understand everything about the attitudes towards English in a given
locale simply by looking at its colonization history". To take the case of Malta, as one
example, colonial history undeniably plays a pivotal role in the speech community, its
language preferences, and also the attitudes related to those preferences, and we can possibly
quite easily account for the development of MaltE within the Dynamic Model. However it is
also true that, almost quicker than we can begin to complain about the decline in standards of
English (also identified by Schneider, 2007: 43 as a typical feature of Level 3 "Nativization"
in the Dynamic Model), younger children in Malta are much more exposed to American
television and of course, the internet too, than their parents would have been, and there is
already enough anecdotal evidence to suggest that patterns of variation in the local English
will accommodate this reality over the next few years. Bonnici (2010:287) has in fact already
noted evidence that suggests a correlation between a shift "towards more r-ful speech in the
youngest speakers" and "greater access to alternative r-ful varieties, which do not carry the
weight which RP does, and greater access to American English, a less value-laden variety
locally".
A model of how language variation and new dialects are established does not have to
provide an exact fit, however, for it to be considered robust enough to account for different
contexts of variation in English. The variety of English used by the speech community on the
Maltese islands may be evolving beyond its origins as a lingua franca during a recent colonial
period, but the origins of the sociolinguistic context described by the Dynamic Model, as well
as the number of potential directions it could take in the future are still well accounted for.
Thusat, Anderson et al. (2009) have accounted for MaltE's development within the Dynamic
Model with some success, noting that MaltE has gone through the first two phases of
Foundation and Exonormative Stabilisation, which includes such patterns of usage as
Schneider describes them: "code switching, code alternation, passive familiarity, second-
25
language acquisition strategies, "negotiation’". (Schneider 2007: 40). Thusat, Anderson et al.
(2009: 30) pitch MaltE in the third phase of the model, noting the definite presence of a
defining feature of nativization as "contacts between the two languages occur on a regular
and daily basis, and there is widespread second-language use". However the authors also
suggest that it may be possible to consider MaltE being on the cusp of the fourth phase of the
model on the basis of its increasing stabilisation as a newly recognised variety, citing Mazzon
(1993) and the occasional encounter of the term 'Maltese English' in the relevant literature, as
reasons to support their position.
It is true that much of the later discussions on new varieties of English may have
emphasised colonial history too much at the expense of a closer consideration of other
contemporary influences shaping development. It is also true that the current opportunities in
the study of World Englishes owe much to the fundamental shift in approach which has
allowed for a more balanced and objective understanding of how new varieties of English
may have developed and may still continue to develop. Paving this new way to approaching
variation in English has also helped to create a context in which to study other linguistic
aspects of variation in a language, among them, most pertinently for this research, patterns of
speech perception and production.
2.2 The "World Englishes" Framework for MaltE
In spite of some limitations then, the models and theories described above have begun
to account for variaton in English more successfully from the more 'pluricentric' position than
any other previous attempts. The insistence that variation is worth further examination both in
its own right, and also, that 'varied' English is much more widespread than standard English,
has meant that the main proponents of such models have also begun to identify common
themes across a wide array of varieties from across the globe. It gradually becomes possible,
26
now, to contemplate MaltE as an autonomous system displaying patterns echoed in other
similarly developing varieties of English which might not necessarily have anything in
common with Malta's geographical, social or historical features, but whose communities may
have been motivated in similar ways to continue using English. Understanding some of the
ways in which MaltE presents a coherent pattern of features which together can generate the
perception of a Maltese person speaking in English is what has informed the central theme of
this thesis. Such understanding should also serve to throw light on the development of this
variety both locally, and within the wider global context.
2.2.1 Evidence for structured variation in MaltE and current research
The notion that some form of variation of English operates in Malta is not a new one,
having been noted by Borg (1986:11) as part of "a continuum of speech styles". A cursory
look through Malta’s history of language usage will immediately illustrate that the
development of English on the island cannot easily be explained away as simply a non-native
variety or even a learner variety of English. Frendo (1975) and more recently Brincat (2011)
have all made it amply clear that language usage in Malta, besides as always reflecting and
defining perceptions of social class, and aspirations in work and life, was also a defining
feature of political and social positioning and expression, particularly in the 20th century,
during most of which, Malta remained a British base, even for a time, after attaining formal
independence from Britain in 1964. English, together with Maltese and Italian, played a
central role during this period when Malta moved towards political independence for the first
time in its history, and it is inevitable that individual language choices would involve, in
many situations, a conscious choice reflecting anything from personal to broader political
affiliations. It is understandable, in such a climate, that English would develop beyond the
confines of the English administrative classes, especially post-independence, and since the
27
language had also come to be associated with a political, or at least, a social position locally,
particularly among those Maltese aspiring to furthering their education, both locally and
abroad, it is also understandable that it began, imperceptibly at first, to develop distinct local
nuances and patterns. Such local patterns could also hope to become more pronounced as
English was diffused more widely throughout a post-independent Malta as it came into
increasing contact with Maltese and Italian, now that it was no longer confined to the
administrative or political classes. Much of this extended use of English, and indeed its
diversification and adaptation to the speech communities using it in Malta although generally
viewed quite negatively, was nevertheless occasionally recorded. Hull (1993: 365), for
example, notes that "even the shopkeeper who speaks Maltese to his customers will count out
the change in English". Yet, in spite of concerns that the English used in Malta was somehow
substandard, a look through any literature concerned with actual language usage in Malta
reveals a pattern of rich diversity and creativity well able to cope with whatever new situation
life threw at it (for example, Brincat, 2011).
The social importance of English in Malta was not necessarily reflected in a parallel
appreciation of changing language patterns, however, as the seeds for a cline in bilingualism
were sown when more strata of society, and people with different levels of education,
exploited the use of English for their own purposes. Until well into the 1990s, and arguably,
up until the time of writing this dissertation, local variation in English has not been
considered as anything other than something to be corrected, and the reluctance to consider it
is witnessed also in the nomenclature involved, as the term Maltese English (MaltE) – or
anything similarly defining - is still not widely accepted in describing the variety of English
used in Malta.
It is generally understood that in naming a thing we acknowledge its presence, and
this may well be the underlying cause for the palpable reluctance by all but a few determined
28
linguists to even acknowledge the possible existence of a variety of English which differs
from Standard British English, let alone to establish it with a label of its own. Certainly
anecdotal experience and private conversations in various contexts in Malta seem to suggest
that admitting to the existence of MaltE is tantamount to writing off all hope of the English
language surviving in Malta in any recognisable form. A 2011 visit to Malta and a series of
public lectures delivered by David Crystal, in which he did, tentatively (if bravely), slip in the
term ‘Maltese English’ set some eyes rolling in academic circles, and the entrenched feeling
that this simply would not do was very much in evidence. The similarities between the
sentiment expressed here and that presented by Saijali above in describing "Indian English"
will be noted.
Of course such concerns about the state of a language, accompanied by laments at its
wholesale deterioration, are not confined to the English language alone, and have been noted
throughout the world and throughout history (Aitchison 2001), to the extent that Schneider
(2003) has included such concerns as a defining feature of the Nativization Phase in his
Dynamic Model (see also section 2.2.5 above). However an idealised notion of language
usage denies the very organic nature of a language. It is difficult to preserve a language
without actually killing it off completely, and efforts to remain loyal to the prestigious
external ‘standards’ may themselves be misjudged. As Bonnici "(2010:66) notes, "Looking at
the historical influences on English in Malta, British RP (received pronunciation) was not
clearly the variety imported to Malta initially. Some of the first English speakers in Malta
included Irish nuns (…) also it is likely that the varieties of English spoken by the British
navy and militia men (…) were multiple.
A consideration of undergraduate and graduate dissertations up to this point reveals
the effect of this stonewalling, as there is almost no literature in which Maltese English is
willingly considered as a distinct variety. Borg (1980: 5), neatly captures this early outlook:
29
"It is probably best not to regard this way of speech as a discrete system or dialect." This
succinctly captures the mood of the time, which only began to shift slightly with a series of
efforts in the 1990s (notably in Camilleri 1992, or in Vella 1995, further discussed below).
While the ensuing few years only yield isolated instances of a range of dissertations
beginning to regularly identify evidence of systematic variation in MaltE, these studies
nevertheless reveal a new trend of interest in the variety, and in reaching the late 1990s or
2000s, we now begin to see a more concerted effort to establish MaltE both locally, and
internationally. More recently, the study of this variety has also begun to benefit from
forming part of more extensive research projects, with the inclusion of Maltese English in the
International Corpus of English (ICE-MTA, 2010) This in itself is sure to generate more
interest in and recognition of the concept of Maltese English as another variety of English.
This regeneration of interest, with the benefit also of new approaches and perspectives
nurtured within the World Englishes debate, as well as more broadly in the fields of language
variation and change, or sociophonetics, among others, may provide a useful platform for
addressing an issue which has been impossible to consider up until this point. This is the
issue of defining parameters for the description of MaltE. Working in a climate of generally
negative feeling towards MaltE, both within the academic community and in the wider
community of the Maltese Islands, much research on this variety has, of necessity, tentatively
concentrated only on specific linguistic features as they relate to a particular subgroup of the
speech community at large. Such studies conducted up until the 2000s have had to contend
with the possibility that MaltE would simply not be recognised as a discrete variety, and,
thus, was only worth limited research.
Unique for this period is Debrincat (1999), whose research undertook to describe the
use of features of MaltE as evidence of a series of steps, or gradients. According to
Debrincat, one extreme of this gradient might show evidence of fossilized patterns of
30
language, while the other extreme would equate more with the then most popular
exonormative model of Standard British English. Debrincat does not see an easy fit for MaltE
within any of the frameworks for World Englishes available at the time, particularly as these
(for example Kachru, 1990; Platt, Weber and Ho, 1984), were often very much defined in
terms of domains of use, or of particular sociolinguistic contexts more typical of some of the
larger ex-British colonies in Asia and Africa. Debrincat (1999) rightly acknowledges that
Malta's sociolinguistic context, and its needs and uses for English (variety unspecified) does
not present the same conditions, and accordingly, a neat fit within such frameworks devised
to describe them is unlikely (see also Section 2.1.3 above). In the absence of any relevant
framework, therefore, Debrincat felt obliged to resort to the familiar positioning of MaltE on
a scale in relation to an external norm, namely Standard British English, perhaps confirming
Cheshire (1991) in her concern that it was important to move away from a monolingual
approach to the study of variation in World Englishes. Her diagram is represented as a
'staircase', with the higher rungs identifying more closely with a British English Standard, and
the lower rungs representing feature clusters more commonly associated with MaltE.
Debrincat's dissertation was the first to attempt a more comprehensive description of MaltE.
Until now, there has not been a ready platform for the open discussion of the
possibility of identifying the parameters of a coherent variety for MaltE. At this point,
however, the broader international concern to develop frameworks which can adequately
account for the full global conditions in which English often operates has helped to foster a
climate more conducive to effective research and existent literature on MaltE has begun to
present enough significant evidence of systematic variation to warrant the progression to a
new phase of study. In parallel, enquiries into the attitudes towards and perceptions of MaltE
also reveal a growing inclination to recognise that "MaltE stands apart from British English
and other English varieties" (Bonnici 2010: 131). It is thus now possible, based on the
31
principled efforts of earlier scholarship, to attempt to build a more coherent framework for
the description of systematic variation in MaltE.
2.2.2 Some aspects of variation in MaltE
In this section I will review briefly some of the more salient characteristics of MaltE
that have been acknowledged and studied so far. As this thesis is concerned with aspects of
variation within the domain of phonetics and phonology, I will limit my review here to
research focused on these areas. This is not to say that variation does not operate in other
linguistic structures too, as Hilbert & Krug (2010) and Bonnici, Hilbert & Krug (2013) have
also demonstrated. However, the phonetic/phonological domains have perhaps in any case
garnered most of the attention as it is widely understood that these are the domains where
variation in MaltE seems to be prevalent, or indeed, most immediately perceived, across all
social strata.
Mufwene (2001) describes his view of language development and change as an
evolution that draws on a "feature pool" where, Darwinian style, features combine with
environmental and social factors to compete for selection. In this thesis I will draw on this
notion of interacting, and possibly competing, linguistic features operating in combination,
rather than in isolation. Some studies, notably Vella (1995) and Bonnici (2010), have
provided an account of systematic variation with respect to specifically identified features or
domains. Other studies have provided more of an overview of the description of a range of
features, sometimes with reference to a widely recognised standard of English, such as
Standard British English, or citing the influence of Maltese as the major source of variation in
MaltE, as a result of transference (Delceppo, 1986; Calleja, 1987; Paavola, 1987; Mazzon,
1993, for some examples).
32
Research on the available descriptions for MaltE to date yields a respectable amount
of information on some of the more salient aspects of variation, many of which have been
recorded and noted in a number of dissertations (Delceppo, 1986; Calleja, 1987; Paavola,
1987; Mazzon, 1992; Vella, 1995; Debrincat, 1999; Galea Cavallazzi, 2004; Bonnici, 2010,
for some of the descriptions closely informing this thesis), and published research (Bonnici,
2010; Brincat, 2011; Bonnici, Hilbert & Krug, 2012; Vella, 2012). Much of the above
research uses both well-grounded intuition and empirical data in itemising many of the most
recognisable characteristics of MaltE. Of particular relevance to this research are descriptions
relating to the phonemic inventory of MaltE (for example, Vella, 1995; Debrincat, 1999;
Bonnici, 2010), stress and rhythm classification or patterns (for example Calleja 1987; Vella,
1995), and intonation patterns and prosodic features (Vella, 1995). Other work has also
concentrated on analysis of other linguistic structures at the level of morphology, syntax and
discourse features (Bonnici, 2010; Bonnici, Hilbert & Krug, 2013). In the research
concentrating on variation within the phonetic/phonological domains, vowel quality, shifts in
the use of the vowel space, vowel weakening in relation to stress patterns, final consonant
devoicing, as well as the substitution of English phonemes, notably /θ/ and /ð/, with others
phonemes including /d/ and /t/, are widely reported. Vella (1995) explains that there is no
equivalent in Maltese for the Standard British vowels /æ/ or /ə/, and, in the context of
discussing language transfer from Maltese to MaltE, also cites Delceppo (1986) who carried
out her study on the acquisiton of English phonology by Maltese children, with ten six-year
old children. Delceppo suggests that in instances where no equivalent between Standard
British and Maltese vowels exists, the Maltese speaker will have a difference in
pronunciation patterns.
With regards to vowels, Vella (1995:74) concludes that "The ME (MaltE) vowels
differ from their RP (Received Pronunciation) equivalents in terms of their quality since they
33
tend to approximate to the quality of corresponding vowels in the Maltese system.".
Azzopardi (1981) presents a comprehensive account of vowel structures in Maltese, and
among other conclusions, she notes patterns in vowel duration that may have a bearing on
similar patterns in MaltE. Although the issue of Maltese as L1 influence is not considered
further in this thesis, it is still worth bearing in mind Azzopardi's finding that in Maltese,
"Vowels in unstressed syllables are as long and sometimes longer than vowels in stressed
syllables" (Azzopardi 1981: 120). Vowel duration, as well as vowel quality, is likely to be a
central feature of phonetic, and possibly also phonological distinction for MaltE, particularly
as it may also be the focus for perceived differences in stress and rhythm patterns. In relation
to prosodic features, notably stress patterns, Calleja (1987:90) notes that her speakers "make
minimal use of vowel reduction and of weak forms." Particular attention is given to schwa,
both in its own right as a vowel not readily found in MaltE (see above), but also, with regards
to its pivotal role in the rhythm patterns of Standard English. In comparison with the latter, it
is expected that MaltE, where "unstressed short vowels which are not reduced are abundant"
(Vella, 1995:74), will also differ. As Vella (1995: 75) notes, "The fact that /ə/ is rarely
realised in ME (MaltE) can therefore be hypothesized to be an important factor in the
different rhythmic quality of ME as compared to that of RP". Debrincat (1999:70) further
describes how 48.5% of her samples of MalE speech did not contain evidence of schwa,
which she took as "a clear indication of the fact that /ə/ is probably a contributing factor to
the accent of ME speakers".
In many of the above discussions and elsewhere, there has also been the
acknowledgment that MaltE speakers are frequently well aware that their variety of English
often differs from that of a more broadly accepted standard, and further, that such differences
are often seen in a negative light. This strong attitudinal position can often result in more
careful, self-monitored speech, particularly in more formal situations. For example, as Calleja
34
(1987: 90) notes, the fact that her speakers were being recorded may have encouraged "the
speakers to use their 'best' English", with this sometimes resulting in hypercorrection of
perceived errors, including devoicing of /ð/ to /θ/, in an effort not to resort to the more usual
substitution with [d]. Both Vella (1995) and Bonnici (2010) also make reference to the
relevance of speech styles and register as an important factor in relation to variation,
particularly given a widespread attention to perceptions of standardised norms of English.
It should be noted at this stage (see also above, 2.1.1), that many of the features
described above have been identified in particular speech communities, but have not been
extended to a study of the wider population, which would be expected to encompass also a
broader span of variation. Bonnici's (2010) data is drawn from participants whose dominant
language is English, for example, while Mazzon (1992) adopted a snowballing 'friends of
friends' technique in the gathering of her data. In each case however, enough data was
gathered to suggest that the patterns identified are predictable and systematic and, in fact, a
short trawl through some of the raw data collected in ICE-MTA corpus yields many instances
of similar examples, suggesting a plethora of aspects of variation to be considered.
Thus, against some considerable odds, and particularly in the face of widespread
reluctance to consider MaltE as worthy of careful study for the patterned variation of many
linguistic and sociolinguistic features, it has been possible to accumulate a sizeable list of
examples that offer solid evidence for the view that MaltE is less of a case of English in
decline, and more an instance of a healthy variety adapting to the needs of its speech
community.
35
2.2.3 Attitudes towards Maltese English
As can be expected, and as has been widely noted in many instances of World
Englishes, perceptions towards local or indigenized varieties of English have often been
negative and regularly stunted by the label 'bad English'. In Malta, a quick look through any
newspaper article highlighting English language issues and usage will often throw in the
usual acidic remarks and alarms at "the falling standards in the use of English by our younger
generations" (for just one example, www.timesofmalta.com accessed 12/01/2010). The same
point was also made some 20 years ago in Vella, (1995), who also noted instances in
newspapers' Letters sections where sometimes harsh words are used to describe the
perception that English in Malta is often used very badly.
A certain amount of sociolinguistic research locally has successfully sidestepped the
issue of addressing the concept of a local variety of English, possibly as a nod to the litany of
accompanying furore that such a study might expect to invite, given the general environment
noted above. Most local research into language usage in Malta has, accordingly, focused on
the range of domains of use of the different languages (Sciriha and Vassallo, 2006), as well as
studies of attitudes towards a range of languages used in Malta, usually focusing on Maltese
and English (variety unspecified). Such focus only very rarely includes reference to a
specifically Maltese English variety, and more usually refers simply to Maltese versus
English, without discussing the issue of a particular variety of English. The resulting focus
therefore generally categorises such examples as English being useful in an international
context, while Maltese is considered the unifying, national language (Camilleri, 1995;
Micheli, 2001; Bagley, 2001; Sciriha & Vassallo, 2006).
36
More recently Maltese English is becoming more consistently referred to as a distinct
variety (for example, Bonnici, 2009) and research has begun to enquire into attitudes towards
Maltese English specifically, often finding, perhaps unsurprisingly, that in Malta 'we should
learn to speak English in our own way' (Debrincat, 1999). Bonnici's research (2010) also
points to a trend in which consistent patterns of variation emerge in close response to social
sensitivities. Focusing on the speech community typical of Camilleri’s (1992) D type Family,
in which English is the dominant language, Bonnici concluded that the linguistic choices in
the patterns of r-ful- or r-lessness and the use of quotatives demonstrate sensitivity to social
norms, as well as linguistic creativity and innovation.
This demonstrates that MaltE is being adapted to local needs, which in itself
illustrates that the variety has a relevant context in which to develop. In the case of evidence
for more or less rhoticity, for example, Bonnici found a pattern of shift across her apparent-
time data from early (that is, in older speakers) r-lessness towards more r-ful speech in the
younger speakers and she accounts for this by noting that younger speakers may be sensitive
"to the association of r-lessness with an ideology of snobbishness" (Bonnici 2010: 287).
Debrincat (1999) also interviewed 50 university students to study what their views were on
variation in English in general, and in relation to the Maltese islands in particular. She also
asked them to identify some features about what was sometimes described as "our own way"
(Debrincat 1999: 22) of speaking (that is, MaltE), but noted that none of the respondents
could do this with much consistency, beyond occasionally noting the wider use of a post
vocalic 'r', or the substitution of 'th', or could not say what was particular about this local
variety that made it stand out as different.
It is hardly surprising that the average layperson is not in a position to identify
phonemic or phonetic characteristics even in their own familiar languages or dialects. Indeed,
much research on the perception of variation in its broader context notes that, where listeners
37
do perceive variation when overtly questioned about it, it is still difficult to tease apart the
perception of variation and the attitudes to it. In explaining the use of Matched-guise
technique, for example, "there is no way to separate the attitude judgments made by the
listeners from their ability to recognize the dialect of the speaker." (Clopper & Pisoni, 2005:
317). Nevertheless, the fact that Debrincat, writing as early as 1999, was able to unearth a
rather more positive attitude towards MaltE would be taken by many in the field of World
Englishes research as a healthy sign of development, in contrast to stagnation, and its
potential linguistic counterpart of fossilization. Such attitudes suggest a slight shift away
from the more negative views reported above, and elsewhere, and add to the position taken
by Thusat et al. (2009) that MaltE may be on the brink of approaching some form of
stabilisation which, according to the Schneider model, may allow it to establish itself in a
more accepting framework of description.
2.4 Conclusions
In this chapter I have considered developments within the field of World Englishes
research, and I have identified studies concerned with describing aspects of variation of
MaltE particularly within the phonetic/phonological domains. The research to date might be
considered sporadic, or a little sparse, but in view of the generally negative context in which
it has often had to operate, its achievements have been considerable. Taking my cue from this
research, it is now possible to begin to identify a number of characteristics of MaltE which
might serve as indicators of different aspects of variation within MaltE.
This thesis will concentrate on a number of pronunciation characteristics of MaltE,
some of them itemised in this chapter in 2.2.2 above, that have, on the strength of previous
research, been noted as salient hallmarks of this variety. In some cases, some of these same
features have been identified popularly as identifying markers of MaltE and together,
38
research and popular commentary have helped shape different aspects of our understanding
of MaltE. Earlier efforts to categorise the use of language in Malta sociolinguistically have
been pointed out in Camilleri (1992) and noted in both Borg (1986) and Camilleri (1992) as
well as in later work (Vella, 1995; Bonnici, 2010). However, the attempt to associate this
gradience with internal linguistic patterns as opposed to external ones has proved tricky when
the underlying framework is taken to be some form of external norm or model, such as
Standard British English. The attempt made by Debrincat (1999) who introduced a schema of
'Gradations', in which speakers were grouped according to the number of features evidenced
in their utterances, is especially noteworthy here.
I have also in this chapter briefly alluded to the fact that the relationship between
attitudes and perceptions towards a variety on the one hand, and speech production on the
other, is sometimes not very clear. This will be discussed further in Chapters 3 and 4 below,
as I consider the notions of what is readily identifiable for a variety, together with possible
frameworks for studying that which might be considered identifiablefor MaltE.
39
3 The notion of identifiability and its measurement
3.1 Defining "Identifiable"
In this dissertation, the notion of how a particular variety of English might be
considered "identifiable" is addressed in part by tapping into the native listener's introspective
judgments and perceptions when listening to different speakers coming from different social,
geographical and linguistic backgrounds. In this case, obtaining introspective perceptions
while simultaneously avoiding attitudinal positions is desirable, as the variety in question –
MaltE – is widely used and well established, but attitudes often present entrenched positions
and these are liable to be formed just as much by political history or allegiance, as by notions
of perceived grammaticality or otherwise in the variety (see also Section 2.2.3 above).
This chapter first considers the notion of identifiability in a langauge, dialect or
variety, and how this can be described with reference to patterns of speech perception and
production along a continuum of variation, such as that which has been suggested for MaltE.
Section 3.2 considers some of the ways in which perceptions of this sort can be measured,
while Section 3.3 hones in on one measure of perception in particular, that of Magnitude
Estimation, which was decided upon as the preferred measure for this research (see Section
3.3.1).
3.1.1 A link between the identifiably salient characteristics of variation and
listener perception in language
As Debrincat (1999) has suggested, it is likely that any native MaltE listener will
readily identify a MaltE speaker. More recent parodies widely available on YouTube or
commented upon in society (see Appendix A) further clearly identify specific features in their
attempts to provide a caricature of what a prototypical, or readily identifiable, Maltese person
40
speaking in English is likely to sound like. These same features have also been more
empirically studied and widely noted to be salient in the production of MaltE patterns of
variation (Delceppo, 1986; Calleja,1987; Vella,1995; Debrincat, 1999; see also Section 2.2.2
above). In many respects, however, a speaker might only be considered to identify as
belonging to a particular language or dialect, if she or he is a good match for the template or
language experience of any one listener. Functional, as opposed to generativist approaches to
speech perception and production, have highlighted ways in which we might store
"exemplars" or instances of actual language in a context, alongside the more abstract
categories including phonemic or lexical representation. These exemplars are then called
upon and matched up even added onto, when a listener encounters further examples of
language. In this view, language items are closely bound with the context in which they are
most encountered. Docherty and Foulkes (2014:45) explain how "variability which a listener
has encountered in relation to a particular point of reference (e.g. the lexical item 'cat') is not
discarded, but built in to the representation in memory of that item as a property of the
clustering of exemplars". It is also worth quoting Johnson (1997:146) at length here, for the
succinct description of exemplar models accounting for language variation:
A perceptual category is defined as the sum of all experienced instances of
that category. That is, no abstract category prototypes are posited. The
process of categorization then involves comparing the to-be categorized
item with each of the remembered instances of each category…
Exemplar models of language representation are still not necessarily considered to
account in a wholly satisfactory manner for how language memories are built up or accessed,
but they do begin to address and account for how instances of language use are mentally
recorded and stored, alongside broader and more abstract categorisation.
41
In this sense, then, the notions of the 'identifiability' of a speaker as belonging to, for
example, the variety MaltE, can be seen to refer to those patterns which are more ubiquitous,
or more salient, more frequently in evidence and therefore more reinforced, in any one
speaker/hearer's experience. Clopper and Pisoni (2004), for example, tested how much living
in one place for one's entire life, or in different geographical locations, might affect two
groups of listeners' perception of dialect speakers. Two groups, divided into “army brats”
who had lived in at least three different locations, and “homebodies” who had lived in the
same place all their lives, were given a forced choice task to categorise different dialect
speakers, with the researchers predicting that the “army brats” group which had been exposed
to greater dialect variation in their lives would be able to categorise speaker dialects more
accurately than those – the “homebodies” - who only had exposure to one type of variation all
their lives. The results confirmed that listeners in the two groups did in fact have different
perceptions, with results from the “army brats” group suggesting that "early linguistic
experience based on residential history affects performance on a perceptual dialect
categorization task" (Clopper & Pisoni, 2004: 47). Expanding on this conclusion, the authors
emphasise that “personal experience with linguistic variation is an important contributing
factor that affects how well people can identify where talkers are from.” (Clopper & Pisoni,
2005:325).
The above study and others before and since (for example, Preston 1986, Campbell-
Kibler 2008, among others) highlight the close bond operating between perception and
production patterns in speech. It could be said that this connection is an obvious one, and yet,
it is only relatively recently that approaches in fields including sociophonetics, speech
perception, speech production and exemplar models of language have converged on the study
of such a relationship between perception and production. Recent understanding of variation
in speech is underscored by the relationship that binds patterns of speech perception and
42
production together. As humans, we neither listen, nor speak, in a vacuum, and we begin to
form and understand patterns from the first months of life. Perception studies with infants and
children show how infants up to six months are able to distinguish phonemic contrasts not
necessarily present in their own language (their mother tongue), but then begin a process of
attunement when they are around twelve months old (Aslin and Pisoni 1980). As Clopper and
Pisoni (2004: 32) then elaborate, "children are born with the basic sensory capacities to
discriminate all possible phonemic contrasts in any language, but early linguistic experience
shapes their perceptual abilities to enhance the relevant contrasts in their native language".
It is therefore reasonable to assume that a speaker's speech patterns are most likely to
be informed by what they are exposed to as infants, in the first place, but also, during the
continuing learning curve of exposure to different sources of language and language
variation. If a frequently occurring characteristic in a particular speech community then
triggers certain indexical information about a speaker, the characteristic in question could
then be associated with identifying that same speech community, or a particular facet of it. In
concrete terms, if as a native listener/speaker of MaltE, I have grown used to hearing /æ/
realised as [ɛ] in such words as 'man' or 'hat', from speakers in Malta, or even, within a
subgroup of the Maltese speech community, then I will eventually also consider MaltE (or a
particular subgroup) as – at least partly – readily 'identifiable' in this pattern of variation.
Thus, the more a listener is exposed to a given characteristic depending on their particular
social networks and linguistic experience, the more likely that characteristic may be
associated with being able to identify a given speech community in some way.
3.1.2 Narrowing down identifiable patterns of variation for MaltE
I have taken a bottom-up approach to the link between perception and production of
identifiable MaltE characteristics. There is, as yet, little understanding both of the phonetic
43
variation that each characteristic presents, and more crucially, how these characteristics might
interact with each other and combine to trigger such perceptions, if this is the case. Thus,
although in some anecdotal sense, native MaltE listeners may sometimes pinpoint vowel
quality, for example, as a strong indicator of MaltE speaker, there is as yet no clear
understanding of what the acoustic properties might point to in their concrete realisation in
speech. In particular, references to preference for full vowels over vowel reduction, or the
realisation of fricative phonemes /θ/ and /ð/ as [t] or [d] abound both in research (see Section
2.2.2 for closer consideration) and in popular culture, and are often parodied (Appendix A).
They may then be viewed as strong contenders in a feature pool in the Mufwene (2001)
sense, and likely to trigger the perception of being identifiable in native MaltE listeners.
However, the issue of how, and to what degree such characteristics, having been repeatedly
identified in both academic and popular literature, can be considered truly identifiably MaltE,
occupies the main theme of this research.
There is also no indication whether any one characteristic or feature alone will
unequivocally trigger a perception of MaltE. Current theory suggests that it is much more
likely for speakers of a dialect or variety to draw on a pool of available features depending on
any number of variables including peer/self identification, register, context for speaking and
many others (Schneider 2011 and Mufwene 2001). Although it may well be the case that
MaltE is eventually most easily defined by one particular characteristic over any other, there
is not enough research to support this, or, conversely, to pinpoint which of the many features
available – and not just at the phonetic level - might be considered more "identifiable" for
MaltE than any other. Closely connected with this consideration is the concern with the
notion of variation within a dialect operating on a continuum. As the small but suggestive
amounts of evidence have indicated (for example Bonnici 2010, for the most recent), any of
those characteristics broadly categorised, for example, as present or absent in a speaker, are
44
likely to exhibit a more complex pattern that goes beyond a simple binary categorisation and
would better be interpreted along a continuum or gradient. Vella (1995:75) also refers to
evidence both in her own data and in previous data (Calleja 1987), that phonetic realisation in
MaltE may vary as a function of context, where, for example, a MaltE speaker may
consciously attempt a more standardised realisation of a phoneme, in some circumstances but
not in others (see also 2.2.3 above and 6.4.2 below for more on register). This concept of
variability occupying a gradient space, rather than a categorical one, is well recognised
throughout studies on variation (for example, Clopper and Pisoni, 2004; Purnell, Idsardi and
Baugh, 1999; Syrda, 1996) but it has not yet been thoroughly investigated with reference to
MaltE.
Thus the notion of what is identifiably MaltE could in some senses be considered a
simplistic, black or white issue: a native listener either will, or will not, readily recognise a
speaker as MaltE. However, as the foregoing discussion suggests (see above), identifiability
can also be considered on a continuum of more compared with less, or as a quality that
occupies more of a gradient space, as its perception by a native listener is triggered by
similarly gradient effects in the realisation of any number of individual features or
combinations of features. The suggestion highlighted above that native listeners can perceive
some degree of nuance or at least, broad phonetic patterns of variation within or across
languages, dialects, and speech communities, is not a new one, and it also complements our
understanding of the production of variant patterns operating on a continuum. Johnson
(2005:363) refers to "speaker normalization" which highlights the fact "that phonologically
identical utterances show a great deal of acoustic variation across talkers, and that listeners
are able to recognise words spoken by different talkers despite this variation". Listeners are
therefore able to 'normalize' much of what they hear, so that, in spite of any evident variaton,
they can also access the intended message. The fact that listeners do in fact appear to filter
45
out the variation in order to arrive at the intended message may be the reason why, for many
years, "variation in speech was treated as a source of noise" (Clopper & Pisoni, 2005: 314).
In fact it appears that listeners can do at least these two calculations at once: 'normalize' the
data enough to perceive the intended message, and also match the item they hear to any
number of stored instances of the same item, thus also making use of the minute details of the
message. Much research in speech perception has readily supported this, both with, and
without, reference to the inevitable link with social indexing. More recent work (for example,
Podesva, 2011; Sumner & Samuel, 2009; Warren, 2008; Clopper & Pisoni, 2004, among
others) has begun to successfully consider both the broader phonological categories of speech
sounds in combination with the phonetic detail which, far from being simply 'noise', is more
likely to convey nuanced clues which can be matched with indexical information.
3.2 Measuring perception
In turning now to the measurement of the perceptions of what might be considered
identifiable, or not, there are a number of options. These will be discussed briefly below,
before continuing to discuss in more detail the chosen measure for this research.
One aspect of perception research is that which focuses on the perception of dialects
and tasks involving dialect categorisation. In some important ways, dialect perception can be
seen as roughly analogous to the theme of the current research, discussing how given
characteristics might trigger recognition, or perceptions of what is identifiably MaltE, and
this section briefly considers perception study research in this light. As Clopper and Pisoni
(2005) note, perception studies focusing on dialect categorisation were sparse up until the
early 2000s. Research has since begun to gather pace, also, perhaps, encouraged by new
directions and energy within sociolinguistic variation and sociophonetics, in particular, where
the focus encompasses how variation in dialects is perceived, as well as produced, and
46
perception here is noted to be different from attitudes. Thus the focus has broadened to
include experimental research aimed at identifying introspective judgments or perceptions, as
well as the more consciously held attitudinal stances or overt associations with indexical
features such as social group, or age. This more recent approach has come to the fore in the
last fifteen years (Montgomery and Beal 2011, Warren 2008, Hay,Warren and Drager 2006,
Niedzielski 2001, among others). Experiment design ranges from vowel matching tasks,
where participants are given framed target vowels and asked to match these to a continuum of
the target vowel sound(s), to forced choice identification tasks. Clopper and Pisoni
(2005:327) conclude:
there is one theoretical conclusion that the results of all of the studies lead
to that cannot be ignored: phonetic variation attributable to dialect
differences between talkers is well-resolved perceptually. (...) This
perceptual ability suggests that listeners retain a memory of the varieties of
their native language and these representations develop naturally through a
person's experience with and exposure to his community and the word at
large.
Increasingly, much research on speech perception has come to appreciate that the
perception of variation is difficult to separate from judgments of or attitudes towards the
social traits that might be associated with speakers and their language choices. In other
words, social indexing often becomes an integral part of studies on speech perception
concerned with variation. For example, a number of studies note the impact of suggestive
props on listener perceptions. In one such study, Hay and Drager (2010) followed up on
earlier research in perception studies, to conduct a vowel matching exercise with 26 native
New Zealanders. The participants were asked to match vowels in sentences read out by a
male New Zealander to a synthesised counterpart on a six-point continuum, but attention was
47
also drawn to a stuffed toy placed on the main desk at the start of the experiment. The 26
participants participated in one of two task conditions, one in which the stuffed toy was a
kiwi (associated with New Zealand) and one where the stuffed toy was a koala or kangaroo
(associated with Australia). The authors found that resulting perceptions could have been
influenced by the added presence of the stuffed toys, and found the study to suggest "that a
wide variety of nonlinguistic information could potentially bias participants’ perception of
vowels." (Hay and Drager 2010:889). On the basis of this and previous studies, the authors
also point out that their findings support exemplar models "in which phonetically detailed
memories are indexed with social information." (Hay and Drager 2010:889).
Other research has attempted to incorporate the nature of variation operating on a
cline, or continuum, by asking participants to use a scale for their perceptions, or judgments.
Examples of these scales include the traditional likert type scale with n-points on the scale, as
well as magnitude estimation, which will be described more extensively in the following
section.
3.3 Using magnitude estimation: closer considerations
In view of the foregoing discussions concerning language perception, Perception
studys and the measurement of perception expressed by listeners as judgments of various
forms, it was important to decide upon a measure which would, among other things, allow, at
least to some degree, the capturing of more nuanced, as opposed to categorical perceptions.
Thus one driving question underlying this research does not just focus on "Is any given
speaker (X) from Malta or not?", but rather, "Is any given speaker (X) more/less readily
identifiable (than a given example) as coming from Malta?". In particular at this earlier stage
of the research, it was important to identify measures which were more likely to encourage
participants – in this case, native MaltE listeners – to represent their estimations on a scale
48
which could capture some level of variation in their perceptions, which in turn, could also
throw light on some of the more identifiable patterns of variation in the production of
predetermined features (see Section 4.2.1 for more on these) employed by the speakers.
Magnitude estimation, or ME, henceforth, was considered to be one such measure offering
such possibilities, as is further explained in Section 3.3.1 below.
3.3.1 The rationale for magnitude estimation
The somewhat uncomfortable idea that linguists or language experts themselves fall
prey to the same type of preconceived notions as the rest of the population when discussing
such ill-defined ideas as ‘acceptability’ or ‘grammaticality’ has been addressed a few times
by linguists in recent years. The overall resulting conviction is that theories based on one
linguist’s concept of acceptability should at least be balanced by mirror studies that collect
‘real’ language examples, or that get naïve listeners to balance out expert listeners in their
judgements of data (for example, Dąbrowska, 2010). This argument has its roots in
Chomsky’s distinction between competence and performance, which, when extended to the
present concern, would allow a linguist, as an informed judge, but also as a competent
language user, to rely on their own language resources and judgments to arrive at a
conclusion on what is an acceptable or grammatical feature of a language or languages. Such
an argument of course holds true, to a large degree. Nevertheless, recent years have seen
increased questioning of the validity of such a position, given that “We must, of course,
acknowledge that judgments can be, and often are, influenced by extragrammatical factors”
(Dąbrowska 2010: 5). That language experts may still – if to a lesser extent – be prey to
“extragrammatical factors” is captured in a neat quip by Featherstone (2005), cited in
Dąbrowska (2010: 4): “if data supports my theory it must be Grammatical, if it supports your
theory it is just markedness”.
49
In Malta, the concern over the validity and robustness of the views and judgements of
any one expert listener may well be magnified by the uncertain nature of the status that
English currently holds throughout the islands. In the first place in Malta, we seem uncertain
about whose English to use as a model (our own? Standard British? American?), and we are
also unsure about whether we acquire English as a second language, learn it as a foreign
language or even speak it as a mother tongue. It is appropriate to reiterate here that the
current context for studying the English language as it is used in Malta today, is one of
pioneering and principled, but isolated, studies, coupled with a generally widespread feeling
– as evidenced by newspaper articles and conversational asides - that the English language in
Malta faces an uncertain future and anyway, we are not quite sure what type of English it is
that we do speak or should speak. There is not enough research done or broadly disseminated
to counteract the effects of language choice as a highly politicised topic, where polarised
attitudes are often adopted in reaction to variation at different points on the continuum.
In this context, it was felt that any form of judgment task on perceptions would do
well to shift the focus away from eliciting attitudinal views. There are now, in any case, some
established studies in this area (for example Caruana, 2007; Sciriha & Vassallo 2006) which
have examined some aspects of attitudinal positions regarding language usage in Malta (see
also Sections 1.1 or 2.2.3). The focus here, rather, is to consider the perceptions of the
linguistic patterns which are in use and which could be said to circumscribe MaltE, without
encouraging too much of a concentration on the attitudes or preformed concepts regarding
what they are listening to. It was thus considered relevant, at this stage, to capture perceptions
of patterns while avoiding any further inference of associated levels of education, social strata
or other indexical information. The effort, here, was to sidestep the possible issues relating to
subjective views of what might constitute 'good' or 'bad' English (see Chapter 2 for more on
this), and concentrate instead on those views – still subjective, but less at risk of pejorative
50
judgment – regarding the parameters of what can and what cannot be readily considered
identifiable in the patterns of English of a speaker coming from Malta (i.e. in Maltese
English, or "MaltE"). It was also important to bear in mind that " it is well-known in social
and cognitive psychology that behavioral responses to stimuli require reference and
comparison to a standard, either internal or external. If a benchmark is not provided by the
experimenter, then the participant must rely on his or her own internal standard which may
shift over the course of the experiment." (Clopper & Pisoni 2005: 321).
Magnitude estimation (ME) was first devised to measure estimated strength of
perceived force or power, in the fields of psychology and psychophysics, since its
introduction by Stevens in the 1950s (Bard, Robertson & Sorace 1996), and more recently it
has also been introduced as a tool for measuring judgment and perception in linguistic data.
Bard et al. (1996) were the first linguists to discuss the benefits of using ME to capture, in
particular, the notion that there is often a continuum of acceptability in language, rather than
a clear distinction between two opposing features of ‘acceptable’ or ‘unacceptable’ (Bard et
al. 1996: 33). This particular focus on a continuum or a gradience in perceptions of
acceptability resonates particularly well with the current research, in which variation is more
about relativity and gradience, and less about categorically neat classifications.
Typically, participants in a ME study are presented with a control item, or ‘modulus’
(for linguistics this would most likely be a sentence, utterance or other extract of language),
and asked to give it a score, against which they then score each subsequent item. So if the
modulus has been assigned a score of 100, an item perceived as twice as strong (or
acceptable, or grammatical, in linguistic terms) will be scored 200, three times as strong 300,
and half as strong, 50, and so on. This will allow for the possibility that “With no preemptive
limitation of the measurement scale, the tension between relative and absolute measurement
is lost as subjects build a whole scale by means of relative judgments.” (Bard et al 1996: 65).
51
More recently, it has been suggested that an alternative to the fixed modulus is to either have
the modulus randomised from among the list of extracts to be judged, or to have no fixed
modulus, but instead to use the score of the previous extract as the basis for the subsequent
judgment. In all the variations of test design, however, the main theme is focused on the
measurements of the intervals between two judgements on a scale, rather than on the
judgment scores themselves.
In this way, and unlike in fixed n-point rating scales, it is the intervals and proportions
of strength (or grammatical acceptability, in linguistics) that are measured. This ultimately
results in measurements which are purported to have more statistical meaning, in the first
place, and the resulting interpretations are also likely to be more nuanced than they would be
following the more categorical rating scales. For while in both ME and binary or n-point
rating scale formats the numerical values might be considered artificial, ME then
concentrates on the interval between the given figures, in other words, it concentrates on the
proportion or the magnitude, so to speak, of divergence from the modulus, and not on the
numerical values themselves. In the ‘validation phase’ of the first experiment using ME to
judge acceptability in a range of sentences, Bard et al.1996 also introduced cross-modality
matching to confirm that even if participants were wary of the consistency of elicited
judgments, “however unprincipled it seems to them, the spacing of judgments by our subjects
is no matter of whim, but a reflection of intuitions on which they can draw repeatedly.” (Bard
et. al. 1996: 54), and again they reiterate that, "even simple informal exercises in magnitude
estimation do yield judgments which are worth pursuing, because we have reason to believe
that judges will be self-consistent and will perform like other judges" (Bard et al. 1996: 65).
Since its introduction as an experimental measure for degrees, or gradients, of
grammatical acceptibility, ME has also come under strong criticism both in relation to the
relative complexity of its design, as well as the nature of its results. To consider the latter
52
point first, Weskott and Fanselow (2008, 2011) for example, have repeatedly argued that
while ME results do indeed generally fall in with the patterns of results yielded by other
traditional scales, such as the Likert-type n-point scale, the drive for finer detail in allowing
judges to determine their own scales and numerical values can lead to "more spurious
evidence" (Weskott and Fanselow 2008:431). The concern here is that this could compromise
the statistical power of any results, even if it did still yield more fine-grained detail.
Additionally, the compromise between detailed nuance on the one hand, and reduced
statistical power on the other, would not necessarily be a compromise worth making, given
that other interval scales can still in fact present substantially detailed results of responses to a
task without compromising the robustness of a statistical analysis. At the very least, it is
possible that "there is no difference in the amount of information linguists can draw out of an
experiment using ME judgments versus using binary or seven-point judgments" (Weskott and
Fanselow 2011:250).
Further criticism relates more closely to the ME experiment design. The concern here
is that the relative complexity involved both in the setting up and in the conduct of such an
experiment may impact on the ability of participants to provide reliable results. Sprouse
(2011), for example, has pointed out that more recently, "the field of psychophysics has
systematically questioned whether participants can actually perform the cognitive task asked
of them by the magnitude-estimation procedure" (2011:274). He reports on two experiments
carried out in the related task of magnitude production, which together suggested that
"participants are able to provide meaningful ratio judgements of loudness, but are unable to
report those judgments in a mathematically meaningful way" (2011:279). Sprouse's own task
was to test the same concerns with respect to grammatical acceptability judgments, and
amongst other issues, he reflected Weskott and Fanselow's (2011) position that grammatical
acceptability might not necessarily be interpretable on a continuum of gradience.
53
Accordingly, it would not be possible for participants to reflect grammatical acceptability in
ratio judgments.
In the light of these concerns, it was important to establish whether ME could still be
considered a useful measuring tool for the relevant judgment tasks at hand. The above
discussion relates exclusively to the matter of ME being used in linguistics to capture the
measure of grammatical acceptability, where this is perceived as a gradient. Whether or not
grammaticality is interpretable by measures of gradience has been questioned and this
uncertainty is reflected in the discussion on whether or not ME can be considered a useful
tool for measuring grammatical acceptability. This, however, is not the main focus of
concern in the present research.
The current task in this research is concerned with capturing the perception of
identifiably MaltE patterns of variation, rather than degrees of grammatical acceptability. In
the current research, native MaltE listener participants' perceptions were to be captured as
they were asked to identify a number of speakers along a continuum according to whether
those speakers sounded more or less Maltese to them. This differs in one important respect
from a task aiming to capture empirical evidence of grammatical acceptability across a range
of structures. The current judgement of identifiability in MaltE is investigative, rather than
conclusive in nature. The aim here is to gather as much evidence as possible relating to the
pegging of predetermined characteristics to patterns of perception. Thus, the task does not
seek to provide empirical evidence in support of or against, perhaps, a more introspectively
identified hypothesis. The intention is instead to use ME to provide an inclusive framework
of measuring what listeners judge on a continuum of variation, to be identifiable, in a variety
of English. There is therefore no reference to be made to any objective point of reference, in
the same way that there may be with the notion of 'acceptability', for in the first place, even
the notion of what was indeed identifiable needed questioning further. The notion of
54
'identifiability' itself is also fundamentally different from 'acceptability' in that identifiability
cannot be divided into a series of discrete items, as, say, tokens of grammaticality can be.
Thus, while the traditional n-point rating scale may have been considered to suffice
for the purposes, it seemed inevitable that some valid information would have been lost in
two important respects. Firstly, the numerical values 1 through to 5, or 7, or 9 do not have a
clearly interpretable meaning, both in themselves, as well as in relation to each other. For
example, my interpretation of '3' might be different to another person's interpretation of '3'.
Secondly, as an extension of the previous argument, the distance between my '1' and '3' in
perceptual terms, may again be different from the next person's perception of the same
distance. Thus, in trying to capture a listener's perception of 'identifiable MaltE' using an n-
point rating scale, analysis would concentrate on broad categorisation of what is perceived to
be identifiable and what is not, but it would not be able to attach any meaningful
interpretation to the rates entered by the participants and their actual perceptions of
identifiable MaltE. At the analysis stage, such categorical numbering would undoubtedly be
much easier to collate, and the results may well provide a more or less appropriate reflection
of listeners' judgments and a true picture of what is perceived as identifiably MaltE.
However, at the experiment stage, in the process of circling a number between, for example,
one and seven, the more categorical nature of such scales would have limited the listeners' to
a predetermined map which cannot easily be converted into any meaningful results.
Conversely, it was expected that it would be possible for me to draw out some broad
categories relating to a range of identifiable features, from listeners' ME scales where the
focus is on the interval, rather than on a category. In other words, I suggest that more
valuable information can be obtained from allowing listeners a freer rein in trying to reflect
their own estimations numerically, than if a predetermined set of categories were imposed,
even if I ultimately intend to then categorise these data at a later stage. The advantage here, is
55
that even if the resulting data are, to some degree, more detailed then necessary, it is better to
have more detailed data that can then be categorised, than to have listeners unable to reflect
the finer points of any judgments they might wish to make.
ME was considered to offer this additional bonus by virtue of the fact that the ensuing
scores allocated by participants could present more fine-grained information regarding the
perception of what sounds 'identifiable' for MaltE. To quote from ME's stronger critics, it was
expected that at worst, the level of detail afforded by an ME task would by "spurious" and
"there is no difference in the amount of information linguists can draw out of an experiment
using ME judgments versus using binary or seven-point judgments." (Wescott & Fanselow,
2011:250). While this may well be the case, I would reiterate here that in fact it is the placing
of the judge (here, listener) in full control of the measuring tool at their disposal that gives
ME an edge over other scales, in this particular context. In other words, it is an advantage in
ME that participant judges are allowed free rein, even if this might involve more risk of
yielding results which are less easily controlled for or indeed, manageable.
The second criticism levelled at ME concerns its design and the fact that it is
dependent for its successful interpretation on the ability of participants to provide
mathematically meaningful results. Again, the essence of this criticism is made with
reference to the measuring of grammatical acceptability judgments, which to a large degree,
are still expected ultimately to conform to and confirm a fixed or itemised list of previously
determined structures ranging from acceptable to completely unacceptable. This is not the
case in the current research, where, in fact, there is no clear consensus, as yet, over what
constitutes identifiable MaltE. It is precisely the nature of the issue at hand, that in the face of
such fluid parameters, it was more important to first include the broadest and most inclusive
range of patterns or features evidenced in individual speakers, before later trying to weed out
56
some of those aspects of language usage which might transpire to be less salient, in this
regard.
Thus, while it seems that ME is in many respects not exactly the "'gold standard' in
acceptability-judgment literature" (Sprouse 2011:274) that it was first hoped to be, its
potential to yield results indicating both general trends and also more individualised and fine-
grained feedback, is still something to be reckoned with. With particular reference to the
current research, the fact that it does not concern itself with grammatical acceptability, but
rather, with a continuum of variation, meant that many of the concerns with ME as an
appropriate measure for grammatical acceptability did not apply in this case.
In conclusion, it was decided that in spite of the concerns highlighted above, ME
offered important advantages that other measures could not quite match. Chapter 4 will
describe in detail how ME was adapted to be used with auditory, as opposed to visual cues
for the purposes of the study of perception and judgment patterns in language variation. The
chapter will also describe in full the methods and procedures adopted for the two studies
forming the focus of this thesis, designed to research patterns of perception, patterns of
production, and the tentative understanding of the relationship between the two, as far as
MaltE is concerned.
57
4 Methods and Procedures
That gradability or variation within the variety MaltE exists, even without reference to
an 'exonormative' (Kachru 1990, 1992) or generally accepted 'Standard' of some sort, has
been noted in works (Borg, 1980, 1986; Camilleri, 1992; Vella, 1995, for example)
concerned with language usage in Malta (see also 2.1.2 for a closer consideration of the
notion of a cline). However the issue of beginning to define where such variation might be
seen to begin and end in MaltE, remains largely untouched. In other words, if we considered
the variation within the variety as a continuum, then at which extremes on either end of the
continuum would the variation cease to be perceived of as identifiable of this variety which is
referred to as MaltE?
Such a question focusing solely on the measuring of perceptions could easily become
the subject of a full scale study in its own right and it became apparent that a decision would
have to be taken as to whether or not to allow this issue to dominate the research or not.
However it was also still considered important to study the most salient of those linguistic
features which appeared to trigger these perceptions in the first place, with the intention of
understanding more fully, the notion of variation within the variety, from the point of view of
a linguistic analysis.
Two related studies were therefore devised, with each having a different emphasis on
perception or production patterns, while still ensuring that the relationship between the two
patterns remained paramount. In the first study (Study 1, henceforth), the focus was on the
use of Magnitude Estimation (ME) as the measuring instrument, and on seeking feedback and
an informed opinion from participants, regarding both the use of ME, and the issue of
identifiably MaltE patterns of variation as described in more detail in Section 4.1. The second
study (Study 2, henceforth), also focused on obtaining judgments on the perception of
58
identifiable MaltE using ME as the measuring instrument, but it also focused on an analysis
of production patterns in six native MaltE speakers, for five predetermined characteristics
noted to be particularly salient in the identification of MaltE patterns of variation (see Section
4.2.1 for details of their selection)
As the research consists of two studies, this chapter is divided into two main sections.
Section 4.1 and its subsections, is devoted to the methods and rationale for Study 1, while
Section 4.2 and its subsections, describes the same for Study 2. Finally, Section 4.3 presents
an overview of how the data resulting from Studies 1 and 2 were managed, annotated and
analysed, together with details of references to further subsections throughout this thesis
which treat in more depth all the various stages concerned with the findings and conclusions
established.
4.1 Study 1: overview and rationale
Study 1 was devised in part to test the efficacy of ME as the appropriate measuring
tool for a perception study in which more introspective judgments, as opposed to more
consciously held attitudes, were to be sought. The other aim for Study 1 was to seek the
informed opinion of the listener participants regarding the judgments they made about each
speaker. Thus, participants in this first study were required to act both as genuine
participants, namely as native MaltE listeners, but also as commentators, or critics, of Study 1
itself. Participants for this study were therefore recruited as 'expert' listeners, defined further
as native Maltese (see Section 4.1.3 below), having a good knowledge of Malta and its
linguistic context, and of language and linguistic study and analysis, more generally.
The auditory stimuli for the ME task to be presented to the above expert native MaltE
listeners were to represent native MaltE speakers, and since it has now been established that
variation also exists within MaltE (see Sections 2.1.2 and 2.2.2) it was crucial that a healthy
59
range of variation of MaltE was represented across the chosen speakers (see Section 4.1.1 for
detail). As far as the execution of the ME task itself was concerned, a number of decisions
needed to be taken in order to ensure a smooth re-adaptation of this measurement for the
purposes of audio stimuli. These are elaborated upon below in Section 4.1.2. It was, however,
also expected that the expert listeners would be able to comment on aspects of the use of ME
in order to allow further fine tuning for its use with the naïve native MaltE listeners in Study
2. These are reported on in detail as part of the data analysis in Chapter 5, Sections 5.1 and
5.2, and they are also referred to in Section 4.3 as the conclusions drawn from the results and
feedback of Study 1 were used to inform the design of Study 2.
4.1.1 The speakers in the audio stimuli for Study 1
The development of the Malta component of the International Corpus of English
(ICE-MTA), under the direction of Manfred Krug and Michaela Hilbert at the University of
Bamberg has helped to put MaltE on the map of World Englishes, and it was considered an
excellent starting point in the attempt to identify patterns of variation in MaltE. The ICE
corpus currently contains available data for some 14 varieties of English largely in both
written and spoken formats, thus offering a healthy range of contexts and media for studying
each variety.
The speakers for Study 1, designed in part to confirm some of the more salient
characteristics which might trigger the perception of readily identifiable MaltE, were all
drawn from the radio and/or lecture broadcasts in the spoken component of ICE-MTA. It was
not possible at the time of this study, to draw on any examples of more spontanteous
dailogues as these often consisted of stretches of speech which were too short for the task
required of the listeners (see Section 4.1.3 below). Nevertheless, the speakers were all chosen
as exemplars of MaltE patterns, and between them, the resulting audio stimuli presented
60
instances of variation relating to a broad range of features, as identified in previous research,
and described above in Section 2.2.2. There was a concern that the choice of lectures/radio
broadcasts would limit the register to a formal one, and this could influence the efforts of
speakers to adhere more closely to perceived standards of correctness than they might
otherwise have done in more spontaneous speech. Efforts to mitigate this included ensuring
that the clips were taken from the middle or towards the end of a recording where it was
expected that speakers would have relaxed into their role and the recording environment (see
Section 4.1.2 below). Furthermore, it was also expected that variant patterns at the
phonetic/phonological levels would be more difficult to fully control, and since these are the
domains in focus, the compromise in favour of clarity and more extended stretches of speech
was preferable to choosing clips which were either too short or not easily understood due to
background noise. Audio stimuli from native MaltE speakers were therefore obtained from
the ICE-MTA corpus (Hilbert & Krug, 2010).
4.1.2 Preparing audio stimuli for ME
11 audio clips were chosen from ICE-MTA and were uploaded from the corpus to the
Praat software system for speech analysis (Boersma and Weenink, 2013). The first of the 11
clips of the audio stimuli for the listening task was to act as the control or ‘modulus’ (see
section 3.3.1 above for more on this) to be played once at the beginning of the exercise, and
then repeatedly, interspersed by each of the remaining 10 clips. Among the final choice of
audio stimuli, the modulus was identified, following the adaptation of ME for linguistic
estimations in Bard, Robertson and Sorace (1996), as one of the speakers who could be
considered mid-range on the continuum of variation for MaltE. In other words, the speaker
used as the modulus presented to some degree, some characteristics in MaltE, as identified
above in Section 2.2.2, used with varying frequency. Thus some of the said characteristics
61
were neither totally absent, nor abundantly frequent. A caveat is offered here, however,
following the discussion concerning the relationship between native listeners and their
perceptions of native speakers (see Sections 3.1.1 and 3.1.2), as it is possible that what might
be considered relatively 'neutral' or 'mid-range' as a modulus for one native listener (in this
case, myself), might not be considered in the same way by another listener. This may be
especially true given the ambivalent climate surrounding the discussion of a new variety of
MaltE, as described in Sections 1.1 or 2.2.3). Nevertheless, it was felt that the advantage of
ME lay also precisely in its concentration on the subsequent interval measures of the estimate
for the modulus, made by the listener, compared with all other estimates for subsequent
stimuli, and it was expected that this would address any potential disagreements over the
choice of a modulus. The same argument is applied also to the choice of modulus in Study 2
(see Section 4.3.2 following discussion with the expert listeners about this issue (elaborated
further in Section 5.2.1. Bearing this concern in mind, however, every effort was made to
ensure that the stimulus chosen as the modulus was not likely to be highlighted as either
highly identifiable of MaltE speech, nor completely unlikely as a MaltE speaker. It was also
decided that the modulus was fixed a priori to one specific stimulus, the first one, for two
reasons. In the first place, this task was already testing a new task design, both because it was
applying ME as an estimate of a participant's perceptions of an audio stimulus. In the second
place, it was expected (and this expectation was borne out), that ME would be a novel form
of measuring judgments for the vast majority of the participants. These two points together
led to the conclusion that it would be important not to introduce too many new elements into
the presentation of a measuring scale which was itself to be considered unfamiliar and a
novelty.
The remaining 10 clips to be used as audio stimuli and to be estimated in relation to
the modulus, were concatenated in Praat, with each clip played once, then repeated once
62
following a 5 second interval, after which the controlling clip was played once more, and so
on, until all 10 clips had been concatenated. This process would allow the listener some time
to observe, think, record observations on a tasksheet and then check back with the controlling
clip (the ‘modulus’), before moving on to a new clip, but it would not allow the listener to
revisit each clip at will.
The length (20 seconds) of each clip, repeated once, was tested by myself and two
other linguists in order to establish that it allowed enough time for an expert listener to make
a judgement, without extending the opportunity to agonise or even theorise over decisions
made. This was considered an important element of the design of the study, in order to
contain somewhat the tendency of an expert listener to revisit and rethink patterns and
evidence. For while the whole point of obtaining the judgement of expert listeners rather than
naive listeners was to extract an informed opinion, it was still thought necessary, at this stage,
to also ensure that such judgements did not give rise to extended and possibly unwieldy
discussions of the data, or to result in a participant changing their original estimation too
radically.
It was estimated that the whole procedure would take a maximum of forty minutes,
with the bulk of the time – 30 minutes – assigned to the cycle of listening, complete with
repeats and pauses, to the audio clips.
4.1.3 The expert listeners in Study 1
Trained linguists and language experts familiar with Malta, and much of whose
research concerns Maltese, English and MaltE, were recruited for this pilot study. The experts
in question were all of Maltese nationality, and were all personal colleagues and friends who
agreed to participate and provide feedback both on the efficacy of ME as a tool, and more
centrally, in relation to identifying which of the features present in the speakers in each of the
63
stimuli might have triggered the initial perception of more or less identifiable MaltE. A total
of nine experts were recruited, and the study was carried out by meeting with each one
individually, setting aside a maximum of 45 minutes for the task's running time. All nine
experts work in linguistics and language fields both in Malta and abroad.
Seven of the nine experts also agreed to offer remarks and feedback explaining their
reasons for their perceptions and this was considered to be a key feature of this study. It has
already been noted that Malta's current language context is such that it is anticipated to be
difficult to obtain even moderately unbiased judgments about MaltE speaker patterns where
these are overtly elicited. However, it was expected that experts' familiarity with Malta's
linguistic context, as well as their ability to suspend attitudinal reactions to speaker patterns
in favour of more linguistically objective observations, would serve to surmount this issue
enough to obtain a more precise understanding of which features or patterns might be
prompting perceptions of the identifiable in MaltE.
4.1.4 Further implications resulting from Study 1
Many of the concerns raised above in section 3.3.1, pertaining to the nature of ME
(Magnitude Estimation) as a measuring tool, and the risk that it may be excessively complex
for participants to grasp easily enough in order to produce reliable results, did not appear to
be an issue here, as the data analysis in Sections 5.1.2 and 5.2.1 will expand to show.
Feedback on the execution of Study 1 itself was also solicited (reported in Section 5.1.2), and
this was taken into consideration in the preparation of Study 2. One useful aspect of the
feedback concerned the length of each audio stimulus and the repetition of the modulus clip
before each new stimulus to be judged. The length of each stimulus from the chosen speakers
was ultimately considered unnecessarily long, and it was also reported that the repetition of
the modulus clip could perhaps be left for the listener to control. While the extended length of
64
each clip was useful in Study 1 in order to allow for the opportunity to observe (hear) and
provide feedback on, as many features as possible, this was not necessary for Study 2, where
participants' impressions of what triggered their perceptions was not going to be solicited. On
the contrary, a more intuitive and less studied reaction was to be sought in Study 2. The
analysis and findings of Study 1 are presented more fully in Chapter 5, and will demonstrate
that ME can yield rich, nuanced and measurable information from data, together with another
type of result, indicating general trends in the perceptions of what is identifiable for MaltE.
Until this point, therefore, I have considered previous literature describing MaltE,
together with contemporary suggestions and informed feedback made available through the
pilot study described above in Section 3.3.2. Together, these sources have helped me to focus
further study on what can be considered identifiably MaltE, in terms of phonetic and
phonological variation, with respect to a number of salient characteristics.
4.2 Study 2: overview and rationale
In contrast with Study 1, Study 2 had a broader scope, while the target language under
focus both for production and perception patterns was rather more controlled. In this study,
one of the central aims was to begin an exploration of the relationship between what native
MaltE listeners perceived as identifiably MaltE, and what might be some of the patterns
produced by native MaltE speakers that could trigger such perceptions. Although initial
intuition combined with some persistent pointers in previous literature (Delceppo, 1986, on
vowel space; Vella, 1995, on prosodic features; Debrincat, 1999, on ranges of variation at the
segmental level; Bonnici, 2010, on rhoticity, among others) led me to focus on potentially
salient patterns variation at the phonetic/phonological levels, it was not possible to fix on a
single one of any of these (or any other) features above as the best one to focus on in detail,
in identifying a solid link between perception and production. Thus it was crucial, at this
65
stage, to cast the net a little wider, and to bear in mind the concept of a "pool" of competing
features, (Mufwene, 2001; Schneider, 2007; Cheshire et. al, 2011, among others).
This study was therefore devised to explore the feature pool for MaltE, by focusing on
a number of characteristics, or features, which have so far presented themselves as likely
candidates to be used by native MaltE speakers, and to be picked up on by native MaltE
listeners. Section 4.2.1 below explains the rationale for choosing five particular
characteristics of the MaltE feature pool, and its subsections describe the context of study for
each of these characteristics in detail. Sections 4.2.2 and 4.2.3, and subsections, describe the
procedures for recruiting speakers for Study 2, and the design of the tasks used for their
recordings, respectively, while Section 4.2.4 gives details about the recording procedures
themselves. Finally, Section 4.3 and its subsections explains how the resulting data from both
the native MaltE listeners in their ME Perception study, and the native MaltE speakers in
their production task, were collated, annotated and prepared for analysis.
4.2.1 Identifying five salient characteristics in MaltE
A key premise of this thesis has been that variation in MaltE is present in a range of
characteristics, not necessarily in just one, and it is also patterned and systematic. As such, in
turn, systematic patterns of variation trigger the perceptions in native listeners of this same
variety in such a way that they can readily identify a native speaker of MaltE (see also
Section 3.1.1). Exactly which aspects of variation, and which linguistic features, might be
triggering such perceptions in the first place, remains to be studied further, and this thesis
seeks to examine this issue more closely.
In order to be able to do this, it was necessary to begin with some of the linguistic
features which appeared to be the most promising candidates, as it were, to trigger such
perceptions in the first place. Previous research on this matter proved a valuable point of
66
departure for this, as there is now a fair amount of discussion in the literature which
highlights, in particular, variation at the phonetic and phonological levels of analysis as a key
feature of MaltE, and the key works dealing with this aspect of variation have been presented
more fully in Section 2.2.1 above, and are referred to in the context of Study 1 below, this
section. As discussed in Section 2.2.1 earlier, variation at the phonetic/phonological levels
has been noted to be widespread in MaltE, and has also been identified to be prevalent across
the continuum of variation within the variety of MaltE (particularly in Vella, 1995).
Furthermore, studies carried out so far (notably, in Delceppo, 1986; Calleja, 1987; Vella,
1995; and Debrincat, 1999, as described more fully above, Section 2.2.1) have honed in on
variation related to vowel quality, vowel duration, and patterns of realisation of prominence.
Variation has also been noted in studies on other segmental features, including the
realisation of /θ/ and /ð/ as plosives, final consonant devoicing, or rhoticity (Delceppo, 1986;
Mazzon, 1992; Bonnici, 2010). These studies were instrumental in confirming that there was
potentially a range of variables to consider in the variation of MaltE, and that some of the
most salient characteristics of this range of variables included those related to the realisation
of prominence and to issues related to duration. At this point, however, it was important to
narrow the field in such a way as to allow for the possibility of a range of characteristics to be
studied, and not just one characteristic. This latter point was crucial to this study in its aim to
explore if, or how, a range of characteristics might interact in the identification of MaltE. At
the same time, it was not possible to test all of the features listed that have been referred to or
studied to date within the remit of this research. In order to hone in further on which
characteristics might be more salient than others in the triggering of perceptions of MaltE, I
therefore tested out a broader range of features and characteristics on the group of expert
native MaltE listeners participating in Study 1.
67
One task presented to the expert listeners in Study 1, (see Section 4.1.3 and 5.2.2) was
to elicit from them which features, or characteristics, heard in the audio stimuli of 11
different MaltE speakers, might lead them to judge each speaker to be unequivocally Maltese.
Although a focus on the phonetic/phonological interface had by this point been determined,
and a range of features had been included across the stimuli presented (see Section 4.1.1), the
expert listeners were not guided specifically to limit their own observations to features within
these two domains. It has already been pointed out that phonetic, and also phonological,
variation in MaltE is widely observed across the full continuum of the variety, so it was
expected that expert listeners would more readily identify characteristics within these
domains even without prompting. Furthermore, it was not my intention, at this point and with
these participants, to highlight features in a domain which may or may not have been relevant
to each expert listener's area of expertise. For example, while the expert listener cohort
included at least one phonetician, other expert listeners' expertise included areas as diverse as
semiotics, morphology or semantics, and it was important for each expert to understand that
they had been recruited for their general orientation towards language study, not for their
expertise in language study. Thus, while offering reasoned and more objective feedback,
these experts would still be broadly identifying the most salient features that attracted their
attention. Broadly, it was expected that many of these salient features would fall within the
phonetic/phonological domain.
The full set of feedback remarks obtained from Study 1 is analysed fully in Section
5.2.2 below, while this section presents an outline of this feedback in order to clarify how this
study helped to inform the choosing of five characteristics to focus on in Study 2. The table
below is a summary of all the remarks and comments elicited from the 9 expert listeners
during the pilot study. The three categories represent the three broad domains which capture
all of the comments, expanded upon more fully in Section 5.2.2 below, and in Appendix B.
68
Table 4. 1 A summary of native MaltE expert listeners' observations
PHONETICS/
PHONOLOGY
Some use of intonation less typical of MaltE; noted
difference (undefined) in intonation or "phrasing"
(=rhythm and intonation); stress patterns suggest
Maltese dominance, lack of vowel reduction e.g.
"operationAL", "outsidER", or lengthened vowels
in "amphIbious";vowel quality and variant vowel
realisation; Post Vocalic 'r', word final devoicing;
stops instead of interdental fricatives.
MORPHOLOGY/
SYNTAX
Word order; variant use "would" and progressive tense;
and one case of switched relative pronouns.
DISCOURSE/
PRAGMATICS
Reading style; filled pauses/discourse markers typical of
MaltE e.g. "ehm", hesitation.
As Table 4.1 above indicates, many of the features noted fall into the
phonetics/phonology category, while there are noticeably less in the other two categories.
This may be due to the choice of clips taken from the ICE-MTA corpus, which, as noted
earlier (see Section 4.1.2), favoured clarity, sufficient length and a broad spectrum of
speakers on a MaltE cline, over sponteneity. This meant that most audio clips were extracted
from interviews or lectures, where speakers may well have been more vigilant or careful, of
their language use. The issue of register has already been noted and will be discussed further
in 6.4.2, but here it seems to support the intuitive claims that variation at the morpho-
syntactic level might be less prevalent across the cline of MaltE, and may instead be more
expected at specific points along the continuum. Another possibility is that variation at the
morpho-syntactic level in MaltE differs according to specific speech communities, rather than
operating on a cline across the entire range of MaltE speakers. For example, variant verb
tense patterns (he had been staying up all night) might be frequent in one subgroup of MaltE
but not in another, while sentence final "but" (I told him, but) might be confined to a different
69
subgroup. Conversely, comments regarding aspects of phonetic/phonological variation
recurred across all 11 speakers presented for judgment, as presented in Table 4.1 above.
A close consideration of previous literature, particularly in Delceppo (1986), Vella
(1995), Debrincat (1999) and Bonnici (2010), reveals that vowel quality, as well as vowel
duration, (Delceppo, 1986; Debrincat, 1999) is one range of characteristics which stands out
in the identification of MaltE. Similarly, variation in the realisation of particular segments
also seems to be quite persistently singled out, especially for interdental fricatives /θ/ and /ð/,
or for post-vocalic (r) (Debrincat, 1999; Bonnici, 2010). Again, features at the prosodic level
were highlighted by Vella, (1995) who also drew attention to the issue of timing and variation
in the realisation of vowel segments as a contributing factor to the rhythm patterns of MaltE.
This therefore, is the point at which I could hone in on the five characteristics described
below and more fully in subsequent subsections. I took the decision to focus on more than
one or two characteristics in order to test the notion of the feature pool for MaltE (see also
Section 1.1). At the same time, I needed to decide on a manageable number of characteristics
for further analysis which would also capture as many examples of meangingful variation as
possible. The following five characteristics listed in Table 4.2 below were chosen with the
above concerns in mind.
Table 4. 2 Five variables chosen for analysis
The use of post-vocalic (r) forms
(rhoticity)
variable (r)
Substitution of interdental phonemes /ð
θ/
variable (th)
Variant realisations of the phoneme /æ/ variable (a)
The use of full vowels as opposed to
schwa
variable (schwaØ)
Aspects of rhythm variable (PVI V.dur) (Pairwise
Variability Index for Vowel
Duration)
70
The substitution of interdental fricatives, listed in table 4.2 above as variable (th), was
considered worth focusing on as it is suspected that it is not the case that wholesale
substitution is happening here. Rather the details of how “target” interdental fricatives are
realised constitutes a grey area involving more variation than expected both given the
foregoing literature, and indeed, anecdotal evidence (see Section 4.2.1.1). The presence or
absence of post-vocalic (r) was chosen for the same reason, where it is expected that rather
than total rhoticity or non-rhoticity, we have another grey area of variation as Bonnici (2010)
also suggests (see below Section 4.2.1.2). One persistent reference to vowel quality is made
throughout the literature in relation to the realisation of the HAT vowel /æ/ (Delceppo,1986;
Vella, 1995; Debrincat, 1999), so this was chosen as a good candidate to focus on in so far as
vowel segments go, together with schwa /ə/, and its absence or presence. /ə/ was also seen as
an important contributing characteristic to perceptions of potential variation in rhythm
patterns too (Vella, 1995), so this vowel segment was also considered useful to study in this
respect. Finally, one aspect of the issue of rhythm patterns was also decided upon as a salient
feature of MaltE. The notion of durational characteristics (Azzopardi, 1981; Calleja, 1987;
Vella, 1995) is also prevalent in the literature on both Maltese and MaltE, and was also raised
in the feedback from the expert listeners in Study 1 (see Section 5.2.2). For these reasons it
was picked as the fifth characteristic to study further.
A point about the notational conventions used in the rest of this thesis, particularly
those for vowels, needs to be made at this stage. First, as shown above in Table 4.2, brackets
are used to refer to the five linguistic variables under analysis, such as, for example, variable
(th). Second, forward slashes are used to indicate representations, such as /æ/, of a broader
and more abstract sort, while square brackets indicate more phonetic representations where it
was considered more appropriate to provide narrower, more phonetic representation of the
realization of the variable involved in a specific instance, such as [æ].
71
It is worth remembering that one of the characteristics of MaltE is precisely the high
degree of variability, both inter- and intra-speaker (see Sections 2.2 and 6.2 for more on this).
This results in speakers sometimes going for productions which seem to call to mind more
Maltese vowels, whilst other speakers might produce vowels more readily identified as
British English variants. Equally possibly, speakers may also do neither of these two things,
but will have vowels having some "intermediate" quality, not always consistent even intra-
speaker. With this in mind, the representation used here is not based on the phonetic quality
of productions, but on what “targets” could be expected.
In practice, the conventions used should be interpreted in the following way.Variable
(x) is a “target” sound that I am interested in here. In the case of variable (th) this includes /θ/
as well as /ð/ which can be realized in a number of different ways including realisations with
a more plosive-like quality, represented using square brackets as discussed below in Section
6.3.3. In the case of the variables (a) and (schwaØ) different speakers, and indeed the same
speakers on different occasions, may again, not only produce a wide variety of different
phones but may also be going for different target sounds. Given that studies based on acoustic
analysis of this variety is as yet largely unavailable it is difficult to say with any certainty,
precisely which vowels symbols provide the best representation of the sound segments under
analysis here. In this context, it was decided to use those symbols which also have some
relevance to the local context. Thus, in discussions of the HAT vowel /æ/, it is noted that this
vowel in MaltE could possibly be expected to approach the quality of a more central vowel
represented here /ɐ/, or that of a front vowel represented here as /ɛ/.
All five characteristics are studied in the context of a scripted text read aloud (referred
to in subsequent sections as TextAloud), as well as in more naturally occurring speech data
(see Section 4.2.2.1 below for more on these), with a view to understanding how, if at all,
they might act as a trigger for the perception of identifiable MaltE among native MaltE
72
listeners. The intention here is therefore twofold: to understand more about the variation
patterns in these five characteristics, listed as linguistic variables above; and to see whether
any of the five might influence perception in any way. Finally, the production of these same
characteristics will contribute towards the compilation of an index in relation to a continuum
of variation in MaltE.
Each of these five characteristics of MaltE identified above could equally command
their own extended research, but the intention here is to begin an exploration of how a group
of features or characteristics might be drawn on to varying degrees in combination, with a
view to eventually combining in an index of variation in relation to the continuum of
variation referred to throughout the literature (Borg 1986, 2011; Camilleri, 1992; Vella 1995,
2012). The specific aspects of how these five characteristics are therefore to be considered in
this research are further described in the following subsections.
4.2.1.1 Segmental variation: the variables (a), (schwaØ) and (th).
In terms of the available literature discussing MaltE, it is evident that MaltE is no
different from other dialects of English in having developed variation in its realisation of both
vowel and consonant segments. The references both to "vowel quality" and to the substitution
of /θ/ or /ð/in earlier studies (Delceppo, 1986; Calleja, 1987; Debrincat 1999, among others),
and also in the first perception study carried out in Study 1, together with remarks concerning
'schwa', or the lack of it, helped to hone this research to focus on these three aspects of
phonetic realisation as a potential hub for variation, namely, variation in the phoneme /æ/ and
the use of /ə/ in weak stressed syllables, and some form of substitution of the phonemes /θ/
and /ð/. Note that /ə/, and the preference for full vowels will be discussed more fully below in
4.2.1.3 within the context of rhythm, timing and vowel duration, but it is also relevant here, in
a discussion of variation at the segmental level of analysis, in an effort to address the notion
of vowel quality.
73
(th) substitution is a feature identified widely throughout the previous studies (for
example Delceppo, 1986; Mazzon, 1993, among others) and it is also widely noted to take a
range of forms in studies on varieties within the broader study of World Englishes, including
replacement in manner of articulation by plosives /t/ or /d/ or in place of articulation, by
alveolars /s/ or /z/ (Schneider 2007). In MaltE, the perception seems to indicate a clear
substitution with /t/ or /d/, as is illustrated in the following headline from a local blogger
(http://daphnecaruanagalizia.com/), commenting on a local politician's speech, entitled, "der
for you all dough she is heterosexual". The same headline is also a clear example of aspects
of stigmatised features referred to above in chapter 2. The general assumption in previous
literature also seems to suggest that interdental fricatives are regularly substituted by a
voiceless/voiced counterpart /t/ or /d/ respectively (Mazzon 1993). However, there is no
indication in the literature of how this substitution is distributed, nor is there yet any extended
study on the phonetic realisation of variation in (th).
The same question arises in relation to the comment in table 4.1 above, and among the
expert listeners, relating to 'vowel quality'. There is on the one hand, the real sense in which
all native MaltE language users 'know' what we mean when we say 'vowel quality'; it seems
to relate to length, sometimes, and at other times to a shift in the vowel space. However, we
are not yet so sure in what ways this issue of 'vowel quality' relates to the perception of
variation in MaltE. The vowel phonemes /æ/ and /ə/ have been taken as a possible starting
point for further study in this area of variation, both because they are widely mentioned
throughout the literature already referred to above, and also, because they too seem to have
been taken up by YouTube bloggers as good examples of MaltE. Unlike in the case of (th),
however, references to variation in vowels are less overt and more inferred. One clip1, for
example, picks up on the /æ/ vs /ɐ/ distinction in the word "back", "move back" (the context
1 The clip in question may be found at: http://www.youtube.com/watch?v=bfMVRLx-RMs
74
ə ɛ
æ ɐ
given is most likely imitating a bus driver in Malta), with (a) in "back" here pronounced with
a distinctly [ɐ]1 quality, roughly corresponding to /bɐk/.
In MaltE, realisation of /æ/ can sometimes occupy more of an /ɛ/ or more of an /ɐ/
space, as illustrated in the diagram below (also Delceppo, 1986; Vella, 1995). Schneider
(2011) also notes instances of variation in the realisation of /æ/, which in some varieties can
be realised as a lower /a/ or a higher /ɛ/. Neither /æ/, nor (th) realised as fricatives are listed in
the phonemic inventory for Maltese (Azzopardi, 1981; Borg and Azzopardi-Alexander,
1997), so it is conceivable that their production may be restricted to one end of the
continuum, where speakers are more likely to use both English (unspecified) and Maltese,
and in some cases, possibly just English as the dominant language. Conversely, speakers at
the other end of the spectrum, who are most likely to speak Maltese all the time and English
(unspecified) only where necessary, may more readily substitute these segments for close
alternatives. However, this may be an overly simplistic view of a much more complex reality,
one where social norms and perceptions of social belonging and identification are more fluid,
and less clear-cut. This will be explored further throughout Chapters 6 and 7.
Figure 4. 1. Some of the vowels relevant to a discussion on MaltE
1 Vowels indicated as phonemes are enclosed in forward slashes / /, while vowels indicated as part of a phonetic
description are enclosed in square brackets [ ].
75
As Figure 4.1 above illustrates, the target sound /æ/ is a low front vowel, while some
of the more readily observed preferred variants for MaltE (Delceppo, 1986, Debrincat, 1999)
are either another front but slightly higher vowel /ɛ/ or even a mid-low central vowel, /ɐ/
(Vella, 1995). Given the greater phonetic distance from /ɐ/ to /æ/, we might expect the
phonetically closer positioned /ɛ/ to be the more preferred variant, as /ɛ/, although a little
higher than /æ/, is also a front vowel, while /ɐ/ is both a little higher and considerably further
back than /æ/.
As regards (schwaØ), this can involve both a question of vowel quality, but also, of
vowel duration. Giegerich (1992) suggests that schwa does not constitute part of the
phonemic inventory for English (variety unspecified), as it is not contrastive with any other
vowel, but rather, it is a popular option for reduction in weak-stressed syllables. Roach
(2009:102) also comments that "ə is not a phoneme of English, but is an allophone of several
different vowel phonemes when those phonemes occur in an unstressed syllable". Schwa is
also not part of the phonemic inventory for Maltese (Azzopardi, 1981; Borg & Azzopardi-
Alexander 1997), although there is as yet, not enough research carried out on natural speech
data in Maltese to be able to assert that it is never present in Maltese or in MaltE speech
patterns. Given its potentially questionable status as a phoneme both in English as an
idealised or prototyical unspecified variety, and more definitively, in Maltese, it may be
expected that speech data on MaltE is likely to show preference for full vowels and less
evidence of schwa. In view of the preliminary nature of this investigation with respect to
schwa, it is being analysed here in terms of a presence or absence of schwa, indicated as the
variable (schwaØ), and is thus considered more closely as having a bearing on vowel
duration, and the perceived variability of vowel duration patterns (see section 4.2.1.3 below).
76
4.2.1.2 Post Vocalic r
The variable (r) here specifically refers to post-vocalic (r) in syllable final position, in
words such as "motorbike" or "father", but not in "jerry", where the (r) is inter-vocalic, and
also syllable initial. The linguistic variable ( r) includes instances of the first type, but not of
the second type. Rhoticity has also received significant attention in MaltE, notably from
Bonnici (2010), where the data focused more closely on native MaltE speakers whose
language dominance was unequivocally English.
The presence or absence of (r) (rhotic and non-rhotic, respectively) has been widely
recognised as a defining feature in English and its many varieties, and its behaviour has been
traced as a marker of variation across time, socio-economic status, and region, among others.
While, and as Bonnici (2010) reports, the overwhelming majority of studies have been
carried out on established varieties of languages, notably American and British English, there
has been more recently, a steadily growing interest in pushing the boundaries of current
research to account also for rhotic and non-rhotic patterns in a range of other varieties of
English. For example, Hartman and Zerbian, (2009) examined rhoticity across affluent and
less affluent university students in South African English, while Stuart-Smith (2007) focused
on derhoticisation in Glaswegian English, among many others. Hickey's earlier study on r-
coloured vowels in Irish English (1989) also draws attention to the issue of how rhoticisation
can affect vowel length, which is also considered in the present study. In MaltE, Bonnici
(2010) serves here as an important context for the current study. Bonnici's findings on the
presence or absence of (r) in her data have been noted earlier in Chapter 2, and the state of
flux between rhotic and non-rhotic accents is echoed in the current data too, as described
more fully in Chapter 6.
The understanding of rhoticity here follows the descriptions in Lindau (1985) and
Ladefoged and Maddieson (1996). Some of the well established realisations of (r) include an
77
alveolar flap or tap, an alveolar trill, a retroflex (r), a uvular (r), and finally, an alveolar
approximant. The current data collected for this research, together with ongoing
impressionistic accounts, suggest that one of the most common realisations of (r) for MaltE is
as an alveolar approximant, which may or may not include – in rarer instances – elements of
frication, while a tap or flap may be likely in syllable onset position, but less so, in
postvocalic distribution.
The growing interest in (r) is also extending its effect to the study of its acoustic
correlations, which, until recently, has largely been sidestepped in favour of auditory
analysis. Now that more studies on (r) are being carried out, researchers are starting to look
for ways to methodically identify it in an accompanying acoustic analysis. This has been a
challenge, not least because of the many and varied realisations it can take. Some studies on
the variant realisations of (r) have had more success than others in identifying waveform and
formant patterns, particularly if they are examining those realisations that allow for more
constriction or closure in their production, such as a uvular fricative or the trilled or tapped r,
familiar to us through their use in French, German, Italian and Greek (Ladefoged &
Mathieson, 1996; Baltazani, 2009) and they are understandably a little easier to identify
acoustically (Ladefoged & Mathieson 1996). The acoustic and auditory analysis of (r) is
taken up again in Chapter 6, below, while here, it is relevant to note that three different
realisations of post vocalic (r) are identified: Ø (null realisation), an alveolar approximant [ɹ]
and an alveolar approximant with frication [ɹ]̝.
As in the case of the discussion on rhythm and timing in section 4.2.1.3 below, a
discussion on the different realisations and distributions of (r) goes beyond the scope of this
study, particularly as it also seems to correlate strongly with indexical features in a speech
community. However, the purpose here has been to respond to the evidence and patterns
suggested in the two perception studies described in the following chapters, in which the use
78
of (r) was identified as one of a number of potential triggers in the identification of MaltE. It
is thus not expecting to present an exhaustive account of this variable in all its facets, but
rather, to identify aspects which give evidence of some form of patterning that might be
considered salient in the identification of MaltE.
4.2.1.3 Rhythm, timing and durational characteristics
Some attention has been accorded to the notion of 'rhythm' in some of the literature on
MaltE (Calleja, 1987; Vella, 1995), and it was also raised as a possible indicator of variation
during Study 1 (see Sections 4.2.1 above, and Section 5.2.2 below). Rhythm has been
succinctly described most recently by Nokes and Hay (2012) as "the patterning of prominent
elements in spoken language, as perceived by the listener" (2012:1). Barry (2007) has argued
that it is more helpful for teaching L2 learners to think of rhythm in terms of such 'prominent
elements', illustrating that "the sum of these (essentially segmental) properties are the
determining features of an acceptable (prosodic) prominence pattern" (Barry 2007:113).
Attempting anything more than that in terms of isochrony, or regularity, Barry argues, is
actually detrimental for an L2 learner, because "no single structural correlate has been found
which justifies the labels as phonological categories" (ibid.) with regards to isochrony
(regularity) in the distinction of 'stress' vs 'syllable' timing. Nokes and Hay (2012) clarify that
rhythm cannot be equated simply with a question of timing, and the same authors highlight a
number of parameters including pitch, intensity and duration, among others, which together
combine to generate the listener's perception of a patterning which has traditionally been
termed 'rhythm'.
The notion of rhythm has commonly been closely associated with 'timing', thanks to
the perceptions of a regularly occurring sequence of observable events. Languages have
traditionally been described in terms of "isochrony", or sequences of regular events as either
'stress-timed' or 'syllable-timed', or "mora-timed" where the emphasis is clearly on the issue
79
of timing. However, Arvaniti (2012) suggests that there may be a problem in "confounding
rhythm with timing" (Arvaniti 2012:1), and Nokes and Hay (2012:2) reiterate this issue, in
saying "It is clear that the perception of rhythm is not based solely on timing, and conversely,
that timing is affected by considerations other than rhythm". Pike (1945) and Abercrombie
(1967) are credited with first encapsulating the notion that languages could be typologically
distinguished on the basis of their rhythm patterns. Since then, this view has gone full circle
from being gradually debunked to more recently partly restored, in modified form. The
original views expressed by Pike and by Abercrombie resulted in the division of languages
into 'syllable-timed' or 'stress-timed' according to whether all syllables, stressed or unstressed,
are produced with more or less perceived even timing or whether timing is organised
primarily around stressed syllables, respectively. Abercrombie (1967: 97) also described
rhythm in terms which suggest an observable activity complete with physiological indications
as "Speech rhythm is essentially a muscular rhythm...". Curiously though, Abercrombie goes
on to a surprisingly prescient account of rhythm being also more of a combined
understanding between the speaker and listener "empathetically" in tune with one another,
where, if the speaker/hearer pair does not share the same mother tongue, then "the sounds
will not be recognized as accurate clues to the movements that produce them" (ibid.). In a
broader attempt to circumscribe the nature of rhythm, however, Abercrombie (1967: 97) can
also be quoted more fully:
In the one kind, known as a syllable-timed rhythm, the periodic recurrence
of movement is supplied by the syllable-producing process (…) In the other
kind, known as a stress-timed rhythm, the periodic recurrence of movement
is supplied by the stress-producing process (…)
Nokes and Hay (2012:3) remind us that Beckman (1992) has referred to the attempts
to capture rhythm patterns as "one of the most persistent metaphors in the history of our
80
struggle to understand speech rhythms", and the word 'metaphor' might capture why some
linguists prefer to treat rhythm as a perceptual phenomenon, rather than an objectively
measurable one in temporal terms. Couper-Kuhlen (1986) took this route, for example, while
noting that "it is a natural human tendency to impose structure on perceptual stimuli"
(1986:52).
Roach (1982:73) was also one of the first to observe that "the distinction between
stress-timed and syllable-timed languages may rest entirely on perceptual skills acquired
through training." His attempts to verify Abercrombie's original claims that a) syllable length
is regular in syllable-timed languages, but varied in stress-timed ones and b) that stress
'pulses' are irregular in syllable-timed languages, flagged up a number of issues, including the
fact that up to that point, the stress/syllable timing distinction was based on "the intuitions of
speakers of various Germanic languages, all of which are said to be stress-timed" (Roach
1982: 79). Roach's reasonable conclusion was that similar studies from native speakers of the
languages designated 'syllable-timed' would also be necessary, for the distinction to remain
robust. The underlying belief, up to the 1990s, then remained that perhaps, rhythm was best
studied within the domain of perception.
Nevertheless, Roach (1982: 79) also signalled the possibility of another way forward,
in following Mitchell (1969) to say that "there is no language which is totally syllable-timed
or totally stress-timed." Much subsequent research has allowed for this more fluid view
where languages fall anywhere along a continuum, the two extremes of which might be
considered 'stress-timed' or 'syllable-timed', with mora-timed languages, such as Japanese,
reportedly behaving similarly to syllable-timed ones in patterning (Dauer, 1983; Dauer, 1987;
Bertinetto & Bertini, 2008, among others). The new perspective involving a continuum,
rather than mutually exclusive categorisation, also nudged subsequent research into the
domain of phonetic, as well as phonological, interpretations of rhythm, where discrete events
81
could be measured and correlated with perceptions of rhythm being more or less syllable- or
stress-timed.
Since the late 1990s, a plethora of measurements and equations have been broached in
an effort to capture these discrete events and analyse them as the acoustic correlates of
rhythm patterns of a range of different languages. Success here was with hindsight considered
muted, partly because it was found that individual studies used different parameters and
criteria in their measurements of segments, and partly because different studies often arrived
at different results for the same languages, using the same or similar measures. It was
concluded that many of these efforts were somewhat circulatory in their efforts to try and
illustrate how a language was either more stress-timed or syllable-timed, depending on how
they were measured (Arvaniti, 2012). Another issue that later researchers had with these
attempts echoed Barry (2008), referred to earlier, in insisting that in fact, rhythm was simply
an umbrella term for a wide array of individual – often segmental – features, and thus could
not really be reducible to a single measure equated with 'rhythm', which was considered
essentially a phonological, rather than a phonetic construct. Such research set about testing
more systematically a number of the more established measures previously used in this
attempt to capture the acoustic cues to rhythm (Gut, 2012; Arvaniti, 2012; White, Mattys and
Wiget, 2012, among others).
However, these later arguments notwithstanding, it is also true that, in terms of
perception, duration or timing is always identified at least as one of three possible acoustic
correlates of prominence. So while it cannot be considered the sum total of rhythm,
durational characteristics of segments are often considered a strong indicator of some form
patterning (Nokes and Hay, 2012). In English for example, it is widely understood that
stressed vowels are full vowels and are longer than unstressed ones, which can be weakened
and shortened. Nokes and Hay (2012: 4) note that "Other factors held equal, a longer vowel
82
length will give rise to a percept of syllable stress, and thus rhythmic prominence, in
English". Although since argued to be flawed in some ways, it is also accepted that the first
efforts to identify and measure some of the acoustic correlates of rhythm such as those
presented by Ramus, Nespor and Mehler (1999) or by Grabe and Low (2002) nevertheless
made an important contribution to the understanding of rhythm, even if they could not wholly
capture the sum total of the nature of the concept of this elusive property of language. Some
of these frameworks and measures are therefore further outlined below, with respect to their
connection with the present study.
As Whyte and Mattys (2007: 501) put it: "Rhythm derives from the repetition of
elements perceived as similar. In speech these elements are syllables, or stressed syllables in
particular". If rhythm could be described with reference to syllables, then it would follow
that measurements of the various components of syllables might indicate where along the
rhythm continuum that language might fall. The 1980s saw the beginning of a new drive to
capture those aspects of rhythm that were more tangible than previously described by "the
subjective perception of isochrony" (Ramus, Nespor & Mehler, 1999:268). Dauer (1983) first
noted that languages of different rhythm types presented systematic patterns in their syllable
structure and in vowel reduction, indicating the real possibility of hinging rhythm patterns to
measurable components of syllables in a language. Guided by this new direction, and
prompted by studies in infant perceptions of speech, Ramus Nespor and Mehler (1999) first
followed this lead, but crucially chose to move away from the domain of the syllable, itself
subject to differing interpretations and measures depending on alternate points of view.
Instead, Ramus, Nespor & Mehler (1999) measured vowel and consonant intervals, based on
the premise that "The measurements suggest that intuitive rhythm types reflect specific
phonological properties, which in turn are signaled by the acoustic/phonetic properties of
speech" (Ramus, Nespor & Mehler, 1999: 265, henceforth, Ramus et al. 1999). In subtly
83
removing from the debate some of the more subjective factors, such as the question of
isochrony, or the domain of the syllable, Ramus et.al. (1999) and subsequent studies sought
to apply a range of formulae to more objectively identifiable segments, in the identification of
distinct rhythm patterns across different languages. Ramus et.al (1999) identified three
variables, namely, the proportion of vowel duration within an utterance, labelled %V, and
standard deviation for vowel length and consonant length ΔV and ΔC respectively. Results
suggested that %V and ΔC correlated well and offered some level of predictive power.
Following this, a number of other studies have challenged, developed and refined the
original concepts, with varying degrees of success. Dellwo (2006), for example, presented
VarcoΔC to account for between-language fluctuations in speech tempo, due, in part, to the
different syllable structure and phonotactic patterns typical across languages. Here it was
found that a negative correlation beween ΔC and speech tempo could be refined by
comparing relative variation to the norm, thus accounting for between-language differences
in the complexity or otherwise of consonant clusters. While the former measures and acoustic
correlates introduced by Ramus et al. (1999) or by Dellwo (2006) aimed to present global
patterns, Grabe and Low (2002), and Low, Grabe and Nolan (2000) came up with another
equation which gave the added dimension of more localised variability between pairs of
vocalic or intervocalic intervals in a Pairwise Variability Index (PVI), which will be
discussed at greater length below, and in Chapter 6.
The framework adopting Pairwise Variability Indices (PVIs) for the measurement of
vocalic and intervocalic intervals is based on the premise that stress-timed languages allow
for greater variability in vowel length, owing in part to greater use of vowel reduction,
whereas syllable-timed languages tend to observe less vocalic variability. As Nokes and Hay
(2012) also successfully demonstrate, the PVI also has the advantage of being able to
measure the variability between any two consecutive features, besides duration, including
84
intensity, or pitch variation, for example. The PVI takes the difference of two successive
measures of intensity, duration, or pitch, for example, and then takes the mean of all the
measured differences, resulting in an index of how much variability exists between one
interval's duration and another. A higher index would indicate greater variability, and a lower
index would indicate less variability between the durations of each pair of intervals measured.
To illustrate further with an example from my data, we take part of an utterance that
was scripted, from TextAloud (see 4.2.1 below), namely "this is a cartoon of a bike rally". As
this was a scripted text to be read aloud, we have examples of the same utterance from each
of the speakers recruited for the main study (described fully below in 4.2.2). In the Grabe and
Low (2002) version, each of the vowel intervals and consonant intervals are measured, for
each of the speakers, resulting in a table similar to the one below (for vowels):
Table 4. 3 Vowel interval measures for 2 speakers of MaltE
Word Interval Duration
Speaker X ms
Duration
Speaker Y ms
this i 61 59
is i 33 45
a a 61 58
cartoon a 159 49
cartoon oo 75 157 Duration measurements in milliseconds
The difference between each pair of successive vowels was recorded, so for Speaker
X, this would be 61-33; 33-61; 61-159, and so on, and these results then averaged, for an
index of variability, and the same process would be carried out for Speaker Y, and so on (see
section 6.1.2 for further explanation on this computation). Note here that "intervals"
corresponds to individual vowel segments, but in the original Grabe and Low (2002) study, a
vocalic interval is measured from the onset of the first vowel to the offset of the last one, thus
in "the arched handlebars", /ɪ/ or /ə/ in 'the' together with the following /ɑ:/ in "arched" would
be measured as one interval together. Since I am interested in vowel durations as a possible
85
indicator of rhythm, but not as the entire story of rhythm in MaltE, I have followed Nokes
and Hay (2012), in measuring vowels as segments, rather than as vocalic intervals.
As Grabe and Low (2002) originally conceived it, the PVI would be normalised for
vowel durations in order to account for speech rate too. Grabe and Low (2002) and Low,
Grabe & Nolan (2000) undertook PVI analyses of a number of languages including both
those identified as syllable- or stress-timed, and those hitherto unclassified. In choosing to
identify the differences between successive pairs of intervals, the PVI analysis seeks to tease
out the more dynamic aspects of language rhythm, assuming as it does, that much of the
effect of rhythm is captured in the variability between consecutive segments, rather than in
the overall pattern of a spoken text. The suggestion is that the distinction is better captured at
a local level, rather than at a global one such as that proposed by either Ramus, Nespor and
Mehler (1999) in %V, or by Dellwo (2006) in ∆C. Grabe and Low analysed eighteen
languages, including seven previously unclassified, and found that those previously
established as stress-timed (for example, English or German), compared with those
established as syllable-timed (for example, French or Spanish) clustered naturally in a
normalised PVI analysis for vocalic intervals, suggesting that this can indeed reflect a
distinction between the perceptually distinct classifications. However, other languages did not
test out with such distinct patterns, leading the authors to conclude that "a strict categorical
distinction between stress-timing and syllable-timing cannot be defended" (Grabe and Low
2002:10).
Barry et al. (2003) have tackled both the pros and cons of PVI analyses, relating the
lack of a neat reflection of perceptual classification into stress- or syllable-timing to a number
of methodological issues. For instance, speech style and number of speakers analysed, as well
as structural differences across languages, were not taken into account in the original PVI
analyses (Barry et al., 2003: 2693). The authors also concluded that a normalised PVI for
86
vocalic intervals and a separate PVI calculation for intervocalic intervals did not yield clear
distinctions between languages. However, a new combined PVI-CV, in which variablity
indices of the two intervals are combined, together with the aforementioned Ramus et.al 1999
%V (see above) together did yield more significant results. The %V measure was seen to be
the best performing measure across different languages and their crucially differing speech
tempos. Further, the authors indicate that a PVI analysis of the sequential vocalic and
intervocalic intervals would capture more precisely the interplay between these measures
which together combine to affect perceptions of rhythm (Barry et al. 2003,: 2696).
So it can be seen from the above discussion that recent efforts to capture rhythm
patterns across languages have by no means been straightforward. They are still subject to
controversy, even doubts as to their reliability, but at best, they are more simply considered to
be in need of further refinements and further testing under different conditions. In spite of the
valid concerns, one aspect of this current direction in the study of language rhythm has
proved useful in that it captures important features of phonetic detail in timing and duration
within and across languages. The durational study of segments may well have some bearing
on our perceptions of rhythm, particularly if they are considered together with other prosodic
features, such as intensity or pitch, as a cumulative indicator of perceived rhythm and
prominence. Studied in isolation, the measurement of durational characteristics can describe
variation at the phonetic level which could be considered the anchor for other aspects of
prominence to converge on. Accordingly in this research, I have elected to focus on vowel
segment durations to start with, on the understanding that these, while not necessarily
capturing the full picture of rhythm quality or classification for MaltE, nevertheless do begin
to account for perceptions of variation in timing patterns. This characteristic is referred to
throughout the thesis as the variable (PVI V.Dur), namely the Pairwise Variability Index for
Vowel Duration.
87
The new directions in the effort to understand how rhythm works across languages as
outlined above, try to capture those points at which differences in rhythm patterns are
perceived quite succinctly by native speakers and by infants. Some of the criticism
concerning the theoretical basis for these efforts, and referred to earlier in this section, is
focused on whether rhythm can ever be reduced to this level of acoustic analysis, and whether
it would not be better represented as a perceptual phenomenon more realistically captured on
the level of phonology. In the meantime, however, the effort to capture measurements related
to durational features can undoubtedly still be considered a very informative exercise, as it
points to systematic variation in how the production of successive segments can be exploited
in terms of duration, as well as in terms of quality. Here, too, I return to the suggestions
already indicated in sections 4.2.1.1 and 4.2.1.2 above, that the realisation of specific
segments can have a bearing on how durational features are realised. The discussions above
concerning the variables relating to vowel reduction (schwaØ) in 4.2.1.1, and to rhoticity (r)
in 4.2.1.2, have both alluded to their potential to affect variation in duration, which may in
turn, affect perceptions of variation in rhythm. If we now also consider the above arguments
relating to the concept that rhythm is realised as a series of acoustically correlated cues,
which combine to define rhythm essentially as a perceptual event, then it is useful also to
consider how the presence or absence of post-vocalic r, and how the preference for full
vowels over schwa, can both have a bearing on variability in vowel durations. In the case of
(r), both Hickey (1989) and Stuart-Smith (2007) have noted the different effects in duration
of post-vocalic 'r' on the preceding vowels, and the use or avoidance of schwa has also been
noted in relation to its status as an allophone, rather than as a phoneme (Giegerich, 1992;
Roach 2009). These studies and comments have been taken to indicate the importance of
considering these two features to be relevant to a study on durational characteristics of vowel
88
segments. The analyses of (schwaØ), and of (r), therefore both include reference to their
effects on duration in the relevant sections below (see Section 6.3).
The variable (PVI V.Dur) presented below in Section 6.2 will therefore focus on
variability in the duration of successive vowel pairs, but this discussion will also be further
informed both in terms of the frequency of (schwaØ), as well as the potential for (r) to affect
the durations of the preceding vowels.
4.2.2 The speakers and their data
The rationale underlying the decisions regarding which form of spoken data to focus
on, and whether or not to use pre-recorded and readily available data either from a formal
corpus such as the ICE-MTA corpus, or indeed from informal data banks available online
(Youtube, TV websites etc.), are issues which will be discussed in this section. Subsequent
sections detail the decisions ultimately taken in the light of these concerns and taking into
consideration Malta's current linguistic context. Here I also consider the issue of register, in
relation to experimental design, particularly with reference to the MaltE context, while
Section 4.2.4 explains the rationale and processes involved in task design and data recording
sessions, respectively.
The participants were selected from among friends, or colleagues, but also a number
of strangers introduced through friends, who could establish a history of living, growing up
and being educated in Malta. My main concern here was to try to not choose people from just
one of my own personal social circles, but from as broad a range as possible, in an attempt to
tap into different possibilities of variation among MaltE speakers. All 11 of the participants
who agreed to participate in my study were asked to complete a questionnaire which included
a self-rating exercise on proficiency in both English and Maltese, together with questions
related to language usage aimed at a brief record of their linguistic backgrounds. Participants
89
were asked to rate themselves on a scale of 0-5 according to how comfortable they felt in the
four main language skills (Listening, Speaking, Reading and Writing) in both Maltese and
English. Although self-rated, this short exercise has been adopted regularly in attempts to
identify differing degrees of bilingualism, and patterns of language dominance. The average
score is taken from the scores allocated for each skill, and for each language preference. If the
average for one language is higher than for another, then the respondent is considered
dominant in that language. If the scores balance out equally, then the respondent is
considered a 'balanced' bilingual (following Romaine, 1995; Dandria, 2002). It will be seen in
Chapter 6 that this by no means guarantees an objective account of a person's language
patterns or abilities, but it does serve as a preliminary way to initially classify participants
according to their own perceptions. A copy of the questionnaire is included in the appendices
(Appendix C).
Besides being assured that each participant represented an eligible example of MaltE,
the same data was to be used in the new Perception study and here, it was also important for
participants to demonstrate the above mentioned criteria. To this end, it was also desirable to
include among the speaker participants, at least one speaker, who, while still Maltese by birth
and nationality, and who also grew up in Malta, had also lived abroad for a while, and might
consequently still display some of the targeted features listed in Section 4.2.2.2 above, but
perhaps only residually, or faintly. One such speaker was identified and agreed to participate
in this study (identified in Clip 8 in the Perception study for Study 2. See Section 6.1.1
below).
Conversely, although the opportunity did present itself, it was decided ultimately not
to request participants from neighbouring Gozo. It was felt that the context for bilingual
development and/or second language acquisition in Gozo is critically different from that in
Malta. Although extensive study on this is limited to initial impressions and intuition,
90
discussion with a number of linguists familiar with this archipelago's linguistic history have
served to clarify that the two islands have afforded their respective speech communities
decisively different contexts for the development of English variation. Although
geographically close to each other, and alike in more ways than they are dissimilar, the two
islands may have a number of reasons for presenting some subtle, but important differences
in their collective linguistic profiles. I will not go into detail here, except to mention a few
salient points. Throughout the largely colonised past, Malta's size, terrain and a naturally
deep but well protected natural harbour have all contributed to it being the main hub of trade,
administration and government. Gozo, to the north, often provided – and still provides – a
welcome break from this, but it was mainly viewed as a rural outpost, and largely left to fend
for itself. As a result, its linguistic background may well have developed differently as British
settlement is more likely to have been occasional, or unsustained, as opposed to Malta, where
army and navy bases were well established. The area singled out by Bonnici (2010) in fact
was well known for its thriving British community of British Forces, spouses and children.
This, combined with patterns of emigration for the two islands, with many Gozitans (or their
children) returning to their homeland, and many Maltese choosing to continue living abroad,
should serve to give an indication of how the two islands' linguistic development could be
considered quite different. A larger-scaled study of MaltE patterns of variation would
definitely need to encompass the full spectrum of variation evident in Gozo as well as Malta,
but for the purposes of the present research, I have decided to limit myself to studying the
variation of just one island, Malta.
It is widely accepted that variation in MaltE can be found at most levels of structure,
and both in spoken, as well as in written form (Vella 1995). In speech, variation at the
morphological and syntactic includes instances such as periphrastic possessives He is the
brother of my father, 3rd person s/v agreement She work from home, different uses of modals
91
or "want completion" You want I come with you? or sentence-final "but" He didn't see me but,
to name just a few instances itemised in Bonnici (2010) and in Bonnici, Hilbert & Krug
(2012). The distribution of such variation has not yet been rigorously studied, and in the
absence of extensive study in this regard, it is not possible to go beyond intuition and initial
observations shared anecdotally. This seems to suggest, initially, that patterns operating in the
phonetic and phonological domains understandably often command most attention, perhaps
most significantly because they tend to cut across all strata of society, are present to varying
degrees in all speech communities, and, anecdotally, are readily picked up by newcomers to
the islands. Conversely, variation at the levels involving morphological or syntactic patterns
are not necessarily distributed as consistently throughout Maltese society, and tend to interact
more closely with levels of schooling and education, or socio-economic status (Bonnici
2010). Since the major focus here was to examine aspects of broadly 'identifiable' MaltE,
rather than different sub-strata within MaltE which might be more closely associated with
one particular section of society, speech community or geographical region, the phonetic and
phonological domains were thus the main focus of this research.
4.2.3 Recordings and data collection
In an ideal situation a researcher would be able to collect data unobtrusively from
completely naturalistic contexts, such as a chat over dinner, at a restaurant, a conversation
between friends, where participants are completely at ease, and so on. In practice, this ideal
scenario is hard to come by. In the first place, there is the question of ethical considerations,
as best practice requires participants to give their willing consent to being recorded. This
naturally strips some of the ad hoc and truly spontaneous nature of the task away, as
participants, however willing to help, are unlikely to use exactly the same language as they
92
would have done unobserved.1 It is also unlikely that truly spontaneous speech would yield
enough instances of lower frequency variables and more generally, complete reliance on
casual conversation or speech would also mean that the research had little control over the
outcome of data collection process (Nortier, 2010). Conversely, a highly formalised setting
such as a recording studio, with microphones and other sophisticated equipment, is equally
likely to yield undesirable results as participants are likely to become tongue-tied, at worst, or
on their best behaviour, at best.
Particularly in the current context of the use of languages, Maltese and English in
particular, in Malta, the issue of register as it is elicited in the setting, as well as in the type of
data used in the experiment, would be a determining factor in the speech output. For example,
if English is mostly associated with schooling, and by extension, with writing, and Maltese is
more associated with everyday usage (Vella 2012, 1995), a context requiring participants to
perform highly structured tasks, such as reading aloud from a given text, is very likely to
result in an unnaturally formal speech output, with very careful self-monitoring and self-
regulation on the part of the speakers. This was considered to be a major concern. As
described in earlier chapters, there is now, perhaps, a growing awareness among the Maltese,
of variation and a sense of separation from the historically close association with the British
English accepted and aspired-to norm. There is also an associated set of attitudinal stances
towards perceived variation within MaltE as is now well documented in a number of
dissertations and papers (among them, Bonnici, 2010; Debrincat, 1999; Camilleri Grima,
1992). There is thus a heightened awareness that personal linguistic choices made, may well
have a bearing on how the speaker is perceived, and this in turn, may be enough to prompt a
MaltE speaker to be more on guard about their speech patterns if these were being recorded
in a highly controlled language laboratory setting.
1 Consent forms were obtained, as per University of Malta Research Ethics Committee guidelines, for all
participants involved in all studies for this thesis.
93
Bearing the concerns with speech styles and recording settings in mind, the recordings
aimed for clarity and quality, but this was not extended to conducting studio recordings. At
this point, efforts were made to also try to control for surrounding noise, or for contexts
which could generate unwanted noise. However, in this case, flexibility was also possible, as
it was considered sufficient to decide on a mutually agreeable location with each pair of
participants, which the latter would feel comfortable with. My only conditions for this were
that the chosen room be relatively small, contain an amount of soft furnishings that would
help to absorb echo effects and dampen movements such as desks creaking or chairs
scraping, and finally, that the room could be enclosed and shut, during the recording process.
This, to varying degrees, is what was achieved for the recordings with all 11 participating
speakers.
In all cases, fieldwork was carried out by myself, and recordings were made in
familiar settings for the participants, using a Tascam DR-100DKII 24bit palm-held digital
recorder. Participants were told that I was interested in collecting data in English rather than
Maltese, but this was not emphasised and no other details were given, and those participants
who were not naturally English-speaking were paired with participants who were English-
speaking. It was expected that the nature of the tasks provided would also serve to encourage
English as opposed to Maltese language usage as much as possible. Participants were asked if
they would accept to be recorded as they negotiated their way through the four tasks laid out
for them. Prior to each recording, participants were presented with minimal information
regarding my research, together with approval forms which had been obtained by following
the standard procedures of data collection practice stipulated by the Research Ethics
Committee at the University of Malta (UREC). Once participants were clear about
procedures and what was expected of them in each task, they were left undisturbed.
94
The resulting recordings have yielded rich data both in terms of directly comparable
and framed phonetic/phonological variables which are the main focus of this research
analysis, but also, in terms of more spontaneous speech patterns and discourse strategies
interesting to consider for MaltE. Details and examples of how the resulting sound files were
annotated, segmented and analysed are presented, along with screenshots of analysis using
Praat v.5.4.8 below in Section 6.2.1.
4.2.4. Task Design
The broad aim of the design for the tasks which the native MaltE speakers were to
carry out as they were being recorded was to ensure the collection of enough data that was as
close as possible to natural speech. The speech tasks ultimately decided upon ranged from
eliciting spontaneous speech or "connected speech" (Warner 2012) through to scripted
speech, which was necessary for the study of comparable data on durational features (see
Section 4.2.1.3. The same theme, subject matter, and therefore lexis, were retained across all
speaking tasks, and centred around a task commonly used in communicative language
teaching classes. Such speaking tasks are typically devised in order to simulate the need to
communicate, but at the same time, they also serve to distract participants (or learners, in a
class) from worrying about being observed. The tasks are therefore engaging and participants
become more focused on successfully managing and completing the task at hand, rather than
worrying about the fact that they are being recorded (or observed in a class). These same
tasks are regularly used in the Communicative Language Teaching classroom, and are often
referred to as Information Gap Activities, in examples of Task-Based Learning, designed so
that students are more concerned with completing the given task than in using 'correct'
language patterns. It is understood that if the task is completed with enough success – and the
information gap is effectively closed - then communication has also been successful.
95
The HCRC Map Task (Human Communication Research Centre, Universities of
Edinburgh and Glasgow) is one such familiar information gap activity used successfully in
the context of work on phonetics and phonology for a wide variety of purposes, to collect
data from different languages and language varieties. The HCRC Map Task requires one
participant to guide another from point A to point B using similar, but not identical, versions
of a map which cannot be compared whilst participants are engaged in the task. There is thus
an information 'gap', where the guiding participant is not sure which landmarks, referred to as
"Target Items", the following participant might have on their version of the map.The main
reasons this task was not ultimately adopted were related to small, but nevertheless, it was
felt, significant details. Firstly, the nature of the map task immediately places one participant
in a Leader role, and another in a Follower role. This already constrains the nature of an
already compromised degree of sponteneity in the task. Secondly, since the elicitation of
directly comparable data was crucial for my purposes, it was felt that even with a limited set
of possible "Target Items" (generally Map landmarks), participants were still free to choose
their own map route, and this might still result in too few instances of these "Target Items"
being produced.
For these reasons it was decided to keep the essence of the HCRC Map Task in its
design as an information gap activity, but to change the theme slightly so that the resulting
tasks would allow for a greater range of data, from quasi-spontaneous, through to completely
scripted text types. The whole data collection process would be arranged around a common
theme, designed to encourage the elicitation of as many comparable instances of data, in the
form of Target Items (see 4.2.4.1 below for more on these), as possible. This balance still
allowed for the eventuality that some tasks would elicit less instances of comparable data
than others, but would serve the twofold purpose of yielding rich naturalistic data that might
96
also (but not necessarily) contain some of the same Target Items. Sponteneity would also
extend to include which of the two participants might assume a lead role, if any.
The resulting information gap activity took the shape of the familiar childhood game
'Spot the Difference', with the information gap generated by a task where each participant
working as a pair, was given a different version of a picture (reproduced in Appendix D), in
order to identify six differences between the two pictures. This task was to provide the main
theme which would contextualise both the quasi-spontaneous information gap task itself, as
well as three other tasks each designed to elicit more scripted and structured language in the
form of Target Items embedded in both the natural speech context and more formally, in the
read text. The individual tasks and the set Target Items incorporated into the design are
described in detail below (Section 4.2.4).
In total, four short tasks were set, each designed to obtain a different degree of
sponteneity or scriptedness. The first task was a familiarisation task, where each participant
in turn was asked to briefly describe their version of the picture using, where useful, the list
of Target Items presented to them on flashcards (Appendix D (Word list)). This served firstly
to allow participants the opportunity to warm to the given topic, and secondly as the first
opportunity to begin using the Target Items. Any of the Target Items or their close
equivalents used during this task were itemised and are referred to throughout subsequent
chapters or in the Appendix, with the code 'Description' (more on coding in Praat is given
below in Sections 4.2.4 and 4.3).
The second task involved each pair of participants negotiating towards completing a
'Spot the Difference' activity. Both participants were given instructions to identify six
differences in each picture, without being allowed to show their respective pictures to each
other. This task was designed to allow for more spontaneous speech patterns, while still
97
allowing the opportunity for the Target Items to be used. These were not prioritised, however,
in this particular task, but if used, they are coded in the data analysis as 'Differences'.
The third task required a more focused use of the Target Items (or TI(s),
henceforward), presented as flashcards as each participant in turn was asked, following a few
moments' thought, to compose and say aloud a sentence contextualising as many of the
Target Items as they could. This task was intended to elicit the controlled use of the Target
Items, if for some reason, they did not crop up already in the two previous and more
spontaneous speech tasks. Target Items collected in this exercise are coded as 'Sentences'.
The fourth and final task again involved the use of the same Target Items, this time
presented as part of a text to be read aloud, thus constituting the most scripted version of the
TIs. This, together with the third task, was designed to ensure that even if the TIs failed to
crop up in the previous more quasi-spontaneous tasks, they would still be recorded at this
point, allowing for directly comparable data analysis here. This directly comparable data
analysis was also necessary for the analysis of the fifth linguistic variable labelled (PVI
V.Duration). This final task was considered to be the main focus for data analysis as it
allowed for longer stretches of directly comparable data. It has been tagged as 'TextAloud'
throughout subsequent chapters. All four task types are tabulated below.
98
Table 4. 4 Task types for data collection
Task Types Elicitation of Target
Items (TIs)
Rationale Coding
Picture description Random, possibly
not elicited at all
Familiarisation with the
topic, naturally
occurring speech data
Description
Spot the difference Random, but some
more obvious TIs
may be used
Naturally occurring
speech data, quasi-
spontaneous, with
added possibility of
some TIs being used
Difference
Sentence framing Controlled, within a
frame
An element of naturally
occurring speech data,
as participants invent
their own
contextualising
sentences (frames), but
with more controlled
elicitation of TIs
Sentences
Reading aloud Completely
controlled
Elicitation of directly
comparable data
TextAloud
Thus several steps were taken to address the compromise needed between collecting
speech data that was as naturalistic and spontaneous as possible, but that would also serve to
yield enough relevant data with enough instances of the targeted speech patterns defined as
relevant to a study of MaltE variation. In particular, a balance was achieved between
ensuring the elicitation of a relatively fixed number of TIs and eliciting these same features in
as spontaneous a manner as possible. It was expected that, since some of the variables in
question are often stigmatised in Maltese society, more care would be taken to avoid their use
in the more controlled tasks, while the more spontaneous tasks would elicit less guarded
responses (refer to Section 4.1.1 above for the discussion on register and speech sources).
The compromise extended towards ensuring that participants were as comfortable as possible
with their surroundings, and that they were given enough time to familiarise themselves with
their task.
99
4.2.4.1 The Target Items
One of the primary aims for the data collected within the Perception Study was to
have enough instances of salient features produced across all speakers, in such as way as to
allow for closely or directly comparable analyses. While it was considered useful to be able
to collect different speech styles including quasi-spontaneous, through to completely scripted
speech, the overwhelmingly important consideration was the elicitation of as many tokens as
possible of items containing a given set of features, defined in section 4.1.1 as 5 linguistic
variables, which had been determined particularly salient in the identification of MaltE
patterns both in previous research, as well as in the outcome of the Pilot Study described in
section 3.3.2 above.
The Target Items (TIs) were designed in order maximise the elicitation of the five
characteristics which had now been decided upon following up on a combination of the
intuitions of myself and others as native MaltE speakers, previous research and feedback
from Study 1 (as described fully above in Section 4.2.1). They are therefore somewhat
contrived and not necessarily in themselves readily conducive to spontaneous use. However,
it was expected that the information gap activity and its related tasks, while ostensibly
designed to allow participants enough time to familiarise themselves with the context, and to
focus on identifying their 'Differences', would in fact serve to distance participants from the
contrived nature of the TIs themselves.
The TIs were contextualised in the Spot the Difference pictures (Figure 4.2) below,
and were designed to elicit enough instances of the four segmental characteristics identified
as (r), (th), (a) and (schwaØ) for subsequent analysis. Durational characteristics would be
studied in the directly comparable data obtained in the scripted text (TextAloud).
The TI list is reproduced in full, together with the targeted features for each item, in
Appendix D. The pictures below, and the accompanying speaking tasks elicited previously
100
identified TIs including 'crash helmet' or 'canvas bag' 'father' or 'bike enthusiast' (variant 'th'
realisation as stops instead of fricatives) and 'daughter' or 'motorbike' (post vocalic r), among
others. In each case of the targeted features, a balanced distribution was also a concern. Thus
in the case of (r), TIs included options for both word final, 'daughter', as well as word medial,
but still syllable coda 'motorbike', and also coda complex, 'scarf'. Similarly, for (th), as well
as (a), both word initial and word medial distributions were elicited in the TIs 'The', 'weather'
or 'enthusiast' or 'Actually', 'crash' or 'rally', respectively.
Figure 4. 2.Two pictures for 'Spot the difference'
Thus a number of TIs were contextualised, which could be repetitively used
throughout all four of the tasks, ranging from quasi-spontaneous in 'Description' and
'Differences', through to more scripted framed sentences, 'Sentences', for the TIs, and a
scripted text to be read aloud for more direct comparison, 'TextAloud'. The TIs were devised
in order to ensure some element of directly comparable data, as well as to ensure the
elicitation of enough instances of the four segmental characteristics for analysis, while the
'TextAloud' was constructed in order to have direct comparisons for vowel duration
measurements, across longer stretches of speech.
101
4.3 Preparing and collecting the data
Study 2 described above throughout section 4.2 and its subsections, thus has a twofold
purpose. Firstly, native MaltE speaker data was to be collected and analysed with reference to
five characteristics expected to be salient in the identifiability of MaltE. Secondly, native
MaltE listener perceptions of the same speakers was to be recorded using (M)agnitude
(E)stimation (referred to henceforth as Perception study).
4.3.1 Speaker data: preparation and annotation
The newly collected spoken MaltE data which had been recorded with pairs of
participants was then prepared for the two related studies, for which the same MaltE spoken
data were to be used, namely, the Perception study for perception patterns (see Section 4.3.2
below), and the case study analysis of six speakers (identified further in Section 4.3.1.1
below) for production patterns (this section). The recordings of the spoken MaltE data were
therefore first separated into individual sound files for each speaker, and for each task type,
resulting in four sound files (or sometimes five, if these were long), for the "Description",
"Differences", "Sentences", and "TextAloud" task types, for each speaker. These files were
stored together with an anonymised speaker identification code in individual sounds files in
.wav format, samples of which are also attached to this thesis. Analysis of this data is
presented throughout Chapter 6, supported by spectrograms, together with their respective
annotations in textgrids. Further examples of annotated data in relation to the five
characteristics in question can also be found in Appendix G.
4.3.1.1 Case studies for 6 MaltE speakers
Once all the data from 11 speakers had been collected, it was necessary to identify
some of the speakers for in-depth data analysis with reference to the five salient
characteristics for the identifiability for MaltE. In the first place, the choice of which
102
speaker's data to examine further was made on the basis of recording quality, and on whether
the speaker managed to execute all of the four tasks (Section 4.2.4) fully.
In order to obtain as informative a range of variation in MaltE as possible, for the
purposes of the Perception study, I needed to also include speakers whose use of English may
have been limited to isolated occasions, resulting in overall less fluency in the language.
Where this was the case, the TextAloud task was sometimes not completed and therefore
could not be used for analysis. The data from these speakers were not selected for further
analysis here, as the TextAloud was necessary for analysis of durational characteristics.
Similarly, the sound quality for another speaker was sometimes compromised as the speaker
shifted position away from the microphones, or rustled papers. Again, this speaker data was
not used for further analysis, as it could not be studied in full. These choices resulted in a data
set for six speakers, who were also noted, at the end of the Perception study, to have scored
across the range of identifiability for MaltE, including "highly identifiable" and
"unidentifiable" (as elaborated on further in Sections 5.1 and 6.1).
The full set of data for six of the eleven participant speakers were prepared for the
analysis of production patterns with particular reference to the five linguistic characteristics
occupying the main theme of this dissertation (see Chapter 6 for this). For each of the six
speaker case studies, the "TextAloud" was annotated orthographically and also segmented in
full, in preparation for an analysis of vowel duration patterns (PVI V.Dur). The remaining
tasks, listed as "Description", "Differences" and "Sentences" were annotated and segmented
in relation to observable instances of each of the four segmental characteristics identified in
4.2.1 above. Praat version 5.3.48 (Boersma and Weenink 2013) was used throughout the
annotation, segmentation and analysis process
103
4.3.1.2 Preparing audio stimuli for ME perception in Study 2
Clips from the various files of all 11 MaltE speakers were extracted for the Perception
Study (see Section 4.3.2 below) to be presented as audio stimuli for magnitude estimation.
Each resulting stimulus represented one speaker (including the modulus, discussed further in
Section 4.3.2 below). Each stimulus also included at least one instance of each of the four
targeted segmental variables, (r), (th), (a), and (schwaØ), and enough time for variability in
vowel duration patterns to register (PVI V.Dur). The resulting sound clips were between 13-
16 seconds long, for each speaker, and were embedded in the powerpoint slides used to
present the Perception Study task to the 34 MaltE listeners. Taking on board the feedback
presented by the expert listeners in Study 1, the audio clips were shortened, and participants
were presented with a new stimulus (as an audio clip) on each slide, together with a repetition
of the modulus. This allowed greater flexibility and manipulation for the participating
listeners, while also ensuring that all the necessary data remained close at hand.
4.3.2 Perception study: design and preparation
The Perception study in Study 2 was informed by the procedures and methods
adopted in Study 1 (see Section 4.1.2 above) and reported on below in Section 5.2. This
Perception study was conducted with a larger cohort of participants than that conducted for
Study 1, all of whom would be considered naïve listeners. The participants were all recruited
from the student or administrative staff populations at the University of Malta over a period
of two weeks during the second semester of 2013, and totalled 34 by the end of the study. Of
this number, three sets of scores had to be removed following participation, as it transpired
that the participants had not managed to follow the rubric and scored differently to the
remaining participants. Although three sets of scores were not included, they are worth
referring to briefly in section 6.1 below, as they serve to highlight two interesting response
104
patterns. Furthermore, in analysing the data of 31 participants, 3 outliers were also removed
in order to give a more coherent picture of the whole exercise.Thus the final count of data
available for analysis was from 28 participants.
The study was presented as a Powerpoint presentation to the participating MaltE
listeners at the University of Malta, recruited via the same University's administration portal.
All of the participants were Maltese nationals who had been educated to a minimum of
university entry stage, and all participants were asked to fill in a brief questionnaire
concerning their linguistic background (Appendix C).
The powerpoint presentation (see Appendix E) was divided into two stages, a training
stage, and the perception study itself. The training stage consisted of a series of exercises
designed to allow participants time to familiarise themselves with the idea of recording their
estimations using M(agnitude) E(stimation). By the end of training, participants would have
been given practice in estimating both a visual and an audio stimulus. The Perception study
itself was contextualised by the following descriptions: Imagine you are at a coffee shop in
an international airport and you overhear the following speakers. In each case, decide
how much more or less Maltese than the Modulus each speaker sounds and enter a
number to show this in the corresponding row on your table.
11 clips were taken from the newly collected data (Section 4.2.4) and linked to the
powerpoint slides. One of the clips was identified as the modulus and all other clips were
randomised and presented on separate slides, alongside the modulus. Following closely on
the discussion regarding the choice of speaker for the modulus stimulus in Section 4.1.2
above, it was again important to choose a speaker who I felt, intuitively as a native speaker of
MaltE, would not be perceived as especially representative of any extreme, whether as
extremely Maltese-sounding or extremely (Maltese)English-sounding or even extremely non-
Maltese sounding. Every effort was made to pick a speaker who presented what I felt to be
105
uncontroversial, in this sense. Accordingly, the choice for the modulus for the Perception
study in Study 2 was firstly made on the basis that the questionnaire data (Appendix C) for
this speaker revealed her to be comfortable with both Maltese and English. Furthermore,
careful consideration of the sound files for this speaker allowed me to identify a range of
variation both in relation to the variables in question, as well as others, which led me to
expect that this speaker would be considered neither too obviously identifiable, nor
completely unidentifiable, for MaltE. The same caveat applies here, however, in that I am just
one MaltE speaker on the continuum, albeit an informed one, and my linguistic background
may influence my choices. It was expected that the design of the task – including specifically
that the interval measures were more relevant than any absolute measures - would mitigate
any influence the choice of the modulus might have made.
The Perception study was conducted in one of the IT training rooms at the University
of Malta. The powerpoint presentation was uploaded to separate workstations in the training
room and each participant was given a head set, pen, and scoring sheet, allowing them to
work in virtual seclusion at their own pace, but with additional support from myself should
they have requested it. Once they had filled in the brief questionnaire for an overview of their
linguistic profile (see above), and once they had been given a moment to familiarise
themselves with their surroundings, each participant was familiarised with the details of the
task at hand, and was then left to conduct the powerpoint and enter their scores on the sheet
provided at their own pace. All of the scoring sheets and completed questionnaires were
collected before the participants left the training room.
As noted above (Section 4.2.4.1) in the description of the Target Items and their
purpose, each of the 11 audio stimuli extracted from the recorded MaltE speakers (see
Section 4.2.2) contained as many instances of the five characteristics under consideration as it
was possible to get in one short clip. The audio stimuli were chosen from among the data
106
produced by the native MaltE speakers performing the spontaneous speech tasks, not the
TextAloud scripted task, in order to avoid having listeners become alerted to the same –
possibly stigmatised – variables, which in turn might run the risk of artificially leading
listeners on. On the contrary, the intention for the Perception Study was to simulate as closely
as possible, a situation where a listener might casually overhear a speaker in a neutral place
and be able to identify the speaker as MaltE (or not). Each clip does still, however, contain
instances of each of the four segmental variables, and is also long enough to give a sense of
such aspects of rhythm as can be described with reference to variability in vowel duration
patterns. These clips are therefore taken to be a faithful representation of the speech patterns
typical for each individual speaker.
4.4 Data annotation, markup and analysis: procedures
This section describes the treatment of the collected spoken data gathered from the six
MaltE speakers identified on the basis of the quality of the resulting data, as well as on the
results obtained from the Perception study in Study 2 in judging these speakers on a range of
identifiability for MaltE.
The data from each of the speakers analysed was divided into two separate lots. The
first includes all those text types involving spontaneous or semi-spontaneous speech styles,
including the Picture Description, Spot the Difference and using Target Words in Sentences
(see Section 4.2.4). These sound filed were labelled, respectively: Description, Differences
and Sentences, for each of the six speakers analysed1 and have been transcribed and
segmented as described more fully below and in Figure 4.3 below. The second set of data was
the scripted monologue where speakers were given a prepared text to read aloud (TextAloud)
1 The accompanying Sound Files have examples of all speech styles for each speaker.
107
(See Section 4.2.4 above). This latter speech style was intended to yield directly comparable
data and has been transcribed and segmented at word and at segment level.
Segmentation and analysis has been carried out in the context of two central aims, in
this thesis. The first aim was to collect a range of data that would include, in part, some form
of (semi) spontaneous or naturally occurring speech. The second aim was to study this data
for evidence of any identifiable acoustic patterns which might relate to the idea of a
continuum of variation, or variation within those features identified as salient for MaltE, as
suggested repeatedly in previous research (Delceppo, 1986; Vella, 1995, and Bonnici, 2010
on rhoticity). The two sections below (Section 4.4.1 and 4.4.2) explain procedures for
annotation and segmentation of the sound files in the light of these two aims.
4.4.1 Annotationand mark-up procedures
Regarding the first aim listed above, it was felt that the compromise in favour of
approaching more natural speech outputs, as opposed to strictly controlled repetitive
segments produced in a laboratory setting, would better capture the wide-ranging array of
patterns expected at a phonetic level of analysis (see Section 1.2 for expectations and the
intended plan of research). With these two points in mind, and in the expectation that much of
the data would yield valid information at the phonetic level, the materials were prepared in
such a way as to simulate a typical speaking situation, namely, the need to contribute missing
information with a prepared information-gap activity (see Section 4.2.4). Furthermore, every
effort was made to ensure that the recording location was free of external noise and
disturbances, as far as possible (see Section 4.2.3 for details on recording procedures).
The overall results were encouraging, with the resulting data being clean and noise-
free enough for auditory analysis, in the first place, and in most cases, also acoustic analysis.
108
Annotation and segmentation for both the spontaneous and read texts was carried out
using a combination of auditory and acoustic analysis of the uncompressed .wav files using
Praat (Boersma and Weenink 2013) speech analysis software. Praat enables the analysis of
speech data by uploading sound files aligned with text files on a number of tiers, allowing for
different levels of segmentation and labelling. An example of segmentation, coding and
analysis in Praat is presented below in Figure 4.3/ All of the spontaneous and scripted data
files were first annotated orthographically (Tier 2), and segmented phonemically (Tier 1,
labelled "Vr" for TextAloud, and "Phones" for all other files). Two versions of TextAloud
were copied, with one version "TextAloudB" keeping the above-mentioned segments
transcribed phonemically, and the other version, "TextAloudpvi" using a slightly different
labelling for Tier 1 (Vr) to allow for Praat scripting support in the computation of vowel
duration measurements for the analysis of (PVI V.dur) (see Section 4.2.1.3). Here, all vowel
phonemes were replaced by orthographic 'V' to indicate 'Vowel'. Vowel durations for
pairwise variability measurements were carried out solely on the scripted text (TextAloud), as
this analysis required longer chunks of directly comparable, and therefore identical, strings of
segments.
The remaining tiers (see Figure 4.3 below), namely, tiers (3-6, inclusive) contain short
commentaries which could also be decoded in the Praat scripts .txt files, for each of the four
target variables namely, (a), (r), (th), (schwaØ), and relating more closely to the two passes of
analysis that were to be carried out for three of these variables, as described more fully below
in Section 4.4.2.
Segmentation was based on auditory-acoustic analysis (see Section 4.4.2 below for
more on procedures for analysis) and follows closely on studies and observations in
Ladefoged (2003), Ladefoged and Maddieson (1995), Di Paolo and Yaeger-Dror (2011) and
Harrington (2013). Vowels were segmented to incorporate solid-state evidence in the
109
spectrogram and were calculated from vowel onset to vowel offset. In the case of onset
glides, these were analysed as a consonant followed by a vowel pattern following the
observation of distinct formant structure or amplitude changes, as in Figure 4.3 below in
"wartime".
Figure 4. 3 An analysis in Praat, Sp5
Figure 4.3 above illustrates the typical layout in Praat, with the soundwave and
spectrographic analysis at the top aligned with, in this case, six tiers illustrating annotation as
described earlier. Each tier contains different annotation information. Phonetic segmentation
is aligned with boundary markers on the first tier from the top "Phones" (or, for
"TextAloudpvi" files for PVI analysis, labelled 'Vr) while the accompanying annotation uses
broad phonemic script in order to allow for readability in Praat script .txt files. This is
followed by orthographic annotation on Tier 2, and then a tier for each variable under
discussion, namely, (a), (th), and (Schwa).
110
4.4.2 Analysis of data: procedures
As for the second aim listed earlier (Section 4.4), concerning the study of variation
within the salient characteristics of MaltE, annotation of the files and the accompanying
analysis took two forms, for the study of three of the variables, namely (a), (r), and (th).
Firstly, a first pass analysis was made as a simple tally, for each of the three segmental
variables. The second stage consisted of a second pass analysis to study in more detail the
kind of variation present, for each of the three listed variables.
The tally first recorded whether there was variation in the realisation of each of these
three variables, or not. So for example, if the expected realisation of the (a) variable is [æ],
then any other realisation was noted down and marked 'Yes' on the relevant tier in Praat (see
Section 4.4.1 above) indicating varied realisations of (a). Conversely, if an example of the
linguistic variable (a) presented the expected realisation of [æ], this example was marked 'No'
on the relevant tier (see Sections 2.2.1 and 4.2.1.1 for the discussion on varied realisations of
this vowel). No other detail regarding the specific realisation of variation in each case was
noted during this first pass analysis. In the second pass analysis, all those instances which
were tallied as 'Yes', therefore indicating varied realisations for the segmental variables listed
above, were further analysed acoustically in order to explore more fully the notion of
variation in this respect. A detailed analysis of the use of schwa and its varied realisations
was beyond the scope of this thesis, and is here confined to an initial understanding of
whether or not it is ever used in MaltE. However, in the presentation of vowel formant
analyses (see Section 4.4.3, below), it will be noted that schwa was also included for analysis,
and therefore a preliminary overview of how this vowel can be realised in MaltE is also
suggested here.
Thus both an auditory and an acoustic analysis were carried out. Auditory analysis
was carried out by myself and then discussed and confirmed by at least one other linguist,
111
and acoustic analysis was carried out on all speech styles for each speaker, excluding only
those instances where the utterance might have been unclear, or where noise interfered with
the speech signal.
4.4.3 Phonetic analysis: procedures
This section describes the procedures adopted for phonetic analysis in more detail.
The first part of this section is concerned with vowel segment analysis, particularly in relation
to the two variables defined throughout this thesis as (a) and (Schwa). The second part of
this section concentrates on the phonetic analysis of post-vocalic (r), /θ/ and /ð/ and their
variants (defined as the variable (th) throughout this thesis), and finally, on the analysis of the
pairwise variability of vowel durations (defined as the variable (PVI V.dur throughout this
thesis).
It is widely understood that formant structure behaviour depends to a great extent on
indexical and contextual information, including anything from the gender of the speaker, to
the nature of the surrounding segments. Peterson and Barney (1952) provide a seminal
account of variation in the realisation of vowel segments, in what has now become one of the
most widely quoted studies in the field of acoustic intra- and inter-speaker variation. The
1952 paper notes both variation in production, and in perception, of vowel sounds, and
illustrates how the production of a vowel sound in a CVC context varies within and across
speakers, but is nonrandom in its distribution. Also relevant to the discussion here, however,
are the findings which indicate that vowels can be identified in term of the acoustic
measurements of the first and second formants, F1 and F2, and that these measurements
present continuity, when described on an F1-F2 plane, as "the distribution of points in the F1-
F2 plane is continuous in going from sound to sound" (Peterson and Barney 1952:183).
Harrington also (2013) cites research by Lindblom (1963), Moon and Lindblom (1994) and
112
van Bergem (1993) to say that while vowel articulation does not necessarily shift according
to contiguous segments, "the formants shift in the direction of the loci of the flanking
segments." (Harrington 2013). Johnson (2005: 365) also clarifies the issue in this way, "the
first two perceptual dimensions always correspond to the frequencies of F1 and F2. However,
the perceptual value of F1 and F2 are modulated by other acoustic properties of vowels".
Another related issue to consider is the case of vowel normalisation in formant
analyses in order to cater for the effects of physiological, or anatomical differences, and
possibly also regional differences, between speakers (Watt, Fabricius and Kendall, 2011;
Johnson, 1997, 2005). This would indeed be a useful process in a study aiming to establish a
statistically robust analysis of vowel realisation for MaltE. However, this goes beyond the
scope of the present work, which is more concerned with exploring the nature of the
interaction among five variables in a given speaker, with a view also to tentatively exploring
the connection between such an interaction of variables on the one hand, and native listener
perception, on the other.
In the context of the discussion above, it was considered relevant to carry out a non-
normalised formant analysis of a range of vowels as used by one male and one female MaltE
speaker from among the six speakers identified (see Section 4.3.1) in order to better
understand the context of the vowel space in which varying realisations of /æ/ might occur. In
this way, the effects of the flanking segments on formants, as well as the physiological effects
of a male and female voice as highlighted above may be somewhat mitigated by this more
contextualised view of vowel realisation in two native MaltE speakers. Both the speakers in
question have been identified in the Perception study in Study 2 as "highly identifiable" for
MaltE (see Section 6.1.1).
Bearing the above discussion in mind, a formant analysis was carried out for two
speakers (Sp1 and Sp2, identified in Section 6.2 below) in order to map out the vowel spaces
113
used by these two speakers. For this exercise, F1 and F2 formants were measured for as many
instances of the realisation of /æ/ as possible, only excluding those instances where the sound
quality was not clear. Similarly, F1 and F2 were also extracted for /ɪ/, /e/, /ʊ/, /u/, /ɒ/, and /ə/.
The resulting formant analysis is presented and discussed below in Section 6.3.2.
Consonant segments were also analysed following conventions set out in Ladefoged
and Maddieson (1996). As described in Section 4.4.2 above in relation to /æ/, the two
variabes (th) and (r) underwent two cycles of analysis. In the first pass analysis, each instance
of (th) and of (r) was checked for evidence of variation in its realisation, in which case it was
marked 'Yes' on the relevant tier (see Section 4.4.1). If an example of either variable did not
present evidence of variation, but instead yielded the expected realisation, then that example
was marked 'No' on the relevant tier. For the (th) variable, an expected realisation was the
fricatives [θ] or [ð], while for the (r) variable, the expected realisation was a non-rhotic
pattern (see Section 4.2.1 above). The auditory and acoustic analysis of post vocalic r has
been discussed more fully in section 4.2.1.2 and here it is noted that, where possible, acoustic
evidence constituted what has been described as a change in patterns of formant structure
(particularly 3rd formant, or F3), following Ladefoged and Maddieson (1996). If no such
evidence, either auditory or acoustic, could be obtained, post-vocalic 'r' – the variable (r) -
was considered absent. Figures 4.4 and 4.5 below give two instances of the same word with
overt realisation of (r) and null realisation of (r) respectively:
114
Figure 4. 4 overt realisation, post-vocalic (r), Sp2
Figure 4. 5 null realisation, post-vocalic (r), Sp2
Both figures 4.4 and 4.5 illustrate two different realisations of post-vocalic (r), in the
speaker. Section 6.4 explores the theme of intra-speaker variation more thoroughly, but here,
suffice it to note the two different realisations, together with their annotation.
In the case of the analysis of the different realisations of the (th) variable, acoustic
analysis concentrated on the observation of spectrographic evidence of stop/release burst
patterns, compared with frication, as evidence of a contrast between the often noted (and
stigmatised) use of stops /t/ or /d/ in the place of the expected interdental fricatives /θ/ and /ð/
115
respectively. Figures 4.6 and 4.7 below show examples of the (th) variable realised with more
of a burst, or with less of a burst but with distinctive aspiration, respectively:
Figure 4. 6 "authentic", Sp3, evidence of a burst in the realisation of the (th) variable
Figure 4. 7 "forty-three", Sp5, a different pattern of little, if any burst, followed by pronounced aspiration
In Figure 4.6 above, note the similarity in the spectrographic display for the analysis
of (th) compared with /t/ in the word "authentic", where both segments display a distinctly
stop-like burst. Compare this with evidence of frication in the realisation of the (th) variable
for Sp5 (Figure 4.7) above, where there is no evidence of a sharp burst-release pattern, but
still pronounced aspiration, compared with the /t/ in which an initial stop, followed by the
striation and aspiration.
116
Again, as with the entire foregoing discussion on acoustic and auditory analysis, a
more controlled laboratory-based experiment would allow for a more closely measurable
analysis of the continuum of variation observed here, but the data and the findings described
fully in Section 6.3.3 will begin to present evidence of rich patterns of variation.
In the case of durational analysis for the variable (PVI V.Dur), each of the six
TextAloud transcriptions for the six speakers was extracted, tabulated in Excel and sorted into
vowel and consonant segments as shown below in Table 4.5.
Table 4. 5 Extract, Excel file extracted from Praat, for normalised PVI calculations
Speaker/Location Word Segment Segment
Duration
(ms)
PVI
(normalised)
Sp2_TextAloudpvi_textgrid This i 59
Sp2_TextAloudpvi_textgrid is i 45 0.26923
Sp2_TextAloudpvi_textgrid a a 58 0.25243
Sp2_TextAloudpvi_textgrid cartoon a 49 0.16822
Sp2_TextAloudpvi_textgrid cartoon oo 157 1.04854
Sp2_TextAloudpvi_textgrid of o 55 0.96226
The Excel file extract in Table 4.5 above lists the first words of the "TextAloud"
monologue from Sp2, sorted into the vowel segments of each of the respective words.
Cartoon, for example, is listed twice to account for the 2 vowel segments in its two syllables.
The 'duration' column lists the duration of each vowel segment in milliseconds, while the
final column lists the results of the normalised Pairwise Variability formula, defined as the
difference between each successive duration, normalised to account for speech rate (Grabe
and Low 2002). A table similar to the one in Table 4.5 above was used to construct a
normalised Pairwise Variability Index (PVI) for each of the six speakers. The PVI formula as
used in the present work calculates the measurable difference of features such as duration,
intensity or pitch, between successive pairs of segments, and then averages each successive
difference to arrive at an index of variability for the given feature. The current analysis
117
follows the later interpretations in Nokes and Hay (2012), with measurements now calculated
on vowel segments, rather than on vowel intervals, as originally computed by Grabe and Low
(2002). Section 4.2.1.3 above explains the rationale for this decision in more detail, in
relation to the notion of rhythm classification.
The analysis captures a pattern of localised variability of duration. Table 4.6 below
provides a simplified illustration (without normalisation) of how this process works:
Table 4. 6 PVI calculations for three vowel segments
Word Segment Duration (ms) Difference between
successive pairs
(Duration)
Results
This i 59
is i 45 (59-45) 14
a a 58 (45-58) 13
etc.
(Average)
The resulting measurement is then also normalised to account for differences across
individual speakers' speech rates, following Grabe and Low (2002), as one of the first
applications of the PVI formula devised to measure vocalic and intervocalic intervals.
Section 4.4 and its subsections has described the procedures and decision made for the
annotation and mark-up, and analysis of the five linguistic variables under discussion in the
present work. Earlier, Sections 4.1, 4.2, and 4.3 and their subsections explained the various
methods and procedures used in the design and execution of the two studies (Study 1 and
Study 2) which constitute the heart of the present work. Chapters 5 and 6 present the findings
of the data analysis of Study 1 and Study 2, respectively.
118
5 Study 1: analysis and findings
This chapter presents analysis and findings for Study 1, in which nine native MaltE
expert listeners estimated their perceptions regarding how identifiably MaltE or otherwise 10
native MaltE speakers sounded, in relation to a control, or modulus stimulus, who was
another native MaltE speaker.
Section 3.3 above explains the rationale for asking participants to devise their own
scale of estimation and this inevitably resulted in a range of scales being employed, according
to what each participant felt most comfortable using. Some participants used a scale of 1-10
and sometimes included fractions or decimals for added nuance, while other participants
worked on a scale of 1-20 or 1-100, and so on. The resulting scores for each subsequent clip
across the entire participant group are therefore not comparable, without first being
normalised. This was the first step carried out, with each score being divided by the score
given to the modulus by each participant in order to obtain comparable results across each of
the two participant groups. The analysis in both chapters on both studies will henceforth refer
to normalised scores, as just described. Once normalised, a score below 1.0 indicates that the
speaker in that clip was judged less identifiable than the modulus by the listener, while a
score above 1.0 indicates that the speaker was considered more identifiable than the modulus.
The data from each ME task were then studied for patterns and for outliers in preparation for
the final analysis. In the case of Study 1 there was one outlier whose data was retained and
will be discussed further in Section 5.2 below.
5.1 Overview and comments on using magnitude estimation
One of the most useful aspects of (M)agnitude (E)stimation has undoubtedly been the
constant reference provided by the modulus. In particular, raw scores for the modulus in
119
Study 1 reported here, and also in Study 2 (discussed more fully in Section 6.1) highlight the
varied responses that one can expect for a given stimulus with no point of reference other
than the listener's own linguistic experience. In each ME task, all participants were given the
instruction to fix a scale of their own, and then within that scale, to allocate a score for the
modulus according to how certain they were that the speaker was Maltese, and in speaking
English, was therefore a MaltE speaker. The raw scores and accompanying scale used by
participants in Study 1, presented below in Table 5.1, show a lack of agreement regarding
how identifiably MaltE the speaker was, and this throws into relief the concerns (see also
section 3.3) with setting a Perception study that allows for categorical judgments with no
external reference points. For if such varied responses, even among expert listeners, are
likely, then it may be considered quite difficult to obtain any kind of consensus at all
regarding our consciously-held views on what constitutes identifiable MaltE.
Table 5. 1 Modulus scores and scales for 9 expert MaltE listeners
Modulus score (raw) Scale used
Li1 4 1-10
Li2 9 1-10
Li3 85 1-100
Li4 50 1-100
Li5 8 1-10
Li6 15 1-20
Li7 70 1-100
Li8 15 1-20
Li9 10 1-20
As Table 5.1 above indicates, there is a fair degree of movement in what the expert
listeners perceived in the modulus in terms of readily identifiable MaltE. Some of them rated
this audio stimulus particularly highly, such as Li2, scoring 9/10, or Li3, with 85/100,
compared with Li1, scoring the same stimulus a 4/10, or Li 4, with a 50/100.
120
However, with ME as a measuring tool, it is possible to take note of such differing
scores but then to focus rather, on the the interval scores subsequently allocated to each
speaker in relation to the original modulus score. Thus, in actual fact, as long as the interval
score is captured, the original modulus score being so different for individual listeners no
longer remains an obstacle.
The details of the raw scores in Table 5.1 above reflect the finely tuned responses of
each individual listener, and illustrate how individuals have different opinions regarding the
precise placement of the modulus and subsequent stimuli as identifiably MaltE, or not. These
differing scores in one way highlight the strength of ME as a scale though, for when the
modulus scores are matched with the normalised scores for each subsequent stimulus, a clear
pattern of agreement emerges, as described in Figure 5.1 below. This pattern indicates that,
particularly in a context where strong opinions and perceptions abound, it is still possible
with ME, to identify and capture the ways in which individuals still present overall agreement
in relation to the same stimuli. If responses to the modulus are taken into account, in terms of
agreement, it is fair to suggest that without a point of reference, responses to all the other
stimuli may well be equally wide-ranging, depending on any number of variables, including,
not least, an individual's own linguistic experience. Since listeners are now required to relate
their perceptions of a speaker in each stimulus to their perception of the modulus, rather than
to their own internalised perception, a degree of objectivity is achieved. Figure 5.1 below
presents the normalised scores for each clip judged in relation to the modulus by the 9 expert
listeners in Study 1.
121
Figure 5. 1 Expert MalE listener judgments on 10 speakers
Note the encouraging trend emerging in Figure 5.1 above, to suggest a strong degree
of consensus across all 9 listeners judging each clip in relation to the modulus. In particular,
listeners now seem to reach some agreement both over what not to consider identifiably
MaltE (Clip 1 or Clip 6), and also, broadly, over which speaker to consider more identifably
so (Clip 9). Section 5.2 and its subsections will consider Study 1 and its findings in more
detail.
5.2 Study 1: Analysis of Study 1 Magnitude Estimation results
The nine native MaltE expert listeners were presented with ten speakers and an
eleventh speaker taken as the modulus, and they used ME to measure their judgements on the
identifiability or otherwise of each speaker in relation to the modulus, while seven of them
also offered remarks to justify their decisions. The resulting scores are tabulated in full in
Table 5.2 below.
0.00
0.25
0.50
0.75
1.00
1.25
1.50
1.75
2.00
2.25
2.50
Clip1 Clip2 Clip3 Clip4 Clip5 Clip6 Clip7 Clip8 Clip9 Clip10
No
rmal
ise
d M
E sc
ore
s
Clips representing 10 native MaltE speakers
9 Expert Native MaltE Listeners' normalised scores for 10 speakers' clips
li1 li2 li3 li4 li5 li6 li7 li8 li9
122
Table 5. 2 li(stener) scores on 10 clips, with average and standard deviation, per clip.
Clip1 Clip2 Clip3 Clip4 Clip5 Clip6 Clip7 Clip8 Clip9 Clip10
li1 0.50 0.75 1.25 1.50 1.50 0.38 0.63 0.50 2.00 2.25
li2 0.11 0.33 0.78 0.89 0.78 0.44 0.56 0.56 1.11 0.89
li3 0.71 0.76 0.82 1.08 1.12 0.82 0.94 0.71 1.16 1.15
li4 0.60 0.80 1.10 1.20 1.20 0.80 0.90 1.00 1.30 1.20
li5 0.38 0.88 1.25 1.00 0.63 0.75 0.63 1.25 1.00 1.00
li6 0.33 1.00 0.67 1.00 1.00 0.33 0.67 0.33 1.33 0.53
li7 0.43 0.79 0.93 1.00 1.14 0.64 1.07 0.50 1.43 1.29
li8 0.27 0.47 0.53 0.80 0.80 0.40 0.73 0.47 1.13 1.20
li9 0.3 0.35 1.2 0.85 1 0.4 0.8 0.9 1.3 1.4
average 0.40 0.68 0.95 1.04 1.02 0.55 0.77 0.69 1.31 1.21
stdev 0.18 0.24 0.27 0.21 0.26 0.20 0.17 0.30 0.29 0.47
The table above lists each score (normalised) awarded by each individual expert
listener for each of the 10 clips, always in relation to the modulus (not represented here, since
scores are normalised, in order to clarify the table's presentation), together with the average
score and standard deviation for each clip. The trend of overall agreement is also captured in
Figure 5.1 above. Relatively low standard deviations for most of the clips suggest a good
amount of agreement between the expert listeners, on whether to consider a clip as
identifiably MaltE or not. For example, on average, Clip1 obtains a relatively low score, and
always less than 1, indicating that expert listeners judged this Clip on less identifiably MaltE
than the modulus. Conversely, with a high average score and low standard deviation, we see
that Clip 9 is widely considered more identifiably MaltE than the modulus.
An encouraging trend also emerges in how the expert listeners scored each speaker.
For example, in the case of Li2, it is clear that this listener considered all but one of the Clips
to be less identifiably MaltE than the modulus, who was scored at 9, on a scale of 1-10, with
all normalised scores coming in at less than 1.0. A 9 out of 10 score for the modulus indicates
that this expert listener felt the modulus to be particularly identifiable of MaltE, unlike the
majority of other listeners, some of whom scored the modulus quite highly, but not as high as
9/10. Within this chosen range for Li2, therefore, it is understandable that scores estimated in
123
relation to a highly scored modulus would present differences, although it is still possible to
notice a similar trend emerging in this listener too. If for example, we take Clip1 and Clip9,
which both received strong indications of least and very, identifiable, respectively, we can see
in Li2's scores that the same trend emerges, where Clip1 and Clip9 are rated much less and
much more identifiable, respectively, even if Clip9 is still rated less than the modulus. When
examining this listener's scores for the remaining clips, it is also possible to see the trend of
scoring for 'more' or 'less' identifiable and the trend line still follows the general direction that
the majority of the listeners were taking.
In another example, if we take Li1, we notice that distinctions made across the
speakers appear to be much more pronounced for this listener. Again, however, the general
pattern follows that of the other listeners, and shows that even where individual preferences
and perceptions may differ, there is still a prominent pattern of consensus which emerges
overall. There is agreement that patterns in variation and new varieties of English often
display patterns of variation which are widespread at the phonological level and this is also
true for MaltE, as previous chapters have illustrated and reported. It is possible to interpret
the scores for Li1, therefore, as reflecting this sensitivity to variation at the phonological
level, with Li1's scores still following the same trends of agreement on what is and what is
not identifiably MaltE , but also presenting a wider range in scoring patterns, when compared
with the other expert listeners.
In this case too, ME has proved a useful measuring tool for this task, allowing each
individual to express their distinctions up to any level of desired refinement (or not, if that is
also the case), while still allowing the researcher to obtain valid information from the
exercise. Figure 5.1 above and the ensuing correlation analysis in Table 5.2 above both yield
the subtlety allowed by the interpretation of a ME scale, as well as the extraction of more
general trends and patterns of consensus where this is strong. This multi-dimensional picture
124
makes it possible both to pick up on broad patterns, such as the general trends as illustrated in
Figrue 5.1 above, but also to identify more fine-grained reactions to the stimuli offered, as
discussed in this section, and described in Table 5.2.
Correlations and accompanying p-values have also been computed and tabulated
below.
Table 5. 3 Expert listener correlations
Li1 Li2 Li3 Li4 Li5 Li6 Li7 Li8 Li9
Li1 1.000 0.849 0.873 0.851 0.318 0.578 0.865 0.910 0.825
Li2 0.849 1.000 0.836 0.978 0.546 0.606 0.850 0.858 0.892
Li3 0.873 0.836 1.000 0.822 0.051 0.625 0.916 0.926 0.694
Li4 0.851 0.978 0.822 1.000 0.559 0.604 0.810 0.832 0.892
Li5 0.318 0.546 0.051 0.559 1.000 0.124 0.182 0.251 0.577
Li6 0.578 0.606 0.625 0.604 0.124 1.000 0.713 0.554 0.343
Li7 0.865 0.850 0.916 0.810 0.182 0.713 1.000 0.932 0.765
Li8 0.910 0.858 0.926 0.832 0.251 0.554 0.932 1.000 0.814
Li9 0.825 0.892 0.694 0.892 0.577 0.343 0.765 0.814 1.000 High correlations are indicated by underlining.
Table 5.3 above presents the correlations for the 9 experts' ME judgements, with each
expert listener paired in turn, with each of the remaining eight listeners. Here too, with the
clear exception of one listener (Li5), the figures indicate strong trends of consensus in the
scoring patterns. The high correlations (underlined) frequently obtained across pairs of
listeners reiterates the trend observed above (Figure 5.1 and Table 5.2), that there is
considerable agreement on what to consider identifiable and non-identifiable MaltE, in
relation to a modulus.
The majority of correlations in Table 5.3 above come in quite high at over 0.6, and
sometimes well over 0.7. Corresponding p-values for the above correlations are listed below
in Table 5.4, and show a significant degree of confidence, with the great majority expressing
p-value < 0.05, suggesting a healthy trend of consensus across the nine expert listeners.
125
Table 5. 4 Corresponding p-values for expert listener correlations
Li1 Li2 Li3 Li4 Li5 Li6 Li7 Li8 Li9
Li1 0 0.002 0.001 0.002 0.37 0.08 0.001 0 0.003
Li2 0.002 0 0.003 0 0.103 0.063 0.002 0.002 0.001
Li3 0.001 0.003 0 0.004 0.888 0.053 0 0 0.026
Li4 0.002 0 0.004 0 0.093 0.064 0.005 0.003 0.001
Li5 0.37 0.103 0.888 0.093 0 0.734 0.614 0.483 0.081
Li6 0.08 0.063 0.053 0.064 0.734 0 0.021 0.097 0.333
Li7 0.001 0.002 0 0.005 0.614 0.021 0 0 0.01
Li8 0 0.002 0 0.003 0.483 0.097 0 0 0.004
Li9 0.003 0.001 0.026 0.001 0.081 0.333 0.01 0.004 0
Table 5.4 above shows p-values well below 0.05, in the majority of cases, indicating a
high degree of correlation. Note the figures for Li5, in both correlations and accompanying p-
values in both cases. In each case, correlation is particularly weak, and confidence levels
correspondingly so. While no further commentary was solicited from Li5, or indeed any other
expert listener regarding the rationale for the scoring, it may be possible, again given the
climate in which variation within MaltE may be considered negatively, that accompanying
expectations with regards to accuracy and correctness may well contribute to the mental
position adopted by any listener in relation to notions of identifiable MaltE. It has already
been noted earlier that it is close to impossible, when asking an informant to externalise a
perception in some way, for there to be no trace of personal experience or background
informing the outcome.
5.3 Study 1: results of feedback commentary
This section takes up the comments offered by seven of the nine expert listeners,
agreeing to justify their choices for judging each speaker as more or less identifiable of
MaltE. The full set of comments collated can be found in Appendix B, while this section
focuses on those features that have been identified for more extensive study in this research.
126
Seven of the linguists and language experts offered short comments to describe those
identifiable features which they felt would be more likely to be used by a Maltese speaker of
English. One advantage of asking expert listeners to judge sound clips in this Pilot study was
that they could be expressly requested to suspend judgments on prescriptive notions such as
'good' or 'bad' English, or 'correct' or 'incorrect' English, and instead be requested to focus on
linguistic features of variation. It is quite unrealistic to expect that preconceived notions can
be considered completely absent, in any native listener acting as a judge, even if they are
trained linguists, or language specialists, as is the case here. Nevertheless, it is still reasonable
to ask an expert listener to focus more closely on observable features at different levels of
linguistic analysis, while suppressing, or ignoring, other more subjective views. This is what
was asked of the linguists/language experts participating in this task, and the resulting
comments presented clear patterns of features or characteristics which struck the participants
as salient in some form.
The focus here is primarily on those features of spoken MaltE which seemed to be
most prevalent across the comments. Other features relating to lexis, morphology or syntax
are also listed in the same Appendix B but are not the focus of this discussion. It is fair to
note that variation across the continuum on the phonetic/phonological level is likely to trigger
the identification of a Maltese speaker, particularly in a short space of time, such as the
exposure to each clip to a maximum of 20 seconds. On the other hand, variation at other
levels including the syntactic, may be evident more sporadically in speech and may well
require benefit from other forms of research, particularly corpus-based research (see Hilbert
& Krug, 2010), for MaltE. While even this observation may have started to change as the
contexts of English language acquisition and usage are also gradually changing throughout
the islands, it is clear both in the literature, as well as in the data used for Study 1 (ICE_MTA,
2010), that variation at the phonetic/phonological levels is evident right across the continuum
127
of variation, compared with variation at other levels, which may be more typical of specific
pockets of variation along the same continuum rather than others. The expert listener
feedback seems to bear this out, as all expert listeners commented regularly on features at the
phonetic and phonological levels, even if their field of expertise was perhaps morphology, or
syntax, and they could therefore have been expected to have focused more closely on
variation at these levels. In fact, while it is indeed the case that each expert listener did
identify features pertaining to their main field of expertise as these arose, they also certainly
offered more broad-ranging examples of phonetic and phonological features. In particular, all
seven of the expert listeners offering feedback commented on segmental features relating to
duration and to various prosodic features.
The segmental features considered salient triggers in identifying MaltE by this group
of expert listeners included increased rhoticity, substitution of interdental fricatives by their
stop counterparts /t/ or /d/, final consonant devoicing, particularly with syllable-final /s/
instead of /z/ in words such as 'was', and issues concerning vowel quality. On the prosodic
level, lack of vowel reduction was widely noted, as were variant patterns of strong/weak
stress and variant patterns of intonation also.
It was expected that many of the features to be indicated as triggering the perception
of identifiable MaltE would be those on the level of prosody. Four of the expert listeners in
their comments included 'intonation' or 'intonation and phrasing', while one other, a
phonetician, also indicated 'rhythm' or 'marked prosody'. The other recurring comment
concerned vowels, including vowel length, vowel quality or hypercorrection in vowel sounds.
Table 5.5 below illustrates the kinds of comments used by the expert listeners that were
closely drawn upon when preparing the choice of speakers and choosing five characteristics
considered salient for MaltE, that were to become the focus in Study 2 (see section 4.2.1),
and the full list of comments can be found in Appendix B.
128
Table 5. 5 Examples of comments by expert listeners in the feedback section of the Pilot Study
Vowels: lack of/no vowel reduction. Full vowels.
"operational" vowels pushed back
no schwa
Consonants: stopping of interdental fricatives
consonant lengthening
gemination final consonant devoicing
r rhoticity post vocalic r roller r
Although the remarks (see full comments in Appendix B) unsurprisingly favoured
intonation as one of the most salient features, it had already been decided that a broader study
on the patterns of intonation would not be attempted in this research and there were two
reasons for this. Firstly, the label 'intonation' can be something of a 'one-size fits all' feature,
in that it allows for a quick summing up of an extremely wide range of possible patterns,
generally perceived as intangible and a little nebulous. As an illustration of this from the
remarks noted, alongside the comments of 'intonation' or 'typical intonation' there are also
much less nebulous examples of variation at the syntactic level, quoted directly, for example
"this picture, how it shows" (to illustrate a form of resumptive pronoun) or "overuse of
'would"'. Secondly, while the comments concerning 'intonation' were undoubtedly indicative
and appropriate, it was felt that a broad ranging study of the intonation patterns of MaltE was
by far too vast a topic to cover within the scope of the current research, with the hope of
doing it any justice. Any such treatment would undoubtedly have meant the automatic
exclusion of any other potentially salient features, in an effort to convincingly identify what,
precisely, in the intonation patterns of MaltE might constitute significant, and systematic
variation at this level.
Although slightly less prevalent than the comment 'intonation', the comments
concerning vowel quality, vowel reduction and possibly 'gemination' for consonants, together
with the mention of 'rhythm' seemed to indicate another important strand of enquiry. The
question of rhythm, indeed, is inextricably linked with the issue of prominence, and the only
extensive work on MaltE intonation (Vella, 1995) does in fact suggest that perceived
129
variation in intonation phrasing in Maltese and also partly in MaltE, is closely related to the
question of alternative allocations of prominence to salient syllables, with the accompanying
tandem effect on intonation phrasing thereby being a natural corollary. However, for reasons
discussed more fully above in Section 4.2.1.3, rhythm is also a little bit of a 'one-size-fits-all'
feature and it was therefore decided that a more useful focus of study particularly in view of
the comments regarding vowel length would concentrate on identifying duration as an
acoustic correlates of prominence. The duration of segments or groups of intervocalic/vocalic
segments has sometimes been assumed to be the main – or even the sole - acoustic correlate
of rhythm, but the most recent studies (e.g. Nokes and Hay 2012; Arvaniti 2012) are
beginning to clarify that durational characteristics of segments or segment clusters constitute
one aspect of the perception of a patterned event, among other patterns relating to pitch
accent events, which might be more usefully described at the level of phonology, rather than
at the level of phonetic detail associated with duration. Thus, although 'rhythm' was noted as
a possible umbrella term to envelope the listeners' broad comments concerning a range of
characteristics related to phrasing or intensity, or durational characteristics of segments, it
was concluded that a more faithful analysis based on linguists' comments would do better to
focus on identifying potential patterns in measurable acoustic events such as time, than on the
broader and more abstract notion of rhythm, which, in the view of many, should be analysed
within the phonological, not the phonetic domain (e.g. Barry et al. 2008). Postvocalic r has
also received a healthy amount of attention from linguists, not least because of its perceived
effects on the vowels it follows (Hickey, 1989), with the potential consequence of also
affecting vowel length and, by extension, patterns of rhythm and duration.
The feedback exercise in Study 1 with 7 native MaltE expert listeners serves to further
support – and in many cases strongly reinforce – both earlier intuitions, and much of the
literature suggesting that some characteristics and features drawn on by MaltE speakers, can
130
trigger a corresponding perception in listeners. The choice of five characteristics listed as
relevant linguistic variables in a study of systematic variation in MaltE in Section 4.2.1 above
was informed by this exercise as well as by the previous literature.
5.4 Study 1: conclusions
The ME results from Study 1, together with their efficacy in identifying both general
trends and more nuanced effects, indicate that ME has tested out as an extremely useful
measuring tool for the purposes of trying to understand how listeners were judging their
perceptions of what is, or is not, identifiably MaltE. It allowed for both the observation of
general trends in scoring the concept of what is 'identifiable', as well as the isolation of more
nuanced patterns of scoring which reflected individual listeners' particular emphasis or
reactions.
Perhaps one drawback to using ME has been the average potential participant's
familiarity with it. Participants are generally much more used to the ubiquitous Likert-type n-
point scale, where scores are still expected to yield a general pattern, if such a pattern is to be
found. The numerical scales used in this latter scale are well known, and participants can
score with very little thought, if they choose to. A fixed, n-point scale also does not require
participants to worry about providing their own figures and proportions, unlike in ME, as the
scale is provided and scores are made simply in relation to this scale. With ME, participants
must decide on their own scale, and constantly keep the modulus and adjacent scores in mind
as they progress through the task. This process may not appeal much to those who consider
themselves not very mathematically minded. While not impossible, and definitely not so
complicated as to undermine the ultimate scoring process, this feature certainly needed
attending to and taking into account, both during Study 1 itself, with its small group of
participants, and particularly in Study 2, where the participant cohort was larger. During
131
Study 1, particular note was taken of how easily or otherwise participants responded to this
new scale, and of any further measures that could be taken to clarify the process, and these
were implemented in the Perception Study carried out with Study 2, described fully in
Chapter 6.
132
6 Study 2: analysis and findings
This chapter presents the analysis and findings of Study 2, which constitutes two
related parts, focusing in turn on patterns of perception and of production in MaltE. Section
6.1 and its subsections describe the outcomes of the Perception study that 34 naïve listeners
of MaltE participated in (methods and execution of this are described above in Section 4.3.2
above). The production patterns of six native MaltE speakers (see Section 4.3), with reference
to the predetermined characteristics for further study (see Section 4.2.1) are analysed further
in Section 6.2 and its subsections.
6.1 Perception study (Study 2)
In working with a larger group of naïve, as opposed to expert listener participants than
in Study 1 (34 in the execution of the task, rather than nine), a number of differences were
anticipated both in the running of the study, and in its subsequent analysis, and this indeed
proved to be the case. For instance, it was expected that the participants in the Study might
request clarification or need reassurance regarding how to use the ME scoring procedures.
This was factored in, both in the presentation of the task, which allowed more time for
training and familiarisation than Study 1 had, and also in the continued presence of myself,
on hand to answer individual queries as they arose. The handling of the Study 2 Perception
study data required a different approach and findings are presented here in terms of general
trends rather than of a series of individual responses as they were in Study 1. Furthermore, in
view of the larger cohort of participants in the Study, the level of nuance afforded by an ME
task is used to illustrate specific trends in the analysis, while more categorical information
133
was also drawn from the resulting data. The full table of correlations and corresponding
confidence levels are presented for reference in Appendix B.
The statistical analysis of listeners' estimation of how much more or less identifiable
than a given modulus a succession of 10 stimuli sounded offers an encouraging pattern of
consensus and similarly to Study 1, indicates that ME can provide nuanced, rich and
meaningful results which outweigh the disadvantages that having to use an unfamiliar tool
might present.
34 participants originally took part in this Perception study but 3 sets of scores were
removed from the group when it emerged that in one case, the participant had in effect scored
categorically, rather than using a gradient, and in two other instances, the participants actually
inverted the scale, allocating a higher score to those speakers who they judged as sounding
less Maltese.
These 3 anomalous scores in themselves also suggest an important response to the
judgment task that is worth noting here. In the case of the categorical marking, the participant
ignored the concern with scale, or degree of variation in MaltE, but instead commented that
they were allocating a top score of 100 across all clips because they were sure that each of the
speakers in the 11 clips were Maltese, regardless of the individual variation that may have
been observable. Thus, although it was not possible to use the resulting data, it was still
interesting to note the reaction that there was no question in this participant's mind that all the
speakers were undoubtedly Maltese, and therefore using MaltE.
The other two sets of scores which could not be used presented a neat inversion of all
the scores. The two participants both noted that they could not help equating the more
Maltese sounding speakers with the notion of 'bad' or 'incorrect' and therefore with a lower
score, while those speakers scoring higher up the scale were considered to sound more
English, and closer to a perceived 'correctness' or standard. Again, while it was not possible
134
to use these scores, it was interesting to note that for these participants, it was impossible to
equate the observation of more typically Maltese-sounding, with anything other than a lower
standard, and consequently, a lower score.
Finally, in analysing the remaining data, three outliers were also removed. It is only
possible to speculate on the scoring patterns in these three cases, but it is fair to assume,
particularly in the light of the other 28 scores, that the nature of the ME task, together with
the relative novelty of being asked to judge audio stimuli, meant that these participants were
unsure of what they were doing in some way. These three outlying scores are not considered
further in this section, which ultimately analyses the data results from 28 participants.
The 28 participants responded positively to using a scale that might have been less
familiar to them, even if they were on the conservative side in allocating scores, and did not
exploit the degree to which they could express their reactions numerically as much as they
could have. The most popular scale used was 1-100, with one participant working between 1-
1000, but still providing similar proportions across the ME task. So while all 28 participants
readily drew up their own scales and gave their scores, the resultant figures were often
rounded up or down, and no decimals or fractions were used.
In spite of this tendency to revert to more familiar parameters in scoring, the
following analysis will suggest that it is possible to identify some patterns of consensus even
across this larger cohort of participants. Similarly, although the ME scale may have been
largely unfamiliar to them, the description of general trends presented below will also suggest
that participants could still respond in a meaningful manner, even while having no scale
imposed on them, and being asked to provide one of their own. Participants were therefore
encouraged to think about how to portray their judgments, as opposed to acting within a
predefined, but mathematically meaningless framework such as the more categorical binary
or n-point scales would present. Thus by asking participants to formulate their own scale and
135
then make judgments according to this, the task required the listeners to think a little bit more
consciously about their scoring than they might otherwise have done.
6.1.1 Perception study (Study 2): Analysis
It was expected that there would not be as strong a degree of consensus on what is
considered identifiably MaltE in this larger group of listeners classed as naïve, when
compared with the much smaller group of expert listeners in Study 1 (see Section 5.2). This
was indeed the case, but it is also true that the relatively high correlations of some of the
paired participants still suggest strong patterns of covariance in the perception of what is and
what is not identifiable for MaltE among the naïve listeners in Study 2.
To begin with the modulus scores, every effort was again made to choose a speaker
who was neither exceptionally identifiable nor particularly non-identifiable for MaltE. Again,
however, the same caveat applies as in Study 1 (see Sections 4.1.2 and 4.3.2, for the
discussion on choosing the modulus stimulus), in which the very choice of an appropriate
speaker for this stimulus highlights the difficulties in establishing what is considered a
relatively neutral form of this particular variety of English. In short, it is still not easy to
pinpoint the telltale signature of an identifiably MaltE speaker. All the participants chose a
scale using multiples of ten, so the modulus scores are presented as percentages in Figure 6.1
below.
136
0
20
40
60
80
100
120
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
% s
core
s
28 Listeners
Modulus Scores %
Figure 6. 1 Modulus scores from 28 naive listeners expressed as %
Note that the vast majority of scores are at least at 50%,. Scores are actually generally
much higher, with a number of them at 100%. This highlights quite succinctly some of the
concerns raised throughout this thesis so far. Firstly, categorical scales cannot adequately
capture the range of perceptions of MaltE variation is, at least not at this stage in the life of
this variety. Also, native speakers (and therefore listeners) are acutely aware of the variation
within MaltE, all of which is often taken to be a strong indication of a speaker's status in
society and educational background, and listeners are likely to respond to this amount of
variation by drawing on their own personal experiences, social status and/or educational
background in making a judgment. Thus while it is possible to categorise the results for all
the clips in Study 2 into 'more' or 'less' identifiably MaltE, based on broad readings of scores,
the strength of ME lies in the freedom it affords in allowing participants to make their own
estimations in relation to a given point of reference.
Figure 6.2 below presents percentages of the normalised scores given by the 28
listeners for each MaltE speaker in Clips 1 through to 10. At this point, scores have been
grouped, or categorised, with reference to the modulus, as either more identifiable (>1) or
less identifiable than the modulus (<1). Thus a <1 score indicates that the clip in question is
137
89
54
71
82
57
7571
4
32
68
11
46
29
18
43
2529
96
68
32
0
20
40
60
80
100
120
Clip1 Clip2 Clip3 Clip4 Clip5 Clip6 Clip7 Clip8 Clip9 Clip10
% s
core
s o
f m
ore
/les
s Id
enti
fiab
le
10 Clips of variation in MaltE
% of Listener scores for each clip
More Identifiably MaltE Less Identifiably MaltE
considered less identifiable than the modulus, while a >1 score is considered more
identifiable, bearing in mind that in any case, scores for the modulus were often on the high
side, as described in Figure 6.1 above.
Figure 6. 2 Total scores % at <1 (less identifiable) and >1 (more identifiable) for each of 10 clips by 28 naȉve listeners
As in Study 1, the listeners represented in Figure 6.2 above seem to be particularly
clear on determining what is definitely, and what is definitely not, within the parameters of
identifiable MaltE, in relation to the modulus. Thus Clip 8 and Clip 1 are considered most
definitively to be not at all identifiable and extremely identifiable, respectively. Production
patterns for the speakers in these clips have been analysed for patterns of variation and results
are presented in Section 6.2 below. Note that the speaker in Clip 8 presents the outlier accent
variation (see Section 4.2.2 above) in that she is the only speaker to have lived abroad in
Africa and England for long stretches of time and impressionistic accounts would peg this
speaker as very close to Standard British English. The speaker in Clip 1 was also noted
impressionistically (see Section 6.4.1 below) to employ an accent most probably learned at
school, and claims to rarely speaks English unless expressly required to for some reason,
reverting to the only English ever learned (in school) in such circumstances. Nevertheless,
138
listeners still identified this speaker as the most identifiably MaltE. In each case of non-
identifiable and very identifiable, listeners achieved a surprising degree of consensus.
Conversely, Clip 2, and Clip 5 attracted quite high scores in perceptions of both more
and less MaltE. The two speakers here present a slightly different pattern of variation within
MaltE from the remaining speakers, and a native Maltese person is highly likely to identify
both as 'English (variety unspecified) speaking' (see Bonnici 2010, and Section 6.4.1 for more
on this), in other words, English is their L1.This is indeed the case, both as self-reported and
self-rated (see Appendix C), and as I observed, the speakers in Clips 2 and 5 strongly identify
with the L1 MaltE speaking community noted in previous work to often come from the
central parts of the island (e.g. Bonnici 2010) and to have quite a strong sense of identity
through English as their first language. Figure 6.3 below describes scores less than 1, and
therefore less identifiably MaltE than the modulus, for the speakers in Clips 2 and 5.
Figure 6. 3 % score <1 for speakers in 2 clips
Note: there is one more entry for Clip 2<1 (dark shade) as representing one more listener's judgment as less identifiably
MaltE than the modulus.
Figure 6.3 above takes all those scores representing listeners' judgments as less
identifiable of MaltE than the modulus. Note that only isolated instances present particularly
0.000.050.100.150.200.250.300.350.400.450.500.550.600.650.700.750.800.850.900.951.001.05
1 2 3 4 5 6 7 8 9 10 11 12 13
No
rmal
ise
d M
E sc
ore
s fo
r C
lips
2 a
nd
5
Listener Scores <1 for Clips 2 and 5
Listeners who scored Clips 2 and 5 < 1
Clip 2<1 Clip 5<1
139
0.000.050.100.150.200.250.300.350.400.450.500.550.600.650.700.750.800.850.900.951.001.051.101.151.201.251.301.351.401.45
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
No
rmal
ise
d M
E sc
ore
s fo
r C
lips
2 a
nd
5
Listener scores >1 for Clips 2 and 5
Listeners who scored Clips 2 and 5 >1
Clip2 Clip 5
low scores (0.29, or 0.4), with the majority of scores coming in quite close to 1.00. This
would suggest firstly that even though these speakers are considered less identifiably MaltE
than the modulus, they cannot be classed as unidentifiable, particularly as the modulus itself
was regularly scored quite high for identifiable MaltE (see above, Figure 6.1). Secondly, the
chart above indicates that native MaltE listeners are not only well able to identify another
Maltese person through a relatively short audio clip, but they can also identify shades of
variation across different speakers. These judgments of perception of identifiability, or
otherwise, may therefore still broadly fall into the 'identifiable' or 'non-identifiable'
categories, but within those categories, they may also still show some degree of nuance which
is particularly useful to understand in the given context of trying to establish a continuum of
variation for MaltE. Furthermore, if we consider the results in Figure 6.3 above alongside the
scores obtained for these same two clips which were greater (i.e. more identifiable) than the
modulus (see Figure 6.4 below), we can now see that although these two clips might have
overall been judged by a higher proportion of the 28 listeners as less identifiable than the
modulus, they are still not justifiably described as non-identifiable in any way.
Figure 6. 4 % score >1 for speakers in 2 clips
140
Figure 6.4 above illustrates the proportion of 28 listeners judging Clips 2 and 5 to be
either equal to the modulus, or more identifiable than the modulus. Note, again, that as the
modulus never obtained a score below 50, that anything scored the same as the modulus is
also considered as more rather than less identifiable for MaltE. Taken together, these two
clips, Clip 2 and Clip 5, which at first glance seemed to be perceived as not particularly
identifiably MaltE, can in fact, in this ME study, be seen as, at least quite or moderately
identifiable as MaltE. The scoring patterns for Clip 2 and Clip 5, are different from all those
of all the other clips, where judgments for more or indeed less, identifiable, are a little more
polarised (see Figure 6.2 above). For the purposes of contextualising six of the speakers for
case studies of production patterns (see Section 4.3.1), from among the 11 speakers
represented in the 11 clips described in this Perception study (Study 2), I am interpreting the
three broadly different patterns of judgment on perception in the following way (see also
Table 6.1 below). The speakers in those clips which were judged >1 by more than 65% of the
listeners are classed as highly identifiable for MaltE; the speakers in those clips which were
judged <1 by more than 70% of the listeners are classed as non-identifiable for MaltE;
finally, the speakers in those clips obtaining roughly 50% either way of the listeners' scores,
are judged moderately identifiable for MaltE.
Table 6. 1 A rough categorisation of ME judgment of perception ratings for "identifiability"
% of
listeners scoring
Classification resulting from
Perception study (Study 2)
Clips
above 65%
scoring >1
highly identifiable 1, 3, 4,
6, 7, 9, and 10
at 50%
scoring >1
moderately identifiablel 2, 5
above 65%
scoring <1
non-identifiable 8
It will be noted that the majority of clips thus seem to have been judged as highly
identifiable, and this is indeed a very broad category. However, once again, taking into
141
consideration that a) the modulus itself was scored quite highly and b) that those scores lower
than 1 are often, with only a few exceptions, very few indeed, as illustrated with Clips 2 and
5, earlier, this is not a surprising outcome. Conversely, if we consider the scores for Clip 8,
classified as non-identifiable, we note a different trend again (see Figure 6.5 below)
Figure 6. 5 Scores <1 for Clip 8
In Figure 6.5 above, only one listener judged the speaker in Clip 8 to be identifiably
MaltE, while the remaining speakers judged this speaker as less identifiably MaltE than the
modulus. Accordingly there are only 27 listeners listed in Figure 6.5 above, while the
remaining 28th listener judged Clip 8 to be higher than the modulus. Note that results are still
not completely categorical, with some of the listeners in Figure 6.5 above still allocating quite
a high score at or above .50. Nevertheless, the proportion of speakers judging this speaker
much lower than the modulus is overall higher. Thus, while the speaker in Clip 8 has been
judged non-identifiable, it must be emphasised that this is only for broad classification
purposes, as reflected in the relatively lower scores allocated. It is by no means the case,
however, that listeners were exceptionally more categorical in judging this clip significantly
less identifiable than all the others.
0.3
0.5
0.3
0.0
0.8
0.3
0.5
0.8
0.40.3
0.7
0.0
0.90.8
0.5
0.6
0.9
0.0
0.3
0.40.30.30.3
0.0
0.8
0.10.2
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
No
rmal
ise
d s
core
s fo
r C
lip 8
Listeners judging Clip 8 non-identifiably (<1) MaltE
Clip8 <1
142
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
Me
an c
orr
ela
tio
n
Listeners
Mean correlations for each listener
The scores for all the listeners were also correlated in order to be able to study a little
more closely the extent of the consensus across the entire cohort, as well as within each
subgroup. Correlations were given for each listener paired with the remaining listeners. A
brief overview of the pattern of correlation is presented as the mean correlation obtained per
listener, below in Figure 6.6, while the full set of correlations for the 28 listeners is presented
below in Table 6.2.
Figure 6. 6 Mean correlation per listener
Note that 23 listeners show a moderate to relatively strong correlation over 0.5, while
5 do not. It was expected that there would be less consensus among naïve listeners compared
with expert listeners, as the latter can be expected, to a degree, to put aside personal opinion
for the exercise, while naïve listeners would not necessarily have the same level of awareness
about accent or variation to do this. In spite of this, and in spite of some weak correlations
across the cohort, the pattern in Figure 6.6 above suggests an encouraging pattern of
consensus in many cases. The figure above also indicates clearly that there is sensitivity to
patterns of variation in MaltE, even if there is less agreement about which aspects of this
variation are particularly salient of this variety. No doubt a more in depth study focusing also
on the linguistic background of the listeners would uncover some patterns of perception in
143
relation to socio-indexical information and/or linguistic experience. Table 6.2 below shows
the full set of correlations with the stronger correlations shaded in grey.
144
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
1 1.0 0.9 0.9 0.7 0.6 0.8 0.6 0.8 0.6 0.6 0.9 0.6 0.9 0.8 0.9 0.7 0.7 0.6 0.9 1.0 0.3 0.9 0.8 0.6 1.0 0.6 0.9 1.0
2 0.9 1.0 0.9 0.7 0.3 0.8 0.7 0.6 0.4 0.6 0.7 0.7 0.8 0.6 0.8 0.6 0.5 0.4 0.8 0.8 0.4 0.9 0.9 0.6 0.9 0.6 0.9 0.9
3 0.9 0.9 1.0 0.7 0.5 0.9 0.7 0.6 0.4 0.5 0.8 0.8 0.9 0.7 0.9 0.8 0.8 0.3 0.8 0.8 0.4 0.9 0.7 0.8 0.9 0.6 0.9 0.9
4 0.7 0.7 0.7 1.0 0.3 0.8 0.7 0.5 0.2 0.5 0.6 0.6 0.7 0.5 0.8 0.8 0.5 0.4 0.7 0.7 0.4 0.8 0.7 0.7 0.7 0.8 0.8 0.7
5 0.6 0.3 0.5 0.3 1.0 0.4 0.3 0.7 0.7 0.1 0.7 0.1 0.4 0.8 0.5 0.5 0.6 0.7 0.7 0.6 0.1 0.3 0.2 0.4 0.6 0.2 0.5 0.4
6 0.8 0.8 0.9 0.8 0.4 1.0 0.9 0.4 0.2 0.4 0.8 0.9 0.9 0.8 0.9 0.9 0.8 0.3 0.7 0.7 0.6 0.9 0.8 0.9 0.8 0.8 0.9 0.8
7 0.6 0.7 0.7 0.7 0.3 0.9 1.0 0.1 -0.1 0.1 0.7 0.9 0.7 0.7 0.6 0.8 0.7 0.2 0.5 0.5 0.7 0.7 0.8 0.9 0.6 1.0 0.7 0.5
8 0.8 0.6 0.6 0.5 0.7 0.4 0.1 1.0 0.9 0.6 0.6 0.1 0.7 0.6 0.8 0.4 0.4 0.7 0.9 0.9 -0.2 0.6 0.3 0.2 0.8 0.0 0.6 0.8
9 0.6 0.4 0.4 0.2 0.7 0.2 -0.1 0.9 1.0 0.5 0.4 -0.2 0.4 0.5 0.5 0.2 0.3 0.8 0.7 0.6 -0.3 0.3 0.1 0.1 0.6 -0.2 0.3 0.5
10 0.6 0.6 0.5 0.5 0.1 0.4 0.1 0.6 0.5 1.0 0.2 0.2 0.6 0.2 0.6 0.3 0.1 0.2 0.6 0.6 -0.2 0.7 0.4 0.3 0.6 0.1 0.4 0.6
11 0.9 0.7 0.8 0.6 0.7 0.8 0.7 0.6 0.4 0.2 1.0 0.6 0.8 1.0 0.8 0.7 0.8 0.5 0.8 0.8 0.5 0.7 0.7 0.7 0.8 0.5 0.8 0.7
12 0.6 0.7 0.8 0.6 0.1 0.9 0.9 0.1 -0.2 0.2 0.6 1.0 0.7 0.5 0.7 0.7 0.7 -0.1 0.4 0.5 0.7 0.8 0.8 0.9 0.6 0.8 0.8 0.6
13 0.9 0.8 0.9 0.7 0.4 0.9 0.7 0.7 0.4 0.6 0.8 0.7 1.0 0.8 0.9 0.6 0.7 0.5 0.8 0.9 0.3 1.0 0.8 0.7 0.9 0.6 0.8 0.9
14 0.8 0.6 0.7 0.5 0.8 0.8 0.7 0.6 0.5 0.2 1.0 0.5 0.8 1.0 0.8 0.7 0.8 0.6 0.7 0.7 0.4 0.6 0.6 0.7 0.7 0.5 0.7 0.6
15 0.9 0.8 0.9 0.8 0.5 0.9 0.6 0.8 0.5 0.6 0.8 0.7 0.9 0.8 1.0 0.8 0.7 0.4 0.9 0.9 0.3 0.9 0.7 0.7 1.0 0.5 0.9 0.9
16 0.7 0.6 0.8 0.8 0.5 0.9 0.8 0.4 0.2 0.3 0.7 0.7 0.6 0.7 0.8 1.0 0.8 0.3 0.7 0.6 0.6 0.7 0.6 0.8 0.7 0.8 0.9 0.6
17 0.7 0.5 0.8 0.5 0.6 0.8 0.7 0.4 0.3 0.1 0.8 0.7 0.7 0.8 0.7 0.8 1.0 0.4 0.6 0.5 0.7 0.6 0.6 0.9 0.6 0.5 0.8 0.5
18 0.6 0.4 0.3 0.4 0.7 0.3 0.2 0.7 0.8 0.2 0.5 -0.1 0.5 0.6 0.4 0.3 0.4 1.0 0.6 0.6 0.0 0.3 0.3 0.2 0.5 0.2 0.3 0.5
19 0.9 0.8 0.8 0.7 0.7 0.7 0.5 0.9 0.7 0.6 0.8 0.4 0.8 0.7 0.9 0.7 0.6 0.6 1.0 1.0 0.1 0.8 0.6 0.5 1.0 0.4 0.8 0.9
20 1.0 0.8 0.8 0.7 0.6 0.7 0.5 0.9 0.6 0.6 0.8 0.5 0.9 0.7 0.9 0.6 0.5 0.6 1.0 1.0 0.1 0.9 0.6 0.4 1.0 0.4 0.8 1.0
21 0.3 0.4 0.4 0.4 0.1 0.6 0.7 -0.2 -0.3 -0.2 0.5 0.7 0.3 0.4 0.3 0.6 0.7 0.0 0.1 0.1 1.0 0.3 0.6 0.8 0.2 0.6 0.6 0.2
22 0.9 0.9 0.9 0.8 0.3 0.9 0.7 0.6 0.3 0.7 0.7 0.8 1.0 0.6 0.9 0.7 0.6 0.3 0.8 0.9 0.3 1.0 0.8 0.6 0.9 0.7 0.8 0.9
23 0.8 0.9 0.7 0.7 0.2 0.8 0.8 0.3 0.1 0.4 0.7 0.8 0.8 0.6 0.7 0.6 0.6 0.3 0.6 0.6 0.6 0.8 1.0 0.7 0.7 0.8 0.8 0.7
24 0.6 0.6 0.8 0.7 0.4 0.9 0.9 0.2 0.1 0.3 0.7 0.9 0.7 0.7 0.7 0.8 0.9 0.2 0.5 0.4 0.8 0.6 0.7 1.0 0.6 0.8 0.8 0.5
25 1.0 0.9 0.9 0.7 0.6 0.8 0.6 0.8 0.6 0.6 0.8 0.6 0.9 0.7 1.0 0.7 0.6 0.5 1.0 1.0 0.2 0.9 0.7 0.6 1.0 0.5 0.9 1.0
26 0.6 0.6 0.6 0.8 0.2 0.8 1.0 0.0 -0.2 0.1 0.5 0.8 0.6 0.5 0.5 0.8 0.5 0.2 0.4 0.4 0.6 0.7 0.8 0.8 0.5 1.0 0.7 0.5
27 0.9 0.9 0.9 0.8 0.5 0.9 0.7 0.6 0.3 0.4 0.8 0.8 0.8 0.7 0.9 0.9 0.8 0.3 0.8 0.8 0.6 0.8 0.8 0.8 0.9 0.7 1.0 0.9
28 1.0 0.9 0.9 0.7 0.4 0.8 0.5 0.8 0.5 0.6 0.7 0.6 0.9 0.6 0.9 0.6 0.5 0.5 0.9 1.0 0.2 0.9 0.7 0.5 1.0 0.5 0.9 1.0
Table 6. 2 Paired correlations for 28 MaltE listeners
145
The shaded correlations are all noted to be strong. Many, though not all, of the above
shaded high correlations also have p value = <0.05, indicating a significant degree of
correlation.
6.1.2 Perception study (Study 2): discussion and conclusions
Magnitude estimation has most recently been described as an ideal method for
measuring "introspection data" (Hoffmann 2013) and it has proven a useful tool for
measuring native listeners' perceptions of MaltE for two reasons. In the first place there is the
reference to an externalised referent in the modulus. Other scales and instruments (such as a
Matched Guise technique, or a likert-type scale) can also use referents in the same way as the
modulus is used for ME, ME then has an edge in requesting participants (here listeners) to
give their own estimation of by how much. This process helps to externalise and capture
thoughts, and achieve some distance from personal positions.
To say that personal experience or opinions on language usage are put aside
completely is too simplistic by far, but it is perhaps acceptable to claim that by asking a
listener to focus on two points, the listener's own position is at least held in check. The
personal input in terms of each listener having to decide their own scale and subsequent
numerical allocations helps to involve each listener in the task, without allowing too much
time to indulge entrenched opinions which may or may not reflect realities in a credible way.
A second related advantage of using ME is in allowing listeners to present their own scale,
which emphasizes the freedom of each listener to operate within a range that is familiar and
meaningful to them, compared with other scales which "might artificially limit their [the
participants'] choices" (Hoffmann 2013:103). While there is generally speaking, nothing
unfamiliar about many of the available scaled or forced-choice categorisation tasks, the
option to operate within individually determined parameters may well serve to encourage
146
participants (listeners) to invest some energy in creating a scale which is more meaningful to
them. From my perspective in trying to understand how MaltE might be perceived, ME offers
the potential to capture gradience, and the introspective judgments of perceptions of stimuli
which are inherently continuous, to a large degree.
The resulting data suggest encouraging trends in patterns of perception of what may
be considered identifiable MaltE. Both Study 1 and the Perception study in Study 2 show
many instances of strong correlations across judgment scores in each group of expert and
naïve listeners, suggesting both that perception of what is identifiably MaltE can go beyond
entrenched attitudes, and also, that there are in fact, strong patterns of variation in the
production of MaltE that can trigger those perceptions. Both the expert and the naïve listener
groups demonstrated sensitivity to patterns of variation in the MaltE variety and could
respond to this in nuanced ways. Some of this nuance is captured by the proportions of
listeners judging each clip. A larger proportion allocating high scores to one clip is taken to
suggest that that clip therefore strikes a wider range of listeners as more identifiably MaltE
(see Section 6.1.1). Another aspect of nuance can be captured by examining an individual
listener's pattern of scoring across all of the clips in order to discover whether there may be
patterns of perception that can be interpreted as a function of the listener's linguistic
experience and background.
Ultimately, and for the purposes of this study which seeks to look at patterns of
production of identifiable MaltE, as well as patterns of perception, I have chosen to broadly
categorise the results from Study 2 Perception study into three types, indicating speakers in
clips who are highly identifiable, moderately identifiable or non-identifiable (see Section
6.1.1 above). However, doubtlessly, ME could be exploited more fully in order to study other
aspects in the patterns of perception of MaltE.
147
6.2 Patterns of variation in the production of speech in 6 MaltE speakers
Now that speakers in the clips used for the Perception Study described in Section 6.1
above have been categorised as more identifiable or less so, this chapter will focus on six of
the speakers and examine patterns of variation in their speech more closely. Section 6.1
concluded that native MaltE listeners both in Study 1 and in Study 2 were able to judge quite
systematically those speakers who could be considered highly identifiable, in relation to a
modulus, as well as those who were completely non-identifiably MaltE. Another broad
category to be observed is that of the middle ground, where speakers were still often judged
highly identifiable, but by a smaller proportion of the listener populations in both studies (see
Section 6.1.1 above for these categorisations). In this chapter features in the speech of six of
the speakers from each of these three broad categories – highly identifiable, moderately
identifiable and non-identifiable - are examined with a view to considering whether
identifiability (or non-identifiability) can be determined on the basis of such features. The
other speakers not falling into these two clear categories, but instead inviting a slightly less
categorical response, either in terms of lower scores, or in terms of fewer listeners judging
them identifiable, would be expected to show some characteristics of both identifiable and
non-identifiable speakers. These will also be analysed in this chapter.
The findings presented in this and the following sections are therefore based on a
qualitative study of six of the speakers recorded for Study 2 (see Section 4.3.1). The speakers
analysed here correspond to Clips 1, 2, 3, 4, 5, and 8, in the Perception study in Study 2 (see
Section 6.1.1 above). They have been relabelled and reordered for the purposes of analysis
and presentation in this section, to correspond with the data files on the accompanying CD.
The data files are organised in descending order of identifiability to include the 3 clips judged
highly identifiable, and labelled throughout this section (and on the data files) as Sp(eaker)1,
Sp2 and Sp3; 2 clips judged moderately identifiable, and labelled Sp4 and Sp5; and finally
148
one clip judged non-identifiable, and labelled Sp6. For clarity, the clips presented during the
Study, together with the corresponding Sp(eakers) analysed are tabulated below.
Table 6. 3 Summary of Labelling
The Perception Study Clip01 Clip03 Clip04 Clip02 Clip05 Clip08
Variation Analysis Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
In all, 4 segmental features were analysed, namely the substitution of the phoneme
represented orthographically as 'th', labelled throughout this work as (th); variation in the
realisation of /æ/, labelled throughout as (a); the choice of vowel reduction using /ə/ as
opposed to using full vowels, labelled throughout as (schwa); and the use, or otherwise, of
post-vocalic (r) and labelled throughout as (r). A qualitative analysis of the specific
characteristics and varied realisations of (th), (a), (schwaØ) and (r) follows in the next
section, but for the purposes of quantifying instances of each segmental feature and their
varied realisations, this section contains findings described in terms of their proportional
occurrence in categorical terms. This was calculated following a tally of all occurrences of
each characteristic for each of the six speakers across all four tasks, and findings are
presented as proportions of the total number of instances tallied (see Section 4.4.2). Specific
instances of variation in the different realisations of each of these segments is discussed
further in Section 6.3 and its subsections.
This section presents the qualitative study of patterns of variation in the six speakers
chosen according to the criteria outlined above in section 6.1. It is naturally not advisable to
expect to be able to base any sort of predictive model of identifiability on these six speakers,
however, two trends observed in this group of speakers are worth focusing on in some detail.
These are dealt with in Section 6.2.2, while in Section 6.2.1 I analyse variation in each of the
five linguistic variables in turn.
149
In the following analysis, speakers 1, 2 and 3 (Sp1, Sp2, Sp3) all received high scores
as highly identifiable for MaltE from the native listeners in Study 2 (see Section 6.1.1), while
Sp(eaker)6 was considered unidentifiable as a MaltE speaker. Speakers 4 and 5 also received
high identifiability ratings but by a slightly lower percentage of listeners. The overwhelming
majority of the remaining percentage of listeners judging Sp4 and Sp5 less identifiable than
the modulus still judged these speakers rather highly in relation to the modulus. In other
words, if the normalised score for the modulus was 1, listener scores, with 3 exceptions, did
not go below .70 and were mostly much higher (see Section 6.1.1 for more details).
The overview of the findings presented in this section is based on both the read text
task ('TextAloud') and the more spontaneous speech styles. Results are presented in the form
of raw scores and proportions, which are useful for easier and direct comparisons.
Comparisons between findings for TextAloud and more naturalistic speech in the unguided
tasks (Description, Differences and Sentences), are also drawn where appropriate. In all, five
characteristics were elicted throughout the data collection process through four different
tasks. Four of the five features are segmental features which can be analysed phonetically,
without reference to other domains of analysis, such as the syllable, their coarticulation with
other elements, lexical stress, intonation, or indeed any other element more typically
associated with the prosodic domain. In some ways such narrow analysis may be considered
contrived, as it is of course difficult to imagine that a post-vocalic r, for example, can be
studied without reference to the influence it has on adjacent segments, or indeed, across
syllable boundaries. On the other hand, it is also useful to study individual segments in
isolation in order to then be in a position to better understand their behaviour in connected
speech. For example, in the first instance, the variable (th) is examined in isolation and
without reference to adjacent segments, however, Section 6.3.3 also considers further
characteristics of this feature, and directions for potential further investigation are also
150
indicated in Chapter 8. The same multi-layered analysis is applied to both the variables
labelled (a) and (r), which are examined in this section in terms of whether they are realised
as [æ] or not, or null or overt realisation of (r), in other words, a binary yes/no categorisation.
Section 6.3.4 and 6.3.5, meanwhile, will look at more specific patterns and their distribution
across individual speakers. A different kind of analysis (PVI V.Dur) is used to report on the
analysis of rhythm and timing, which is by nature prosodic. The results of this analysis are
reported at the end of 6.2.1.
6.2.1 Patterns of variation at the segmental level: An overview
As expected, variation in each of the five chosen variables is often realised more in
terms of continua, rather than in terms of stark contrasts. As Bonnici (2010) notes in relation
to rhoticity, analysis is therefore largely a case of identifying more versus less variation,
rather than presence or absence of variation. It is expected that the more identifiably MaltE
rated speakers would show more evidence of those features considered relatively more
variant, with variation here being defined with reference to earlier descriptions (see Sections
2.2.1 and 4.2.1) of aspects of MaltE, such as those identified in Mazzon (1993), Vella (1995),
Debrincat (1999) or Bonnici (2010). The variant realisations of each of the features in
question here would therefore involve:
o more rhoticity - ( r)
o more variant realisation of (a), approximating /ɛ/ or /ɐ/, as well as /æ/- (a)
o varied substitution of (th) - (th)
o more full vowel realisation in place of schwa – (schwa Ø)
o less variability across vowel segment durations – (PVI V.Dur)
The bar charts in figure 6.7 (i-v) illustrate distributions for each of the five variables
above across the six speakers. The bars express the percentage proportion of variation among
the number of instances for each variable across all four speech styles, for the segmental
151
0.0
20.0
40.0
60.0
80.0
100.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(th)
0.0
20.0
40.0
60.0
80.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(schwaØ)
variables. The (PVI V.Dur) variable is expressed as an index. The title of each chart (i-v)
refers to the linguistic variable, rather than to an actual phonological/phonetic identity, in
order to avoid confusing the label with its varied realisations.
Figure 6. 7 (i-v). Proportion of variation in five variables for six speakers
(i) (ii)
(iii) (iv)
0.0
20.0
40.0
60.0
80.0
100.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(r)
0.0
20.0
40.0
60.0
80.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(a)
152
(v)
A glance at the barcharts in Figure 6.7 above suggests that the four segmental
variables show a strong level of variation, expressed in higher percentage proportions for the
different speakers along the continuum of variation in MaltE (see also Figures 6.30 and 6.31
in Section 6.4.1). Note the relatively stark contrast between Sp6 (non-identifiable) on the one
hand, and Sp1, Sp2 and Sp3 on the other hand. In the case of the four segmental variables,
Sp6 presents little or no evidence of marked variation, while Sp1, Sp2 and Sp3 systematically
show the highest level of variation in most of the four variables. Similarly, although the scale
here is reversed, in (PVI V.Dur) Sp6 shows a higher index than all the other speakers,
indicating more variability in vowel durations, and Sp1, Sp2, Sp3 and Sp4 all indicate a much
lower index, and lower vowel duration variability, while Sp5 falls somewhere in between the
two camps. This would be the expected pattern of distribution across the speakers in view of
both previous research and impressionistic views which suggest that MaltE prefers full
vowels over vowel reduction.
Both Sp4 and Sp5 (moderately identifiably MaltE), sometimes present patterns
similar to that of the non-identifiable speaker, and at other times, more closely resemble the
remaining 3 highly identifiable speakers. To take the variable labelled (th), for example, five
0
20
40
60
80
100
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
Ind
ex
Speakers
(PVI v.Dur)
153
87.876.9
88.4
60.5 63.6
0.00.0
20.0
40.0
60.0
80.0
100.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(th)
of the speakers (Sp1-Sp5) can be broadly categorised in two groups as illustrated in figure 6.8
below, while Sp6 (non-identifiable) forms a third category.
Figure 6. 8 Varied realisation of /θ/ and /ð/ (variable (th)) distribution across all 6 speakers
Sp1, Sp2 and Sp3 all had a high proportion of substituted /θ/, or /ð/, around the 80%
range, while Sp4 and Sp5 had a moderately high proportion of substitution, around the 60%
range. Conversely, Sp6 showed no evidence at all of (th) substitution. The mean proportion
of frequency of (th) substitution for all 6 speakers is 62.87 (standard deviation=32.95), with
Sp1, Sp2 and Sp3 well above the mean, and the two moderately identifiable speakers Sp4 and
Sp5 clustering closely around the mean. Thus Sp4 and Sp5, while presenting a lower
proportion of (th) substitution than the 3 highly identifable speakers, can still be more closely
patterned with the identifiable, rather than with the non-identifiable, in this case.
To take a different pattern, results for the variable (a) show the six speakers grouped
differently. In the first place, the proportion of variation is lower than in the (th) variable, as
Figure 6.9 below illustrates.
154
51.9
41.1
60.9
31.9
20.5
8.2
0.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(schwaØ)
Figure 6. 9 Distribution of variant realisations of /æ/ (variable (a)) in six speakers
In this case, only the 3 most identifiably MaltE speakers presented variation in the
realisation of (a), collectively grouped as [e] or [ʌ] variants (see section 6.3.4 below for a
qualitative analysis of this). On the other hand, the moderately identifiable speakers – Sp4
and Sp5 - together with the non-identifialbe speaker, Sp6, consistently used [æ] with no
variation in its realisation, which presents a different distribution of a marked feature across
the six MaltE speakers, compared with the distribution of (th).
In the case of (schwaØ), the preference for full vowels seems once again to indicate a
clear pattern of distribution between the highly identifiably MaltE speakers, on the one hand,
and on the other hand, the moderately identifiable and non-identifiable speakers. The cut-off
point here, however, is much less categorical than in the case of (th) above, as the data points
in Figure 6.10 below indicate:
Figure 6. 10 Proportion of the realisation of schwa as full vowels in six speakers
64.0
9.4
34.5
0.0 0.0 0.00.0
10.0
20.0
30.0
40.0
50.0
60.0
70.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(a)
155
In Figure 6.10 above, the higher frequencies indicate greater preference for full
vowels over schwa, while the lower frequencies (for example in Sp6) indicate a preference
for schwa. The mean proportion of frequency for realisation of schwa as full vowels defined
as variable (schwaØ) is 35.73 (sd=19.62), with the 3 highly identifiable MaltE speakers
presenting higher proportions, and the 2 moderately identifiable speakers still hovering
around the mean but not as closely as in the case of (th) above and also more widely
dispersed than in the case of (th). Again, the non-identifiable speaker, Sp6, falls well below
the mean.
Finally, in the case of rhoticity labelled (r) in Figure 6.11 below, the frequency of
post-vocalic (r) is once again distributed slightly differently across the six speakers:
Figure 6. 11 Preference for post vocalic (r) across six speakers
Higher frequencies here indicate a higher proportion of overt realisation of post-
vocalic (r) as opposed to null realisation of post-vocalic (r). Here, the proportion of post-
vocalic (r) in Sp1 is much lower than that of the 2 remaining highly identifiable speakers
(Sp2 and Sp3), and also lower than one of the more moderately identifiable speakers, Sp4. On
the strength of the other variables discussed, Sp1 would be expected to present patterns
similar to those of Sp2 and Sp3, as highly identifiable speakers, whereas in fact, Sp1 here
falls somewhere between the highly identifiable and non-identifiable speakers. Similarly,
Sp5, considered a moderately identifiable speaker, presents an even lower proportion of (r)
41.2
84.3 88.1 84.3
4.3 7.8
0.0
20.0
40.0
60.0
80.0
100.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
(r)
156
than even the completely non-identifiable Sp6. The other moderately identifiable speaker,
Sp4, also presents a pattern of (r) distribution more similar to Sp2 and Sp3, rated as highly
identifiable speakers. Reasons for these apparent anomalies are discussed further below in
6.4, as it is becoming increasingly clear that rhoticity is likely to present an instance of more
localised, as opposed to external, norm-developing patterns. So the overt use, or not, of post-
vocalic (r) may be a function of localised social norms, and its distribution may also change
as a function of register.
The variable (PVI V.Dur) presents a more fine-grained pattern of variation, and in this
case, the more identifiable speakers are expected to have lower indices, while the less
identifiably MaltE speakers would be expected to have a higher index. This is expected
because of the noted preference for full vowels in MaltE patterns, while the non-identifiable
speaker, who is an English dominant speaker having lived abroad for some of her life (see 6.4
below), would be expected to use more vowel reduction in weak stressed syllables, and
possibly more generally, more phonetically reduced speech. The resulting measure of
variability would therefore be expected to be higher in the non-identifiable speaker who may
have a more pronounced distinction between the reduced weak stressed syllable segments and
the durationally longer strong stressed ones. Conversely, the more identifiably MaltE
speakers would not be expected to present such variability between reduced and non-reduced
vowel segments. In the first discussion of the Pairwise Variability Index in Section 4.2.1.3
above, such fine-grained patterning was considered too fine to give a clear indication of
rhythm classification across different languages, however, this observation was made in the
context of a very disparate cohort of speakers, with just one speaker for each different
language (Grabe and Low, 2002). In the current context, where all the speakers belong not
only to the same language group, but also to the same variety, MaltE, it is expected that the
differences in PVI across the 6 speakers will be slight, but still indicative for our purposes. A
157
glance at the results for the variable (PVI V.Dur) above in Figure 6.7 verifies this expectation
which will be discussed in further detail below in section 6.2.2.
In the meantime, if we pan out, again, to consider all five variables collectively, we
see a picture of variation in MaltE that suggests encouraging evidence of systematic
patterning both within and across the five salient scharacteristics which have been singled
out. In the first place, the three most identifiable speakers often present similar distributions
in some of the characteristics, notably in variant realisations of fricatives /θ/ and /ð/, variant
realisation of /æ/, and in a preference for full vowels instead of schwa. On the other hand, the
two moderately identifiable speakers taken together, can sometimes be grouped with the non-
identifiable speaker, as in the case of variant realisation of /æ/, but also share some patterns
with the highly identifiable speakers, such as in variant realisations of fricatives /θ/ and /ð/.
Here too, however, while there are similarities between Sp4 and Sp5 as moderately
identifiable speakers, this is not to say that they present a uniform pattern, as can be noted in
the case of overt vs null realisation of post-vocalic (r), or in the case of (PVI V.Dur), where
Sp4 is more similar to Sp1, Sp2 and Sp3, while Sp5 is more closely patterned with Sp6. This
overview of the results of the data analysis seems to suggest then, some clear indications of
how the continuum of variation withint MaltE may begin to be mapped out, with respect to
these five characteristics in question.
6.2.2 Two trends
All five variables were tested for correlations with identifiability ranking obtained
through the Perception study in Study 2, presented in Section 6.1 and its subsections.
Correlations using Pearson's correlation returned strong correlations for identifiability and
(PVI V.Dur), (schwaØ), and (th), with high confidence levels in each case. Pearson's
correlation returned -0.883, (p value = 0.02) for (PVI V.Dur), a positive correlation of 0.857
158
(p value = 0.03) for (schwaØ), and a positive correlation of 0.965 (p value = 0.001), for (th).
Correlations were also run for (a) and for (r) but these did not return strong correlations of
any significance. For (a) and identifiability, correlation was 0.620 (p value=0.19), while for
(r) and identifiability, the correlation was 0.581 (p value = 0.23).
Two possible reasons for the lack of clear correlation for (a) and for (r) could be a
relatively small number of tokens in each variable concerned, or the fact that the number of
speakers presenting a relatively wide range of variation within MaltE, constituted too small a
cohort on which to carry out any level of predictive modelling. Total number of tokens for
instances of the variable (r) which could be realised as either null or overt, ranged between 47
and 64, across the six speakers; and for variant realisation of /æ/ (label (a)), they ranged
between 25 and 32 tokens across the six speakers. Attempts were made to convert the data
sets for these two variables, to an indexing system suggested in Chambers and Trudgill
(1998), where each variant of a variable is multiplied by 1 or by 2 in cases of more than one
variant, while each instance of the expected realisation is multiplied by 0 (see more on this in
Chapter 7 below). The result is a higher or lower number as an index of variation for a
particular variable, depending on how many unmarked and how many marked tokens each
speaker presents. However this conversion to indices creating higher numbers did not
increase the correlation at all. The other - and more likely - possibility is that the six speakers,
presenting, as they did, a rather substantial range of variation within the continuum of MaltE,
constituted too small a number of speakers for any significant correlation to be observed.
These results can be suggestive of a particular trend worth exploring further. Note that
strong correlations are obtained for (PVI V. Dur) and (schwaØ), as well as for (th), but not
for the remaining two variables, (r) and (a), most likely because the number of speakers is
quite small. Alongside this, there is the evidence presented in the earlier descriptions above in
Figure 6.7 (i-v) above that the proportion of variant forms of labelled variables (a) and (r) is
159
particularly higher for the more identifiable MaltE speakers, and contrastively much lower, or
completely absent, for the less identifiable MaltE speakers. The six speakers for this analysis
were chosen because they presented a wide enough range of the different manifestations of
variation along the continuum of the MaltE variety. Given such a potentially wide range of
variation, spread rather thinly across just six speakers, it is to be expected that it would be
difficult to observe patterns which have any high predictive power. It is perhaps therefore,
more remarkable that three of the variables did in fact, still return a strong correlation, rather
than that the remaining two variables did not return a significant correlation. I suggest that the
three variables which have correlated strongly with identifiability ratings therefore be
considered strong triggers in the perception of identifiable MaltE across the whole continuum
or range of variation in this variety. In other words, these three variables are prevalent in
MaltE speakers on any point along the continuum and they co-vary strongly with the
perceived identifiability ratings for MaltE. Conversely, the remaining two variables are
clearly also to be considered particularly salient for MaltE, and therefore most likely do
trigger a perception of identifiability, as Figure 6.7 above suggests, but they may operate on a
different pattern, in being salient for specific subsets along the continuum, rather than being
prevalent across the full spectrum of MaltE. These subsets are further described in Section
6.4 below. Two trends are therefore observed in these data, suggesting firstly, that the
combination of variant realisation of full vowels versus schwa, together with the degrees of
variability in vowel duration are highly indicative of identifiability in MaltE. The correlation
between the third variable (th) and identifiability ratings also suggests that this variable is a
strong trigger in the perception of identifiable MaltE. Conversely, the two remaining
variables (r) and (a) clearly present patterns of variation in identifiably MaltE speakers, when
compared with less identifiable MaltE speakers, but in these cases, variation may be better
160
understood as a function of subtle social meaning, at a more localised level, compared with
the broader variation of, for example, vowel duration patterns, or (th).
In both cases of strongly correlated and of non correlated characteristics it would be
useful to expand on this study with larger subject populations both across the various MaltE
speech communities, as well as within subgroups of MaltE speakers.
6.3 Fine-grained analysis
In this section I now turn to considering each of the five variables in turn, and their
distribution across the six speakers. In a general observation, all five variables yielded
interesting patterns of variation, and given the evidence of this data, it is now clearer why
these are so often referred to as somehow relevant to MaltE speakers, either obliquely in
anecdotal reference or parodied imitation, or more objectively and systematically by expert
listeners. This section also considers individual patterns of variation with reference to each of
the 5 characteristics.
6.3.1 A Pairwise Variability Index (Vowel Durations)
The PVI analysis of vowel duration follows the formula presented in Grabe and Low
(2002), but it also follows Nokes and Hay (2012) in analysing each vowel segment (see
Section 4.4.3 for a closer description of how this method was used), rather than each vocalic
vs intervocalic segment. This distinction is important given that it was specifically vowel
segment durations that this research was primarily interested in. Nokes and Hay (2012) make
a similar distinction also with reference to the moot point that an acoustic measurement of
various timing patterns cannot necessarily be generalised and extrapolated to account for
rhythm classification. In other words, as described more fully in the discussion on rhythm in
Section 4.2.1.3 above, it is important not to consider durational features to be the sole
161
acoustic cue in the classification of rhythm patterns across languages. However, as Nokes and
Hay clarify, the perception of stress and rhythm patterns in English is still closely associated
with durational features, and for this, the vowel segment is a likely candidate in capturing
timing: "Other factors held equal, a longer vowel length will give rise to a percept of syllable
length, and thus rhythmic prominence, in English" (Nokes and Hay 2012:4). With this
argument, the authors carry out a PVI analysis on successive vowel segments, as opposed to
vocalic vs intervocalic groups, which is what I have also done here.
The resulting analyses are not directly comparable with either Grabe and Low (2002)
or with Nokes and Hay (2012) both because speakers are using different texts, and in the case
of Grabe and Low (2002), because different structures are being measured. Nevertheless,
some observations and comparisons can still be drawn. For example, Grabe and Low's British
English speaker rates one of the higher indices among a group of speakers of eighteen
different languages. One British English speaker, together with a Dutch and a German
speaker (all three languages would be classified 'stress-timed'), all had the highest PVI
indices, compared with other languages, also including other stress-timed languages.
Similarly, Sp6 in my data also obtained a high index for vowel duration variability (PVI
V.Dur). This speaker was considered non-identifiable for MaltE, and could possibly be
mistaken for an English person, having spent part of her life in England.
In the case of the more moderately identifiable speakers Sp4 and Sp5, the higher (PVI
V. Dur.) might begin to account for the perception that, although still perhaps recognisably
MaltE, they are no longer unequivocally identifiably MaltE, as are Sp1, Sp2 and Sp3. Figure
6.12 below illustrates the way in which this index patterns with identifiability.
162
Figure 6. 12 Comparing % listeners' identifiability ratings and measured (PVI V.Dur)
The barchart in Figure 6.12 above presents the identifiability rating for each speaker
(pale), alongside the (PVI V.Dur) index, showing the corresponding dip in identifiability
rating as vowel duration variability increases. This pattern is borne out by the significant
negative correlations obtained between identifiability ratings and (PVI V.Dur), as described
in 6.2.2 above.
Nokes and Hay (2012) looked at changes in segment durations over real time in New
Zealand English. As the authors describe it, New Zealand English is understood to be more
syllable-timed than other varieties of English, and further, this current observation is seen as a
shift from earlier rhythm patterns, observed to have been much more stress-timed (Nokes and
Hay 2012:1). The study describes vowel duration patterns across speakers born between 1851
and 1988, presenting almost a century's worth of data. The PVI analysis for the more
identifiable MaltE speakers I have presented here yields indices more or less within the same
region as those in Nokes and Hay for the younger population (born around 1970s onwards),
whose English is described as more syllable-timed than that used some 50 years earlier. The
Nokes and Hay (2012) indices also present a continuum of change ranging from 69.7 for a
55.1 56.849.5
57.9
69.7
81.1
8982
71
57 54
4
0
10
20
30
40
50
60
70
80
90
100
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
PV
I V
.Du
r an
d Id
en
tifi
abili
ty
Speakers
PVI V.dur and Identifiability in MaltE
PVI v.Dur
IdentifiabilityRating
163
speaker born in 1868, to 51.6, for a speaker born in 1981, indicating a decrease in durational
variability over time. In the data in the present work, indices (PVI V.Dur) for the more
identifably MaltE speakers range between 56.8 to 49.5 while the index for the non-
identifiably MaltE speaker who could in fact be considered to sound English, is much higher
at 81.1.
6.3.2 Preference for full vowels over schwa (schwaØ)
The variable labelled as (schwaØ), namely the substitution of an expected schwa by
full vowels instead, was another variable where correlation patterns could be considered
indicative, but also expected, given the strong correlation between a lower (PVI V.Dur) and
identifiability. The preference for full vowels over reduced ones in unstressed positions in
MaltE speakers is widely noted, and stigmatised, but not fully understood. The current
analysis was considered to be a preliminary exploration of this pattern, with only those
instances where a schwa in weak stress position could confidently be expected. Those
instances where weak stress could have a number of other realisations, as well as schwa, were
not included here. In TextAloud, for example, the word 'handlebar' was not analysed as
expecting schwa. In 'handlebar', the weak-stressed penultimate syllable could easily be
realised as a syllabic [l̩].
In the case of the naturally occurring speech data, each instance where schwa could be
expected was analysed, but here too, those instances where there was any query or
uncertainty over whether or not a schwa could be expected, were not included in the analysis.
For example, in Sp4, "the" (Figure 6.13a below) in the context of a pause was not analysed,
as the pause itself is likely to have triggered vowel lengthening in this case. As can be
expected in variant patterns involving vowel duration, timing and rhythm, phrase final
contexts may well present an important focus of attention for future studies. In the second
164
example in Figure 6.13b, "enthusiast" in Sp3 was also not analysed for a schwa in the
antepenultimate syllable, for two reasons. Firstly, although the vowel in the first syllable of
"enthusiast" could be reduced, it might not automatically be reduced to a schwa, and some
dictionaries in fact list the vowel in "en" as a short /ɪ/ (the three dictionaries checked were:
Oxford English Dictionary Online, Cambridge Dictionary Online, Collins Dictionary
Online). In the second place, this vowel also yielded variant stress patterns across many of
the MaltE speakers in the data, where in fact, the antepenultimate syllable often took primary
stress, as opposed to the more expected penultimate syllable taking primary stress, resulting -
in a broad transcription - in /ˈentʊzjæst/. In the example in Figure 6.13b below, the final
syllable contains a schwa, in this case, but this too was not analysed, as it was not considered
to be part of the expected pattern either.
Figure 6. 13 a "the", Sp4
Figure 6.13 b "enthusiasts", Sp3
165
In spite of its lack of phonemic status in either English or Maltese, in fact, the data
yields evidence of substantial schwa realisation, (including in unexpected positions, as
evidenced in Figure 6.13 (b), above) together with an alternative preference for full vowels
instead. Examples of full vowels analysed where schwa might have been expected are given
below in Figures 6.14 (a-c)
Figure 6. 14 a. "motorbike", Sp3, with a full vowel instead of schwa in "tor"
Figure 6.14 b. "the weather", Sp3. Evidence of a schwa, and of a full vowel where schwa was expected, respectively
166
Figure 6.14c. "father", Sp4. Two instances of "father", with expected schwa in syllable final position
The figures above present evidence of both schwa and a preference for full vowels in
different speakers. This pattern of evidence of both full and reduced vowels (in relation to
schwa, here) holds across all six speakers, and not just for those speakers who are less
identifiable, and who might define themselves as English-speaking (see below Section 6.4 for
more on this), though to varying degrees. However, a preference for full vowels over
reduction to schwa is much more prevalent in those speakers considered more identifably
MaltE. Sp1, Sp2 and Sp3 all had more instances of full vowels where schwa might have been
used instead, compared with the remaining three speakers. This pattern is distributed fairly
evenly throughout both the naturally occurring speech data (spontaneous speech style) and
the scripted text (careful speech style), but with a slightly higher number of full vowels over
schwa in the spontaneous over the read texts (see Section 6.4.2 below for a discussion on
register and speech styles). In Table 6.4 below, preference for full vowels where schwa could
have been expected is expressed as raw figures, accompanied by the total number of
instances of schwa for each speaker.
167
Table 6. 4 (schwaØ) in 'Spontaneous' and in scripted 'TextAloud' speech
Spontaneous TextAloud
(schwaØ) Totals % (schwaØ) Totals %
Sp1 24 30 80 20 40 50
Sp2 11 26 42 14 33 42
Sp3 23 35 66 21 36 58
Sp4 21 60 35 9 32 28
Sp5 11 46 24 5 32 16
Sp6 4 31 13 1 30 3
Note that the percentage of preferences for (schwaØ), in other words for full vowels
where schwa could be expected, is higher or at least the same, for Spontaneous speech across
all speakers, both identifiable and non-identifiable. The three most identifiable speakers also
consistently show greater preference for (schwaØ) than the less identifiable and non-
identifiable speakers, and this pattern can be seen to complement the pairwise variability
index (Section 6.3.1) measuring variability in vowel durations. Preference for full vowels
over schwa would be expected to encourage more homogeneity across vowel durations in a
speaker, and this combined pattern is noted in these data. However, this is not to say that a
preference for full vowels is the sole reason for the more identifiably MaltE pattern of a lower
(PVI V.Dur), as this could also result from a combination of other factors, not examined here,
but including consonant gemination, and lack of reduction in other vowels also in unstressed
positions, notably [ɪ].
6.3.3 Variant realisation of fricatives /θ/ and /ð/, labelled (th)
The distribution of this feature across the 6 speakers presents a rather different pattern
to either (PVI V.Dur) or (schwaØ), in that the non-identifiable speaker has no instances at all
of substitution of these segments, whereas both the moderately and the highly identifiable
168
speakers had quite a high proportion of substitution, in both task types (Spontaneous and
TextAloud). Again, numbers and percentages are presented in Table 6.5 below.
Table 6. 5 Preference for substitution of (th) in Spontaneous speech and scripted TextAloud
Spontaneous TextAloud
th_N Totals % th_N Totals %
Sp1 24 28 85.7 17 18 94.4
Sp2 20 24 83.3 12 17 70.6
Sp3 26 28 92.9 13 16 81.3
Sp4 19 30 63.3 9 15 60.0
Sp5 11 19 57.9 11 15 73.3
Sp6 0 10 0.0 0 15 0.0
Among the five speakers showing variation in (th) realisation, Sp2, Sp3 and Sp4, had
a higher rate of substitution in the Spontaneous, over the scripted text. Across all the first five
speakers, rated highly or moderately identifiable, both speech styles show a high rate of (th)
variation, before the plunge to no variation at all in Sp6.
Until this point this variable has been analysed as a binary category +/- (th), however
the evidence in the data suggests that it might also be useful to consider a phonetic continuum
of the variable (th) in some of the speakers. For the purposes of this analysis, an attempt was
made to begin to capture some of the more noticeable points along this continuum. These
points have been annotated here as [th] for a strongly aspirated plosive (see Figure 6.15
below), [t] or [d] for a plosive (Figure 6.16), and then [θ] or [ð] for the respective voiceless
and voiced fricatives (Figure 6.17). Note the annotation [t] does not assume no aspiration at
all in the realisation of /t/, but is being annotated in this way here, in order to distinguish it
from the realisation of a voiceless /t/ which suggests evidence of much more pronounced
aspiration to the extent of almost approaching frication. The evidence of phonetic variation at
this level doubtlessly needs to be explored further in a more controlled context of data
169
collection and analysis, while here, the analysis is confined to identifying and describing
some aspects of the more readily observable patterns of phonetic variation. The variation
annotated as highly aspirated [th] can be described as more salient than the expected fricative
realisations of [θ] or [ð], but slightly less salient than plosive realisations [t] or [d]. It can
perhaps be regarded as an approximation to frication, as it appears to be more strongly
aspirated even than an expected plosive with similar distribution.
Figure 6. 15 "think", Sp1. (th) annotated as strongly aspirated [th]
Figure 6. 16 "enthusiasts" in Sp1. (th) annotated as [t]
170
Figure 6. 17 "thing" in Sp6. (th) annotated as [θ]
Three realisations of (th) presented in the figures (Figures 6.15-6.17) above indicate
instances of variation at the phonetic level which may be relevant for further controlled
studies, also in relation to perceived variation in the realisation of this variable. In the first
two instances, Figures 6.15 and 6.16 both present evidence of a burst, followed by some
degree of aspiration. I am taking the first case, (Figure 6.15) to be particularly pronounced
aspiration, when compared with the second instance in Figure 6.16, which I am considering
to be the unmarked realisation of the plosive /t/ when not preceded by /s/ in English. In other
words, I expect some degree of aspiration in the realisation of /t/ not preceded by /s/ in
English, and indeed this is evident in many instances of /θ/ substitution across the current
data, but these data also provide evidence of further variation in the realisation of /θ/, which
is manifested as more strongly pronounced aspiration (and annotated here, therefore, as [th];
in the Praat files on tier 4, it is annotated also in script-friendly fonts as "no = t+h").
In another set of examples, Figures 6.18-6.20 below present variation of /θ/ in
TextAloud, realised by three different speakers. In the first instance (Figure 6.18), /θ/ is
realised as a fricative, [θ], while in Figures 6.19 and 6.20, /θ/ shows more evidence of a burst
and aspiration pattern, or evidence of no burst, but a relatively longer and more pronounced
171
aspiration, respectively. The example in Figure 6.20 is therefore being analysed as having
more pronounced aspiration than that in Figure 6.19. Interestingly, in the case of this set of
examples, one of the highly identifiably MaltE speakers realises /θ/ as a fricative. As Table
6.5 below indicates, and as the correlations in Section 6.2.2 above also suggest, the more
identifiably MaltE speakers could be more likely to show evidence of a plosive followed by
varying degrees of aspiration, rather than the unmarked fricative. However this is not the case
in this instance, highlighting the issue that variation must also be noted at intra-speaker level,
and not only at inter-speaker level. As noted further below (this section, see also Table 6.5),
the moderately identifiable speaker in Figure 6.20 presents more evidence of the particular
realisation of /θ/ that I have analysed as strongly aspirated, and the strong correlation between
the variable (th) and different degrees of identifiability described more fully in Section 6.2.2
above suggest that this may be worth looking into further, as evidence of meaningful
variation in MaltE.
Figure 6. 18 "authentic" in Sp1, with evidence of frication for /θ/
172
Figure 6. 19 "authentic" in Sp3, with evidence of a burst, followed by aspiration, annotated as [t]
Figure 6. 20 "enthusiast" in Sp5 with evidence of no burst and aspiration
The distribution of (th) across speech styles and speakers is listed in Table 6.6 below.
In this table, instances are presented as raw figures as they are limited in number.
Table 6. 6 (th) realised as a strongly aspirated stop [th]
Spontaneous TextAloud
[th] Totals [th] Totals
Sp1 2 28 0 18
Sp2 2 24 0 17
Sp3 1 28 0 16
Sp4 3 30 1 15
Sp5 4 19 3 15
Sp6 0 10 0 15 Raw figures are presented here for clarity
173
Note more instances of this variant in Spontaneous, over TextAloud, and again, no
variation in either speech style for Sp6. Both the moderately identifiable speakers (Sp4 and
Sp5) have a rather higher proportion of this variant, particularly in the Spontaneous texts.
Phonetically, [th] could be seen as intermediate between an interdental fricative, and a
plosive, and it has been analysed as salient, compared with the expected fricatives, but less
salient than realisation as a plosive. The barcharts in Figure 6.21 below illustrate the
distribution of all three realisations of (th), labelled here in excel-friendly manner as th_Y,
where /θ/ is realised as a fricative, th_h for /θ/ realised with strong aspiration, and th_N, for
/θ/ realised as a plosive:
Figure 6. 21 Three realisations of / θ / (variable (th)) across six speakers, by text type (raw figures)
The grey-shaded bars in Figure 6.21 above show (th) realised as a plosive to be the
most popular variant in all the five more identifiable speakers, across both text types. (th)
realised as a plosive and also as [th] (with strong aspiration) are both more widely used in the
Spontaneous texts, than in TextAloud for all the MaltE speakers, except in the case of Sp5,
who has a marginally higher proportion of [th] in TextAloud. The connection between the
moderately identifiably MaltE (Sp4 and Sp5) speakers and the phonetically intermediate [th]
0
5
10
15
20
25
30
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
Spontaneous TextAloud
th_y th_h th_N
174
is noted as potentially suggestive of a pattern reflecting meaningful variation in MaltE along
a continuum, as opposed to a more categorically defined presence versus absence of frication
in the realisation of the variable labelled here (th).
6.3.4 (a)
As in the case of the variable labelled (th) in Section 6.3.3 above, the variable labelled
(a) here was also earlier analysed as either the realisation of (a) as [æ] or not. In other words,
a first pass analysis was presented (see Section 4.4.2). This section presents findings for the
second pass analysis, in which the data was examined for evidence of more nuanced
variation. Once again, there appears to be a stronger pattern of variation in the highly
identifiable MaltE speakers (Sp1, Sp2, Sp3), when compared with the moderately identified
or unidentified speakers (Sp4, Sp5 and Sp6). Three possible variants of (a) were identified in
Sp1, Sp2 and Sp3, while a much more consistent realisation of variable (a) as [æ] was noted
possible avenue for future study, for they are not large enough to be statistically relevant.
Table 6. 7 Variant realisation of (a) in the three highly identifiable MaltE speakers
(a) realised as /ɛ/ or /ɐ/
Spontaneous Texts TextAloud
Sp1 7 9
Sp2 3 0
Sp3 5 5
Sp4 0 0
Sp5 0 0
Sp6 0 0
This seems to be a salient feature for MaltE, and one which, in the present data,
singles out the highly identifiable speakers from all the others. The more marked realisations
of (a) have been divided between 2 alternatives, occupying spaces closer to /ɛ/ or to /ɐ/. After
in Sp4, Sp5, and Sp6. In Table 6.7 below, raw figures are presented as the indication of a
175
[æ] the more consistently preferred realisation is [ɛ], while [ɐ] is used much less frequently,
although it is still present in all three highly identifiable speakers. [ɐ] has been identified as
the more marked of the two variants because it is also phonetically more distant from the
phoneme /æ/ (see also Section 4.2.1.1 for a discussion of these variants).
Table 6. 8 Two marked realisations of (a) in TextAloud and Spontaneous text
[ɛ] TextAloud [ɛ] Spontaneous
[ɐ] TextAloud [ɐ] Spontaneous
Sp1 6 5 3 2
Sp2 0 2 0 1
Sp3 4 3 1 2
Thus variant realisation of (a) in three of the speakers totals 29, with 20 instances of
[ɛ] and 9 instances of [ɐ].
If we look at proportions of variation in (a) as a function of speech style, we find that
the variation is present for Sp2 in the Spontaneous texts and not in the TextAloud (scripted)
text, while the remaining two speakers present variation in both speech styles as Figure 6.22
below illustrates. Again, instances are presented as raw figures here.
176
Figure 6. 22 Three realisations of (a) across text type (raw)
Note that the variable (a) realised as [æ] (blue) is by far the preferred choice across
most of the speakers, but note also that for Sp1, rated highly identifiable, this is not the case,
as in both speech styles, Sp1's preference for [ɛ] is higher. Sp1's use of [ɐ] is also greater than
that of the remaining two highly identifiable speakers, Sp2 and Sp3.
A formant analysis carried out for one male speaker and one female speaker (Sp1 and
Sp2, respectively, both highly identifiable) also indicates some degree of phonetic variation
in relation to how the phoneme /æ/ has been realised in these speakers. The formant
measurements for F1 and F2 are plotted in Excel to illustrate the use of the vowel space by
each of these two MaltE speakers. Figure 6.23 (a,b) below presents formant measurements
for the targeted /æ/ in one male and one female speaker, while Figure 6.23 (c,d,e,f) presents
formant measurements for other targeted sounds /e/ or /ʌ/, in English (variety unspecified), in
the same speakers, respectively, in order to illustrate the use of vowel space for these
typically contrasting sounds, in English. The vowel tokens selected were those used in
stressed syllables, with no expectation of reduction. These three vowels are presented here in
order to hone in on the auditory analyses, which first suggested that variation in the
0
5
10
15
20
25
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
Spontaneous Text
Distribution of variable (a) for speaker and text type
æ
ɛ
ɐ
177
050100150200250300350400450500550600650700750800850900
050010001500200025003000
Formants for [æ] in SP (male)
050100150200250300350400450500550600650700750800850900
050010001500200025003000
Formants for [æ] in SP2 (female)
050100150200250300350400450500550600650700750800850900
050010001500200025003000
Formants for [e] in SP1 (male)
050100150200250300350400450500550600650700750800850900
050010001500200025003000
Formants for [e] in SP2 (female)
realisation of /æ/ could possibly approximate front vowel /ɛ/ or more rarely, a central vowel,
such as /ɐ/, as well as [æ]. Figure 6.24 below combines the formant measurements in Figure
6.23 together with a token number of other vowels in order to illustrate the relative vowel
space used by each speaker. Once again, there is evidence here to suggest that phonetic
variation in the realisation of /æ/ merits further investigation.
Figure 6. 23 (a-f). Vowel space used by Sp1 (male) and Sp2 (female), with particular reference to [æ], [e], [ʌ]
(a) (b)
(c) (d)
178
050100150200250300350400450500550600650700750800850900
050010001500200025003000
Formants for [ʌ] in SP2(female)
050100150200250300350400450500550600650700750800850900
0100020003000
Formants for [ʌ] in SP1 (male)
The six figures in Figures 6.23 (a-f) above do in fact illustrate patterns of variation
worth exploring further. Firstly, the use of vowel space for Sp1 /æ/ and /e/ seems to be more
centralised, whereas tokens for /ʌ/ in 6.23 (e) above appear to be particularly spread out. In
the case of all three vowel segments, however, there is evidence that there is less distinction
in the realisation of each of the three phonemes /æ/, /e/, and /ʌ/, as there are a number of
instances in which these vowels are more centralised, and consequently, the resulting vowel
space is tighter and allows for more overlapping in the realisation of the three segments. This
apparent contraction of the vowel space in evidence here supports the identification of
variants in the realisation of the target phoneme /æ/, which have been described here as either
a front vowel or a central vowel, /ɛ/ or /ɐ/, respectively On the other hand, Sp2 presents some
realisations of each of the three phonemes /æ/, /e/, and /ʌ/ which seem to occupy more
distinctive vowel spaces, particularly in the cases of /e/ and of /ʌ/, which are often realised as
more front, or more back vowels, respectively. Conversely, however, note that the relatively
wide dispersal of /æ/ tokens again suggests some overlapping with the vowel spaces for /ɛ/ or
for /ɐ/, confirming the auditory analysis to some degree. Figure 6.24 (a.b) below presents all
three vowel tokens together with a small number of other vowel sounds in order to illustrate
(e) (f)
179
ɒ ɒɒ
ɒ ɒ
ɑː
æ
ææ
æ
æ
æææææ
ææ
æææ
e
e
eeeee
ə
əəə
ə
ə
ɜ:ɪ ɪɪ ʊ
ʊ
ʊ
ʌ
ʌʌ
ʌ
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
050010001500200025003000
F1
F2
Formant measurements of a range of vowels in SP1(male, highly identifiably MaltE)
the vowel space for the same two speakers described above. Note in Figure 6.24 (a, b) below
that a few schwa tokens have also been included. However,these are presented here for
illustrative purposes only, and have not been studied further here, as it is understood that the
acoustic properties of schwa are themselves subject to controversy. (for example, Cruttenden,
2008; Flemming and Johnson, 2007; Barry, 2007).
Figure 6. 24 (a) Formant measurements for a range of vowels in Sp1
180
æ
æ
æ
æ
æ
ææ
æ
æ
æ
ææ
æææ
ææ
æ
æ
ææ
æ ææ
ɑː
ɒ
eee
ee
ee e
e e
e
e
e
e
e
ɜ:ɪ
ɪ
ɪɪ
ɔː
u:ʊ
ʊʌ
ʌ
ʌ
ʌ
ʌ ə
ə
əə
əə
ə
0
50
100
150
200
250
300
350
400
450
500
550
600
650
700
750
800
850
900
050010001500200025003000
F1
F2
F2, F1 plot for a range of vowels in SP2 (female, highly identifiably MaltE)
Figure 6.24 (b). Vowel space used by Sp2 (female), with particular reference to [æ], [e], [ʌ]
Note the diffusion of possible measurements for target sounds /æ/, /e/, and /ʌ/ with
this relationship showing up more clearly in Figure 6.24 (b) above for the female speaker,
Sp2. Sp1 also shows evidence of variation in the realisation of the vowels in question,
although the vowel space here appears to be more concentrated and perhaps centralised (even
allowing for the fact that these represent non-normalised measurements, given that a small
number of data points for one female and one male speaker were analysed). In both cases
above, however, there is an indication that the vowel space used by speakers of MaltE is a
potentially interesting area for further study, generally, and perhaps in particular, with
reference to variant realisations of the target sound /æ/.
181
6.3.5 (r)
As Bonnici (2010) notes, (r) presents a stereotypical feature of distinct variation in
MaltE. In the current data, four of the speakers show evidence of a strong preference for
rhoticity, one speaker (Sp5) is almost entirely non-rhotic, and, most interestingly, one other
speaker (Sp6) is mostly non-rhotic, but seems to use (r) from time to time when in
conversation with another rhotic speaker, during Spontaneous tasks eliciting a more
spontaneous speech style. This process of what could be suggested as linguistic
accommodation to a linguistic alignment with another speaker is discussed briefly below in
Section 6.4.2. The data include high proportions of both overt and null realisation of a post
vocalic (r) across the six speakers, described in Figure 6.25 below as r_Y (Yes), and r_N
(No) respectively. A third variant was also noted and is described here as r with Frication, or
r_F.
Figure 6. 25 Three variants of (r) in six speakers
Bars are a representation of raw frequencies for each variant.
Figure 6.25 above presents the distribution for three variants of (r), labelled here in an
excel-friendly manner, rather than phonetically, as r_N (no post-vocalic (r)), r_Y (post-
0
10
20
30
40
50
60
70
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
r_N r_Y r_F
182
vocalic (r) as alveolar approximant), and r_F (post-vocalic (r) as alveolar approximant with
evidence of frication). Note that one of the most highly rated speakers for identifiability, Sp1,
here presents a preference for null realisation, r_N. The same speaker, however, does have
evidence of both other variants which are considered salient here, namely overt realisation,
r_Y and also r_F. As expected, the English dominant, non-identifiably MaltE speaker Sp6 has
very little evidence of overt post-vocalic (r) realisation, preferring instead null realisation.
Across all six speakers, however, there is evidence to support Bonnici (2010) in her claim
that variation tends to operate more in terms of a continuum of more versus less preference of
one variant over another, as opposed to more categorical choice of one variant over another.
Figure 6.26 below illustrates the distribution of the three possible realisations of (r) across the
six speakers for the two different speech styles:
Figure 6. 26 Variant realisation of (r) by speaker and by speech style
Figure 6.26 above presents a slightly different pattern of variation distributed across
the speakers, when compared to other patterns of distribution described for the other variables
in question in Study 2. For here, Sp1, considered highly identifiably MaltE, appears to pattern
more closely with the less identifiably MaltE speakers, Sp5 and Sp6 in preferring null
16
4 5
9
22
32
0
14
42 2
23
27
6
2022
36
1
5
0
8
22 2119
1 0
4
1
9
4
0 0 0
3
0 0 0 0 00
5
10
15
20
25
30
35
40
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
r_N r_Y r_FTextAloudSpontaneous
183
realisation of post-vocalic (r). Conversely, preferring overt realisation in both text types, Sp2,
Sp3 and Sp4 share similar patterns. The third variant of post-vocalic (r) realised with frication
is here being considered more marked than either of the other variants for MaltE, in particular
as it involves further complexity in its phonetic realisation, and also, because it does not
appear as frequently as either of the remaining two variants. Sp1, who overall prefers null
realisation of post-vocalic (r), still has evidence of (r) with frication, and Sp1 is also the only
speaker to have evidence of frication in (r) in the scripted speech style in TextAloud.
Although not as frequent as null realisation or overt realisation of (r) without frication,
it is, however, still noticeable, and shows evidence of some patterning, throughout the data
analysed for Study 2. Bonnici (2010) suggests that rhoticity in MaltE may present evidence
of some form of change in progress for this variety of English, and the distribution of (r)
variation in the current data can support this position. Three aspects of the (r) characteristic in
particular prompt this view. In the first place, there is the twofold pattern that rhoticity is
evident in all 5 of the highly, and moderately identifiably MaltE speakers, but the distribution
of this variable is slightly different to that of the other variables above, as noted earlier in this
section. In particular, Sp1, who was consistent in preferring the more salient variants for
MaltE in the other 4 variables, and who is rated the most identifiably MaltE speaker, here in
the case of (r), prefers the least marked, non-rhotic variant of (r), across both speech styles.
Conversely, one of the more moderately identifiably MaltE speakers Sp4, showed strong
evidence of (r) variation, particularly in the Spontaneous speech styles.
Secondly, there is evidence to suggest that rhoticity can be used as a discourse
marker, as variant realisations of (r) either as an alveolar approximant or as an alveolar
approximant with frication seem to be distributed as a function of discourse structure in some
cases. Note, for example, that r_F in the barchart in Figure 6.26 above, is only present for Sp1
in both speech styles. For all other speakers, r_F is either not an available option, or it is only
184
used in the Spontaneous speech styles. A closer look at this variant of (r) with frication shows
its distribution to be limited to phrase final position, or in some cases to where post vocalic r
precedes a /t/. Figure 6.27 below illustrates an instance of post-vocalic (r) analysed as being
followed by frication:
Figure 6. 27 (r) realised as an alveolar approximant [ɹ], and as an alveolar approximant with frication [ɹ]̝, Sp1
Two target items (TIs) "german army" and "wartime", have elicited two variants of (r)
in Sp1. In the first case, for "german army", the dipped shape in the dark banding of the
spectrogram above in figure 6.22, describes a dip in formant structures – specifically the 3rd
formant - which typically indicates the presence of a post-vocalic (r) realised as an alveolar
approximant (in the excel-friendly tables above listed as r_Y). The same dip in formant
structure (F3) is noted in the second target item, "wartime", but this is also accompanied by a
slight shadowing towards the top of the spectrogram, which more typically indicates the
presence of some element of frication. Furthermore, the very thin striation just visible within
the /r/ segment might also be suggestive of an element of tap or flap quality, possibly in
anticipation of the following /t/. No such shadowing or striations are visible in the case of (r)
elicited in "german army". In a similar case of what I suggest is context-dependent realisation
of particular variants, the same speaker again presents in this case a non-rhotic pattern,
followed by [ɹ̝] (or r_F, in the tables), illustrated below in Figure 6.28.
185
Figure 6. 28 (r) realised as null, and as an alveolar approximant with frication [ɹ]̝, Sp1
In Figure 6.28 above, Sp1 first says "motorbike" with no visible or auditory evidence
of overt (r), but then says "are", where (r) is realised with frication, followed by a pause. r_F,
described phonetically as [ɹ]̝, may therefore also be considered highly marked for this speaker
of MaltE, and more specifically, analysis of these data suggest that its distribution may be
determined as a function of discourse, or by a particular phonetic environment, preceding /t/.
In the three remaining speakers also presenting evidence of [ɹ]̝, all instances are also either
phrase final or in the context of V_t, which may again support the suggestion that this
realisation of (r) is worth investigating in a broader context of variation in MaltE.
(r) realisation may also have a bearing on perceptions of vowel duration in these data,
as suggested earlier in section 4.1.1. In view of the discussions on vowel duration
characteristics as a salient feature of variation in MaltE, and rhoticity patterns as another
highly prevalent feature, vowel duration patterns preceding (r) were also measured in the
TextAloud speech style, which allowed for directly comparable data.
186
0.0
20.0
40.0
60.0
80.0
100.0
120.0
140.0
160.0
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
Ave
rage
Du
rati
on
ms
Speakers
Vowel durations in contexts of post-vocalic (r) r_Y and no post-vocalic (r), r_N
r_Y
r_N
Figure 6. 29 Vowel durations when followed by r_Y (overt) and r_N (null) realisations of post-vocalic (r)
In Figure 6.29 above, all six speakers show a clear pattern of shorter vowel durations
when vowels are followed by an overt realisation of post-vocalic (r), and longer vowel
durations when there is no post-vocalic (r). It is evident that only one speaker, Sp4, has
similar vowel duration patterns both when vowels are followed by a post-vocalic r, and when
they are not, although here too, vowel durations are slightly longer when there is no
following post-vocalic (r). So we can consider that the absence of (r) in the less identifiable
speakers may contribute to some degree of compensatory pattern in the form of lengthened
vowels, which in turn may contribute to greater variability in vowel duration patterns. Longer
vowel durations here, combined with weakened vowels in unstressed positions, may be at
least two indications of how variability in vowel duration is often realised. On the other hand,
note that those speakers with a much lower index of variability in vowel durations (Sp1, Sp2,
Sp3, and Sp4), also have higher frequencies of (r), and correspondingly shorter vowel
duration patterns, which may also contribute to the greater homogeneity of vowel duration
patterns in these speakers.
187
6.4 Individual speaker variation
Part of the motivation for this research was based on the understanding that a native
MaltE listener would often, quite quickly and accurately, be able to single out a MaltE
speaker among speakers of other dialects or varieties of English. Alongside this there is the
growing body of commentaries and parodies of the ways in which Maltese people speak a
distinct variety of English, already illustrated briefly, throughout the earlier chapters of this
thesis. These reactions and behaviours might suggest that any individual born and raised in
Malta is likely to employ distinct patterns of variation, which act as triggers in the perception
of this local variety of English. Foregoing literature indicates that such patterns are likely to
operate on a continuum of more versus less prevalent use. The hypothesis that I have
undertaken to investigate here, is that this continuum of patterns of use may be matched with
perceptions of identifiability in order to begin to map out a spectrum of MaltE variation with
respect to five characteristics identified as particularly salient for MaltE. This map, or index,
of identifiability, can then act as a coherent context within which to further study the socially
meaningful patterns of variation plotted along its continuum. The ensuing section begins a
tentative examination of some of the more persistent patterns among the five variables
analysed here in relation to their potential for indexing social factors.
With the above position in mind, the criteria for identifying appropriate speakers were
that they were Maltese, and, with the exception of one speaker, that they had grown up and
lived in Malta. Accordingly, speakers were chosen from an extended circle of acquaintances
in an effort to identify speakers at different points along the continuum of variation. It was
expected that the range of speakers would trigger different identifiability ratings in the
Perception study (See Section 4.3.2, and Section 6.1 above). The discussion in this section is
188
based on the analysis carried out above as well as on a broader consideration of some of the
more salient patterns identified across the whole group of speakers.
6.4.1 Variation and indexical information
In case studies with six speakers, it is understandably not the intention to extend or
extrapolate results in the hope of accounting for a whole nation or even a whole speech
community. The intention is, rather, to study individual patterns of variation with reference to
the five variables identified, in the context of a continuum of variation for MaltE, and in so
doing, to begin an exploration of those patterns of variation which may be more salient than
others in identifying MaltE.
The six speakers analysed in this chapter present a wide range of variation in terms of
linguistic background and geographical location. Two speakers (Sp1 and Sp6) come from
two towns in the south of Malta, another two (Sp2 and Sp3) come from Sliema, while the last
two (Sp4 and Sp5) come from the central part of the island, B'Kara/Balzan. They can be
divided into two age groups where two of the speakers (Sp1 and Sp6) are over 55 (55 and 67
years old respectively), and the remaining 4 are between 38 and 43 years old. All 6 speakers
attended what are commonly called church schools for at least part of their education, but this
factor is not taken into close consideration here, as type of schooling (church, state or
independent), together with school location, has undergone significant changes during the
relevant timeframes in question, and the demographics of the schooling system in Malta has
been in a state of flux for some years, since roughly the 1980s. This information was gathered
from the basic questionnaire at the start of the data collection process (Appendix C), together
with a short self-report on language ability and language usage across English and Maltese,
and is collated below in Table 6.9.
189
Table 6. 9 A summary of speaker background
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
Age 55 38 39 42 39 67
Locality Fgura
(Southern
Harbour)
Sliema
(Northern
Harbour)
Sliema
(Northern
Harbour)
B'Kara
(Northern
Harbour)
Balzan
(West)
Senglea
(Southern
Harbour)
Schooling Church/
State
Church Church Church Church State/
Church
Self-report
language use
Maltese Maltese
and
English
Maltese Maltese
and
English
English
and
Maltese
English
Self-report,
language
ability
Balanced Balanced Maltese
Dominant
English
Dominant
English
Dominant
English
Dominant
Personal
impression,
language
ability
Maltese
Dominant
Balanced Maltese
Dominant
Balanced English
Dominant
English
Dominant
My own impressions of speakers' language ability are included here as additional
information, and are based on the short interview surrounding the completion of the
questionnaire. These interviews were not recorded for additional data, as it was considered
more important to use this stage of the process as a warming-up and breaking-the-ice phase.
It was nevertheless used as an opportunity for me to establish or verify my view of some
aspects of each speaker's linguistic background. In fact in some cases I came to a slightly
different conclusion from the speakers' self-reports, although the differences in each case
(Sp1 and Sp4) are by no means drastic. The question of language dominance relates closely
to a combination of the broader discussion (Chapters 1 and 2) of how the two languages are
viewed and used throughout the Maltese islands, and it also relates to the question of ability
to function in either language, as self-reported and also as observed. Thus "Dominance"
suggests overall preference for and proficiency in one or other of the two languages in
question, while "Balanced" suggest a more equal spread of both usage and proficiency across
the two. This in itself has not been widely studied in Malta and at the time of writing, it is still
190
not possible to agree on any objective, or nationally established description for such terms, so
they are used here as a broad yardstick, rather than as a discrete categorisation.
Note that the three speakers listed as using Maltese extensively in their daily lives also
had the highest incidence of more marked features for MaltE. Among the more English
dominant speakers, Sp6 consistently presented a different pattern to the remaining two, Sp4
and Sp5, who alternately shared some features of variation with the more Maltese dominant
speakers, as well as with Sp6, in some cases.
If we consider the age and gender variables, it is also possible to see evidence of some
patterns to corroborate earlier studies and findings. Portelli (2006) and Bonnici (2010) both
consider the issue of variation as a function of gender, and specifically, as an affirmation of
masculinity/femininity (Portelli, 2006). With respect to rhoticity, for example, "the desire to
avoid sounding English-speaking causes men to prefer the r-ful (rhotic) form and perhaps
a more Maltese-influenced English overall" (Bonnici, 2010:225). Elsewhere, Bonnici
(2010) also notes that in its closer association with Received Pronunciation, and the
latter's status as the preferred accent in schools, "Some may assume that the prestige form
in postcolonial contexts is the standard variety of the former colonizers. In this line of
reasoning, since it was a former British colony, r-lessness would be the prestige form in
Malta" (Bonnici 2010:176). Findings in my data seem to reflect this question of the status
of (r). Firstly two male speakers, Sp3 and Sp4 both have a very high frequency of (r),
with the former being rated highly identifiable, and the latter rated moderately
identifiable as MaltE speakers. In contrast, the other male speaker also rated highly
identifiable – Sp1 – showed a preference for non-rhotic patterns, with 58% null
realisation of (r). This is not an exceptionally high proportion of non-rhotic patterning,
particularly when compared with the other two non-rhotic speakers, whose null
realisation of (r) is above 90%, suggesting a much more decidedly non-rhotic accent.
191
However, Sp1 is decidedly showing more preference for non-rhotic (null realisation),
than for rhotic patterns, and this, compared with the highly rhotic patterns of the other
two males in the cohort, is to be noted. Also to note, is that although Sp1 presents a lower
frequency of overall rhoticity than the other two highly identifiably MaltE speakers, he
nevertheless also presents a high frequency of the most marked variant of (r), [ɹ]̝. Indeed,
a pattern emerges for (r) followed by frication, [ɹ]̝, where the 3 male speakers, including the
more balanced (and also more moderately identifiable) Sp4, all present the highest frequency
for this variant of (r) across both speech styles, while a female speaker, Sp2, does have
evidence of [ɹ]̝ but this is restricted to just one instance in the Spontaneous speech style. Thus
although for a highly identifiably MaltE male speaker, Sp1 has a much lower frequency of
rhotic patterns than two other highly identifiable, and one moderately identifiable speaker, I
suggest that this results more as a function of age, than of gender (see also below, this
section, and section 6.4.2 below for further discussion of prestige forms, and of register,
respectively).
If we look at age in relation to (r), Sp1 as the older Maltese dominant speaker (55 yrs)
has a much lower frequency of overt (r) realisation as an alveolar approximant than both the
other two Maltese dominant (and identifiable) Sp2 (38 yrs) and Sp3 (39 yrs), and also than
the balanced, moderately identifiably MaltE male Sp4 (43 yrs), preferring instead, null
realisation of (r). A shift from a non-rhotic to a more rhotic accent is noted by Bonnici (2010)
in her study which uses an apparent-time study of change, with participants ranging from 18
to 81 years of age. Older generations in Malta were often taught by native English-speaking
teachers, who were likely to have adopted the then more prestigious Received Pronunciation
(RP) accent during school teaching, even if these might not always have been RP speakers
originally. Accordingly, Maltese students, particularly those, such as potentially Sp1, who
were essentially Maltese-speaking, might have been exposed to an RP variety favouring non-
192
rhotic patterns as the main – or even the only – source of English encountered on a regular
basis, through school. Conversely, younger generations may have been exposed to both RP
and non-RP speakers as the older generation of teachers was replaced by a younger
generation of native MaltE speakers. So it is plausible to suggest that an older speaker of
MaltE may opt for the perceived prestige form last taught, in this case, a non-rhotic pattern.
In the case of a preference for non-rhotic patterns in some of the other speakers, there
may also be other conditions affecting choices, including home town, and therefore most
likely, type of school, which may well be other contributing factors. For example, Sp5 is non-
rhotic and is a female speaker in the same age group as the more rhotic Sp2 (female), and Sp3
and Sp4 (male). She therefore does not seem to pattern neatly with the rest of the cohort in
terms of rhotic patterns, as she contrasts starkly with Sp2, who is a female speaker of a
similar age, and patterns instead more closely with the older female speaker, Sp6. The other
female speaker , Sp2, could not be classed non-rhotic, although instances of both (r) and (r)
followed by frication are just slightly lower than for the two male speakers. This hint at
salient differences in the distribution of (r) across speakers as a function of age, gender and
possibly also schooling, suggests that post vocalic (r) in Malta is a healthy indicator of an
emerging variety of English shifting from a norm-following to a norm-developing variety.
Such evidence may corroborate Thusat et al. (2009:28) and Bonnici (2010: 168), who both
suggest that variation in MaltE is shifting away from its British English roots and instead
moving towards more locally meaningful patterns of variation.
Rhoticity is one of the features widely held to be identifiable in MaltE speakers, but
according to Bonnici's (2010) most recent extended study, it may also carry social meaning
and distinction both within MaltE speech communities, as well as throughout the Maltese
Islands. At the time when more widespread schooling was implemented in Malta (early 20th
Century), the RP non-rhotic variety would have been established as the more prestigious
193
norm, even if other more rhotic varieties may already have filtered into the islands through
the first British settlers (Bonnici 2010). It is likely that this was the accent most widely
prescribed, and associated with 'good' English, for many years, and this may have prompted
an effort towards less rhoticism, at least in more careful or self-monitored speech. On the
other hand many of the islands' inhabitants may also have been exposed to different accents
of English through less formal channels, such as at the ports and harbours, the docks, in shops
or in bars, and so on, so although non-rhotic forms may have been more prestigious, they
were unlikely to have been the only form widely present during, and beyond, British
colonisation in Malta. Bonnici (2010: 177) suggests that at least one instance of social
marker may see an increased preference for rhoticism in younger generations, particularly in
terms of gender distribution. Maltese itself is also a rhotic language, although we cannot
assume that this is the only, or indeed the most relevant factor in a discussion on rhoticism in
MaltE, particularly in the case of MaltE L1 speakers. These rich and diverse foundations may
well have made for the amount of variation found on the continuum of rhoticism today, and
the evidence of variation in the 6 speakers analysed here provides further evidence of this.
To return to the gender variable, note also the case of (a) which, like (r), also did not
correlate strongly with perceptions of identifiability. Here too there is also a possible context
for variation as a function of gender. Evidence of variation in the realisation of (a) is
restricted to the three speakers rated most identifiable MaltE (and Maltese dominant)
speakers (Sp1, Sp2, Sp3), and of these three speakers, it is the two male speakers (Sp1, Sp3)
with the higher frequencies of variation across both speech styles, while the female speaker
shows isolated instances of variation in the Spontaneous speech style (see also section 6.4.2
below for more on register shift).
Pertinent to this discussion here, are the Principles of Linguistic Change established
by Labov in the 1990s and early 2000s, which as Tagliamonte ( 2012: 62) explains, were
194
formulated in response to, and captured "the overwhelming consistency of patterns of
linguistic change and their associated social correlates". Such patterns include, among others,
the "gender paradox" (ibid.), where on the one hand, women can be expected to adopt
innovative forms in a language if these are overtly accepted, but on the other hand, will not
adopt language forms perceived to be stigmatised. If we consider the cases of (r), and (a) in
the highly identifiably MaltE female Sp2, we might interpret the high frequency of (r), and
the comparatively low frequency of variation in (a) in this light, if we consider (r) to be an
accepted innovation, but (a) to perhaps be more stigmatised. So in this case, female Sp2
would be willing to adopt a rhotic pattern, but less willing to use any variants of (a), except
the established [æ] form, even though here some subconscious variation might still be
occasionally in evidence in the more Spontaneous speech styles, where less self-monitoring
of speech patterns is possible. Conversely, Sp5, also female, and also within the 35-45 age
bracket, might perhaps be more conservative, rather than innovative, in her choices to retain
non-rhotic patterns, particularly as her proportion of null realisation of (r) matches that of the
older female, Sp6, at around 90%.
So, broad categorisation of speakers into highly identifiable, moderately identifiable
and non-identifiable for MaltE, do seem to yield some interesting suggestions of patterns of
variation. Such patterns seem to suggest, in the first place, some strong correlations with
identifiability (notably in the case of PVI V.Dur), but equally crucially, there are suggestions
here of variation having more nuanced social meanings, potentially as a function of gender,
age, or schooling and upbringing. The examples of (r) and (a) distribution in some of the
speakers have been discussed above in this light, while the figures below (6.30 and 6.31)
return to the theme of identifiability and corresponding patterns of variation across the
speakers in order to conclude with some general observations.
195
0
10
20
30
40
50
60
70
80
90
100
PVI V.Dur (r) (th) (schwaØ) (a)
Variation in the 3 highly identifiable MaltE speakers
Sp1 Sp2 Sp3
Firstly, in the group of three highly identifiably MaltE speakers, namely Sp1, Sp2,
and Sp3, note that although Sp3 (male) was rated lower for identifiability (71%) than both
male Sp1 and female Sp2 (89% and 82%, respectively), he nevertheless reflects similar
patterns of variation to Sp1 in three of the variables, illustrated below in Figure 6.30.
Figure 6. 30 Patterns of variation in the group of speakers rated highly identifiable
As Figure 6.30 above illustrates, the three highly identifiably MaltE speakers broadly
pattern in similar ways, but then, the two male speakers Sp1 and Sp3 also reflect similar
frequencies for the variables (th), (schwaØ) and (a), compared with the female, Sp2, which
may suggest the potential for variation as a function of gender to be an important feature in
the MaltE patterns here. Conversely, the case of closer similarity between female Sp2 and
male Sp3 on (r), compared with Sp1, may be more related to the age, than to the gender
factor, as Sp1 is rather older than both Sp2 and Sp3. (PVI V.Dur) may be manifesting a
different level of patterning, as already discussed above in 6.3.1, as its measurement may well
be affected by other factors, including the preference for full vowels over schwa, or more
rhotic patterning.
196
0
10
20
30
40
50
60
70
80
90
100
PVI V.Dur (r) (th) (schwaØ) (a)
Variation in all 6 MaltE speakers
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
If we return to the full cohort of speakers and their frequency patterns across all five
variables, (Figure 6.31 below), we may also note another potential instance of variation as a
function of gender.
Figure 6. 31 Patterns of variation, across all six speakers
Both Sp4 (male) and Sp5 (female) were rated moderately identifiable in the
Perception study (Study 2), with 57% and 54% of the listeners scoring them more identifiably
MaltE than the modulus, respectively. Sp4 reported himself as speaking both Maltese and
English in daily life, but self-rated his proficiency as English-dominant. Sp5 also rated
herself as English-dominant and speaks English, and to a lesser extent, Maltese, on a daily
basis. Note however, that although Sp4 and Sp5 both have similar identifiability ratings, they
do not always predictably pattern in the same way, noticeably in the case of (r), and also in
(PVI V.Dur). In these two cases, Sp4 seems to pattern more closely with the three highly
identifiable speakers, while the female Sp5 regularly patterns more closely with the older, but
non-identifiable female Sp6. Again, this could raise the question of whether, besides the
overall determining factor of salient cues for perceptions of identifiability across the cohort
197
and through the Perception study, we can also see here evidence of another level of socially
meaningful patterning relating particularly to gender, or age, but also to other factors such as
register, and prestige forms.
6.4.2 Variation, speech style and register
Another determining factor for variation may well be the use of different registers,
elicited in some speakers as they negotiated different types of tasks, and in some other
speakers, simply by virtue of the fact that they were asked to complete all tasks in English. In
Chapter 4, and also in Chapter 2, I have presented the context and rationale for opting to elicit
speech data that was for the most part, as much as possible, naturalistic, and quasi-
spontaneous. In three of the four tasks set, quasi-spontaneous dialogue/monologue was
elicited as participant speakers concentrated on a series of tasks, also involving information
gap activities (Section 4.2.2), geared to focus attention on the task more than on the recording
context, and these tasks were all grouped together in eliciting "Spontaneous" speech styles.
However, it was also necessary, given the focus on vowel duration measurements, to obtain
directly comparable data, and this required some form of reading aloud, referred to
throughout as "TextAloud".
Perhaps inevitably, the "TextAloud" task tended to elicit speech that was much more
carefully produced and subject to an amount of self-monitoring, than the three Spontaneous
speech tasks. The study of what in many cases resulted in some amount of register shift has
also been deeply informative, and suggests important directions for further research. Two
aspects of variation in the context of register are particularly striking in these data. If we take
the case of (r), note (also as part of the earlier discussion in 6.4.1 above), that the most
identifiable speaker Sp1, who prefers the more marked patterns for all the remaining four
variables, here prefers the least marked option of null realisation of post vocalic (r), and he
198
has a much lower proportion of (r) at 44% than either of the other two highly identifiable
speakers, or one of the moderately identifiable speakers (at 86%, 88% and 84%). This
speaker's TextAloud was also noticeably slower paced than all the other speakers, at 60.6
seconds, where the mean length across the remaining five speakers was 43.1 seconds. This
can indicate that he was speaking carefully and allowing time for self-monitoring. Sp1 speaks
Maltese almost exclusively in his daily life, where both his work and home environment are
Maltese-speaking. He may only need to switch to English with foreigners or with English-
speaking Maltese people, and in conversing with me (predominantly English-speaking), he is
used to using a little bit of English alongside his Maltese. In short, out of all the six speakers,
Sp1 is the most likely to function entirely in Maltese. His careful performance in both the
Spontaneous and the TextAloud tasks suggests that speaking in English for an extended
period prompted him to fall back on learnt and drilled classroom English acquired at school.
As already discussed (see Section 6.4.1 above), this may well have emphasised and required
non-rhotic, rather than rhotic variants, as the learned, and preferred choice of (r).
Similarly, variation in the realisation of (a) already described as a function of gender
in Section 6.4.1 above, may also be interpreted in terms of register shift. (a) variation only
occurs in the three highly identifiable speakers, among which, the female speaker has the
lowest proportion of (a), and all instances are restricted to the Spontaneous speech style.
Finally, another possible instance of variation as a function of the context in which it
is used may be glimpsed in the cases of linguistic accommodation in the conversation
between Sp6 and Sp7 (not analysed here). Sp6 is very clearly a non-rhotic speaker, having
only 5 instances (8%) of (r) across all speech styles, and none of them occurring in
TextAloud. Sp7, on the other hand, is a rhotic speaker and there are at least two instances
where Sp6 shifts to using a post-vocalic (r) in her exchanges with Sp7. Of the remaining 3
instances of (r) in Sp6, 2 also occur in the same conversation, even though they involve
199
0.00
0.20
0.40
0.60
0.80
1.00
1.20
1.40
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
No
rmal
ised
sco
res
for
Sp6
28 listeners
Perception results for the identifiability of Sp6
instances of talking aloud, or providing a running commentary of the task at hand. The last
instance of (r) was used in the task immediately following this conversation, in "Sentences".
There are no instances of (r) in the TextAloud task for Sp6, and across all the five variables,
Sp6 consistently opts for the least marked variant in each case, such as in the variable (th), for
example, which is consistently realised as an interdental fricative in Sp6. Sp6 was born and
raised in an entirely Maltese-speaking environment, but claimed to have always loved
reading, and loved English, going on to do two degrees in English before moving to England
for a number of years. Sp6 has been described as "non-identifiably MaltE" throughout this
analysis, because she was rated less identifiable than the modulus by more listeners in the ME
Perception study (Study 2). Nevertheless, this cannot be considered a clear-cut categorisation,
as there were still a number of the listeners who rated Sp6 as quite closely similar to the
modulus. As the modulus clip itself was marked relatively highly, in many cases, (see above
Section 6.1.1) Sp6's ratings could be interpreted as an acknowledgement on the part of the
listeners of some noticeable, residual effects of MaltE patterns of speech. Figure 6.32
illustrates identifiability ratings for Sp6.
Figure 6. 32 Identifiability ratings for Sp6
A score of "1.00" is equal to the score for the modulus
200
Apart from the first listener rating Sp6 higher than the modulus, all other scores rate
Sp6 less identifiably MaltE than the modulus, and quite a few of the scores (15 listeners)
come in lower than 0.4, when compared with the rating for the modulus at 1.00. Compared
with ratings for the five other speakers in the data set analysed throughout this chapter, the
ratings for Sp6 are indeed particularly low, but it is important to note that they are still not
categorically all low, or even all very low.
This chapter has analysed patterns of speech, with particular reference to five
linguistic variables, in six speakers rated identifiable, to varying degrees, of MaltE. Findings
show that patterns of systematic and meaningful variation can be identified across all the six
speakers, and many of these patterns have been shown to correlate strongly with the
perceptions of identifiability for MaltE. On the basis of these findings, I now turn to
examining more closely the possibility of interpreting these patterns of variation as an index
of identifiability which could be pegged to a continuum of variation for MaltE, with reference
to the six speakers in question.
201
89.3
82.1
71.4
57.153.6
3.6
7571.4
32.1
67.9
0
10
20
30
40
50
60
70
80
90
100
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6 Sp7 Sp8 Sp9 Sp10
Identifiability Ratings %
7 An Index of MaltE Variation
10 speakers (excluding the modulus) were recruited for the Perception study (Study 2)
and among these, the data of six speakers were analysed with respect to five linguistic
characteristics which a combination of previous literature and the informed opinions of nine
language experts or linguists had identified as salient in the perception of what is
identifiablyMaltE. The identifiability ratings for all ten speakers are presented in the table
below for reference, in order to contextualise the six speakers whose speech patterns were
studied more closely. Note that the first six bars listed in Figure 7.1 below describe these six
speakers, while the remaining bars describe all those speakers who were rated for
identifiability in the Perception study (Study 2), but who were not included in the foregoing
analysis.
Figure 7. 1 % of listeners rating each speaker (Sp) more identifiable than the modulus
The first six speakers have been presented in descending order of identifiability,
defined here as the percentage of native MaltE listeners, from a total of 28, who awarded the
speakers a higher rating than the one given to the modulus, for sounding identifiably MaltE.
202
0
10
20
30
40
50
60
70
80
90
100
Sp1 Sp2 Sp3 Sp4 Sp5 Sp6
Identifiability and Variation in Six MaltE Speakers
Identifiability Rating PVI V.Dur (r) (th) (schwaØ) (a)
The task put to the listeners was: compared with the modulus, how sure are you that (these
speakers) are Maltese? (see Section 4.3.2). Throughout the analysis presented in Chapter 6, it
has become clear that with some slight exceptions, but overall with healthy consistency, the
three speakers obtaining greatest agreement on their identifiability for MaltE show strong
preference for realising variants considered more marked for MaltE. The correlations
described in Section 6.2.2 above indicate that three of the five variables show particularly
strong correlations, while two other variables do not show any significant correlations. The
two variables which do not seem to correlate significantly with identifiability ratings
nevertheless suggest different patterns of variation which may be more closely related to
specific speech communities, or particular socio-indexical properties within a broad
description of MaltE.
The interplay between identifiability and marked realisation of characteristics has
already been briefly explored in relation to vowel duration in Section 6.2.2 above, and is
revisited here also in relation to the remaining four variables in Figure 7.2 below:
Figure 7. 2 Identifiability ratings corresponding to PVI V.Dur and four segmental variables in six MaltE speakers
The first bar (dots) represents % listener identifiability ratings for each speaker labelled. Subsequent bars represent the five
characteristics in question.
203
To begin with, note the first two bars for each speaker, where a higher identifiability
corresponds with a lower normalised PVI for vowel duration (PVI V.Dur) while a lower
identifiability corresponds with a high (PVI V.Dur). If we take the two extremes, Sp1 and
Sp6, note the contrasting patterns, as a higher identifiability corresponds with higher
proportions for the remaining four segmental variables (as well as a correspondingly low PVI
V.Dur), and a lower identifiability corresponds to low proportions of variation across the four
remaining variables, together with the high (PVI V.Dur). If we consider the two other highly
identifiably MaltE speakers Sp2 and Sp3, we see a similar pattern of lower PVI V.Dur,
combined with evidence of marked variation in the other four variables. Conversely, Sp4 and
Sp5, both considered moderately identifiable, both begin to show less – though still visible –
evidence of variation. More specifically, the male Sp4 is similar to the three highly
identifiable speakers in rates of (PVI V.Dur), and for (r), but patterns more closely with the
non-identifiable Sp6 on (a). For (th) and for (schwaØ) he falls somewhere in between the two
extremes.
Sp5 patterns more closely with the unidentifiable Sp6 on all variables except (th). It is
worth remembering at this stage, however, that although Sp5 is considered only moderately
identifiable, with 54% of listeners agreeing that she was more identifiable than the modulus,
this is still a relatively high rating, especially when we remember that those listeners who
rated Sp5 less identifiable than the modulus still marked her quite highly, with a few
exceptions. A corresponding middle ground can be observed across variation patterns for
Sp5 in (th), (PVI V.Dur), and to a degree, also in (schwaØ). This pattern suggests that
listeners are very sensitive to even the most subtle variation in a short stretch of speech,
because although Sp5 shows strong similarities with the non-identifiable Sp6, there are two
variables which are distinct enough in their patterning, to mark out Sp5 as much more
identifiable of MaltE than Sp6. For example, PVI V.Dur, capturing variablity in vowel
204
duration, is much higher than for the remaining identifiable speakers, with an index of 70,
where 57 is the highest index for Sp1, Sp2, Sp3 and Sp4, and yet it is also lower than that of
the non-identifiable speaker's PVI V.Dur, at 81. Again, (th) variation for Sp5 is more similar
to the more identifiable than to the non-identifiable speakers at 65%, compared to 0% for
Sp6. We can see, therefore, even with this small cohort of speakers, the beginnings of a
continuum of variation expressed in close correspondence with each speaker's identifiability
rating, and the final stage in this research is concerned with starting to map out this
continuum of variation for MaltE.
There are a number of ways in which to approach this exercise, although for a more
accurate description of variation in MaltE, it will of course be necessary to have larger data
sets, i.e. more speakers, and indeed, more variables targeted. As it is, with this small cohort
of six speakers, a more descriptive approach is necessary at this stage. Nevertheless, two
types of index have been computed to illustrate the potential for further study, and they both
show some potential to reflect some of the variation across the six speakers that have been
studied in Chapter 6, ranging from highly identifiable, to non-identifiable. The
understandable compromise in having six speakers is that some of the more subtle variation
does tend to get levelled out, particularly in the case of Sp5, highlighted further below. Note
that the following exercise applies to the four variables analysed at the segmental level. An
index of variability in vowel duration has already been computed and it is plotted, or pegged,
on the continuum of MaltE variation separately, because it also operates differently. In the
case of the four segmental variables, a higher index is expected to correspond to a higher
identifiability rating, because the index is calculated on greater variation at the segmental
level. Conversely, in the pairwise variability for vowel duration (PVI V.Dur) a lower index is
expected to indicate a higher identifiability rating, because vowel durations are expected to
be more homogenised, and therefore less variable, for more identifiably MaltE speakers.
205
7.1 Indexing variation in a series of features
The first index of variation for the four segmental variables is based on Chambers and
Trudgill (1998:51), who compute their index by multiplying each successively more marked
variant by a higher score, adding the total, and dividing this by the total number of tokens for
each speaker. The least marked variant is allocated a 0, the next a 1 and the most marked, a 2,
and so on, with all allocated scores corresponding to phonetically more distant variants from
the least marked option. For example, in my data, the three possible realisations of the (a)
phoneme are as /æ/, /ɛ/, and /ɐ/. Of these, the first, /æ/would be considered least marked, as
it can be considered an expected realisation. The other two realisations can be placed on a
gradient as successively more marked, and also as being successively more phonetically
distant from the unmarked version. /ɛ/ is articulated in a similar way to /æ/, as both are front
vowels, with the former slightly more close than the latter (see Sections 4.2.1.1 and 6.2.2
above). Conversely, /ɐ/ is phonetically more distant as a central vowel than /æ/. Thus /æ/, /ɛ/
and /ɐ/ can be seen to fall on a continuum of articulation, corresponding also to degrees of
variation, and markedness. According to the Chambers and Trudgill index, /æ/, /ɛ/ and /ɐ/
would therefore be allocated a 0, 1 and 2 respectively. Indices for my data have been
computed according to this system, with results illustrated in Table 7.1 below. In cases where
only 1 variant is expected, such as (schwaØ), this is multiplied by 1, not by 2, while the
unmarked realisation is multiplied by 0. The total score for each variable, per speaker, was
divided by the total number of instances of each variable per speaker recorded and the final
results are listed in Table 7.1 below.
206
Table 7 1 An index following Chambers and Trudgill (1998) for four segmental variables
(r ) (th) (schwaØ) (a)
Sp1 0.6 1.7 0.6 0.9
Sp2 0.9 1.5 0.4 0.1
Sp3 1.0 1.7 0.6 0.4
Sp4 0.9 1.1 0.3 0.0
Sp5 0.0 1.1 0.2 0.0
Sp6 0.1 0.0 0.1 0.0
Above indices calculated by scoring each token for each speaker of each of the 4 segmental variables, then dividing by the
total number of tokens for each speaker
The patterns are now familiar as higher indices are consistently associated with the
more identifiable speakers, Sp1, Sp2 and Sp3, while the lowest indices frequently correspond
to the non-identifiable speaker Sp6. With a larger cohort of speakers, it should also be
possible to observe the more subtle differences also present, for example, between Sp4 and
Sp5 on one hand, and the three highly identifiable speakers on the other. The more
moderately identifiable speakers Sp4 and Sp5 alternately have lower or higher indices,
reflecting slight shifts in preference patterns for the different characteristics analysed here.
Figure 7.3 below illustrates an encouraging trend of patterning across the six speakers, where
Sp1 is rated most identifiable and Sp6 is rated non-identifiable. In order to be able to display
all five indices together, the four indices for the four segmental variables listed above have
been multiplied by 100.
207
Figure 7. 3 Indices of Variation for all five variables per speaker
In Figure 7.3 above, all lines except the dotted one illustrating (PVI V.Dur) show a
downward trend from Sp1 to Sp6. This downward trend is perhaps most dramatic in (th), but
is also visible in (r) (diamond shapes ) which also highlights the marked dip for Sp5 in this
variable.This index therefore suggests some degree of reliability in retaining the distinction
between speakers. In accounting for different degrees of markedness in within-speaker and
across-speaker variation by multiplying more marked variants by successively higher
numbers, this index succeeds in illustrating clearly each speaker's place on the continuum of
variation in MaltE, with reference to a particular variable.
7.2 Indexing speakers on a continuum of variation
Another form of index might be one which sums all the instances of variation for all
six speakers into one index. Any attempt to collate these data more ruthlessly in the hope of
coming up with a single index for each speaker incorporating all five variables is likely to run
the risk of homogenising the data and losing important distinctions between the speakers, as
0
20
40
60
80
100
120
140
160
180
S P 1 S P 2 S P 3 S P 4 S P 5 S P 6
Co
mp
ute
d in
dic
es
to r
ep
rese
nt
vari
abili
ty
Speakers
(r ) (th) (schwaØ) (a) (PVI V.Dur)
208
the following exercise will illustrate. Nevertheless, with a view to plotting or pegging larger
datasets of MaltE speakers within a similar framework, it is useful to study the potential of
this possibility too. In this exercise, the mean frequency for each segmental variable (or the
mean index of variation for PVI V.Dur) is calculated. Each speaker's frequency for each
segmental variable is then allocated (+) if it lies above the mean for that variable, or a (-) if it
falls below the mean. For the (PVI V.Dur), this pattern is reversed, as a lower index here is
associated with markedness for MaltE. The total number of (+) are then summed to form a
rough index for all the six speakers.
Table 7 2 A collated index of variability per speaker
Index+% variables
PVI V.Dur
(r) (th) (schwaØ) (a) Index Identifiability Rating
Sp1 55.1 (+) 42 (-) 89 (+) 63(+) 64 (+) 4 89
Sp2 56.8 (+) 84 (+) 78 (+) 42 (+) 9 (-) 4 82
Sp3 49.5 (+) 88 (+) 89 (+) 62 (+) 34 (+) 5 71
Sp4 57.9 (+) 84 (+) 62 (-) 33 (-) 0 (-) 2 57
Sp5 69.7 (-) 4 (-) 65 (+) 21 (-) 0 (-) 1 54
Sp6 81.1 (-) 7 (-) 0 (-) 8 (-) 0 (-) 0 4
Mean 61.7 51.9 63.7 38.1 18.0
Raw frequencies of variation (r), (th), (schwaØ) and (a), per speaker, are allocated a (+) if they fall above the Mean, and a
(-) if they fall below the Mean. PVI V.Dur is allocated a (-) i.e. unmarked for MaltE if it falls above the Mean and (+) i.e.
marked for MaltE if it falls below the Mean.
This index quite successfully represents the different degrees of variation, and overall,
it corresponds relatively well with the identifiability rating each speaker scored in the
Perception study (see Section 6.1). As expected, however, results for some of the variation
which may still be relevant as salient triggers of identifiability have been levelled out, such as
variation in (a) for Sp2, which falls below the Mean, and is therefore allocated (-), or again,
(PVI V.Dur) for Sp5, which is also allocated a (-) in relation to the Mean. In both cases, the
figures still indicate some form of variation, also in relation to the least marked of all the
speakers, Sp6, but in a binary categorisation, some of the rather more subtle variation cannot
209
be accounted for. In spite of this slight homogenisation of the data, note that the index still
manages to reflect a continuum of MaltE variation across the six speakers in these data. In
particular, the behaviour of this index may strengthen the argument for considering clusters
of different linguistic characteristics, as opposed to individual ones.
7.3 Discussion
This study has focused on variation in MaltE across five characteristics within the
domain of phonetic analysis. Taken individually, each characteristic often gives a clear
indication of how identifiably MaltE or otherwise a speaker may be rated. This is particularly
the case for the three most highly rated speakers for identifiability, who were all allocated (+)
markers for most or all of the variables. However, the more moderately identifiable speakers
only obtained (+) markers distributed sporadically across the five variables, ensuring that a
detailed study of any one characteristic in a small dataset such as this is likely to miss salient
aspects of variation in MaltE. Sp4 obtained (+) markers for PVI V.Dur and for (r), while Sp5
only obtained (+) for (th). Conversely, a broader analysis of a number of characteristics
which can then collate the frequencies into an index is still able to capture both the more
marked aspects of variation in the highly identifiable speakers, as well as the more subtle, but
still salient variation in the more moderately identifiable speakers. This distinction is
especially useful given the context of bilingualism and continua of language use that a
multicultural nation such as that of the Maltese islands is likely to present.
Besides allowing for a broad categorisation of different degrees of variation among
speakers within MaltE, the five characteristics identified in this thesis also suggest another
pattern of relationships across the individual linguistic variables. This pattern can perhaps
best be understood in the context of the "feature pool" (Mufwene, 2001; Schneider, 2007).
Warren (2008:47) expands on this concept of a feature pool, thus:
210
there is always a great deal of variability in how a given variety of a
language is spoken, but what contributes to change is that some of the
variability becomes meaningful through its association with particular
groups of speakers, who exploit particular ways of speaking as part of their
identity.
Such exploitation of "particular ways of speaking" is noted in the speech patterns of
the six speakers analysed here. On the one hand, we find that durational characteristics
described within the prosodic domain strongly indicate MaltE variation across the whole
continuum, and on the other hand, a series of segmental characteristics are exploited to
greater or lesser extents by individuals who can then perhaps be more precisely identified at
specific points along this same continuum. If we take each feature individually, the only
features which might be said to trigger a perception of identifiability, evident also in that they
present strong correlations with identifiability, are the all-encompassing (PVI V.Dur) feature,
or the (th) feature, which both strongly correlate with identifiability ratings. On the other
hand, (r) or (a), when considered individually, may well be viewed as minimally relevant to
the study of MaltE. However, a more broad-ranging study such as this research has
attempted, is more concerned with capturing variation across a group of characteristics, or
features, and the results of studying such groupings, or "pools" of features, suggest that
variation often operates in terms of groups of features. Thus, if we consider the five linguistic
variables identified in this dissertation, and their potential interaction, we see an overall
pattern that also seems to be reflected in identifiability ratings, which may in fact have been
triggered by the cumulative effect of how these five features were used by the various
individuals. Conversely, the analysis of one feature may not always indicate such a clear
relationship between perceptions of identifiable MaltE, on the one hand, and systematic
variation on the other. For example, if we look solely at rhoticity, it may seem as though the
211
resulting frequencies present unexpected patterns, which could almost seem random. Sp1, as
a speaker rated highly identifiable, does not pattern together with the other two highly
identifiable speakers, and again, Sp5, identified as moderately identifiable, shows less
preference for rhoticity than even the non-identifiable Sp6. Similarly, if we consider (a), we
would see that while two highly identifiable speakers present distinct variation here, the other
highly identifiable speaker presents only a low frequency of variation, and could just as easily
be classified together with the moderately identifiable speakers, who showed no variation at
all. Although a closer sociophonetic study would still be very likely to uncover the meaning
to the patterns which are undoubtedly there, it may be more difficult to identify a distinctly
MaltE profile here, with reference to just one linguistic feature.
So it may be more useful for the purposes of trying to understand how variation in
MaltE works, to view these characteristics (and doubtless others too) as a series of linguistic
choices whose qualities or features, make them optimal candidates for meaningful variation
in this emerging variety of English. In other words, among the hundreds or thousands of
possible permutations of linguistic variation, it is possible to identify a much smaller number
of characteristics which are likely to interact more closely with each other, and it is this
interaction of a combination of some characteristics over others that can inform the
perceptions of what is identifiable for, in this case, MaltE. Further, once we have identified a
series of combined characteristics which are likely to trigger the perception of identifiability
in MaltE, we can then also begin to examine exactly how these combined characteristics
interact to indicate a more nuanced understanding of how variation in MaltE operates along a
continuum of identifiability. Among the five characteristics identified in this research, for
instance, (PVI V.Dur) can be considered a strong trigger of the perception of identifiability in
MaltE, across all six speakers analysed, while (r) might serve to place a MaltE more precisely
within a particular social network, or speech community. So it is possible to consider the
212
process of perception of MaltE as a multi-tiered one, where the first, perhaps almost
subconscious perceptions are triggered by (PVI V.Dur), or by (th), while on another level,
characteristics such as (r) or (a) serve to modify or confirm that initial reaction.
In this chapter, I have explored the possibility of beginning to peg the data from six
MaltE speakers onto a continuum of variation of MaltE. This framework is based on the
correspondence between perception ratings obtained for each of the six speakers from the
Perception study (Study 2) with 28 native MaltE listeners, and the variation indices calculated
for each of the five chosen characteristics (PVI V.Dur), (schwaØ), (r), (a) and (th).
213
8 Conclusion
This study presents evidence to support the view that native speakers of MaltE use
systematic patterns of variation, with reference to the five characteristics analysed here. This
study has also shown that MaltE listeners are frequently able to perceive such systematic
patterns of variation, and also identify with a fair degree of subtlety, how identifiably MaltE
or not an individual speaker may be. Such evidence serves to strengthen the suggestion that
MaltE is approaching a stage identified by Schneider (2003, 2007) as "nativization", where a
newly emerging dialect, or variety, of English, shows increasing evidence of "the emergence
of locally distinctive linguistic forms and structures" (Schneider, 2007:71). Together, the
patterns of perception evidenced in 28 native MaltE listeners, and the patterns of production
analysed in the speech of six native MaltE speakers, further suggest evidence that MaltE
operates on a continuum of variation, very much in line with the concept of the "cline in
bilingualism" proposed by Kachru (1990), and noted with reference to language use in Malta
by a number of scholars observing that variation in MaltE operates on a continuum, and
cannot be homogenised (Borg, 1986; Vella, 1995). This variation within MaltE, far from
being the aberration, or the manifestation of bad English, that it is sometimes villified as, can
be considered as evidence of a desire to engage creatively with the English language in a way
that is meaningful for the community.
8.1 Patterns of production
There is widespread evidence of systematic variation across all six MaltE speakers
whose speech patterns were analysed with respect to five characteristics which are widely
considered to be associated with MaltE patterns of speech. Four segmental features analysed
in the six speakers yielded strong indications of systematic variation, which in the case of
214
(th), and (schwaØ) also resulted in a strong positive correlation with identifiability ratings.
The remaining two segmental features, (a) and (r), did not show particularly strong
correlations, but patterns of variation are nevertheless still particularly evident, as indicated
by the generally high frequency of variation for the highly identifiable MaltE speakers. I have
suggested one interpretation for this to be that while variation in the realisation of (th) is more
widespread throughout MaltE speech patterns, variation in the realisation of (r), and (a), is
less prevalent, and may instead be restricted to specific subgroups within the MaltE speech
community. A closer look at how (a) and (r) were distributed in the data in terms of speech
styles shows an unexpectedly lower frequency of (a) variation for Sp2, and of (r) for Sp1 in
the reading aloud passage (TextAloud), when compared with the other identifiable, or
moderately identifiable speakers in the cohort. This might be accounted for by a tendency in
speakers to self-monitor and control for any variants which may be perceived as stigmatised,
in some way, in speech styles which allow for more careful production. Bonnici (2010) has
indicated that (r) is consciously discussed as a feature of distinction in MaltE, with less rhotic
speakers frequently associated either with older people or with younger people perceived as
snobbish, on account of this feature's close association with ex-colonial Standard British
English patterns. Similarly, anecdotal evidence and parodies (Appendix A), together with
earlier literature, also highlight variation in the realisation of (a) as a feature which is readily
associated with MaltE. So both these variants may be or may once have been considered non-
standard, or even stigmatised, and may therefore have been controlled for to some degree, by
two speakers in the highly identifiable MaltE group who may have been more sensitive to the
social implications for how (a) and (r) are realised.
In particular, patterns of variability in vowel duration (PVI V.Dur) were found to be
especially indicative of identifiable MaltE speech, as increased variability showed a strong
negative correlaton with perceptions of identifiability obtained through the Perception study
215
(Study 2). This shows that a lower index of variability in vowel duration patterns is more
salient for MaltE, while a high index is considered non-identifiable for MaltE. In an expected
and closely related analysis, the preference for full vowels over schwa (schwaØ) was also
much higher for the more identifiable MaltE speakers, while less identifiable speakers
showed more preference for schwa over full vowels. However, this distinction between
preferring full vowels, or preferring schwa, may not be the sole factor to influence the
different indices of vowel duration variability across the six speakers. Another factor seen to
have a bearing on durational characteristics is rhoticity. In all six speakers, both the rhotic
and non-rhotic speakers showed evidence of shorter mean lengths of vowel duration when
vowels were followed by overt realisation of (r) than in cases of null realisation of (r). It
follows, then, that speakers who are less rather than more rhotic have more opportunity, as it
were, to exploit variability in vowel durations than speakers who are more rhotic. The
speakers who were rated as being more identifiably MaltE were generally more rhotic, used
more full vowels rather than schwa, and also had a lower pairwise variability index for vowel
duration.
Besides the evidence of salient variation across the full spectrum of MaltE, most
significantly in (PVI V.Dur), (schwaØ) and (th), some preliminary findings in these data are
also suggestive of a distribution of these five characteristics in terms of gender, a sensitivity
to prestige norms, and by association, to speech styles, or register. This may be particularly
noticeable in the case of variation in (a), which is only evident in the three highly identifiable
speakers, and among these, is only minimally present in the female speaker, Sp2. One
possible interpretation of this patterning is that variation in (a) is stigmatised in some way and
therefore is avoided in more careful speech. It is also therefore not considered a popular
option for a female speaker, who, as it has been previously established, is more likely to show
sensitivity to the distinction between prestige forms and stigmatised ones. Compare with this
216
the distribution of (r), where the distinction between prestige and stigmatised forms is not so
clear cut. In this case, the highly identifiable female speaker Sp2 also presents similar
patterns of high frequencies of rhoticity, compared with the other identifiable male speakers,
and one of the more moderately identifiable speakers also presents a similarly high frequency
of (r).
In terms of the link between identifiability ratings obtained through the Perception
study, and patterns of variation in the production of these five characteristics, findings
suggest that it is possible to begin to establish an index of variation for MaltE. Two types of
index were devised to capture on the one hand variation within each characteristic, and on the
other hand, variation across the five characteristics as a group. In each case, the resulting
index managed to capture distinctions among the six speakers which reflect both general
trends across the speakers, as well as more subtle distinctions between them. These same
subtleties and trends are also reflected in the identifiability ratings allocated by the MaltE
listeners.
8.2 Patterns of perception
Previous research and studies carried out on the varied English evident throughout the
Maltese islands has usually needed to first justify the existence of a distinct and systematic
pattern of variation which I have called MaltE throughout this dissertation. The same research
has also often felt the need to place MaltE within a context of reference to some form of
exonormative standard, such as, usually, Standard British English, with RP accompanying it
as accent of choice. An amount of later research on MaltE has circumvented this limitation by
attempting to contextualise MaltE solely with reference to its localised speech community,
and the subgroups within it (for example, Camilleri, 1992; Vella, 1995; Bonnici, 2010), and
217
the research for this dissertation builds on this new direction by investigating patterns of
perception in MaltE listeners listening to spoken MaltE.
In this dissertation I tapped into the more intuitive, as opposed to more overtly
judgmental perceptions of MaltE speech patterns. This required a carefully designed
perception study where participants were encouraged to focus on whether or not they thought
a series of audio clips represented speakers who were Maltese, or not. The timing of each
audio clip was purposely very short, in order to minimise the opportunity to read too much
meaning into the clip, other than addressing the issue of whether or not each clip represented
speakers who were Maltese. Each clip also included a number of examples of the target
characteristics under discussion throughout this thesis.
The decision to use magnitude estimation (ME) as a measuring instrument posed
some risks, not least because of its novelty, and also, because it required participants to think
numerically, and inteprpret their impressions on a numerical scale of their own devising.
Nevertheless, these risks paid off, to a great extent, as the resulting scores then offered the
possibility both of broader categorical groupings, in terms of more versus less identifiable, as
well as a more nuanced understanding of how each individual speaker was perceived by the
individual listeners. In effect, ME was therefore, useful also in allowing listeners more
freedom in expressing their perceptions than the traditional likert-type scale might have
allowed for. The ME scores from each listener were normalised and categorised first
according to whether they fell above or below a score of 1, indicating more or less
identifiable than the modulus respectively. The scores were then grouped into the percentage
of listeners scoring each clip more or less than 1, in order to arrive at an identifiability rating
for each clip, representing the MaltE speakers.
Patterns of perception as observed in the ME Perception Study showed trends
typically observed elsewhere in previous studies of dialect perception. In particular, listeners
218
showed strong levels of consensus over what could definitely and what could definitely not
be considered typically MaltE. Listeners also showed a degree of sensitivity with regards
more fine-grained aspects of variation, which subsequently allowed for further distinction
among those speakers who had been identified by listeners as identifiably MaltE to different
degrees.
8.3 Future research
In shifting our perspective slightly, to consider variation in MaltE on its own terms,
and not in reference to an external standard, or norm, we notice many examples of systematic
and coherent patterning which suggest that such MaltE variation operates on a meaningful
level, within its speech community. Questions relating to ways in which variation in MaltE is
manifested as a function of gender, or age, or register have been raised in this thesis, and
more research in this field would undoubtedly enrich our understanding of the socio-
linguistic context of Maltese society. It would be interesting, for example, to understand how
variation in (a) is manifested in larger participant groups, whether it is preferred by males or
females, or, on the other hand, whether its distribution changes as a function of register.
Similarly, as this work focused on developing an index of variation with reference to
five linguistic characteristics at the phonetic/phonological level, it would be useful to expand
here by conducting further studies with larger cohorts of speakers. This process could test or
strengthen the current findings across a wider range of variation, or indeed across a wider
range of characteristics, at other levels of linguistic structure. Both of these directions would
serve to develop a more refined index of variation.
As the role of the use of English in Malta is considered to be sometimes a utilitarian
one, important for travel, and better employment opportunities, more studies on speech styles
are also needed. My work has suggested some self-monitoring in more careful speech styles,
219
and perhaps also, more linguistic accommodation in conversation, particularly when speakers
on different points along the continuum of variation engage in conversation with each other.
Future research should concentrate, for example, on changes in speech style between scripted
and spontaneous speech styles in the same speakers to examine their effects.
To conclude, there is no shortage of avenues to explore in the study of Maltese
English, where research to date has been limited, though still able to present a number of
clear directions for future work. This thesis has presented a perspective of the continuum of
variation for MaltE, and it would undoubtedly be useful to learn more about how variation in
this newly emerging dialect operates in a meaningful way in the Maltese islands.
220
References
Abercrombie, D. (1967). Elements of general phonetics.. Edinburgh: Edinburgh University
Press.
Achebe, C. (1965). English and the African writer. Transition, 75/76, 342-349.
Aitchison, J. (2000). Language change: Progress or decay? (3rd ed.). Cambridge: Cambridge
University Press.
Anderson, A. H., Bader, M., Bard, E. G., Boyle, E., Docherty, G., Garrod, S., . . . Weinert, R.
(1991). The HCRC map task corpus. Language and Speech, 34(4), 351-366.
doi:10.1177/002383099103400404
Arvaniti, A. (2009). Rhythm, timing, and the timing of rhythm. Phonetica, 66, 46-63.
doi:10.1159/000208930
Arvaniti, A. (2012). The usefulness of metrics in the quantification of speech rhythm. Journal
of Phonetics, 40(3), 351-373. doi:http://dx.doi.org/10.1016/j.wocn.2012.02.003
Arvaniti, A. (forthcoming). Rhythm classes and speech perception [prepublication version].
In O. Niebuhr (Ed.), Prosodies: Context, function and communication () Walter de
Gruyter.
Azzopardi, M. (1981). The phonetics of Maltese: Some areas relevant to the deaf.
(Unpublished PhD). University of Edinburgh,
Bailey, R. W. (2003). Ideologies, attitudes, and perceptions. American Speech, (88), 123-150.
221
Baltazani, M. (2005). Phonetic variability of the Greek rhotic sound. Phonetics and
Phonology in Iberia Conference, Barcelona.
Baltazani, M. (2007). Prosodic rhythm and the status of vowel reduction in Greek. Selected
Papers on Theoretical and Applied Linguistics from the 17th International Symposium
on Theoretical and Applied Linguistics, , 1 31-43.
Bambgose, A. (1982). Standard Nigerian English: Issues of identification. In B. B. Kachru
(Ed.), The other tongue. English across culture (pp. 99-111). Oxford: Pegamon Press.
Bard, E. G., Robertson, D., & Sorace, A. (1996). Magnitude estimation of linguistic
acceptability. Language, 72(1), 32-68.
Barras, W. S. (2011). The sociophonology of rhoticity and r-sandhi in East Lancashire
English (PhD). Available from www.era.lib.ed.ac.uk/.
Barry, W. (1998). Time as a factor in the acoustic variation of schwa. 5th International
Conference on Spoken Language Processing, Sydney Australia.
Barry, W. J. (2007). Rhythm as an L2 problem: How prosodic is it? In J. Trouvain, & U. Gut
(Eds.), Non-native prosody. phonetic description and teaching practice (pp. 97-120).
Berlin, New York: De Gruyter Mouton.
Barry, W. J. (2009). Do rhythm measures reflect perceived rhythm? Phonetica, 66(1/2), 78-
94.
Barry, W. J., Andreeva, B., Russon, M., Dimitrova, S., & Kostadinova, T. (2003). Do rhythm
measures tell us anything about language type? Proceedings of the 15th International
Congress of Phonetic Sciences, Barcelona. 2693-2696.
222
Beal, J. (2009). "You're not from New York city, you're from Rotherham": Dialect and
identity in UK "indie" music. Journal of English Linguistics, 37, 223-240.
Beckman, M. E. (1992). Evidence of speech rhythms across languages. In Tokhura, Y.,
Vatikiotis-Bateson, E., Sagisaka, Y. (Ed.), Speech perception, production and linguistic
structure (pp. 457-463). Tokyo: OHM Publishing Co.
Benzeghiba, M., De Mori, R., Deroo, O., Dupont, S., Erbes, T., Jouvet, D., Fissore, L.,
Laface, P., Mertins, A., Ris, C., Rose, R., & Wellekens, C. (2007). Automatic speech
recognition and speech variability: A review. Speech Communication, 49(10–11), 763-
786. doi:10.1016/j.specom.2007.02.006
Bertinetto, P. M. (1989). Reflections on dichotomy 'stress' vs. 'syllable' timing. Revue De
Phonetique Apliquee, 91-93, 99-130.
Bertinetto, P. M., & Bertini, C. (2008). On modeling the rhythm of natural languages.
Proceedings of Speech Prosody, Campinas, Brazil. 427-430.
Bhatt, R. M. (2001). World Englishes. World Review of Anthropology, 30, 527-550.
Bilal, H. A., Mahmood, M. A., & Saleem, R. M. (2011). Acoustic analysis of front vowels in
Pakistani English. International Journal of Academic Research, 3(6), 20-27.
Boersma, P. (2014). Acoustic analysis. In R. J. Podesva, & D. Sharma (Eds.), Research
methods in linguistics (pp. 375-395). Cambridge: Cambridge University Press.
Boersma, P., & Weenink, D. (2013). Praat version 5.3.48
Bolton, K. (2006). Varieties of World englishes. In B. B. Kachru, Y. Kachru & C. Nelson
(Eds.), The handbook of World englishes (pp. 289-312). Oxford: Blackwell.
223
Bonnici, L. M. (2007). Maltese English: History of use, structural variation, and
sociolinguistic status. In B. Comrie, R. Fabri, E. Hume, M. Mifsud, T. Stolz & M.
Vanhove (Eds.), Introducing Maltese linguistics: Selected papers from the 1st
international conference on Maltese linguistics, Bremen, 18-20 october, 2007. (pp. 393-
414). Amsterdam: John Benjamins.
Bonnici, L. M. (2010). Variation in Maltese English: The interplay of the local and the
global in an emerging postcolonial variety. (Unpublished PhD). Univerity of California,
Davis,
Bonnici, L. M., Hilbert, M., & Krug, M. (2012). Maltese English. In B. Kortmann, & K.
Lunkenheimer (Eds.), The Mouton World Atlas of variation in English (pp. 653-668).
Berlin and New York: Mouton de Gruyter.
Borg, A. (1980). Language and socialization in developing Malta. Work in Progress,
Department of Linguistics, University of Edinburgh, 13, 60-71.
Borg, A. (1986). The maintenance of Maltese as a language: What chances? Council of
Europe European Workshop on Multicultural Studies in Higher Education, 89-106.
Borg, A., & Azzopardi-Alexander, M. (1997). Maltese. London: Routledge.
Brincat, J. (2011). Maltese and other languages: A linguistic history of Malta. Malta: Midsea
Books ltd.
Buchstaller, I., & D'Arcy, A. (2009). Localized globalization: A multi-local, multivariate
investigation of quotative be like. Journal of Sociolinguistics, 13(3), 291-331.
doi:10.1111/j.1467-9841.2009.00412.x
224
Bybee, J. (2001). Phonology and language use. Cambridge: Cambridge University Press.
Calleja, M. (1987). A study of stress and rhythm as used by Maltese speakers of English.
(Unpublished B.Ed (Hons.)). University of Malta.,
Camilleri, A. (1992). The sociolinguistic status of English in Malta. Edinburgh:
Camilleri-Grima, A. (2001). The Maltese bilingual classroom: A microcosm of local society.
Mediterranean Journal of Educational Studies, 6(1), 3-12.
Camilleri-Grima, A. (2013). A select review of bilingualism in education in Malta.
International Journal of Bilingual Education and Bilingualism, 16(5), 553-569.
doi:10.1080/13670050.2012.716811
Campbell-Kibler, K. (2008). I'll be the judge of that: Diversity in social perceptions of (ing).
Language in Society, 37(5), 637-659.
Carabott, S. (2013, ). Potatoes, briefs and other ways to sell Malta. The Times of Malta
Caruana, S. (2006). Trilingualism in Malta - Maltese, English and Italiano televisivo.
International Journal of Multilingualism, 3(3), 159-172.
Caruana, S. (2007). Language use and language attitudes in Malta. In D. Lasagabaster, & A.
Huguet (Eds.), Multilingualism in European bilingual contexts: Language use and
attitudes (pp. 184-207). Clevedon, U.K.: Multilingual Matters.
Chambers, J. K., & Trudgill, P. (1998). Dialectology. Cambridge: Cambridge University
Press.
225
Chambers, J. K., Trudgill, P., & Schilling-Estes, N. (Eds.). (2002). The handbook of language
variation and change. MA., USA: Blackwell Publishing.
Cheshire, J. (1991). English around the world : Sociolinguistic perspectives. Cambridge:
Cambridge University Press.
Chisanga, T., & Kamwangam Alu, N. M. (1997). Owning the other tongue: The English
language in Southern Africa. Journal of Multilingual and Multicultural Development,
18(2), 89-99. doi:10.1080/01434639708666305
Clopper, C., & Pisoni, D. B. (2004). Homebodies and army brats: Some effects of early
linguistic experience and residential hitory on dialect categorization. Language Variation
and Change, 16, 13-48. doi:10:10170S0954394504161036
Clopper, C., & Pisoni, D. B. (2005). Perceptions of dialect variation. In D. B. Pisoni, & R.
Remez (Eds.), The handbook of speech perception (pp. 313-337). Oxford: Blackwell.
Coetzee, A. W., & Wissing, D. P. (2007). Global and local durational properties in three
varieties of South African English. The Linguistic Review, 24, 263-289.
doi:10.1515/TLR.2007.010
Cohn, A. C., Fougeron, C., Huffamnn, M. K., & Renwick, M. E. L. (Eds.). (2012). The
Oxford handbook of laboratory phonology. Oxford: Oxford University Press.
Connell, L., & Ramscar, M. (2001). Using distributional measures to model typicality in
categorization. Proceedings of the 23rd Annual Conference of the Cognitive Science
Society, University of Edinburgh.
Cruttenden, A. (2001). Gimson's pronunciation of English, 6th Edition, London: Arnold.
226
Crystal, D. (2003). The Cambridge Encyclopedia of the English language. Cambridge:
Cambridge University Press.
Crystal, D. (2011). The future of Englishes. University of Malta.
Dabrowska, E. (2010). Naive vs expert intuitions: An empirical study of acceptability
judgments. The Linguistic Review, 27, 1-23. doi:10.1515/tlir.2010.001
Dandria, R. (2002). Lexical access and prototypicality in the English of Maltese bilingual
speakers. (Unpublished B.A.(Hons)). University of Malta, Malta.
Dauer, R. M. (1983). Stress-timing and syllable-timing reanalysed. Journal of Phonetics, 11,
51-62.
Dauer, R. M. (1987). Phonetic and phonological components of language rhythm.
Proceedings of the 11th International Congress of Phonetic Sciences, Tallinn. 447-449.
Debrincat, R. (1999). Accent characteristics and variation in Maltese English. (Unpublished
B.A. Hons). University of Malta, Malta.
Delceppo, R. (1986). The acquisition of English phonology by the Maltese child.
(Unpublished B.Ed.(Hons)). University of Malta,
Dellwo, V. (2006). Rhythm and speech rate: A variation coefficient for deltaC. Language and
Language-Processing: Proceedings of the 28th Linguistic Colloquium, 231-241.
Deterding, D. (2001). The measurement of rhythm: A comparison of Singapore and British
English. Journal of Phonetics, 29(2), 217-230.
227
Di Paolo, M., Yaeger-Dror, M. (Eds.). (2011). Sociophonetics: A student's guide. Oxon:
Routledge.
Di Paolo, M., Yaeger-Dror, M. & Beckford Wassink, A. (2011). Analyzing vowels. In M. Di
Paolo & M. Yaeger-Dror (Eds.), Sociophonetics. (pp. 87-106). Oxon: Routledge.
Docherty, G. J., & Foulkes, P. (2014). An evaluation of usage-based approaches to the
modelling of sociophonetic variability. Lingua, 142, 42-56.
Dufour, S., Nguyen, N., & Frauenfelder, U. H. (2007). The perception of phonemic contrasts
in a non-native dialect. Journal of the Acoustical Society of America, 121(4), EL131-
EL136. doi:10.1121/1.2710742
Featherston, S. (2005). Magnitude estimation and what it can do for your syntax: Some wh-
constraints in German. Lingua, 115(11), 1525-1550. doi:10.1016/j.lingua.2004.07.003
Flemming, E., & Johnson, S. (2007). Rosa's Roses. Reduced vowes in American English.
Journal of the International Phonetic Association, 37, 83-96.
Foulkes, P., & Docherty, G. J. (2006). The social life of phonetics and phonology. Journal of
Phonetics, 34, 409-438. doi:10.1016/j.wocn.2005.08.002
Frendo, H. (1992). Language and nationhood in the Maltese experience : Some comparative
and theoretical approaches. Msida: University of Malta.
Galea Cavallazzi, K. (2004). The phonology-syntax interface of spoken Maltese English.
(Unpublished M.A.). University of Malta,
Garrod, S., & Pickering, M. J. (2004). Why is conversation so easy? Trends in Cognitive
Science, 8(1), 8-11. doi:10.1016/j.tics.2003.10.016
228
Gessinger, J. (2010). Language variation, language change and perceptual dialectology.
Multilingua, 29(3), 361-383. doi:10.1515/mult.2010.018
Giegerich, H. J. (1992). English phonology: An introduction. Cambridge: Cambridge
University Press.
Gil-Günzburger, D. (1979). Estimation tests as a means of quantifying linguistic expectancy.
Acta Psychologica, 43(4), 271-282. doi:10.1016/0001-6918(79)90036-2
Grabe, E., & Low, E. (2002). Durational variability in speech and the rhythm class
hypothesis. Papers in Laboratory Phonology, 7, 515-546.
Gries, S. T. (2013). Statistics for linguistics with R: A practical introduction. Berlin/Boston:
De Gruyter Mouton.
Gut, U. (2012). Rhythm in L2 speech. In D. Gibbon, D. Hirst & N. Campbell (Eds.), Rhythm
melody and harmony in speech. Studies in honour of Wiktor Jassem. Special edition.
Speech and language technology. (14th ed., pp. 83-94)
Hamdi, R., Barkat-Defradas, M., Ferragne, E., & Pellegrino, F. (2004). Speech timing and
rhythmic structure in arabic dialects: A comparison of two approaches. Interspeech, ,
1613-1616.
Harrington, J. (2010). Phonetic analysis of speech corpora. MA., USA: Wiley-Blackwell.
Hartmann, D., & Zerbian, S. (2009). Rhoticity in Black South African English: A
sociolinguistic study. South African Linguistics and Applied Language Studies, 27(2),
135-148. doi:0.2989/SALALS.2009.27.2.2.865
229
Hawkins, S. (2003). Contribution of fine phonetic detail to speech understanding.
Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona. 293-
296.
Hay, J., & Drager, K. (2010). Stuffed toys and speech perception. Linguistics, 48(4), 865-
892. doi:10.1515/LING.2010.027
Hay, J., Warren, P., & Drager, K. (2006). Factors influencing speech perception in the
context of a merger-in-progress. Journal of Phonetics, 34(4), 458-484.
Hickey, R. (1989). R-coloured vowels in Irish English. Journal of the International Phonetic
Alphabet, 15(2), 44-58. doi:http://dx.doi.org/10.1017/S0025100300002978
Hilbert, M., & Krug, M. (2010). The compilation of ICE Malta: State of the art and
challenges along the way. ICAME Journal, 34, 56-63.
Hoffmann, T. (2013). Obtaining introspective acceptability judgments. In M. Krug, & J.
Schluter (Eds.), Research methods in language variation and change (pp. 99-118).
Cambridge: Cambridge University Press.
Hughes, R., & Szczepek Reed, B. (2011). Learning about speech by experiment: Issues in the
investigation of spontaneous talk within the experimental research paradigm. Applied
Linguistics, 32(2), 197-214. doi:10.1093/applin/amq044
Hull, G. (1993). The Malta language question: A case study in linguistic imperialism. Malta:
Said International.
230
International Phonetic Association. (2005). IPA chart, available under a creative commons
attribution-sharealike 3.0 uported license. Retrieved from
http://www.langsci.ucl.ac.uk/ipa/ipachart.html
Johnson, K. (1997). Speech perception without speaker normalization. In J. W. Mullennix
(Ed.), Talker variability in speech processing (pp. 145-165). California, USA: Academic
Press.
Johnson, K. (2005). Speaker normalization in speech perception. In D. B. Pisoni, & R. Remez
(Eds.), The handbook of speech perception (pp. 363-389). Oxford: Blackwell.
Kachru, B. B. (Ed.). (1992). The other tongue: English across cultures. University of Illinois
Press.
Kachru, B. B. (1990). The alchemy of English : The spread, functions and models of non-
native Englishes. University of Illinois Press.
Kachru, B. B. (1996). World Englishes: Agony and ecstasy. Journal of Aesthetic Education,
30(2), 135-155.
Kortmann, B., & Lukenheimer, K. (2013). The Electronic World Atlas of Varieties of
English. Retrieved from http://ewave-atlas.org/
Krug, M. (2001). Frequency, iconicity, categorization: Evidence from emerging modals. In J.
Bybee, & P. Hopper (Eds.), Frequency and the emergence of linguistic structure (pp.
309-335). Amsterdam: John Benjamins.
231
Krug, M. (In Press 2015). Maltese English. In J. P. Williams, E. W. Schneider, D. Schreier &
P. Trudgill (Eds.), Further studies in the lesser-known varieties of English (pp. 8-50).
Cambridge: Cambridge University Press.
Labov, W. (1978). Sociolinguistic patterns. Oxford: Blackwell.
Labov, W. (2001). Principles of linguistic change. volume 2: Social factors. Massachusetts,
USA: Blackwell Publishers, Inc.
Ladefoged, P. (2003). Phonetic data analysis. Oxford: Blackwell.
Ladefoged, P., & Ferrari Disner, S. (2012). Vowels and consonants (3rd ed.). MA., USA:
Wiley-Blackwell.
Ladefoged, P., & Maddieson, I. (1996). The sounds of the world's languages. Oxford:
Blackwell.
Language act, art.5. (1964). Constitution of Malta.
Lass, R. (1997). Historical linguistics and language change. Cambridge: Cambridge
University Press.
Lehiste, I. (1970). Suprasegmentals. Cambridge, MA: MIT Press.
Lindau, M. (1985). The story of /r/. In V. A. Fromkin (Ed.), Phonetic linguistics: Essays in
honor of Peter Ladefoged (pp. 157-168). Orlando, FA: Academic Press.
Llisterri, J. (1995). Relationships between speech production and speech perception in a
second language. Proceedings of the 13th International Congress of Phonetic Sciences,
Stockholm. 92-99.
232
Loebach, J. L., Bent, T., & Pisoni, D. B. (2008). Multiple routes to the perceptual learning of
speech. Journal of the Acoustic Society of America, 124(1), 552-561.
doi:10.1121/1.2931948
Low, E. (2010). The acoustic reality of the kachruvian circles: A rhythmic perspective. World
Englishes, 29(3), 394-405. doi:10.1111/j.1467-971X.2010.01662.x
Maguire, W., & McMahon, A. (Eds.). (2011). Analysing variation in English. Cambridge:
Cambridge University Press.
Mazzon, G. (1992). L'Inglese di Malta. Napoli: Liguori.
Mazzon, G. (1993). English in Malta. English World-Wide, 14(2), 171-208.
McMahon, A. (1994). Understanding language change. Cambridge: Cambridge University
Press.
Mesthrie, R. (1996). A lexicon of South African Indian English. Leeds: Peepal Tree Press.
Mesthrie, R. (2003). The World Englishes paradigm and contact linguistics: Refurbishing the
foundations. World Englishes, 22(4), 449-461. doi:10.1111/j.1467-971X.2003.00312.x
Micheli, S. M. (2001). Language attitudes of the young generation in malta. (Unpublished
M.Phil.). University of Vienna,
Milroy, J. (2001). Language ideologies and the consequences of standardization. Journal of
Sociolinguistics, 5(4), 530-555.
Milroy, J., & Milroy, L. (1985). Linguistic change, social network and speaker innovation.
Journal of Linguistics, 21(2), 339-384.
233
Montgomery, C., & Beal, J. (2011). Perceptual dialectology. In W. Maguire, & A. McMahon
(Eds.), Analysing variation in English (pp. 121). Cambridge: Cambridge University
Press.
Mufwene, S. S. (2001). The ecology of language evolution. Cambridge: Camburidge
University Press.
Nelson, C. (2011). Intelligibility in World Englishes: Theory and application. New York:
Routledge.
Niedzielski, N. (1999). The effect of social information on the perception of sociolinguistic
variables. Journal of Language and Social Psychology, 18(1), 62-85.
Niedzielski, N. (2001). Chipping away at the perception/production interface. University of
Pennsylvania Working Ppaers in Linguistics, 7(3), 247-256.
Nokes, J., & Hay, J. (2012). Acoustic correlates of rhythm in New Zealand English: A
diachronic study. Language Variation and Change, 24(1), 1-31. doi:
http://dx.doi.org/10.1017/S0954394512000051
Nortier, J. (2012). Types and sources of bilingual data. In L. Wei, & M. G. Moyer (Eds.), The
Blackwell guide to research methods in bilingualism and multilingualism (pp. 35-52).
MA., USA: Blackwell.
Ochs, E. (1993). Constructing social identity: A language socialization perspective. Research
on Language and Social Interaction, 26(3), 287-306.
234
Ohala, J. (1989). Sound changes is drawn from a pool of synchronic variation. In L. E.
Breivik, & E. H. Jahr (Eds.), Language change: Contribution to the study of its causes
(pp. 173-198). Berlin: De Gruyter Mouton.
Paavola, H. (1987). Features of Maltese English. (Unpublished University of Turku, Finland.
Payne, E., Post, B., Prieto, P., Vanrell, M. M., & Astruc, L. (2012). Measuring child rhythm.
Language and Speech, 55(2), 203-229. doi:10.1177/0023830911417687
Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. The
Journal of the Acoustical Society of America, 24(2), 175-184.
doi:http://dx.doi.org/10.1121/1.1906875
Pierrehumbert, J. B. (2003). Phonetic diversity, statistical learning and acquisition of
phonology. Language and Speech, 46(2/3), 115-154. doi:10.1177/--
238309030460020501
Platt, J. T., Weber, H., & Ho, M. L. (1984). The new Englishes. London: Routledge.
Podesva, R. J., Roberts, S. J., & Campbell-Kibler, K. (2001). Sharing resources and indexing
meanings in the production of gay styles. In K. Campbell-Kibler, R. J. Podesva, S. J.
Roberts & A. Wong (Eds.), Language and sexuality: Contesting meaning in theory and
practice. (pp. 175-189) Center for the Study of Language and Information.
Portelli, J. R. (2006). Language: An important signifier of masculinity in a bilingual context.
Gender and Education, 18(4), 413-430.
Preston, D. R. (1986). Five visions of America. Language in Society, 15(2), 221-240.
235
Preston, D. R. (2013). The influences of regard on language variation and change. Journal of
Pragmatics, 52, 93-104. doi:10.1016/j.pragma.2012.12.015
Prieto, P., Vanrell, M. M., Astruc, L., Payne, E., & Post, B. (2012). Phonotactic and phrasal
properties of speech rhythm. evidence from Catalan, English and Spanish. Speech
Communication, 54, 681-702. doi:10.1016/j.specom.2011.12.001
Purnell, T., Idsardi, W. J., & Baugh, J. (1999). Perceptual and phonetic experiments on
American English dialect identification. Journal of Language and Social Psychology,
18(1), 10-30.
Ramus, F., Dupoux, E., & Mehler, J. (2003). The psychological reality of rhythm classes:
Perceptual studies. Proceedings of the 15th International Congress of Phonetic Sciences,
Barcelona. 337-342.
Ramus, F., Nespor, M., & Mehler, J. (1999). Correlates of linguistic rhythm in the speech
signal. Cognition, 73, 265-292.
Rathcke, T., & Smith, R. (2011). Exploring timing in accents of British English. Proceedings
of the 17th International Congress of Phonetic Sciences, Hong Kong. 1666-1669.
Recasens, D., & Espinosa, A. (2007). Phonetic typology and positional allophones for
alveolar rhotics in Catalan. Phonetica, 63, 1-28. doi:10.1159/000100059
Roach, P. (1982). On the distinction between 'stress-timed' and 'syllable-timed' languages.
Retrieved from http://www.personal.reading.ac.uk/~11sroach/phon2/frp.pdf
Romaine, S. (1995). Bilingualism (2nd ed.). MA. USA: Blackwell.
236
Sailaja, P. (2009). Dialects of English: Indian English. Edinburgh: Edinburgh University
Press.
Scharinger, M., Monahan, P. J., & Idsardi, W. J. (2011). You had me at "hello": Rapid
extraction of dialect information from spoken words. Neuroimage, , 1-10.
doi:10.1016/j.neuroimage.2011.04.007
Schneider, E. W. (2003). The dynamics of new Englishes: From identity construction to
dialect birth. Language, 79(2), 233-281.
Schneider, E. W. (2011). English around the world. Cambridge: Cambridge University Press.
Schneider, E. W. (2007). Postcolonial English: Varieties around the world. Cambridge:
Cambridge University Press.
Sciriha, L. (2001). Trilingualism in Malta: Social and educational perspectives. International
Journal of Bilingual Education and Bilingualism, 4(1), 23-37.
Sciriha, L., & Vassallo, M. (2006). Living languages in Malta. Malta: Print It.
Scobbie, J. M. (2006). (r) as a variable. In K. Brown (Ed.), Encyclopedia of language and
linguistics (2nd ed., pp. 337-344). Oxford: Elsevier.
Sharbawi, S., & Deterding, D. (2010). Rhoticity in Brunei English. English World-Wide,
31(2), 121-137. doi:10.1075/eww.31.2.01sha
Sharma, D. (2005). Dialect stabilization and speaker awareness in non-native varieties of
English. Journal of Sociolinguistics, 9(2), 194-224. doi:10.1111/j.1360-
6441.2005.00290.x
237
Silverstein, M. (2003). Indexical order and the dialectics of sociolinguistic life. Language and
Communication, 23, 193-229. doi:10.1016/S0271-5309(03)00013-2
Sorace, A. (2010). Using magnitude estimation in developmental linguistics research. In E.
Blom, & S. Unsworth (Eds.), Experimental methods in language acquisition research
(pp. 57-72). Amsterdam: John Benjamins.
Sorace, A., & Keller, F. (2005). Gradience in linguistic data. Lingua, 115(11), 1497-1524.
doi:10.1016/j.lingua.2004.07.002
Sprouse, J. (2011). A test of the congitive assumptions of magnitude estimation:
Commutativity does not hold for acceptability judgments. Language, 87(2), 274-288.
Stuart-Smith, J. (2007). A sociophonetic investigation of post-vocalic /r/ in Glaswegian
adolescents. Proceedings of the 16th International Congress of Phonetic Sciences,
Saarbrucken. 1449-1452.
Sumner, M. (2011). The role of variation in the perception of accented speech. Cognition,
119, 131-136.
Sumner, M., & Samuel, A. G. (2009). The effect of experience on the perception and
representation of dialect variants. Journal of Memory and Language, 60, 487-501.
Tagliamonte, S. (2012). Variationist sociolinguistics: Change, observation, interpretation.
Oxford: Blackwell.
Tauberer, J., & Evanini, K. (2009). Intrinsic vowel duration and the post-vocalic voicing
effect some evidence from dialects of North American English. 10th Annual Conference
of the International Speech Communication Association, Brighton. 2211-2214.
238
Thusat, J., Anderson, E., Davis, S., Ferris, M., Javed, A., Laughlin, A., . . . Wrubel, J. (2009).
Maltese English and the nativization phase of the dynamic model. English Today, 25(2),
25-32. doi:10.1017/S0266078409000157
Trudgill, P., & Gordon, E. (2006). Predicting the past: Dialect archaeology and Australian
English rhoticity. English World-Wide, 27(3), 235-246.
Vella, A. (1995). Prosodic structure and intonation in maltese and its influence on Maltese
English. (Unpublished PhD). University of Edinburgh,
Vella, A. (2012). Languages and language varieties in Malta. International Journal of
Bilingual Education and Bilingualism, , 1-21. doi:0.1080/13670050.2012.716812
Volín, J., & Skarnitzl, R. (2010). The strength of foreign accent in Czech English under
adverse listening conditions. Speech Communication, 52(11–12), 1010-1021.
doi:10.1016/j.specom.2010.06.009
Warren, P. (2008). The language we use. New Zealand English Journal, 22, 45-53.
Watts, R.J., & Trudgill, P. (Eds.). (2004). Alternative histories of English. London:
Routledge.
Watt, D., Fabricius, A., & Kendall, T. (2011). More on vowels. In M. Di Paolo & M. Yaeger-
Dror (Eds.), Sociophonetics. (pp. 107-118). Oxon: Routledge.
Weskott, T., & Fanselow, G. (2008). Variance and informativity in different measures of
linguistic acceptability. Proceedings of the 27th West Coast Conference on Formal
Linguistics, University of California. 431-439.
239
Weskott, T., & Fanselow, G. (2011). On the informativity of different measures of linguistic
acceptability. Language, 87(2), 249-273.
White, L., & Mattys, S. L. (2007). Calibrating rhythm: First language and second language
studies. Journal of Phonetics, 35, 501-522.
White, L., Mattys, S. L., Series, S., & Gage, S. (2007). Rhythm metrics predict rhythmic
discrimination. Proceedings of the 16th International Congress of Phonetic Sciences,
Saarbrucken. 1009-1012.
White, L., Mattys, S. L., & Wiget, L. (2012). Language categorization by adults is based on
sensitivity to durational cues, not rhythm class. Journal of Memory and Language, 66,
665-679.
Zammit, A. (2013, February 20). One very precious legacy. The Times of Malta
240
Appendix A MaltE commentary and complaints
1. Fran, lost in Salisbury, on Facebook:
Finding myself lost in Salisbury I stopped a random bus to ask if it by any chance
passes from my road. The driver stared for a second and said "jaqaw inti Maltija?" Then I got
a "itla' sabiħa nieħdok fejn trid!" … hehe gotta love the Maltese and the fact that we're
Everywhere!
Jaqaw inti Maltija? = By any chance, are you Maltese?
Itla' sabiħa nieħdok fejn trid = Hop on, darling, I'll take you wherever you want.
2. Extract from an article on The Times of Malta entitled, "Potatoes, briefs and
other ways to sell Malta. Retrieved from:
http://www.timesofmalta.com/articles/view/20130507/local/Potatoes-briefs-and-other-ways-
to-sell-Malta.468653
A young Maltese farmer romanticises about the cultivation of potatoes (…)
These articles served as a welcome boost for the island after a slide-show on BBC’s portal
featured migrant birds illegally shot in Malta and the release of Chiara Siracusa’s single
Żarbun two days later fuelled criticism over the choice of lyrics.
But the bubble soon burst when a video about the heritage of potato harvesting in Malta,
carrying the Air Malta logo, went viral in the middle of the week, reaching 120,000 views
yesterday.
The clip, by Duerinck Productions, was produced for Jansen Dongen, one of the top suppliers
of vegetable products for supermarket chains in The Netherlands and Germany.
241
In it, a young farmer romanticises about the cultivation of potatoes, going to lengths to
explain that potato harvesting runs in his family’s blood, and each crop is grown with so
much passion that it breaks their heart to part from it.
He tries to put into words the authenticity of this home-grown hand-picked vegetable which
tastes of the sea, the church and the sun – distinctive Maltese features.
The initial reaction, mostly from Maltese viewers, generated a series of memes, picking
on the young farmer for his broken English and for making a fool of the Maltese.
The reaction, which surprised the producers themselves, soon shifted however when
foreigners praised the Maltese crop and fellow islanders challenged the critics.
The producers uploaded another clip called Malta Derby Potatoes, without the subtitles,
but this boosted criticism, prompting them to excuse themselves for posting“a
preproduction version with errors in the subtitles”.
3. Reader commentary on an article in The Times of Malta entitled "One very
precious legacy" Retrieved from:
http://www.timesofmalta.com/articles/view/20140220/opinion/One-very-precious-
legacy.507520
Whilst agreeing with (and regretting along with) Mr Zammit about the decline in our
fluency in English there could be an explanation for a part of the mis-spelling we
come across so often in the printed word. A Word processor program inserts its own
view of the word typed out, irrespective of the original intended interpretation. Thus
"wear" becomes "where", "breadth" becomes "Breath" and so on.
But conversely, I cannot ever bring myself to accept the glorification of illiteracy to
242
be seen in the rendition of so many Maltese translations such as, to my mind, terrible
howlers such as translating the English "television" as "telfixxin" in the Maltese
version. Neither the French nor the Italians are guilty of such "Barbarismi"
"telfixxin" would be an excellent example of what I was saying, Mr Camilleri.
But my Maltese dictionary gives televizjoni/televixin...
So what's afoot?
Are you using more than one dictionary, and thence a variety of optional spellings?
What hope is there, for either language?
Two brief comments on this very entertaining and educational article:
To my knowledge, English is still the most widely spoken language in the world,
while Mandarin is the most spoken language.
I know I am going against the grain by saying this, but the perceived decline in the
level of English should be examined further. Suffice to say that today the base is
much broader than it was forty years ago. The vast increase in the number of students
attending educational institutions since then may confuse our perceptions
243
Appendix B – Expert Listener Feedback
Clip0 Clip1 Clip2 Clip3 Clip4 Clip5 Clip6 Clip7 Clip8 Clip9 Clip10
Li1
rhythm
differences,
lack of
vowel
reduct.;
lack of
dark l
Sliema
English
with some
useof inton.
less typical
of ME
educated
speaker,
academic
flow, typical
ME
prosody, to
which lack
of vowel
reduction
contributes.
Fp very ME
marked
prosody and
use of FPs
very
prominent.
Some
influence of
M in syntax
Reading style,
English learnt at
school, not
regularly used
non-reading
variant of
previous.
Highly
marked.
Doesn't use
English
regularly?
eduated ME
speaker,
possibly
regular user.
Some vowel
reduction, non-
marked
prosody
comfortable
user of E,
still marked
especially
in prosody
sliema type
English but full
vowels hence
marked rhythm
infrequent user
of E, good
communicative
ability, but
highly marked
pronunciation
heavily
marked
Li2
intonation,
vowel
quality
final z/z/
devoicing
intonation intonation,
rhoticity
vowel
quality,
devoicing.
Stopping
interdental
frics.,
vowel quality,
intonation,
devoicing, rhoticity
'started
cropping
up'',
intonation,
no schwa,
vowel
quality
intonation,
vowel quality
'because he
would have
had'', final z
devoicing
in his, was
intonation,
vowels e->I in
carrots
rhotic, vowel
quality,
topicalisation
''this picture
how it
shows…''
intonation,
rhotic, no
schwa
Li3
no vowel
reduction,
MT L1
influenced
intonation
intonation
is MT
influenced
word final
devoicing,
MT
intonation,
stress.
stress suggests MT
dominant speaker;
consonant
lengthening,
gemination
grammar L1
MT
influenced,
intonation
hypercorrection
e -> I in
desperate
use of
would, final
devoicing,
MT
intonation
hypercorrection,
vowels pushed
back.
244
Li4
intonation,
vowel
quality
final
intonation
'even better
still''
vowel
quality,
devoicing
devoicing,
intonation and
phrasing
rhoticity?,
varied use
relative
pronouns
hypercorrection
e -> I in
desperate
phrasing
and
intonation
devoicing,
regressive
assimilation
vowel quality,
intonation
vowel
quality,
intonation
Li6
typical ME
intonation;
frequent
use of
"used to +
V"
calquing
MT "kien +
V"
not your
typical ME
speaker,
even
though
there are
some
intonational
traits of
ME
typical ME
intonation;
"better still"
at the end of
a phrase
calquing
MT "tajjeb
xorta"
not as typical
as 0 and 2,
but has some
features
which are
characteristic
of ME such
as the
phonemic
substition θ -
t (three,
months)
typical ME
intonation; note
syntactic 'error'
which for who
typical
intonation
patterns e.g.
"from
outside" bit;
some
phonemic
substitutions
ð - d (them,
there, then)
could be the
speaker of
another
English, not
necessarily
ME, I guess
typical
intonation,
e.g., "at
home, at
school..."
not typical ME
intonation;
again, could be
the speaker of
another English
unimstakably
ME, very
typical
intonation,
roller r,
pronunciation
of operationAl
not many
characteristic
features of
ME
Li8
"what was
THE
point";
intonation
and rhythm
very B
English.
Expat. No
interdental
stops
r' is used
almost as
approximant
ME vowels
"because";
discourse
marker ehm
diphthongises
suitable;
Intonation=reading?
stopping
interD
fricatives,
"outsidER",
discourse:
like
hesitation and
FPs
vowels;
intonation;
discoure: "I
mean",
Rising
Intonation
Vowels and FPs
Vowels. No
reduction
"amphibious",
"operationAL",
rhythm
Vowels. No
reduction
"similAR",
stops for
fricatives.
Li9 post
vocalic r
something
about
intonation
odd
emphasis
"better",
intonation
stopped
fricatives,
WO "quite
some
students
relative pronouns
variation
"which/who"
intonation.
Stopping
fricatives
vowel quality
"desperate",
variant stress
patterns
variant use
"would",
intonation
operational and
amphibious
noted, no th
heavy
intonation,
also no v
reduction.
245
Appendix C Participant information pack
Data Collection Information Pack and Tasks
Part 1 - Information for you, as a participant in this study:
My name is Sarah Grech and I am reading for a PhD in Linguistics at the University of Malta, where I also work as an Assistant Lecturer in the Department of English. As part of my PhD research, I am collecting data on language usage in Malta. This data will consist of a short written questionnaire (attached), and a short recorded conversation (together totalling 30 minutes). I would greatly appreciate it if you would consider participating in this study. Your participation will consist of a meeting lasting a maximum of 30 minutes, which will be audio recorded. The resulting data from the audio recording and from the questionnaire will be used for research purposes only and will not be used for identification in any way. Any personal information inadvertently arising during the course of the interview will be discarded and deleted. The data collected may form part of the International Corpus of English (ICE) for Malta and 20 seconds of each of the collected audio recordings will be made available on a dedicated website focusing on language analysis and language perception. By signing you are indicating that you are happy to participate in this study. The time you invest will contribute to an increased awareness and understanding of our linguistic identity, behaviour and patterns of language usage here in Malta.
Thank you very much.
Part 2 – Your consent
I agree to participate in a 30 minute study which will be audio recorded, which may form part of the International Corpus of English and 20 seconds of which will be extracted to form part of an extended study made available on a dedicated website. I understand that any personal information inadvertently arising during the course of the interview will be discarded and deleted. I understand that I may withdraw from the study at any point, without providing a reason for doing so.
NAME*:
SIGNATURE:
Sarah Grech Professor Ray Fabri
Supervisor
I can be contacted at any point regarding the study here: [email protected]
*This is only required for initial stages of organising data collection. Any data you provide will immediately be coded
something like this: Sp(eaker)1m/Sp(eaker)1f.
246
Part 3 – Questionnaire
The answers to these questions will be used in conjunction with the recorded conversations
for research purposes only.
How old are you?
Which type of school/s did you
attend? (answering Church, State
or Private is sufficient)
What language do you mostly
use at home?
What language did you mostly
use at school?
Where in Malta/Gozo have
you lived most of your life?
(this can include the place
you’re currently living in)
Where do you currently live?
Please complete the following table. An example is provided below.
Instructions: rate yourself from 0 (I cannot use this skill in this language), through to 5 (I am
very comfortable with this skill in this language) in the table below. You can use the digits 0,
1, 2, 3, 4 and 5 in your self-rating.
Name:
Language Speaking Listening Reading Writing
Maltese
English
Name: >>example<< Sarah Grech
Language Speaking Listening Reading Writing
Maltese 4 3 2 1
English 5 5 5 5
247
Appendix D Speech production task materials.
Word List: Spot the difference pictures:
TextAloud:
This is a cartoon of a bike rally, showing a motorbike possibly belonging in the German army. Actually the Montecarlo Rally is an internationally famous event. Dressed in the protective clothing typically used for a wartime motorbike, this proud father and his daughter are clearly bike enthusiasts, enduring damp weather in order to participate in this rally. The bike is military beige and it has some interesting typical features , such as the arched handlebars, together with side mirrors, an old canvas bag, an authentic jerry can and an ammo container seen as part of the sidecar outfit. The man is wearing a crash helmet but the woman only has a head scarf on her head. The vehicle number which is forty-three in one version of the picture, seems a little strange, as do the miserable mulberry trees in the background.
1 bike rally
2 crash helmet
3 side mirrors
4 canvas bag
5 jerry can
6 sidecar outfit
7 ammo container
8 Montecarlo Rally
9 head scarf
10 vehicle number
11 mulberry tree
12 damp weather
13 proud father
14 forty-three
15 arched handlebars
16 bike enthusiast
17 german army
18 wartime motorbike
19 military beige
20 protective clothing
248
Appendix E Perception Study Powerpoint Slides
Slide 1
Slide 2
Slide 3
An Estimation Task
Thank you for taking part in my research.
A Magnitude Estimation Study
This is an estimation task. There are no right or wrong answers, only your close estimation of what you see and hear.
2
Here’s what you need to do.Look at the line appearing below. How long do
you think it might be? Give it a number to reflect your estimation of the length, using the column titled: Trial length, on your Participant Task Sheet.
This line is your yardstick, or modulus, against which you can now compare other lengths.
3
249
Slide 4
Slide 5
Slide 6
Like this one:
• This line is longer than the modulus. Enter a new number to reflect the difference.
4
Modulus (Trial Length)
Trial Length A
Or this one:
5
Modulus (Trial Length)
• Enter a new number again for this line.
Tip: although the line is shorter, please do not use negative numbers (below zero) because my statistics programme won’t like it!
Trial Length B
This is how magnitude estimation works.
You are estimating your impressions by comparing them to the example set by the modulus.
For my study, I’m going to ask you to do the same kind of estimation for people talking.
6
250
Slide 7
Slide 8
Slide 9
Estimating people talking
In the following three clips (one modulus + two speakers), please listen and enter your score in the excel sheet column marked ‘Trial Sound’. You can listen to the modulus more than once if you need to by clicking the yellow loudspeaker icon.
Tip: avoid using a small scale like 1-10 for this, because it doesn’t give you much flexibility. You could work on a scale of 1-100 or 1-1000, for example and you can also include decimals or fractions if you need to.
8
The modulus
Trial Sound A
Trial Sound Clip B
Enter your estimated score in the column labelled ‘Trial Estimating Sound’.
First estimate how clear the modulus sounds, then give Trial Sounds A and B higher or lower scores, depending on whether you think they are clearer or less clear than the modulus.
Click the yellow loudspeaker icon when you are ready to listen.
And now for my study.You will now listen to 11 different speakers and
judge them in the same way as you judged the lines and the previous 3 speakers.
Be as adventurous as you like with numbers, so that they reflect your judgement as closely as possible.
The first speaker is your modulus (Clip 00), and it will be repeated throughout the experiment to help you, but you can manipulate the slides yourself, so you can repeat a sound file or skip it, as you choose.
251
Slide 10
Slide 11
Slide 12
In the following clips, please estimate the following:Imagine you are at a coffee shop in an international airport
and you overhear the following speakers. In each case, decide how much more or less Maltese than the Modulus each speaker sounds and enter a number to show this in the corresponding row on your table.
Use Clip 00 as your reference, then estimate each other clip as either more, less, or equally likely to come from Malta.
Enter your score on the accompanying sheet, in the column that corresponds with the slide title (Clip 00, Clip 01 etc.).
Clip 00 (Modulus), Clip 01
• Clip 00
• Clip 01
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure are
you that they are Maltese?
Clip 00, Clip 02
• Clip 00
• Clip 02
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure
are you that they are Maltese?
252
Slide 13
Slide 14
Slide 15
Clip 00, Clip 03
• Clip 00
• Clip 03
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure are
you that they are Maltese?
Clip 00, Clip 04
• Clip 00
• Clip 04
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure are you that
they are Maltese?
Clip 00, Clip 05
• Clip 00
• Clip 05
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure are you that
they are Maltese?
253
Slide 16
Slide 17
Slide 18
Clip 00, Clip 06
• Clip 00
• Clip 06
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure are
you that they are Maltese?
Clip 00, Clip 07
• Clip 00
• Clip 07
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure
are you that they are Maltese?
Clip 00, Clip 08
• Clip 00
• Clip 08
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure are you that they
are Maltese?
254
Slide 19
Slide 20
Clip 00, Clip 09
• Clip 00
• Clip 09
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure
are you that they are Maltese?
Clip 00, Clip 10
• Clip 00
• Clip 10
Imagine you are at a coffee shop in an international airport and you overhear the following speakers. Compared with the modulus, how sure
are you that they are Maltese?
255
Appendix F Sample data file for Sp1
Vowel Durations r_Y(es) and r_N(o)
Source Word Duration ms
r
Sp1_TextAloudB_textgrid cartoon 120 yes
Sp1_TextAloudB_textgrid Montecarlo 92 yes
Sp1_TextAloudB_textgrid internationally 55 yes
Sp1_TextAloudB_textgrid for 57 yes
Sp1_TextAloudB_textgrid wartime 62 yes
Sp1_TextAloudB_textgrid are 164 yes
Sp1_TextAloudB_textgrid clearly 155 yes
Sp1_TextAloudB_textgrid participate 53 yes
Sp1_TextAloudB_textgrid arched 76 yes
Sp1_TextAloudB_textgrid mirrors 119 yes
Sp1_TextAloudB_textgrid container 47 yes
Sp1_TextAloudB_textgrid part 99 yes
Sp1_TextAloudB_textgrid sidecar 73 yes
90.15385
39.08078
Sp1_TextAloudB_textgrid motorbike 53.00 no
Sp1_TextAloudB_textgrid German 111.00 no
Sp1_TextAloudB_textgrid army 202 no
Sp1_TextAloudB_textgrid motorbike 58 no
Sp1_TextAloudB_textgrid father 142 no
Sp1_TextAloudB_textgrid daughter 136 no
Sp1_TextAloudB_textgrid weather 175 no
Sp1_TextAloudB_textgrid order 187 no
Sp1_TextAloudB_textgrid order 164 no
Sp1_TextAloudB_textgrid features 123 no
Sp1_TextAloudB_textgrid handlebars 236 no
Sp1_TextAloudB_textgrid together 70 no
Sp1_TextAloudB_textgrid scarf 147 no
Sp1_TextAloudB_textgrid her 96 no
135.7143
54.87108
Pairwise Variability Index calculated for vowel durations
Source Text Vowel Duration ms
PVI calculation
Sp1_TextAloudpvi_textgrid This i 53
Sp1_TextAloudpvi_textgrid is i 60 0.124
256
Sp1_TextAloudpvi_textgrid a a 65 0.080
Sp1_TextAloudpvi_textgrid cartoon a 98 0.405
Sp1_TextAloudpvi_textgrid cartoon oo 195 0.662
Sp1_TextAloudpvi_textgrid of o 69 0.955
Sp1_TextAloudpvi_textgrid a a 229 1.074
Sp1_TextAloudpvi_textgrid bike i 137 0.503
Sp1_TextAloudpvi_textgrid rally a 71 0.635
Sp1_TextAloudpvi_textgrid rally i 79 0.107
Sp1_TextAloudpvi_textgrid showing o 57 0.324
Sp1_TextAloudpvi_textgrid showing i 55 0.036
Sp1_TextAloudpvi_textgrid a a 77 0.333
Sp1_TextAloudpvi_textgrid motorbike o 91 0.167
Sp1_TextAloudpvi_textgrid motorbike i 149 0.483
Sp1_TextAloudpvi_textgrid possibly o 46 1.056
Sp1_TextAloudpvi_textgrid possibly i 45 0.022
Sp1_TextAloudpvi_textgrid possibly i 84 0.605
Sp1_TextAloudpvi_textgrid belonging e 25 1.083
Sp1_TextAloudpvi_textgrid belonging o 72 0.969
Sp1_TextAloudpvi_textgrid belonging i 83 0.142
Sp1_TextAloudpvi_textgrid in i 38 0.744
Sp1_TextAloudpvi_textgrid the e 55 0.366
Sp1_TextAloudpvi_textgrid German e 111 0.675
Sp1_TextAloudpvi_textgrid German a 32 1.105
Sp1_TextAloudpvi_textgrid army a 202 1.453
Sp1_TextAloudpvi_textgrid Actually a 66 1.015
Sp1_TextAloudpvi_textgrid Actually a 60 0.095
Sp1_TextAloudpvi_textgrid Actually i 130 0.737
Sp1_TextAloudpvi_textgrid the e 36 1.133
Sp1_TextAloudpvi_textgrid Montecarlo o 47 0.265
Sp1_TextAloudpvi_textgrid Montecarlo e 47 0.000
Sp1_TextAloudpvi_textgrid Montecarlo a 92 0.647
Sp1_TextAloudpvi_textgrid Montecarlo o 71 0.258
Sp1_TextAloudpvi_textgrid rally a 73 0.028
Sp1_TextAloudpvi_textgrid is i 55 0.281
Sp1_TextAloudpvi_textgrid an a 20 0.933
Sp1_TextAloudpvi_textgrid internationally i 61 1.012
Sp1_TextAloudpvi_textgrid internationally e 55 0.103
Sp1_TextAloudpvi_textgrid internationally a 54 0.018
Sp1_TextAloudpvi_textgrid internationally a 30 0.571
Sp1_TextAloudpvi_textgrid internationally a 42 0.333
Sp1_TextAloudpvi_textgrid famous a 122 0.976
Sp1_TextAloudpvi_textgrid famous ou 40 1.012
Sp1_TextAloudpvi_textgrid event e 75 0.609
Sp1_TextAloudpvi_textgrid event e 66 0.128
257
Sp1_TextAloudpvi_textgrid Dressed e 90 0.308
Sp1_TextAloudpvi_textgrid in i 62 0.368
Sp1_TextAloudpvi_textgrid the e 50 0.214
Sp1_TextAloudpvi_textgrid protective o 39 0.247
Sp1_TextAloudpvi_textgrid protective e 53 0.304
Sp1_TextAloudpvi_textgrid protective i 43 0.208
Sp1_TextAloudpvi_textgrid clothing o 116 0.918
Sp1_TextAloudpvi_textgrid clothing i 64 0.578
Sp1_TextAloudpvi_textgrid typically i 28 0.783
Sp1_TextAloudpvi_textgrid typically a 21 0.286
Sp1_TextAloudpvi_textgrid used u 180 1.582
Sp1_TextAloudpvi_textgrid for o 57 1.038
Sp1_TextAloudpvi_textgrid a a 41 0.327
Sp1_TextAloudpvi_textgrid wartime a 62 0.408
Sp1_TextAloudpvi_textgrid wartime i 98 0.450
Sp1_TextAloudpvi_textgrid motorbike o 80 0.202
Sp1_TextAloudpvi_textgrid motorbike o 58 0.319
Sp1_TextAloudpvi_textgrid motorbike i 125 0.732
Sp1_TextAloudpvi_textgrid this i 25 1.333
Sp1_TextAloudpvi_textgrid proud ou 84 1.083
Sp1_TextAloudpvi_textgrid father a 138 0.486
Sp1_TextAloudpvi_textgrid father e 142 0.029
Sp1_TextAloudpvi_textgrid and a 37 1.173
Sp1_TextAloudpvi_textgrid his i 34 0.085
Sp1_TextAloudpvi_textgrid daughter au 133 1.186
Sp1_TextAloudpvi_textgrid daughter e 136 0.022
Sp1_TextAloudpvi_textgrid are a 164 0.187
Sp1_TextAloudpvi_textgrid clearly ea 155 0.056
Sp1_TextAloudpvi_textgrid bike i 140 0.102
Sp1_TextAloudpvi_textgrid enthusiasts e 67 0.705
Sp1_TextAloudpvi_textgrid enthusiasts u 73 0.086
Sp1_TextAloudpvi_textgrid enthusiasts i 49 0.393
Sp1_TextAloudpvi_textgrid enthusiasts a 89 0.580
Sp1_TextAloudpvi_textgrid enduring e 97 0.086
Sp1_TextAloudpvi_textgrid enduring u 153 0.448
Sp1_TextAloudpvi_textgrid enduring i 61 0.860
Sp1_TextAloudpvi_textgrid damp a 95 0.436
Sp1_TextAloudpvi_textgrid weather ea 67 0.346
Sp1_TextAloudpvi_textgrid weather e 175 0.893
Sp1_TextAloudpvi_textgrid in i 94 0.602
Sp1_TextAloudpvi_textgrid order o 187 0.662
Sp1_TextAloudpvi_textgrid order e 164 0.131
Sp1_TextAloudpvi_textgrid to o 42 1.184
Sp1_TextAloudpvi_textgrid participate a 53 0.232
258
Sp1_TextAloudpvi_textgrid participate i 37 0.356
Sp1_TextAloudpvi_textgrid participate i 37 0.000
Sp1_TextAloudpvi_textgrid participate a 184 1.330
Sp1_TextAloudpvi_textgrid in i 88 0.706
Sp1_TextAloudpvi_textgrid this i 48 0.588
Sp1_TextAloudpvi_textgrid rally a 74 0.426
Sp1_TextAloudpvi_textgrid The e 37 0.667
Sp1_TextAloudpvi_textgrid bike i 183 1.327
Sp1_TextAloudpvi_textgrid is i 57 1.050
Sp1_TextAloudpvi_textgrid a a 89 0.438
Sp1_TextAloudpvi_textgrid military i 10 1.596
Sp1_TextAloudpvi_textgrid military i 49 1.322
Sp1_TextAloudpvi_textgrid beige ei 153 1.030
Sp1_TextAloudpvi_textgrid and a 124 0.209
Sp1_TextAloudpvi_textgrid it i 68 0.583
Sp1_TextAloudpvi_textgrid has a 93 0.311
Sp1_TextAloudpvi_textgrid some o 182 0.647
Sp1_TextAloudpvi_textgrid interesting i 28 1.467
Sp1_TextAloudpvi_textgrid interesting e 26 0.074
Sp1_TextAloudpvi_textgrid interesting e 58 0.762
Sp1_TextAloudpvi_textgrid interesting i 53 0.090
Sp1_TextAloudpvi_textgrid features ea 83 0.441
Sp1_TextAloudpvi_textgrid features u 123 0.388
Sp1_TextAloudpvi_textgrid such u 100 0.206
Sp1_TextAloudpvi_textgrid as a 146 0.374
Sp1_TextAloudpvi_textgrid the e 95 0.423
Sp1_TextAloudpvi_textgrid handlebars a 45 0.714
Sp1_TextAloudpvi_textgrid handlebars a 236 1.359
Sp1_TextAloudpvi_textgrid together o 33 1.509
Sp1_TextAloudpvi_textgrid together e 62 0.611
Sp1_TextAloudpvi_textgrid with i 26 0.818
Sp1_TextAloudpvi_textgrid side i 174 1.480
Sp1_TextAloudpvi_textgrid mirrors i 36 1.314
Sp1_TextAloudpvi_textgrid mirrors o 119 1.071
Sp1_TextAloudpvi_textgrid an a 82 0.368
Sp1_TextAloudpvi_textgrid old o 104 0.237
Sp1_TextAloudpvi_textgrid canvas a 52 0.667
Sp1_TextAloudpvi_textgrid canvas a 69 0.281
Sp1_TextAloudpvi_textgrid bag a 150 0.740
Sp1_TextAloudpvi_textgrid an a 55 0.927
Sp1_TextAloudpvi_textgrid authentic au 92 0.503
Sp1_TextAloudpvi_textgrid authentic e 62 0.390
Sp1_TextAloudpvi_textgrid authentic i 40 0.431
Sp1_TextAloudpvi_textgrid jerry e 73 0.584
259
Sp1_TextAloudpvi_textgrid can a 125 0.525
Sp1_TextAloudpvi_textgrid and a 113 0.101
Sp1_TextAloudpvi_textgrid an a 107 0.055
Sp1_TextAloudpvi_textgrid ammo a 101 0.058
Sp1_TextAloudpvi_textgrid ammo o 135 0.288
Sp1_TextAloudpvi_textgrid container o 51 0.903
Sp1_TextAloudpvi_textgrid container ai 117 0.786
Sp1_TextAloudpvi_textgrid container e 47 0.854
Sp1_TextAloudpvi_textgrid seen ee 85 0.576
Sp1_TextAloudpvi_textgrid as a 93 0.090
Sp1_TextAloudpvi_textgrid part a 99 0.063
Sp1_TextAloudpvi_textgrid of o 77 0.250
Sp1_TextAloudpvi_textgrid the e 84 0.087
Sp1_TextAloudpvi_textgrid sidecar i 142 0.513
Sp1_TextAloudpvi_textgrid sidecar a 73 0.642
Sp1_TextAloudpvi_textgrid outfit ou 121 0.495
Sp1_TextAloudpvi_textgrid outfit i 132 0.087
Sp1_TextAloudpvi_textgrid The e 39 1.088
Sp1_TextAloudpvi_textgrid man a 159 1.212
Sp1_TextAloudpvi_textgrid is i 73 0.741
Sp1_TextAloudpvi_textgrid wearing ea 80 0.092
Sp1_TextAloudpvi_textgrid wearing i 47 0.520
Sp1_TextAloudpvi_textgrid a a 75 0.459
Sp1_TextAloudpvi_textgrid crash a 44 0.521
Sp1_TextAloudpvi_textgrid helmet e 55 0.222
Sp1_TextAloudpvi_textgrid helmet e 63 0.136
Sp1_TextAloudpvi_textgrid but u 60 0.049
Sp1_TextAloudpvi_textgrid the e 33 0.581
Sp1_TextAloudpvi_textgrid woman o 55 0.500
Sp1_TextAloudpvi_textgrid woman a 47 0.157
Sp1_TextAloudpvi_textgrid only o 136 0.973
Sp1_TextAloudpvi_textgrid has a 91 0.396
Sp1_TextAloudpvi_textgrid a a 66 0.318
Sp1_TextAloudpvi_textgrid head ea 74 0.114
Sp1_TextAloudpvi_textgrid scarf a 147 0.661
Sp1_TextAloudpvi_textgrid on o 61 0.827
Sp1_TextAloudpvi_textgrid her e 96 0.446
Sp1_TextAloudpvi_textgrid head ea 112 0.154 0.551
260