manner asymmetries in central catalan pre-vocalic voicing

23
Manner asymmetries in Central Catalan pre-vocalic voicing Patrycja Strycharczuk Clinical Audiology Speech and Language Research Centre, Queen Margaret University, Queen Margaret University Drive, Edinburgh, EH21 6UU, UK article info Article history: Received 2 April 2014 Received in revised form 16 August 2014 Accepted 16 August 2014 Available online Keywords: Voicing Sandhi Catalan Sound change Naturalness abstract In Central Catalan, word-nal sibilants and stop þ sibilant clusters undergo voicing when followed by a vowel in the next word. In word-nal pre-vocalic singleton stops, however, no voicing is observed. This asymmetry is unexpected from the point of view of phonetic naturalness, and it invites a closer empirical investigation, especially in the light of recent phonetic ndings which challenge earlier descriptions of a similar voicing process before sonorant consonants in Catalan. This study presents a systematic acoustic comparison of voicing in pre-vocalic stops /p, b/, sibilants /s, z/ and stop þ sibilant clusters /bz/ with voicing before obstruents and before sonorant consonants. The results show that while pre-sonorant voicing is limited and highly variable, pre-vocalic sibilant and cluster voicing are both robust categorical processes for most speakers. Drawing on evidence from how voicing is realised in a subset of gradient cases, I propose that pre-vocalic cluster voicing developed from pre-vocalic sibilant voicing which in turn descended from an earlier intervocalic voicing process. I further question whether the outcome of the sound changes in Catalan can be synchronically analysed using grounded markedness constraints, and to what extent the pre-vocalic voicing may be regulated by universal vs. language-specic mechanisms of phonological abstraction. Ó 2014 Elsevier Ltd. All rights reserved. 1. Introduction 1.1. Descriptive reports on Catalan voicing Central Catalan voice assimilation is a challenging case to consider in the context of naturalness, as it presents a complex picture of manner interactions between trigger and undergoer, which are phonetically motivated to different degrees. In clusters of two obstruents, regressive voice assimilation applies, where the [voice] value is determined by the rightmost segment in the cluster, as shown in (1). (1) Voice assimilation in obstruent þ obstruent clusters (Cuartero Torres, 2001; Wheeler, 2005) a. Assimilation across word boundaries nap gran [bg] big turnipgos danès [zð] Danish dogE-mail address: [email protected]. Contents lists available at ScienceDirect Language Sciences journal homepage: www.elsevier.com/locate/langsci http://dx.doi.org/10.1016/j.langsci.2014.08.004 0388-0001/Ó 2014 Elsevier Ltd. All rights reserved. Language Sciences 47 (2015) 84106

Upload: patrycja

Post on 23-Feb-2017

213 views

Category:

Documents


1 download

TRANSCRIPT

Language Sciences 47 (2015) 84–106

Contents lists available at ScienceDirect

Language Sciences

journal homepage: www.elsevier .com/locate/ langsci

Manner asymmetries in Central Catalan pre-vocalic voicing

Patrycja StrycharczukClinical Audiology Speech and Language Research Centre, Queen Margaret University, Queen Margaret University Drive, Edinburgh, EH216UU, UK

a r t i c l e i n f o

Article history:Received 2 April 2014Received in revised form 16 August 2014Accepted 16 August 2014Available online

Keywords:VoicingSandhiCatalanSound changeNaturalness

E-mail address: [email protected].

http://dx.doi.org/10.1016/j.langsci.2014.08.0040388-0001/� 2014 Elsevier Ltd. All rights reserved.

a b s t r a c t

In Central Catalan, word-final sibilants and stop þ sibilant clusters undergo voicing whenfollowed by a vowel in the next word. In word-final pre-vocalic singleton stops, however,no voicing is observed. This asymmetry is unexpected from the point of view of phoneticnaturalness, and it invites a closer empirical investigation, especially in the light of recentphonetic findings which challenge earlier descriptions of a similar voicing process beforesonorant consonants in Catalan. This study presents a systematic acoustic comparison ofvoicing in pre-vocalic stops /p, b/, sibilants /s, z/ and stop þ sibilant clusters /bz/ withvoicing before obstruents and before sonorant consonants. The results show that whilepre-sonorant voicing is limited and highly variable, pre-vocalic sibilant and cluster voicingare both robust categorical processes for most speakers. Drawing on evidence from howvoicing is realised in a subset of gradient cases, I propose that pre-vocalic cluster voicingdeveloped from pre-vocalic sibilant voicing which in turn descended from an earlierintervocalic voicing process. I further question whether the outcome of the sound changesin Catalan can be synchronically analysed using grounded markedness constraints, and towhat extent the pre-vocalic voicing may be regulated by universal vs. language-specificmechanisms of phonological abstraction.

� 2014 Elsevier Ltd. All rights reserved.

1. Introduction

1.1. Descriptive reports on Catalan voicing

Central Catalan voice assimilation is a challenging case to consider in the context of naturalness, as it presents a complexpicture of manner interactions between trigger and undergoer, which are phonetically motivated to different degrees. Inclusters of two obstruents, regressive voice assimilation applies, where the [voice] value is determined by the rightmostsegment in the cluster, as shown in (1).

(1)

Voice assimilation in obstruent þ obstruent clusters (Cuartero Torres, 2001; Wheeler, 2005) a. Assimilation across word boundaries

nap gran

[bg] ‘big turnip’ gos danès [zð] ‘Danish dog’

1 The example fromthere is pre-vocalic vovocalic voicing is varia

P. Strycharczuk / Language Sciences 47 (2015) 84–106 85

set terrible

Wheeler (2005) iicing in labio-denble for labiodenta

[tː]

n (3) groups labiodtal fricatives. Howl fricatives. Such

‘terrible thirst’

cas terrible [st] ‘terrible case’

b.

Word-internal clusters futbol [db] w [bː] ‘football’ afgà [vɣ] ‘Afghan’ dubte [pt] ‘doubt’ tastar [st] ‘taste’

In addition to the cross-linguistically common voice assimilation in obstruent clusters, Catalan has also been reported to havevoicing in domain-final (word-final, syllable-final and prefix-final) obstruents followed by sonorant consonants, as shown in(2). This type of voicing is potentially phonetically motivated, since it applies in a sequence of consonants where the first isphonetically voiceless, and the second is phonetically voiced. Yet, pre-sonorant voicing is relatively less common in a cross-linguistic perspective (Jansen, 2004; Strycharczuk, 2012), and it does not seem to apply independently of voice neutralisationand voice assimilation in obstruent clusters.

(2)

Voice assimilation in obstruent þ sonorant clusters (Cuartero Torres, 2001; Wheeler, 2005) a. Assimilation across word boundaries

sap riure

[br] ‘knows how to laugh’ poc whisky [gw] ‘not much whiskey’ cas notable [zn] ‘remarkable case’ mateix líquid [ʒl] ‘same liquid’

b.

Word-internal clusters hipnosi [bn] ‘hypnosis’ ovni [vn] ‘UFO’ esnob [zn] ‘snob’

Most languages which show pre-sonorant voicing also have obstruent voicing before vowels. This has been found, forinstance, in West-Flemish (De Schutter and Taeldeman, 1986; Weijnen, 1991; Simon, 2010; Strycharczuk and Simon, 2013),Ecuadorian Spanish (Robinson, 1979; Lipski, 1989; Strycharczuk et al., 2013) and Slovak (Blaho, 2008; Bárkányi and Kiss,2012). Catalan also shows pre-vocalic voicing, but according to Wheeler (2005), pre-vocalic voicing only applies in sibi-lants, but not in any other fricatives or stops. Examples are in (3).1

(3)

Word-final obstruent voicing before a vowel (Wheeler, 2005) a. Non-sibilants

sap ajudar

[p] ‘knows how to help’ llarg any [k] ‘long year’ xef únic [f] ‘sole chef’

b.

Sibilants cas extrem [z] ‘extreme case’ mateix element [ʒ] ‘same element’ mig any [dʒ͡] ‘half a year’

The pre-vocalic voicing pattern of fricatives described above occurs not only at the end of a word, but also prefix-finally, asillustrated in (4).

(4)

Prefix-final voicing (Bonet and Lloret, 1998; Bermúdez-Otero, 2001) des-estimar [z] ‘to rule out’ sots-arrendar [dz͡] ‘to sublet’

While pre-vocalic voicing does not apply to singleton stops, it does apply to stop þ sibilant clusters, as shown in (5).

(5)

Pre-vocalic voicing in consonant clusters with a final sibilant (Bonet and Lloret, 1998; Wheeler, 2005) cops amagats [bz] ‘hidden blows’ amics íntims [gz] ‘close friends’ discs antics [zgz] ‘old discs’

ental fricatives with stops in resisting pre-vocalic voicing. According to Bonet and Lloret (1998),ever, both sources acknowledge that their generalisations do not take into account that pre-

variation is also noted by Recasens (1993) and Recasens and Mira (2012).

P. Strycharczuk / Language Sciences 47 (2015) 84–10686

In addition to the voice assimilation processes discussed above, Catalan also shows voice neutralisation in coda position. Thelaryngeal contrast in Catalan is generally limited to obstruents, including stops (p, b, t, d, k, g), fricatives (f, v, s, z, ʃ, ʒ) andaffricates (ts͡, dz͡, tʃ͡, dʒ͡), and it is only observed in onset position. In coda position, the laryngeal contrast is neutralised, asillustrated in (6).

2 The rphonolog

(6)

eader isy inter

Voice neutralisation and contrast (Wheeler, 2005)

Final devoicing

referred to Ernestus (2011) and (Strycharczuk, 2012, Cface.

Voicing contrast

sap [ˈsap] ‘(he/she) knows’ sabem

h. 2) for recent

[sə.ˈbεm]

surveys of the gradience-cat

‘we know’

tip

[ˈtip] ‘fed up, MASC.’ tipa [ˈti.pə] ‘fed up, FEM.’ cas [ˈkas] ‘case’ casos [ˈka.zus] ‘cases’ braç [ˈbras] ‘arm’ braços [ˈbɾa.sus] ‘arms’ serf [ˈserf] ‘serf, MASC.’ serfa [ˈser.bə] w [ˈser.və] ‘serf, FEM.’ buf [ˈbuf] ‘puff’ bufar [bu.ˈfa] ‘to puff’ mig [ˈmitʃ͡] ‘half, MASC.’ mitja [ˈmid.dʒ͡ə] ‘half, FEM.’ despatx [dǝs.ˈpatʃ͡] ‘office’ despatxos [dəs.ˈpa.tʃ͡us] ‘offices’

1.2. Previous phonetic studies

The sources introduced in Section 1.1 abovemostly treat the voicing processes they describe in categorical terms. However,recent phonetic data deliver new evidence on regressive voicing in Catalan, challenging aspects of the earlier impressionisticdescriptions.

Some of the debate around the description of Catalan voice assimilation concerns the categorical or gradient status of voiceassimilation in obstruent clusters. This, in turn, pertains to a larger issue of whether the voicing is interpreted as a phonetic ora phonological process.2 In a study with two native speakers of Catalan, Cuartero Torres (2001) analysed the ratio of voicingduration to obstruent duration in obstruentþ obstruent clusters with systematically varied [voice] specifications. The voicingduration was extracted from the electroglottographic (EGG) signal, whereas the obstruent duration was analysed based ontime-aligned electropalatographic (EPG) data. One of the speakers in the experiment was found tomostly produce categoricalassimilation in obstruent clusters. The other speaker, however, showed evidence of partial, gradient voicing in underlyinglyvoiceless obstruents followed by a voiced obstruent. Nevertheless, Cuartero Torres argues that that voice assimilation inobstruents is a phonological rule in Catalan, even though voicing in clusters may be phonetically incomplete due to aero-dynamic difficulties in sustaining voicing throughout prolonged constriction. A different position is taken by Recasens andMira (2012), who analyse voicing ratio in CC clusters, using time-aligned EGG and acoustic signals. Recasens & Mira reportthat voicing in /p, t, k/ followed by a voiced C2 is typically incomplete, and that the mean voicing ratio is relatively low,averaging 58.7% (for comparison, Snoeren et al. (2006) report an average voicing ratio of 76.3% in French). Furthermore, thevoicing ratio is subject to large inter-speaker variation. Based on these observations, Recasens & Mira argue that regressivevoicing in Catalan is a gradient process, conditioned by the interaction between the degree to which C1 may be prone toundergo voicing (coarticulatory resistance) and the degree to which C2 may exert a coarticulatory influence (coarticulatoryaggressiveness). A similar line of argumentation is pursued by Recasens and Mira (2013), based on EGG data from voiceassimilation in three-way consonant clusters. For these clusters, Recasens &Mira find amain effect of the following consonanton the voicing ratio of preceding CC clusters, but the effect was relatively limited, and affected C2 more than C1.

When it comes to sonorants as triggers of voice assimilation, existing phonetic findings do not entirely confirm thedescriptive data. Cuartero Torres (2001) analysed sequences of obstruents followed by laterals and nasals, finding much lesssystematic voicing patterns compared to obstruent þ obstruent clusters. While there were instances of complete voicing inCuartero Torres’s data for some stop þ nasal sequences, partially voiced and voiceless realisations were also found. Inaddition, voicing was overall limited in fricative þ sonorant sequences. Similarly, Recasens and Mira (2012) found very littleregressive voicing in consonants followed by nasals or laterals (38.6%–45.4%). For CCC clusters, Recasens and Mira (2013)found on average more voicing in word-final CC clusters followed by voiced stops /b, d/ than in corresponding clusters fol-lowed by nasals or laterals. Carbonell (1992) reports that voicing before nasals may apply categorically, but only for somespeakers, which appears consistent with the interpretation of pre-nasal voicing as an optional but categorical rule. All thesefindings question the generalisation stated in (2), that regressive voicing in Catalan applies categorically to pre-sonorantobstruents. Instead, it would appear that a gradient or a categorical but optional interpretation of this voicing rule may bemore appropriate.

Another noteworthy finding which follows from recent research into Catalan voicing concerns the relative susceptibility ofvoicing in stops vs. fricatives, which corresponds to the degree of coarticulatory resistance in Recasens & Mira’s terms.Recasens and Mira (2012) report that fricatives /f, s, ʃ/ showed an overall lower voicing degree than stops /p, t, k/. The authorsdo not report whether stops are more susceptible to voicing than fricatives before voiced obstruents, before sonorants, or in

egoricity distinction and the phonetics–

P. Strycharczuk / Language Sciences 47 (2015) 84–106 87

both contexts. Although Recasens &Mira note that this finding is consistent with previous results by Slis (1986) for Dutch, thefricative resistance to voicing is surprising given what has been reported for stop and fricative voicing before a followingvowel. Recall that, according to existing descriptions exemplified in (3), only fricatives but not stops undergo voicing before afollowing vowel. Therefore, the phonetic asymmetry in susceptibility to voicing observed by Recasens and Mira (2012) goesagainst the asymmetry found in a potentially phonologised voicing pattern in the pre-vocalic context.

Importantly, the phonological asymmetry between pre-vocalic stop and fricative voicing is not assured, as no systematicdata seem to exist on the relative degree of voicing in singleton coda stops and fricatives before a vowel. Recasens and Mira(2012) included some contexts withword-final fricatives before a vowel, and they do find high voicing ratios for word-final /s/(82.7%) and /ʃ/ (82.0%). However, they do not systematically report how this effect compares to fricative voicing in otherenvironments, such as before a following sonorant or a voiced obstruent. More importantly, their study did not include word-final pre-vocalic stops. Similarly, no phonetic data seem to exist on voicing inword-final stopþ sibilant clusters followed by avowel in the next word. Therefore, the existing descriptions of pre-vocalic voicing effects in Catalan have yet to be experi-mentally verified. An empirical confirmation of the pre-vocalic voicing descriptions is also called for given that descriptiveand experimental sources diverge considerably when it comes to the extent of voice assimilation before voiced obstruentsand sonorants.

1.3. Issues related to pre-vocalic voicing

The generalisations concerning pre-vocalic voicing deserve closer scrutiny not only for the sake of adequate description,but also because pre-vocalic voicing in Catalan presents an unusual case of tension between phonetic and phonologicalnaturalness. Recall that word-final (and morpheme-final) sibilants and stop þ sibilant clusters undergo voicing before avowel, but no such voicing applies to singleton stops, as schematised in (7).

(7)

Summary of voicing generalisations for word-final prevocalic obstruents Data summary Reference /s#V/ [zV] (3b) /ps#V/ [bzV] (5) /p#V/ [pV] (3a)

From a phonetic point of view, the occurrence of voicing in stops þ sibilant clusters but not in singleton stops is unnatural.Obstruent clusters are inherently more resistant to voicing than singleton obstruents, which is related to the aerodynamicrequirements on initiating and sustaining voicing. Vocal fold vibration requires a sufficient pressure drop across the glottis tobe sustained. However, intraoral air pressure rises continuously throughout prolonged constriction, eventually quenchingvoicing. A common phonetic consequence of these aerodynamic restrictions is that voicing ceases at some point duringcluster production, even if the production had been planned as voiced. This has been observed for Catalan by Cuartero Torres(2001), who notes that clusters are particularly prone to devoicing, subject to inter-speaker variation. Considering the pro-duction difficulty involved in cluster voicing, the preference for voicing in pre-vocalic clusters compared to pre-vocalic stopsgoes directly against phonetic motivation.

Even though cluster voicing is phonetically unnatural, it can be considered phonologically natural, seeing how it can bemodelled as a feeding interaction. Pre-vocalic cluster voicing follows from the activity of two independent rules of Catalanphonology: voicing of pre-vocalic fricatives and regressive voice assimilation in obstruent clusters, where the former rulefeeds the latter. This relationship can also be easily modelled in a constraint-based approach, such as Optimality Theory. Anumber of formal analyses for the pattern have been proposed within this framework, including Bermúdez-Otero (2001),Wheeler (2005) and Jiménez and Lloret (2008). While the individual analyses differ in some aspects, they all postulatethat word-final pre-vocalic fricatives in Catalan undergo voicing as a result of feature spreading or constraints on featureagreement. The preference for pre-vocalic fricative voicing is seen as a markedness relationship formalised by means ofspecific constraints. Bermúdez-Otero (2001) proposes the constraint *CONTVOILAG which requires fricative but not stop voicingbefore a vowel.Wheeler (2005) introduces the constraint LAZYSIBILANTS, which requiresword-final fricatives to be voiced beforevowels. Jiménez and Lloret (2008) postulate amarkedness hierarchy of NO-LINK-VC constraints, which prohibit the sharing of afeature [þvoice] between a vowel and a preceding obstruent, depending on the obstruent’s degree of constriction (Jiménez,1999). NO-LINK-VC�Ocl is said to be high-ranked in this hierarchy, prohibiting voicing of post-vocalic stops, but allowing voicingin segments of relatively lesser constriction, such as fricatives.

All the analyses cited above are descriptively adequate in capturing the generalisations concerning Catalan pre-vocalicvoicing, and they all succeed in modelling cluster voicing as resulting from an interaction of constraints that play a role inother independently occurring phonological processes in Catalan. A potentially important implication follows from this,namely that phonologically natural processes applying in a feeding order may produce a phonetically unnatural outcome.This implication, however, only holds insofar as regressive voice assimilation and pre-vocalic sibilant voicing are consideredboth phonological and natural. As far as voice assimilation is concerned, its naturalness is fairly uncontroversial, as regressivevoicing is both cross-linguistically common and phonetically motivated (see Jansen, 2004 for extensive discussion). However,the phonological status of this process has been brought into question. As discussed in Section 1.2, Recasens and Mira (2012)propose that voicing in obstruent clusters is a gradient phonetic process. When it comes to pre-vocalic sibilant voicing, the OT

P. Strycharczuk / Language Sciences 47 (2015) 84–10688

analyses discussed above seem to assume that the process is natural and regulated bymarkedness constraints or markednesshierarchies. The authors justify the constraints and constraint hierarchies they use mainly based on typological evidence.Bermúdez-Otero (2001) cites the case of the Kentish dialect of Middle English where Old English [f, s, q] underwent voicing to[v, z, ð] (e.g. OE fæder>ME vader). Jiménez and Lloret (2008), who link voicing to degree of stricture, look for evidence in thetypology of Catalan dialects. They report that Catalan dialects form a continuum with respect to which obstruents undergopre-vocalic voicing. On the one end of the spectrum the Alicante dialect voices all word-final pre-vocalic obstruents, includingstops. Central Catalan voices all obstruents in the same position, excluding stops (as well as variably the labial fricative). TwoValencian dialects are relatively more restrictive than Central Catalan, with voicing applying pre-vocalically in sibilant fric-atives, but not in stops, labial fricatives, or affricates. The typology is completed by Central Valencian, where pre-vocalicobstruent voicing does not occur at all. Wheeler (2005) notes that sibilant liaison is a common process in Romance, asseen in Portuguese, Catalan, Occitan and French, and that it reflects a series of sound changes where singleton intervocalicobstruents underwent lenition, which was followed by a simplification of geminates. Unfortunately, relying on typologicalevidence alone in support of constraint formulation is prone to circularity: markedness constraints are introduced on thebasis of typological observations, and then used to account for the very same typology (see also Gurevich, 2001 andHaspelmath, 2006 for discussion on this point). The grounding of constraints would be more convincing if, like in the case ofvoice assimilation in clusters, there was also some functional preference. However, such functional motivation is problematicwhen it comes to the preference for pre-vocalic sibilant voicing compared to pre-vocalic stop voicing. As discussed in Section1.2, Recasens and Mira (2012) show that in Catalan, in other environments than pre-vocalically, stops show more phoneticvoicing than fricatives. In addition, the question of whether pre-vocalic sibilant and cluster voicing are indeed categoricalprocesses has not been systematically addressed by previous phonetic studies on Catalan voicing, and hence the phonologicalstatus of pre-vocalic voicing yet awaits empirical confirmation.

1.4. The current study

Pre-vocalic voicing in Central Catalan has potentially vital consequences for our understanding of phonetic andphonological naturalness and of how the two interact. However, the discussion of the relevant theoretical points must beinformed by more phonetic detail. The present study undertakes a phonetic investigation of voicing in Catalan with twoaims in mind. One of these aims is descriptive. Despite the relatively high number of phonetic investigations into Catalanvoicing contrast and voicing (Dinnsen and Charles-Luce, 1984; Charles-Luce, 1993; Cuartero Torres, 2001; Recasens andMira, 2012, 2013, inter alia), no study has as yet investigated the extent of pre-vocalic voicing depending on the undergoer.The second aim is to gain a better understanding of voicing processes in Catalan through relating some formal tools used inthe existing analyses to phonetic evidence. The experimental design and analysis were guided by the following researchquestions:

1. Does voicing apply to the same extent preceding vowels, voiced obstruents and sonorant consonants?2. How does the effect of the following environment interact with the manner of articulation of the undergoer?3. Is there any evidence for the relative preference for voicing in sibilants as opposed to stops?

2. Material and methods

Production data were collected from six native speakers of Central Catalan from the Barcelona region. They were allbilingual, using Catalan and Spanish on a daily basis. The selection criterion for the participation in the experiment was thatCatalan had been the language spoken at home. Five of the speakers were female and onewasmale. The age rangewas 18–32.The speakers were not aware of the purpose of the experiment and they were not paid for participation.

The test items were words ending in obstruents or obstruent clusters (/p/, /b/, /s/, /z/, /bz/) followed by a vowel or asonorant consonant in the next word. An equal number of underlyingly voiced and voiceless tokens were used in the case ofsingleton stops and sibilants. Stop þ sibilant clusters all had an underlyingly voiced obstruent followed by a morphologicalmarker -s. Place of articulation of the undergoer was not strictly controlled for, since labial stops and coronal sibilants wereused.3 This was necessary in order to avoid using word-final coronal stops, which undergo place assimilation before followinglaterals and nasals (Wheeler, 2005, p. 167). Labial fricatives were also avoided as pre-vocalic voicing in their case had beenreported as highly variable (Recasens, 1993; Bonet and Lloret, 1998). For every type of undergoer three vocalic contexts wereincluded ([i], [ə], [u]), and they were varied systematically. The sonorant contexts included a following /m/, /n/, /r/, /l/, and /ʎ/.The test items were embedded in the same carrier phrase. An example is in (8).

3 In dis

(8)

cussing

Example stimulus sentence

‘sa al c Digui n

present re

p anglès’ un

on

sults, I refer to ‘sib

tre

m

ilants

op.

tim

’ and ‘sto

‘Say ‘k

ows English’ e ore e.’

ps’, but the reader should keep in mind that only alveolar sibilants and labial stops were tested.

P. Strycharczuk / Language Sciences 47 (2015) 84–106 89

In addition, 20 tokens of word-final obstruents followed by an obstruent (/t/, /d/, /s/, /z/) in the next word were included ascontrols. For ease of segmentation following fricatives were usedwhen theword final obstruent was a plosive, whereas word-final sibilants were placed in the context of a following stop. Similarly to the test items, the control items were embedded in acarrier phrase.

The recordings were made in a sound-treated room on a Pioneer PDR-609 CD recorder, using a Sennheiser MKH20 P-48microphone. The speakers were positioned 30 cm away from the microphone and instructed to read the sentences at acomfortable rate. They were encouraged to correct themselves if they made a mistake. The test items were randomised foreach speaker and presented on a computer screen, one at a time. The experiment was self-timed. The speakers read fourrepetitions of the stimuli.

The audio data were sampled at 44.1 kHz. Segmentation and acoustic analysis were carried out in Praat (Boersma andWeenink, 2009) on a 5 ms Gaussian window (spectrogram bandwidth 260 Hz). Boundaries were inserted manually basedon visual analysis of the spectrograms. Altogether (40(test stimuli) þ 10(control stimuli))*6(speakers)*4(repetitions) ¼ 1200utterances were recorded. 78 utterances were excluded due to deletions, mispronunciations, segmentation difficulties, in-stances of pre-vocalic glottalisation, or place/manner assimilation (e.g. /s#r//[rː]). This left 1122 test utterances for analysis.

The following acoustic measurements were taken.

1. Duration of glottal pulsing during closure (voicing duration). Previous research shows that the presence of vocal foldvibration is the primary cue to voicing in Catalan (Cuartero Torres, 2001; Recasens and Mira, 2012, 2013). Duration ofglottal pulsing was measured manually based on the presence of the voicing bar in the spectrogram and periodicity in thewaveform. Absence of voicing was coded as 0. In cases where partial voicing was present in obstruents, the periodicoscillations visible in the waveform would fade away gradually towards the end of the voiced portion. In such cases, theoffset of voicing was placed at the last pulse in the sequence as detected by Praat (cf. Fig. 1). For female speakers the pitchrange was set to 75–300 Hz, and for female speakers it was set to 100–500 Hz, as recommended in the Praat manual.

2. Obstruent duration. In the case of sibilants, the boundaries were placed to coincide with the onset and offset of high-frequency noise visible in the spectrogram. This included periods of relatively weak frication which sometimesoccurred, e.g. preceding nasals. For stops the duration measurement was taken to correspond to the closure phase. Stopclosure was measured from the offset of the formant structure of the preceding vowel to the offset of closure (whichcoincidedwith the onset of burst for released stops and the onset of the following segment where no clear burst occurred).Example segmentations of obstruents are illustrated in Fig. 2. Duration measurements were taken primarily with the aimof contextualising voicing duration. However, in some languages segmental duration has been shown to be a correlate ofvoicing (Chen, 1970; Kluender et al., 1988).

3. Vowel duration. Charles-Luce (1993) reports than in some cases vowel lengthening is observed before underlyingly voicedobstruents inword-final position. Similar effects have been reported for a number of other languages, notably by studies ofincomplete neutralisation (e.g. Port and O’Dell, 1985 for German, Warner et al., 2004 for Dutch). Vowel duration wasmeasured manually. The beginning of the vowel was placed at the beginning of the formant structure for vowels precededby obstruent. Occasionally, initial obstruents underwent lenition, and were realised as approximants. If an abrupt tran-sition in amplitude could be seen in such cases, such a transition would be taken as a vowel onset. Where no such abrupttransition was present, the initial vowel boundary was placed at the onset of the formant steady state. The offset of thevowel was taken to correspond to the onset of the following obstruent.

4. f0 and f1 preceding the obstruent (20 ms and 10 ms before the onset of closure/frication). f0 and f1 lowering has beenobserved preceding voiced stops in Dutch by Jansen (2004). f0 and f1 measurements in the following segment were nottaken, since some of the following segments involved voiceless obstruents, where themeasurements would be impossibleor unreliable. f0 and f1 were measured using the autocorrelation and Burg algorithms in Praat, respectively.

Fig. 1. An example of partial initial voicing. The voicing offset in cases like this was placed to coincide with the last visible pulse detected by Praat.

Fig. 2. Example segmentations.

P. Strycharczuk / Language Sciences 47 (2015) 84–10690

All of the durationmeasurements, including voicing, stop closure, burst, and vowel duration, were taken inmilliseconds. f0and f1 were measured in Hertz.

3. Results

3.1. Realisation of voicing

In the cases where obstruent voicingwas present in the data, it was realised in one of the three followingways. In 55% of allthe test utterances, voicing was present in the initial part of the obstruent and then it gradually faded away to be resumedduring the following voiced sound. An example of such initial partial voicing is illustrated in Fig. 3 (see also Fig. 1 and thediscussion of segmentation criteria above).

In some cases, voicing continued uninterrupted throughout closure or constriction, as illustrated in Fig. 4. This was foundfor 38% of the test items. Continuous voicing was never found in obstruents preceding a voiceless obstruent, but there wereinstances of continuous voicing preceding other types of triggers (voiced obstruents, vowels and sonorants) in combinationwith any assimilation undergoer (stops, sibilants, stop þ sibilant clusters).

Finally, there were some infrequent cases (3% of the data) where the voicing was present initially to gradually fade awayand then reappear in the latter part of the constriction. An example of interrupted voicing is illustrated in Fig. 5. This kind ofinterrupted voicing was only identified in sibilants and stopþ sibilant clusters, but never in stops. Fig. 6 illustrates the voicingratio in partially voiced tokens, i.e. tokens where the obstruent voicing stopped before the end of constriction, or when it wasinterrupted. The figure shows that the vast majority of partially voiced tokens had some initial voicing. The median voicingratio was ca. 0.25, but in some cases the voicing extended for longer than that, continuing for more than 50% of the obstruent.Anticipatory voicing, on the other hand, occurred very infrequently, as all instances of anticipatory voicing were outliers. Inaddition, all the instances of anticipatory voicing involved interrupted voicing, with voicing present at the beginning and endof the constriction.

In the remaining 4% of the data, no voicing during constriction could be identified.

3.2. Voicing-related acoustic parameters

The datawere analysed using linear mixed-effects regression. The analysis was run in R (R Development Core Team, 2005),using the lme4 package (Bates and Maechler, 2009). A series of models were fitted to predict the realisation of the following

Fig. 3. Initial partial voicing.

Fig. 4. Uninterrupted voicing.

P. Strycharczuk / Language Sciences 47 (2015) 84–106 91

acoustic measurements associated with the voicing contrast: voicing duration, ratio of the duration of vocal fold vibration toobstruent duration (voicing ratio), obstruent duration, duration of the preceding vowel, f0 and f1 at 10 and 20 ms before theonset of the obstruent. For all themodels, the effects of speaker and itemwere treated as random. The fixed effects consideredin the modelling included: the obstruent’s manner of articulation (whether it was a sibilant, a stop or a cluster), manner ofarticulation of the following segment (whether it was a sonorant consonant, a vowel, or a voiced/voiceless obstruent), and theunderlying voicing of the obstruent. The obstruent’s manner of articulation is abbreviated as ‘undergoer’, and the manner ofarticulation of the following segment is abbreviated as ‘trigger’ in tables and figures throughout this section. In the process ofmodelling, individual predictors were considered by being added to the model one by one. Predictors were retained in thefixed part of the model, or as a random slope within speaker/item only if they significantly improved the model fit accordingto the log-likelihood test. The best fitting models for every variable are presented in this section. For every model a table ofcoefficients is included, summarising the b values, as well as the standard error, and the t-values. Significant interactions arerepresented graphically in Figs. 7–10. The lines used to connect factors levels in these figures are to improve the legibility ofthe plots, and they should not be taken as an indication that the observations are connected.

3.2.1. Voicing durationThe best fitting model of voicing duration had an interaction between the manner of articulation of the trigger and of the

undergoer in the fixed part. The random part of the model included random intercepts for speaker and item, as well as arandom slope for the undergoer’s manner of articulation within speaker. The model was not significantly improved by theinclusion of the undergoer’s underlying voicing as a predictor. The model summary is in Table 1.

The main effect of the undergoer’s manner of articulationwas overall very small, and the associated small t-values (<j0.5j)suggest that it was also not significant. There were, however, relatively large differences in voicing duration depending on thetrigger. Voicing duration was considerably greater before a voiced obstruent than before a sonorant (b ¼ 20.70, SE ¼ 6.97,t¼ 2.97). At the same time, pre-sonorant obstruentswere associatedwith increased duration of vocal fold vibration comparedto obstruents followed by voiceless obstruents (b¼�23.92, SE¼ 6.93, t¼�3.45). Vowels triggered on averagemore voicing inpreceding obstruents than sonorants did (b ¼ 23.10, SE ¼ 5.23, t ¼ 4.41). This last effect was also found to interact with theundergoer’s manner of articulation. The interaction is illustrated in Fig. 7. While within sibilants and clusters there was morevoicing before a vowel than before a sonorant, the opposite effect was found in stops, where therewas virtually no pre-vocalicvoicing; pre-vocalic stops patterned with stops followed by voiceless obstruents in terms of voicing duration. The duration of

Fig. 5. Interrupted voicing.

Fig. 6. Voicing ratio in partially voiced obstruents as a function of where the voicing was located.

P. Strycharczuk / Language Sciences 47 (2015) 84–10692

voicing did not vary with the undergoer in the context of a sonorant or in the context of a voiceless stop. Before voicedobstruents, clusters showed increased duration of voicing compared with both singleton sibilants and stops. Pre-vocalicclusters also showed greater voicing duration than pre-vocalic sibilants.

The fit of the model improved significantly with the inclusion of a random effect of undergoer within speaker, whichindicates that there were individual differences with respect to how much voicing was found on average in sibilants, stopsand clusters. I will return to the issue of individual variation in Section 3.3.

3.2.2. Voicing ratioA linear mixed-effects regression model was fitted to the data predicting the ratio of voicing duration to obstruent

duration. The best fitting model of voicing ratio had an interaction of the undergoer’s and the trigger’s manner of articulationin its fixed part. Themodel did not improve significantly with the inclusion of underlying voicing as a predictor. Themodel didimprove when its random part was expanded by adding random slopes for the trigger’s and undergoer’s manner of artic-ulation. The model summary is in Table 2.

From the effects summary it follows that voicing ratio was generally lower in obstruent clusters than in sibilants(b ¼ �0.23, SE ¼ 0.07, t ¼ �3.45), but there was very little difference between voicing ratio in sibilants and in stops (b ¼ 0.03,

Fig. 7. Interaction between the undergoer’s and trigger’s manner of articulation in a model of voicing duration in Catalan word-final obstruents. The error barscorrespond to the standard error in the model.

Fig. 8. Interaction between the undergoer’s and trigger’s manner of articulation in a model of voicing ratio in Catalan word-final obstruents. The error barscorrespond to the standard error in the model.

P. Strycharczuk / Language Sciences 47 (2015) 84–106 93

SE ¼ 0.10, t ¼ 0.32). The main effect of the trigger’s manner on voicing ratio was similar to its effect on voicing duration.Voicing ratio was lower in obstruents followed by sonorants than in obstruents followed by a voiced obstruent (b ¼ 0.36,SE ¼ 0.08, t ¼ 4.32), but higher than in obstruents followed by a voiceless obstruent (b ¼ �0.28, SE ¼ 0.09, t ¼ �3.19). Overallthe voicing ratio was higher in pre-vocalic than in pre-sonorant obstruents (b ¼ 0.34, SE ¼ 0.06, t ¼ 5.58), except when theobstruent was a stop. An interaction between the trigger’s and the undergoer’s manner of articulation is plotted in Fig. 8. Pre-vocalic stops patterned with stops followed by a voiceless obstruent in terms of voicing ratio. A relatively low voicing ratiowas also observed in pre-sonorant clusters.

The model of voicing ratio with random slopes for the undergoer’s and the trigger’s manner of articulation fitted the datasignificantly better than a model with random intercepts (for speaker and item) only (c2 ¼ 144.6, df ¼ 20, p < 0.001). Sig-nificant improvements were also found in a stepwise model selection between models with only one random effect withinspeaker. This, again, is indicative of a considerable amount of individual variation, and therefore the generalisations thatemerge from the model for the population might not accurately reflect the behaviour of individual speakers.

Fig. 9. Interaction between the undergoer’s and trigger’s manner of articulation in a model of vowel duration preceding Catalan word-final obstruents. The errorbars correspond to the standard error in the model.

Fig. 10. Interaction between the undergoer’s and trigger’s manner of articulation in a model of f1 (in Hz) of the preceding vowel measured at 10 ms before theoffset. The error bars correspond to the standard error in the model.

P. Strycharczuk / Language Sciences 47 (2015) 84–10694

3.2.3. Obstruent durationA model of obstruent duration achieved the best fit of the data with two main fixed effects: of the undergoer’s and the

trigger’s manner of articulation. The fit of the model did not significantly improve when an interaction between these twofactors was added to the model (c2 ¼ 10.34, df¼ 6, p¼ 0.11), nor when the underlying voicing of the obstruent was added as amain effect (c2 ¼ 0, df ¼ 1, p ¼ 1). The summary of the fixed effects is in Table 3. The random part of the best fitting modelincluded random intercepts for speaker and item, as well as random slopes for the undergoer’s and the trigger’s manner ofarticulation within speaker.

Clusters had a considerably higher duration than sibilants (b ¼ 68.88, SE ¼ 5.46, t ¼ 12.61), whereas there was very littledifference in the average duration of frication and stop closure (b ¼ �5.38, SE ¼ 6.70, t ¼ �0.80). Pre-vocalic obstruents wereon average longer than obstruents followed by a sonorant (b¼�7.07, SE ¼ 3.14, t¼�2.25), or obstruents followed by anotherobstruent, although in the last case, the small t-values indicate a non-significant effect.

3.2.4. Vowel durationThe model of preceding vowel duration was found to provide the best fit of the data with the following structure: an

interaction between the undergoer’s and the trigger’s manner of articulation in the fixed part, and random effects for speaker,item as well as the undergoer’s and the trigger’s manner of articulation within speaker. The summary of the fixed part of themodel is in Table 4. Vowels preceding sibilants were considerably longer than vowels preceding stops or stop þ sibilantclusters (stops: b ¼ �29.22, SE ¼ 3.05, t ¼ �9.57; clusters: b ¼ �30.53, SE ¼ 3.02, t ¼ �10.12). Vowels preceding pre-vocalicobstruents were relatively longest (b ¼ 12.18, SE ¼ 2.47, t ¼ 4.93 compared to the pre-sonorant context), followed by vowelspreceding obstruent þ voiced obstruent clusters (b ¼ 6.41, SE ¼ 3.42, t ¼ 1.87 compared to the pre-sonorant context). Vowelsfollowed by an obstruent in the context of a voiceless obstruent were relatively shortest (b ¼ �11.28, SE ¼ 4.38, t ¼ �2.57compared to the pre-sonorant context). The comparison of trigger effects between voiced and voiceless obstruents confirmsthe existence of vowel lengthening in the context of a following voiceless sound. The difference in length between vowelsfollowed by an obstruent in the context of a voiced vs. voiceless obstruent (established by re-running the model with voicedobstruent as a baseline) was 10.27ms (SE¼ 3.83, t¼�2.68). This effect of the trigger’s manner of articulation (vowel> voicedobstruent > sonorant > voiceless obstruent) was observed within stops, sibilants and clusters, but the size of the effect of thetrigger differed somewhat depending on the undergoer. The interaction between undergoer and trigger is plotted in Fig. 9.

3.2.5. f0 of the preceding vowelModelling of f0 of the vowel preceding a word-final obstruent did not yield any significant main effects.

3.2.6. f1 of the preceding vowelThe best-fitting model of the preceding f1 measured 10 ms before the offset had an interaction between the undergoer’s

and the trigger’s manner of articulation in its fixed part, random intercepts for speaker and item, as well as a random slope forthe undergoer’s manner of articulation within speaker. The fixed part of the model is summarised in Table 5. f1 modellingshowed an effect of f1 lowering in the context where voicing is expected. This generalisation follows from the main effect ofthe trigger’s manner of articulation. In a model of f1 measured at 10 ms before the vowel offset there was a main effect of thetrigger’s manner, with f1 lowering in the context of an obstruent followed by a voiced obstruent. f1 was lower by 70.96 Hz in

Table 1Regression coefficients, with standard error and t values for a model predicting the duration of vocal fold vibration (in ms) in word-final obstruents. Theintercept corresponds to a word-final sibilant followed by a sonorant.

Term Level b SE t

(Intercept) 46.25 5.78 8.01Undergoer cluster 2.10 8.28 0.25Undergoer stop �2.37 7.76 �0.31Trigger voiced obstruent 20.70 6.97 2.97Trigger voiceless obstruent �23.92 6.93 �3.45Trigger vowel 23.10 5.23 4.41Undergoer: trigger cluster: voiced obstruent 31.30 11.94 2.62Undergoer: trigger stop: voiced obstruent �1.78 9.70 �0.18Undergoer: trigger cluster: voiceless obstruent 0.21 11.92 0.02Undergoer: trigger stop: voiceless obstruent 6.37 9.62 0.66Undergoer: trigger cluster: vowel 42.91 8.90 4.82Undergoer: trigger stop: vowel �42.04 7.44 �5.65

P. Strycharczuk / Language Sciences 47 (2015) 84–106 95

the context of a voiced obstruent (across another intervening obstruent) compared to the context of a voiceless obstruent(b ¼ 70.96, SE ¼ 18.18, t ¼ 3.90). The context of a following sonorant consonant had a similar effect on f1 to the context of avoiced obstruent. The main effect of vowel involved f1 lowering compared to sonorants (b ¼ �37.91, SE ¼ 10.84, t ¼ �3.50),and compared to voiced obstruents (b ¼ �32.70, SE ¼ 15.79, t ¼ �2.07). This effect, however, was mostly conditioned byvowels followed by pre-vocalic sibilants. An interaction between the trigger’s and the undergoer’s manner of articulation onf1 at 10 ms before the vowel offset is plotted in Fig. 10. The effect of the trigger’s manner of articulation varied considerablywithin sibilants, with f1 lowering in pre-vocalic sibilants, and relatively high f1 before sibilants followed by voicelessobstruents. Within stops and clusters the effect of the trigger’s manner was more limited, although clusters show some f1lowering in the preceding vowel when followed by voiced obstruents.

A model of f1 measured at 20 ms before the vowel offset showed very similar results to the model of f1 measured at 10 msbefore the offset. Both models achieved the best fit with the same effects: an interaction between the trigger’s and theundergoer’s manner of articulation as a fixed effect, as well as a random effect of speaker, item and undergoer’s manner ofarticulation within speaker. The model of f1 at 20 ms before the offset is summarised in Table 6, and the interaction betweenthe undergoer’s and the trigger’s manner of articulation is shown in Fig. 11.

3.2.7. Interim summaryThe results of mixed-effects modelling largely confirm the literature reports concerning the realisation of voicing inword-

final pre-vocalic obstruents. Pre-vocalic sibilants and clusters were typically realised with an extended portion of vocal foldvibration, and they patterned with sibilants and clusters followed by voiced obstruents in terms of voicing duration andvoicing ratio. The effect of voicing in pre-vocalic sibilants was alsomanifested by f1 lowering of the preceding vowel, althoughin clusters this effect was relatively smaller. Unlike sibilants and stop þ sibilant clusters, singleton stops did not undergovoicing in the pre-vocalic position. Word-final pre-vocalic stops patterned with stops followed by a voiceless obstruent withrespect to most acoustic predictors, including voicing duration, voicing ratio, closure duration, and f1.

Voicing before sonorant consonants was relatively limited. Word-final obstruents followed by a sonorant consonant in thenext word did show increased voicing duration and voicing ratio, as well as f1 lowering compared to obstruents followed by avoiceless obstruent in the next word. At the same time, however, average voicing duration and ratio were considerably lowerin pre-sonorant obstruents than in obstruents followed by a voiced obstruent, with a difference in means of 20.7 ms (forvoicing duration) and 0.36 (for voicing ratio), and high associated t-values.

Table 2Regression coefficients, with standard error and t values for a model predicting voicing ratio in word-final obstruents. The intercept corresponds to aword-final sibilant followed by a sonorant.

Term Level b SE t

(Intercept) 0.56 0.06 8.87Undergoer cluster �0.23 0.07 �3.45Undergoer stop 0.03 0.10 0.32Trigger voiced obstruent 0.36 0.08 4.32Trigger voiceless obstruent �0.28 0.09 �3.19Trigger vowel 0.34 0.06 5.58Undergoer: trigger cluster: voiced obstruent 0.07 0.13 0.53Undergoer: trigger stop: voiced obstruent �0.17 0.10 �1.64Undergoer: trigger cluster: voiceless obstruent 0.12 0.13 0.92Undergoer: trigger stop: voiceless obstruent 0.08 0.10 0.78Undergoer: trigger cluster: vowel 0.11 0.09 1.22Undergoer: trigger stop: vowel �0.57 0.08 �7.21

Table 3Regression coefficients, with standard error and t values for a model predicting obstruent duration (in ms) in word-final obstruents. The intercept corre-sponds to a word-final sibilant followed by a sonorant.

Term Level b SE t

(Intercept) 86.16 6.35 13.58Undergoer cluster 68.88 5.46 12.61Undergoer stop �5.38 6.70 �0.80Trigger voiced obstruent �8.56 6.26 �1.37Trigger voiceless obstruent �3.97 5.39 �0.74Trigger vowel �7.07 3.14 �2.25

Table 4Regression coefficients, with standard error and t values for a model predicting vowel duration (in ms) preceding word-final obstruents. The interceptcorresponds to a word-final sibilant followed by a sonorant.

Term Level b SE t

(Intercept) 115.29 4.38 26.35Undergoer cluster �30.53 3.02 �10.12Undergoer stop �29.22 3.05 �9.57Trigger voiced obstruent 6.41 3.42 1.87Trigger voiceless obstruent �11.28 4.38 �2.57Trigger vowel 12.18 2.47 4.93Undergoer: trigger cluster: voiced obstruent �4.82 5.08 �0.95Undergoer: trigger stop: voiced obstruent �3.51 4.12 �0.85Undergoer: trigger cluster: voiceless obstruent 7.59 5.07 1.50Undergoer: trigger stop: voiceless obstruent 8.39 4.07 2.06Undergoer: trigger cluster: vowel �7.99 3.82 �2.09Undergoer: trigger stop: vowel �11.97 3.18 �3.76

P. Strycharczuk / Language Sciences 47 (2015) 84–10696

The effect size in the model of voicing duration and ratio points to pre-vocalic voicing being categorical in sibilants andclusters, but absent from stops. The results concerning pre-sonorant voicing, however, are less clear. The mean effect ofvoicing in pre-sonorant obstruents could arise from a variety of factors. First, it could be conditioned by categorical intra-speaker variation, where all speakers voice their pre-sonorant obstruents in an optional but categorical fashion. A similarsituation could arise from a gradient type of voicing within speaker, where obstruents are partially voiced in pre-sonorantposition. Finally, the effect could also follow from inter-speaker variation of either the categorical-but-optional, or thegradient kind. The results of mixed-effects modelling indicate the existence of inter-speaker variation in the data. All of themodels presented above were found to fit the data significantly better, once the effect of the trigger’s manner or articulationand/or the undergoer’s manner of articulation were allowed to vary within speaker. The degree of inter-speaker variation isconsidered below in Section 3.3.

3.3. Individual variation

The results of the regression modelling confirm that individual speakers differed with respect to the effect of theundergoer’s and the trigger’s manner of articulation, but the exact nature of the variation is difficult to gauge based on themodel results. The reason for this is that the manner of articulation of the trigger interacted with themanner of articulation ofthe undergoer (as seen for instance in the stop-sibilant asymmetrywith respect to pre-vocalic voicing), but it was not possible

Table 5Regression coefficients, with standard error and t values for a model predicting f1 (in Hz) of the vowel preceding word-final obstruents at 10 ms before thevowel offset. The intercept corresponds to a word-final sibilant followed by a sonorant.

Term Level b SE t

(Intercept) 616.15 23.78 25.91Undergoer cluster 24.27 23.21 1.05Undergoer stop 32.32 24.65 1.31Trigger voiced obstruent �5.21 14.30 �0.37Trigger voiceless obstruent 65.74 14.08 4.67Trigger vowel �37.91 10.84 �3.50Undergoer: trigger cluster: voiced obstruent �17.19 24.20 �0.71Undergoer: trigger stop: voiced obstruent 3.95 19.45 0.20Undergoer: trigger cluster: voiceless obstruent �56.07 24.07 �2.33Undergoer: trigger stop: voiceless obstruent �61.72 19.06 �3.24Undergoer: trigger cluster: vowel 40.25 18.38 2.19Undergoer: trigger stop: vowel 52.55 15.28 3.44

Table 6Regression coefficients, with standard error and t values for a model predicting the f1 (in Hz) of the vowel preceding word-final obstruents at 20 ms beforethe vowel offset. The intercept corresponds to a word-final sibilant followed by a sonorant.

Term Level b SE t

(Intercept) 668.60 20.55 32.54Undergoer cluster 30.85 21.40 1.44Undergoer stop 45.40 21.32 2.13Trigger voiced obstruent �5.99 12.37 �0.48Trigger voiceless obstruent 52.61 12.16 4.33Trigger vowel �28.56 9.39 �3.04Undergoer: trigger cluster: voiced obstruent �3.98 20.90 �0.19Undergoer: trigger stop: voiced obstruent 2.91 16.76 0.17Undergoer: trigger cluster: voiceless obstruent �43.26 20.77 �2.08Undergoer: trigger stop: voiceless obstruent �54.07 16.38 �3.30Undergoer: trigger cluster: vowel 35.99 15.91 2.26Undergoer: trigger stop: vowel 36.64 13.22 2.77

P. Strycharczuk / Language Sciences 47 (2015) 84–106 97

to include interactions between these two factors as random effects within speaker, as such models would not converge.Consequently, studying the coefficients from the random part of the models allows us to analyse the effect of trigger and/orundergoer on a speaker-by-speaker basis, but not how the two factors interacted within individual participants. In the face ofthis problem, a mixed-effects regression analysis was carried out on subsets of the data. Only voicing duration and ratio wereanalysed, based on the fact that the duration of vocal fold vibration (absolute and relative to obstruent duration) is theprimary phonetic correlate of voicing in Catalan, as reported previously (Cuartero Torres, 2001), and as confirmed for thecurrent data by the results of the regression analysis (Section 3.2). Linear mixed regression models of voicing duration andratio were fitted to the data from the pre-vocalic (n¼ 333) and pre-sonorant (n¼ 541) environments. In that way the effect ofthe undergoer’s manner of articulation in pre-vocalic and pre-sonorant voicing could be analysed within speaker. All themodels had the same fixed structure, which included only onemain effect, that of the undergoer’s manner of articulation. Therandom part of the model included a random intercept for speaker and item in all cases. A model with random interceptswould then be compared to a model with a random effect of undergoer within speaker to test whether the inclusion of therandom slope improved the model fit.

The models of voicing duration and ratio before a vowel improved significantly upon including a random effect of theundergoer’s manner of articulationwithin speaker (model of voicing duration: c2 ¼ 30.91, df¼ 5, p < 0.001; model of voicingratio: c2 ¼ 34.37, df ¼ 5, p < 0.001). The random coefficients for undergoer within speaker are plotted in Fig. 12. The fixedcoefficients of both models are summarised in Table 7.

The main effect of the undergoer’s manner of articulation on voicing duration shows increased duration in clusters fol-lowed by a vowel compared to pre-vocalic sibilants, but the mean difference in the voicing ratio between these two con-ditions was not significant as indicated by the low t-value of �1.32. Stops showed significantly less voicing in terms of bothvoicing duration and voicing ratio. This general pattern was displayed by all the participants in the experiment with the

Fig. 11. Interaction between the undergoer’s and trigger’s manner of articulation in a model of f1 (in Hz) of the preceding vowel measured at 20 ms before theoffset. The error bars correspond to the standard error in the model.

P. Strycharczuk / Language Sciences 47 (2015) 84–10698

exception of speaker B, who had a much lower voicing ratio in pre-vocalic clusters compared to pre-vocalic sibilants, orcompared to any other speaker.

Themodels of voicing duration and ratio before a sonorant consonant also showed a significant improvement in fit when arandom slope for the undergoer’s manner of articulation within speaker was added (voicing duration: b ¼ 66.10, df ¼ 5,p< 0.001; voicing ratio: b¼ 102.71, df¼ 5, p< 0.001). The fixed coefficients of the twomodels are in Table 8. Fig.13 representsthe random coefficients for undergoer within speaker.

The results of pre-sonorant modelling evade reliable generalisations per population. The only relatively stable effectinvolved lower voicing ratio in pre-sonorant clusters compared to pre-sonorant sibilants (b¼�0.23, SE¼ 0.09, t¼�2.47), andcompared to pre-sonorant stops (b ¼ �0.27, SE ¼ 0.11, t ¼ �2.35). The difference between stops and sibilants was subject to alarge amount of inter-speaker variation (cf. Fig. 13), and no stable generalisations for the population can be discerned, asconfirmed by the large standard error and low t-values associated with the difference in means between voicing duration andratio in pre-sonorant sibilants and stops.

In summary, the analysis of individual strategies in the realisation of voicing duration and voicing ratio indicates that pre-vocalic voicing is categorical in sibilants and clusters, with high mean values for every speaker (with the exception of pre-vocalic clusters realised by speaker B). Most of the inter-speaker variation shown by the regression modelling seems to beconfined to obstruents followed by sonorant consonants. As for the nature of variation in the pre-sonorant obstruents acrossspeakers, it involved both the overall degree of voicing (ranging from fairly limited voicing in speaker B to a considerabledegree of voicing in speaker G), as well as the effect of the undergoer’s manner of articulation, as individual speakers showeddifferent relative levels of voicing in sibilants, clusters and stops.

4. Discussion

4.1. Main results and comparison with previous findings

The current results largely confirm the generalisations concerning pre-vocalic voicing in the existing descriptive literatureon Catalan. Crucially, the findings provide phonetic support for the generalisation that pre-vocalic voicing is categorical insibilants. A similar pattern was found for word-final stop þ sibilant clusters. However, unlike sibilant voicing, cluster voicingwas found to vary between speakers. While most speakers showed evidence for both sibilant and cluster voicing in theirproduction, one speaker (speaker B) had categorical voicing in pre-vocalic sibilants, but not in clusters. None of the partic-ipants showed any considerable degree of voicing in pre-vocalic singleton stops. Evidence for categorical voicing was alsofound in obstruents followed by another obstruent. In comparison, obstruent voicing before sonorant consonants was morelimited and subject to considerable inter- and intra-speaker variation.

One of the aims of this study was to verify the phonological status of various aspects of Catalan voicing. The phonologicalstatus of regressive voicing in obstruent clusters has recently been questioned by Recasens and Mira (2012), who report thatobstruents show relatively limited voicing ratios when followed by a voiced obstruent (58.7% for /p, t, k/ before a voiced C2),and that voicing ratios vary considerably between speakers. A similar argument against a categorical regressive voicing rule ispresented by Recasens andMira (2013) based on three-way clusters with amedial fricative followed by aword-boundary (e.g.

Fig. 12. The effect of undergoer within speaker in models predicting voicing duration (left) and voicing ratio (right) in pre-vocalic obstruents. The error barscorrespond to one standard deviation in the fitted data.

Table 7Regression coefficients, with standard error and t values for models predicting voicing duration and voicing ratio in pre-vocalic obstruents. The interceptcorresponds to a sibilant.

Term Level b SE t

Voicing duration(Intercept) 70.36 5.77 12.19Undergoer cluster 44.18 12.66 3.49Undergoer stop �45.59 8.10 �5.63

Voicing ratio(Intercept) 0.91 0.03 28.70Undergoer cluster �0.13 0.10 �1.32Undergoer stop �0.55 0.09 �6.39

Table 8Regression coefficients, with standard error and t values for models predicting voicing duration and voicing ratio in pre-sonorant obstruents. The interceptcorresponds to a sibilant.

Term Level b SE t

Voicing duration(Intercept) 46.42 7.35 6.32Undergoer cluster 1.51 11.01 0.14Undergoer stop �2.49 9.85 �0.25

Voicing ratio(Intercept) 0.56 0.09 6.26Undergoer cluster �0.23 0.09 �2.47Undergoer stop 0.04 0.14 0.25

P. Strycharczuk / Language Sciences 47 (2015) 84–106 99

/ps#b/). The voicing ratios in such fricatives were once again found to be low, on average below 45%. Note, however, thatRecasens andMira (2013) average over clusters with a voiced obstruent C3 /b, d, g/ and clusters with a sonorant C3 /m, n, l, ʎ, j/.The relatively low voicing ratio may be conditioned mainly by clusters with a sonorant C3. This would be consistent with thefindings of the present study, as well as the findings by Cuartero Torres (2001) and by Recasens and Mira (2012, 2013) thatsonorants trigger relatively little regressive voicing compared to voiced obstruents. Contrary to Recasens & Mira, the presentfindings suggest that regressive voicing in obstruent clusters has a phonological status in Catalan. The average voicing ratiosfor obstruents followed by a voiced obstruent C2 were high: 78% for stops before a voiced C2, 92% for sibilants in the sameposition and 76% for stop þ sibilant clusters (cf. Fig. 8). These results are not directly comparable with Recasens and Mira(2012) or Recasens and Mira (2013), who do not report all trigger–undergoer interactions in their study. Nevertheless, theaverage voicing ratios obtained by the present study appear higher overall. On the other hand, the present results approx-imate the results obtained by Cuartero Torres (2001), for one of his two speakers. Cuartero Torres reports the following

Fig. 13. The effect of undergoer within speaker in models predicting voicing duration (left) and voicing ratio (right) in pre-sonorant obstruents. The error barscorrespond to one standard deviation in the fitted data.

P. Strycharczuk / Language Sciences 47 (2015) 84–106100

average voicing ratios. In fricatives followed by voiced stops /sd, zd/, 80% for speaker MJ and 71% for speaker AN. In stopsfollowed by voiced stops /tg, dg/, 29% for speaker MJ and 87% for speaker AN.

There are two potential reasons why the present findings on voicing ratios in obstruent þ obstruent clusters may deviatefrom Recasens and Mira (2012, 2013). First, according to Recasens and Mira (2012), acoustic analysis may overestimate thedegree of voicing. Based on the analysis of the EGG signal, Recasens & Mira note the occurrence of quasi-periodic low-amplitude oscillations. These occurred at the offset of continuous voicing, and were not counted as true voicing. The acousticanalysis applied in the present study is not sensitive enough to distinguish between the two types of oscillations, andconsequently the extent of what Recasens & Mira term ‘true glottal vibrations’ may be slightly overestimated in the presentstudy. Note however, that while the present results deviate from Recasens & Mira in the mean voicing ratio, more grossgeneralisations concerning the effect of manner of voicing are largely consistent between this experiment and other phoneticstudies on Catalan voicing. For instance, all studies deliver the result that voiced obstruents condition considerably moreregressive voicing than sonorant consonants. Similarly, vowels trigger a high degree of voicing in preceding singleton word-final sibilants (82.7% in Recasens and Mira (2012), 92% in the present study). Consequently, while the acoustic approach mayoverestimate the degree of ‘true voicing’, it is accurate enough to capture the relevant generalisations concerning the relativeeffect of voicing in different classes of sounds. In addition, the present average results are close to the results obtained byCuartero Torres (2001) for speaker AN, based on electroglottographic data.

The second factor which may have contributed to the differences in results obtained by different phonetic studies is in-dividual variation. Speakers in the present study varied considerably in howmuchvoicing theyproduced in two-wayobstruentclusters, where the second obstruentwas voiced. Similarly, CuarteroTorres (2001) and Recasens andMira (2012, 2013) all noteconsiderable individual variation in their data. In relation to this, Cuartero Torres (2001 p.66) points out that the degree ofvoicingobserved inobstruentþobstruent clustersmaybe lowevenwhen thewhole sequencehadbeenplannedas voiced. Thisis due to the aerodynamic difficulty in maintaining voicing throughout prolonged constriction. Consequently, voicing ratiosalonemay not be very telling, and theymay not be comparable between different studies.What ismore important in assessingthe extent of assimilatory voicing, is the relative extent of voicing compared to the relevant baselines.

Based on the analysis of how undergoer and trigger interact in conditioning C1 voicing, the present study delivers twomajorarguments for why voicing before voiced obstruents should be considered phonological in Catalan. The first comes fromcomparing the extent of voicing in singleton consonants vs. consonant clusters. The average voicing ratio preceding voicedobstruents was consistently high for singleton stops, sibilants and stop þ sibilant clusters (cf. Table 2 and Fig. 8). However,stopþ sibilant clusters showedconsiderablygreatervoicingduration compared to singletonstopsor sibilants followedbyavoicedobstruent (cf. Table 1 and Fig. 7). This can only be explained as an element of an active voicing strategy on the part of speakers:since clusters are inherently longer than singleton consonants, they require increased voicing duration to maintain a similarvoicing ratio. The secondargument comes fromtheeffectof voicedobstruents on initial voicing incaseswhere incomplete voicingwas produced before a voiced stop. When some partial devoicing was observed in stop þ sibilants clusters before a voicedobstruent, it would occur in the latter part of the cluster (i.e. during the sibilant). This again suggests that the speakers plan therealisation of the cluster as voiced, even if continuous voicing may not be sustained throughout the entire obstruent sequence.

Similarly to regressive obstruent voicing in obstruent þ obstruent clusters, pre-vocalic sibilant and cluster voicing appearto be an active phonological process, consistent with the existing descriptive literature on Catalan. The results on word-finalpre-vocalic sibilants are in line with the phonetic findings by Recasens and Mira (2012), showing high average voicing ratios.Vowels patterned with voiced obstruents as triggers of voicing in word-final sibilants, while the voicing itself was phonet-ically realised by means of extended vocal fold vibration throughout constriction, with the mean voicing ratio of over 0.9. Anew empirical result delivered by the present study is that the extent of voicing in pre-vocalic stop þ sibilant clusters iscomparable to voicing in singleton sibilants. This suggests that pre-vocalic cluster voicing is phonological. This interpretationis further supported by the following two observations, similar to those made for clusters before a voiced obstruent. Firstly,both clusters and singleton sibilants have similar voicing ratios before a vowel, but clusters show a considerably longeraverage voicing duration. Just like in the context of a voiced obstruent, this extended voicing duration in clusters indicatesthat an active voicing mechanism is responsible for maintaining a high voicing ratio. Secondly, when partial devoicing occursin clusters, it affects the second rather than the first obstruent in the sequence, suggesting that the cluster is planned asvoiced. Importantly, unlike sibilant voicing, cluster voicing was found to vary between speakers.While most speakers showedevidence for both sibilant and cluster voicing in their production, one speaker (speaker B) had categorical voicing in pre-vocalic sibilants, but not in clusters. None of the participants showed any considerable degree of voicing in pre-vocalicsingleton stops: pre-vocalic stops patterned with stops followed by voiceless obstruents.

Compared to pre-vocalic voicing and voicing before obstruents, pre-sonorant voicing was highly variable. There were anumber of instances where continuous voicing occurred in obstruents followed by a sonorant. On average, sonorants trig-gered more regressive voicing than voiceless obstruents. However, there was considerably less regressive voicing beforesonorants than before voiced obstruents. These results are in linewith the findings on pre-sonorant voicing by CuarteroTorres(2001) and Recasens and Mira (2012), and they problematise the generalisations concerning pre-sonorant voicing in thedescriptive literature on Catalan (cf. example (2) in Section 1.1). Pre-sonorant voicing does not seem to be a stable phono-logical process in Catalan. Instead, it seems to be phonetic and gradient, or it might perhaps be currently undergoing pho-nologisation, having reached different stages in different speakers.

As it is not entirely phonologised, pre-sonorant voicing provides a good testing ground for the phonetic characteristics ofthe relationship between undergoer’s manner of articulation and the degree of voicing. One of the research questions leading

P. Strycharczuk / Language Sciences 47 (2015) 84–106 101

the present study was whether any phonetic tendency might be observed indicating a preference for voicing in sibilantscompared to stops, or vice versa. The asymmetry observed between stops and sibilants as undergoers of voicing in the contextof a vowel might suggest a preference for voicing in sibilants, but it is difficult to tease apart phonetic constraints on theproduction of voicing from other factors which may have played a part in the phonologisation of pre-vocalic voicing. It isusually not possible to investigate the factors responsible for sound change in cases where change is already phonologised.Instead, investigating cases where sound change is not yet complete may provide more information on phonetic tendenciesinvolved in such changes (Solé, 2010). Building upon this view, if the degree to which stops and sibilants are susceptible tovoicing was rooted in production factors, we would expect to find a preference for sibilant voicing before vowels andsonorants alike. However, no such tendency is confirmed by the results on pre-sonorant voicing in Catalan. As shown in Tables1 and 2, and illustrated in Figs. 7 and 8, unlike in the pre-vocalic context, there was no significant difference in voicingduration or voicing ratio between stops and sibilants followed by a sonorant consonant. This findingwas further confirmed bythe results of regression models fitted to the subset of the data from the pre-sonorant environment. Stops and sibilantsshowed a similar mean voicing duration and ratio, with small associated t-values, indicating that the mean differences be-tween the two groupswere not significant. On an individual level, speakers such as speaker A showed an appreciable increasein voicing duration in sibilants compared to stops, but no similar effect size and not even always the same trend was alsofound for other speakers. Therefore, the current results do not support the generalisation that sibilant voicing is phoneticallypreferred to stop voicing. However, neither do they support the opposite prediction, made by Recasens and Mira (2012), thatstop voicing is preferred to sibilant voicing.

Finally, a pertinent finding delivered by this study concerns the location of gradient voicing in partially voiced obstruents.As discussed in Section 3.1, partial voicing typically involved cases where voicing was found only in the initial part of C1. Someauthors consider such voicing to be purely inertia from the preceding vowel (e.g. Cuartero Torres, 2001, p. 65). However, theduration of the post-vocalic voicing tail can vary as the function of the following environment. For instance, Recasens andMira (2012) observe that a post-vocalic voicing tail may occur in clusters of two voiceless obstruents, but that in suchcases the C1 voicing ratio does not usually go beyond 20% (In the current data the C1 voicing ratio in stops followed byvoiceless obstruents was on average 22% for stops and 25% for sibilants). However, in other environments, post-vocalicvoicing could be longer, suggesting an effect of C2. The same is observed in the current data: there was considerable varia-tion in the duration of the post-vocalic partial voicing, which was systematically affected by the following context.

4.2. The diachronic development of pre-vocalic cluster voicing

The categorical status of pre-vocalic cluster and sibilant voicing seem to accord well with the existing formal accounts ofCatalan voicing introduced in Section 1.3. All these accounts propose that cluster voicing follows form the interaction betweenpre-vocalic sibilant voicing and voice assimilation between obstruents. However, in doing so, the synchronic accounts rely ona markedness relationship which favours sibilant voicing to stop voicing. Unfortunately, such a preference is not clearlysupported by any external factors related to speech production, as argued previously in Section 4.1. In contrast, a potentialfunctional explanation for how the stop-sibilant asymmetry may evolve through language change is available when oneconsiders factors related to speech perception. Perceptual studies have shown that the presence of vocal fold vibration duringfrication increases the chance of voiced percepts in listeners (Forrez, 1966; Stevens et al., 1992). It is conceivable that theimportance of vocal fold vibration as a voicing cue varies depending on manner of articulation. Voicing in sibilants affects theintensity of high-frequency frication (Ladefoged andMaddieson,1996), and it is thus potentially salient. Passive voicing in theclosure phase of stops, on the other hand, does not have a similar effect, due to an overall low intensity of occlusion, whichmight make it less perceptible.4 In addition, initial voicing in stops does not affect the Voice Onset Time, which is recognisedas one of the main voicing cues in stops (Lisker and Abramson, 1964). The different ways in which vocal fold vibration in-teracts with other voicing cues in stops and sibilants may affect the degree to which passive voicing is perceived by listeners,making sibilants more likely to undergo perceptual reanalysis as voiced due to the presence of passive voicing. This couldaccount for why pre-vocalic voicing develops in sibilants but not in stops, while a clear production preference for voicing ineither of these contexts is missing.

This perceptual account feeds into the following possible scenario, schematised in Table 9, of how the Catalan patternmight have originated. The pattern started off as intervocalic voicing that targeted delaryngealised sibilants. The connectionbetween delaryngealisation and voicing is supported by a close match between the distribution of these two processes. Whilethe close match in the environment for pre-sonorant/pre-vocalic voicing and the environment for delaryngealisation hasbeen noted in other languages, the Catalan case strongly suggests that the link is not coincidental, given the application of pre-vocalic sibilant voicing in prefix-final position (cf. (4)). The application of delaryngealisation and pre-vocalic/pre-sonorantvoicing at the end of the word could potentially be argued to follow directly from independent functional factors. This,however, does not explain the prefix-final voicing, since prosodically conditioned functional pressures do not apply directly tomorphologically defined environments. For both delaryngealisation and pre-vocalic sibilant voicing to apply in the same

4 For the purpose of this discussion, I adopt the following definition of passive voicing by Jansen (2004, p. 36): “Sounds or parts of sounds are said to bepassively voiced if a closed equilibrium position of the vocal folds and normal subglottal pressure (according to Stevens, 1998, 8000 dyne/cm2/800 Pa istypical) are sufficient to initiate or maintain the physical conditions for vocal fold vibration.”

Table 9Diachronic development of Catalan prevocalic voicing.

Stage Change Examples

1 Initial stage [baz ǝntik] [sab ǝsglεs] [sabz ǝsglεs]2 Word-final delaryngealisation [baS ǝntik] [saP ǝsglεs] [saPS ǝsglεs]3 Initial passive voicing [ba ǝntik] [sa ǝsglεs] [sa s ǝsglεs]4 Reinterpretation of intervocalic sibilant voicing as categorical [baz ǝntik] [sa ǝsglεs] [sa s ǝsglεs]5 Rule generalisation [baz ǝntik] [sa ǝsglεs] [sa z ǝsglεs]6 Feeding rule ordering [baz ǝntik] [sa ǝsglεs] [sabz ǝsglεs]

P. Strycharczuk / Language Sciences 47 (2015) 84–106102

prefix-final domain only two diachronic scenarios are possible. Either the two processes started off as prosodically condi-tioned to then become re-analysed as prefix-final independently of each other, or one of the processes developed from theother. The latter explanation appears more principled, as it accords well with the phonetic interpretation of voicing targetsand their role in conditioning coarticulation (Jansen, 2004). Delaryngealised obstruents do not have their own voicing targets,and so they are predicted to be more susceptible to laryngeal coarticulation from the neighbouring sounds. Passive voicingspilling over from the neighbouring sounds may be sustained for longer by delaryngealised obstruents, as no active devoicinggesture is executed to counteract the voicing. This extended portion of voicing is likely to be more perceptible to listenerscompared to stop voicing, and so it may also be actively targeted in production once it is the listeners’ turn to speak. Thismechanism may yield a categorical voicing process whose distribution matches closely the environment for delaryngeali-sation. Passive voicing may also be less perceivable in stops than in sibilants, which could explain the observed synchronicasymmetry between pre-vocalic stop and sibilant voicing.

The next stage in the development of the Catalan undergoer asymmetries involves rule generalisation (Vennemann,1972).The generalisation consists of a reinterpretation of intervocalic sibilant voicing as pre-vocalic. What is interesting about thisprocess is the concomitant gradual loss of the functional motivation that instigated voicing in the first place. From the point ofview of diachronic functionalism the presence of the vowel on the following side is just one of the factors involved in trig-gering the change. The other necessary ingredients include the presence of a preceding vowel, as well as the undergoer beinga singleton sibilant of limited duration. All of these play a part in the origin of aerodynamically conditioned passive voicing, aswell as in the perception of passively voiced sibilants as voiced. In the synchronic grammar of Catalan, however, the role of thepreceding vowel seems to have been reduced, or else it would block voicing in clusters. This development reflects a diachronictransition from a phonetically conditioned process in an environment where it is functionally motivated to a partiallyarbitrary, but nonetheless productive synchronic generalisation.

Loss of phonetic conditioning as part of rule generalisation has been observed in numerous instances of sound change,going back to Schuchardt (1885 [1972]). It is in fact relatively uncommon for phonological processes to develop in a way thatexactly mirrors the phonetic environment where they were motivated in the first place. An example of this kind of gener-alisation comes from palatalisations. Based on experimental data, Cole and Iskarous (2001) argue that palatalisations areperceptually best motivated in VCV sequences where an obstruent is flanked by two high vowels. However, this conditioningis rarely reflected in the output of sound changes, which tend to only preserve the following high vowel as the palatalisingcontext. Similarly, Coetzee and Pretorius (2013) argue that following phonologisation, intervocalic stop voicing becomessensitive to non-phonetic environment effects, specifically morphological boundary effects. Further examples of rulegeneralisation involve for instance the extension of /l/-darkening from a syllable-based process to a foot-based process(Bermúdez-Otero, 2012, and references therein).

The final stage in the development of Catalan pre-vocalic voicing, the output of the voicing rule (voiced pre-vocalicsibilant) becomes an input to a voice assimilation rule that operates independently in language. This feeding interactionyields voicing in pre-vocalic sibilants and in stop þ sibilant clusters, while singleton stops do not undergo voicing.5

4.3. Synchronic generalisation

Instances of rule generalisation present an interesting problem in reflecting a mixture of phonetic and phonological in-fluences. This is also the case with Catalan pre-vocalic voicing. On the one hand, the voicing pattern shows traces of functionalmotivation, as it descends from coarticulation. On the other hand, the synchronically unnatural aspects of pre-vocalic voicing(partial loss of the conditioning environment and cluster voicing in the absence of singleton stop voicing) indicate an in-fluence of an abstract phonological generalisation shaping the process. An important question which arises in this context ishow natural this process is in a synchronic perspective, and consequently how much grounding can and should be incor-porated into a synchronic analysis.

5 A reviewer points out that cluster voicing interacts with another diachronic process which involved the loss of the post-tonic non-low vowel wherebyword-final obstruent þ sibilant clusters were created, e.g. SAPIS>*ˈsabəs>ˈsabz w ˈsaps. Notably, instances of word-final stop þ sibilant clusters as studiedin this paper seem to always involve the morphological marker -s.

P. Strycharczuk / Language Sciences 47 (2015) 84–106 103

The original phonetic source of voicing in sequences like /Vs#V/ is post-vocalic, but the rule is initially phonologised asintervocalic, and later re-interpreted as pre-vocalic. The first step of the generalisation is evidenced by the absence ofvoicing in other post-vocalic environments than before a following vowel. The second step of the generalisation is indicatedby the fact that stop þ sibilant clusters also undergo voicing before a vowel (cf. discussion in Section 4.2 above). In line withthis observation, sibilant voicing is analysed as pre-vocalic in the existing synchronic Optimality Theoretic accounts byBermúdez-Otero (2001), Wheeler (2005) and Jiménez and Lloret (2008) introduced in Section 1.3. In all these analyses, thefollowing three aspects of pre-vocalic sibilant voicing are subsumed under the activity of a single constraint (*CONTVOILAG inBermúdez-Otero (2001), NO-LINK-VC�Ocl Jiménez and Lloret (2008) and LAZYSIBILANTS in Wheeler (2005)): i) voicing isregressive, ii) the trigger is a vowel; iii) the undergoer is a sibilant/the undergoer is not a stop. However, from the phoneticand variationist evidence it follows that a constraint which effectively enforces regressive pre-vocalic sibilant voicing is nota single coherent expression of a unified markedness preference. Instead, it reflects a mixture of partial phonetic andphonological influences, some of which are filtered through perceptual speaker–listener interactions in language use. Thus,for an analysis which aims to preserve genuine insights about grounding in grammar, different aspects of voicing, i.e. theenvironment, the undergoer and the direction, should be teased apart to reflect their different conditioning factors.However, even such an elaborated proposal is subject to the following major challenge related to the inter-speaker variationobserved in the data.

Any analysis of pre-vocalic voicing in Catalan should also be able to accommodate a related system where sibilantvoicing is intervocalic, and to express the fact that the two systems are closely related. As previously mentioned in thediscussion on domain narrowing, there is a frequent mismatch between two-sided and one-sided environments inconditioning phonetic vs. phonological generalisations. A two-sided generalisation is potentially observed in the currentdata for speaker B, who produces voicing in pre-vocalic sibilants, but not in clusters. This type of system is predicted bythe diachronic analysis proposed in Section 4.2, where pre-vocalic voicing is preceded by intervocalic sibilant voicingwhich does not affect clusters. For a synchronic analysis, the observed inter-speaker variation is problematic. Onepossibility to exclude voicing of stop þ sibilant clusters is by introducing an additional constraint, similar to constraintsresponsible for handling the asymmetry between pre-vocalic voicing of stops and sibilants. This solution, however, isunsatisfactory on several counts. First, it loses the analytical insight that the presence or absence of cluster voicing resultsfrom an interaction of independent constraints. Instead, a specific ban against cluster voicing needs to be stipulated.More seriously, the putative two-sided constraint would have to be freely rankable to model pre-vocalic cluster voicing inthe absence of singleton stop voicing. However, this would have undesirable typological consequences, also predictingunattested systems where pre-vocalic voicing may occur only in clusters, but not in pre-vocalic singleton sibilants orstops. Alternatively, one could analyse the system of speaker B as involving intervocalic voicing, which is altogetherdifferent from pre-vocalic voicing. As NO-LINK-VC constraints (and their equivalents in the analyses by Bermúdez-Otero,2001 and Wheeler, 2005) are one-sided, a whole new set of constraints would have to be posited to handle intervocalicvoicing. This approach, however, essentially involves positing process-specific constraints, and defeats the programme ofderiving typological restrictions by constraint interaction. In addition, using a different constraint family to analyseintervocalic and pre-vocalic voicing loses sight of the fact that the two processes are related and that the former maymorph into the latter.

The inter-speaker variation observed in current data shows just how carefully one needs to stipulate the balance betweenphonetic fidelity and formal simplicity, in order to model the feeding interaction (or lack thereof) between stop voicing andcluster voicing. For the feeding interaction to work, the grounding of the constraints is carefully limited, as we have to positone-sided constraints referring just to the following environment for pre-vocalic voicing. Modelling the counterfeedinginteraction (i.e. speaker B’s system), on the other hand, requires either abandoning free ranking or reformulating the relevantconstraints in terms of a two-sided environment. Given how specific the constraints need to be in each case, the questionarises whether constraints like that can be claimed to come from a universal limited set, which would be the defaultassumption in an innate approach to markedness. Alternatively, one could assume that markedness constraints are neitherinnate nor functional, but language-specific and discovered by the learner in the process of acquisition. This possibility isdiscussed by Hayes (1999), who takes up the issue of indirect phonology-phonetics mappings in relation to natural classbehaviour in sound patterns. Hayes (1999) proposes that constraint inventories are restricted by functional factors on the onehand, and formal simplicity on the other. In consequence, synchronic grammars are neither entirely functional, nor formal.Instead, grammars reflect phonetic naturalness mediated by an abstract generalisation biased towards structural simplicity.

Hayes’s proposal comes with a computational solution concerning the inductive grounding of markedness constraints, i.e.discovery of the relevant constraints on the part of the learner. This process is proposed to involve creating a phoneticdifficulty map based on the learner’s experience with language, followed by a selection of the most effective constraints togeneralise over the observed language variation. This mechanism allows for the creation of markedness constraints withpartial functional grounding. For instance, inductive grounding could give rise to a markedness relationship conditioningpre-vocalic sibilant voicing, similar to the constraints proposed by the Optimality Theoretic analyses of Catalan discussed inSection 1.2. However, when it comes to modelling the inter-speaker variation, some of the previously discussed problemsresurface. Analysing speaker B’s system as intervocalic voicing is potentially more compatible with the assumptions ofinductive grounding, as the relationship between intervocalic and pre-vocalic voicing could reflect the tension betweenphonetic grounding and structural simplicity. In a theory like this, however, the main task of a learner becomes the dis-covery of relevant constraints which can accommodate either a one-sided or a two-sided process. The constraints are left

P. Strycharczuk / Language Sciences 47 (2015) 84–106104

with little work to do in terms of controlling phonological abstraction. Instead, the theoretical focus comes down to ac-counting for what constraints one needs. Furthermore, the formal simplification involved in discovering the relevant one-sided or double sided constraints is similar to mechanisms of abstraction posited in theories that work with purelylanguage-specific generalisations. Such a language-specific approach has been taken by a number of phonologists workingwithin Evolutionary Phonology (Blevins, 2004, 2006), Exemplar Theory (Silverman, 2006), Substance-Free Phonology (Haleand Reiss, 2008; Blaho, 2008), and Rule-Based Phonology (Vaux, 2008). What these proposals largely share is the idea thatsynchronic phonology does not directly encode the factors responsible for conditioning sound patterns. Instead, theexplanation for typological frequencies is sought in extra-grammatical factors. These are typically understood in the contextof influences on sound change, the main idea being that while synchronic grammars may produce arbitrary patterns, thedevelopment of such patterns will be limited by external pressures, including phonetic biases, speaker–hearer interaction ina speech community, and language acquisition. Thus, markedness effects may emerge in synchronic grammars, but they arenot thought of as the primary driving force.

4.4. Residual issues. Trigger asymmetry

Aside from the issue of manner asymmetries with respect to pre-vocalic voicing, Central Catalan also presents theproblem of the difference between pre-vocalic and pre-sonorant voicing. This difference manifested itself in three ways: 1)Pre-vocalic voicing applied to sibilants, and (in all but one speaker) also to clusters, but failed to apply to stops. Pre-sonorantvoicing showed no systematic differences depending on the undergoer’s manner of articulation. 2) In the environmentswhere it did apply pre-vocalic voicing was categorical, occurring in the vast majority of potential environments andtypically resulted in fully voiced obstruents. Pre-sonorant voicing was more variable, as indicated by low mean values forvoicing duration and ratio compared to obstruents followed by a voiced obstruent. 3) There was relatively little inter-speaker variation with respect to pre-vocalic voicing, but pre-sonorant voicing varied from speaker to speaker. The vari-ation included both the overall mean of voicing duration per speaker, as well as the within-speaker mean as a function ofthe undergoer’s manner of articulation.

From the point of view of functional factors discussed in this paper, the vowel-sonorant asymmetry is puzzling. Aperception driven re-analysis account of pre-sonorant/pre-vocalic voicing proposes that the sound change originates as thevoicing from the preceding vowel is extended into aword-final obstruent due to the lack of the obstruent’s voicing target. Thefollowing environment has an effect on the duration of passive voicing, but the influence is mostly due to the absence orpresence of a voicing target (e.g. the presence of a following voiceless obstruent will quench passive voicing). Under this viewno difference is expected between sonorant consonants and vowels in the conditioning of word-final obstruent voicing. Thepresence of a robust difference in this respect in Catalan necessitates reconsidering the role that the following environmentmight play in the initiation of voicing.

Considering the degree of inter- and intra-speaker variation observed in pre-sonorant voicing vis-à-vis the relativelyestablished pattern of voicing before a vowel, it is likely that pre-vocalic voicing is the older pattern of the two. This allows usto reformulate the question concerning the role of the trigger’s manner of articulation as a question of why voicingmight haveapplied only before a vowel at some stage, but not before a sonorant. An answer to this question would require an investi-gation into how the manner of articulation of the sound following a word-final obstruent affects the obstruent’s duration andthe air pressure. It would need to be conducted in a language that has nomanner asymmetries of the kind found in Catalan, asotherwise the results could be confounded by voicing which affects both air pressure and segmental duration. Vowel ar-ticulations are relatively more open compared to sonorant consonants, so their production may involve lower intraoralpressure. In consequence, vowels might be potentially more conducive to passive voicing. However, the questions of whethersuch differences indeed exist, and if so, whether they are big enough to trigger an asymmetry of the kindwe see in Catalan, yetawait a systematic investigation.

5. Conclusion

Central Catalan displays a range of interactions between the undergoer’s and trigger’s manner of articulation, some ofwhich are subject to inter-speaker variation. These interactions have consequences for the analysis of how the Catalan patternmight have developed diachronically, as well as for how the pattern may be synchronically analysed by the learner. Anextension of word-final sibilant voicing before a vowel to stop þ sibilant clusters, but not to singleton stops, supports thehypothesis that the pattern developed through an interaction of phonetic and phonological pressures, specifically a phoneticbias for intervocalic sibilant voicing and a phonological reanalysis of this pattern as a right-to-left pre-vocalic voicing rule. Theproduct of this sound change is arbitrary from the point of view of synchronic functionalism, and hence it does not lend itselfto a non-arbitrary synchronic analysis. Although the existing synchronic markedness-based analyses of Catalan voicing areformally adequate, they crucially depart from phonetic grounding in their constraint formulation in order to achievedescriptive adequacy. This tension between descriptive adequacy and phonetic grounding raises questions about theexplanatory power of universal markedness constraints. If we assume that Universal Grammar supplies constraints withprecisely the right balance between phonetic fidelity and formal simplicity, then explanation is simply deferred. Alternatively,assuming that the required constraints are discovered by the child during acquisition emphasises the role of mechanisms ofabstraction and generalisation that are also central to theories assuming language-particular generalisations.

P. Strycharczuk / Language Sciences 47 (2015) 84–106 105

Acknowledgements

Many thanks to the participants who volunteered for this research. I wish to thank Eulàlia Bonet and Maria-Rosa Lloret fortheir extensive help with data collection for this paper. Eulàlia andMaria-Rosa have also givenme advice on the experimentaldesign, as have JoanMascaró, LaiaMayol, Clàudia Pons-Moll, Daniel Recasens and XicoTorres-Tamarit. I am thankful toMaria-Josep Solé for discussion of my study. I would also like to thank Carles Salse Capdevila for technical support. I have receivedmany helpful comments on this paper from Ricardo Bermúdez-Otero, Yuni Kim, Koen Sebregts and from Language Sciencesreviewers. Any remaining omissions aremine. The research reported herewas supported by an Arts and Humanities ResearchCouncil Block Grant Partnership Postgraduate Award no. AH/H029141/1.

Appendix

Items included in the study

/p#ə/ drap anglès ‘English cloth’/p#i/ drap històric ‘historical cloth’/p#u/ drap opac ‘opaque cloth’/p#m/ drap millor ‘best cloth’/p#n/ drap noruec ‘Norwegian cloth’/p#r/ drap romà ‘Roman cloth’/p#l/ drap letó ‘Latvian cloth’/p#ʎ/ drap llampant ‘bright cloth’/p#s/ drap suec ‘Swedish cloth’/p#z/ drap zambià ‘Zambian cloth’/b#ə/ sap anglès ‘(s)he knows English’/b#i/ sap història ‘(s)he knows history’/b#ɔ/ sap obrir ‘(s)he knows how to open’/b#m/ sap mongol ‘(s)he knows Mongolian’/b#n/ sap noruec ‘(s)he knows Nowegian’/b#r/ sap remar ‘(s)he knows how to row’

/b#l/ sap letó ‘(s)he knows Latvian’/b#ʎ/ sap llegir ‘(s)he knows how to read’/b#s/ sap suec ‘(s)he knows Swedish’/b#z/ sap zoologia ‘(s)he knows zoology’/s#ə/ pas audaç ‘bold step’/s#i/ pas immens ‘huge step’/s#ɔ/ pas obert ‘open step’/s#m/ pas mandrós ‘tired step’/s#n/ pas nerviós ‘nervous step’/s#r/ pas robust ‘robust step’/s#l/ pas legítim ‘legitimate step’/s#ʎ/ pas lleuger ‘light step’/s#t/ pas tranquil ‘calm step’/s#d/ pas dinàmic ‘dynamic step’/z#ə/ vas antic ‘old glass’/z#i/ vas immens ‘huge glass’/z#u/ vas oficial ‘official glass’/z#m/ vas mullat ‘wet glass’/z#n/ vas normal ‘normal glass’/z#r/ vas rodó ‘round glass’/z#l/ vas letó ‘Latvian glass’/z#ʎ/ vas lleuger ‘light glass’/z#t/ vas trencat ‘broken glass’/z#d/ vas daurat ‘gold glass’/bz#ə/ saps anglès ‘you know English’/bz#i/ saps història ‘you know history’/bz#ɔ/ saps obrir ‘you know how to open’/bz#m/ saps mongol ‘you know Mongolian’/bz#n/ saps noruec ‘you know Norwegian’/bz#r/ saps remar ‘you know how to row’

/bz#l/ saps letó ‘you know Latvian’/bz#ʎ/ saps llegir ‘you know how to read’/bz#t/ saps tocar ‘you know how to touch/play (an instrument)’/bz#d/ saps daurar ‘you know how to gild’

P. Strycharczuk / Language Sciences 47 (2015) 84–106106

References

Bárkányi, Z., Kiss, Z.G., 2012. On the Border of Phonetics and Phonology: Sonorant Voicing in Hungarian and Slovak. Paper presented at the TwentiethManchester Phonology Meeting.

Bates, D., Maechler, M., 2009. lme4: Linear Mixed-effects Models Using S4 Classes. http://CRAN.R-project.org/package¼lme4 (accessed 13.08.10). R packageversion 0.999375-32.

Bermúdez-Otero, R., 2001. Voicing and Continuancy in Catalan: a Nonvacuous Duke-of-York Gambit and a Richness-of-the-Base Paradox. Ms.. University ofManchester, Manchester http://www.bermudez-otero.com/Catalan.pdf (accessed 10.07.10).

Bermúdez-Otero, R., 2012. Traces of Change in Synchronic Phonology: English Syllabification and the Life Cycle of Lenition. http://www.bermudez-otero.com/PortoAlegre.pdf. Paper presented at the IV Seminário Internacional de Fonologia, Porto Alegre, 26 April 2012.

Blaho, S., 2008. The Syntax of Phonology: a Radically Substance-free Approach (Ph.D. thesis). Universitetet i Tromsø, Tromsø.Blevins, J., 2004. Evolutionary Phonology. The Emergence of Sounds Patterns. Cambridge University Press, Cambridge.Blevins, J., 2006. A theoretical synopsis of Evolutionary Phonology. Theor. Linguist. 32, 117–166.Boersma, P., Weenink, D., 2009. Praat: Doing Phonetics by Computer [Computer Programme]. http://www.praat.org/ (accessed 15.10.09). Version 5.1.12.Bonet, E., Lloret, M.-R., 1998. Fonologia Catalana. Ariel, Barcelona.Carbonell, J., 1992. Final Devoicing and Voicing Assimilation in Catalan: an Acoustic Experiment (Master’s thesis). University College London, London.Charles-Luce, J., 1993. The effects of semantic context on voicing neutralization. Phonetica 50, 28–43.Chen, M., 1970. Vowel length variation as a function of the voicing of the consonant environment. Phonetica 22, 129–159.Coetzee, A.W., Pretorius, R., 2013. Phonetically grounded phonology and sound change: the case of Tswana labial plosives. J. Phon. 38, 404–421. Poster

presented at Phonology 2013, University of Massachusetts.Cole, J., Iskarous, K., 2001. Effects of vowel context on consonant place identification: implications for a theory of phonologization. In: Hume, E., Johnson, K.

(Eds.), The Role of Speech Perception in Phonology. Academic Press, San Diego, pp. 103–122.Cuartero Torres, N., 2001. Voicing Assimilation in Catalan and English (Ph.D. thesis). Universitat Autònoma de Barcelona.De Schutter, G., Taeldeman, J., 1986. Assimilatie van Stem in de Zuidelijke Nederlandse Dialekten. In: Devos, M., Taeldeman, J. (Eds.), Vruchten van z’n akker:

opstellen van (oud-) medewerkers en oud-studenten voor Prof. V.F. Vanacker. Seminarie voor Nederlandse Taalkunde, Ghent, pp. 91–133.Dinnsen, D.A., Charles-Luce, J., 1984. Phonological neutralization, phonetic implementation, and individual differences. J. Phon. 12, 49–60.Ernestus, M., 2011. Gradience and categoricality in phonological theory. In: van Oostendorp, M., Ewen, C.J., Hume, E., Rice, K. (Eds.), Blackwell Companion to

Phonology, vol. IV. Wiley-Blackwell, Chichester.Forrez, G., 1966. Relevante parameters van de stemhebbende fricatief /z/. Inst. Perceptie Onderzoek Verslag (Eindhoven).Gurevich, N., 2001. A critique of markedness-based theories in phonology. Stud. Linguist. Sci. 31, 89–114.Hale, M., Reiss, C., 2008. The Phonological Enterprise. Oxford University Press, USA.Haspelmath, M., 2006. Against markednss (and what to replace it with). J. Linguist. 42, 25–70.Hayes, B., 1999. Phonetically driven phonology: the role of Optimality Theory and inductive grounding. In: Darnell, M., Moravscik, E., Noonan, M.,

Newmeyer, F., Wheatly, K. (Eds.), Functionalism and Formalism in Linguistics, vol. 1. John Benjamins, Amsterdam, pp. 243–285.Jansen, W., 2004. Laryngeal Contrast and Phonetic Voicing: a Laboratory Phonology Approach to English, Hungarian, and Dutch (Ph.D. thesis). University of

Groningen, Groningen.Jiménez, J., 1999. L’estructura sil$làbica del catala. IIFV, Publicacions de l’Abadia de Montserrat, València, Barcelona.Jiménez, J., Lloret, M.-R., 2008. Asimetrías perceptivas y similitud articulatoria en la asimilación de sonoridad del catalán. Cuadernos de Lingüística del I.U.I.

Ortega y Gasset 15, 71–90.Kluender, K.R., Diehl, R.L., Wright, B.A., 1988. Vowel-length differences before voiced and voiceless consonants: an auditory explanation. J. Phon. 16, 153–

169.Ladefoged, P., Maddieson, I., 1996. The Sounds of the World’s Languages. Blackwell, Cambridge, MA.Lipski, J.M., 1989. /s/-voicing in Ecuadoran Spanish: patterns and principles of consonantal modification. Lingua 79, 49–71.Lisker, L., Abramson, A.S., 1964. A cross-language study of voicing in initial stops: acoustical measurements. Word 20, 384–422.Port, R., O’Dell, M., 1985. Neutralization of syllable-final voicing in German. J. Phon., 455–471.R Development Core Team, 2005. R: a language and environment for statistical computing. R Foundation for Statistical Computing. Austria, Vienna. http://

www.R-project.org. ISBN 3-900051-07-0.Recasens, D., 1993. Fonètica i fonologia. Enciclopèdia Catalana, Barcelona.Recasens, D., Mira, M., 2012. Voicing assimilation in Catalan two-consonant clusters. J. Phon. 40, 639–654.Recasens, D., Mira, M., 2013. Voicing assimilation in Catalan three-consonant clusters. J. Phon. 41, 264–280.Robinson, K.L., 1979. On the voicing of intervocalic s in the Ecuadorian Highlands. Roman. Philol. 33, 132–143.Schuchardt, H., 1885 [1972]. On sound laws: against the neogrammarians. In: Vennemann, T., Wilbur, T.H. (Eds.), Schuchardt, The Neogrammarians and the

Transformational Theory of Phonological Change: Four Essays. Athenäum Verlag, Frankfurt am Mein, pp. 41–72. First published (1885), Über dieLautgesetze: gegen die Junggrammatiker, Berlin: Oppenheim.

Silverman, D., 2006. The diachrony of labiality in Trique, and the functional relevance of gradience and variation. In: Goldstein, L.M., Whalen, D.H., Best, C.T.(Eds.), Papers in Laboratory Phonology VIII. Mouton de Gruyter, Berlin, pp. 133–154.

Simon, E., 2010. Voicing in Contrast. Acquiring a Second Language Laryngeal System. Academia Press (Ginkgo series), Ghent.Slis, I.H., 1986. Assimilation of voice in Dutch as a function of stress, word boundaries and sex of speaker and listener. J. Phon. 14, 311–326.Snoeren, N.D., Hallé, P.A., Segui, J., 2006. A voice for the voiceless: production and perception of assimilated stops in French. J. Phon. 34, 241–268.Solé, M.-J., 2010. Effects of syllable position on sound change: an aerodynamic study of final fricative weakening. J. Phon. 38, 289–305.Stevens, K., 1998. Acoustic Phonetics. MIT Press, Cambridge, MA.Stevens, K.N., Blumstein, S.E., Glicksman, L., Burton, M., Kurowski, K., 1992. Acoustic and perceptual characteristics of voicing in fricatives and fricative

clusters. J. Acoust. Soc. Am. 91, 2979–3000.Strycharczuk, P., 2012. Phonetics-phonology Interactions in Pre-sonorant Voicing (Ph.D. thesis). University of Manchester.Strycharczuk, P., Simon, E., 2013. Explaining pre-sonorant voicing. The case of West-Flemish. Nat. Lang. Linguist. Theory 31, 563–588.Strycharczuk, P., van ’t Veer, M., Bruil, M., Linke, K., 2013. Phonetic evidence on phonology-morphosyntax interactions. Sibilant voicing in Quito Spanish. J.

Linguist. 50, 403–452.Vaux, B., 2008. Why the phonological component must be serial and rule-based. In: Vaux, B., Nevins, A. (Eds.), Rules, Constraints and Phonological Phe-

nomena. Oxford University Press, Oxford, pp. 20–60.Vennemann, T., 1972. Phonetic analogy and conceptual analogy. In: Vennemann, T., Wilbur, T.H. (Eds.), Schuchardt, the Neogrammarians, and the Trans-

formational Theory of Phonological Change. Athenäum Verlag, Frankfurt am Mein, pp. 181–204.Warner, N., Jongman, A., Sereno, J., Kemps, R., 2004. Incomplete neutralization and other sub-phonemic durational differences in production and

perception: evidence from Dutch. J. Phon. 32, 2.Weijnen, A., 1991. Vergelijkende Klankleer van de Nederlandse Dialecten. SDU, The Hague.Wheeler, M.W., 2005. The Phonology of Catalan. Oxford University Press, New York.