speech timing and rhythm - uni-bielefeld.de · syllable timing – a ‘syllable’ is a sequence...

Post on 10-Mar-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Speech Timing and Rhythm

Dafydd Gibbon

Jinan University, Guangzhou, ChinaNovember 2017

Background: the Architecture of Speech and Language

The Ranks and Interpretations Model

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 3

The architecture of language: Ranks and Interpretations

(MORPHO)PHONEME

MORPHEME

LEXICAL ROOT

DERIVED WORD

COMPOUND WORD

PHRASE

CLAUSE

SENTENCE

TEXT

LE

XIC

ON

– h

oli

stic

pro

per

ties

, o

pac

ity

DISCOURSE

Grammar – compositionality

MULTIMODALHIERARCHIES

speechwritinggesture

SEMANTICS/PRAGMATICHIERARCHIES:

conceptsobjectsevents

Categorial Ranks Interpretations

semiotic relationbetween meaningand phonetic form

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 4

Prosody in the Ranks and Interpretations Model

(MORPHO)PHONEME

MORPHEME

LEXICAL ROOT

DERIVED WORD

COMPOUND WORD

PHRASE

CLAUSE

SENTENCE

TEXT

LE

XIC

ON

– h

oli

stic

pro

per

ties

, o

pac

ity

DISCOURSE

Grammar – compositionality

Prosodic-phonetic Interpretation

phoneme rank: segment/tone/accent/stress

word rank:

morphological tone/accent/stress

sentence, clause, phrase rank: intonation, rhythm

phrasal accent, boundary tone

utterance rank: intonation, rhythm

discourse rank: intonation, rhythm

Rank

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 5

Speech Timing:

Regularities and Rhythm

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 6

Speech timing

Relevance of speech timing

– Studies in prosodic typology of timinge.g. mora, syllable, foot timing (depending on annotation)

– Studies in musicologye.g. song, music performance

– Speech technology● measuring foreign language phonetic proficiency● diagnosis and therapy in speech pathology● natural duration models forl speech synthesis● designing disambiguation models in speech recognition

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 7

Speech timing

● Discourse rank:– prosodic adaptation, turn-taking (sequence, interruption)– back channel intonation

● Text/utterance rank:– rhetorical pause, rhythm– timing of new/old information

● Sentence, phrase rank:– stress or syllable timed regularities– phrase-final lengthening

● Word rank (simple, derived, compound, inflected):– mora, syllable, foot timing (depending on annotation)

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 8

Measuring Timing Regularities

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 9

Analysis of timing relations

1. Empirical resources1. corpus creation: planning, recording, storage2. transcription, annotation

2.Method1. Recording, transcription, annotation2. Measurement:

1.Manual, with spreadsheets2.Automatic analysis

e.g. TGA, Time Group Analysis – an online tool for studying speech timing and rhythm

http://wwwhomes.uni-bielefeld.de/gibbon/TGA/

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 10

Speech data: corpus creation

● Pre-recording phase:● definition of purposes for which the data will be used● scenario: domain, activities, speakers● equipment and technical operator:

– general: digital audio (recorder / laptop), digital video– specialised: laryngograph, etc

● Recording phase:● negotiate scenario with chiefs, elders, speakers● ensure the recording location is quiet● if possible ensure the microphones, video tripod etc. can

be stably positioned● Post-recording phase:

● provide recordings with metadata immediately● label the data media immediately● make safety copies immediately

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 11

Timing and stress: pitch pattern and syllable duration

Questions for discussion:

● Measure the durations of the speech sounds in these words.● Can you order the types of speech sound by their average durations?

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 12

Regularity and Rhythm in Speech

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 13

Regularity and Rhythm in Speech

Rhythm is an emergent property of timing, determined multiple factors in the typology of languages:

● Functional factors – well investigatedDiscourse: speech rate, pauses

● Grammatical factors – these are well investigated:Lexical: contrastive duration (2 or 3 values)Phrasal: relations between stressed-unstressed itemsDiscoursal: rhetorical pause

● Phonetic factors – somewhat controversial:Inherent Consonant and Vowel durationBalance of Consonant and Vowel duration in syllablesLanguage specific compositional units of timing:

mora – syllable – foot

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 14

Regularity and Rhythm in Speech

Rhythm is an emergent property of timing, determined multiple factors in the typology of languages

A common theory of timing regularity distinguishes between different kinds of regularity in timing in different types of language:

● stress timing (or foot timing) – a ‘foot’ or ‘rhythm unit’ is a syllable sequence stressed syllables and neigbouring unstressed syllables; there are different theories of foot structure

● syllable timing – a ‘syllable’ is a sequence of speech sounds consisting of a vowel and its neighbouring consonants; there are different theories of syllable structure

● mora timing – a ‘mora’ is a unit of timing which is smaller than the syllable

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 15

Regularity and Rhythm in Speech

Rhythm is an emergent property of timing, determined multiple factors in the typology of languages

Duration properties of speech sounds:● intrinsic duration:

– vowels– consonants

● contrastive duration:– vowels– consonants

● rhythmic duration:– strong – weak durations:

● syllable patterns of consonants and vowel alternations:C – V – C – V ...

● foot patterns of stressed and unstressed syllable alternations:CVC – cvc – CVC – cvc ...

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 16

Regularity and Rhythm in Speech: A Basic Rhythm Model

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 17

Regularity and Rhythm in Speech: A Basic Rhythm Model

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 18

Regularity and Rhythm in Speech: A Basic Rhythm Model

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 19

Regularity and Rhythm in Speech: A Basic Rhythm Model

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 20

Discovering regularities in rhythm:

Relations between neighbouring syllables

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 21

Partial recovery of alternation: Wagner quadrants

Wagner (2006) has a topological procedure for recovering non-absolute differences by plotting DUR(i) x DUR(i+1):

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 22

Partial recovery of alternation: Wagner quadrants

Wagner (2006) has a topological procedure for recovering non-absolute differences by plotting DUR(i) x DUR(i+1):

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 23

Partial recovery of alternation: Wagner quadrants

Wagner (2006) has a topological procedure for recovering non-absolute differences by plotting DUR(i) x DUR(i+1):

Note: still binary relations

However, 4 quadrants permit distinguishing between long-short & short-long

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 24

Binary duration relations: Wagner Quadrants for German

Green: stressed->unstressed

Blue: unstressed->stressed

Red:...->phrase-final

D(i+1)

Comment: stress timed - green & blue disjoint

D(i)

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 25

Binary duration relations: Wagner Quadrants for English

Green: stressed->unstressed

Blue: unstressed->stressed

Red:...->phrase-final

D(i+1)

D(i)

Comment: stress timed - green & blue disjoint

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 26

Binary duration relations: Wagner Quadrants for French

Green: stressed->unstressed

Blue: unstressed->stressed

Red:...->phrase-final

D(i+1)

Comment: syllable timed - green & blue overlap

D(i)

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 27

Binary duration relations: Wagner Quadrants for Italian

Green: stressed->unstressed

Blue: unstressed->stressed

Red:...->phrase-final

D(i+1)

D(i)

Comment: stress timed - green & blue disjoint

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 28

Binary duration relations: Wagner Quadrants for Polish

Comment: highly syllable timed - green & blue overlap

Green: stressed->unstressed

Blue: unstressed->stressed

Red:...->phrase-final

D(i+1)

D(i)

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 29

Discovering regularities in rhythm:

dynamic timing models

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 30

Models of rhythm and entrainment

● Fred Cummins and Robert Port– rhythm– entrainment of rhythm

● Plinio Barbosa– rhythm as oscillation– different domains of oscilation

● Petra Wagner and associates– examination of different oscillator models

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 31

Barbosa’s dynamic timing model

Def. “rhythm”: speech rhythm is understood as the consequence of the variation of perceived duration along the entire utterance.

Two levels of duration encoding / control / specification, coupling between 2 oscillators:syllabic: intrinsic lexical level

phrasal: extrinsic, properly rhythmic level

entrainment (coupling) of the oscillators

Emulation of results of other rhythm studies:the greater wo, the more like stress-timing

the smaller wo, the more like syllable-timing

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 32

Barbosa’s dynamic rhythm model

Dafydd Gibbon, Guangzhou Prosody Lectures, November 2016

Lecture 6: Speech Timing 33

Barbosa’s dynamic rhythm model

phrase pulse

syllable oscillations

(for English these could be stress oscillations)

Note also work by Cummins, Port, Wagner, Windman and others on oscillator models of rhythm.

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 34

Regularity and Rhythm in Speech:

‘Isolated’ words in citation contexts

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 35

Timing and stress: pitch pattern and syllable duration

Questions for discussion:● What are the durations of the syllables?● What are the ratios between the durations of syllables in each word?● Assuming (a big assumption) that the effect of stress is the same, whether

the syllable is first or second, what is the effect of final lengthening?

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 36

Timing and stress: pitch pattern and syllable duration

Which is more important for stress – pitch change or duration?Which is more important for duration, stress or final lengthening?

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 37

Timing and stress: pitch pattern and syllable duration

IM 0.18port! 0.47 ratio: 2.6:1im 0.22PORT! 0.49 ratio: 2.2:1

IM 0.22port? 0.49 ratio: 2.2:1im 0.17PORT? 0.5 ratio: 2.9:1

Which is more important for stress – pitch change or duration?Which is more important for duration, stress or final lengthening?

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 38

Regularity and Rhythm in Speech:

Words and phrases in utterance contexts

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 39

Timing and stress: pitch pattern and syllable duration

Tasks for discussion:● Enter the durations of the syllables into a spreadsheet.● Find the mean & standard deviation of all syllable durations (no pauses).● Find the mean and standard deviation of the stressed syllable durations.● Find the mean and standard deviation of the unstressed syllable durations.

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 40

Words and phrases in utterance contexts

● Procedure:– Select a Praat TextGrid– Enter into a spreadsheet the timestamps of items in the

tier you are examining (e.g. syllables):start timestampend timestamp

– Calculate durations of the items:end – start

– Calculate the mean (average) duration– Calculate the standard deviation of the duration– Calculate the coefficient of variation (relative standard

deviation) of the duration

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 41

Some data processing functions used in timing analysis

μ=∑i=1

nx i

n

σ=√∑i=1

n(xi−μ)2

n

Mean (average):

Standard deviation (of sample):

CV=σμ

Coefficient of variation (relative standard deviation), ‘varco’:

z= x−μσ

Normalisation or standardisation of data values from different sources in order to make them comparable:

z-score (standard score)Task for discussion:

First calculate the z-scores for your data, then calculate the mean, standard deviation and coefficient of variation.

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 42

Some data processing functions used in timing analysis

μ=∑i=1

nx i

n

σ=√∑i=1

n(xi−μ)2

n

Mean (average):

Standard deviation (of sample):

CV=σμ

Coefficient of variation (relative standard deviation), ‘varco’:

z= x−μσ

Normalisation or standardisation of data values from different sources in order to make them comparable:

z-score (standard score)Task for discussion:

First calculate the z-scores for your data, then calculate the mean, standard deviation and coefficient of variation.

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 43

Speech timing – regularity measures

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 44

Speech timing – regularity measures

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 45

Speech timing – regularity measures

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 46

Speech timing – regularity measures

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 47

Speech timing – regularity measures

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Used for consonants, whose duration is not so variable

Used for vowels, whose duration is quite variable

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 48

Timing: temporal relations – rhythm metrics

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 49

Timing: temporal relations – rhythm metrics

‘Rhythm metrics’ of relative isochrony:– measures of regularity...irregularity of timing units:

– σ: Standard Deviation– PIM: Pairwise Irregularity Measure– PFD: Pairwise Foot Difference– rPVI, nPVI: raw and normalised Pairwise Variability Index

– not rhythm: they ignore rhythmic alternation

Guangzhou, November 2017 D. Gibbon: Acoustic Phonetics 2: Speech Timing 50

Timing: temporal relations – rhythm metrics

Task:1. Choose a Mandarin TextGrid and an English TextGrid.2. Calculate the z-score (optional step).3. Calculate the nPVI for each.4. Compare these results with the results for Standard Deviation.5. Can you draw any conclusions about syllable timing and stress timing in English and Mandarin?

top related