states and times or cycles and frequencies?

74
Phonology and Phonetics of Rhythm: States and Times or Cycles and Frequencies? Dafydd Gibbon Bielefeld University Jinan University 13th Chinese Phonetics Conference Guangzhou, November 2018

Upload: others

Post on 15-May-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: States and Times or Cycles and Frequencies?

Phonology and Phonetics of Rhythm:

States and Timesor

Cycles and Frequencies?

Dafydd Gibbon

Bielefeld UniversityJinan University

13th Chinese Phonetics ConferenceGuangzhou, November 2018

Page 2: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 2

Theses: YARD (Yet Another Discussion of Rhythm)

In speech (and in music):1) Rhythms are oscillations, not sequences of isochronous states2) Rhythmic oscillations are essentially linear (‘finite state’)3) Rhythms occur in time domains of different sizes, each of them

linear4) The time domains are associated with ranks of units of

speech/language from discourse to phoneme

So rhythms appear to be hierarchical, but the hierarchy● is layered● has limited depth● has linear layers with iterative cycles● is not a general recursive hierarchy

Page 3: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 3

Linear Layered Discourse Rhythms

57 seconds of a BBC news broadcast, from Aix-MARSEC corpus (A0101B)

Two Frequency zones:1) Approximately 120 – 220 Hz2) Approximately 300 – 400 Hz (on ‘paratone’ onsets)

Finite depth linear 2-cycle iterative layered structure(FS, cf. Pierrhumbert, not a recursive hierarchy)

● Two independently motivated but structurally dependent linear layers(like hours & minutes on a clock)

● Each gradually declining and periodically resetting

Page 4: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 4

1. Denotation, Reference

2.symbols, Icons,

Indices inSpace-Time

semiotic relation

3.Categorial

simple and structured

forms

Multimodal Interpretation

hierarchicalpatterning

FocusContrast

Emphasis

linearpatterns

Three ‘hierarchies’, not one, with a ternary semiotic basis for signs at all ranks.

Semantic Interpretation

The Multilinear Rank Interpretation Model

Page 5: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 5

Discourse: Monologue, Dialogue

Utterance: turn, IPU, ...

Sentence, clause, phrase

Word: simple, inflected, compound, derived

Multimodal

Rank

Interpretation

Architecture

MultilinearCategory Ranks

Ranks: Categories and their Interpretations

The Multilinear Rank Interpretation Model

Page 6: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 6

The discourse dimension: people communicating with their multimodal rank interpretation architectures

The Multilinear Rank Interpretation Model

Page 7: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 7

States and Times or Cycles and Frequencies? - Overview

1.Rhythm Communities

2. Phonological cycles: abstract, iterative rhythm

3. Phonetic states: sequences of isochronous states as rhythm

4.Phonetic cycles: rhythm as oscillation:– Production as amplitude and frequency modulation– Perception as amplitude and frequency demodulation

5. Pitch cycles: the role of F0 in rhythm– AM and FM similarities– Some problems with F0 estimators (aka ‘pitch trackers’)

6. More roles of F0 in discourse rhythms

Page 8: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 8

Rhythm Communities – a Selection

Conversation Analysis

Linguistic Analysis

Phonetic Annotation Analysis

PhoneticSignal

Analysis

Page 9: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 9

Rhythm Communities – a Selection

Conversation Analysis

‘Authentic’ everyday dialogue dataEthnomethodology: task scenarioSystematic but subjective judgmentsFocus on interpretation:

● discourse grammar: framing● semantics: topic development● pragmatics: speaker, addressee

Linguistic Analysis

Constructed data paradigmsRules for verbal categoriesAbstract stress position patterns:

● linear structures, grids● hierarchical structures, trees

● words● sentences

Phonetic Annotation Analysis

Standardised reading data● sentences● stories (The North Wind…)

Annotation:● manual, semi-automatic● syllables, feet, C or V chunks

Focus on isochrony

Phonetic Signal Analysis

Smoothing models● splines, polynomials

Oscillation models● production oriented● perception oriented

● Amplitude Modulation Spectrum● Frequency Modulation Spectrum

Page 10: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 10

Rhythm Communities – a Selection

Conversation Analysis

‘Authentic’ everyday dialogue dataEthnomethodology: task scenarioSystematic but subjective judgmentsFocus on interpretation:

● discourse grammar: framing● semantics: topic development● pragmatics: speaker, addressee

Linguistic Analysis

Constructed data paradigmsRules for verbal categoriesAbstract stress position patterns:

● linear structures, grids● hierarchical structures, trees

● words● sentences

Phonetic Annotation Analysis

Standardised reading data● sentences● stories (The North Wind…)

Annotation:● manual, semi-automatic● syllables, feet, C or V chunks

Focus on isochrony

Phonetic Signal Analysis

Smoothing models● splines, polynomials

Oscillation models● production oriented● perception oriented

● Amplitude Modulation Spectrum● Frequency Modulation Spectrum

Tree Structures

Linguistics: deductive, top-down● paradigmatic (classificatory)● syntagmatic (compositional)

Phonetics: inductive, bottom-up● classificatory (CART etc.)● compositional

Page 11: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 11

Phonological Iteration as Abstract Oscillation:

Iterative intonation

Iterative tonal sandhi (Niger Congo)

Iterative tonal sandhi (Tianjin dialect)

Note: by ‘cycle’ I do not mean tone cycles in theparadigmatic , classificatory sense,

but syntagmatic, compositional iterations

Page 12: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 12

Empirical overgeneration1) Accents in a sequence tend to be all H* or all L*2) Global contours tend to be rising with L*

accents, falling with H* accents3) Global contours may span more than 1 turn

Empirical undergeneration1) Paratone hierarchy not included2) No time constraints

Pierrehumbert’s regular grammar / finite state transition network

Phonological iteration as a layered hierarchyof linear abstract oscillations

Not the first (cf. Reich,’t Hart et al., Fujisaki, …)

But linguistically the most interesting.

Page 13: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 13

From an allotonic point of view:

● 3 cycles● 1-tape (1-level) transition

network

Niger-Congo Iterative Tonal Sandhi (the most general case)

Phonological iteration as a layered hierarchyof linear abstract oscillations

At the most abstract level, just one node with H and L cycling around it.

Page 14: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 14

From an allotonic point of view:

● 3 cycles● 2-tape (= 2-level) transition

network

Niger-Congo Iterative Tonal Sandhi (the most general case)

Phonological iteration as a layered hierarchyof linear abstract oscillations

Page 15: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 15

From phonetic signal processing point of view:

● 3 cycles● 3-tape (= 3-level) transition

network

Phonological iteration as a layered hierarchyof linear abstract oscillations

Niger-Congo Iterative Tonal Sandhi (the most general case)

Page 16: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 16

Martin Jansche 1998Tianjin Mandarin tone sandhi

Tianjin Dialect Iterative Tonal Sandhi

Phonological iteration as a layered hierarchyof linear abstract oscillations

Page 17: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 17

Phonetics, the Search for Rhythm I

States and Times: the Isochrony Approach

1-dimensional2-dimensional3-dimensional

Page 18: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 18

One-dimensional annotation mining of time-stamp durations

One-dimensional because the result of the analysis is a single scale. The results are all comparable to variance or standard deviation, but differ in detail.

For example, with the nPVI, subtraction is between neighbouring data in a moving window (so a kind of AMDF, Average Magnitude Difference Function), not between mean and data, thus factoring out tempo variations to some extent.

(or Standard Deviation)

Page 19: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 19

Wagner, Petra (2007). “Visualizing levels of rhythmic organisation.” Proc. International Congress of Phonetic Sciences, Saarbrücken 2007, pp. 1113-1116, 2007

Two-dimensional because duration relations are represented in a z-scored scatter plot, not as a single scale.Result, visualising the scale in two dimensions:

Mandarin: means are scattered relatively evenly around the centreEnglish: e.g. count(short-short) > count(long-long), not binary!

LONG-LONG

LONG-SHORTSHORT-SHORT

SHORT-LONG

Two-dimensional annotation mining of time-stamp durations

LONG-LONGSHORT-LONG

LONG-SHORTSHORT-SHORT

Page 20: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 20

Gibbon, Dafydd. 2006. “Time types and time trees: Prosodic mining and alignment of temporally annotated data”. In: Stefan Sudhoff, et al., eds. Methods in Empirical Prosody Research. Berlin: Walter de Gruyter, pp. 281–209, 2006.

Three-dimensional because alternative trees are possible, depending on the algorithm settings:

- binary/nonbinary, lower/higher percolated- related to phrasal and discourse patterns

Induction of compositional tree structures

Two-dimensional annotation mining of time-stamp durations

Page 21: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 21

All these approaches

● are based on annotated time-stamps, not the signal● model static durations● use pairwise relations● focus on isochrony, equal timing of durations

● completely ignore the essential property of rhythm, which is alternation of units with approximately equal duration

● in other words: oscillation

Time Stamps:

AnnotationMining

Time Stamps:

1D Isochrony

Time Stamps:

2DRelations

Time Stamps:

3DTime Trees

Time Stamps:

AnnotationMining

Time Stamps:

1D Isochrony

Time Stamps:

2DRelations

Time Stamps:

3DTime Trees

Page 22: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 22

Phonetics, The Search for Rhythm II

Cycles and Frequencies: Oscillation Approaches

Rhythm Production as Amplitude Modulation (AM)Rhythmic Amplitude Envelope + Carrier Frequency

There are many studies of oscillation in production

I will concentrate on ...

Rhythm Perception as Amplitude DemodulationRhythmic Amplitude Envelope – Carrier Frequency

Page 23: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 23

Selected Work on Amplitude Envelope Demodulation Spectra[1] Cummins, Fred, Felix Gers and Jürgen Schmidhuber. “Language identification from prosody

without explicit features.” Proc. Eurospeech. 1999.[2] He, Lei and Volker Dellwo. “A Praat-Based Algorithm to Extract the Amplitude Envelope and

Temporal Fine Structure Using the Hilbert Transform.” In: Proc. Interspeech 2016, San Francisco, pp. 530-534, 2016.

[3] Hermansky, Hynek. “History of modulation spectrum in ASR.” Proc. ICASSP 2010.[4] Leong, Victoria and Usha Goswami. “Acoustic-Emergent Phonology in the Amplitude Envelope of

Child-Directed Speech.” PLoS One 10(12), 2015.[5] Leong, Victoria, Michael A. Stone, Richard E. Turner, and Usha Goswami. “A role for amplitude

modulation phase relationships in speech rhythm perception.” JAcSocAm, 2014.[6] Liss, Julie M., Sue LeGendre, and Andrew J. Lotto. “Discriminating Dysarthria Type From

Envelope Modulation Spectra.” Journal of Speech, Language and Hearing Research 53(5):1246–1255, 2010.

[7] Ludusan, Bogdan Antonio Origlia, Francesco Cutugno. “On the use of the rhythmogram for automatic syllabic prominence detection.” Proc. Interspeech, pp. 2413-2416, 2011.

[8] Ojeda, Ariana, Ratree Wayland, and Andrew Lotto. “Speech rhythm classification using modulation spectra (EMS).” Poster presentation at the 3rd Annual Florida Psycholinguistics Meeting, 21.10.2017, U Florida. 2017.

[9] Tilsen Samuel and Keith Johnson. “Low-frequency Fourier analysis of speech rhythm.” Journal of the Acoustical Society of America. 2008; 124(2):EL34–EL39. [PubMed: 18681499]

[10] Tilsen, Samuel and Amalia Arvaniti. “Speech rhythm analysis with decomposition of the amplitude envelope: Characterizing rhythmic patterns within and across languages.” The Journal of the Acoustical Society of America 134, p. 628 .2013.

[11] Todd, Neil P. McAngus and Guy J. Brown. “A computational model of prosody perception.” Proc. ICSLP 94, pp. 127-130, 1994.

[12] Varnet, Léo, Maria Clemencia Ortiz-Barajas, Ramón Guevara Erra, Judit Gervain, and Christian Lorenzi. “A cross-linguistic study of speech modulation spectra.” JAcSocAm 142 (4), 1976–1989, 2017.

Page 24: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 24

INFORMATION

AMPLITUDEMODULATION

CARRIERFREQUENCY

NOISEFREQUENCIES

+ ✕ FILTERCOEFFICIENTS

INFORMATION

FREQUENCYMODULATION

SPECTRALANALYSESin different

frequency zones,COORDINATION

AMPLITUDE DEMODULATIONin different time zonesrectification, LP filtering

envelope detection

FREQUENCYDEMODULATION

in different frequency zonespitch tracking, formant tracking

AM Multiple

Oscillators,productionemulation

Rhythmis

oscillationand iteration

AM Spectral Zones,

perceptionemulation

FMdiscourse

turnsemotion

Page 25: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 25

Modulation Regularities – Maybe the Long-Term Spectrum?

Rhythm: thelong-term spectrum?

Page 26: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 26

Modulation Regularities – Maybe the Long-Term Spectrum?

F0

Rhythm: thelong-term spectrum?

Page 27: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 27

Modulation Regularities – Maybe the Long-Term Spectrum?

F0 F0 F0harmonics

Long-term spectrum:

not so useful

Rhythm: thelong-term spectrum?

Page 28: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 28

Modulation Regularities – Maybe the Long-Term Spectrum?

?

What is happening

here?

Page 29: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 29

Modulation Regularities – Maybe F0?

And what is the role of

F0?

Page 30: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 30

Rhythm as iteration

The Reality of Rhythm!

Page 31: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 31Amplitude Envelope Modulation Spectrum (AEMS, AMS, EMS) Frequency Zones

Amplitude Envelope Modulation

↓Amplitude Envelope Demodulation

absolute value of Hilbert transform(or rectification & peak-picking / LP filtering)

↓Spectral slice (FFT)

↓Spectral Zone Edge Detection

Amplitude Modulation, Demodulation and Envelope Spectrum

Page 32: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 32

Amplitude Demodulation and the Amplitude Envelope Spectrum

Amplitude ModulationSpectrum

Page 33: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 33

AM Spectrum and the AM Difference Spectrum

Amplitude ModulationDifference Spectrum

(pioneered by Wiktor Jassem)

Page 34: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 34

The Role of F0

If a spectrum can be derived from the AM envelope,

why not derive a spectrum from the FM track

and see whether they correlate?

Preliminary answer:

Yes, they do correlate to some extent, but not overwhelmingly strongly!This is not very surprising, of course, since they are partly co-extensive locally and globally, though locally not too similar.

I will look at both AM and FM spectra.

Page 35: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 35

Mandarin, female30 sec, < 1Hz

AM and FM Demodulation

Page 36: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 36

Mandarin, female30 sec, < 1Hz

0.2 Hz / 5s:the rhythm ofinterpausal units?

AM and FM Demodulation

Page 37: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 37

AM and FM Demodulation and Spectrum Analysis: Calibration

Page 38: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 38

English (RP)Edinburgh corpus

“The North Wind and the Sun”

Beijing MandarinYu corpus

“bei3 feng1 gen1 tai4 yang2”

Envelope Demodulation: Extending to Discourse Spectra

Page 39: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 39

English (RP)Edinburgh corpus

“The North Wind and the Sun”

Beijing MandarinYu corpus

“bei3 feng1 gen1 tai4 yang2”

Short phrases

Short IPUs

ParatoneIPUs

IPU layers

PhrasesIPUs

1 Hz

Envelope Demodulation: Extending to Discourse Spectra

Page 40: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 40

SpectralFrequency

ZoneBoundaries EnglishNewsreading

EnglishStory

Extending to Discourse Spectra: English Genres

Page 41: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 41

Contrast

elderly male Englishyoung female Mandarin

Extending to discourse spectra: English-Mandarin

Page 42: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 42

Rhythms in Syntagmatic, Compositional Spectral Trees

Page 43: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 43

L-strong, > L-strong, < R-strong, > R-strong, <

Frequency Domain Trees: Spectral Zone Hierarchies

EnglishNewsreading

AM and FM Demodulation and Spectral Tree Induction

Page 44: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 44

L-strong, <

AEMS Frequency Tree

EnglishNewsreading

AM and FM Demodulation and Spectral Tree Induction

Page 45: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 45

L-strong, <

AEMS Frequency Tree

EnglishNorth Wind & Sun

AM and FM Demodulation and Spectral Tree Induction

Page 46: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 46

L-strong,

<

AEMS Frequency TreeMandarinNorth Wind &

Sun

AM and FM Demodulation and Spectral Tree Induction

Page 47: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 47

Rhythms in Paradigmatic, Classificatory Spectral Trees

Page 48: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 48

Data:Chunks from “The North Wind and the Sun”

Male, English: 40s

Female, Mandarin: 40s

Method:Comparison of non-overlapping adjacent 5s audio chunks– offsets into recording: 0, 5, 10, 15, 20, 25, 30, 35– AEMS for each chunk– Inter-speaker comparison (AEMS pointwise means, r=0.82)– Classification by hierarchical similarity / distance:

● Pearson Distance: 1 – r , range 0 … 2● comparison of 7 classifiers● similar classification methods used in dialectometry,

stylometry, language typology

Consistency of AM Spectra: Rhythm Classification

Page 49: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 49

Similarity criterion:>3 adjacent same-speaker settings

Largest per speaker score: (4+3)/16

Largest cluster:4/16

Consistency of Spectra: Rhythm Classification

Page 50: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 50

Highest speaker-specific total:1 Nrst.Pt. (4+3+3)/10

Largest cluster:5 UPGMC 5/10

Consistency of Spectra: Rhythm Classification

Page 51: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 51

Discourse Rhythms: Long FM contours

Thesis: in evolution,– frequency modulation and rhythm came first

● emotional cries● turn-taking came before grammar,

Levinson, “Turn-taking in Human Communication – Origins and Implications for Language Processing”, 2015

Note: in infant speech,– frequency modulation and rhythm also come first

● emotional criesWermke, Sebastian-Galles

● turn-takingcf. the ‘bootstrapping’ literature

the infant ‘twin-talk’ videos on YouTube ☺

Page 52: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 52

Answer: falling utterance contour

Question+Answer: rising-falling adjacency pair contoursyntagmatic entrainment

Question: rising utterance contour

Discourse Rhythms: Long FM contours

Page 53: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 53

Answer: falling utterance contour

L* sequence, global rise, final rise

H* sequence, global fall

H* sequence, global fall

L* sequence, global fall, final rise

Response Continuation

Interview start Question Question

Discourse Rhythms: Long FM contours

Page 54: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 54

But there are Methodological Problems in F0 / Pitch estimation

1. Terminology:– articulation rate (production)– F0 (acoustic transmission)– pitch (perception)

2. Measurement:– F0 estimation implementations yield slightly different results

● Autocorrelation● Normalised Cross-correlation● Average Magnitude Difference Function (AMDF)● FFT peak detection● Cepstrum

– Environment differences● Preprocessing: low-pass filter; centre-clipping● Postprocessing: moving median

Page 55: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 55

RAPT (Robust Algorithm for Pitch Tracking)

RAPTC binary

David Talkin

Page 56: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 56

RAPT (Python emulation)

RAPTPython

emulation

Daniel Gaspari

Page 57: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 57

Reaper (Robust Epoch And Pitch EstimatoR)

Reapercompiled C

binary

David Talkin

Page 58: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 58

Praat

Praatscript+binary

Paul Boersma

Page 59: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 59

YIN (as opposed to YANG, Python emulation)

YINPython

emulation

Patrice Guyot

Page 60: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 60

YAAPT (Yet Another Algorithm for Pitch Tracking, Python emulation)

YAAPTPython

emulation

Bernardo J. B. Schmitt

Page 61: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 61

SWIPE (Square Wave Inspired Pitch Estimator, Python emulation)

SWIPEPython

emulation

Disha Garg

Page 62: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 62

F0 – Pitch: Methodological Problems

1. Terminology:– articulation rate (production)– F0 (acoustic transmission)– pitch (perception)

2. Measurement:– F0 estimation implementations yield slightly different results

● Autocorrelation● Normalised Cross-correlation● Average Magnitude Difference Function (AMDF)● FFT peak detection● Cepstrum

– Environment differences● Preprocessing: low-pass filter; centre-clipping● Postprocessing: moving median

So let’s do somthing about it

● do-it-yourself F0 estimation● with full control over all parameters.

Page 63: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 63

F0 – Pitch: A Constructive Do-It-Yourself Strategy

Time domain:

AMDF

A kind of auto-correlation, but with subtraction minima not correlation maxima

Preprocessing:● centre-clipper● low-pass filterPostprocessing:● moving median

Simple: no● voice detection● candidate weighting

(code on GitHub)

Frequency domain:

FFT+spectrum peak-picking

Finding the lowest frequency spectral peak in the Fourier transform

Preprocessing:● centre-clipper● low-pass filterPostprocessing:● moving median

Simple - no● voice detection● candidate weighting

(code on GitHub)

Page 64: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 64

How about AMDF (Average Magnitude Difference Function, Python)

Dafydd Gibbon

AMDFnaive difference minima, Python

Page 65: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 65

FFTpeak (Simple F0 Tracker, Python)

S0FTnaive FFT

peak, Python

Dafydd Gibbon*

* inspired by a snippet from ‘Jonathan’, gist.github.com/endolith/255291

Page 66: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 66

Comparing F0 estimators with RAPT as ‘gold standard’

Benchmark against standard F0 estimators

Encouraging for FFTpeak (problems remain, of course):● correlation ignores some relevant properties such as

overall difference in pitch height● (slightly positively biased) idea of the relationship● not enough test data● not as robust as RAPT● but suggests that RAPT is fit for purpose

F0 estimator pair Correlation

RAPT:PyRAPT 0,8902RAPT:Praat 0,8657RAPT:FFTpeak 0,9096RAPT:f0AMDF 0,8605

FFTpeak:RAPT 0,9096FFTpeak:Praat 0,8409FFTpeak:PyRAPT 0,8352FFTpeak:f0AMDF 0,8016

Page 67: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 67

RAPT (Robust Algorithm for Pitch Tracking)

RAPTC binary

David Talkin

Page 68: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 68

Continuing with FM anyway: Emotive Rhythms

Thesis 1:In the evolutionary time domain: emotive ‘animal’ modulations came before structural modulations

Thesis 2:In the beginning was “Wow!” (Or “Aaah!”)

Thesis 3:Or the wolf whistle (it’s not simply ‘cat-calling’)

Thesis 4:Other primates wowed, aahed and whistled first – we humans continued the custom

Is this why in some societies there is a taboo on whistling?

Page 69: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 69

Many Uses of Frequency Modulation in Discourse Rhythms

Page 70: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 70

Alibaba FM

Page 71: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 71

Cantonese region (Guangzhou) Wu region (Shanghai)

‘Tone 6’ ☺Tone 4

Emotive Exclamations

Page 72: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 72

Cantonese region (Shenzhen)

Emotive Exclamations

Page 73: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 73

FM Rhythms in Teleglossia

Page 74: States and Times or Cycles and Frequencies?

13th Chinese Phonetics Conference, 2018 D. Gibbon, States and Times or Cycles and Frequencies 74

Thank you!谢谢 !