gareth walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · transcript lines are...

7
Visual representations of acoustic data in CA research: a survey and suggestions Gareth Walker Introduction Introduction Introduction

Upload: others

Post on 20-May-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Visual representations of acoustic data inCA research: a survey and suggestions

Gareth Walker

Introduction

Introduction Introduction

Page 2: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Introduction

http://www.incredibleart.org/lessons/elem/elem2.html

Aims of this talk

• to encourage thought about how visual representations ofacoustic data are prepared and used

• give some suggestions on good practice in constructingvisual representations of acoustic data

Visual representations of acoustic data in ROLSI

2001 2004 2007 2010 2013 20160

10

20

30

Year

cum

ulat

ive

prop

ortio

n(%

)

Cumulative proportion of ROLSI papers including phonetics termsthat contain visual representations of acoustic data

Visual representations of acoustic data in ROLSI

from Clayman, S. E. & Raymond, C. W. (2015). Modular pivots: Aresource for extending turns at talk. ROLSI, 48(4), 388-405.

In this case, the speaker raises her pitch toward the end of the first sentential completion, butthen begins to drop to her previous level on the second syllable of “turkey.” This downwardtrajectory is sustained across the pivot’s onset boundary. As evident in the spectrogram, there isalso a merging of the final consonant of “turkey” with the initial consonant of “yihknow” across

(10) [NB.IV.13.R, Page 3] 1 Emm: °°Ril cute °° But uh (0.7) .t.hhh They left early Lottie 'n 2 then we decideh we j'z we were goin ho:::me 'n then we 3 -> deci:ded it wz so nice'n quiet dow-.hhhh HEY I B'N EAT'N A 4 -> LO:TTA TURKEY YIHKNOW I DON'T HAVE ONE: BITTA ITCHI:NGk? 5 (1.2) 6 Emm: .t.hhhh YIHKNOW AH HEARD THET T(h)URKEY WZ GOO::D FOR YUH 7 with this thi:ng? 8 (0.3) 9 Lot: Is that ri::ght? 10 Emm: eeYah a girl'n the apartm'n tol'me tha:t. Thet the doctor 11 cured it? An' I'm tellin yuh yin- I've never had s'ch a 12 healing. I have no(h)o pro(h)oblems:.

A LO:TTA TURKEY YIHKNOW I DON’T HAVE ONE:

75

500

100

200

300

Pitc

h (H

z)

Time (s)

0 1.8750

5000

Freq

uenc

y (H

z)

MODULAR PIVOTS 399

Dow

nloa

ded

by [R

oyal

Hal

lam

shire

Hos

pita

l], [G

aret

h W

alke

r] a

t 03:

45 2

3 N

ovem

ber 2

015

Page 3: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Visual representations of acoustic data in ROLSI

from Clayman & Raymond (2015)

In this case, the speaker raises her pitch toward the end of the first sentential completion, butthen begins to drop to her previous level on the second syllable of “turkey.” This downwardtrajectory is sustained across the pivot’s onset boundary. As evident in the spectrogram, there isalso a merging of the final consonant of “turkey” with the initial consonant of “yihknow” across

(10) [NB.IV.13.R, Page 3] 1 Emm: °°Ril cute °° But uh (0.7) .t.hhh They left early Lottie 'n 2 then we decideh we j'z we were goin ho:::me 'n then we 3 -> deci:ded it wz so nice'n quiet dow-.hhhh HEY I B'N EAT'N A 4 -> LO:TTA TURKEY YIHKNOW I DON'T HAVE ONE: BITTA ITCHI:NGk? 5 (1.2) 6 Emm: .t.hhhh YIHKNOW AH HEARD THET T(h)URKEY WZ GOO::D FOR YUH 7 with this thi:ng? 8 (0.3) 9 Lot: Is that ri::ght? 10 Emm: eeYah a girl'n the apartm'n tol'me tha:t. Thet the doctor 11 cured it? An' I'm tellin yuh yin- I've never had s'ch a 12 healing. I have no(h)o pro(h)oblems:.

A LO:TTA TURKEY YIHKNOW I DON’T HAVE ONE:

75

500

100

200

300

Pitc

h (H

z)

Time (s)

0 1.8750

5000

Freq

uenc

y (H

z)

MODULAR PIVOTS 399

‘As evident in the spectrogram,there is. . . a merging of thefinal consonant of ‘‘turkey’’with the initial consonant of‘‘yihknow’’ across thisjuncture, with no break invoicing and a single palatalplace of articulation.’ (p. 400)

Visual representations of acoustic data in ROLSI

Visual representations of acoustic data in ROLSI

Freq

uenc

y (k

Hz)

0

1

2

3

4

5

KEY YIH

Time (s)0 0.05 0.1 0.15 0.2 0.25

i) provides support for the claim

ii) the reader can independently verify that claim

Pitch traces

GREETING: DISPLAYING STANCE THROUGH PROSODY 379

FIGURE 1 F0 trace of Paula’s greeting to Amanda (Excerpt 1, line 2).

FIGURE 2 F0 trace of Paula’s greeting to Derik (Excerpt 2, line 2).To facilitate comparison, the window sizes of Figures 1 and 2 are aboutthe same.

In this case, the speaker raises her pitch toward the end of the first sentential completion, butthen begins to drop to her previous level on the second syllable of “turkey.” This downwardtrajectory is sustained across the pivot’s onset boundary. As evident in the spectrogram, there isalso a merging of the final consonant of “turkey” with the initial consonant of “yihknow” across

(10) [NB.IV.13.R, Page 3] 1 Emm: °°Ril cute °° But uh (0.7) .t.hhh They left early Lottie 'n 2 then we decideh we j'z we were goin ho:::me 'n then we 3 -> deci:ded it wz so nice'n quiet dow-.hhhh HEY I B'N EAT'N A 4 -> LO:TTA TURKEY YIHKNOW I DON'T HAVE ONE: BITTA ITCHI:NGk? 5 (1.2) 6 Emm: .t.hhhh YIHKNOW AH HEARD THET T(h)URKEY WZ GOO::D FOR YUH 7 with this thi:ng? 8 (0.3) 9 Lot: Is that ri::ght? 10 Emm: eeYah a girl'n the apartm'n tol'me tha:t. Thet the doctor 11 cured it? An' I'm tellin yuh yin- I've never had s'ch a 12 healing. I have no(h)o pro(h)oblems:.

A LO:TTA TURKEY YIHKNOW I DON’T HAVE ONE:

75

500

100

200

300

Pitc

h (H

z)

Time (s)

0 1.8750

5000

Freq

uenc

y (H

z)

MODULAR PIVOTS 399

different interactional function. Although these multiples are produced exactlytwice and in immediate succession by the same speaker, these multiples contain apitch change such that the second saying of the token is produced with higher pitchthan the first token. Tokens in this category take the shape shown in Figure 2. Interms of its interactional placement, this type of double, that is, ja^ja., is alwayspositioned in interactional environments in which the interactants’ intersubjec-tivity or common world view is fractured. The basic sequence unfolds as follows:A produces an utterance, and B responds to it. B’s response displays B’s misalign-ment with the previous turn. It is in response to this misalignment that speaker Aproduces a turn containing a turn-initial ja^ja. of the prosodic shape described pre-viously. With the production of the ja^ja. turn, speaker A acknowledges speakerB’s utterance while simultaneously realigning the interactants. Put differently,with the ja^ja., its speaker treats the action/content of the previous speaker’s utter-ance as either unwarranted or self-evident and takes issue with it. All instances ofja^ja. turns or ja^ja.-fronted turns in the collection fall into the following typesof misalignment5: (a) the prior speaker tells the jaja speaker something over whichthe jaja speaker has epistemic authority (i.e., B-event statements; Labov & Fan-shel, 1977), (b) the prior speaker asks for clarification or comments on somethingthat the jaja speaker already said or implied in the preceding turn(s), or (c) the priorspeaker is responding to something that was not the main point of the jaja speaker(sequential misalignment). In all three categories, the “fault” for the misalignmentis attributable to the prior speaker (i.e., the non-ja^ja. speaker). Moreover, the dataconvey the sense that the prior speaker should have known better (i.e., “I alreadyknow, and you should have known that I already know, what you just said”). In the

252 GOLATO AND FAGYAL

FIGURE 2 ja^ja. token with pitch peak in the second syllable.

All confirmation sequences were analyzed auditorily and subsequently in PRAAT 5.3.77.2 The symbol ʔis used in the transcripts to represent glottalization, while the = symbol indicates vowel linking. At therelevant word boundary, pitch accents are recorded as capitals where they occur, with an indication of thepitch movement. Throughout the transcript syllable lengthening and pausing are also represented (see theappendix for transcription symbols). None of these prosodic parameters accounts for the contrastdescribed; as a result their analysis has been kept to a minimum. Similarly, finer phonetic detail has notbeen transcribed and is not referenced here, since the primary explanation for the contrast betweenglottalization and linking in the context of turn extension after initial confirmations is an action-basedrather than a phonological one.

Transcript lines are translated into English in a separate line. The translations aim to strike abalance between an appropriate gloss and a sufficiently strong sense of the original lexicalchoices. We draw attention to the fact that translation of all the nuances of the original is notpossible. Ashmore and Reed (2000) note that transcripts of natural data are twice removed fromthe original event through recording and subsequent notation. Translation adds another layer tothis process, and neither the transcripts in the original language nor their translations shouldtherefore be considered “data.”

Time (s)0 0.5646

-0.2435

0.2735

0

-0.2435

0.2735

0

ja a- -ber

75

600

Pitc

h (H

z)

Time (s)0 0.5646

Figure 2. Glottalized ja aber, Extract 7, line 473.

2http://www.fon.hum.uva.nl/praat/

RESEARCH ON LANGUAGE AND SOCIAL INTERACTION 133

176 BARTH-WEINGARTEN

Pitc

h (H

z)

100

150

200

300

50

500

70

jA JA

s o

Time (s)0 0.3093

FIGURE 3 Pitch trace of joke-aligning JAJA in excerpt 7, line 2315.(The end of the second JA is not properly taken up by PRAAT as it isoverlapped.)

On the (narrow-sense) phonetic side, these JAJAs are regularly accompanied by smile voice—i.e., lip spreading, which results in a raised first formant, an audibly “broader” pronunciationof vowels, and possibly the perception of raised pitch, for instance (see Ford & Fox, 2010)—and often they shade off into, or are followed by, laughter (for the relevance of this cluster offeatures for joke-aligning JAJAs, see Barth-Weingarten, in press). Interactionally, these JAJAsare affiliating with the jocular mode and accomplish alignment in that they are neither sequenceclosing nor are they followed by any topicalization of a misalignment or the like, but instead theirspeakers seem to be content with a continuation of the sequence in the same mode as before.

Consider Excerpt 7. It is taken from another edition of the TV talk show “Die Woche” withaudience present. The talk show host Gerd Müller-Gerbes (MG) invited, among others, the popsinger Howard Carpendale (HC) and the politician Heiner Geißler (HG). HC has just jokinglycomplimented HG on the way he presents himself in this show, upon which HG points out thatBiermann, another famous German political singer and songwriter, had already said the samething to him on some earlier occasion.

Excerpt 7 Biermann (Fasch_2304, Fasch_2309, Fasch_2314 (1:00:00-1:00:35)

2299 HG: der BIER]mann hat des AUCH [schon mal zu mir gesagt;=nIcht?=]{name: Biermann} said this to me too once you know

2301 HC: [((smiles till cut to HG)) ]2302 MG: =WER hat das-

=who has|_____________|((HG visible with smile))

2303 der BIERma[nn. ]{name: Biermann}_____________|((HG visible with smile))

Page 4: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Pitch traces

Callhome EN4093: 1207s

1 B: [ has ] Kim been here before2 A: [and the-]3 (0.2)4 A: → no

Pitch traces

0

150

300

450

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

Pitch traces

0

150

300

450

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

linear; notscaled to range

Linear vs. non-linear scales

we perceive differences in frequency better at lowerfrequencies

130.8 Hz

138.6 Hz 261.6 Hz277.2 Hz

Page 5: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Linear vs. non-linear scales

100

150

200

250

300

Pitc

h (H

z)

Time

100

150

200

250

300

Pitc

h (H

z)

Time

a non-linear scale makes changes at lower frequencies lookbigger; higher frequencies are ‘squished’

Linear vs. non-linear scales

50

150

300

450

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

Linear vs. non-linear scales

50

150

300

450

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

-5.1 ST

Linear vs. non-linear scales

50

150

300

450

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

Page 6: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Scaling to a speaker’s range

• what is high (or low) for one speaker might not be high (orlow) for another

• relative placement in a speaker’s range has interactionalrelevance (Couper-Kuhlen, 1996; Local, 2005)

Scaling to a speaker’s range

0

100

200

300

400

500

600

700

Speakers

Pitc

h(H

z)

malefemalemedianmean

• what is high (orlow) for A may notbe high (low) for B

• some speakersseem to have hugeranges, othersmuch smaller ones

• it looks like womenhave wider rangesthan men

Scaling to a speaker’s range

0

5

10

15

20

25

30

35

Speakers

Ran

ge(s

emito

nes

re.

base

line)

malefemalemedianmean

• semitones, relativeto the speaker’sbaseline

• men and womenare mixed

• ranges look moresimilar

Scaling to a speaker’s range

variation in pitch heights and ranges means it is often worthdrawing pitch traces relative to the speaker’s range

Page 7: Gareth Walkergareth-walker.staff.shef.ac.uk/pubs/walker-lboro-2017.pdf · Transcript lines are translated into English in a separate line. The translations aim to strike a balance

Scaling to a speaker’s range

86

150

200

250

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

Scaling to a speaker’s range

86

150

200

250

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

Pitch (sem

itones)

0

6

12

18

0

150

300

450

Pitc

h (H

z)

0 0.1 0.2 0.3 0.4Time (s)

Summing up

• trying to share some advice on preparing robust visualrepresentations

• trying to prompt more careful thought about how visualrepresentations are prepared and used

• visual representations, along with descriptions andtranscriptions, are how readers usually ‘get at’ the data

Visual representations of acoustic data inCA research: a survey and suggestions

Gareth Walker

scripts: tinyurl.com/visrepsslides: tinyurl.com/gwshef