getting data ● kinds of data for linguistics – written – spoken – visual (asl, body...

31
Getting Data Kinds of data for linguistics Written Spoken Visual (ASL, body language) Phonetics Implosives-larynx lowering, rounding, x-ray movies Judgments, reaction times, phonetic measurements, fMRI

Upload: bertha-chase

Post on 19-Jan-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Getting Data

● Kinds of data for linguistics– Written– Spoken– Visual (ASL, body language)

● Phonetics– Implosives-larynx lowering, rounding, x-ray movies

– Judgments, reaction times, phonetic measurements, fMRI

Page 2: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Getting Data

● Don't reinvent the wheel, use available corpora– Linguistic Data Consortium (Audio)

● Switchboard, CallHome, CallFriend (phone conversations)● Santa Barbara Corpus of Spoken English

Page 3: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Getting Data

● Don't reinvent the wheel, use available corpora– Linguistic Data Consortium (Audio)

● Switchboard, CallHome, CallFriend (phone conversations)● Santa Barbara Corpus of Spoken English

– How is /t/ pronounced across words?● “He read it aloud.” [t, ʔ, ɾ]● “The port of San Francisco” [t, ʔ, ɾ]

Page 4: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Getting Data

● Don't reinvent the wheel, use available corpora– Linguistic Data Consortium (Audio)

● Switchboard, CallHome, CallFriend (phone conversations)● Santa Barbara Corpus of Spoken English

– How is /t/ pronounced across words?● “He read it aloud.” [t, ʔ, ɾ]● “The port of San Francisco” [t, ʔ, ɾ]● [ɾ] is used more by older speakers and males● [ʔ] is used more by younger speakers and females

– Podcasts

Page 5: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

What are audio recordings good for?

● Pronunciation– Sociolinguistics– Learner speech– L2 acquisition– Phonetic questions

Page 6: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

What are transcriptions of audio recordings good for?

● Vocabulary● Idioms● Conversational Analysis

– What do “um, well, uh huh, mmm, yeah” mean?– How do you end your turn talking? Know the other

person is done?● Regionalisms

Page 7: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

What are transcripts of audio recordings good for?

● Available transcripts– CNN– NPR– Movie Scripts

Page 8: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Getting your own recordings

● Practical considerations:– Audio only or audio and video? – Format for digital recorder? – How much time do you need? – External microphone – Making copies – Identifying speakers – Quiet place, no kids running around, traffic

Page 9: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Problems with recordings

● You may not get enough instances of what you are looking for

● Technical difficulties ● Very time consuming to go through and find

things/transcribe

Page 10: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

I. R. B.

Page 11: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

I. R. B.Institutional Review Board

Page 12: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

I. R. B.Institutional Review Board

YOU MUST GET THIS IF YOU ARE WORKING WITH PEOPLE

Page 13: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

IRB

● BYU wants to protect people in studies● BYU want to protect students doing studies● BYU doesn't want its name associated with iffy,

marginal, questionable studies

Page 14: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

IRB

● BYU wants to protect people in studies● BYU want to protect students doing studies● BYU doesn't want its name associated with iffy,

marginal, questionable studies● BYU does not want to get sued

Page 15: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Why IRB?

Page 16: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Why IRB?

Page 17: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray
Page 18: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

● Can you secretly record people?

Page 19: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

● Can you secretly record people?● Can you use broadcast information without

permission of speakers?

Page 20: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

● Can you secretly record people?● Can you use broadcast information without

permission of speakers?● Can you use data from corpora like LDC without

permission of speakers?

Page 21: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

● Can you secretly record people?● Can you use broadcast information without

permission of speakers?– Oprah Winfrey and use of [a] vs. [aj]– The Queen's English– “Say yes to the dress”

● Can you use data from corpora like LDC without permission of speakers?

Page 22: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Ethical Questions

● Observation studies (non-linguistic)– Reporter joins Jerry Falwell's group, goes on mission

trip, write book about experiences– Woman goes back to school at UNA, observes coed

roommates, writes book about them, and her experiences (Cathy Small)

Page 23: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Observations

● Recordings are observations– "The collection of data without manipulating it." – "Simply observe ongoing  activities, without making

any attempt to control or determine them." ● People know they are being recorded

Page 24: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Observer's Paradox

● We want speech in natural, casual, unfiltered format.

Page 25: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Observer's Paradox

● We want speech in natural, casual, unfiltered format.

Page 26: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Observer's Paradox

● When people know they are observed they change their speech to more careful, standard usages.

Page 27: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Other kinds of observation

● L1 acquisition– Diaries– Wired house

● Labov's department store● Labov-French or English in Montreal

– Age, gender, place, topic of conversation● Give kids puppets to play with

– Girls use cooperative language, boys aggressive

Page 28: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Other kinds of observation

● Bring your friend to experiment– Experimenter called out, camera keeps filming– What to men and women talk about?

● Observe slips of the tongue

Page 29: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Problems with corpora, recordings, observations

● You may find few cases of what you are studying– Might could– She really nice lady

● What is age, origin, ethnicity, educational level of person?

Page 30: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Case Studies

● Study one person, or small group over long period of time.– L1 diaries– Aphasia patients, speech therapy– Investigator observes class over school year

● Teachers interact less with minority students

– Feral children (Genie)

Page 31: Getting Data ● Kinds of data for linguistics – Written – Spoken – Visual (ASL, body language) ● Phonetics – Implosives-larynx lowering, rounding, x-ray

Case Studies

● Problems– Generalizability– Subject retention– Researcher loses interest