automatic transcript generation

12
Automatic Transcript Generation Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen

Upload: illias

Post on 07-Jan-2016

28 views

Category:

Documents


0 download

DESCRIPTION

Automatic Transcript Generation. Helmer Strik A 2 RT Dept. of Language & Speech University of Nijmegen. Problem & Solution. Problem: We have Audio from radio & TV We need Transcripts Solution ASR: Automatic Speech Recognition. History of ASR. It all started more than 100 years ago. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Automatic Transcript Generation

Automatic Transcript Generation

Helmer Strik

A2RT

Dept. of Language & Speech

University of Nijmegen

Page 2: Automatic Transcript Generation

Problem & Solution

Problem: – We have

Audio from radio & TV– We need

Transcripts

SolutionASR: Automatic Speech Recognition

Page 3: Automatic Transcript Generation

History of ASRIt all started more than 100 years ago

Page 4: Automatic Transcript Generation

History of ASR1870 - Alexander Graham Bell:

Make speech visible, for the hearing impaired

1952 - AT&T Bell Laboratories:1st ASR - ten English digits

2001 - ASR is ‘everywhere’ :– PC: dictation + ‘Command & Control’– mobile phones (hands free)– call-centers– tap phone calls

Page 5: Automatic Transcript Generation

First: A/D-conversion

Mic. + sound card

Before ASR: A/D-conversion

WAV file - digital & discrete

Speech - analogue & continuous

Page 6: Automatic Transcript Generation

What is ASR?

Answer: conversion from speech to text

ASR

W: a string of words

X: unknown speech signal

Page 7: Automatic Transcript Generation

How: probabilistic approach

Find W that max. P(W|X)

P(W|X) = P(X|W) * P(W) / P(X)

• P(W) - language model• P(X|W) - acoustic model

– Whole word models– Phoneme models + Lexicon

Page 8: Automatic Transcript Generation

ASR

ASR =

• Phoneme models(HMMs)

• Lexicon

• Language model

P(X|W)

P(W)

Page 9: Automatic Transcript Generation

Training

HMMs & LMs are trained:

Training procedure

ASR:

•HMMs (Hidden Markov Models)

•Language Models

Speech + manual transcripts (lexicon)

Page 10: Automatic Transcript Generation

Decoding

Automatic Transcript Generation:

ASR

W: the automatic transcripts

X: unknown speech signal

Page 11: Automatic Transcript Generation

C-3PO - 6 million languages

Page 12: Automatic Transcript Generation

MUMIS