ultrasound speech analysis: state of the art

15
Ultrasound speech analysis: State of the art Alan Wrench

Upload: joel-holden

Post on 30-Dec-2015

54 views

Category:

Documents


1 download

DESCRIPTION

Ultrasound speech analysis: State of the art. Alan Wrench. Overview. Machines Methods of recording image sequences and syncing with audio Probes Head stabilisation Contour tracking Parameterisation Choosing an ultrasound : considerations - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Ultrasound speech analysis: State of the art

Ultrasound speech analysis:State of the art

Alan Wrench

Page 2: Ultrasound speech analysis: State of the art

Overview1. Machines2. Methods of recording image sequences and syncing with audio3. Probes4. Head stabilisation5. Contour tracking6. Parameterisation

Choosing an ultrasound : considerations Physical: Size, weight, portability, fan noise – small and quiet is good Probe design – low frequency, microconvex, short handled is good Method of extracting images – ideally high quality, high frame rate, fast Method of synchronising audio – ideally fully automated hardware frame sync Cost – Low cost of course

There is no single perfect system.

Page 3: Ultrasound speech analysis: State of the art

Overview ~50 speech labs now using ultrasound Aloka SSD 1000 -1 lab Aloka SSD 4000 -1 lab Aloka SSD 5000 -1 lab Aloka SSD 5500 -1 lab Mindray DP2200 - 6 labs Mindray DP6600 - 15 labs Mindray DP6900 - 1 lab Mindray M5 - 1 lab Echoblaster 128 - 3 labs GE Logiq e - 3 labs GE Logiq alpha 100 -1 lab Interson SeeMore – 2 labs and British Columbia health service Sonosite 180+ - 3 labs (being replaced) Sonosite Titan – 3 labs Terason T3000 – 7 labs Ultrasonix RP, Tablet, Touch – 7 labs Zonare Zone.1 - 1 lab

~50 speech labs now using ultrasound Aloka SSD 1000 -1 lab Aloka SSD 4000 -1 lab Aloka SSD 5000 -1 lab Aloka SSD 5500 -1 lab Mindray DP2200 - 6 labs Mindray DP6600 - 15 labs Mindray DP6900 - 1 lab Mindray M5 - 1 lab Echoblaster 128 - 3 labs GE Logiq e - 3 labs GE Logiq alpha 100 -1 lab Interson SeeMore – 2 labs and British Columbia health service Sonosite 180+ - 3 labs (being replaced) Sonosite Titan – 3 labs Terason T3000 – 7 labs Toshiba Famio 8 (SSA-530A) – 1 lab Ultrasonix RP, Tablet, Touch – 7 labs Zonare Z.one - 1 lab

Page 4: Ultrasound speech analysis: State of the art

Recording ultrasoundAcquiring image data via the Video Port (NTSC or digital) Methods used

– Frame grabber card and AAA software (Audio captured separately via soundcard and synced using a box that places a flash on the video and tone on the audio. Automatic post-processing to detect flash and tone and align – recording is fast but setting sync parameters can be a bit tricky

– Canopus ADVC 110 video capture card. Provides integrated synchronous audio. Requires video editing software for capture such as Sony Vegas, Apple Final Cut Pro and iMovie, Avid Xpress DV.

– Record to DVD recorder then transfer to PC offline.– Recording the screen using Snagit or Camtasia is an option

for machines running under Windows such as Interson SeeMore. Although this is not using the video port it results in a video file.

If data is not compressed then de-interlacing provides 60 frames per second. If compressed de-interlacing may not be possible

Page 5: Ultrasound speech analysis: State of the art

Things to look out for:

(These factors can vary between individual models of ultrasound, even ones from the same manufacturer or if settings are changed.)

– There may be a lag between the ultrasound and audio if the machine takes appreciable time to process the ultrasound signal.

– There may be duplicate frames

– There may be blurring if frame averaging cannot be switched off.

– Video images may be “torn” when made of parts of different sweeps.

– Careful selection of Ultrasound system can mitigate against these problems.

– 25 labs using Aloka 1000,4000, Mindray 2200,6600,6900, GE LogiqE, Toshiba famio 8 use video port capture.

Page 6: Ultrasound speech analysis: State of the art

Recording Ultrasound Cineloop – direct access to ultrasound memory Advantage – No “torn” images. Frame rates higher than 60fps possible. Disadvantage

– Automatic audio synchronisation is not possible (with exceptions). Audio must be recorded separately, merged in video editing software and synchronised by manual observation of stop releases ( or a flash/beep signal ref., CHAUSA)

– Cine loops have limited size. This limits record time. Sometimes this is a few seconds, sometimes it can be several minutes. (with exceptions)

– This approach is used by 10 labs with GE Logiq e Zonare z.one Mindary M5 Sonosite 180+ Sonosite Titan Interson SeeMore

Page 7: Ultrasound speech analysis: State of the art

There are 4 systems in use which allow automated synchronisation of cineloop data and audio

Aloka SSD 5500 (Haskins) one off modification not generally available Ultrasonix – both frame and scanline pulse sequences generated by

hardware. Terason T3000 – hardware sync signal not generally available so

software sync used. Ultraspeech software polls system for new frames. Echoblaster 128 - TTL frame pulses. By recording these pulse signals on a second audio track alongside the

microphone input, automated precise synchronisation is possible. 16 labs use this method, using either Terason/UltraSpeech or

Ultrasonix/AAA or Matlab With the exception of the Aloka, these systems also provide software

programming toolkits so that bespoke speech applications can be written:– UltraSpeech– AAA– Matlab

Page 8: Ultrasound speech analysis: State of the art

ProbesConvex and particularly microconvex (<20mm radius) generally preferred for

midsagittal tongue imaging

Probes come in a range of frequencies from 2-12MHz Low frequency = good penetration = tongue image doesn’t disappear for

high vowels and consonants Small radius means transmitting array fits under chin Large Field of View means more of the tongue can be imaged. Short handle

Aloka UST-9121 Multi Frequency Tight Convex Transducer

Scan angle: 120°Radius: 14 mmFrequency range: 2.5-6 MHz

Short handle

Narrow cylindrical grip ideal for a clamp.

Page 9: Ultrasound speech analysis: State of the art

Probe specifications Model Probe MHz Radius FOV ° Handle

T3000 8MC3 3 - 8 120 short

T3000 8EC4 4 - 8 15 140 short

Ge Logiq-e 8C-RS 4 - 10 15 133 short

EchoBlaster128 C6.5/20/128Z 5 - 8 10 156 short

EchoBlaster128 C3.5/20/128Z 2 - 4 20 104 short

Ultrasonix C9-5/10 5 - 9 10 148 short

Ultrasonix C7-3/50 3 - 7 50 69 short

Mindray 65EC10EA 5 - 8 10 120 long

Mindray 65C15EA 5 - 8 15 90 short

Mindray 35C20EA 2 - 6 20 83 short

Aloka UST-9121 2.5 - 4 15 120 short

Interson SeeMore 99-5901 3.5 – 5 10 90 short

Zonare C9-4t 4 - 9 11 134 short

Sonosite C11 4 - 7 11 90 short

Sonosite C15 2 – 4 15 101 short

Best specification

Poor FOV

Page 10: Ultrasound speech analysis: State of the art

Probe stabilisation Headset – 30+ labs

Rest forehead against headrest with probe in fixed position – 2 labs

Fixed head restraint and sprung-loaded probe

Fixed head restraint fixed probe

Page 11: Ultrasound speech analysis: State of the art

Head movement correction Palatoglossatron, Peterotron,

https://github.com/jjberry/Autotrace/blob/master/old/

APIL wiki ??

HOCUS

http://www.psych.mcgill.ca/labs/mcl/pdf/HOCUS.pdf

GIPSA accelerometers and gyrometers

Page 12: Ultrasound speech analysis: State of the art

Contour tracking Edgetrack – Maryland – standalone PC application – Snakes

http://vims.cis.udel.edu/~mli/research.htm AAA – QMU – integrated PC application – fan based edge detection – similar performance to

Edgetrak within a recording and analysis GUI. Also a snakes based contour fitting interface. Tonguetrack – Simon Fraser – Matlab – MRF energy minimisation

http://tonguetrack.cs.sfu.ca/TongueTrackUserGuide.pdfL. Tang and G. Hamarneh. Graph-based tracking of the tongue contour in ultrasound se-quences with adaptive temporal regularization. InMathematical Methods for BiomedicalImage Analysis (MMBIA), pages 1–8, 2010.

GetContours - Haskins – Matlab – Edgetrak with a GUI - available on request from Mark Tiede

Ultramat – Gipsa – Matlab – Thomas Hueber

Autotrace – Arizona – python script – Jeff Berry

https://github.com/jjberry/Autotrace Noname - Munich – Matlab – in progress – Phil Hoole

UltraPraat – Arizona – in progress UltraCats – Toronto – manual contour drawing – Tim Bressman

Jacob - Rochester – Speckle tracking – software not availableJacob, M., H. Lehnert-LeHouillier, S. Bora, S. McAleavey, D. Dialecki, J. McDonough.2008. \Speckle Tracking for the Recovery of Displacement and Velocity Information fromSequences of Ultrasound Images of the Tongue".Proceedings of the 8th International Sem-inar on Speech Production, Strasbourg France, 53-57.

Roussos – UCL/Trier/Queen Mary - Active appearance models – software not availableRoussos, A. Katsamanis, and P. Maragos, “Tongue tracking inultrasound images with active appearance models,” inProc. IEEEInt’l Conf. on Image Processing, 2009.

Page 13: Ultrasound speech analysis: State of the art

Speckle tracking University of Rochester Biomedical Engineering It provides displacement estimates giving “virtual fleshpoints” Works on

clear vowel images.

Page 14: Ultrasound speech analysis: State of the art

Parameterisation Lingua – Quebec – Matlab

ISSP 2008 http://www.phonetique.uqam.ca/upload/files/anniebrasseur/menard%20et%20al%20issp2008.pdf

Zharkova – QMU – pythonZharkova, N. (2013). A normative-speaker validation study of two indices developed to

quantify tongue dorsum activity from midsagittal tongue shapes. Clinical Linguistics & Phonetics, 27, 484-496.

Hueber – GIPSA – Matlab – EigenTongues – Ultraspeech tools www.ultraspeech.com

Also Hoole – Munich – Matlab - Principal components Analysis, Mielke NCSU, USA and Richmond, Edinburgh

NYU - SSANOVA using the gss package in R. Haskins – shape analysis methods based on polynomial

fitting and procrustes comparison to a resting tongue shape. AAA – Tongue averaging – pointwise t-tests. Surfaces - Displays a sequence of contours as a time-motion

display. Contour sequences can be averaged and compared numerically.

Page 15: Ultrasound speech analysis: State of the art

Miscellaneous Ultrasonix 4D – Haskins GE Logiq – Linear probe – laryngeal – Victoria EchoTools - A set of tools for analyzing Echo-Doppler tongue images

https://github.com/jjberry/EchoTools