winter school on speech and audio processing deep learning · pdf file ·...

1
Winter School on Speech and Audio Processing Deep Learning for Multi-Lingual Speech Processing 17 - 20 January, 2014 Organized by Centre of Excellence in Signal Processing (CESP) International Institute of Information Technology (IIIT), Hyderabad 500 032, India About WiSSAP - 2014 Technical Content (Tentative) Program Schedule WiSSAP - Winter School on Speech and Audio Processing provides a forum for students, researchers and professionals to enhance their background and get exposed to intricate research areas in the field of speech and audio signal processing. WiSSAP 2014 is the ninth one in the series, following the very successful earlier winter schools: WiSSAP 2006 to WiSSAP 2013 focusing on different topics in speech and audio processing. The exponential growth of audio and speech data, coupled with increase in computing power, has led to increasing popularity of deep learning using neural networks. The use of a generative, layer-by- layer pre-training method for initializing the weights before running the discriminative back-propagation learning has triggered the raise of deep neural networks and its successful application in speech recognition and synthesis. WiSSAP 2014 aims to provide theory and applications of Deep Learning for Multi-Lingual Speech Processing. Invited Speakers Li Deng is currently a Principal Researcher at Microsoft Research, Redmond, USA. His research interests include deep learning and machine intelligence, deep/recurrent/dynamic neural networks, neural information processing, and speech/language processing. Tanja Schultz is currently a Professor and the director of Cognitive Systems Lab at Karlsruhe Institute of Technology, Germany. Her research activities centre around human-computer interactions and applications based on bio-signals, speech and to rapidly adapt speech and language processing systems to new domains and languages. Tara Sainath is currently a member of Speech and Language Algorithms group at IBM T.J. Watson Research Center, New York, USA. Her research interests are in acoustic modeling, including sparse representations, deep belief networks and adaptation methods. C. Chandra Sekhar is currently a Professor at Indian Institute of Technology Madras, Chennai, India. His research interests are in acoustic modeling, kernel methods and content-based information retrieval. Algorithms for speech signal processing Basics of neural networks - Convolution and Recursive Deep linguistic structures in speech Learning in neural networks Algorithms to train deep networks Unsupervised vs Supervised training Optimizing methods for training Building effective acoustic models Improvements in phone/subword recognition and large vocabulary recognition Preprocessing speech for deep neural networks Multilingual data processing Dealing with under-resourced languages and languages with no written forms B. Yegnanarayana is currently a Professor at International Institute of Information Technology, Hyderabad, India. His research interests are in signal processing, speech, image processing, and neural networks. Peri Bhaskararao is currently a Professor at International Institute of Information Technology, Hyderabad, India. His areas of interest include general linguistic phonetics and fundamentals of phonetics of Indian languages, phonetics of Indian languages applied to speech technology. S.R.M. Prasanna is currently a Professor at Indian Institute of Technology Guwahati, India. His areas of interest are speech signal processing, speech enhancement, speaker recognition, speech recognition, speech synthesis and handwriting recognition. S. Chandra Sekhar is currently an Assistant Professor at Indian Institute of Science, Bangalore, India. His research interests include speech/audio/bio-acoustic signal processing, sampling theories, sparse signal processing and biomedical imaging/image processing. Time 17 th Jan 18 th Jan 19 th Jan 20 th Jan 8:30 - 10:00 S.R.M. Prasanna / S. Chandra Sekhar L. Deng - 1 T. Sainath - 2 T. Schultz - 3 10:30 - 12:00 C. Chandra Sekhar- 1 T. Schultz - 1 T. Schultz - 2 L. Deng - 3 13:30 - 15:00 C. Chandra Sekhar- 2 T. Sainath - 1 L. Deng - 2 Closing Comments 15:30 - 17:00 P. Bhaskararao Research Demos/Posters T. Sainath - 3 - 17:30 - 19:00 B. Yegnanarayana Meeting Slot - I Meeting Slot -II - Programme Committee C. Chandra Sekhar, IIT Madras Hema A. Murthy, IIT Madras Kishore S. Prahallad, IIIT Hyderabad S.R.M. Prasanna, IIT Guwahati Preeti Rao, IIT Bombay V. Ramasubramanian, PESIT, Bangalore Rohit Sinha, IIT Guwahati Rajesh M. Hedge, IIT Kanpur K. Samudravijaya, TIFR Mumbai T. V. Sreenivas, IISc Bangalore K. Sreenivasa Rao, IIT Kharagpur S. Umesh, IIT Madras Registration Details ISCA/IEEE Members * Others Students | 2700 | 3000 Academic Faculty | 5500 | 6000 Industry Participants | 9500 | 10000 * - Must indicate membership number and validity. For registration form and modes of payment visit http://wissap.iiit.ac.in Last date of registration is 17 December, 2013. Accommodation Details: Registration fees doesn’t include accommodation charges. A limited number of rooms are available for non-student participants at the Guest House on campus. Students would be accommodated in the institute hostels. More details are available online. Contact Address Kishore S. Prahallad (Convener) Suryakanth V. Gangashetty (Convener) Speech and Vision Laboratory Gachibowli, Hyderabad 500 032, India Tel: +91-40-6653-1422, Fax: +91-40-6653-1413 Email: [email protected] Url: http://wissap.iiit.ac.in

Upload: vuxuyen

Post on 30-Mar-2018

218 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Winter School on Speech and Audio Processing Deep Learning · PDF file · 2016-11-22Deep Learning for Multi-Lingual Speech Processing 17 ... including sparse representations, deep

Winter School on Speech and Audio ProcessingDeep Learning for Multi-Lingual Speech Processing

17 - 20 January, 2014Organized by

Centre of Excellence in Signal Processing (CESP)International Institute of Information Technology (IIIT), Hyderabad 500 032, India

About WiSSAP - 2014 Technical Content (Tentative) Program Schedule

WiSSAP - Winter School on Speech and Audio Processing providesa forum for students, researchers and professionals to enhance theirbackground and get exposed to intricate research areas in the fieldof speech and audio signal processing. WiSSAP 2014 is the ninthone in the series, following the very successful earlier winter schools:WiSSAP 2006 to WiSSAP 2013 focusing on different topics in speechand audio processing.The exponential growth of audio and speech data, coupled withincrease in computing power, has led to increasing popularity of deeplearning using neural networks. The use of a generative, layer-by-layer pre-training method for initializing the weights before runningthe discriminative back-propagation learning has triggered the raiseof deep neural networks and its successful application in speechrecognition and synthesis. WiSSAP 2014 aims to provide theory andapplications of Deep Learning for Multi-Lingual Speech Processing.

Invited Speakers

Li Deng is currently a Principal Researcher at Microsoft Research,Redmond, USA. His research interests include deep learning andmachine intelligence, deep/recurrent/dynamic neural networks, neuralinformation processing, and speech/language processing.

Tanja Schultz is currently a Professor and the director of CognitiveSystems Lab at Karlsruhe Institute of Technology, Germany. Herresearch activities centre around human-computer interactions andapplications based on bio-signals, speech and to rapidly adapt speech and

language processing systems to new domains and languages.Tara Sainath is currently a member of Speech and Language

Algorithms group at IBM T.J. Watson Research Center, New York,USA. Her research interests are in acoustic modeling, including sparserepresentations, deep belief networks and adaptation methods.

C. Chandra Sekhar is currently a Professor at Indian Institute ofTechnology Madras, Chennai, India. His research interests are in acousticmodeling, kernel methods and content-based information retrieval.

•Algorithms for speech signal processing•Basics of neural networks - Convolution and Recursive•Deep linguistic structures in speech•Learning in neural networks

•Algorithms to train deep networks•Unsupervised vs Supervised training•Optimizing methods for training

•Building effective acoustic models• Improvements in phone/subword recognition and large

vocabulary recognition•Preprocessing speech for deep neural networks

•Multilingual data processing•Dealing with under-resourced languages and

languages with no written forms

B. Yegnanarayana is currently a Professor at International Institute ofInformation Technology, Hyderabad, India. His research interests are insignal processing, speech, image processing, and neural networks.

Peri Bhaskararao is currently a Professor at International Institute ofInformation Technology, Hyderabad, India. His areas of interest includegeneral linguistic phonetics and fundamentals of phonetics of Indianlanguages, phonetics of Indian languages applied to speech technology.

S.R.M. Prasanna is currently a Professor at Indian Institute ofTechnology Guwahati, India. His areas of interest are speechsignal processing, speech enhancement, speaker recognition, speechrecognition, speech synthesis and handwriting recognition.

S. Chandra Sekhar is currently an Assistant Professor at IndianInstitute of Science, Bangalore, India. His research interests includespeech/audio/bio-acoustic signal processing, sampling theories, sparsesignal processing and biomedical imaging/image processing.

Time 17th Jan 18th Jan 19th Jan 20th Jan

8:30 - 10:00 S.R.M. Prasanna /S. Chandra Sekhar

L. Deng - 1 T. Sainath - 2 T. Schultz - 3

10:30 - 12:00 C. Chandra Sekhar- 1 T. Schultz - 1 T. Schultz - 2 L. Deng - 3

13:30 - 15:00 C. Chandra Sekhar- 2 T. Sainath - 1 L. Deng - 2 Closing Comments

15:30 - 17:00 P. Bhaskararao ResearchDemos/Posters

T. Sainath - 3 -

17:30 - 19:00 B. Yegnanarayana Meeting Slot - I Meeting Slot -II -

Programme Committee

C. Chandra Sekhar, IIT Madras Hema A. Murthy, IIT MadrasKishore S. Prahallad, IIIT Hyderabad S.R.M. Prasanna, IIT GuwahatiPreeti Rao, IIT Bombay V. Ramasubramanian, PESIT, BangaloreRohit Sinha, IIT Guwahati Rajesh M. Hedge, IIT KanpurK. Samudravijaya, TIFR Mumbai T. V. Sreenivas, IISc BangaloreK. Sreenivasa Rao, IIT Kharagpur S. Umesh, IIT Madras

Registration Details

ISCA/IEEE Members∗ Others

Students | 2700 | 3000Academic Faculty | 5500 | 6000Industry Participants | 9500 | 10000

∗ - Must indicate membership number and validity.

• For registration form and modes of payment visit http://wissap.iiit.ac.in• Last date of registration is 17 December, 2013.

Accommodation Details: Registration fees doesn’t include accommodationcharges. A limited number of rooms are available for non-student participants at theGuest House on campus. Students would be accommodated in the institute hostels.More details are available online.

Contact Address• Kishore S. Prahallad (Convener) • Suryakanth V. Gangashetty (Convener)

Speech and Vision LaboratoryGachibowli, Hyderabad 500 032, India

Tel: +91-40-6653-1422, Fax: +91-40-6653-1413

Email: [email protected] Url: http://wissap.iiit.ac.in