be final semester project synopsis format

8/2/2019 BE Final Semester Project Synopsis Format

1/4

MANIPAL INSTITUTE OF TECHNOLOGY(A constituent Institute of MANIPAL UNIVERSITY)

MANIPAL - 576 104, KARNATAKA, INDIA

Synopsis

TITLE

SUBMITTED

BY

STUDENT NAME REG NO E-MAIL ADDRESS/PHONE

Under the Guidance of:

GUIDEs NAMEDesignation

Department

Name of the Organisation


2/4

Details of the organisation (with postal address):

Name of Guide with contact details and email address:

Date of commencement of the project:

Signature of Guide:


3/4

1. Introduction:

1.1 Topic

Language is man's most important means of communication and speech its

primary medium. A speech signal is a complex combination of a variety of airborne

pressure waveforms. This complex pattern must be detected by the human auditorysystem and decoded by the brain. This can be done by using a combination of audio and

visual cues to perceive speech more effectively. The project aims to emulate this

mechanism in human machine communication systems by exploiting the acoustic and

visual properties of human speech.

1.2 Organization

2. Need for the project:

Current speech recognition engines employing only acoustic features are not

100% robust. Visual cues can be used to undermine the ambiguity in the auditory

modality. Hence a flexible and reliable system for speech perception can be designedwhich finds a variety of applications in

Dictation systems Voice Based Communications in tele-banking, voice mail, data-base query

systems, information retrieval systems, etc

System Control in automobiles, robotics, airplanes, etc Security systems for speaker verification

3. Objective:

Recognise 10 English words (speaker independent) with at least 90% accuracy in a noisy

environment.

4. Methodology:

The project is carried out in into following parts

Processing of Audio Signalso Detection of end points to demarcate word boundarieso Analysis of various acoustic features such as pitch and formants, energy and

time difference of speech signals, etc.

o Extraction of selected features Processing of Video Signals

o Demarcate frames from the video sequenceo Identify faces, and then lip regionso Extract features from the lip profile

Recognition of Speech by synchronizing Audio and Visual Data


4/4

o Synchronize audio and video features for pattern recognition usingstandardized algorithms

o Train the system to recognize the spoken word under adverse acousticconditions.

5. Project Schedule:

January 2011o Processing of audio signalso Feature extraction from the chosen training databaseo Pattern recognition and signature extraction from the featureso Training the HMM with the training set

February 2011o Processing of video signalso Feature extraction from the chosen training databaseo Pattern recognition and signature extraction from the features

March 2011o Synchronize audio and video features for pattern recognitiono Extension of training data set to 10 words

April 2011o Up gradation of system for speaker independent applicationso Performance analysis by comparing results of audio-only approach with that of

joint audio-visual approach

May 2011o Documentation

References:

1. Tsuhan Chen, "Audiovisual Speech Processing, Lip Reading and Lipsynchronization", IEEE Signal Processing Magazine, January 2001.

2. R.Chellapa, C.L. Wilson and S. Sirohoey, Human and Machine Recognitionof Faces : A survey, Proceedings of the IEEE, vol 83, no.5 May 1995

be final semester project synopsis format

Documents