breaking voice based captchas ppt

Breaking Voice Based Captcha

Group Members :Harshal R JoshiKushal KamraMayur Gahiwad

Internal Guide: Prof. M. B. Jhade

2

Index

1. What is Captcha2. Types of Captcha3. Voice based Captcha4. Comparison between existing Captcha and new tool5. Architectural diagram6. Voice Activity Detection7. VAD algorithm8. Speech to Text conversion9. Overall digit recognition system10.Dynamic Time Warping11. Application and Future enhancement


What is ? • Completely Automated Public Turing Test to tell

Computers and Humans Apart

• Ensures that the response is not generated by a computer

3

• Tests that humans can pass but computers cannot


4

Types of …

• Text based…

• Image based…

• Voice Based…


5

Voice based

• Advancement over text and image captchas – Invented in 2006

• Text and image captchas already conquered

• Provides new dimension to the concept of Captcha

Aim: To design a testing tool to try and break the voice based Captcha. Pre-recorded voice samples will be used for the purpose.


6

Comparison between existing Captcha and New Breaking Tool

Currently Text based and image based Captcha are available.

In text based Captcha we have to identify the distorted text and to retype the same.However, In the image based Captcha we have to identify the the common objectfrom collection of picture. In our project, we are making the breaking Tool that recognize the voice and performs the speech to text conversion.


Breaking Voice Based Captcha 7

Fast Fourier

Transform

Voice Activity

Detection

CalculateSNR

SNR<

5 dB

SpectralSubtraction

MMSE

CleanSpeech

Architecture Diagram

SpeechTo

TextConverter

ConvertedText

I/P Voice

Yes

No

8

Voice Activity Detection (VAD)

• The process of separating conversational speech from silence, music, noise or other non-speech signals.

• Primary Function: Provide an indication of the presence of speech in order to facilitate speech processing as well as possibly providing delimiters for the beginning and end of a speech segment.


9

VAD Algorithm…

1. Spectral distance voice activity detector is used.

2. Spectral distance threshold is decided.

3. If the Spectral Distance of the segment is less than the threshold then noise flag is to 1 indicating noise segment.

4. A noise counter is maintained to keep track of immediate previous noise frames.

5. If this counter is greater than some threshold(hangover) then the entire segment is treated as silence segment else it is treated as speech segment.


10

Speech to Text Conversion

• The sound is sampled, or digitized, by taking precise measurements of the wave at frequent intervals.

• The system filters the digitized sound to remove unwanted noise, and separates it into different bands of frequency.

• Next the signal is divided into small segments as short as a few hundredths of a second.

• The program then matches these segments to known phonemes in the appropriate language.

• It runs the contextual phoneme plot through a complex statistical model and determines what the user was probably saying and outputs it as text.


11Breaking Voice Based Captcha

12

Overall Digit Recognition System


13

Dynamic Time Warping (DTW)

• DTW is an algorithm for measuring similarity between two sequences which may vary in time or speed.

• It allows a computer to find an optimal match between two given sequences (e.g. time series) with certain restrictions.

• The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.


To test the vulnerability of the website in order to make more robust Captcha.

Noise Reduction- To reduce noise in wireless communication

Speech to Text conversion- Security Voice Calculator To help disabled persons

Future Enhancement To recognized Word System using Markov model

Applications

14Breaking Voice Based Captcha

15

References:

• Digital Speech Processing – L.R. Rabiner, R.W. Schafer

• What is Fast Fourier Transform - By William T. Cochran, James W. Cooley

• Single channel Noise Reduction algorithm for Hands free Operation in Distorted Environments - By Stefan Schmitt , Malte Sandrock

• Spectral Subtraction Basics – Steven F. Boll

• MMSE – Ephraim, Malah

• IEEE Papers


16

THANK YOU!!!


breaking voice based captchas ppt

Documents