fractalspeechprocessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… ·...

8
Fractal Speech Processing Although widely employed in image processing, the use of fractal techniques and the fractal dimension for speech characterization and recognition is a relatively new concept, which is now receiving serious attention. This book represents the fruits of research carried out to develop novel fractal-based techniques for speech and audio signal processing. Much of this work is finding its way into practical commercial applications. The book starts with an introduction to speech processing and fractal geometry, setting the scene for the heart of the book where fractal techniques are described in detail with numerous applications and examples, and concludes with a chapter summing up the potential and advantages of these new techniques over conventional processing methods. It will provide a valuable resource for researchers, academics and practising engineers working in the field of audio signal processing and communications. Professor Marwan Al-Akaidi is Head of the School of Engineering and Technology at De Montfort University, UK. He is a Senior Member of the Institute of Electrical and Electronic Engineers and Fellow of the Institute of Electrical Engineering. He is Chair of the IEEE UKRI Signal Processing Society and has presided over many national and international conferences in the field. www.cambridge.org © Cambridge University Press Cambridge University Press 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

Upload: others

Post on 29-Jun-2020

8 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

Fractal Speech Processing

Although widely employed in image processing, the use of fractal techniques andthe fractal dimension for speech characterization and recognition is a relatively newconcept, which is now receiving serious attention. This book represents the fruits ofresearch carried out to develop novel fractal-based techniques for speech and audiosignal processing. Much of this work is finding its way into practical commercialapplications.The book starts with an introduction to speech processing and fractal geometry,

setting the scene for the heart of the book where fractal techniques are described indetail with numerous applications and examples, and concludeswith a chapter summingup the potential and advantages of these new techniques over conventional processingmethods. It will provide a valuable resource for researchers, academics and practisingengineers working in the field of audio signal processing and communications.

Professor Marwan Al-Akaidi is Head of the School of Engineering and Technology atDe Montfort University, UK. He is a Senior Member of the Institute of Electrical andElectronic Engineers and Fellow of the Institute of Electrical Engineering. He is Chairof the IEEE UKRI Signal Processing Society and has presided over many national andinternational conferences in the field.

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 2: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

FractalSpeech Processing

Marwan AI-Akaidi

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 3: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

published by the press syndicate of the university of cambridgeThe Pitt Building, Trumpington Street, Cambridge, United Kingdom

cambridge university pressThe Edinburgh Building, Cambridge CB2 2RU, UK40 West 20th Street, New York, NY 10011–4211, USA477 Williamstown Road, Port Melbourne, VIC 3207, AustraliaRuiz de Alarcon 13, 28014 Madrid, SpainDock House, The Waterfront, Cape Town 8001, South Africa

http://www.cambridge.org

C© M. Al-Akaidi 2004

This book is in copyright. Subject to statutory exceptionand to the provisions of relevant collective licensing agreements,no reproduction of any part may take place withoutthe written permission of Cambridge University Press.

First published 2004

Printed in the United Kingdom at the University Press, Cambridge

Typefaces Times 10.5/14 pt. and Helvetica System LATEX2ε [tb]

A catalogue record for this book is available from the British Library

Library of Congress Cataloguing in Publication data

Al-Akaidi, Marwan, 1959–Fractal speech processing / Marwan Al-Akaidi.

p. cm.Includes bibliographical references and index.ISBN 0-521-81458-81. Speech processing systems. 2. Fractals–Data processing. I. Title.TK7882.S65A43 2004006.4′54 – dc22 2003055750

ISBN 0 521 81458 8 hardback

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 4: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

Contents

List of acronyms and abbreviations page ix

1 Introduction to speech processing 1

1.1 Introduction 11.2 The human vocal system 31.3 Acoustic phonetics 51.4 The fundamental speech model 51.5 Speech coding techniques 10

References 23

2 Computational background 26

2.1 Digital filters 262.2 The fast Fourier transform 282.3 Data windowing 312.4 Random number generation and noise 342.5 Computing with the FFT 342.6 Digital filtering in the frequency domain 402.7 Digital filtering in the time domain 55

Reference 65

3 Statistical analysis, entropy and Bayesian estimation 66

3.1 The probability of an event 663.2 Bayes’ rule 683.3 Maximum-likelihood estimation 713.4 The maximum-likelihood method 73

v

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 5: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

vi Contents

3.5 The maximum a posteriori method 743.6 The maximum-entropy method 753.7 Spectral extrapolation 773.8 Formulation of the problem 783.9 Reformulation of the problem 793.10 Ill-posed problems 803.11 The linear least squares method 813.12 The Gerchberg–Papoulis method 823.13 Application of the maximum-entropy criterion 85

References 87

4 Introduction to fractal geometry 88

4.1 History of fractal geometry 884.2 Fractal-dimension segmentation 974.3 Non-stationary fractal signals 108

References 124

5 Application to speech processing and synthesis 127

5.1 Segmentation of speech signals based on fractal dimension 1275.2 Isolated word recognition 129

References 140

6 Speech processing with fractals 141

6.1 Introduction 1416.2 Sampling strategies for speech 1446.3 Numerical algorithms and examples 1476.4 Template-matching techniques 1526.5 Robustness of recognition system to natural noise 1556.6 Application of the correlation dimension 160

References 163

7 Speech synthesis using fractals 164

7.1 Computing fractal noise 1647.2 Non-stationary algorithms for speech synthesis 165

References 175

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 6: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

vii Contents

8 Cryptology and chaos 178

8.1 Introduction 1788.2 Cryptology 1798.3 Unpredictability and randomness 1868.4 Chaotic systems 1898.5 Chaotic stream ciphers 2008.6 Chaotic block ciphers 204

References 209

Index 212

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 7: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

Acronyms and abbreviations

AbS analysis by synthesisADM adaptive delta modulationADPCM adaptive differential pulse code modulationAUSSAT Australian SatelliteBA binary arithmeticBCM box-counting methodBPP bounded-away error probabilistic polynomial-time computationsCA cellular automataCCITT International Consultative Committee for Telephone and TelegraphCDF cumulative distribution functionCELP code-excited linear predictionCM correlation methodCPU central processing unitDAC digital-to-analogue converterDCC discrete chaotic cryptologyDCT discrete cosine transformDFT discrete Fourier transformDM delta modulationDP dynamic programmingDPCM differential pulse code modulationDSP digital signal processingDTMF dual-tone multifrequencyDTW dynamic time warpingEEG electroencephalogramFDS fractal-dimension segmentationFFT fast Fourier transformFIR finite impulse responseFPA floating-point arithmeticGSM Global System for Mobile CommunicationHMM hidden Markov modelIFFT inverse fast Fourier transform

ix

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information

Page 8: FractalSpeechProcessingassets.cambridge.org/052181/4588/frontmatter/0521814588_frontm… · 0521814588 - Fractal Speech Processing - Marwan Al-Akaidi Frontmatter More information

x List of acronyms and abbreviations

IMBE improved multiband excitationINMARSATM International Maritime SatelliteIFS iterated-function systemKLT Karhunen–Loeve transformKS Kolmogorov–Sinai (entropy)LCG linear congruential generatorLFSR linear feedback shift registerLP linear approximation probabilityLPC linear predictive codingMAP maximum a posterioriMBE multiband excitationML maximum likelihoodMMEE minimum mean square error estimatorMPC multipulse codingNFSR non-linear feedback shift registerPC polynomial-time computationsP-box permutation boxPCM pulse code modulationPCNG pseudo-chaotic number generatorPDF probability density functionPNP non-deterministic polynomial-time computationsPRNG pseudo-random-number generatorPSDF power spectral density functionPSM power spectrum methodPSTN Public Switched Telephone NetworkRELP residual excited linear predictionRPE regular-pulse-excitation coderRSF random scaling fractalS-box substitution boxSFD stochastic fractal differentialSME spectral magnitude estimationSNR signal-to-noise ratioSTC sinusoidal transform coderSTFT short-time Fourier transformVQ vector quantizationWDM walking-divider methodXOR ‘exclusive or’ gate

www.cambridge.org© Cambridge University Press

Cambridge University Press0521814588 - Fractal Speech Processing - Marwan Al-AkaidiFrontmatterMore information