report format for final yr

Upload: muhammad-majeeb

Post on 07-Apr-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/4/2019 Report Format for Final Yr

    1/23

    Study and Implementation of ITU-T G.723.1

    Group Members

    1. M.Sajjad.Khan 07-HITEC-EE-01

    2. Kamran khan 07-HITEC-EE-02

    3. Qaisar Khan 07-HITEC-EE-03

    Project Advisor

    (Dr.Jameel Ahmed)Professor

    Head, Department of Electrical

    Department of Electronic & Computer EngineeringNFC Institute of Engineering & Technological Training Multan-Pakistan

    July 2006

  • 8/4/2019 Report Format for Final Yr

    2/23

    Department of Electronic & Computer EngineeringNFC Institute of Engineering & Technological Training, Multan-Pakistan

    The project ___________________________________________, presented by:

    1. M. Sajjad Khan 2K1-Electro-01

    2. Kamran Khan 2K1-Electro-02

    3. Qaisar Khan 2K1-Electro-03

    under the supervision of their project advisor and approved by the project

    examination committee, has been accepted by the NFC Institute of Engineering &

    Technological Training, in partial fulfillment of the requirements for the four year

    degree of B.Sc ( Electronic Engineering).

    __________________ _______________

    (Engr. Abdul Manan) (Dr. M. Ali Unar)Lecturer Professor

    Internal Examiner External Examiner

    __________________(Engr. Jameel Ahmed)

    Associate Professor Head, Department of Electronic

    & Computer Engineering

  • 8/4/2019 Report Format for Final Yr

    3/23

    DEDICATION

    In this page you can dedicate your project to which you want

    to dedicate your work.

  • 8/4/2019 Report Format for Final Yr

    4/23

    ACKNOWLEDGMENT

    Acknowledgement is due to NFC Institute of Engineering & Technological Training

    for support of this Project.

    In this page you are advised to give appreciation to those teachers who have

    helped you during your projects, and also the name of those who have guide you

    through out your project thesis, evaluation.

  • 8/4/2019 Report Format for Final Yr

    5/23

    TABLE OF CONTENTS

    Certificate ii

    Dedication iii

    Acknowledgement iv

    Table of Contents v

    List of Tables x

    List of Figures xi

    Abstract xiii

    CHAPTER 1: INTRODUCTION 1

    CHAPTER 2: LITERATURE SURVEY

    5

    2

    CHAPTER 3: SPEECH CODING TECHNIQUES 9

    3.1 Basic properties of speech coding 9

    3.2 Classes of speech 12

    3.2.1 Voiced sounds 123.2.2 Unvoiced sounds 12

    3.2.3 Plosive sounds 13

    Properties of speech 16

    Speech modeling 16

    Waveform coding 17

    3.5.1 Pulse code modulation 19

    3.5.2 Delta modulation 19

    3.5.3 Adaptive differential PCM 19

    3.6 Vocoding 20

    3.6.1 Types of vocoders 22

    3.6.1.1 Homomorphic vocoders 22

    3.6.1.2 Linear predictive vocoders 23

    3.7 Hybrid coding 24

    3.7.1 Regular pulse excited coding 25

    3.7.2 Code excited linear predictor coders 26

    3.8 The G.728 Recommendation 27

    3.9 The G.729 Recommendation 27

    3.10 The G.723.1 Recommendation 28

  • 8/4/2019 Report Format for Final Yr

    6/23

    CHAPTER 4: UNDERSTANDING THE G.723.1 STANDARD 29

    4.1 Introduction 29

    4.1.1 Scope 30

    4.1.2 Bit rates 304.1.3 Possible input signals 30

    4.1.4 Delay 30

    4.2 Speech coder description 31

    4.2.1 Analysis-by-Synthesis coding techniques 31

    4.3 Encoder principle 40

    4.3.1 Framer 42

    4.3.2 High Pass Filter 44

    4.3.3 LPC analysis 44

    4.3.3.1 Autocorrelation method 48

    4.3.4 LSP quantizer 48

    4.3.4.1 LPC to LSP conversion 514.3.4.2 Quantization of the LPC coefficients 51

    4.3.5 LSP decoder 52

    4.3.6 LSP interpolation 53

    4.3.7 Formant Perceptual weighting filter 53

    4.3.8 Pitch estimation 56

    4.3.9 Harmonic noise shaping 56

    4.3.10 Impulse response calculator 57

    4.3.11 Zero input response and ringing subtraction 58

    4.3.12 Pitch prediction 58

    4.3.12.1 Adaptive codebook 58

    4.3.13 Multi pulse LPC 59

    4.3.14 High rate excitation (MP-MLQ) 61

    4.3.15 Code excited linear predictive coding (CELP) 62

    4.3.16 Low rate excitation (ACELP) 64

    4.3.17 Excitation decoder 67

    4.3.18 Decoding of the pitch information 68

    4.3.19 Memory update 69

    4.4 Decoder Principle 69

    4.4.1 General description 70

    4.4.2 LSP decoder 72

    4.4.3 LSP interpolator 724.4.4 Decoding of the pitch information 72

    4.4.5 Excitation Decoder 72

    4.4.6 Pitch postfilter 72

    4.4.7 LPC synthesis filter 74

    4.4.8 Formant postfilter 75

    4.4.9 Gain scaling unit 75

  • 8/4/2019 Report Format for Final Yr

    7/23

    CHAPTER 5: SYSTEM DESIGN 77

    5.1 Introduction 77

    5.2 Use cases 77

    5.3 Detailed use cases 805.3.1 Encoder 80

    5.3.1.1 Set Rate 80

    5.3.1.2 Allocate Buffer 80

    5.3.1.3 Analyze LPC 80

    5.3.1.4 Quantize LSP 80

    5.3.1.5 Weighting Formants 82

    5.3.1.6 Estimate Pitch 82

    5.3.1.7 Shape Noise 82

    5.3.1.8 Predict Pitch 82

    5.3.1.9 Encode Excitation 82

    Decoder 835.3.2.2 Allocate Buffer 83

    5.3.2.3 Decode LSP 83

    5.3.2.4 Interpolate LSP 83

    5.3.2.5 Decode Pitch 83

    5.3.2.6 Decode Excitation 83

    5.3.2.7 Postfilter Pitch 85

    5.3.2.8 Synthesize Signal 85

    5.3.2.9 Postfilter Formants 85

    Collaboration Diagrams 85

    Collaboration Diagrams of the Encoder 85

    Collaboration Diagrams of the Decoder 95

    Class Diagram 104

    Classes and attributes 104

    Class associations 104

    Methods and attributes type information 105

    CHAPTER 6:IMPLEMENTATION OF THE G.723.1 STANDARD 108

    6.1 Introduction 108

    6.2 Speech coding considerations 108

    6.2.1 Platform independency 109

    6.2.2 Robustness 1096.2.3 Security 110

    6.2.4 Object orientation facility 110

    6.3 Java compared 111

    6.4 Algorithmic description 114

    6.4.1 Framer 114

    6.4.2 High pass filter 114

    6.4.2.1 Rem_Dc 115

  • 8/4/2019 Report Format for Final Yr

    8/23

    6.4.3 LPC analysis 115

    6.4.3.1 Comp_Lpc 115

    6.4.3.2 Durbin 116

    6.4.4 LSP quantizer 117

    6.4.4.1 AtoLsp 117

    6.4.4.2 LspQnt 1186.4.4.3 Lsp_Svq 119

    6.4.5 LSP decoder 119

    6.4.5.1 Lsp_Inq 119

    6.4.6 LSP interpolation 120

    6.4.6.1 Lsp_Int 121

    6.4.6.2 LsptoA 121

    6.4.7 Format perceptual weighting filter 122

    6.4.7.1 Wght_Lpc 122

    6.4.7.2 Error_Wght 122

    6.4.8 Pitch estimation 123

    6.4.8.1 Estim_Pitch 1236.4.9 Harmonic noise shaping filter 124

    6.4.9.1 Comp_Pw 124

    6.4.9.2 Filt_Pw 124

    6.4.10 Impulse response calculator 125

    6.4.10.1 Comp_Ir 125

    6.4.11 Zero input response and ringing subtraction 126

    6.4.11.1 Sub_Ring 126

    6.4.12 Pitch prediction 126

    6.4.12.1 Find_Acbk 127

    6.4.12.2 Get_Rez 128

    6.4.12.3 Decod_Acbk 128

    6.4.13 High rate excitation (MP-MLQ) 129

    6.4.13.1 Find_Fcbk 129

    6.4.13.2 Find_Best 130

    6.4.13.2 Find_Best 130

    6.4.13.3 Gen_Trn 131

    6.4.13.4 Fcbk_Pack 131

    6.4.14 Low rate excitation (ACELP) 132

    6.4.14.1 search_T0 132

    6.4.14.2 ACELP_LBC_code 132

    6.4.14.3 Cor_h 1326.4.14.4 Cor_h_X 132

    6.4.14.5 G_code 133

    6.4.15 Excitation decoder 133

    6.4.15.1 Fcbk_Unpk 133

    6.4.16 Decoding of the pitch information 134

    6.4.17 Memory update 134

    6.4.17.1 Upd_Ring 134

  • 8/4/2019 Report Format for Final Yr

    9/23

    6.4.18 Bit allocation 135

    6.4.18.1 Line_Pack 135

    6.4.19 Pitch postfilter 141

    6.4.19.1 Comp_Lpf 141

    6.4.19.2 Find_B 142

    6.4.19.3 Find_F 1426.4.19.4 Get_Ind 143

    6.4.19.5 Filt_Lpf 143

    6.4.20 LPC synthesis filter 144

    6.4.20.1 Synt 144

    6.4.21 Formant postfilter 145

    6.4.21.1 Spf 145

    6.4.21.2 Comp_En 146

    6.4.22 Gain scaling unit 146

    6.4.22.1 Scale 146

    6.4.23 Frame interpolation handling 147

    6.4.23.1 Comp_Info 1476.4.23.2 Regen 148

    CHAPTER 7: RESULTS AND OBSERVATIONS 149

    CHAPTER 8: CONCLUSION 166

    APPENDIX A: TABULAR DISTRIBUTION FOR CODEC 168

    APPENDIX B: CD CONTENTS 170

    NOMENCLATURE 171

    REFERENCES 173

  • 8/4/2019 Report Format for Final Yr

    10/23

    LIST OF TABLES

    Table 4.1 ACELP excitation codebook 65

    Table 6.1 Bit allocation of the 6.3 kbps coding algorithm 139

    Table 6.2 Bit allocation of the 5.3 kbps coding algorithm 140

    Table 7.1 Size in bytes of test vectors for encoder and decoder modes 151

  • 8/4/2019 Report Format for Final Yr

    11/23

    LIST OF FIGURES

    Figure 3.1 Physical model of speech production 11

    Figure 3.2 Periodic nature of a voiced sound 14

    Figure 3.3 Attenuation of the power of a voiced sound 15

    Figure 3.4 General speech production model of vocoders 21

    Figure 4.1 Basic structure of AbS-LPC speech encoder 33

    Figure 4.2 Frequency domain plot of the weighting filter of LPC envelop 36

    Figure 4.3 Block diagram of the speech coder 41

    Figure 4.4 Logical division of the frame 43

    Figure 4.5 Source filter model of speech production 45

    Figure 4.6 The zeros of the coefficients of the polynomial 50

    Figure 4.7 Spectral envelop of output quantization noise 56

    Figure 4.8 A typical pulse position structure for MPLPC 60

    Figure 4.9 Block diagram of the speech decoder 71

    Figure 5.1 High level use case diagram 79

    Figure 5.2 Detailed use case diagram of the encoding process 81

    Figure 5.3 Detailed use case diagram of the decoding process 84

    Figure 5.5 Collaboration diagram of Analyze LPC 87

    Figure 5.6 Collaboration diagram of Quantize LSP 88

    Figure 5.7 Collaboration diagram of Weighting Formants 89

    Figure 5.8 Collaboration diagram of Estimate Pitch 90Figure 5.9 Collaboration diagram of Shape Noise 91

    Figure 5.10 Collaboration diagram of Predict Pitch 92

    Figure 5.11 Collaboration diagram of Encode Excitation at 6.3 kbps 93

    Figure 5.12 Collaboration diagram of Encode Excitation at 5.3 kbps 94

    Figure 5.13 Collaboration diagram of decoders Allocate Buffer 96

    Figure 5.14 Collaboration diagram of Decode LSP 97

    Figure 5.15 Collaboration diagram of Interpolate LSP 98

    Figure 5.16 Collaboration diagram of Decode Pitch 99

    Figure 5.17 Collaboration diagram of Decode Excitation 100

    Figure 5.18 Collaboration diagram of Postfilter Pitch 101

    Figure 5.19 Collaboration diagram of Synthesize Signal 102Figure 5.20 Collaboration diagram of Postfilter Formants 103

    Figure 5.21 Class diagram of the Codec 107

    Figure 6.1 Programming languages compared 112

    Figure 7.1 Delay comparison of test vector Overc53h.tin on Windows

    Platform

    152

    Figure 7.2 Delay comparison of test vector Overc63h.tin on Windows

    Platform

    153

  • 8/4/2019 Report Format for Final Yr

    12/23

    Figure 7.3 Delay comparison of test vector Overd53.tco on Windows Platform 154

    Figure 7.4 Delay comparison of test vector Overd63p.tco on Windows

    Platform

    155

    Figure 7.5 Delay of test vector Overc53h.tin on Linux Platform 156

    Figure 7.6 Delay of test vector Overc63h.tin on Linux Platform 157

    Figure 7.7 Delay of test vector Overd53.tco on Linux Platform 158Figure 7.8 Delay of test vector Overd63p.tin on Linux Platform 159

    Figure 7.9 Test of Overc53h.tin on the optimized codec 160

    Figure 7.10 Test of Overc63h.tin on the optimized codec 161

    Figure 7.11 Test of Overd53.tco on the optimized codec 162

    Figure 7.12 Test of Overd63p.tco on the optimized codec 163

    Figure 7.13 Distribution of computational load at High-rate 164

    Figure 7.14 Distribution of computational load at Low-rate 165

    ABSTRACT

    Digital transmission of coded speech is becoming increasingly important in a

    plethora of VoIP applications e.g. teleconferencing, video-on-demand, Internet

    telephony etc, reaching a variety of platforms, which urges for secure, robust,

    flexible and platform independent software. Traditionally the software to support

    multimedia applications, were not accoutered with these requirements. In this

    project, we describe the ITU G.723.1 dual rate speech coding algorithm, its

  • 8/4/2019 Report Format for Final Yr

    13/23

    implementation in Java, which results in the bit-exact, fixed-point mathematical

    operations as specified by the recommendation. The results were tested for bit-by-

    bit compatibility with the ITU-T standard using the test vectors provided by ITU.

    The performance results from these tests carried on different platforms show the

    versatility of our codec.

    Chapter 1

  • 8/4/2019 Report Format for Final Yr

    14/23

    INTRODUCTION

    Although with the emergence of optical fibers bandwidth in wired communications has

    become inexpensive, there is a growing need for bandwidth conservation and

    enhanced privacy in wireless cellular and satellite communications. In particular,

    cellular communications have been enjoying a tremendous worldwide growth and

    there is a great deal of R&D activity geared towards establishing global portable

    communications through wireless personal communication networks (PCNs). On the

    other hand, there is a trend toward integrating voice-related applications (e.g.,

    voicemail) on desktop and portable personal computers - often in the context of

    multimedia communications. Most of these applications require that the speech signal

    is in digital format so that it can be processed, stored, or transmitted under software

    control. Speech is generally band limited to 4 kHz (or 3.2 kHz) and sampled at 8kHz,

    although digital speech brings flexibility and opportunities for encryption, it is also

    associated (when uncompressed) with a high data rate and hence high requirements of

    transmission bandwidth and storage. Speech Coding or Speech Compression is the

    field concerned with obtaining compact digital representations of voice signals for the

    purpose of efficient transmission or storage. Speech coding involves sampling and

    amplitude quantization. While the sampling is almost invariably done at a rate equal toor greater than twice the bandwidth of analog speech, there has been a great deal of

    variability among the proposed methods in the representation of the sampled

    waveform. The objective in speech coding is to represent speech with a minimum

    number of bits while maintaining its perceptual quality. The quantization or binary

    representation can be direct or parametric. Direct quantization implies binary

    representation of the speech samples themselves while parametric quantization

    involves binary representation of speech model and/or spectral parameters.

    The simplest non-parametric coding technique is Pulse Code Modulation (PCM), which is

    simply a quantizer of sampled amplitudes. Speech coded at 64 kilobits per second (kbps)

    using logarithmic PCM is considered as "non-compressed" and is often used as a reference

    for comparisons. The term medium-rate for coding in the range of 8-16 kbps, low-rate for

  • 8/4/2019 Report Format for Final Yr

    15/23

    systems working below 8 kbps and down to 2.4 kbps, and very-low-rate for coders

    operating below 2.4 kbps.

    Speech coding at medium-rates and below is achieved using an analysis-synthesisprocess.

    In the analysis stage, speech is represented by a compact set of parameters, which are

    encoded efficiently. In the synthesis stage, these parameters are decoded and used in

    conjunction with a reconstruction mechanism to form speech. Analysis can be open-loop

    orclosed-loop.

    In closed-loop analysis, the parameters are extracted and encoded by minimizing explicitly

    a measure (usually the mean square) of the difference between the original and the

    reconstructed speech. Therefore closed-loop analysis incorporates synthesis and hence this

    process is also called analysis-by-synthesis. Parametric representations can be speech or

    non-speech specific. Non-speech specific coders or waveform coders are concerned with

    the faithful reconstruction of the time-domain waveform and generally operate at medium-

    rates. Speech specific coders or voice coders (vocoders) rely on speech models and are

    focused upon producing perceptually intelligible speech without necessarily matching the

    waveform. Vocoders are capable of operating at very-low rates but also tend to produce

    speech of synthetic quality.

    Although this is the generally accepted classification in speech coding, there are coders

    that combine features from both categories. For example hybrid coders, which rely on

    analysis-by-synthesis linear prediction. Hybrid coders combine the coding efficiency of

    vocoders with the high-quality potential of waveform coders by modeling the spectral

    properties of speech (much like vocoders) and exploiting the perceptual properties of the

    ear, while at the same time providing for waveform matching (much like waveform

    coders). Modern hybrid coders can achieve communications quality speech at 8 kbits/s and

    below at the expense of increased complexity.

    The International Telecommunications Union (ITU) is an international standards

    organization chartered by the United Nations to formulate worldwide communications

  • 8/4/2019 Report Format for Final Yr

    16/23

    standards. The members represent nearly every nation in the world, which delegates

    typically from the largest telecommunication service providers and equipment

    manufacturers in those member countries. All equipment manufactured is according to

    these standards and this ensures compatibility of equipment and protocols worldwide. The

    most widely adopted ITU standards for speech coding in multimedia applications, are

    G.728, G.729 and G.723.1.

    The speech compression technology, to be designated as G.723.1, has enabled visual

    telephony over the public telephone network, among a variety of other teleconferencing

    and multimedia applications. This technology operates at data rates as low as 6.3 and 5.3

    kbps producing a substantial improvement in compression ratios over existing ITU

    standards - while maintaining high speech quality. The high bit rate has a great quality.The low bit rate gives a good quality and provides system designers with additional

    flexibility. The high quality speech is possible because of significant advances in the

    digital speech compression introduced by the parties and by advances in digital signal

    processing technologies.

    The algorithm used for coding of speech at higher rate (6.3 kbps) is Multipulse Maximum

    Likelihood Quantization (MP-MLQ) and for lower rate (5.3 kbps) is Algebraic-Code-

    Excited Linear Prediction (ACELP). It is possible to switch between the two rates at any

    frame boundary.

    In this project we have studied and implemented the ITU G.723.1 speech codec in Java,

    which provides in more flexible, extensible, robust, secure and platform independent

    implementation.

    The report is distributed in the following manner.

    Chapter 2 presents the literature survey, which includes the overview from differentpublications on speech compression. In Chapter 3, we have examined characteristics of

    human speech, which will serve as a foundation for discussing how voice can be analyzed

    and synthesized. By discussing different voice-digitization methods, we will also coverdifferent international methods, laying the foundation for information presented in the

    chapters followed. Chapter 4 presents a block-by-block explanation of the ITU G.723.1

    dual rate speech coder. Chapter 5 illustrates the system design aspects of our codec.Chapter 6 deals with the implementation aspects and the software specifications of G.723.1

  • 8/4/2019 Report Format for Final Yr

    17/23

    in Java. Chapter 7 illustrates the observation made by executing our codec on different

    machines and platforms. Chapter 8 extracts the conclusion of the research and offers

    suggestions for future attempts in this area.

    Chapter 2

    LITERATURE SURVEY

  • 8/4/2019 Report Format for Final Yr

    18/23

    Andreas S. Spanias [5] provides an overview of speech coding methodologies with

    emphasis on those algorithms that are part of the recent low-rate standards for cellular

    communications. Although the emphasis is on the new low-rate coders, attempts to providea comprehensive survey by covering some of the traditional methodologies as well. Which

    will not only point out key references but will also provide valuable background to the

    beginners.

    Richard V. Cox and Peter Kroon [19] have compared different ITU standards, which are

    applicable to low bit-rate multimedia communications. ITU Rec.G.729 8 kb/s CS-ACELP

    has a 15 ms algorithmic codec delay and provides network-quality speech. It was originally

    designed for wireless applications, but is applicable for multimedia communications as

    well. Annex A of Rec. G.729 is a reduced complexity version of the CS-ACELP coder. It

    was designed explicitly for simultaneous voice and data applications that are prevalent in

    low bit-rate multimedia communications. These two coders use the same bit-stream format

    and can interoperate. ITU Rec. G.723.1 6.3 and 5.3 kb/s speech coder for multimedia

    communications was designed originally for low bit-rate videophones. Its frame size of 30

    ms and one-way algorithmic codec delay of 37.5 ms allow for a further reduction in bit rate

    compared to the G.729 coder. In applications where low-delay is important, the delay of

    G.723.1 may be too large. However, if the delay is acceptable, G.723.1 provides a lower

    complexity alternative to G.729 at the expense of a slight degradation in quality. The

    authors describe the attributes of speech coders such as bit rate, complexity, delay and

    quality, and discuss the basic concepts of the three ITU coders by comparing their specific

    attributes.

    Kashif Israr Siddiqui et al. [21] gives a brief account of their work i.e. to implement and

    optimize a dual-rate speech codec for real-time operation on TriMedia's Very Long

    Instruction Word (VLIW) Digital Signal Processor (DSP), Central Processing Unit (CPU)

    so that the speech codec can operate under limited processor resources. They implemented

    the speech codec which has two-bit rates associated with it, 5.3 and 6.3 kbits/s. This codec

  • 8/4/2019 Report Format for Final Yr

    19/23

    was optimized to represent speech with a high quality at the above rates using a limited

    amount of complexity.

    Fu-Kun Chen et al. [24] have proposed condensed stochastic codebook search approaches

    that progressively reduce the computation required for the algebraic code excited linear

    predictive (ACELP) and multi-pulse maximum likelihood quantization (MP-MLQ) coders.

    By reducing the candidates of the codebook before search procedure, the proposed

    methods can effectively diminish the computation required for the ITU-T G.723.1 dual rate

    speech coder. Their simulation results show that the proposed methods can save over 50

    percent for the stochastic codebook search with perceptually intangible degradation in

    speech quality.

    J. P. Woodard and L. Hanzo [25], have considered extensions to the Analysis-by-Synthesis

    (AbS) loop used in Code Excited Linear Predictive (CELP) speech codecs. They have

    examined the methods for updating the short-term synthesis filter once the excitation

    parameters have been determined. They show that significant improvements can be

    achieved by updating the synthesis filter, similar to those obtained using the well-known

    methods of interpolation and bandwidth expansion. However their proposed method of

    update avoids the increase in the delay of a codec that is usually associated with

    interpolation. Furthermore the traditional sequential method of determining the adaptive

    and fixed codebook parameters is examined and compared to an exhaustive search of both

    codebooks. Three sub-optimum techniques were proposed for improving the performance

    of the codebook search while maintaining a reasonable level of complexity. The most

    complex of these increases the codec complexity by only about 40% but provides 80% of

    the maximum possible 1.1 dB segmental SNR improvement associated with an exhaustive

    codebook search.

    Benjamin W. Wah and Dong Lin [26]discuss a fundamental issue in real-time interactive

    voice transmissions over unreliable IP networks due to the loss or late arrival of packets for

    playback. This problem is especially serious when transmitting low bit rate-coded speech

    with pervasive dependencies introduced. In such a case, the loss or late arrival of a single

  • 8/4/2019 Report Format for Final Yr

    20/23

    packet will lead to the loss of subsequent dependent frames. In their paper, they have

    described end-to-end loss-concealment schemes for ensuring high quality in playback.

    They propose a novel multiple description-coding methods for concealing packet losses in

    transmitting low bit rate-coded speech. Based on high correlations observed in linear

    predictor parameters in the form of Line Spectral Paris (LSPs) of adjacent frames, they

    generate multiple descriptions in senders by interleaving LSPs, and reconstruct lost LSPs

    in receivers by linear interpolations. As excitation codewords have low correlations, they

    further enlarge the segment size for excitation generation and replicate excitation

    codewords in all descriptions in order to maintain the same transmission bandwidth.

    J. P. Woodard and L. Hanzo [27] have developed a programmable 8-16 kbps low-delay

    speech codec, which is compatible with the G.728 16 kbps ITU codec at its top rate and

    exhibits similarly attractive trade-offs in terms of speech quality, delay and complexity in

    the range of 8-16 kbps.

    Thomas J. Dillon, Jr. [36] application report describes how the G.723.1 Dual-Rate Speech

    Coder has been implemented on the Texas Instruments (TIE) TMS320C62x digital signal

    processor (DSP). Beyond the use of the C62x intrinsic functions, the application report

    includes specific changes required to allow this coder to operate in a real-time system with

    other speech coders. Also reported is information on several optimization techniques used

    to yield multiple channels running concurrently. Finally, the application report includes the

    performance resulting from this implementation of the algorithm.

    REFERENCES

    [1] A.M Kondoz, Digital Speech, John Wiley & sons, January 2001.

    [2] David Flanagan, Java in a Nutshell, OReilly, August 1998.

    [3] Jonathan (Y) Stein, Digital Signal Processing, John Wiley & sons, January 2000.

  • 8/4/2019 Report Format for Final Yr

    21/23

    [4] Sophocles J. Orfanidis, Introduction to Signal Processing, Prentice Hall, 1996.

    [5] Andreas S. Spanias, Speech Coding: A Tutorial Review, Proc. IEEE, vol.82, no. 10,

    pp. 1541-1582, October 1994.

    [6] C. G. Bell et al. Reduction of speech spectra by Analysis-by-synthesis techniques,

    Journal of the Acoustic Soc. of America, 33:1725-1736, December 1961.

    [7] B. Atal, Predictive Coding of Speech at low Bit Rates, IEEE Trans. On

    Communications, pages 600-614, April 1982.

    [8] R. W. Schafer and B. S. Atal, Code Excited Linear Prediction (CELP): High Quality

    Speech at Low Bit Rates, Proc of ICASSP, pages 937-940, 1985.

    [9] P. Kroon and E. Deprettere, A Class of Analysis-By-Synthesis Predictive Coders for

    High Quality Speech Coding at Rates Between 4.8 and 16 Kbps, IEEE jour. On Selected

    Areas in Communications, pages 353-363, February 1988.

    [10] M. R. Schroeder and B. S. Atal. Predictive Coding of Speech Signals and Subjective

    Error Criteria. IEEE Trans. On ASSP, 27(3):247-254, June 1979.

    [11] M. R. Schroeder, B. S. Atal, and J. L. Hall. Optimizing Digital Speech Coders by

    Exploiting Masking Properties of the Human Ear. Journal of the Acoustic Soc. of

    America, 66(6):1647-1652, December 1979.

    [12] Edward Chilton. Factors Affecting the Quality of Linear Predictive Coding of

    Speech at Low Bit-rates, PhD thesis, University of Surrey. Guildford, Surrey, U.K.,

    October 1990.

    [13] K. Y. Lee. Analysis By Synthesis Linear Predictive Coding, PhD thesis, University

    of Surrey, Guildford, Surrey, U.K., October 1990.

    [14] S. Singhal and B. S. Atal, Improving the performance of multi-pulse LPC coders at

    low bit rates, In Proc. Of ICASSP, pages 1.3.1-1.3.4, San Diego, 1984.

    [15] R. Soheili, J. Horos, A. M. Kondoz, and B. G. Evans. New Innovations in Multi-

    Pulse Coding for bit rates below 8kbits/s. In Proc. of EUROSPEECH Conf., pages 298-

    301, Paris, France, September 1989.

    [16] L. Rabiner and R. Schafer. Digital Processing of Speech Signals. Signal Processing.

    Prentice-Hall, 1978

    [17] John R. Deller, John G. Proakis, and John H. L. Hansen. Discrete-Time

    Processing of Speech Signals. Macmillan, 1993.

    [18] Luis Miguel Teixeira de Jesus, Speech Coding and Synthesis Using

  • 8/4/2019 Report Format for Final Yr

    22/23

    Parametric Curvess, MS thesis, University of East Anglia., October 1997.

    [19] Richard V. Cox and Peter Kroon. Low Bit-Rate Speech Coders for Multimedia

    Communication, Murray Hill, NJ 07904, November 1996.

    [20] Thomas E. Tremain, The Government Standard Linear Predictive Coding

    Algorithm: LPC-10, Speech Technology Magazine, p.40-49, April 1982.

    [21] Kashif Israr Siddiqui et al. Real-Time Implementation of ITU-Ts G.723.1 Dual Rate

    Speech Coder for Multimedia Communications Transmitting at 5.3 and 6.3 kbits/s on

    Trimedias TM-1000 VLIW DSP CPU, Proc. of INMIC, December 2001.

    [22] Gill Held, Voice and Data Internetworking, Osborne/McGraw-Hill, 2001.

    [23] Patrick Niemeyer and Joshua Peck, Exploring Java,OReilly, September 1997.

    [24] Fu-Kun Chen et al. Complexity Scalability for ACELP and MP-MLQ Speech Coders

    , IEICE Trans. Inf. & Syst., Vol.E85-D, No.1 January 2002[25] J.P. Woodard and L. Hanzo, Improvements to the Analysis-by-Synthesis Loop in

    CELP Codecs, Department of Electronics and Computer Science, University of

    Southampton, June 1994.

    [26] Benjamin W. Wah and Dong Lin,LSP-Based Multiple-Description Coding for Real-TimeLow Bit-Rate Voice Transmissions Department of Electrical and Computer

    Engineering and the Coordinated Science Laboratory University of Illinois, Urbana

    Urbana, IL 61801, USA

    [27] J.P. Woodard and L. Hanzo , A G 728-compatible Programmable Rate 8-16 kbps

    Low-Delay CELP Codec, Department of Electronics and Computer Science, University

    of Southampton, June 1994.

    [28] Simon Haykin, Analog and Digital Communications, John Wiley & sons, 1994.

    [29] Wayne Tomasi, Advanced Electronic Communications Systems, Fourth Edition,

    Prentice Hall, 1998.

    [30] Sophocles J. Orfanidis, Introduction to Signal Processing, Prentice Hall Signal

    Processing Series, 1996.

    [31] Phillips and Parr, Signals, Systems, and Transforms, Prentice Hall, 1995.

    [32] Peter Kraniauskas, Transforms in Signals and Systems, Addison-Wesley, 1993.

    [33] Leland B. Jackson, Signals, Systems, and Transforms, Addison-Wesley, 1991.

    [34] Alan V. Oppenheim, Signals and Systems, Prentice Hall, 1983.

    [35] Justin Zobel, Writing for Computer Science, Springer, 1998.

  • 8/4/2019 Report Format for Final Yr

    23/23

    [36] Thomas J. Dillon, Jr., G.723.1 Dual-Rate Speech Coder: Multichannel

    TMS320C62x Implementation, Application Report, SPRA552B, February 2000.

    [37] Cisco Systems official documentation, Waveform Coding Techniques, Cisco

    Systems Inc, 2001.

    [38] Excelsior, Excelsior JET 2.5,http://www.excelsior-usa.com , 2002.

    [39] Craig Larman, Applying UML and Patterns, Prentice Hall, 2000.

    [40] Martin Fowler, UML Distilled, Addison-Wesley, 2000.

    [41] Bertrand Meyer, Object-Oriented Software Construction, Prentice Hall, 1997.

    [42] Bruce Eckel, Thinking in Java, Prentice Hall, 2000.

    http://www.excelsior-usa.com/http://www.excelsior-usa.com/http://www.excelsior-usa.com/