![Page 1: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/1.jpg)
Non-linear speech processing: overview of COST-277 current research
1
Nonlinear speech processing (NOLISP)
Overview of COST-277 current research
Marcos Faúndez-Zanuy
COST-277 Chairman
![Page 2: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/2.jpg)
Non-linear speech processing: overview of COST-277 current research
2
OUTLINE
1. Overview: what means “nonlinear”?
2. Organization of COST-277
3. Report activity june’01 – june’03
![Page 3: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/3.jpg)
Non-linear speech processing: overview of COST-277 current research
3
OUTLINE
1. Overview: what means “nonlinear”?
2. Organization of COST-277
3. Report activity june’01 – june’03
![Page 4: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/4.jpg)
Non-linear speech processing: overview of COST-277 current research
4
What means “Non-linear”? (Strict sense)
Superposition principle does not hold:
Given: f(x1)=y1, f(x2) =y2 =>
f(ax1)=ay1, f (x1 +x2) =y1+y2
![Page 5: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/5.jpg)
Non-linear speech processing: overview of COST-277 current research
5
What means “Non-linear”? Strict sense: Really almost “everything” is nonlinear
Acquisition Parameterization Models
Quantizer (linear, A-law, etc.)
Cepstrum HMM, VQ
-8 -6 -4 -2 0 2 4 6 8 -8 -6 -4 -2 0 2 4 6 8
outp
ut
input
Uniform 3 bits quantizer
-4 -3
-2 -1
0 1
2 3
)(log)( 1 nxFFnxcepstrum
![Page 6: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/6.jpg)
Non-linear speech processing: overview of COST-277 current research
6
Non-linearities are always present
Nonlinearities of the systems that generate the signal and/ or noise
Nonlinearities of the signal acquisition system
Nonlinearities of the transmission channel Nonlinearities of the human perception
mechanism.
![Page 7: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/7.jpg)
Non-linear speech processing: overview of COST-277 current research
7
Classical approachWide sense: linear speech processing
Speech signal model consists of a pulse/ noise source and a linear filter where both change their characteristics on a frame-by-frame basis.
This approach neglects structure known to be present in the speech signal.
![Page 8: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/8.jpg)
Non-linear speech processing: overview of COST-277 current research
8
Evidences of nonlinearities
Residue comparison Correlation dimension Higher order statistics Probability density functions
![Page 9: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/9.jpg)
Non-linear speech processing: overview of COST-277 current research
9
Example: Linear vs NL
![Page 10: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/10.jpg)
Non-linear speech processing: overview of COST-277 current research
10
Drawbacks with NOLISP approaches
A lack of a unifying theory of the different nonlinear processing tools (nnets, homomorphic, polynomial, morphological, ordered statistics filters, and so on)
High computational burden Well known analysis tools are not applicable Usually, a closed-form formulation does not exist,
and iterative methods (with local minima problems) must be used.
![Page 11: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/11.jpg)
Non-linear speech processing: overview of COST-277 current research
11
What are we mainly looking for?
The replacement of the linear filter (or parts thereof) with nonlinear operators (models) should enable us to obtain an accurate description of the speech signal with a lower number of parameters. This in turn should lead to better performance of practical speech processing applications.
![Page 12: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/12.jpg)
Non-linear speech processing: overview of COST-277 current research
12
OUTLINE
1. Overview: what means “nonlinear”?
2. Organization of COST-277
3. Report activity june’01 – june’03
![Page 13: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/13.jpg)
Non-linear speech processing: overview of COST-277 current research
13
What is COST ?
Intergovernmental Cooperation– Created in 1971– 17 Scientific and Technical Domains
Participation– 33 COST Countries– European Commission– International Organisations – Organizations from Non-COST Countries on Mutual
Benefit Basis COST Actions
– Concerted Actions of Nationally Funded R&D
![Page 14: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/14.jpg)
Non-linear speech processing: overview of COST-277 current research
14
COST TISTCOST TISTTelecommunications,Telecommunications,Information ScienceInformation Scienceand Technologiesand Technologies
![Page 15: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/15.jpg)
Non-linear speech processing: overview of COST-277 current research
15
COST CountriesThe fifteen EU Member States
The EFTA Member States
Iceland
Norway
Switzerland
Central and Eastern countries
Estonia
Latvia
Lithuania
Poland
the Czech republic
Slovakia
Slovenia
Croatia
Romania
Bulgaria
Other countries
Cyprus
Malta
Turkey
Hungary
![Page 16: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/16.jpg)
Non-linear speech processing: overview of COST-277 current research
16
Evolution of COST Actions
0
50
100
150
200
250
80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 00
Total Actions
Starting Actions
![Page 17: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/17.jpg)
Non-linear speech processing: overview of COST-277 current research
17
WHAT IS A COST ACTION?
Concerted Action Pan-European “NON-COMPETITIVE” Research R&D Financed Nationally Flexibility Bottom-up A la carte participation Commission funds only coordination activities
![Page 18: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/18.jpg)
Non-linear speech processing: overview of COST-277 current research
18
COST Senior Officials (CSO)
Responsible for the overall strategy of COST
Decides on the launching of each individual COST Action
Approves participation from non-COST countries institutes
Approves prolongation of COST Actions
![Page 19: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/19.jpg)
Non-linear speech processing: overview of COST-277 current research
19
COST Technical Committee (TC)
Selection of new COST Actions
Monitoring of ongoing COST Actions
Evaluation of completed COST Actions
Dissemination and Valorisation of COST activities
Provide Advice to EC on Budget Planning
![Page 20: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/20.jpg)
Non-linear speech processing: overview of COST-277 current research
20
Management Committee (MC)
Supervises and coordinates the implementation of the Action
Composed of :– Maximum two representatives of each signatory
country they ensure the scientific coordination at national level
– One representative of any non-COST institution admitted to participate
– The Scientific Secretary– Representatives of the Commission services
Each signatory has one vote
![Page 21: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/21.jpg)
Non-linear speech processing: overview of COST-277 current research
21
Working Group (WG)
Small number of researchers per working group
Working group members may be:
– Management Committee members
– Other scientists from the signatory countries
![Page 22: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/22.jpg)
Non-linear speech processing: overview of COST-277 current research
22
COST TIST
~ 28 Actions, ~ 2000 Organisations Covering Basic Research on
– Antennas and Radio Propagation– Satellite Technologies and Services– Mobile Technologies and Services– Optical Networking Components and Services– Internet & Multimedia Network Services– Speech Technologies– Information and Computer Science
Strong Relationship with IST Program
![Page 23: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/23.jpg)
Non-linear speech processing: overview of COST-277 current research
23
Evolution of COST Evolution of COST TIST ActionsTIST Actions
0
5
10
15
20
25
30
1996 1998 2000
Total Actions
StartingActions
![Page 24: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/24.jpg)
Non-linear speech processing: overview of COST-277 current research
24
Special Needs & User Requirements
COST 219bis,
269
COST TISTResearch Domains & Actions
Antennas/Radio PropagationCOST 244bis, 255,
260, 261, 271
Mobile & Personal Comm.
COST 259, 273Satellite
Tech. & Services
COST 272
Optical Networking
COST 265, 266, 267, 268, 270
New Internet & Multimedia Services COST 211 Quad, 256,
257, 263, 264, 269, 275, 279
Speech Technologies
COST 258, 277, 278
Information & Computer Science
COST 274, 276
![Page 25: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/25.jpg)
Non-linear speech processing: overview of COST-277 current research
25
Other COST Actions in Speech Technologies
COST 275: Biometrics-Based Recognition of People over the Internet – Involves the use of both voice and face recognition
for user authentification over the Internet COST 278: Spoken Language Interaction in
Telecommunications– Improve knowledge regarding issues and problems
related to spoken language interaction, including robustness and multi-lingual aspects
– Human-computer interaction using spoken language in multi-modal context, including dialoque theories and application evaluation
![Page 26: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/26.jpg)
Non-linear speech processing: overview of COST-277 current research
26
Relationship between COST Actions 275, 277 and 278
275: Biometrics based Recognition of People
over the Internet
277: Non-linear Speech Processing
278: Spoken LanguageInteraction in
Telecommunication
Speaker
Recognition
Speech
Recognition
Natural
Language
Processing
Multi
Modality &
Data Fusion
Speech
Analysis & Coding
Image
Analysis &
Graphics
Speech
SynthesisDialogue
Application Fields
Interface Components
Generic Functions
![Page 27: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/27.jpg)
Non-linear speech processing: overview of COST-277 current research
27
GRANT CONTRACTS COST TIST support is provided through annual
Grant Contracts with coordinating organisation Contract covers costs for:
– Secretariat (manpower to cover administration)– Meetings (WG and MC)– Seminars and workshops– Short Term Scientific Missions– Publications
![Page 28: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/28.jpg)
Non-linear speech processing: overview of COST-277 current research
28
SECRETARIAT Contract Management, Payments Reimbursement of Meetings Rebuilding of WWW site
– Repository of Official Documents– TC and Action Activities and Events
Enhancing Dissemination– News Letter– Central Index and Storage of Reports for Retrieval
Links with EC (IST) and National Programmes
![Page 29: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/29.jpg)
Non-linear speech processing: overview of COST-277 current research
29
Overview:COST-277
DISCRETE MODELS
SY
NT
HE
TIC
SP
EE
CHH
UM
AN
SP
EE
CH
CODED SPEECH
WRITTEN SPEECH
TtS
StT
StC
CtS
Analysis SynthesisR
ecogn.
Cod
ing
© u
kl 2
002
![Page 30: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/30.jpg)
Non-linear speech processing: overview of COST-277 current research
30
Organization
Chair: Marcos Faúndez Vice-Chair: Gernot Kubin Secretary: Stephen McLaughlin
– WG1: Bastiaan Kleijn– WG2: Bojan Petek– WG3: Stephen McLaughlin– WG4: Gerard Chollet
![Page 31: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/31.jpg)
Non-linear speech processing: overview of COST-277 current research
31
Countries
Austria Belgium Czech Republic France Germany Greece Ireland Italy Lithuania Portugal Slovakia Slovenia Spain Sweden Switzerland UK
Canada
![Page 32: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/32.jpg)
Non-linear speech processing: overview of COST-277 current research
32
Dissemination of info
e-mail distribution list:
Subscribe/unsubscribe [email protected]
Website:
http://www.ee.ed.ac.uk/cost277/
![Page 33: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/33.jpg)
Non-linear speech processing: overview of COST-277 current research
33
Future Meetings of the management committee
![Page 34: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/34.jpg)
Non-linear speech processing: overview of COST-277 current research
34
Publications and reports
International Journal of control and intelligent systems, special issue on Non-linear Speech processing techniques and applications ACTAPRESS. Invited editor: A. Hussain (COST-277 MC member)
Special sessions in EUSIPCO’02, IWANN’01, IWANN’03, EUSIPCO’04 (TBC)
![Page 35: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/35.jpg)
Non-linear speech processing: overview of COST-277 current research
35
COST Actions in Speech Technologies
COST 275: Biometrics-Based Recognition of People over the Internet – Involves the use of both voice and face recognition for user
authentification over the Internet COST 277: Nonlinear speech processing COST 278: Spoken Language Interaction in
Telecommunications– Improve knowledge regarding issues and problems related
to spoken language interaction, including robustness and multi-lingual aspects
– Human-computer interaction using spoken language in multi-modal context, including dialoque theories and application evaluation
![Page 36: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/36.jpg)
Non-linear speech processing: overview of COST-277 current research
36
Relationship between COST Actions 275, 277 and 278
275: Biometrics based Recognition of People
over the Internet
277: Non-linear Speech Processing
278: Spoken LanguageInteraction in
Telecommunication
Speaker
Recognition
Speech
Recognition
Natural
Language
Processing
Multi
Modality &
Data Fusion
Speech
Analysis & Coding
Image
Analysis &
Graphics
Speech
SynthesisDialogue
Application Fields
Interface Components
Generic Functions
![Page 37: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/37.jpg)
Non-linear speech processing: overview of COST-277 current research
37
COST-277: A different approach
“The four classical areas of speech processing:
Speech Recognition (Speech-to-Text, StT)
Speech Synthesis (Text-to-Speech, TtS and Code-to-Speech, CtS)
Speech Coding (Speech-to-Code, StC with CtS) and
Speaker Verification and Identification (SV)
have all developed their own methodology almost independently from the neighboring areas. This has led to a plurality of tools and methods that are hard to integrate to any small multifunctional speech processing system (a mobile phone performing speaker verification and continuous speech recognition in addition to speech coding should have many separate processes running in parallel).
![Page 38: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/38.jpg)
Non-linear speech processing: overview of COST-277 current research
38
Relations between different fields
DISCRETE MODELS
SY
NT
HE
TIC
SP
EE
CHH
UM
AN
SP
EE
CH
CODED SPEECH
WRITTEN SPEECH
TtS
StT
StC
CtS
Analysis SynthesisR
ecogn.C
odin
g
© u
kl 2
002
![Page 39: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/39.jpg)
Non-linear speech processing: overview of COST-277 current research
39
COST277Non-linear speech processing
PROGRESS REPORT
Period: from (June-2001) to (June-2003)
![Page 40: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/40.jpg)
Speech coding 40
LINEAR PREDICTION
Scalar linear prediction AR modeling of order P : where ai are the scalar prediction coefficients.
obtained with the levinson-durbin recursion.
Vectorial linear prediction AR-vector modeling of order P: where are matrices
P
ii neinxanx
1
neinxAnxP
ii
1
PiiA ,1mm
![Page 41: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/41.jpg)
Speech coding 41
NL SCALAR PREDICTION WITH NNET
input layer
hidden layer
output layer
x[n-1]x[n-p] x[n-p+1]inputs: x[n]
output
![Page 42: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/42.jpg)
Speech coding 42
NLVECTORIAL PREDICTION WITH NNET
input layer
hidden layer
output layer
inputs:
outputs
x[n-p] x[n-p+1] x[n-1]
x[n] x[n+1]
![Page 43: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/43.jpg)
Speech coding 43
ADPCM NNET PREDICTION
Q
Q -1
MLP1
x[n]
+ -
d[n]
xN[n] ~
d[n] ~
c[n]
x[n] ^ MLP2
MLPN
x1[n] ~ C
OM
.
x[n] ~
![Page 44: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/44.jpg)
Speech coding 44
VECTORIAL NL-ADPCM RESULTS
1 1.5 2 2.5 3 3.5 46
8
10
12
14
16
18
20
22
24
26
bits per sample
SE
GS
NR
1D2D3D4D5D
![Page 45: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/45.jpg)
Non-linear speech processing: overview of COST-277 current research
45
Very low bit rate speech coder
Demonstration !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
![Page 46: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/46.jpg)
Non-linear speech processing: overview of COST-277 current research
46
Broadcast news audio segmentation,
classification, clustering and speech recognition
Demonstration
demo
Available at http://193.126.86.80
![Page 47: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/47.jpg)
Non-linear speech processing: overview of COST-277 current research
47
SPEAKER RECOGNITION
Current systems rely on low-level information in speech.– Short time extent analysis windows (20-30 ms)– Spectral energy based (MFCC)
Another possibility: High level information– Speaking rate– Pitch patterns– Word/ Phrase usage– Idiosyncratic pronunciation
![Page 48: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/48.jpg)
Non-linear speech processing: overview of COST-277 current research
48
SPEAKER RECOGNITION:Possibilities of NOLISP
Low level information:– Non-linear predictive models instead of LPCC– Parameters: Fractal, Lyapunov exponents,
correlation dimension, etc. High level information:
– To take advantage of the other working groups. For instance intonation is fundamental in speech synthesis and useful for speaker recognition.
![Page 49: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/49.jpg)
Non-linear speech processing: overview of COST-277 current research
49
Why to use NL-models?
Listening to the residual signal of an LPC analysis it is possible to identify who is speaking.– Usually the residual signal is discarded.– NL models offer a better fit and whiter
residual signal. NL models can offer an improvement in
coding and synthesis, so there is room for speaker recognition improvement.
![Page 50: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/50.jpg)
Non-linear speech processing: overview of COST-277 current research
50
BANDWIDTH EXTENSION:An example of NL processing
A speech signal that has passed through the public switched telephony network (PSTN) has generally a limited frequency range between 0.3 and 3.4 kHz.
The Bandwidth extension algorithms aim at recovering the lost low- (0 - 0.3 kHz) and/or high- (3.4 –8 kHz) frequency band given the narrow-band speech signal
![Page 51: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/51.jpg)
Non-linear speech processing: overview of COST-277 current research
51
SPECTRAL BAND REPLICATION
0 fs/4 fs/2
0 fs/4 fs/2fs/8
0 fs/4 fs/2
0 fs/4 fs/2
initial
final
f [kHz]5 10
LPF
![Page 52: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/52.jpg)
Non-linear speech processing: overview of COST-277 current research
52
BANDWIDTH EXTENSION
Databases:– Original fullband: [0.3, 7] kHz
– Narrow band: [0.3, 3.4] kHz
– Bandwidth extended: [0.3, 7] kHz
LPF
Bandwidth extension
![Page 53: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/53.jpg)
Non-linear speech processing: overview of COST-277 current research
53
MIC database:DCF for several MELCEPS-l
8 10 12 14 16 18 20 22 24 260.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
l
DC
FMELCEPS
[0, 8] kHz[0.3, 3.4] kHz
[0.3, 8] kHz BWext
![Page 54: Non-linear speech processing: overview of COST-277 current research1 Nonlinear speech processing (NOLISP) Overview of COST-277 current research Marcos](https://reader031.vdocuments.us/reader031/viewer/2022020921/56649e9f5503460f94ba21af/html5/thumbnails/54.jpg)
Non-linear speech processing: overview of COST-277 current research
54
Bandwidth extension
For human beings it’s more easy to recognize using full band signals.
No new information is added Experimental results reveal that:
– The bandwidth extension algorithm does not introduce any damaging artifacts
– With MELCEPS parameterization, the results are better than using the narrow band signal.