i-vector based joint anti-spoofing and speaker...
TRANSCRIPT
![Page 1: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/1.jpg)
i-vector based joint anti-spoofing and speaker verification
Tomi Kinnunen, Elie Khoury, Aleksandr Sizov Zhizheng
Wu, Sébastien Marcel
Contact: [email protected]
![Page 2: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/2.jpg)
Spoofing attacks:Achille’s heel of biometrics
2014: Samsung Galaxy S5
linked with user’s PayPal
account, fake fingerprint
2011: HK->Canada passenger
with fake face mask
2013: Apple iphone 5S
touchID, fake fingerprints
2014 book
on the topic
![Page 3: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/3.jpg)
Spoofing speaker verification
• Sneakers (1992)
“Sneakers” (1992)
IS IT RELEVANT IN REAL
APPLICATIONS ?
![Page 4: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/4.jpg)
Increasing use of ASV in finance
• Barclays bank (48 million customers in 50 countries)[ http://www.computerweekly.com/news/2240179218/Barclays-streamlines-phone- banking-with-voice-biometrics ]
• Banco Santander México[ http://findbiometrics.com/road2bup-commerce-3-deployments-of-biometrics-in-finance
• National Australia Bank[ http://www.businessspectator.com.au/news/2012/11/21/technology/nab-speaks-loud-and-clear-voice-biometrics ]
• Australian Health Management[ http://www.zdnet.com/voice-biometrics-replaces-id-check-at-ahm-1339274060/ ]
• ”Voice unlock” feature in Lenovo A586 phone[ http://hlt.i2r.a-star.edu.sg/site_media/news_articles/Digital_Life_Voiceprint_Tech_30_Jan_2013.pdf ]
![Page 5: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/5.jpg)
Wu et al., “Spoofing and countermeasures for automatic speaker verification: a survey”, to appear in Speech Communication
Replay
Impersonation
Text-to-Speech (TTS)
Voice conversion
Four ways to spoof ASV
Tomi here,
verify me !
Mimicry by a human being
Replay of a previously-
recorded utterance
Generation of speech
signal from text input
Conversion of speaker
identity of an utterance
![Page 6: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/6.jpg)
Source Target Converted
Male-to-
female
Use ~30 seconds to train conversion conversion
Convert spectrum only, retain F0
Voice conversion demo
![Page 7: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/7.jpg)
Voice conversion increases FAR
Study Speaker Verification
Technique
Zero-effort
FAR (%)
FAR after VC
spoofing (%)
Perrot et al. 2005 GMM-UBM 16.0 40.0
Matrouf et al. 2006 GMM-UBM 8.00 100.0
Bonastre et al. 2007 GMM-UBM 6.61 55.0
Kinnunen et al. 2012 JFA 3.24 17.33
Wu et al. 2013 i-vector PLDA 2.99 41.25
Wu et al. 2013 Text-dep. HMM 2.92 21.87
Alegre et al. 2013 i-vector PLDA 3.03 55.00
Kons et al. 2013 Text-dep. HMM-NAP 1.00 36.00
FAR: false acceptance rate
GMM-UBM: Gaussian Mixture Model - Universal Background Model
JFA: Joint Factor Analysis
PLDA: Probabilistic Linear Discriminant Analysis
HMM: Hidden Markov Model
NAP: Nuisance Attribute Projection
![Page 8: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/8.jpg)
Traditional approch:Independent ASV and spoofing detector systems
[Wu, Z., Kinnunen, T., Chng, E.S., Li, H., Ambikairajah, E., 2012b. ”A study on spoofing attack in state-of-the-art
speaker verification: the telephone speech case”, in: Proc. Asia-Pacific Signal Information Processing Association
Annual Summit and Conference (APSIPA ASC)]
[ Wu, Z., Li, H., 2013. Voice conversion and spoofing attack on speaker verification systems, in: Proc. Asia-Pacific
Signal Information Processing Association Annual Summit and Conference (APSIPA ASC) ]
MFCCs
Hand-crafted features based on
knowledge of the attacks
Could we use the same
MFCC i-vector front-end ?• Simpler system
• Only one threshold
• Computational savings
![Page 9: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/9.jpg)
i-Vector extraction
Utterance
MFCC
extraction
GMM mean
supervector
extraction
Utterance-dependent
supervector
Low-rank matrix
i-vectorUBM supervector
512 Gaussians60 MFCCs
30720 x 40030720
30720
400
Universal background
model (UBM)
[ N. Dehak, P. Kenny, R. Dehak, P. Dumouchel, and P. Ouellet, “Front-end factor analysis for speaker verification,”
IEEE Trans. on Audio, Speech, and Language Processing, vol. 19, no. 4, pp. 788–798, May 2011 ]
Tφms φ
![Page 10: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/10.jpg)
Probabilistic Linear Discriminant
Analysis (PLDA) modeling of i-vectors
j:th i-vector of
speaker i
Between-speaker
subspace V,
speaker factor yi
Within-speaker
subspace U,
factors xi,j
Residual with
N(0, ∑), diagonal
covariance
[ S. J. D. Prince and J. H. Elder, “Probabilistic linear discriminant analysis for inferences about
identity,” in IEEE ICCV, 2007, pp. 1–8 ]
ijijiij εUxVyφ
![Page 11: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/11.jpg)
Three use cases of PLDA
1. Stand-alone speaker verification
2. Stand-alone spoofing detection
3. Joint speaker verification and anti-spoofing
Requires additional i-vectors extracted from
synthetic speech
![Page 12: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/12.jpg)
”Synthetic” i-vector generation
MCEP or
LPC vocoder
Copy - synthesis
utterance
i-vector
extraction
MFCC
extraction
MFCC
extraction
Original
utterance
i-vector
extraction
x
x x x x
x
o
ooo
o oo o
xx
Original
utterance
Vocoded
speech
i-vectors
Natural
speech
i-vectors
![Page 13: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/13.jpg)
Examples of i-vectors(reduced to 2-d with linear discriminant analysis)
Before length normalization After length normalization
![Page 14: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/14.jpg)
Experiments
Text-independent set-up
Subset of NIST 2006 core task trials
Voice conversion spoofs: Non-parallel frame alignment
Mel-cepstrum (MCEP) & linear prediction (LPC) vocoders
from the SPTK toolkit
Joint-density GMM conversion of the spectral features
Equalization of mean and variance of log-F0 (RAPT F0 extraction)
[ T. Kinnunen, Z.-Z. Wu, K. A. Lee, F. Sedlak, E. S. Chng, H. Li, “Vulnerability of Speaker Verification Systems Against Voice
Conversion Spoofing Attacks: the Case of Telephone Speech”, Proc. ICASSP 2012, pp. 4401--4404, Kyoto, Japan, March 2012 ]
[ Z. Wu, T. Kinnunen, E.S. Chng, H. Li, E. Ambikairajah, ”A Study on spoofing attack in state-of-the-art speaker verification:
the telephone speech case”, Proc. 2012 APSIPA ASC 2012, pp. 1--5, Hollywood, USA, December 2012 ]
[ Speech Signal Processing Toolkit (SPTK), http://sp-tk.sourceforge.net/ ]
![Page 15: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/15.jpg)
Database summary
Male Female Total
Target speakers 241 342 583
Genuine trials 1,614 2,332 3,946
Zero-effort impostor
trials
1,132 1,615 2,747
Voice conversion
impostors (MCEP)
1,132 1,615 2,747
Voice conversion
impostors (LPC)
1,132 1,615 2,747
ZERO-
EFFORT
PROTOCOL
ZERO-EFFORT SPOOF PROTOCOL:
subset of the original NIST 2006 core task
![Page 16: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/16.jpg)
Database summary
Male Female Total
Target speakers 241 342 583
Genuine trials 1,614 2,332 3,946
Zero-effort impostor
trials
1,132 1,615 2,747
Voice conversion
impostors (MCEP)
1,132 1,615 2,747
Voice conversion
impostors (LPC)
1,132 1,615 2,747
MCEP SPOOF
PROTOCOL
MCEP SPOOF PROTOCOL: Voice conversion attack with
Mel-cepstral features and joint-density GMM conversion
![Page 17: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/17.jpg)
Database summary
Male Female Total
Target speakers 241 342 583
Genuine trials 1,614 2,332 3,946
Zero-effort impostor
trials
1,132 1,615 2,747
Voice conversion
impostors (MCEP)
1,132 1,615 2,747
Voice conversion
impostors (LPC)
1,132 1,615 2,747
LPC SPOOF
PROTOCOL
LPC SPOOF PROTOCOL: Voice conversion attack with linear
prediction vocoder and joint-density GMM conversion
![Page 18: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/18.jpg)
Training
samples
Attack
samples
Cos SVM PLDA
A. MCEP MCEP 92.2 91.7 91.8
B. LPC MCEP 53.0 53.6 53.1
C. MCEP LPC 98.3 98.3 98.7
D. LPC LPC 99.3 99.4 99.4
Stand-alone spoof detection (% correct)
A & B: DEDICATED ATTACKER
”Matched” MCEP vocoder with the recognizer features
C & D: SLOPPY ATTACKER
”Mismatched” vocoder with recognizer features
Take SVM as a baseline
spoofing detector
![Page 19: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/19.jpg)
“Integrated PLDA”
],,,[, ],,,,[ nat
ns
nat
n2
nat
n1
nat
1s
nat
12
nat
11
natural
n1
],,,[, ],,,,[ mcep
ns
mcep
n2
mcep
n1
mcep
1s
mcep
12
mcep
11
mcep
n1
],,,[, ],,,,[ lpc
ns
lpc
n2
lpc
n1
lpc
1s
lpc
12
lpc
11
lpc
n1
Expand training set:
Two times more ‘speakers’ to train PLDA:
Integrated
PLDA (lpc)
lpcnatural
Integrated
PLDA (mcep)mcepnatural
![Page 20: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/20.jpg)
Interaction of speaker verification
and anti-spoofing, FAR (%)
Spoof detector
training data
Zero-effort
spoofs
MCEP
spoofs
LPC
spoofs
Baseline PLDA --
Score fusion MCEP
LPC
Integrated PLDA MCEP
LPC
Baseline PLDA: No countermeasures
Score fusion: SVM countermeasure + PLDA ASV, weights from natural utterances
Integrated PLDA: Expanded training set including synthetic i-vectors
![Page 21: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/21.jpg)
Spoof detector
training data
Zero-effort
spoofs
MCEP
spoofs
LPC
spoofs
Baseline PLDA -- 1.76 6.13 10.84
Score fusion MCEP 1.62 7.12 13.13
LPC 1.73 4.89 9.35
Integrated PLDA MCEP 1.24 3.90 5.82
LPC 1.42 5.94 2.97
Interaction of speaker verification
and anti-spoofing, FAR (%)
Baseline PLDA: No countermeasures
Score fusion: SVM countermeasure + PLDA ASV, weights from natural utterances
Integrated PLDA: Expanded training set including synthetic i-vectors
![Page 22: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/22.jpg)
Spoof detector
training data
Zero-effort
spoofs
MCEP
spoofs
LPC
spoofs
Baseline PLDA -- 1.76 6.13 10.84
Score fusion MCEP 1.62 7.12 13.13
LPC 1.73 4.89 9.35
Integrated PLDA MCEP 1.24 3.90 5.82
LPC 1.42 5.94 2.97Zero-effort FAR not
affected much, good... But is not
systematically helpful in
reducing spoof FAR
Interaction of speaker verification
and anti-spoofing, FAR (%)
Baseline PLDA: No countermeasures
Score fusion: SVM countermeasure + PLDA ASV, weights from natural utterances
Integrated PLDA: Expanded training set including synthetic i-vectors
![Page 23: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/23.jpg)
Spoof detector
training data
Zero-effort
spoofs
MCEP
spoofs
LPC
spoofs
Baseline PLDA -- 1.76 6.13 10.84
Score fusion MCEP 1.62 7.12 13.13
LPC 1.73 4.89 9.35
Integrated PLDA MCEP 1.24 3.90 5.82
LPC 1.42 5.94 2.97
Interaction of speaker verification
and anti-spoofing, FAR (%)
Baseline PLDA: No countermeasures
Score fusion: SVM countermeasure + PLDA ASV, weights from natural utterances
Integrated PLDA: Expanded training set including synthetic i-vectors
![Page 24: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/24.jpg)
Pooled MCEP and LPC spoofs
Additional i-vectors: MCEP Additional i-vectors: LPC
![Page 25: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/25.jpg)
What? Use of i-vectors to do speaker
verification & anti-spoofing
How? A simple ”integrated PLDA” recipe
1. Create synthetic i-vectors by copy-synthesis
2. Treat synthetic speakers as a new ”speakers”
3. Score as usual
Worth for further studies!
Does not solve the problem of wrong training vocoder
Not as impressive improvements as dedicated
countermeasures, but much simpler system
What next?
More vocoders
Impersonation, replay, synthesis attacks
Other biometric modalities
Integrated
PLDA
mcep
natural
Summary
![Page 26: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/26.jpg)
Other recent topics
Foreign accent & regional dialect identification H. Behravan, V. Hautamäki, T. Kinnunen, “Factors Affecting i-Vector Based Foreign Accent Recognition: a Case
Study in Spoken Finnish”, Speech Communication (to appear)
H. Behravan, V. Hautamäki, S.M. Siniscalchi, T. Kinnunen, C.-H. Lee, ”Introducing attribute features to foreign
accent recognition”, Proc. ICASSP 2014
H. Behravan, V. Hautamäki, S.M. Siniscalchi, E. Khoury, T. Kurki, T. Kinnunen, C.-H. Lee, ”Dialect Levelling in
Finnish: A Universal Speech Attribute Approach”, Proc. Interspeech 2014
Effect of human mimicry (impersonation) R. Gonzalez Hautamäki, T. Kinnunen, V. Hautamäki, A.-M. Laukkanen, ”Comparison of human listeners and
speaker verification systems using voice mimicry data”, Proc. Odyssey 2014: The Speaker & Language
Recognition Workshop, pp. 137--144, Joensuu, Finland, June 2014
R. Gonzalez Hautamäki, T. Kinnunen, V. Hautamäki, T. Leino, A.-M. Laukkanen, ”I-vectors meet imitators: on
vulnerability of speaker verification systems against voice mimicry”,Proc. Interspeech 2013, pp. 930--934, Lyon,
France, August 2013
Recording device identification from speech C. Hanilçi and T. Kinnunen, “Source Cell-Phone Recognition from Recorded Speech Using Non-Speech
Segments”, Digital Signal Processing (to appear)
Vocal effort compensation J. Pohjalainen, C. Hanilçi, T. Kinnunen, P. Alku, “Mixture Linear Prediction in Speaker Verification Under Vocal
Effort Mismatch”, IEEE Signal Processing Letters, 21(12): 1516--1520, December 2014
![Page 27: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/27.jpg)
Foreign accent detectionData: Finnish national foreign
language certificate (FSD) corpus
Finnish spoken utterances
produced by foreigners
H. Behravan, V. Hautamäki, T. Kinnunen, “Factors Affecting i-Vector Based Foreign Accent Recognition: a Case
Study in Spoken Finnish”, Speech Communication (to appear)
![Page 28: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/28.jpg)
![Page 29: i-vector based joint anti-spoofing and speaker verificationcostic1206.uvigo.es/sites/default/files/Meetings/Belgrade/Presentations/presentation... · i-vector based joint anti-spoofing](https://reader030.vdocuments.us/reader030/viewer/2022041203/5d500ef288c99387498bbd3d/html5/thumbnails/29.jpg)
Thank you