armor vox whitepaper_blacklist-fraud-detection_
TRANSCRIPT
AURAYA SYSTEMS
One Tara Boulevard | Nashua, New Hampshire 03062 | +1 603 123 7654 | twitter.com/armorvox | linkedin/in/armorvox
TECHNOLOGY WHITEPAPER
ArmorVox – “Black-List” Fraud Detection
2
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
The Problem
When the economy takes a “downturn”, there is one thing that can be guaranteed to take an “upturn” -
and that is fraud. As Internet security has become stronger, criminals and fraudsters are increasingly
targeting soft options, such as contact centers to gain access to accounts and personal information.
Using stolen personal information, which is readily available on the Web either through criminal groups
or through the myriad of burgeoning social networking sites, a fraudster can easily gain access millions
of accounts simply by quoting the personal information to a contact center agent. The problem is that
the agent cannot differentiate between a legitimate caller and the fraudster, especially when the correct
personal information is being quoted. As the uptake of mobility increases this problem becomes
widespread in other channels such as the Internet and contact mediums like web chat, the problem is
only going to get worse.
In this whitepaper, Auraya describes the application of its speaker adaptive voice authentication
technology to provide a solution to this problem. The solution allows the contact center agent to take a
call and using the speech provided by the caller during the conversation, and compare their voice in the
background against a “black-list” of known fraudsters and suspicious callers to see if the voice
matches. Where there is a close match the agent or call center manager can be notified and
appropriate action taken.
Simulating Fraudulent Calls
Set up - To assess the effectiveness of black-list detection, Auraya set-up a simulation. The first step
was to configure the Auraya system to perform the “black-list” detection process. Figure 1 shows the
architecture. In this arrangement speech spoken by a caller is compared against each of the acoustic
models of the fraudulent “black-list” speakers. The output from this process is a list of scores
representing how well the speech matches each of the “black-list” acoustic models. The list is ranked in
descending order and a threshold is set to detect if the match is close enough to raise an alarm that the
speaker’s voice matches the voice of one of the speakers in the “black-list”.
3
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
Figure 1. ArmorVox “black-list” Architecture
For this exercise a speech database of some 200 speakers collected in a telephone call
center environment was used. The simulation was set-up in three stages:
Stage 1: Black-List Enrollment -This stage involved selecting ten speakers from the
database to act as the “black-list” fraudsters. The selection was purely arbitrary. There are no special
circumstances and includes both male and female speakers.
Table 1 (on the next page) shows the selection made:
4
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
“Black-List” Reference ID Speaker ID (from Speech Database)
101 106
102 117
103 120
104 160
105 294
106 385
107 386
108 398
109 415
110 438
Table1. Fraudster Selection
In this arrangement; speaker ID 106 from the database was enrolled as “Black-list” reference
101, speaker ID 117 as “Black-list” reference 102 and so on to create the “Black-list” of ten.
Stage 2 - “Black-List” Detection - Once this was complete, the second stage involved processing the
database of 200 speakers (which included the “black-list” speakers) and systematically comparing each
speaker against each of the “black-list” voiceprints with the results of this process loaded into a
database for further analysis.
The process generates two thousand authentication results, that is, 200 speakers each compared
against the ten “Black-list” enrollments. The results generated by this process were then sorted into
descending order, with the highest scores (closed matches) ranked at the top.
Table 2 shows an analysis for “Black-List” ID 101 (which is speaker 106 from database)). In this table
only the top ten closest matches are listed (from the 200 matches generated). The table shows that the
highest score was generated by wave file 10611-2-1-1.wav, i.e. speaker ID 106 (with a score or
1.3374); with ID 214 with voice file 21411-2-1-1.wav being the second closest match (with a score of
0.5512) and so on.
5
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
This test confirmed that Auraya’s technology was successful in a database of 200 callers in detecting
ID 106 from the database as fraudster 101. . Further the fraudster match was over double that of the
second closest match, showing that there was a very positive authentication of the fraudster’s voice in
this test.
Black List ID Speaker ID (Database Reference) “Black-List” Raw Score Delta
101 10611-2-1-1.wav 1.3374 0.7862
101 21411-2-1-1.wav 0.5512 0.7885
101 28711-2-1-1.wav 0.5489 0.8198
101 24744-2-1-1.wav 0.5176 0.8582
101 21311-2-1-1.wav 0.4792 0.9032
101 17444-2-1-1.wav 0.4342 0.9119
101 24644-2-1-1.wav 0.4255 0.9786
101 21813-2-1-1.wav 0.3588 1.0162
101 11044-2-1-1.wav 0.3212 1.041
101 12011-2-1-1.wav 0.2964 1.0693
Table 2. Results for “Black-List” ID 101 showing voice file of speaker 106 to be highest scoring match.
An analysis of the complete data set shows that the result achieved for ID 106 was consistent across all
“Black-list” IDs. That is, in each case, the Auraya technology was able to clearly identify the correct
speaker from the total voice database against the corresponding “blacklist” speaker. Table 3a shows
the top three speaker ID’s for each “black-list” ID. In every case the corresponding speaker ID was
ranked number one in each data set.
This database is based on account number. A second test was run to see if the technology could
reliably detect “black-list” speakers when they were quoting a different account number.
This tested the performance of the technology in matching the voice quality, not the content of
the speech files.
6
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
The results are generated are detailed in Table 3b (on the next page).
In all but ID 110 the “Black-List” speaker was successfully selected as the number one ranked match by
a large margin despite the fact that the speaker was saying different information to the information
enrolled, a clear and positive identification. In the case of 110, ID 513 was placed ahead of the real
fraudster’s ID which was 438. Note that ID 513 was ranked third in the initial test, suggesting that the
voice of ID 513 appears to be highly confusable with the voice of the nominated ID 438. Further, it
would appear that “Black-list” ID 110 also produced a weak voiceprint resulting in low match score
(0.7403 compared to around 1.5) in the first test. This is something that we will come back to later in the
business rule analysis.
7
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
BLACK
LIST
ID
SPEAKER ID
(Database
Reference)
Raw
Score
Ranking BLACK
LIST
ID
SPEAKER ID
(Database
Reference)
Raw
Score
Ranking
101
101
101
10611-2-1-.wav
21411-2-1-1.wav
28711-2-1-1.wav
1.3374
0.5512
0.5489
First
Second
Third
101
101
101
10611-2-2-1.wav
10930-2-2-1.wav
17444-2-2-1.wav
1.6778
0.7576
0.7552
First
Second
Third
102
102
102
11711-2-1-1.wav
29411-2-1-1.wav
29311-2-1-1.wav
1.4163
0.5328
0.4347
First
Second
Third
102
102
102
11711-2-2-1.wav
33644-2-2-1.wav
10411-2-2-1.wav
1.2436
0.4595
0.4266
First
Second
Third
103
103
103
12011-2-1-1.wav
28711-2-1-1.wav
29011-2-1-1.wav
1.9814
0.5514
0.4936
First
Second
Third
103
103
103
12011-2-2-1.wav
36144-2-2-1.wav
10011-2-2-1.wav
1.6463
0.6992
0.683
First
Second
Third
104
104
104
16011-2-1-1.wav
12011-2-1-1.wav
55644-2-1-1.wav
1.0953
0.3928
0.2712
First
Second
Third
104
104
104
16011-2-2-1.wav
12011-2-2-1.wav
21411-2-2-1.wav
1.231
0.5576
0.4029
First
Second
Third
105
105
105
29411-2-1-1.wav
22822-2-1-1.wav
29311-2-1-1.wav
1.8086
0.7991
0.6144
First
Second
Third
105
105
105
29411-2-2-1.wav
33644-2-2-1.wav
34944-2-2-1.wav
1.8922
0.5514
0.4285
First
Second
Third
106
106
106
38511-2-1-1.wav
38111-2-1-1.wav
39411-2-1-1.wav
1.0417
0.4209
0.3661
First
Second
Third
106
106
106
38511-2-2-1.wav
51344-2-2-1.wav
38111-2-2-1.wav
0.9484
0.6673
0.557
First
Second
Third
107
107
107
38611-2-1-1.wav
45311-2-1-1.wav
39611-2-1-1.wav
2.5391
1.0601
0.9634
First
Second
Third
107
107
107
38611-2-2-1.wav
39611-2-2-1.wav
58644-2-2-1.wav
2.5511
0.9081
0.8672
First
Second
Third
108
108
108
39844-2-1-1.wav
82144-2-1-1.wav
40011-2-1-1.wav
0.7882
0.5045
0.5007
First
Second
Third
108
108
108
39844-2-2-1.wav
41711-2-2-1.wav
49644-2-2-1.wav
0.9732
0.4509
0.441
First
Second
Third
109
109
109
41511-2-1-1.wav
61744-2-1-1.wav
41844-2-1-1.wav
2.4704
0.933
0.6561
First
Second
Third
109
109
109
41511-2-2-1.wav
61744-2-2-1.wav
39611-2-2-1.wav
1.6375
0.7019
0.5715
First
Second
Third
110
110
110
43844-2-1-1.wav
58644-2-1-1.wav
51344-2-1-1.wav
0.7403
0.4773
0.4204
First
Second
Third
110
110
110
51344-2-2-1.wav
43844-2-2-1.wav
81444-2-2-1.wav
0.8005
0.6628
0.605
First
Second
Third
Table 3 a. and 3 b. Top three matches for each “Black-list” ID
8
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
Stage 3 - Business Rules Development - Showing that the technology works effectively for
“Black-List” detection in one thing - developing an effective set of rules that can take the results
generated by the technology and turn that into a solution that provides a reliable alarm for the call
center agent or operator, is another.
One approach developed by Auraya for tuning authentication applications has been the use
of “speaker space” analysis. In “speaker space” analysis, the “position” of each speaker in a “speaker
space” can be plotted from the results generated by the authentication technology. Using this analysis
the “position” in the “space” of the non-”Black-list” speakers and compared to the “position” of the
“black-list” speakers and appropriate rules developed that maximize the separation of the speaker in
the space.
Figure 2 shows this analysis. In this figure, blue dots represent non- “black-list” speakers, while the red
diamonds represent the “black-list” speakers. In this analysis there are approximately 2000 non-“Black-
List” speakers (blue dots) and ten “black-list” speech samples (red diamonds).
Figure 2 “Black-list” Scatter Analysis
9
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
This analysis demonstrated the distribution of the non- “black-list” speakers compared to the “Black-
List” speakers. From inspection of these distributions, a threshold (shown as the broken black line) can
be developed as a prototype business rule that separates the non- “black-list” speakers from the
“Black-List” speakers.
Given this prototype business rule and the database used in the analysis, all “black-list” speakers would
have been successfully detected, with four (out of 2000) non-black list speaker being falsely detected
as “black-list” speakers i.e. false alarms. In this analysis, the false alarms are those blue dots that are
above the business rule threshold. This equates to a fraud detection rate of 100% with a false alarm
rate of 0.2%.
Whilst good, we were looking to see if a business rue could be constructed that would reduce the false
alarm rate to zero. The analysis shows that that black list ID’s 108 and 110 are the most problematic
and most confusable with the non-”Black-list” speakers. A separate analysis of ID 110 only(shown in
Figure 3) demonstrates that the current rule does reliably separate the “black-list” ID 110 from all the
non- “Black-List” speakers, indicating that this rule would result in successful detection of this the
“Black-List” fraudster with no false alarms.
Figure 3. Fraudster ID 110 Scatter Analysis
10
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
However, an analysis of “black-list” ID 107 (shown in Figure 4) which appears to generate a very strong
match resulting in the response appearing towards the top right hand corner of speaker space also
shows that it is easily separated from non- “black-list” speakers.
Figure 4. Fraudster ID 107 Scatter Analysis
However, whilst the rule works effectively for “black-list” ID 110, that this rule would generate
a number of false alarms as shown by the speakers circled. In fact, it appears that in this analysis all
false alarms are associated with matches to the “black-list” ID 107. In this case a successful business
rule can be achieved by increasing the settings essentially moving the threshold closer to the to right
hand corner of the speaker space.
11
ARMORVOX – “BLACK-LIST” FRAUD DETECTION
© 2012 Auraya Systems www.ArmorVox.com
Conclusions
The simulation clearly demonstrates the effectiveness of the speaker adaptive voice authentication
technology to detect “black-list” callers. By customizing the business rule for the “black-list” ID’s all
“black-list” ID’s are successfully detected as fraudsters with no false alarms.
As the economic downturn worsens and the problem of identity fraud intensifies, call center operators
can rest assured that Auraya will continue to develop new technologies and new solutions to help
enhance security. Auraya’s “black-list” detection solution not only enhances security and addresses the
insidious problem of identity fraud, but does this unobtrusively in the background enables caller center
agents to focus on offering the best possible personal service confident that the caller is indeed “who
they say they are”.
Dr. Clive Summerfield is Auraya Systems’ Founder and Chief Executive Officer.
Clive is an internationally recognized authority on voice technology and holds numerous
patents in Australia, USA and UK in radar processing, speech chip design and speech
recognition and voice biometrics.
As a former Founder Deputy Director of the National Centre for Biometric Studies (NCBS) at University
of Canberra, in 2005 Clive undertook at the time the world’s largest scientific analysis of the voice
biometric systems leading to the adoption of voice biometrics by for secure services. That experience
lead Clive in 2006 founding Auraya, a business exclusively focused on advanced voice biometric
technologies for enterprise and cloud based services. Visit ArmorVox.com for Clive Summerfield’s full
bio.
About Auraya Systems
Founded in 2006, Auraya Systems, the creators of ArmorVox™ Speaker Identity System is a global
leader in the delivery of advanced voice biometric technologies for security and identity management
applications in a wide range of markets including banks, government, and health services. Offices are
located near Boston USA, Canberra and Sydney Australia. For more information, please
visit www.armorvox.com.com.