armor vox whitepaper_blacklist-fraud-detection_

11

Click here to load reader

Upload: aubrey-thomas

Post on 01-Jun-2015

114 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Armor vox whitepaper_blacklist-fraud-detection_

AURAYA SYSTEMS

One Tara Boulevard | Nashua, New Hampshire 03062 | +1 603 123 7654 | twitter.com/armorvox | linkedin/in/armorvox

TECHNOLOGY WHITEPAPER

ArmorVox – “Black-List” Fraud Detection

Page 2: Armor vox whitepaper_blacklist-fraud-detection_

2

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

The Problem

When the economy takes a “downturn”, there is one thing that can be guaranteed to take an “upturn” -

and that is fraud. As Internet security has become stronger, criminals and fraudsters are increasingly

targeting soft options, such as contact centers to gain access to accounts and personal information.

Using stolen personal information, which is readily available on the Web either through criminal groups

or through the myriad of burgeoning social networking sites, a fraudster can easily gain access millions

of accounts simply by quoting the personal information to a contact center agent. The problem is that

the agent cannot differentiate between a legitimate caller and the fraudster, especially when the correct

personal information is being quoted. As the uptake of mobility increases this problem becomes

widespread in other channels such as the Internet and contact mediums like web chat, the problem is

only going to get worse.

In this whitepaper, Auraya describes the application of its speaker adaptive voice authentication

technology to provide a solution to this problem. The solution allows the contact center agent to take a

call and using the speech provided by the caller during the conversation, and compare their voice in the

background against a “black-list” of known fraudsters and suspicious callers to see if the voice

matches. Where there is a close match the agent or call center manager can be notified and

appropriate action taken.

Simulating Fraudulent Calls

Set up - To assess the effectiveness of black-list detection, Auraya set-up a simulation. The first step

was to configure the Auraya system to perform the “black-list” detection process. Figure 1 shows the

architecture. In this arrangement speech spoken by a caller is compared against each of the acoustic

models of the fraudulent “black-list” speakers. The output from this process is a list of scores

representing how well the speech matches each of the “black-list” acoustic models. The list is ranked in

descending order and a threshold is set to detect if the match is close enough to raise an alarm that the

speaker’s voice matches the voice of one of the speakers in the “black-list”.

Page 3: Armor vox whitepaper_blacklist-fraud-detection_

3

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

Figure 1. ArmorVox “black-list” Architecture

For this exercise a speech database of some 200 speakers collected in a telephone call

center environment was used. The simulation was set-up in three stages:

Stage 1: Black-List Enrollment -This stage involved selecting ten speakers from the

database to act as the “black-list” fraudsters. The selection was purely arbitrary. There are no special

circumstances and includes both male and female speakers.

Table 1 (on the next page) shows the selection made:

Page 4: Armor vox whitepaper_blacklist-fraud-detection_

4

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

“Black-List” Reference ID Speaker ID (from Speech Database)

101 106

102 117

103 120

104 160

105 294

106 385

107 386

108 398

109 415

110 438

Table1. Fraudster Selection

In this arrangement; speaker ID 106 from the database was enrolled as “Black-list” reference

101, speaker ID 117 as “Black-list” reference 102 and so on to create the “Black-list” of ten.

Stage 2 - “Black-List” Detection - Once this was complete, the second stage involved processing the

database of 200 speakers (which included the “black-list” speakers) and systematically comparing each

speaker against each of the “black-list” voiceprints with the results of this process loaded into a

database for further analysis.

The process generates two thousand authentication results, that is, 200 speakers each compared

against the ten “Black-list” enrollments. The results generated by this process were then sorted into

descending order, with the highest scores (closed matches) ranked at the top.

Table 2 shows an analysis for “Black-List” ID 101 (which is speaker 106 from database)). In this table

only the top ten closest matches are listed (from the 200 matches generated). The table shows that the

highest score was generated by wave file 10611-2-1-1.wav, i.e. speaker ID 106 (with a score or

1.3374); with ID 214 with voice file 21411-2-1-1.wav being the second closest match (with a score of

0.5512) and so on.

Page 5: Armor vox whitepaper_blacklist-fraud-detection_

5

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

This test confirmed that Auraya’s technology was successful in a database of 200 callers in detecting

ID 106 from the database as fraudster 101. . Further the fraudster match was over double that of the

second closest match, showing that there was a very positive authentication of the fraudster’s voice in

this test.

Black List ID Speaker ID (Database Reference) “Black-List” Raw Score Delta

101 10611-2-1-1.wav 1.3374 0.7862

101 21411-2-1-1.wav 0.5512 0.7885

101 28711-2-1-1.wav 0.5489 0.8198

101 24744-2-1-1.wav 0.5176 0.8582

101 21311-2-1-1.wav 0.4792 0.9032

101 17444-2-1-1.wav 0.4342 0.9119

101 24644-2-1-1.wav 0.4255 0.9786

101 21813-2-1-1.wav 0.3588 1.0162

101 11044-2-1-1.wav 0.3212 1.041

101 12011-2-1-1.wav 0.2964 1.0693

Table 2. Results for “Black-List” ID 101 showing voice file of speaker 106 to be highest scoring match.

An analysis of the complete data set shows that the result achieved for ID 106 was consistent across all

“Black-list” IDs. That is, in each case, the Auraya technology was able to clearly identify the correct

speaker from the total voice database against the corresponding “blacklist” speaker. Table 3a shows

the top three speaker ID’s for each “black-list” ID. In every case the corresponding speaker ID was

ranked number one in each data set.

This database is based on account number. A second test was run to see if the technology could

reliably detect “black-list” speakers when they were quoting a different account number.

This tested the performance of the technology in matching the voice quality, not the content of

the speech files.

Page 6: Armor vox whitepaper_blacklist-fraud-detection_

6

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

The results are generated are detailed in Table 3b (on the next page).

In all but ID 110 the “Black-List” speaker was successfully selected as the number one ranked match by

a large margin despite the fact that the speaker was saying different information to the information

enrolled, a clear and positive identification. In the case of 110, ID 513 was placed ahead of the real

fraudster’s ID which was 438. Note that ID 513 was ranked third in the initial test, suggesting that the

voice of ID 513 appears to be highly confusable with the voice of the nominated ID 438. Further, it

would appear that “Black-list” ID 110 also produced a weak voiceprint resulting in low match score

(0.7403 compared to around 1.5) in the first test. This is something that we will come back to later in the

business rule analysis.

Page 7: Armor vox whitepaper_blacklist-fraud-detection_

7

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

BLACK

LIST

ID

SPEAKER ID

(Database

Reference)

Raw

Score

Ranking BLACK

LIST

ID

SPEAKER ID

(Database

Reference)

Raw

Score

Ranking

101

101

101

10611-2-1-.wav

21411-2-1-1.wav

28711-2-1-1.wav

1.3374

0.5512

0.5489

First

Second

Third

101

101

101

10611-2-2-1.wav

10930-2-2-1.wav

17444-2-2-1.wav

1.6778

0.7576

0.7552

First

Second

Third

102

102

102

11711-2-1-1.wav

29411-2-1-1.wav

29311-2-1-1.wav

1.4163

0.5328

0.4347

First

Second

Third

102

102

102

11711-2-2-1.wav

33644-2-2-1.wav

10411-2-2-1.wav

1.2436

0.4595

0.4266

First

Second

Third

103

103

103

12011-2-1-1.wav

28711-2-1-1.wav

29011-2-1-1.wav

1.9814

0.5514

0.4936

First

Second

Third

103

103

103

12011-2-2-1.wav

36144-2-2-1.wav

10011-2-2-1.wav

1.6463

0.6992

0.683

First

Second

Third

104

104

104

16011-2-1-1.wav

12011-2-1-1.wav

55644-2-1-1.wav

1.0953

0.3928

0.2712

First

Second

Third

104

104

104

16011-2-2-1.wav

12011-2-2-1.wav

21411-2-2-1.wav

1.231

0.5576

0.4029

First

Second

Third

105

105

105

29411-2-1-1.wav

22822-2-1-1.wav

29311-2-1-1.wav

1.8086

0.7991

0.6144

First

Second

Third

105

105

105

29411-2-2-1.wav

33644-2-2-1.wav

34944-2-2-1.wav

1.8922

0.5514

0.4285

First

Second

Third

106

106

106

38511-2-1-1.wav

38111-2-1-1.wav

39411-2-1-1.wav

1.0417

0.4209

0.3661

First

Second

Third

106

106

106

38511-2-2-1.wav

51344-2-2-1.wav

38111-2-2-1.wav

0.9484

0.6673

0.557

First

Second

Third

107

107

107

38611-2-1-1.wav

45311-2-1-1.wav

39611-2-1-1.wav

2.5391

1.0601

0.9634

First

Second

Third

107

107

107

38611-2-2-1.wav

39611-2-2-1.wav

58644-2-2-1.wav

2.5511

0.9081

0.8672

First

Second

Third

108

108

108

39844-2-1-1.wav

82144-2-1-1.wav

40011-2-1-1.wav

0.7882

0.5045

0.5007

First

Second

Third

108

108

108

39844-2-2-1.wav

41711-2-2-1.wav

49644-2-2-1.wav

0.9732

0.4509

0.441

First

Second

Third

109

109

109

41511-2-1-1.wav

61744-2-1-1.wav

41844-2-1-1.wav

2.4704

0.933

0.6561

First

Second

Third

109

109

109

41511-2-2-1.wav

61744-2-2-1.wav

39611-2-2-1.wav

1.6375

0.7019

0.5715

First

Second

Third

110

110

110

43844-2-1-1.wav

58644-2-1-1.wav

51344-2-1-1.wav

0.7403

0.4773

0.4204

First

Second

Third

110

110

110

51344-2-2-1.wav

43844-2-2-1.wav

81444-2-2-1.wav

0.8005

0.6628

0.605

First

Second

Third

Table 3 a. and 3 b. Top three matches for each “Black-list” ID

Page 8: Armor vox whitepaper_blacklist-fraud-detection_

8

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

Stage 3 - Business Rules Development - Showing that the technology works effectively for

“Black-List” detection in one thing - developing an effective set of rules that can take the results

generated by the technology and turn that into a solution that provides a reliable alarm for the call

center agent or operator, is another.

One approach developed by Auraya for tuning authentication applications has been the use

of “speaker space” analysis. In “speaker space” analysis, the “position” of each speaker in a “speaker

space” can be plotted from the results generated by the authentication technology. Using this analysis

the “position” in the “space” of the non-”Black-list” speakers and compared to the “position” of the

“black-list” speakers and appropriate rules developed that maximize the separation of the speaker in

the space.

Figure 2 shows this analysis. In this figure, blue dots represent non- “black-list” speakers, while the red

diamonds represent the “black-list” speakers. In this analysis there are approximately 2000 non-“Black-

List” speakers (blue dots) and ten “black-list” speech samples (red diamonds).

Figure 2 “Black-list” Scatter Analysis

Page 9: Armor vox whitepaper_blacklist-fraud-detection_

9

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

This analysis demonstrated the distribution of the non- “black-list” speakers compared to the “Black-

List” speakers. From inspection of these distributions, a threshold (shown as the broken black line) can

be developed as a prototype business rule that separates the non- “black-list” speakers from the

“Black-List” speakers.

Given this prototype business rule and the database used in the analysis, all “black-list” speakers would

have been successfully detected, with four (out of 2000) non-black list speaker being falsely detected

as “black-list” speakers i.e. false alarms. In this analysis, the false alarms are those blue dots that are

above the business rule threshold. This equates to a fraud detection rate of 100% with a false alarm

rate of 0.2%.

Whilst good, we were looking to see if a business rue could be constructed that would reduce the false

alarm rate to zero. The analysis shows that that black list ID’s 108 and 110 are the most problematic

and most confusable with the non-”Black-list” speakers. A separate analysis of ID 110 only(shown in

Figure 3) demonstrates that the current rule does reliably separate the “black-list” ID 110 from all the

non- “Black-List” speakers, indicating that this rule would result in successful detection of this the

“Black-List” fraudster with no false alarms.

Figure 3. Fraudster ID 110 Scatter Analysis

Page 10: Armor vox whitepaper_blacklist-fraud-detection_

10

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

However, an analysis of “black-list” ID 107 (shown in Figure 4) which appears to generate a very strong

match resulting in the response appearing towards the top right hand corner of speaker space also

shows that it is easily separated from non- “black-list” speakers.

Figure 4. Fraudster ID 107 Scatter Analysis

However, whilst the rule works effectively for “black-list” ID 110, that this rule would generate

a number of false alarms as shown by the speakers circled. In fact, it appears that in this analysis all

false alarms are associated with matches to the “black-list” ID 107. In this case a successful business

rule can be achieved by increasing the settings essentially moving the threshold closer to the to right

hand corner of the speaker space.

Page 11: Armor vox whitepaper_blacklist-fraud-detection_

11

ARMORVOX – “BLACK-LIST” FRAUD DETECTION

© 2012 Auraya Systems www.ArmorVox.com

Conclusions

The simulation clearly demonstrates the effectiveness of the speaker adaptive voice authentication

technology to detect “black-list” callers. By customizing the business rule for the “black-list” ID’s all

“black-list” ID’s are successfully detected as fraudsters with no false alarms.

As the economic downturn worsens and the problem of identity fraud intensifies, call center operators

can rest assured that Auraya will continue to develop new technologies and new solutions to help

enhance security. Auraya’s “black-list” detection solution not only enhances security and addresses the

insidious problem of identity fraud, but does this unobtrusively in the background enables caller center

agents to focus on offering the best possible personal service confident that the caller is indeed “who

they say they are”.

Dr. Clive Summerfield is Auraya Systems’ Founder and Chief Executive Officer.

Clive is an internationally recognized authority on voice technology and holds numerous

patents in Australia, USA and UK in radar processing, speech chip design and speech

recognition and voice biometrics.

As a former Founder Deputy Director of the National Centre for Biometric Studies (NCBS) at University

of Canberra, in 2005 Clive undertook at the time the world’s largest scientific analysis of the voice

biometric systems leading to the adoption of voice biometrics by for secure services. That experience

lead Clive in 2006 founding Auraya, a business exclusively focused on advanced voice biometric

technologies for enterprise and cloud based services. Visit ArmorVox.com for Clive Summerfield’s full

bio.

About Auraya Systems

Founded in 2006, Auraya Systems, the creators of ArmorVox™ Speaker Identity System is a global

leader in the delivery of advanced voice biometric technologies for security and identity management

applications in a wide range of markets including banks, government, and health services. Offices are

located near Boston USA, Canberra and Sydney Australia. For more information, please

visit www.armorvox.com.com.