managing the business risk of fraud using sampling and data mining ez-r stats, llc managing the...

256
Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall 2009 Mike Blakley Mike Blakley Presented to:

Post on 19-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC

Managing the Business Risk of Fraud using Sampling and

Data Mining

Fall 2009

Mike BlakleyMike Blakley

Presented to:

Page 2: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

PWC Global Survey – Nov, 2009“Economic crime in a downturn”

Sharp rise in accounting fraud over the past 12 months

Accounting fraud had grown to 38 percent of the economic crimes in 2009

Employees face increased pressures to :

– meet performance targets– keep their jobs – keep access to funding

Page 3: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Survey findings

Greater risk of fraud due to increased incentives or pressures

More opportunities to commit fraud, partially due to reductions in internal finance staff

While companies are expecting more fraud, they have not done much

People who look for fraud are more likely to find it

Page 4: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session objectives

Understand the framework for managing the business risk of fraud

Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential

sampling and other sampling techniques Apply SAS 56, the new SAS suite and the revised

(2007) Yellow Book. Run, hands-on, the most productive analytic technique

(regression analysis). Use data mining to introduce greater efficiency into the

audit process, without losing effectiveness.

Page 5: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session agenda - 1

Introduction and the Process for Managing the Business Risk of Fraud

Introductions All Around Course Objectives Framework of risk management for fraud Fundamentals of data mining Data mining: The Engine That Drives analysis

– Analytics and Regression Sources of Analytics Data Basic and Intermediate ARTs

SAS 56 IIA Practice Advisory 2320 The Yellow Book (2007 revision) The Guide – “Managing the Business Risk of Fraud”

Page 6: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Agenda (cont’d) – Sampling refresher

Sampling The sampling process Sampling methods RAT-STATS

– Random Numbers– Determining Sample Size– Case Study– Attribute sampling– Variable Sampling– Case study– Stratified Sampling– Obtaining and Interpreting the results

Other Sampling Approaches DCAA Audit Package Sequential Sampling Overview of the process Attribute Sampling Variable Sampling

Page 7: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Agenda (cont’d) – Linear regression as an

audit tool

Regression Analysis Overview Terms Statistical basis Charting Regression … Seeing Is Believing Plotting Data

– Inserting a “Trend line” Statistical Intervals

– Confidence Intervals– Prediction Intervals– Calculation of Statistical “Confidence Bounds”

Case Study - Wake County Schools Bus Maintenance

Page 8: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Agenda (cont’d) – Data mining, or

How to test 100%

Overview Statistical Basis Data Conversion and Extraction Data mining objectives

– Classification– Trends– Identification of extremes– Major types of data analysis

Numeric Date Text

Page 9: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Agenda (cont’d) – Excel as an Analytics tool

Macros Tools – Data Analysis The Macro facility

– Adding a little “class” to your audit– VBA – “friend” or “foe”

Page 10: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Handout (CD)

CD with articles and software PowerPoint presentation More info at www.ezrstats.com

Page 11: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

“Cockroach” theory of auditing

If you spot one roach….

Page 12: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

“Cockroach” theory of auditing

There are probably 30 more that you don’t see…

Page 13: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Statistics based “roach” hunting

Many frauds coulda/woulda/shoulda been detected with analytics

Page 14: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Overview

Fraud patterns detectable with digital analysis

Basis for digital analysis approach

Usage examplesContinuous monitoringBusiness analytics

Page 15: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

The Why and How

Three brief examples ACFE/IIA/AICPA Guidance Paper Practice Advisory 2320-1 Auditors “Top 10” Process Overview Who, What, Why, When & Where

Objective 1

Page 16: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Example 1Wake County Transportation Fraud

Supplier Kickback – School Bus parts

$5 million Jail sentences Period of years

Objective 1a

Page 17: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Too little too late

Understaffed internal audit Software not used Data on multiple platforms Transaction volumes large

Objective 1a

Page 18: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Preventable

Need structured, objective approach

Let the data “talk to you” Need efficient and effective

approach

Objective 1a

Page 19: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Regression Analysis

Stepwise to find relationships

– Forwards– Backwards

Intervals– Confidence– Prediction

Objective 1

Page 20: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Data outliers

Objective 1

Sometimes an “out and out Liar”

But how do you detect it?

Page 21: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Data Outliers

Plot transportation costs vs. number of buses

“Drill down” on costs– Preventive maintenance– Fuel– Inspection

Objective 1

Page 22: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Scatter plot with prediction and confidence intervals

Page 23: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Cost of six types of AIDS drugs

Total Cost of AIDS Drugs

0

50

100

150

200

NDC1 NDC2 NDC3 NDC4 NDC5 NDC6

Drug Type

Dol

lar

Am

ount NDC1

NDC2

NDC3

NDC4

NDC5

NDC6

Example 2 Objective 1a

Page 24: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Medicare HIV Infusion Costs

Objective 1

CMS Report for 2005 South Florida - $2.2 Billion Rest of the country combined -

$.1 Billion

Page 25: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Pareto ChartObjective 1

Medicare HIV Infusion Costs - 2005 ($Billions)data source: HHS CMS

0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

120.0%

County

Ann

ual M

edic

are

Cos

ts

Pct

Cum Pct

Page 26: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Typical Prescription Patterns

AIDS Drugs Prescription Patterns

0.0

10.0

20.0

30.0

40.0

50.0

60.0

Prov 1 Prov 2 Prov 3 Prov 4 Prov 5 Prov 6

Prescriber

Dol

lar

Val

ue

NDC1

NDC2

NDC3

NDC4

NDC5

NDC6

Example 2 Objective 1a

Page 27: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Prescriptions by Dr. X

Dr. X compared with Total Population

050

100150200250

300350

NDC1 NDC2 NDC3 NDC4 NDC5 NDC6

Drug Type

Dol

lar

Am

ount

Population

Dr. X

Example 2 Objective 1a

Page 28: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Off-label use

Serostim– Treat wasting syndrome, side effect of

AIDS, OR– Used by body builders for recreational

purposes– One physician prescribed $11.5 million

worth (12% of the entire state)

Example 2 Objective 1a

Page 29: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Revenue trends

Overall Revenue Trend

0.9

0.95

1

1.05

1.1

1.15

1.2

2001 2002 2003

Calendar Year

Ann

ual B

illin

gs

Overall

Linear (Overall)

Example 3 Objective 1a

Page 30: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Dental Billings

Rapid Increase in Revenues

0

1

2

3

4

5

2001 2002 2003

Calendar Year

Ann

ual B

illin

gs

($m

illio

ns) Billings A

Billings B

Linear (Billings A)

Example 3 Objective 1a

Page 31: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Guidance Paper

A proposed implementation approach “Managing the Business Risk of Fraud: A

Practical Guide” http://tinyurl.com/3ldfza Five Principles Fraud Detection Coordinated Investigation Approach

Objective 1b

Page 32: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Managing the Business Risk of Fraud: A Practical Guide

ACFE, IIA and AICPA Exposure draft issued 11/2007, final 5/2008

Section 4 – Fraud Detection

Objective 1b

Page 33: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Guidance Paper

Five Sections– Fraud Risk Governance– Fraud Risk Assessment– Fraud Prevention– Fraud Detection– Fraud Investigation and

corrective action

Page 34: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Risk Governance

Fraud risk management program Written policy – management’s expectations

regarding managing fraud risk

Page 35: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Risk Assessment

Periodic review and assessment of potential schemes and events

Need to mitigate risk

Page 36: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud Prevention

Establish prevention techniques Mitigate possible impact on the organization

Page 37: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud Detection

Establish detection techniques for fraud “Back stop” where preventive measures fail,

or Unmitigated risks are realized

Page 38: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud Investigation and Corrective Action

Reporting process to solicit input on fraud Coordinated approach to investigation Use of corrective action

Page 39: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

“60 Minutes” – “World of Trouble”

2/15/09 – Scott Pelley– Fraud Risk Governance – “one grand wink-wink,

nod-nod “– Fraud Risk Assessment - categorically false – Fraud Prevention – “my husband passed away”– Fraud Detection - We didn't know? Never saw one.– Fraud Investigation and corrective action - Pick-A-

Payment losses $36 billion

Page 40: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Section 4 – Fraud Detection Detective Controls Process Controls Anonymous Reporting Internal Auditing Proactive Fraud Detection

Objective 1b

Page 41: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Proactive Fraud Detection

Data Analysis to identify:– Anomalies– Trends– Risk indicators

Objective 1b

Page 42: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud Detective Controls

Operate in the background Not evident in everyday business

environment These techniques usually –

– Occur in ordinary course of business– Corroboration using external information– Automatically communicate deficiencies– Use results to enhance other controls

Page 43: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Examples of detective controls

Whistleblower hot-lines (DHHS and OSA have them)

Process controls (Medicaid audits and edits) Proactive fraud detection procedures

– Data analysis– Continuous monitoring– Benford’s Law

Page 44: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Specific Examples Cited

Journal entries – suspicious transactions

Identification of relationships Benford’s Law Continuous monitoring

Objective 1b

Page 45: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Data Analysis enhances ability to detect fraud

Identify hidden relationships Identify suspicious transactions Assess effectiveness of internal

controls Monitor fraud threats Analyze millions of transactions

Objective 1b

Page 46: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Continuous Monitoring of Fraud Detection

Organization should develop ongoing monitoring and measurements

Establish measurement criteria (and communicate to Board)

Measurable criteria include:

Page 47: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Measurable Criteria – number of

fraud allegations fraud investigations resolved Employees attending annual ethics course Whistle blower allegations Messages supporting ethical behavior

delivered by executives Vendors signing ethical behavior standards

Page 48: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Management ownership of each technique implemented

Each process owner should:– Evaluate effectiveness of technique regularly– Adjust technique as required– Document adjustments– Report modifications needed for techniques which

become less effective

Page 49: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Practice Advisory 2320-1Analysis and Evaluation

International standards for the professional practice of Internal Auditing

Analytical audit procedures– Efficient and effective– Useful in detecting

Differences that are not expected Potential errors Potential irregularities

Page 50: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Analytical Audit Procedures

May include– Study of relationships– Comparison of amounts with

similar information in the organization

– Comparison of amounts with similar information in the industry

Page 51: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Analytical audit procedures

Performed using monetary amounts, physical quantities, ratios or percentages

Ratio, trend and regression analysis Period to period comparisons Auditors should use analytical audit

procedures in planning the engagement

Page 52: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Factors to consider

Significance of the area being audited Assessment of risk Adequacy of system of internal control Availability and reliability of information Extent to which procedures provide support

for engagement results

Page 53: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Peeling the Onion

Population as Whole

Possible Error Conditions

Fraud Items

Objective 1c

Page 54: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud Pattern Detection

Market Basket

Stratification

Trend Line

Holiday

Day of Week

Duplicates

Univariate

Gaps

Benford’s Law

Round Numbers

Target Group

Objective 1d

Page 55: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Digital Analysis (5W)

WhoWhatWhyWhereWhen

Objective 1e

A little about the basics of digital analysis….

Page 56: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Who Uses Digital Analysis

Traditionally, IT specialists With appropriate tools, audit

generalists (CAATs) Growing trend of business

analytics Essential component of

continuous monitoring

Objective 1e

Page 57: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

What - Digital Analysis

Using software to:– Classify– Quantify– Compare

Both numeric and non-numeric data

Objective 1e

Page 58: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How - Assessing fraud risk

Basis is quantification Software can do the “leg work” Statistical measures of difference

– Chi square– Kolmogorov-Smirnov– D-statistic

Specific approaches

Objective 1e

Page 59: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Why - Advantages

Automated process Handle large data populations Objective, quantifiable metrics Can be part of continuous monitoring Can produce useful business analytics 100% testing is possible Quantify risk Repeatable process

Objective 1e

Page 60: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Why - Disadvantages

Costly (time and software costs) Learning curve Requires specialized knowledge

Objective 1e

Page 61: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

When to Use Digital Analysis

Traditional – intermittent (one off) Trend is to use it as often as possible Continuous monitoring Scheduled processing

Objective 1e

Page 62: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Where Is It Applicable?

Any organization with data in digital format, and especially if:– Volumes are large– Data structures are complex– Potential for fraud exists

Objective 1e

Page 63: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Disadvantages of digital analysis

Cost– Software– Training– Skills not widely available

Time consuming– Development costs– Testing resources

Page 64: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 1 Summarized

Three brief examples CFE Guidance Paper “Top 10” Metrics Process Overview Who, What, Why, When & Where

Objective 1

Page 65: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 1 - Summarized

Understand the framework for managing the business risk of fraud

Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential

sampling and other sampling techniques Apply SAS 56, the new SAS suite and the revised (2007)

Yellow Book. Run, hands-on, the most productive analytic technique

(regression analysis). Use data mining to introduce greater efficiency into the

audit process, without losing effectiveness.

Next is plan, perform …

Page 66: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Statistical Sampling

Brief History / Timeline Overview Attribute Sampling – Compliance Variable Sampling – Numeric Estimates

Page 67: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

History of Sampling

Basis is two laws/theorems of probability Law of Large Numbers Central Limit Theorem

Page 68: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Law of large numbers

Simulated rolling of dice

0

1

2

3

4

5

6

7

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85

Observation

Valu

e Result

Average

Linear (Result)

Page 69: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Time Line - LLN

Indian mathematician Bramagupta 600 AD Italian mathematician Cardon 1500’s

Statement without proof that empirical statistics improve with more trials

Page 70: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Time line LLN (continued)

Jacob Bernoulli first to prove in 1713 Foundation for central limit theorem

Page 71: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Central limit theorem

Classic measure

Mean of a sufficiently large number of random samples will be approximately normally distributed.

Page 72: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

The traditional explanation

Page 73: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Central Limit Theorem

See it in action today Any population Large number of samples Average is “normally” distributed

Page 74: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

History of Central Limit Theorem

French mathematician Abraham de Moivre

1733 – approximate distribution from tossing coin (heads/tails)

Ho hum reaction French Mathematician

LaPlace – expanded it Ho hum reaction

Page 75: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

History of CLT (cont’d)

Russian mathematician Lyapunov

Proof in 1901 Same reaction

Page 76: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Industrial revolution

Manufacturing

Engineering

Excitement!

Page 77: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Student’s T

William Gosset - 1908

Guinness Brewery

Page 78: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

SAS 39

Effective June, 1983 Exposure draft for

revision in 2009

Page 79: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Attribute sampling

Buonaccorsi (1987) Refined calculations Few software packages use it

Page 80: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Overview

Sample size calculations Attribute sampling Variable sampling Random number generators

Page 81: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Sample size calculation

It’s a guess… Every package – different

answer Need to know the

population But that’s why you’re

taking a sample!

Page 82: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC

Attribute Sampling

Using RAT-STATS

Unrestricted populations

Page 83: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Objectives

1. Understand what is attribute sampling and when to use it

2. Understand unrestricted populations

3. Overview of the process using RAT-STATS

4. Understand the formula behind the computations

Page 84: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Attribute sampling

“Attribute”Compliance testingSignatures on approval

documents, attachment of supporting documentation, etc.

Page 85: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Statistical approach

RecommendedEconomicalEfficientRequires determination of a

sample size

Page 86: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Overview of process

Determine the sampling objective– Confidence– Precision

Determine required sample size Identify samples to be selected based upon random

numbers Pull the sample and test Compute the sampling results (i.e. estimate of range)

Page 87: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How this is done in RAT-STATS

The sampling parameters are first developed by the auditor

RAT-STATS is used to compute sample size RAT-STATS used to generate random

numbers Pull the sample and test Enter results in RAT-STATS to compute

estimates

Page 88: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 1 – Develop sampling parameters

1. Size of population

2. Expected error rate

3. Required confidence

4. Required precision

Page 89: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 2 – Obtain the random numbers

Done by entering info into RAT-STATS Output can be a variety of sources:

– Text File– Excel– Microsoft Access– Print File

Page 90: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 3 – Pull the sample

Each random number selected corresponds with an item

Put the selected item on a separate schedule

Page 91: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 4 - Test each selected item

Generally requires reviewing documents

Page 92: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 5 – Compute the results

Enter summary information into RAT-STATS Output can be in a variety of formats:

– Excel– Microsoft Access– Text File– Print File– Printer

Page 93: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

That’s It!

Now we’ll see an actual demo using the RAT-STATS software

Excel population of 5,000 invoices Results of test of attributes stored in the

worksheet

Page 94: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC

Variable Sampling

Using RAT-STATS

Unrestricted populations

Page 95: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Objectives

1. Understand what variable sampling is and when to use it

2. Understand unrestricted populations

3. Overview of the process using RAT-STATS

4. Understand the formula behind the computations

Page 96: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Variable sampling

“Variable”Estimating account balancesEstimating transaction totals

Page 97: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Statistical approach

RecommendedEconomicalEfficientRequires determination of a

sample size

Page 98: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Overview of process

Determine the sampling objective– Confidence– Precision

Determine required sample size Identify samples to be selected based upon random

numbers Pull the sample and test Compute the sampling results (i.e. estimate of range)

Page 99: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How this is done in RAT-STATS

The sampling parameters are first developed by the auditor

RAT-STATS is used to compute sample size RAT-STATS used to generate random

numbers Pull the sample and test Enter results in RAT-STATS to compute

estimates

Page 100: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 1 – Develop sampling parameters

1. Probe sample

2. Statistical measure

3. Excel formula

Page 101: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 1 – Develop sampling parameters

1. Size of population

2. Average value

3. Standard deviation

Page 102: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 2 – Obtain the random numbers

Done by entering info into RAT-STATSOutput can be a variety of sources:

– Text File– Excel– Microsoft Access– Print File

Page 103: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 3 – Pull the sample

Each random number selected corresponds with an item

Put the selected item on a separate schedule

Page 104: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 4 - Test each selected item

Generally requires reviewing documents

Example data contains both “examined” and “audited” value.

Page 105: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 5 – Compute the results

Enter summary information into RAT-STATS Output can be in a variety of formats:

– Excel– Microsoft Access– Text File– Print File– Printer

Page 106: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

That’s It!

Now we’ll see an actual demo using the RAT-STATS software

Excel population of 5,000 invoicesAudited values stored in the

worksheet

Page 107: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC

Attribute Sampling

Using RAT-STATS

Stratified populations

Page 108: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Objectives

1. Understand what is stratification and when to use it

2. Overview of the process using RAT-STATS

Page 109: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Stratified sampling

“Strata”HomogenousMore efficient in some instances

Page 110: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Overview of process

Separation into strata Determine the sampling objective

– Confidence– Precision

Determine required sample size Identify samples to be selected based upon random

numbers Pull the sample and test Compute the sampling results (i.e. estimate of range)

Page 111: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How this is done in RAT-STATS

The sampling parameters are first developed by the auditor

RAT-STATS is used to compute sample size RAT-STATS used to generate random

numbers Pull the sample and test Enter results in RAT-STATS to compute

estimates

Page 112: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 1 – Develop sampling parameters

1. Size of population

2. Expected error rate

3. Required confidence

4. Required precision

Page 113: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 2 – Obtain the random numbers

Done by entering info into RAT-STATS Output can be a variety of sources:

– Text File– Excel– Microsoft Access– Print File

Page 114: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 3 – Pull the sample

Each random number selected corresponds with an item

Put the selected item on a separate schedule

Page 115: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 4 - Test each selected item

Generally requires reviewing documents

Page 116: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 5 – Compute the results

Enter summary information into RAT-STATS Output can be in a variety of formats:

– Excel– Microsoft Access– Text File– Print File– Printer

Page 117: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

That’s It!

Now we’ll see an actual demo using the RAT-STATS software

Excel population of 5,000 invoices Results of test of attributes stored in the

worksheet

Page 118: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC

Variable Sampling

Using RAT-STATS

Stratified populations

Page 119: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Session Objectives

1. Understand what stratified sampling is and when to use it

2. Populations benefiting from stratified sampling

3. Overview of the process using RAT-STATS

4. Understand the formula behind the computations

Page 120: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Stratified variable sampling

“Stratified”“Variable”Estimating amountsNarrower standard deviation

Page 121: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Overview of process

Determine the sampling objective– Confidence– Precision

Determine required sample size Identify samples to be selected based upon random

numbers Pull the sample and test Compute the sampling results (i.e. estimate of range)

Page 122: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How this is done in RAT-STATS

The sampling parameters are first developed by the auditor

RAT-STATS is used to compute sample size RAT-STATS used to generate random

numbers Pull the sample and test Enter results in RAT-STATS to compute

estimates

Page 123: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 1 – Develop sampling parameters

1. Probe sample

2. Statistical measure

3. Excel formula

Page 124: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 1 – Develop sampling parameters

1. Number of strata

2. Size of population

3. Average value

4. Standard deviation

Page 125: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 2 – Obtain the random numbers

Done by entering info into RAT-STATS Multi-stage random numbers Output can be a variety of sources:

– Text File– Excel– Microsoft Access– Print File

Page 126: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 3 – Pull the sample

Each random number selected corresponds with an item in a strata

Put the selected item on a separate schedule

Page 127: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 4 - Test each selected item

Generally requires reviewing documents

Example data contains both “examined” and “audited” value.

Page 128: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Step 5 – Compute the results

Enter summary information into RAT-STATS Output can be in a variety of formats:

– Excel– Microsoft Access– Text File– Print File– Printer

Page 129: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

That’s It!

Now we’ll see an actual demo using the RAT-STATS software

Excel population of 5,000 invoicesDivided into three strataAudited values stored in the

worksheet

Page 130: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 2 - Summarized

Understand the framework for managing the business risk of fraud Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential sampling and

other sampling techniques Apply SAS 56, the new SAS suite and the revised (2007) Yellow

Book. Run, hands-on, the most productive analytic technique (regression

analysis). Use data mining to introduce greater efficiency into the audit

process, without losing effectiveness.

Next is cost reduction …

Page 131: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Techniques for cost reduction

Optimize sample size (most “bang” for the buck)

Skip sampling – review 100% of transactions using computer assisted audit techniques (CAATs)

Page 132: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Sample optimization

Sequential sampling

Page 133: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

University of Hawaii

Banana aphids

Page 134: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Sequential sampling

Banana aphids

Page 135: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

100% test using CAATs

Provides complete coverage Best practice Basis for continuous monitoring Repeatable process

Page 136: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 3 - Summarized

Understand the framework for managing the business risk of fraud

Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential sampling and

other sampling techniques Apply SAS 56, the new SAS suite and the revised (2007)

Yellow Book. Run, hands-on, the most productive analytic technique

(regression analysis). Use data mining to introduce greater efficiency into the audit

process, without losing effectiveness.

Next is Yellow Book and SAS 56 …

Page 137: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Yellow book standards

Standards regarding statistical sampling and IT

Page 138: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

General standards

3.43 Technical Knowledge and competence

“The staff assigned to conduct an audit or attestation engagement under GAGAS must collectively possess the technical knowledge, skills, and experience necessary to be competent for the type of work being performed before beginning work on that assignment.

The staff assigned to a GAGAS audit or attestation engagement should collectively possess: “

Page 139: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Stat sampling and IT

Skills appropriate for the work being performed. For example, staff or specialist skills in

(1) statistical sampling if the work involves use of statistical sampling;

(2) information technology

Page 140: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

SAS 56 – Analytical procedures

Requires use of analytic review procedures for:

Audit planning Overall review stages

Page 141: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

SAS 56 – Analytical Review

Encourages use of analytical review Provides guidance

“A wide variety of analytical procedures may be useful for this purpose.”

Page 142: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 4 - Summarized

Understand the framework for managing the business risk of fraud

Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential sampling and

other sampling techniques Apply SAS 56, the new SAS suite and the revised (2007)

Yellow Book. Run, hands-on, the most productive analytic technique

(regression analysis). Use data mining to introduce greater efficiency into the audit

process, without losing effectiveness.

Next is linear regression …

Page 143: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 144: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Trend BustersDoes the pattern make sense?

ACME Technology

05,000

10,00015,00020,00025,00030,000

Date

Am

ount Sales

Employee Count

7 - Trends

Page 145: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Trend Busters

Linear regression Sales are up, but cost of goods sold is

down “Spikes”

7 – Trends

Page 146: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Purpose / Type of Errors

Identify trend lines, slopes, etc.

Correlate trends Identify anomalies Key punch errors where

amount is order of magnitude

7 – Trends

Page 147: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Linear Regression

Test relationships (e.g. invoice amount and sales tax)

Perform multi-variable analysis

7 – Trends

Page 148: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is it done?

Estimate linear trends using “best fit”

Measure variability (standard errors)

Measure slope Sort descending by slope,

variability, etc.

7 – Trends

Page 149: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Trend Lines by Account - Example Results

Generally the trend is gently sloping up, but two accounts (43870 and 54630) are different.

Account N Slope Std Err

32451 18 1.230 0.87

43517 17 1.070 4.3

32451 27 1.023 0.85

43517 32 1.010 0.36

43870 23 0.340 2.36

54630 56 -0.560 1.89

7 – Trends

Page 150: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Scatter plot with prediction and confidence intervals

Page 151: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 5 - Summarized

Understand the framework for managing the business risk of fraud

Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential sampling and

other sampling techniques Apply SAS 56, the new SAS suite and the revised (2007)

Yellow Book. Run, hands-on, the most productive analytic technique

(regression analysis). Use data mining to introduce greater efficiency into the audit

process, without losing effectiveness.

Next is data mining …

Page 152: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Basis for Pattern Detection

Analytical review Isolate the “significant few” Detection of errors Quantified approach

Objective 6

Page 153: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Understanding the Basis

Quantified Approach Population vs. Groups Measuring the Difference Stat 101 – Counts, Totals, Chi

Square and K-S The metrics used

Objective 2

Page 154: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Quantified Approach

Based on measureable differences

Population vs. Group“Shotgun” technique

Objective 2a

Page 155: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Detection of Fraud Characteristics

Something is different than expected

Objective 2a

Page 156: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud patterns

Common theme – “something is different”

GroupsGroup pattern is different than

overall population

Objective 2b

Page 157: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Measurement Basis

Transaction counts

Transaction amounts

Objective 2c

Page 158: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

A few words about statistics (the “s” word)

Detailed knowledge of statistics not necessary

Software packages do the “number-crunching”

Statistics used only to highlight potential errors/frauds

Not used for quantification

Objective 2d

Page 159: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is digital analysis done?

Comparison of group with population as a whole

Can be based on either counts or amounts Difference is measured Groups can then be ranked using a selected

measure High difference = possible error/fraud

Objective 2d

Page 160: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Demo in Excel of the process

Based roughly on the Wake County Transportation fraud

Illustrates how the process works, using Excel

Page 161: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Histograms

Attributes tallied and categorized into “bins” Counts or sums of amounts

Objective 2d

Page 162: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Two histograms obtained

Population and groupPopulation

0

100

200

300

400

500

600

700

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Group

01020304050607080

Jan-07

Feb-07

Mar-07

Apr-07

May-07

Jun-07

Jul-07

Aug-07

Sep-07

Oct-07

Nov-07

Dec-07

Objective 2d

Page 163: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Compute Cumulative Amount for each

Count by Month

0

10

20

30

40

50

60

70

80

Month

Cou

nt

Cum Pct

0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

120.0%

Jan-

07

Mar

-07

May

-07

Jul-0

7

Sep-0

7

Nov-0

7

Objective 2d

Page 164: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Are the histograms different?

Two statistical measures of difference

Chi Squared (counts) K-S (distribution) Both yield a difference metric

Objective 2d

Page 165: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Chi Squared

Classic test on data in a table Answers the question – are the

rows/columns different Some limitations on when it can be

applied

Objective 2d

Page 166: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Chi Squared

Table of Counts Degrees of Freedom Chi Squared Value P-statistic Computationally intensive

Objective 2d

Page 167: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Kolmogorov-Smirnov

Two Russian mathematicians

Comparison of distributionsMetric is the “d-statistic”

Objective 2d

Page 168: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is K-S test done?

Four step process

1. For each cluster element determine percentage

2. Then calculate cumulative percentage

3. Compare the differences in cumulative percentages

4. Identify the largest difference

Objective 2d

Page 169: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Kolmogorov-Smirnov

Objective 2d - KS

Page 170: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Classification by metrics

Stratification Day of week Happens on holiday Round numbers Variability Benford’s Law Trend lines Relationships (market basket) Gaps Duplicates

Objective 2e

Page 171: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Auditor’s “Top 10” Metrics

1. Outliers / Variability

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Objective e

Page 172: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Understanding the Basis

Quantified Approach Population vs. Groups Measuring the Difference Stat 101 – Counts, Totals, Chi Square

and K-S The metrics used

Objective 2

Page 173: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 2 - Summarized

1. Understand why and how 2. Understand statistical basis for quantifying

differences3. Identify ten general tools and techniques4. Understand examples done using Excel5. How pattern detection fits in

Next are the metrics …

Page 174: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

It’s that time!

Session Break!

Page 175: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

The “Top 10” Metrics

Overview Explain Each Metric Examples of what it can detect How to assess results

Objective 3

Page 176: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Trapping anomalies

Objective 3

Page 177: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fraud Pattern Detection

Market Basket

Stratification

Trend Line

Holiday

Day of Week

Duplicates

Univariate

Gaps

Benford’s Law

Round Numbers

Target Group

Objective 3

Page 178: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Outliers / Variability

Outliers are amounts which are significantly different from the rest of the population

1 - Outliers

Page 179: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Outliers / Variability

Charting (visual) Software to analyze “z-scores” Top and Bottom 10, 20 etc. High and low variability (coefficient

of variation)

1 - Outliers

Page 180: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Drill down to the group level

Basic statistics– Minimum, maximum

and average– Variability

Sort by statistic of interest– Variability (coefficient

of variation)– Maximum, etc.

1 - Outliers

Page 181: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Example Results

Provider N Coeff Var

3478421 3,243 342.23

2356721 4,536 87.23

3546789 3,421 23.25

5463122 2,311 18.54

Two providers (3478421 and 2356721) had significantly more variability in the amounts of their claims than all the rest.

1 - Outliers

Page 182: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 183: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Unusual stratification patterns

Do you know how your data

looks?

2 - Stratification

Page 184: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Stratification - How

Charting (visual)Chi SquaredKolmogorov-SmirnovBy groups

2 - Stratification

Page 185: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Purpose / types of errors

Transactions out of the ordinary “Up-coding” insurance claims “Skewed” groupings Based on either count or amount

2 – Stratification

Page 186: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

The process?

1. Stratify the entire population into “bins” specified by auditor

2. Same stratification on each group (e.g. vendor)

3. Compare the group tested to the population

4. Obtain measure of difference for each group

5. Sort descending on difference measure

2 – Stratification

Page 187: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Units of Service Stratified - Example Results

Two providers (2735211 and 4562134) are shown to be much different from the overall population (as measured by Chi Square).

Provider N Chi Sq D-stat

2735211 6,011 7,453 0.8453

4562134 8,913 5,234 0.7453

4321089 3,410 342 0.5231

4237869 2,503 298 0.4632

2 – Stratification

Page 188: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 189: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Day of Week

Activity on weekdays Activity on weekends Peak activity mid to late week

3 – Day of Week

Page 190: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Purpose / Type of Errors

Identify unusually high/low activity on one or more days of week

Dentist who only handled Medicaid on Tuesday

Office is empty on Friday

3 – Day of Week

Page 191: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How it is done?

Programmatically check entire population Obtain counts and sums by day of week

(1-7) Prepare histogram For each group do the same procedure Compare the two histograms Sort descending by metric (chi square/d-

stat)

Page 192: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Day of Week - Example Results

Provider 2735211 only provided service for Medicaid on Tuesdays. Provider 4562134 was closed on Thursdays and Fridays.

Provider N Chi Sq D-stat

2735211 5,404 12,435 0.9802

4562134 5,182 7,746 0.8472

4321089 5,162 87 0.321

4237869 7,905 56 0.2189

3 – Day of Week

Page 193: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 194: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Round Numbers

It’s about….

Estimates!

4 – Round Numbers

Page 195: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Purpose / Type of Errors

Isolate estimates Highlight account numbers in

journal entries with round numbers

Split purchases (“under the radar”) Which groups have the most

estimates

4 – Round Numbers

Page 196: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Round numbers

Classify population amounts– $1,375.23 is not round– $5,000 is a round number – type 3 (3

zeros)– $10,200 is a round number type 2 (2

zeros) Quantify expected vs. actual (d-statistic) Generally represents an estimate Journal entries

4 – Round Numbers

Page 197: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Round Numbers in Journal Entries - Example Results

Two accounts, 2735211 and 4562134 have significantly more round number postings than any other posting account in the journal entries.

Account N Chi Sq D-stat

2735211 4,136 54,637 0.9802

4562134 833 35,324 0.97023

4321089 8,318 768 0.321

4237869 9,549 546 0.2189

4 – Round Numbers

Page 198: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 199: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Made up Numbers

Curb stoning

Imaginary numbers Benford’s Law

5 – Made up numbers

Page 200: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

What can be detected

Made up numbers – e.g. falsified inventory counts, tax return schedules

5 – Made Up Numbers

Page 201: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Benford’s Law using Excel

Basic formula is “=log(1+(1/N))” Workbook with formulae available at

http://tinyurl.com/4vmcfs Obtain leading digits using “Left”

function, e.g. left(Cell,1)

5 – Made Up Numbers

Page 202: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Made up numbers

Benford’s Law Check Chi Square and d-statistic First 1,2,3 digits Last 1,2 digits Second digit Sources for more info

5 – Made Up Numbers

Page 203: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is it done?

Decide type of test – (first 1-3 digits, last 1-2 digit etc)

For each group, count number of observations for each digit pattern

Prepare histogram Based on total count, compute expected

values For the group, compute Chi Square and d-

stat Sort descending by metric (chi square/d-stat)

5 – Made Up Numbers

Page 204: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Invoice Amounts tested with Benford’s law - Example Results

During tests of invoices by store, two stores, 324 and 563 have significantly more differences than any other store as measured by Benford’s Law.

Store Hi Digit Chi Sq D-stat

324 79 5,234 0.9802

563 89 4,735 0.97023

432 23 476 0.321

217 74 312 0.2189

5 – Made Up Numbers

Page 205: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 206: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Market Basket

Medical “Ping ponging” Pattern associations Apriori program References at end of slides Apriori – Latin a (from) priori

(former) Deduction from the known

6 – Market Basket

Page 207: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Purpose / Type of Errors

Unexpected patterns and associations

Based on “market basket” concept Unusual combinations of diagnosis

code on medical insurance claim

6 – Market basket

Page 208: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Market Basket

JE AccountsJE ApprovalsCredit card fraud in Japan –

taxi and ATM

6 – Market basket

Page 209: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is it done?

First, identify groups, e.g. all medical providers for a patient

Next, for each provider, assign a unique integer value

Create a text file containing the values

Run “apriori” analysis

6 – Market basket

Page 210: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Apriori outputs

For each unique value, probability of other values

If you see Dr. Jones, you will also see Dr. Smith (80% probability)

If you see a JE to account ABC, there will also an entry to account XYZ (30%)

6 – Market basket

Page 211: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 212: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Numeric Sequence Gaps

What’s there is interesting, what’s not there is critical …

8 - Gaps

Page 213: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Purpose / Type of Errors

Missing documents (sales, cash, etc.)

Inventory losses (missing receiving reports)

Items that “walked off”

8 – Gaps

Page 214: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is it done?

Check any sequence of numbers supposed to be complete, e.g.

Cash receipts Sales slips Purchase orders

8 – Gaps

Page 215: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Gaps Using Excel

Excel – sort and check Excel formula Sequential numbers and dates

8 – Gaps

Page 216: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Gap Testing - Example Results

Four check numbers are missing.

Start End Missing

10789 10791 1

12523 12526 2

17546 17548 1

8 – Gaps

Page 217: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 218: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Duplicates

Why is there more than one?

Same, Same, Same, and

Same, Same, Different

9 - Duplicates

Page 219: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Two types of (related) tests

Same items – same vendor, same invoice number, same invoice date, same amount

Different items – same employee name, same city, different social security number

9 – Duplicates

Page 220: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Duplicate Payments

High payback area“Fuzzy” logicOverriding software

controls

9 - Duplicates

Page 221: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Fuzzy matching with software

Levenshtein distance Soundex “Like” clause in SQL Regular expression

testing in SQL Vendor/employee

situations

Russian physicist

9 - Duplicates

Page 222: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How is it done?

First, sort file in sequence for testing

Compare items in consecutive rows

Extract exceptions for follow-up

9 - Duplicates

Page 223: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Possible Duplicates - Example Results

Five invoices may be duplicates.

Vendor Invoice DateInvoice Amount Count

10245 6/15/2007 3,544.78 4

10245 8/31/2007 2,010.37 2

17546 2/12/2007 1,500.00 2

9 - Duplicates

Page 224: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Next Metric

1. Outliers

2. Stratification

3. Day of Week

4. Round Numbers

5. Made Up Numbers

6. Market basket

7. Trends

8. Gaps

9. Duplicates

10. Dates

Page 225: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Date Checking

If we’re closed, why is there …

Adjusting journal entry?

Receiving report?

Payment issued?

10 - Dates

Page 226: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Holiday Date Testing

Red Flag indicator

10 – Dates

Page 227: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Date Testing challenges

Difficult to determine Floating holidays –

Friday, Saturday, Sunday, Monday

10 – Dates

Page 228: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Typical audit areas

Journal entries Employee expense

reports Business telephone calls Invoices Receiving reports Purchase orders

10 – Dates

Page 229: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Determination of Dates

Transactions when business is closed

Federal Office of Budget Management

An excellent fraud indicator in some cases

10 – Dates

Page 230: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Holiday Date Testing

Identifying holiday dates:– Error prone– Tedious

U.S. only

10 – Dates

Page 231: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Federal Holidays

Established by Law Ten dates Specific date (unless

weekend), OR Floating holiday

10 – Dates

Page 232: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Federal Holiday Schedule

Office of Personnel Management Example of specific date – Independence

Day, July 4th (unless weekend) Example of floating date – Martin Luther

King’s birthday (3rd Monday in January) Floating – Thanksgiving – 4th Thursday in

November

10 – Dates

Page 233: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How it is done?

Programmatically count holidays for entire population

For each group, count holidays Compare the two histograms (group

and population) Sort descending by metric (chi

square/d-stat)

10 – Dates

Page 234: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Holiday Counts - Example Results

Two employees (10245 and 32325) were “off the chart” in terms of expense amounts incurred on a Federal Holiday.

Employee Number N Chi Sq D-stat

10245 37 5,234 0.9802

32325 23 4,735 0.97023

17546 18 476 0.321

24135 34 312 0.2189

10 – Dates

Page 235: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

The “Top 10” Metrics

Overview Explain Each Metric Examples of what it can detect How to assess results

Objective 3

Page 236: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 3 - Summarized

1. Understand why and how 2. Understand statistical basis for quantifying

differences3. Identify ten general tools and techniques4. Understand examples done using Excel5. How pattern detection fits in

Next – using Excel …

Page 237: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Use of Excel

Built-in functions Add-ins Macros Database access

Objective 4

Page 238: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Excel templates

Variety of tests– Round numbers– Benford’s Law– Outliers– Etc.

Objective 4

Page 239: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Excel – Univariate statistics

Work with Ranges =sum, =average, =stdevp =largest(Range,1),

=smallest(Range,1) =min, =max, =count Tools | Data Analysis | Descriptive

Statistics

Objective 4

Page 240: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Excel Histograms

Tools | Data Analysis | Histogram Bin Range Data Range

Objective 4

Page 241: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Excel Gaps testing

Sort by sequential value =if(thiscell-lastcell <>

1,thiscell-lastcell,0) Copy/paste special Sort

Objective 4

Page 242: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Detecting duplicates with Excel

Sort by sort values =if testing =if(=and(thiscell=lastcell, etc.))

Objective 4

Page 243: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Performing audit tests with macros

Repeatable process Audit standardization Learning curve Streamlining of tests More efficient and effective Examples -

http://ezrstats.com/Macros/home.html

Objective 4

Page 244: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Using database audit software

Many “built-in” functions right off the shelf with SQL

Control totals Exception identification “Drill down” Quantification June 2008 article in the EDP Audit &

Control Journal (EDPACS) “SQL as an audit tool”

http://ezrstats.com/doc/SQL_As_An_Audit_Tool.pdf

Objective 4

Page 245: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Use of Excel

Built-in functions Add-ins Macros Database access

Objective 4

Page 246: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 4 - Summarized

1. Understand why and how 2. Understand statistical basis for quantifying

differences3. Identify ten general tools and techniques4. Understand examples done using Excel5. How Pattern Detection fits in

Next – Fit …

Page 247: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

How Pattern Detection Fits In

Business Analytics Fraud Pattern Detection Continuous monitoring

Objective 5

Page 248: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Where does Fraud Pattern Detection fit in?

Business Analytics Fraud Pattern Detection Continuous fraud pattern

detection Continuous Monitoring

Right in the middle

Objective 5

Page 249: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Business Analytics

Fraud analytics -> business analytics

Business analytics -> fraud analytics

Objective 5

Page 250: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Role in Continuous Monitoring (CM)

Fraud analytics can feed (CM) Continuous fraud pattern detection Use output from CM to tune fraud

pattern detection

Objective 5

Page 251: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Objective 6 - Summarized

Understand the framework for managing the business risk of fraud

Plan, perform and explain statistical sampling in audits Reduce audit costs using data mining, sequential sampling and

other sampling techniques Apply SAS 56, the new SAS suite and the revised (2007)

Yellow Book. Run, hands-on, the most productive analytic technique

(regression analysis). Use data mining to introduce greater efficiency into the audit

process, without losing effectiveness.

Page 252: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Links for more information

Kolmogorov-Smirnov http://tinyurl.com/y49sec Benford’s Law http://tinyurl.com/3qapzu Chi Square tests http://tinyurl.com/43nkdh Continuous monitoring

http://tinyurl.com/3pltdl

Page 253: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Market Basket

Apriori testing for “ping ponging” Temple University

http://tinyurl.com/5vax7r Apriori program (“open source”)

http://tinyurl.com/5qehd5 Article – “Medical ping ponging”

http://tinyurl.com/5pzbh4

Page 254: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Excel macros used in auditing

Excel as an audit software http://tinyurl.com/6h3ye7

Selected macros - http://ezrstats.com/Macros/home.html

Spreadsheets forever - http://tinyurl.com/5ppl7t

Page 255: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Questions?

Page 256: Managing the business risk of fraud using sampling and data mining EZ-R Stats, LLC Managing the Business Risk of Fraud using Sampling and Data Mining Fall

Managing the business risk of fraud EZ-R Stats, LLC

Contact info

Phone: (919)-219-1622E-mail:

[email protected]: http://blog.ezrstats.com