how to catch a thief: identifying tax evasion

34
How To Catch A Thief Identifying Tax Evasion with Machine Learning

Upload: srisatish-ambati

Post on 05-Apr-2017

56 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: How to Catch a Thief: Identifying Tax Evasion

How To Catch A Thief

Identifying Tax Evasion

with

Machine Learning

Page 2: How to Catch a Thief: Identifying Tax Evasion

About Me

UC Berkeley,

BA Geophysics

Scripps Institution of Oceanography,

MS Geophysics

Science/AAAS,

Science Journalism Fellow

+

Lauren DiPerna, Data Scientist @ H

2

O.ai

Page 3: How to Catch a Thief: Identifying Tax Evasion

What is Money Laundering?

Making Dirty Money Look Clean

Page 4: How to Catch a Thief: Identifying Tax Evasion

How it Works

Placement

Layering

Integration

CHECKLIST

Convert Dirty Money

Deposit into Bank

Withdraw from Bank

Page 5: How to Catch a Thief: Identifying Tax Evasion

Get Your Money Into a Financial InstitutionPlacement

Layering

Integration

Pros

❏ Stop Stuffing Money into Mattress❏ Stop Paying for Extra Guards

Cons

❏ Possibility of Getting Caught❏ Hard to Find Trustworthy Smurfs

Page 6: How to Catch a Thief: Identifying Tax Evasion

Use Multiple Accounts to Avoid SuspicionPlacement

Layering

Integration

Pros

❏ Won’t Lead Back to Me❏ Hard for IRS to audit

Cons

❏ Headaches From Keeping Track of Everything❏ Have to Figure out Loopholes in Different Languages

Page 7: How to Catch a Thief: Identifying Tax Evasion

Get Your Money BackPros

❏ Can Access Money & Appear Innocent❏ Excuse to Buy a House❏ Excuse to Buy a Boat❏ Excuse to Buy Expensive Art❏ Money

Placement

Layering

Integration

Page 8: How to Catch a Thief: Identifying Tax Evasion

Notorious Money Laundering

“Russian criminals laundered about $70 billion through this shack in Nauru”- NYTimes

Shell Banks No Record

Page 9: How to Catch a Thief: Identifying Tax Evasion

Why is Money Laundering Hard To Catch?

RULE

Untangle Transactions& Identifying the Source is Hard

Rule-Based Systems Rely on Relatively Stateless Checks

Criminals Find WaysAround Regulatory

Detection

Page 10: How to Catch a Thief: Identifying Tax Evasion

Could We Use ML to Help?

Machine Human

2 0

Page 11: How to Catch a Thief: Identifying Tax Evasion

Why We Should Use H2O Machine LearningHow Would you Teach a Child to Identify an Animal?

Page 12: How to Catch a Thief: Identifying Tax Evasion

Why H2O Machine Learning Can Beat RulesHard Code RulesSpecify ThresholdsDeterministic

Learns Rules from DataOptimizes Thresholds to Minimize ErrorProbabilistic Scores

If deposit > 10,000 andAcct_age < 1 year then....

Page 13: How to Catch a Thief: Identifying Tax Evasion

About H20.aiDistributed in-memory platform

Easy to use SDK / APIs

Open Source

Can use ALL business data

GLM, DRF, GBM, Deep Learning, K-means, PCA

Java, Python, R, Scala, JSON, Browser-Based GUI

Ownership of Methods

Modeling without Sampling

Page 14: How to Catch a Thief: Identifying Tax Evasion

H2O.ai Integration

Java R Python Scala REST

On Spark On HadoopStandalone

H2O

Page 15: How to Catch a Thief: Identifying Tax Evasion

H2O Algorithm OverviewStatistical Analysis

Deep Neural Networks

Ensembles

Clustering

Anomaly Detection

Dimensionality Reduction

Generalized Linear ModelsNaïve Bayes

Distributed Random Forest Gradient Boosting Machine

Deep learning

K-means

Principal Component AnalysisGeneralized Low Rank Models

Autoencoders

Page 16: How to Catch a Thief: Identifying Tax Evasion

Distributed Random Forest

SuspectInnocentSuspectInnocent

.

.

.

.

.

.

.

Acc. 1Acc. 2Acc. 3Acc. 4

Original Dataset

Predictions

Page 17: How to Catch a Thief: Identifying Tax Evasion

Tax EvasionUse Case

How Regulators Deal with Tax Evasion

Page 18: How to Catch a Thief: Identifying Tax Evasion

Bank Secrecy Act

THE IRS/FinCENTHE BANKS

Page 19: How to Catch a Thief: Identifying Tax Evasion

What Gets Sent to Regulators

DepositFlag

Don’t Flag

Page 20: How to Catch a Thief: Identifying Tax Evasion

Structuring: How Criminals Avoid Detection

Spaced Out Deposits

Page 21: How to Catch a Thief: Identifying Tax Evasion

How H2O Machine Learning Can Help

What Gets Sent to Regulators

DepositFlag

Don’t Flag

Page 22: How to Catch a Thief: Identifying Tax Evasion

But How is ML Different from Rule Systems?Threshold Trends/Spikes Memory Train/Retrain

Machine Learning

Rules

Page 23: How to Catch a Thief: Identifying Tax Evasion

Made for Machine Learning

What Features Should We Input to an Algorithm?

Page 24: How to Catch a Thief: Identifying Tax Evasion

Features that Provide ContextProfile Summary

Current balance: $50,000

Deposits: 1

Withdrawals: 3

Age of account: 10 years

Linked accounts: 3

Page 25: How to Catch a Thief: Identifying Tax Evasion

Feature 1. Total Amount Deposited Last 7 Day

$2,000

$0

$49,995

$9,900

Day 0-7

Day 7-14

Day 14-21

Day 21-28

Time

Amou

nt

Page 26: How to Catch a Thief: Identifying Tax Evasion

Feature 2. Total Amount Withdrawn Last 7 Day

$8,000

$2,000

$2,000

-$9,000

Day 0-7

Day 7-14

Day 14-21

Day 21-28

Time

Amou

nt

Page 27: How to Catch a Thief: Identifying Tax Evasion

Feature 3. % of Monetary Instruments Last 30 Days

100% Cash 50% Cash 20% Cash

Suspicious

Moderately Suspicious

Normal

Page 28: How to Catch a Thief: Identifying Tax Evasion

Feature 4. Tax Day Behavior

November December January February

Add weight to transactions closer to tax season

Page 29: How to Catch a Thief: Identifying Tax Evasion

How the Algo Uses Features to Identify StructuringSplit along features that minimize prediction error

Ability to include feature interactions

Composition of features yields a probability score,

used to identify structuring

Page 30: How to Catch a Thief: Identifying Tax Evasion

Bonus Machine Learning TreatFeature Importance

% of Monetary Instruments

Total Amount Withdrawn

Total Amount Deposited

Account Holder Name

Page 31: How to Catch a Thief: Identifying Tax Evasion

Putting It All Together

AllTransactions

retrain verify

Page 32: How to Catch a Thief: Identifying Tax Evasion

Machine Learning for Anti-Money Laundering

Enhance RulesHelp Generate Better RulesIdentify New Criminal PatternsReplace Rule-Based Systems

LOOPHOLES

Page 33: How to Catch a Thief: Identifying Tax Evasion

H2O Machine Learning Resources

http://docs.h2o.ai/ https://groups.google.com/forum/#!forum/h2ostream http://stackoverflow.com/questions/tagged/h2o

[email protected]

H2O User Guide

Page 34: How to Catch a Thief: Identifying Tax Evasion

Don’t Forget to File your Taxes (on all your income)