blind spots in big data erez koren @ forter

Blind Spots in BIG DATA

Erez KorenForter

About myself - Erez Koren

In the computer business since 2nd grade

Love building products and hacking stuff

Currently in my 3rd startup adventure

Working at Forter from before day one

About Forter

We catch Fraudsters & protect E-commerce merchants

Founded 3.5 years ago

~80 employees worldwide

Backed by

We detect fraud, give a real-time decision (approve/decline) every time and guarantee it (chargeback protection). Covering the whole customer lifecycle

We collect data from browsers (JS) and mobile apps (through SDK )

We also receive order/account data S2S into our API and reply with our decision in real-time

Our stack:

Forter - What & How We Do It

Compliance:

And more...

The big data infrastructure...

But you have a feeling that something is wrong

Are you sure the data contains everything you need?

How do you ensure the quality of your data?

The COVERAGE challengeIn some cases the data you are analyzingis only partial

Today’s internet is a jungle.

There are thousands of devices, platforms, browsers and configurations.

Are you sure you are collecting data from all / most of the relevant sources?

The COVERAGE Challenge

Demo timeThis is how we do it

MULTIPLE DEVICES & PLATFORMS

MULTIPLE VERSIONS, INCLUDING DEV. VERSIONS

SENDING EVENTS FROM 25 DIFFERENT CONFIGS

SELENIUM TESTS COVERS FULL CHECKOUT EXPERIENCE

IN REAL WORLD SOME OF THE TESTS ALWAYS FAIL

EXAMPLE FOR UNEXPECTED DATA IN REAL WORLD

ChormeSafariMobile SafariFirefoxIEAndroid BrowserEdgeChrome WebViewPhantomJSundefinedOperaWebKit

Detect exceptions that occurs on client sideBrowsers (JS), Mobile SDKs and any other client integrations

CLIENT SIDE CODE MONITORING

JS SCRIPT TIMEOUTS

Merchant checked the website with a browser that is not supporting javascript

Detect gaps between script request from server and script events received

Compare the data segments of the

general population versus the data

segment spread in your data

Test it as if you were a real user

Even if everything is working now, in

the future it will not

Takeaways

The MONITORING Challenge

Is “measuring everything” good enough?

How often are you checking the graphs?

Do you have enough alerts or too many?

There are always technical issues that can corrupt the alerting data

Demo 2 timeThis is how we do it

API AVAILABILITY CHECK

External monitoring (watch the watcher), including round-tripPingdom and StatusCake

DEPENDENCIES MONITORING (RSS)

e.g. AWS, GitHub

Reported to our #productionroom in slack

API RESPONSES ANOMALY DETECTION

Detect decline increase from X% to Y% in a given time window

1. Making sure we don’t slow the site down, or impact checkout funnel via automated Selenium tests (with & without our script, multiple browsers)

2. Incremental deployment support for

JS SCRIPT MONITORING

ML FEATURES ANOMALY DETECTION

Monitoring system’s healthby measuring our MachineLearning featuresdistribution over time

VULNERABILITIES MONITORING (RSS)

OS, databases, libraries etc.

ALERTS DAILY SUMMARY

Alerts summary of in the last 24h + ability to drill the graphs

Takeaways

Make sure every alert can be drilled down into a graph and relate to the raw metric

Know how to investigate - leave breadcrumbs to raw data (even when the data is aggregated)

Differentiate between critical alerts and other alerts (that can be fixed the next morning)

Measure low values as well as the high ones - alerts for low values (e.g. CPU) is something that most systems are missing

Takeaways

Understand the pipes and filters make sure there are no hidden blockages in the data pipelines

Log errors both from client side and server side when possible and analyze together

Make sure incidents that affect input data are shared with your data scientists by using “dirty” or “partial” flag

Thank you !

blind spots in big data erez koren @ forter

Technology

erez kreiner kr.erez@gmail.com +972-50-6353411 c y b e r...

the menagerie - beth erez

koren/apii/tein & research projects

copyright 2008 koren ece666/koren part.4b.1 israel koren...

problemas de cuello - koren publications

curriculum vitae gideon koren - brown

copyright 2008 koren ece666/koren part.4c.1 israel koren...

yoram koren – home page for prof. yoram koren ·...

copyright 2008 koren ece666/koren part.6a.1 israel koren...

copyright 2008 koren ece666/koren part.5a.1 israel koren...

sample - koren publications

su primera visita - koren publications

koren 100 - luther.edu

sculpture by beth erez

copyright 2008 koren ece666/koren part.4a.1 israel koren...

copyright 2008 koren ece666/koren part.7c.1 israel koren...

koren book adders

the forter/mrc fraud attack indexl.forter.com/hubfs/global...

erez allouche latech geopolymer concrete

copyright 2008 koren ece666/koren part.2.1 israel koren...