a bot will always be a bot: using machine learning …...⁄ your website login page is secure from...
TRANSCRIPT
DIVIDER SLIDE
A BOT WILL ALWAYS BE A BOT
USING MACHINE LEARNING TO PROTECT YOUR WEBSITE AND MOBILE APPS
FROM AUTOMATED TRAFFIC
Mark Greenwood – Head of Data Science
2
OVERVIEW
2
⁄ Automated Traffic
⁄ Your application as an opportunity
⁄ Impact of web bots
⁄ Types of web bots
⁄ Evolution and sophistication of attack
⁄ Machine Learning
⁄ What is machine learning?
⁄ Why is it useful for tackling web bots?
⁄ How we use machine learning to identify web bots
3
YOUR WEB APPLICATION
3
⁄ Enables users to interact with your business
⁄ Application enforces business rules…
⁄ …through user interface and API interactions
⁄ Interactions are inspectable
⁄ Query syntax
⁄ Application logic
⁄ Business logic/rules
⁄ 24/7 operation
⁄ Available to probe and catalogue any time…
⁄ …from anywhere in the world
44
REAL WORLD EXAMPLES
5
IMPACT OF BOTS – ACCOUNT TAKEOVER
5
DUNKIN DONUTS
OKCUPID
TURBOTAX
DELIVEROO
HSBC
NEST
“81% of Hacking-Related Breaches Leverage Compromised Credentials”
- Verizon DBIR 2017
6
IMPACT OF BOTS – ACCOUNT TAKEOVER
6
7
IMPACT OF BOTS - INVENTORY
7
8
IMPACT OF BOTS
8
Automated traffic
makes up >50% of the
Internet
(IDM)
$6.5-$7bn lost each year to Account Takeover (Forrester)
Bad bots account for 29% of all Internet traffic(The Atlantic)
1bn bots involved in
210m fraud
attempts Q1 2018 (Security Intelligence)
9
WEB BOTS
9
⁄ Exploit automating interactions to scale attacks
⁄ Content Scraping/theft
⁄ Ad-fraud
⁄ Inventory abuses
⁄ Account takeover and credential stuffing
⁄ Carding attacks
⁄ Range of approaches
⁄ Basic scripts
⁄ Browser automation
⁄ Off-the-shelf tools/platforms
⁄ Often tuned/configurable to a specific application
1010
+ =
11
TOOLS AND TUTORIALS
11
12
TOOLS AND TUTORIALS
12
13
EVOLUTION OF WEB BOTS
Basic Bot
Script run in
one location
making basic
attempts to
conceal
identity.
Automated Bot
Application in one
or limited number
of locations using
off the shelf
tools to automate
parts of attack.
Distributed Bot
Using a bot
network and
automation to
launch a
distributed attack
that mimics some
real user
behaviour.
Advanced Bot
Fully automated
and distributed
attack with the
ability to adapt
in real time to
mitigations. Often
go undetected and
difficult to
prevent.
14
COST OF ANONYMITY?
14
15
EVOLUTION OF WEB BOT MITIGATION
Network Security Problem
⁄ WAF rules
⁄ User agents
⁄ Rate limiting
⁄ ACLs
⁄ IP Reputation
⁄ User agent
15
Application Problem
⁄ Client-side/device validation
⁄ Captcha tests
⁄ Password policies
⁄ Mobile-phone MFA
16
EVOLUTION OF WEB BOT MITIGATION
Network Security Problem
⁄ WAF rules
⁄ User agents
⁄ Rate limiting
⁄ ACLs
⁄ IP Reputation
⁄ User agent
16
Application Problem
⁄ Client-side/device validation
⁄ Captcha tests
⁄ Password policies
⁄ Mobile-phone MFA
Brittle
Enumerable
Inspectable
Circumventable
1717
Business logic enumeration and
exploitation…
1818
Business logic enumeration and
exploitation…
…including common Bot Mitigations!
19
AN EXAMPLE – DEVICE VERIFICATION
1919
• Fingerprint
• Source
• User agent
• Browser features
• User interactions
20
AN EXAMPLE – DEVICE VERIFICATION
21
THE THREAT LANDSCAPE
⁄ Business/Application logic enumerable⁄ Exposes business to breach/exploitation
⁄ Breach impact⁄ Reputation
⁄ Financial
⁄ Web bots allow attackers to scale and mask their attacks
⁄ Pay-offs for attackers are not always obvious
⁄ Growing sophistication of attacks⁄ Harder to identify attackers and stay ahead…
⁄ …means growing sophistication in mitigation
21
22
MACHINE LEARNING & ADAPTABLE DEFENCE
23
WHAT IS MACHINE LEARNING?
⁄ Take action without explicit programming
⁄ Exploit patterns in data to make predictions/decisions
23
Model
New data
PredictionTrainingHistoric data
24
WHAT IS MACHINE LEARNING?
Supervised
⁄ Historic data is labelled
⁄ Learn to associate data with labels
24
Unsupervised
⁄ Unlabelled data
⁄ Learn relationships/patterns in data
⁄ Responds to similarities/differences in new data
25
WHY MACHINE LEARNING?
⁄ What these actors are trying to achieve is non-standard⁄ Focus on behaviour and intent
⁄ Bots will not interact with site like other users do⁄ Data around how users usually interact with applications…
⁄ …can be used to highlight non-standard activity
⁄ Generalisation⁄ Not hand-crafted
⁄ Not tuned to specific attacks or actors
⁄ Adaptable⁄ To the threat landscape
⁄ To businesses appetite for risk
25
A BOT WILL ALWAYS BE A BOT
26
THE NETACEA APPROACH
⁄ Focus on interactions with the API
⁄ These actions have to be carried out to get what the attacker
wants
⁄ Identify patterns in live traffic that point to automation
⁄ Device/client verification
⁄ One potential signal amongst many
26
27
THE NETACEA APPROACH
⁄ Holistic view of traffic⁄ Monitor trends and patterns across whole estate…
⁄ …not just an individual level
⁄ Model User behaviour⁄ API interactions
⁄ Standard versus non-standard
⁄ Similarities/differences
⁄ Unsupervised ⁄ What does ‘normal’ look like?
⁄ What groups of user behaviours are there?
⁄ Supervised⁄ Previously seen attack patterns
27
28
DATA PIPELINE
28
HTTP
Requests
Client
Browser Web
Server
Real-time
Data Streaming
Feature
Extraction
Supervised/
unsupervised models
Near real-time threat scores & recommendationsExternal
knowledge
sources
TRANSPARENCY THROUGH INTELLIGENCE
30
PICTURE THIS…
⁄ You have full visibility of all traffic to your website and mobile apps and APIs.
⁄ You can differentiate between human and non-human activity.
⁄ You are able to make informed decisions based on intelligence and context.
⁄ Genuine users always have a frictionless experience.
⁄ Your website login page is secure from credential stuffing attacks, giving your customers peace of mind.
⁄ Your online reputation is protected.
30
T H A N K
Y O U