an artificial coevolutionary framework for adversarial ai...them appear to be high value targets....

An Artificial Coevolutionary

Framework for Adversarial AI

Jamal Toutouh

[email protected]

mailto:[email protected]

ALFAhttp://groups.csail.mit.edu/ALFA

[email protected]

ALFA Group 2018-2019

Erik Hemberg Nicole Hoffman Abdullah Al-DujailiUna-May O’Reilly

Shashank Srikant

Jamal Toutouh

Andrew Zhang Linda Zhang

Michal

Shlapentokh-RothmanMilka Piszczek

UR

OP

Saeyoung Rho (TPP)

Gra

d

Jonathan Kelly Ayesha BajwaLi Wang

ME

ng

Edilberto Amorim Jonathan Rubin Re

se

arc

h A

ss

oc

.

Co

re M

em

be

rsP

hD

Miguel Paredes

Agenda

• Adversarial Engagements and Arms Races

• Network Security Arms Races

– RIVALS framework

» RIVALS: Robustness vs Denial

» AVAIL: Isolation vs Contagion

» DARK Horse and ADHD: Deception vs reconnaissance

» Acknowledgments:

▪ Funding

❖ Member companies supporting Cybersecurity@CSAIL

❖ DARPA XD3

❖ MIT Lincoln Labs

▪ Work

❖ Members of the ALFA group and collaborators

LEARNING

LEARNING

RIVALS

RIVALS helps the defense anticipate the attack strategies given a

defensive configuration (and mission)

RIVALS helps the defense consider arms races and

Design effective courses of action for the network to be resilient

ENGAGEMENT

ENVIRONMENT

Advanced Persistent Threat Kill Chain

http://drshem.com/wp-content/uploads/2016/01/APT-Life-Cycle-Expanded.png from INTEL Security

http://drshem.com/wp-content/uploads/2016/01/APT-Life-Cycle-Expanded.png

Deceptive Defense With Honeypots

Flow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources

SDN Controller Network view

generator

Deception Server

DeceptionSDN

Fitness Function

Time to detect first scan

Number of scans detected

Number of real nodes detected

Coevolutionary Algorithm

Attacker configurations

Defender configurations

Fitness

score

DarkHorse

Fig. 1: Overview of SDN system with deception, named Decept i veSDN, and Dar kHor se. The two systems recreate the

evolving competition between scans and deceptive views.

2) We analyze the optimized configurations for both defend-

ers and attackers in a deceptive network subject to static

adversaries or when coevolving.

3) Our experiments show that depending on whether the

defender or the attacker is adapting, but the adversary

is static, crowding real nodes onto one subnet will either

be the optimized configuration, or the worst configuration

respectively.

4) Our experiments with coevolving adversaries indicate

sensitivity to the rates of evolution. When the attacker

evolves more (i.e. defender is static temporarily), the

attacker’s optimized scanning batch sizes are small while

the defender’s best response is to distribute real nodes

randomly. Conversely, when the defender evolves more

(i.e. attacker is static temporarily), the attacker’s opti-

mized scanning batch size is 20 times larger while the

defender’s best response is to distribute real nodes evenly,

rather than randomly.

Achleitner’s deception system [1], called by us

Decept i veSDN, is described in Section II. Our threat

model and AI system, named Dar kHor se, are described in

Section III. We analyze the experimental results in Section IV.

Finally, Section V summarizes the results and future work.

I I . BACKGROUND

Two examples of research on deception are [5] and [1].

Passive deception works by deploying static decoy systems

such as honeypots [8], [16], [12]. On a network, honeypots

appear real to other nodes on the network, but do not exist in

the underlying network. They do not contain any live data or

information, but they can contain false information that makes

them appear to be high value targets. They can be configured

to prevent the intruder from accessing network enclaves and to

gather information and signal unauthorized use, see Figure 2.

3

5

4

2

1 1

32

5

4

Fig. 2: Example of a virtual network view showing the place-

ment of nodes and honeypots in order to delay adversaries

from identifying real nodes

Achleitner’s deception system [1], which we call

Decept i veSDN, defends against network reconnaissance

by simulating virtual network topologies and using passive

deception. It relies upon SDN flow tables and the SDN

separation of controller and switch. These support agile

distribution of virtual, imposter networks that overlay the true

one. Each time any node on the network requests a DHCP

lease, it internally is assigned a unique deceptive network

view. A network view generator sets up these deceptive views

so that the overall address space appears falsely large. It

places real nodes on virtual subnets to increase the time it

takes a scanner to find them and inserts honeypots which it

can monitor for illicit activity. Using the dynamic address

translation provided by the deception server, only one real

node is used, but is simulated to appear as many nodes

dispersed throughout the network. To ensure accurate routing,

a server handles dynamic address translation between the

overlay network, and true network addresses by rewriting the

headers in “ real time”. Packets are transferred to it through

flow table logic. Virtual routers simulate virtual paths so

that the scan attack cannot infer the actual topology. [1]

demonstrates the system against a variety of scan attacks.

As we will detail in the next section, the defensive net-

work views and scanning attacks of Decept i veSDN com-

prise strategic, parametrically configurable, multi-dimensional

spaces. Decept i veSDN serves as an example of a sys-

tem where it is unknown how to optimally set up an at-

tack or defense configuration – a situation that depends

on the adversary. (Note that from now on we refer to

the virtual network unless explicitly stated). Dar kHor se

utilizes Decept i veSDN as an experimental platform.

Decept i veSDN allows Dar kHor se to measure the out-

come of an “adversarial engagement” between an attack and

a defense, each specified by some set of configuration values.

Outcome informs the GA about relatively better configura-

tions, given the adversaries’ conflicting objectives. We next

succinctly describe our threat model and Dar kHor se.

I I I . METHOD

Threat Model: Dar kHor se assumes that a node on a network

has been compromised, and from it there is an attacker

scanning the SDN in an attempt to identify vulnerable nodes.

The attacker’s goals are to evade detection while scanning over

as much of the network as possible and discovering the real

Flow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources


generator

Deception Server

DeceptionSDN

Fitness Function







Fitness

score

DarkHorse










respectively.
















I I . BACKGROUND










3

5

4

2

1 1

32

5

4









































I I I . METHOD




The attacker’sgoals are to evade detection while scanning over


Achleitner et al.

Engagement

CONFLICTING OBJECTIVES

MEASUREMENTSd: time for defense to detect a scan (sec.)t: time to run the scan (sec.) n: number of scan detections by defender h/H: ratio of real nodes that were discovered to total real nodes.

Evaluate using Mininet

Flow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources


generator

Deception Server

DeceptionSDN

Fitness Function







Fitness

score

DarkHorse










respectively.
















I I . BACKGROUND










3

5

4

2

1 1

32

5

4









































I I I . METHOD






Adversarial Behaviors

APT

SCANNING

TACTICS

SDN

DECEPTION

SYSTEM

PARMSFlow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources


generator

Deception Server

DeceptionSDN

Fitness Function







Fitness

score

DarkHorse










respectively.
















I I . BACKGROUND










3

5

4

2

1 1

32

5

4









































I I I . METHOD






Defensive Learning

APT

SCANNING

TACTICS

SDN

DECEPTION

SYSTEM

PARMSFlow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources


generator

Deception Server

DeceptionSDN

Fitness Function







Fitness

score

DarkHorse










respectively.
















I I . BACKGROUND










3

5

4

2

1 1

32

5

4









































I I I . METHOD






Static Attack – Optimized DefenseHypothesis: Good defense has more honeypots, subnets and real hosts with even distribution

Results:

- More difficult to detect smaller NMAP batch sizes

- Fitness function rewards discovering more real host less than the penalty of being detected: smaller scans do

better

- Defense against an attacker that scans with local preference is the most difficult

- Expected real behavior of attackers is to start scan their local subnet

Possible recommendation: create subnets for DHCP leases where real hosts are in a different subnet

Attack Learning

RIVALS

APT

SCANNING

TACTICS

SDN

DECEPTION

SYSTEM

PARMSFlow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources


generator

Deception Server

DeceptionSDN

Fitness Function







Fitness

score

DarkHorse










respectively.
















I I . BACKGROUND










3

5

4

2

1 1

32

5

4









































I I I . METHOD






Static Defense – Optimized AttackResults:

- Difficult to attack many honeypots and subnets.

- Easier with crowded distribution of real hosts, large reward when that subnet is scanned (similar for defender when avoiding)

Points to adopting high risk- and high reward tactic

Coevolutionary Arms Race

RIVALS

APT

SCANNING

TACTICS

SDN

DECEPTION

SYSTEM

PARMS

Flow Rule

Generator

Analyze

Flow

Statistics

Configures

Deceptive Virtual

Network

Manipulate

Virtual Network

Traffic

Simulate Virtual

Network

Resources


generator

Deception Server

DeceptionSDN

Fitness Function







Fitness

score

DarkHorse










respectively.
















I I . BACKGROUND










3

5

4

2

1 1

32

5

4









































I I I . METHOD






Coevolution of Scanning and DeceptionF

itn

ess

Best Fitness Lockstep CoevolutionA0 A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 A16 A17 A18 A19

D0 D1 D2 D3 D4

Defender does significantly worse vs an evolving attacker, thus beware of static assumptions

Evolved for defense & attack

From Biological Coevolution Towards

Adversarial AI Via Artificial Coevolution

Wikimedia Commons: Cuttlefish changing color. URL https://upload.wikimedia.

org/wikipedia/commons/thumb/1/1c/Cuttlefish_color.jpg/636px- Cuttlefish_

color.jpg. Picture taken by Nick Hobgood - License: CC BY-SA 3.0

Wikimedia Commons: Misumena vatia with wasp (1998). URL https://en. wikipedia.org/wiki/File:Misumena.vatia.beute.wespe.1771.jpg#filelinks. Pic-

ture taken by Olaf Leillinger - License: CC-BY-SA-2.0/DE and GNU FDL

Wikimedia Commons: Viceroy butterfly (2005). URL https://commons.wikimedia. org/wiki/File:Viceroy_Butterfly.jpg. License: CC BY-SA 3.0. Subject to dis-

claimers.

Wikimedia Commons: Monarch in may (2007). URL https://en.wikipedia.org/

wiki/Monarch_butterfly#/media/File:Monarch_In_May.jpg. Created: 29 May 2007

By Kenneth Dwain Harrelson - License: CC BY-SA 3.0

• Biological arms races can provide adaptation

• Can coevolution help to improve robustness in other adversarial settings?

• Multiple comparisons can aid robustness and improve diversity

• Help to anticipate

• Replay the arms-race

Use Case

RIVALS

Dark

Horse

AVAIL

Engagement

Environment

Evaluation

Objective

- MinMax

- Mean Utility

- Pareto

Attacker Behavior

Parameters

Objectives

Threat

Environment

Attacker Population

Defender Population

Defender Behavior

Parameters

Fitness Evaluation

Measurements

Adversarial AI Framework Concept

CompendiumCache

Learning Variants

- Coevolution

- Non-negative

matrix

factorization

- Gaussian

processes

- Ensembles

DDoS Network Defense

https://proxy.duckduckgo.com/iu/?u=https%3A%2F%2Fcdn-images-1.medium.com%2Fmax%2F1600%2F1*8gG2DMHmnGJBIyDtR79rjA.png&f=1

RIVALS: Network Routing Problem

DEFENDER ACTIONS

link flooding

shortest path

CHORD

FITNESS

mission disruption

attacks in number and duration

Defender Objective: maximize

Attacker Objective: maximize

ATTACKER ACTIONS

node, start time, end time

complete loss of node

FITNESS

mission completion

time and hops

STRATEGY ADAPTATION

&

COEVOLUTION

OF

DDOS DEFENSE & ATTACKS

PEER TO PEER OVERLAY

NETWORK

TESTBED

DOS Attack Campaign

Defense Strategy

Mission Progress

Network Metrics

ALFA Sim of P2P

Fig. 9: Results from a Coev run for network rout ing on logical topology on

network topology 0. Top: Median and best fitness results for at tacker populat ion

over 20 generat ions. Bot tom: Median and best fitness results for defender

populat ion over 20 generat ions.

23

Defender finds an optimal solution

Network Segmentation

The Definitive Guide to Micro-Segmentation, John Friedman, CyberEdge Group

AVAIL: Enclaves vs Contagion

DEFENDER ACTIONS

set tap sensitivity and size

for each enclave

FITNESS

mission delay

budget remaining

OBJECTIVE: minmax

ATTACKER ACTIONS

set strength and duration of attack

for each enclave

FITNESS

mission delay

budget remaining

STRATEGY ADAPTATION

&

COEVOLUTION

OF

DEFENSE & ATTACKS

CONTAGION &

DETECTION

SIMULATOR

Attack: strength, duration

Defense: per enclave, #devices, tap sensitivity

Mission delay

Budget use

Realistic parameters from a

validated low level emulator

Evolve Defense

Defender learns sound network segmentation practices over generations

RIVALS

Dark

Horse

AVAIL

Engagement

Environment

Evaluation

Objective

Attacker Behavior

Parameters

Objectives

Threat

Environment

Attacker Population

Defender Population

Defender Behavior

Parameters

Fitness Evaluation

Measurements

Compendium Analysis

Cache

SUBSET

SELECTION

Then

ALL VS ALL

ENGAGEMENTS

WITHIN SIDE

COMPARISON

COMPENDIUM SOLUTIONS

SELECTION

VISUALIZATIONS

Use Case

Attack Campaign Performance Comparison

Different metrics and Ranking Schemes

Attack Campaign Similarity

Summary & Future Work

• Adversarial Engagements and Arms Races

• Network Security Arms Races

– RIVALS Adversarial AI framework

» RIVALS: Robustness vs Denial

» AVAIL: Isolation vs Contagion

» DARK Horse and ADHD: Deception vs reconnaissance

• Future

– Validate, refine, and extend use cases

an artificial coevolutionary framework for adversarial ai...them appear to be high value targets....

Documents