towards scalable cluster auditing through … · apache mysql proftpd heterogeneous workload ->...

33
Towards Scalable Cluster Auditing through Grammatical Inference over Provenance Graphs Wajih Ul Hassan , Mark Lemay, Nuraini Aguse, Adam Bates, Thomas Moyer NDSS Symposium 2018 Feb 20, 2018

Upload: vankhanh

Post on 20-Sep-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Towards Scalable Cluster Auditing through Grammatical Inference

over Provenance Graphs WajihUlHassan,MarkLemay,NurainiAguse,

AdamBates,ThomasMoyer

NDSS Symposium 2018 Feb 20, 2018

Page 2: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Notable Data Breach in 2017

2

Page 3: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Notable Data Breach in 2017

3

Equifax Data Breach Timeline 2017

apr may jun jul aug sep oct

Breached Detected

Hackers in Equifax Servers

Patched

Breached Announced

Page 4: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Notable Data Breach in 2017

4

Equifax Data Breach Timeline 2017

apr may jun jul aug sep oct

Breached Detected

Hackers in Equifax Servers

Patched

Breached Announced

3 Months of crucial attack audit logs

Page 5: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Notable Data Breach in 2017

5

Equifax Data Breach Timeline 2017

apr may jun jul aug sep oct

Breached Detected

Hackers in Equifax Servers

Patched

Breached Announced

3 Months of crucial attack audit logs Are current auditing systems scalable?

Page 6: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Data Provenance aka Audit log � Lineage of system activities � Represented as Directed Acyclic Graph (DAG) � Used for forensic analysis

6

1.   Bash,SpawnsNGINX2.  NGINX,Receivesfromabc.com3.  NGINX,ReadsFileindex.html4.  ….......

index.html

NGINX

abc.com

Auditlog

Bash

ProvenanceGraph

Bash:exec(“./NGINX”);NGINX:recv(…,“abc.com”);fread(“index.html”);

CodeExecu@on

Page 7: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Data Provenance in a Cluster

7

WorkerNodes

MasterNode

Centralized auditing not practical due to two

limitations

Page 8: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Limitation#1: Graph Complexity � NGINX and MySQL running for 5 mins on a single machine

8

Finding needle in a haystack problem

Page 9: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Limitation#2: Storage overhead � Leads to network overhead as logs are transferred to master

node

9

[VALUE]GB

[VALUE]GB

[VALUE]GB

[VALUE]GB

0

2

4

6

8

10

12

14

Day1 Day2 Day3 Day4 Day5

LogSize(G

B)

AuditLogSizeGrowthforaSingleNGINXserver

Uncompressed Compressed

2.54GB/Dayonasinglemachine

Withcompression>1GB/Day

Page 10: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Winnower

� Cluster applications are replicated in accordance with microservice architecture principle

� Replicated apps produce highly homogeneous provenance graphs �  core execution behaviour is similar

10

Key Idea: Remove redundancy from provenance graphs across cluster before sending to master node

Page 11: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

11

Before

/up/*

NGINX

*

Bash

mysql

*

mysqld

/db/*

Master Node View with Winnower

After

Otherlibraryfilever@ces

Page 12: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Winnower � Build consensus model across cluster using graph grammars �  Like string grammar, graph grammars provide rule-based

mechanisms �  For generating, manipulating and analyzing graphs �  Induction – produce grammar from a given graph �  Parsing – membership test of a given graph is in a grammar

12

a a

t t t

b b b

a

S ≔ A T

A ≔ a B

T t

B≔

b

S

S ≔ e

a e

t

b

Graph GraphGrammar

Page 13: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Architecture

13

AuditModule

Prov.Graph

Worker Nodes

ModelAggregator

Master Node

Worker Node

Fetchgraphateachepoch

——

——

——

—— —

——

——

———

Fine-grained Graph

Abstracted Graph

Graph

Abstraction

ModelGraph

Graph

Induction

WinnowerAgent

Modelgraphs/grammarsfromcluster

Page 14: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Architecture

14

AuditModule

ModelAggregator

——

——

——

—— —

——

——

———

Fine-grained Graph

Abstracted Graph

Graph

Abstraction

ModelGraph

Graph

Induction

WinnowerAgent

AggregatedModelOnlysendModelupdates

Worker Nodes

Master Node

Worker Node

Prov.Graph

Fetchgraphatnextepoch

Page 15: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Architecture

15

AuditModule

ModelAggregator

WinnowerAgent

QuerypartofProvenancegraphHigh-fidelity

Provenancegraph

Worker Nodes

Master Node

Worker Node

Page 16: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Provenance Graph Abstraction � Graph Induction process builds a model/grammar that concisely

describe the whole graph � However, instance-specific fields frustrate any attempts to build a

generic application behaviour model

16

NoGeneralmodelasinstancespecificinforma@onsuchPIDisdifferentamonggraphs

ftppid:2788

ftp workerpid:2797

192.168.0.2

ftp listenerpid:2789

192.168.0.1

ftp listenerpid:2791

ftp workerpid:2795

192.168.0.2192.168.0.1 /up/File1Inode:3

/up/File2Inode:5

ftppid:2780

Node 1 Node 2

GraphInduc@on

Page 17: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Provenance Graph Abstraction � Provenance graph vertices have well defined fields

�  E.g. pid:1234 , FilePath:/etc/ld.so� Defined rules manually that remove or generalize these fields

ftppid:2788

ftp workerpid:2797

192.168.0.2

ftp listenerpid:2789

192.168.0.1

ftp listenerpid:2791

ftp workerpid:2795

192.168.0.2192.168.0.1 /up/File1Inode:3

/up/File2Inode:5

ftppid:2780

Node 1 Node 2ftp

ftp worker

192.168.0.0/24

ftp listener

192.168.0.0/24

ftp listener

ftp worker

192.168.0.0/24192.168.0.0/24/up/* /up/*

ftp

Node 2Node 1

GraphAbstrac@on

Page 18: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Provenance Graph Induction � Deterministic Finite Automata (DFA) Learning to generate grammar

�  Encodes the causality in generated models �  In DFA learning the present state of a vertex includes the path taken

to reach the vertex (provenance ancestry) �  Winnower extends it to remember descendants (provenance progeny)

� State of each vertex consist of three items: 1.  Label 2.  Provenance ancestry 3.  Provenance progeny

18

File1.txt

gzip

Bash

File1.txt

Progenyofgzipvertex

Ancestryofgzipvertex

Page 19: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Provenance Graph Induction � Finds repetitive patterns using standard implicit and explicit

state merging algorithm �  Implicit state merging combines two subgraphs if states of each

vertex are same in both subgraphs

19

ftp

ftp worker

192.168.0.0/24

ftp listener

192.168.0.0/24

ftp listener

ftp worker

192.168.0.0/24192.168.0.0/24/up/* /up/*

ftp

Node 2Node 1 ftp192.168.0.0/24

ftp listener

ftp worker

192.168.0.0/24 /up/*

Confidence levelLegend 2

GraphInduc@on

Page 20: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

java

javamapper

data

javareducer

java

java mapper

data

javareducer

Node 1

Explicit State Merging

20Mergetwonodes

� At high-level explicit state merging �  Picks two nodes and make their states same �  Check if subgraph can be merged implicitly

� Consider a chained map reduce job

java

java mapper

data

javareducer

java

java mapper

data

javareducer

Node 1

:=SS:=AS:=T|VT:=A->X|A->YX:=B->WY:=C->WW:=D|D->SA:=dataB:=javamapperC:=javareducerD:=java

GraphGrammar

A

B

D

C

Page 21: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Provenance Graph Induction � Consider a graph with a malicious activity � Malicious behavior is visible in the final model

21

GraphInduc@on

ftp

ftp worker

ftp listener

ftp listener

ftp worker/up/*

/up/*

bash

Malicious filewget

x.x.x.x

ftp

Node 1

ftp ftp listener

ftp worker/up/*

Node 3

Node 2ftp

ftp listener

ftp worker

/up/*

bash

Malicious file

wget

x.x.x.x

Confidence levelLegend 1 3

Master Node

Page 22: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Evaluation Setup

� Setup �  1 VM as master node, 4 VMs as worker nodes �  SPADE and Docker Swarm �  Epoch size 50 sec

� Metrics �  Storage Overhead �  Computational Cost �  Effectiveness

22

Page 23: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Storage Overhead on Master Node

23

0 100 200 300 400 500 600 700

HTTPD

ProFTPD

MySQL

485

630

130

0.11

0.12

0.17

LOGSIZEINMB

Winnower Raw

98.7%decrease

Page 24: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Storage Reduction on Master Node

24

�  Apache Webserver with moderate workload

�  Note the log scale on y-axis

1

10

100

1000

10000

100000

50 100 150 200 250 300 350 400 450 500 550 600

LOG

SIZ

E (M

B)

TIME (SEC)

Raw(Uncompressed) Raw(Compressed) Winnower

7zcompressionisnotsuitable:•  Noglobalviewofcluster•  Oblivioustopreviousbatch

Page 25: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Evaluation: Computation Cost

25

�  Average time spent in induction and membership test at each epoch

0

5

10

15

20

25

30

35

50 100 150 200 250 300 350 400 450 500 550 600

AverageTime(sec)

ElapsedTime(sec)

Apache MySQL ProFTPD

HeterogeneousWorkload->

Updatesmodel

GenerateModelforfirst@me

Membershipcheckinexis@ngmodel

Page 26: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Case Study: Ransomware Attack

26

•  Attacker exploits Redis database server vulnerability version < 3.2

•  Vulnerability allows attacker to change SSH key and log in as Root

•  Attacker deletes the database and left a note using vim to send bitcoins get database back

Page 27: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Traditional Graph of Attack

27

� 10 instances of redis running in the cluster � ~80k vertices and ~83K edges with 161 MB size � Part of provenance graph shown below

Page 28: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

28

Winnower Generated Provenance graph

� 54 vertices and 68 edges with 0.7 MB size � Part of graph is shown below:

Worker

* /uploads/*

redis-server

x.x.x.x

Attack Provenance

Nginx

*

bash

/root/.ssh/authorized_keys

*

172.17.0.0/24

/var/lib/redis/dump.rdb

/proc/12743/stat

/var/log/redis/redis.log

x.x.x.x sshd bash

/root/ransomware.notevim

/dev/tty

Other library files

Confidence levelLegend 1 10

Page 29: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

29

Winnower Generated Provenance graph

� What happens if we attack all the nodes in the cluster

Worker

* /uploads/*

redis-server

x.x.x.x

Attack Provenance

Nginx

*

bash

/root/.ssh/authorized_keys

*

172.17.0.0/24

/var/lib/redis/dump.rdb

/proc/12743/stat

/var/log/redis/redis.log

x.x.x.x sshd bash

/root/ransomware.notevim

/dev/tty

Other library files

Confidence levelLegend 10

Page 30: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

� Winnower is the first practical system for provenance-based auditing of clusters at scale with low overhead

� Winnower significantly improves attack identification and investigation in a large cluster

30

Conclusion

Page 31: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Thank you for your time. [email protected]

31

Questions

Page 32: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

32

Backup Slides

Page 33: Towards Scalable Cluster Auditing through … · Apache MySQL ProFTPD Heterogeneous Workload -> Updates model Generate Model for first me Membership check in exisng model Case Study:

Threat model

� Assumptions �  Winnower only tracks user-space attacks i.e. trusts the OS �  Log integrity is maintained

� Attack surface �  Distributed application replicated on Worker nodes

� Attacker’ motive �  Gain control over worker node by exploiting a software vulnerability in

the distributed application

33