aunifiedplatformforinteractivenetworkforensics matthias ... · 1258531.. cteurv1.. 192.168.1.103...

Post on 23-Aug-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

VASTA Unified Platform for Interactive Network Forensics

Matthias Vallentin1,2 Vern Paxson1,2 Robin Sommer2,3

1UC Berkeley

2International Computer Science Institute (ICSI)

3Lawrence Berkeley National Laboratory (LBNL)

March 17, 2016

USENIX NSDI

1 / 28

Omnipresent Data Breaches

2 / 28

Breach Timeline

Compromise Forensics

Detection

Time

3 / 28

Breach Timeline

Compromise

Detection

Time

3 / 28

Breach Timeline

Compromise

Detection

Time

?

3 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

Organization

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

?

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

?

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

?

4 / 28

Network Forensics — Characteristics

Interactive data explorationI Iterative query refinement

I High-dimensional search

Disparate data accessI Temporal

I Spatial

Massive data volumesI 50–100K events/sec

I 10s TBs/day

?

4 / 28

Log Example — Bro Connection Log

#separator \x09#set_separator ,#empty_field (empty)#unset_field -#path conn#open 2016-01-06-15-28-58#fields ts uid id.orig_h id.orig_p id.resp_h id.resp_p proto service duration orig_bytes resp_bytes conn_..#types time string addr port addr port enum string interval count count string bool bool count string1258531.. Cz7SRx3.. 192.168.1.102 68 192.168.1.1 67 udp dhcp 0.163820 301 300 SF - - 0 Dd 1 329 1 328 (empty)1258531.. CTeURV1.. 192.168.1.103 137 192.168.1.255 137 udp dns 3.780125 350 0 S0 - - 0 D 7 546 0 0 (empty)1258531.. CUAVTq1.. 192.168.1.102 137 192.168.1.255 137 udp dns 3.748647 350 0 S0 - - 0 D 7 546 0 0 (empty)1258531.. CYoxAZ2.. 192.168.1.103 138 192.168.1.255 138 udp - 46.725380 560 0 S0 - - 0 D 3 644 0 0 (empty)1258531.. CvabDq2.. 192.168.1.102 138 192.168.1.255 138 udp - 2.248589 348 0 S0 - - 0 D 2 404 0 0 (empty)1258531.. CViJEOm.. 192.168.1.104 137 192.168.1.255 137 udp dns 3.748893 350 0 S0 - - 0 D 7 546 0 0 (empty)1258531.. CSC2Hd4.. 192.168.1.104 138 192.168.1.255 138 udp - 59.052898 549 0 S0 - - 0 D 3 633 0 0 (empty)1258531.. Cd3RNm1.. 192.168.1.103 68 192.168.1.1 67 udp dhcp 0.044779 303 300 SF - - 0 Dd 1 331 1 328 (empty)1258531.. CEwuIl2.. 192.168.1.102 138 192.168.1.255 138 udp - - - - S0 - - 0 D 1 229 0 0 (empty)1258532.. CXxLc94.. 192.168.1.104 68 192.168.1.1 67 udp dhcp 0.002103 311 300 SF - - 0 Dd 1 339 1 328 (empty)1258532.. CIFDQJV.. 192.168.1.102 1170 192.168.1.1 53 udp dns 0.068511 36 215 SF - - 0 Dd 1 64 1 243 (empty)1258532.. CXFISh5.. 192.168.1.104 1174 192.168.1.1 53 udp dns 0.170962 36 215 SF - - 0 Dd 1 64 1 243 (empty)1258532.. CQJw4C3.. 192.168.1.1 5353 224.0.0.251 5353 udp dns 0.100381 273 0 S0 - - 0 D 2 329 0 0 (empty)1258532.. ClfEd43.. fe80::219:e3ff:fee7:5d23 5353 ff02::fb 5353 udp dns 0.100371 273 0 S0 - - 0 D 2 369 0 01258532.. C67zf02.. 192.168.1.103 137 192.168.1.255 137 udp dns 3.873818 350 0 S0 - - 0 D 7 546 0 0 (empty)1258532.. CG1FKF1.. 192.168.1.102 137 192.168.1.255 137 udp dns 3.748891 350 0 S0 - - 0 D 7 546 0 0 (empty)1258532.. CNFkeF2.. 192.168.1.103 138 192.168.1.255 138 udp - 2.257840 348 0 S0 - - 0 D 2 404 0 0 (empty)1258532.. Cq4eis4.. 192.168.1.102 1173 192.168.1.1 53 udp dns 0.000267 33 497 SF - - 0 Dd 1 61 1 525 (empty)1258532.. CHpqv31.. 192.168.1.102 138 192.168.1.255 138 udp - 2.248843 348 0 S0 - - 0 D 2 404 0 0 (empty)1258532.. CFoJjT3.. 192.168.1.1 5353 224.0.0.251 5353 udp dns 0.099824 273 0 S0 - - 0 D 2 329 0 0 (empty)1258532.. Cc3Ayyz.. fe80::219:e3ff:fee7:5d23 5353 ff02::fb 5353 udp dns 0.099813 273 0 S0 - - 0 D 2 369 0 0

5 / 28

Existing Solutions

MapReduce (Hadoop)

3 Scalability

7 Batch-oriented: no iterative, exploratory analysis

In-Memory Cluster Computing (Spark)

3 E�cient & complex analysis

7 Thrashing when working set does not fit in aggregate memory

6 / 28

Existing Solutions

MapReduce (Hadoop)

3 Scalability

7 Batch-oriented: no iterative, exploratory analysis

In-Memory Cluster Computing (Spark)

3 E�cient & complex analysis

7 Thrashing when working set does not fit in aggregate memory

6 / 28

Contribution

VASTVisibility Across Space and Time

ArchitectureI

Performance: concurrent & modular design

IScaling: intra-machine & inter-machine

ITyping: strong & rich

ImplementationI

Composition: high-level bitmap indexing framework

IAdaptation: fine-grained component flow-control

IAsynchrony: finite state machines for query execution

7 / 28

Contribution

VASTVisibility Across Space and Time

ArchitectureI

Performance: concurrent & modular design

IScaling: intra-machine & inter-machine

ITyping: strong & rich

ImplementationI

Composition: high-level bitmap indexing framework

IAdaptation: fine-grained component flow-control

IAsynchrony: finite state machines for query execution

7 / 28

Contribution

VASTVisibility Across Space and Time

ArchitectureI

Performance: concurrent & modular design

IScaling: intra-machine & inter-machine

ITyping: strong & rich

ImplementationI

Composition: high-level bitmap indexing framework

IAdaptation: fine-grained component flow-control

IAsynchrony: finite state machines for query execution

7 / 28

Outline

1. Architecture

2. Implementation

3. Evaluation

VAST Architecture — Single Machine

8 / 28

VAST Architecture — Single Machine

importer

archive

index

exporter

node

source sink

10.0.0.1 10.0.0.254 53/udp10.0.0.2 10.0.0.254 80/tcp

8 / 28

VAST Architecture — Ingestion

10.0.0.1 53/udp10.0.0.2 80/tcp…

source

type 10.0.0.1 53/udpmetatype 10.0.0.2 80/tcpmeta

generateevent batch

9 / 28

VAST Architecture — Ingestion

10.0.0.1 53/udp10.0.0.2 80/tcp…

source

type 10.0.0.1 53/udpmetatype 10.0.0.2 80/tcpmeta

generateevent batch

importer

assign IDs

9 / 28

VAST Architecture — Ingestion

10.0.0.1 53/udp10.0.0.2 80/tcp…

source

type 10.0.0.1 53/udpmetatype 10.0.0.2 80/tcpmeta

generateevent batch

importer

assign IDs

archive

compressbatch

9 / 28

VAST Architecture — Ingestion

10.0.0.1 53/udp10.0.0.2 80/tcp…

source

type 10.0.0.1 53/udpmetatype 10.0.0.2 80/tcpmeta

generateevent batch

importer

assign IDs

archive

compressbatch

index

9 / 28

VAST Architecture — Ingestion

10.0.0.1 53/udp10.0.0.2 80/tcp…

source

type 10.0.0.1 53/udpmetatype 10.0.0.2 80/tcpmeta

generateevent batch

importer

assign IDs

archive

compressbatch

index

10.0.0.2 80/tcp

append datato bitmap index

10.0.0.1 53/udp

type

9 / 28

VAST Architecture — Index

partition

index

partition partition

meta index

10 / 28

VAST Architecture — Index

partition

index

partition partition

meta index

conn

10.0.0.2 53/udp 8.8.4.4 53/udp “dns”

indexer

10 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

lookup bit vectorsfrom partitions

80/tcp==X

10.0.0.0/8inX

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

lookup bit vectorsfrom partitions

80/tcp==X

10.0.0.0/8inX

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

lookup bit vectorsfrom partitions

80/tcp==X

10.0.0.0/8inX

archive

locate & shipevent batch for ID

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

lookup bit vectorsfrom partitions

80/tcp==X

10.0.0.0/8inX

archive

locate & shipevent batch for ID

candidatecheck

decompressbatch

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

lookup bit vectorsfrom partitions

80/tcp==X

10.0.0.0/8inX

archive

locate & shipevent batch for ID

candidatecheck

decompressbatch

sink

11 / 28

VAST Architecture — Querying

exporter

X in 10.0.0.0/8||

X == 80/tcp

index

lookup bit vectorsfrom partitions

80/tcp==X

10.0.0.0/8inX

archive

locate & shipevent batch for ID

candidatecheck

decompressbatch

sink

10.0.0.1 53/udp10.0.0.2 80/tcp…

type 10.0.0.1 53/udpmetatype 10.0.0.2 80/tcpmeta

render results

11 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

VAST Architecture — Distributed

12 / 28

Outline

1. Architecture

2. Implementation

3. Evaluation

Indexing Basics — Tree Indexes

13 / 28

Indexing Basics — Composition

( )� �

14 / 28

Indexing Basics — Composition

( )� �

14 / 28

Indexing Basics — Inverted Index

10 2 3 4 5 6 7 8 9

3

1

4

8

9

5

0

4

2

5

6

2

A B C D

15 / 28

Indexing Basics — Bitmap Index

0 1 1 0

1

10 2 3 4 5 6 7 8 9

0

0

1

1

0

1

2

3

4

5

0

1

0

0

0

0

0

1

0

1

0

0

1

0

0

A B C D

16 / 28

Indexing Basics — Bitmap Index

10 2 3 4 5 6 7 8 9

012345

A B C D

16 / 28

Indexing Basics — Bitmap Composition

17 / 28

Indexing Basics — Bitmap Composition

X 2 192.168.0.0/24 Y � 60s

17 / 28

Indexing Basics — Bitmap Composition

X 2 192.168.0.0/24 Y � 60s

17 / 28

Indexing Challenges

High-cardinality valuesI Represent millions of distinct values compactly

I Provide low-latency lookups

High-level operationsI Support type-specific operations

I Relational operators: {<, , =, 6=, �, >, 2, /2}

18 / 28

Query Language

Boolean ExpressionsI Conjunctions &&

I Disjunctions ||

I Negations !I Predicates

I LHS op RHSI (expr)

ExamplesI A && B || !(C && D)

I orig h in 10.0.0.1 && &time < now - 2h

I &type == "conn" || "foo" in :string

I duration > 60s && service == "tcp"

ExtractorsI &tag

I x.y.z

I :type

Relational OperatorsI <, <=, ==, >=, >

I in, ni, [+, +]

I !in, !ni, [-, -]

I ⇠, !⇠

ValuesI T, F

I +42, 1337, 3.14

I "foo"

I 10.0.0.0/8

I 80/tcp, 53/?

I {1, 2, 3}19 / 28

Data Model

TYPE

record

vector set

table

KEY VALUE

TYPETYPE

field 1

TYPE

field n

TYPE

container types

basic types

compound types

recursive types

bool

int

count

real

duration

time

string

pattern

address

subnet

port

none

20 / 28

Data Model

TYPE

record

vector set

table

KEY VALUE

TYPETYPE

field 1

TYPE

field n

TYPE

container types

basic types

compound types

recursive types

bool

int

count

real

duration

time

string

pattern

address

subnet

port

none

20 / 28

Bitmap Index for IP Addresses

192.168.0.42

21 / 28

Bitmap Index for IP Addresses

11000000.10101000.00000000.00101010

21 / 28

Bitmap Index for IP Addresses

11000000.10101000.00000000.00101010

21 / 28

Bitmap Index for IP Addresses

11000000.10101000.00000000.00101010

21 / 28

Bitmap Index for IP Addresses

X 2 192.168.0.0/27

21 / 28

Bitmap Index for IP Addresses

X 2 192.168.0.0/27

21 / 28

Outline

1. Architecture

2. Implementation

3. Evaluation

Data Set

Single-Machine

Data:

I 10M packets from a 24-hourtrace (5 fields/event)

I 3.4M derived Broconnection logs (20fields/event)

Machine:

I 2 ⇥ 8-core Intel Xeon CPUs

I 128GB RAM

I 4 ⇥ 3TB SAS 7.2K disks

I 64-bit FreeBSD

ClusterData:

I 1.24B Bro connection logs(152GB)

I Split into N slices for Nnodes

I N 2 [1, 24]

Nodes:

I 2 ⇥ 8-core Intel Xeon CPUs

I 12GB of RAM

I 2 ⇥ 500MB SATA disks

I 64-bit FreeBSD

22 / 28

Data Set

Single-Machine

Data:

I 10M packets from a 24-hourtrace (5 fields/event)

I 3.4M derived Broconnection logs (20fields/event)

Machine:

I 2 ⇥ 8-core Intel Xeon CPUs

I 128GB RAM

I 4 ⇥ 3TB SAS 7.2K disks

I 64-bit FreeBSD

ClusterData:

I 1.24B Bro connection logs(152GB)

I Split into N slices for Nnodes

I N 2 [1, 24]

Nodes:

I 2 ⇥ 8-core Intel Xeon CPUs

I 12GB of RAM

I 2 ⇥ 500MB SATA disks

I 64-bit FreeBSD

22 / 28

Queries

23 / 28

Performance – Index Latency

● ●

● ● ● ● ● ● ● ●●

● ●

0

2

4

6

8

10

12

14

16

4 8 12 16Cores

Late

ncy

(sec

onds

) Query● A

BCDEFGHI

24 / 28

Performance — Scaling

Import

0.5

1.0

1.5

2.0

5 10 15 20 25Nodes

1 / U

tiliz

atio

n

Export

●●

●●●

0.5

1.0

1.5

2.0

2.5

5 10 15 20 25Nodes

Late

ncy

(sec

onds

)

25 / 28

Details in the paper

26 / 28

Conclusion

Network Forensics ChallengesI Explorative high-dimensional search

I Disparate data access

I Massive data volumes

VAST: Visibility Across Space and TimeI Platform for network forensics

I Interactive & iterative search

IInter-machine and intra-machine scaling

I Open-source, permissive license (BSD)

27 / 28

Questions?

http://vast.io

28 / 28

top related