introduction to artificial intelligence for cybersecurity …...machine learning will be everywhere...

Post on 04-Oct-2020

0 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Presenter’s Name

Presenter's Position

I n t r o d u c t i o n t o Ar t i f i c i a l I n t e l l i g e n c e

f o r C y b e r S e c u r i t y Ap p l i c a t i o n s

Rob Collins

Director – Sales Engineering, APAC

G L O S S ARY

k-means – simple clustering algorithm

DBSCAN – more advanced clustering algorithm

NB – Naïve Bayes classifier model

GMM – Gaussian Mixture Model clustering algorithm

LSTM – Long Short-Term Memory Neural Network algorithm

CNN – Convolutional Neural Network

RNN – Recurrent Neural Network

LR – Logistic Regression classifier

DT – Decision Tree

- Arthur Samuel, 1959

M AC H I N E L E AR N I N G I S A F I E L D O F S T U D Y T H AT G I V E S C O M P U T E R S T H E AB I L I T Y T O L E AR N W I T H O U T E X P L I C I T LY B E I N G P R O G R AM M E D

M AC H I N E L E AR N I N G W I L L B E E V E RY W H E R E

M AC H I N E L E AR N I N G W I L L B E E V E RY W H E R E

The trend to incorporate ML capabilities into new and existing security

products will continue apace. According to an April 2016 Gartner

report:

By 2018, 25% of security products used for detection will have some form of

machine learning built into them.

By 2018, prescriptive analytics will be deployed in at least 10% of UEBA products to

automate response to incidents, up from zero today.

Gartner Core Security, The Fast-Evolving State of Security Analytics, April, 2016, Report ID: G00298030 accessed

at https://hs.coresecurity.com/gartnerreprint-2017

T W O T H I N G S C AM E T O G E T H E R T O E N AB L E A I

Big Data

Large collections of Spam, malware,

exploits, network traffic, user behaviors

Cloud Computing Power

Possible to consume over

100,000 CPU/GPU cores

+

S U P E R V I S E D

S U P E R V I S E D P R O C E S S

U N S U P E R V I S E D

Works like a human brain – useful connections

remain, others dropped

N E U R AL N E T W O R K

‘Nostrils open’?

‘Nostrils size’?

‘Ears hanging?’

‘Tongue visible’?

‘Tongue width’?

‘Tongue length’?

‘Eye roundness’?

‘pupil roundness’?

‘Nostrils open’ = 1.0

‘Nostrils size’ = 0.9

‘Ears hanging’ = 0.1

‘Tongue visible’ = 0.8

‘Tongue width’ = 0.7

‘Tongue length’ = 0.4

‘Eye roundness’ = 0.9

‘pupil roundness’ = 0.9

‘Tongue visible’ = 0.0

‘Tongue width’ = 0.0

‘Tongue length’ = 0.0

‘Nostrils open’ = 0.2

‘Nostrils size’ = 0.1

‘Eye roundness’ = 0.5

‘pupil roundness’ = 0.1

‘Ears hanging’ = 0.0

(0.1, 0.9, 0.9, 1.0, 0.9, 0.8, 0.7, 0.4) (0.0, 0.5, 0.1, 0.2, 0.1, 0.0, 0.0, 0.0)

this dog’s feature vector this cat’s feature vector

Cats

Dachshunds

Dogs

Visualization: Mark Borg

Ears hanging

Eye roundness

Pupil roundness

Nostrils open

Nostrils size

Tongue visible

Tongue width

Tongue length

Persian Cats

MS Office

Good software

Visualization: Mark Borg

Code sequences

Strings

Wavelet analysis

Control structures

Disassembly graphs

Header structures

Compilers

Timezones

Toolkits

...

>2.7M features

Malware

Wannacry

5 G E N E R AT I O N S O F M L F O R C Y B E R S E C U R I T Y

R: Cloud training / local prediction

F: Medium features (~100,000)

D: Medium samples (~100M)

D: Mostly human labeled / some heuristic

H: Largely uninterpretable

G: Misleading FP rate / Overfit

2

R: Cloud only training / prediction

F: Small features (~1,000)

D: Small samples (~1M)

D: Hand picked and human labeled

H: Easily interpretable

G: High FPs / Underfit / Easy to bypass

1

R: Cloud enhanced models

F: Large features (~3M)

D: Large samples (~1B)

D: Largely heuristic labeled

H: Some interpretability with visualization

G: Fit appropriately / accuracy metrics

generalize

3 4

R: Models learn from local training

F: Large features (>3M)

D: Online learning

H: Model explains strategy & gets feedback

G: Model fits current and future inputs

R: Unsupervised local training

F: Unlimited with semi-supervised

discovery and data collection

D: Active learning

H: Human input optional

G: Model identifies and adapts to

concept drift

5

DARPA’s Three AI Waves: DESCRIBE CATEGORIZE EXPLAIN

• Runtime

• Features

• Datasets

• Human Interaction

• Goodness of Fit

Generational Factors

QUESTIONS

A N D

ANSWERS

T H AN K Y O U

top related