introduction to artificial intelligence for cybersecurity …...machine learning will be everywhere...
TRANSCRIPT
Presenter’s Name
Presenter's Position
I n t r o d u c t i o n t o Ar t i f i c i a l I n t e l l i g e n c e
f o r C y b e r S e c u r i t y Ap p l i c a t i o n s
Rob Collins
Director – Sales Engineering, APAC
G L O S S ARY
k-means – simple clustering algorithm
DBSCAN – more advanced clustering algorithm
NB – Naïve Bayes classifier model
GMM – Gaussian Mixture Model clustering algorithm
LSTM – Long Short-Term Memory Neural Network algorithm
CNN – Convolutional Neural Network
RNN – Recurrent Neural Network
LR – Logistic Regression classifier
DT – Decision Tree
- Arthur Samuel, 1959
M AC H I N E L E AR N I N G I S A F I E L D O F S T U D Y T H AT G I V E S C O M P U T E R S T H E AB I L I T Y T O L E AR N W I T H O U T E X P L I C I T LY B E I N G P R O G R AM M E D
M AC H I N E L E AR N I N G W I L L B E E V E RY W H E R E
M AC H I N E L E AR N I N G W I L L B E E V E RY W H E R E
The trend to incorporate ML capabilities into new and existing security
products will continue apace. According to an April 2016 Gartner
report:
By 2018, 25% of security products used for detection will have some form of
machine learning built into them.
By 2018, prescriptive analytics will be deployed in at least 10% of UEBA products to
automate response to incidents, up from zero today.
Gartner Core Security, The Fast-Evolving State of Security Analytics, April, 2016, Report ID: G00298030 accessed
at https://hs.coresecurity.com/gartnerreprint-2017
T W O T H I N G S C AM E T O G E T H E R T O E N AB L E A I
Big Data
Large collections of Spam, malware,
exploits, network traffic, user behaviors
Cloud Computing Power
Possible to consume over
100,000 CPU/GPU cores
+
S U P E R V I S E D
S U P E R V I S E D P R O C E S S
U N S U P E R V I S E D
Works like a human brain – useful connections
remain, others dropped
N E U R AL N E T W O R K
‘Nostrils open’?
‘Nostrils size’?
‘Ears hanging?’
‘Tongue visible’?
‘Tongue width’?
‘Tongue length’?
‘Eye roundness’?
‘pupil roundness’?
‘Nostrils open’ = 1.0
‘Nostrils size’ = 0.9
‘Ears hanging’ = 0.1
‘Tongue visible’ = 0.8
‘Tongue width’ = 0.7
‘Tongue length’ = 0.4
‘Eye roundness’ = 0.9
‘pupil roundness’ = 0.9
‘Tongue visible’ = 0.0
‘Tongue width’ = 0.0
‘Tongue length’ = 0.0
‘Nostrils open’ = 0.2
‘Nostrils size’ = 0.1
‘Eye roundness’ = 0.5
‘pupil roundness’ = 0.1
‘Ears hanging’ = 0.0
(0.1, 0.9, 0.9, 1.0, 0.9, 0.8, 0.7, 0.4) (0.0, 0.5, 0.1, 0.2, 0.1, 0.0, 0.0, 0.0)
this dog’s feature vector this cat’s feature vector
Cats
Dachshunds
Dogs
Visualization: Mark Borg
Ears hanging
Eye roundness
Pupil roundness
Nostrils open
Nostrils size
Tongue visible
Tongue width
Tongue length
Persian Cats
MS Office
Good software
Visualization: Mark Borg
Code sequences
Strings
Wavelet analysis
Control structures
Disassembly graphs
Header structures
Compilers
Timezones
Toolkits
...
>2.7M features
Malware
Wannacry
5 G E N E R AT I O N S O F M L F O R C Y B E R S E C U R I T Y
R: Cloud training / local prediction
F: Medium features (~100,000)
D: Medium samples (~100M)
D: Mostly human labeled / some heuristic
H: Largely uninterpretable
G: Misleading FP rate / Overfit
2
R: Cloud only training / prediction
F: Small features (~1,000)
D: Small samples (~1M)
D: Hand picked and human labeled
H: Easily interpretable
G: High FPs / Underfit / Easy to bypass
1
R: Cloud enhanced models
F: Large features (~3M)
D: Large samples (~1B)
D: Largely heuristic labeled
H: Some interpretability with visualization
G: Fit appropriately / accuracy metrics
generalize
3 4
R: Models learn from local training
F: Large features (>3M)
D: Online learning
H: Model explains strategy & gets feedback
G: Model fits current and future inputs
R: Unsupervised local training
F: Unlimited with semi-supervised
discovery and data collection
D: Active learning
H: Human input optional
G: Model identifies and adapts to
concept drift
5
DARPA’s Three AI Waves: DESCRIBE CATEGORIZE EXPLAIN
• Runtime
• Features
• Datasets
• Human Interaction
• Goodness of Fit
Generational Factors
QUESTIONS
A N D
ANSWERS
T H AN K Y O U