is that normal? behaviour modelling on the cheap
DESCRIPTION
Originally presented at BSides Ottawa on 06-Sep-2014, this talk lays out the challenges faced by todays defender (for context), the gap in our current defensive strategies (what we'll address), and explains how to start a basic behavioural analysis practice with minimal investment. Remember this is a BSides presentation so there may be some language which causes a double-take ;-) Open with caution.TRANSCRIPT
Mark Nunnikhoven, bunch of letters @marknca
Just like you probably can’t see this, I can’t see the backchannel Tweet me now @marknca, I’ll reply after the talk…
Is That Normal?Behaviour modelling on the cheap
What is it?
What folks are doing
Today’s talk
Context The gap Getting started
Recently…
450 000 000
Target 27-Nov-2013—15-Dec-2013
First CEO “resignation” due to information security incident
The Home Depot Early May-2014—Late Aug-2014
a/k/a “Target 2”
ebay Late Feb-2014—Mid May-2014
Nominated for “Worst Communications During An Incident”
Houston Astros 17-Jun–2013—17-Oct-2014
“Oh shit, they tried to trade me for an old bus and a hot dog vendor?”
Amazing visualization from Information Is Beautiful “World’s Biggest Data Breaches & Hacks”
0d
Because it was successful, it was “an APT”…at least according to marketing
KISS
Simple works. A lot. With minimal effort Why waste a “bunker buster” when they left the door open?
The Problem
Restrict inbound Restrict outbound Heavily monitor access
Data
Restrict inbound Allow outbound Little to no monitoring
User
Yes, we only use 2 types of controls to police this space. Amazing isn’t it?
Authentication Authorization
3 is more than 2. So that’s an immediate win when reporting up to your boss(es)
Authentication Authorization Behaviour analysis
How?
What to look at
All traffic leaving user space
What to look at
All traffic leaving user space
What to look for
Malicious patterns
You might want to consider buying something here or at least Martin’s solution However, if you don’t have a strong process for handling alerts don’t bother!
What to look for
Odd access patterns
You can buy products that help here but we can get good ROI with DIY If you already have a SIEM, put this effort into tuning it’s rules & alerts
Starting point
…and only a starting point
The Goal
Provide actionable information to your team
You’re never going to get 100% automated here BUT you can reduce your team’s workload
In order of importance
Access Transactions Authentication
<< fancy circles for no particular reason
And then?
Dump it all in a database
Yes, an old school relational database
Dump it?
Well no…that’ll cause problems*
The #1 problem with RDBMS is that few people consider what they want to get _out_ of them
* Only if you want to do anything with the data. If you want a(nother) shelfware project, go ahead
It’s amazing what an old school DB can do when structured properly There is a reason why we’ve stuck with the tech for 40+ years
Hardware Table Structure
Desktop HourBigger DayBiggest Week
Bigger-est MonthRidiculous
This talk has “on the Cheap” in the title. Stop showing off
Anything else?
Add metadata on ingestion*
I felt like using the term “metadata” would add more credibility and a nice NSA-esque feeling here
* You’re trying to save computation later on. And it’s easier to line up usernames or groups now rather than later. You can do fun things with caching too
Indices?
Store the timestamp as YYYY-MM-DD-HH-MM-SS*
First person to say “what about seconds since the epoch?” gets a free gift It’s not a good gift. You don’t want it. Trust me on this
* No wiggle room. It’s easier to do computations on this way
How you structure your query has a major impact on performance That should be obvious. If not, it is now
Hardware Query Breadth (in tables)
Desktop 1Bigger 2-3Biggest 3-5
Bigger-est 3-5Ridiculous Didn’t you get the message on 2 slides ago?
More dimensions == slower performance but potentially more useful answers Use your judgement here
Hardware Query Size (in dimensions)
Desktop 2-3Bigger 3-5Biggest 5-7
Bigger-est 5-7Ridiculous Seriously, WTF?
How do I frame questions for the data?
Based on the average of X, what are the outliers?
Not the Malcolm Gladwell Outliers, actual math-y type ones
* select min(thing_I_want) from (group_of_things_I_want) select max(thing_I_want) from (group_of_things_I_want)
Start simple, build up the questions you ask based on success “If it isn’t actionable, get rid of it”, Rob Edwards < awesome guy
Questions you should ask your data?
<Timeline for logins> <Period of access for user> <Size of transaction> <Number of domains per day>* These four will net a lot of interesting info
Use your logs Reduce work for your team Start small, build
Thanks!
Mark Nunnikhoven @marknca
Now send me a tweet ;-)