1
Impact of IT Monoculture on Behavioral End Host
Intrusion Detection
Dhiman Barman, UC Riverside/JuniperJaideep Chandrashekar, Intel Research
Nina Taft, Intel Research Michalis Faloutsos, UC Riverside/stopthehacker.com
Ling Huang, Intel Research Frederic Giroire, INRIA
2
Problem: How should we configure behavioral HIDS across an enterprise?
Enterprise laptops run HIDS Each device can
have its own threshold
Key question: does “one size fit all”?Users
Firewall
Enterprise
Internet
SysAdmin
Server
HIDS = Host Intrusion Detection Systems
3
Motivation: so far, monoculture! Why? We polled sys admins:
"easier to manage” no method on how to set them otherwise harder to interpret results, if not mono
Term: monoculture = homogeneous
4
Contributions We challenge the practice of monoculture Measure enterprise behavior: 350 laptops We observe that
User behavior is diverse Diversity is better than monoculture in HIDS
We propose a new approach: partial diversity A little diversity goes a long way!
6
Our data collection User traffic: 350 laptops of enterprise
employees 5 weeks in Q1 of 2007 Collected all packet headers Collection tool runs on laptop
Malicious traffic: Collected traces from machines with known botnets on them
7
Measured key detection features
We study features used in real systems Selection of features is an orthogonal
question
8
Threat Models #1: Attacker knows nothing about user
behavior #2: Attacker monitors user behavior and
builds histograms of behavior for typical HIDS feature Attacker cannot know the instantaneous value
of a feature, only its histogram Attacker selects volume of malicious traffic to
“hide” inside normal traffic
9
Defining the optimization goal
Far from obvious: FN (False Negatives) vs FP (False Positives) failing to detect vs false alarms
Our Utility provides a flexible definition Sysadmins need to decide this
User i, with threshold Ti,w is relative importance of FN or FP
11
User behavior varies a lot! Focus on the
tail behavior of users 99%, 99.9%
Spans 4 orders of magnitude
13
Different users could detect different types of attacks
Is the feature activity correlated?
Not necessarily Conclusion:
All users are important
Synthesizing alarms is non-trivial
Some users are "light" in terms of the maximum number of UDP connections, but "heavy" in TCP connections
14
An uber-policy for enterprise diversity We propose a tunable policy
Monoculture: one threshold for all Full diversity: one threshold per user Partial diversity: one threshold per group
We use 8 groups
Partial diversity subsumes the other two a key question: grouping users
15
Partial Diversity: grouping Our goal here:
there exists a grouping with good results for diversity k-means clustering did not work well:
skewed distribution with wide and continues spread Heuristic: follow the nature of the distribution:
the top 15%, split into 4 subgroups bottom 85% split into 4 subgroups
Experimented with 2,3,5,8 We show only the 8 group case (best results)
16
Evaluation approach Train using real data Test with malicious traces superimposed Evaluation method:
Train on previous week -> thresholds Apply thresholds on current week
Interesting: Weekly thresholds vary! a 99th perc. threshold for previous week does not guarantee 1% false positive this week
19
Limiting the attacker’s opportunity: measuring the stealth traffic
Naïve attacker will be detected Clever attacker will be “limited”
20
Conclusions Time to revisit the question of diversity Diversity can offer benefits We propose Partial Diversity:
striking the balance in a tunable way Our work as a first step in providing
a framework to compare initial techniques to establish thresholds