real-time traffic monitoring and containment a. l. narasimha reddy dept. of electrical engineering...
Post on 19-Dec-2015
218 views
TRANSCRIPT
Real-time Traffic monitoring and containment
A. L. Narasimha Reddy
Dept. of Electrical Engineering
Texas A & M University
http://ee.tamu.edu/~reddy/
Narasimha Reddy
Texas A & M University
2
Acknowledgements
• Deying Tong, Smitha, Phani Achanta
• Seong Soo Kim
Narasimha Reddy
Texas A & M University
3
Outline
• Introduction & Motivation
• DOS attacks– Partial state routers
• DDOS attacks, worms– Aggregate Packet header data as signals– Signal/image based anomaly/attack detectors
Narasimha Reddy
Texas A & M University
4
Introduction
• UDP-based multimedia traffic increasing
• UDP does not have congestion control
• Applications can be “selfish” – If everyone is selfish, network can break down
• Controlling “selfish” flows desired– Identify Resource hogs and control them
Narasimha Reddy
Texas A & M University
5
Impact of UDP -- Unfairness
• When UDP and TCP compete, UDP wins by pushing TCP into congestion [Floyd&Fall 99]
Narasimha Reddy
Texas A & M University
10
UDP -- Summary
• Individual flows need to respond to congestion
• When end-hosts don’t respond to congestion – Need to identify and contain such flows– Need network mechanisms for such control
Narasimha Reddy
Texas A & M University
11
Introduction (cont’d)
• Many Network attacks
• Exploit Application, Protocol, Network architecture vulnerabilities
• Denial of Service attacks– Consume all resources– Leave no resources for legitimate users
Narasimha Reddy
Texas A & M University
12
TCP SYN Flooding (cont’d)• The attack occurs by the attacker
initiating a TCP connection to the server with a SYN. (using a legitimate or spoofed source address)
• The server replies with a SYN-ACK• The client then doesn’t send back a
ACK, causing the server to allocate memory for the pending connection and wait.
(If the client spoofed the initial source address, it will never receive the SYN-ACK)
Narasimha Reddy
Texas A & M University
13
TCP SYN Flooding: Results
• The half-open connections buffer on the victim server will eventually fill
• The system will be unable to accept any new incoming connections until the buffer is emptied out.
• There is a timeout associated with a pending connection, so the half-open connections will eventually expire.
• The attacking system can continue sending connection requesting new connections faster than the victim system can expire the pending connections.
Narasimha Reddy
Texas A & M University
14
TCP Three-Way Handshake
SYNClient wishes to establish connection
SYN-ACKServer agrees to connection request
ACKClient finishes handshake
Client initiates request Connection
is now half-open
Client connection Established
Server connection Established
Client connecting to a TCP port
Narasimha Reddy
Texas A & M University
15
SYN Flood Illustrated
Client spoofs request
half-openS
half-openS
half-openS
Queue filledS
Queue filledS
Queue filledS
Client SYN Flood
I have ACKed these connections, but I have not received an ACK back!
Narasimha Reddy
Texas A & M University
16
Smurf Example
19
2.1
68
.1.0
/24
10.1.2.0/24 Cloud
Victim
Attacker
1. Attacker sends ICMP packet with spoofed source IP
Victim10.1.2.255
2. Attacker sends ICMP packet with spoofed source IP
Victim192.168.1.255
3. Victim is flooded with ICMP echo responses
4. Victim hangs?
Narasimha Reddy
Texas A & M University
17
Distributed Denial of Service Attacks (DDOS)• Attacker logs into Master
and signals slaves to launch an attack on a specific target address (victim).
• Slaves then respond by initiating TCP, UDP, ICMP or Smurf attack on victim.
Narasimha Reddy
Texas A & M University
18
Network Attacks -- Summary
• Many vulnerabilities exist in Networks• Malicious traffic increasing
– For fun and profit
• Need mechansims to identify and control malicious traffic
• DOS and DDOS• DOS, resource hog problem similar• DDOS requires new approach
Narasimha Reddy
Texas A & M University
19
Real-time traffic monitoring
• Attacks motivate us to monitor network traffic– Potential anomaly/attack detectors– Potentially contain/throttle them as they happen
• Line speeds are increasing– Need simple, effective mechanisms
• Attacks constantly changing– CodeRed yesterday, MyDoom today, what next
Narasimha Reddy
Texas A & M University
20
Motivation
• Most current monitoring/policing tools are tailored to known attacks– Look for packets with port number 1434
(CodeRed)– Contain Kaaza traffic to 20% of the link
• Become ineffective when traffic patterns or attacks change– New threats are constantly emerging
Narasimha Reddy
Texas A & M University
21
Motivation
• Can we design generic (and generalized) mechanisms for attack detection and containment?
• Can we make them simple enough to implement them at line speeds?
Narasimha Reddy
Texas A & M University
22
Introduction
• Why look for Kaaza packets– They consume resources– Consume resources more than we want
• Not much different from DOS flood– Consumes resources to stage attacks
• Why not monitor resource usage?– Do not want to rely on attack specific info
Narasimha Reddy
Texas A & M University
23
Attacks
• DOS attacks– Few sources = resource hogs
• DDOS attacks, worms– Many sources – Individual flows look normal– Look at the aggregate picture
Narasimha Reddy
Texas A & M University
24
DOS attacks & Network Flows
• Too many flows to monitor each flow• Maintain a fixed amount of state/memory
– State not enough to monitor all flows (Partial state)– Manage the state to monitor high-bandwidth flows – How?
• Sample packets– High-BW flows more likely to be selected
• Use a cache and employ LRU type policy– Traffic driven– Cache retains frequently arriving flows
Narasimha Reddy
Texas A & M University
25
Partial State Approach
• Similar to how caches are employed in computer memory systems– Exploit locality
• Employ an engineering solution in an architecture-transparent fashion
Narasimha Reddy
Texas A & M University
26
Identifying resource hogs
• Lots of web flows– Tend to corrupt the cache quickly
• Apply probabilistic admission into cache– Flow has to arrive often to be included in cache– Most web flows not admitted
• Works well in identifying high-BW flows
• Can apply resource management techniques to contain cached/identified flows
Narasimha Reddy
Texas A & M University
27
LRU with probabilistic admission
• Employ a modified LRU
• On a miss, flow admitted with probability p– When p is small, keeps smaller flows out – High-BW flows more likely admitted– Allows high-BW flows to be retained in cache
• Nonresponsive flows more likely to stay in cache
Narasimha Reddy
Texas A & M University
28
Traffic Driven State Management
• Monitor top 100 flows at any time– Don’t know the identity of these flows– Don’t know how much BW these may consume
Narasimha Reddy
Texas A & M University
29
Policy Driven State Management
• An ISP could decide to monitor flows above 1Mbps– Will need state >= link capacity/1 Mbps
• Could monitor flows consuming more than 1% of link capacity– For security reasons– At most 100 flows with 1% BW consumption
Narasimha Reddy
Texas A & M University
32
UDP Cache Occupancy
0100200300400500600
0.1
0.4
0.6 1
1.25 2.
12.
7 33.
5 4
Rate in Mb
Tim
e in
se
co
nd
s
Narasimha Reddy
Texas A & M University
33
TCP Cache Occupancy
0.70.720.740.760.78
0.80.820.840.86
1 3 5 7 9 11 13 15 17 19
Flow Number
Tim
e in
se
co
nd
s
Narasimha Reddy
Texas A & M University
35
Preferential Dropping
drop prob
Queue lengthdrop prob for high bandwidth flows
minth maxth
maxp
1
drop prob for other flows
Narasimha Reddy
Texas A & M University
36
Multiple possibilities
• SACRED: Monitor flows above certain rate (policy driven), differential RED, (iwqos99)
• LRU-RED: Traffic driven state management, differential RED (Globecom01)– Approximately fair BW distribution
• LRU-FQ: Traffic driven state management, fair queuing (ICC 04)– Contain DOS attacks
– Provide shorter delays for short-term flows
Narasimha Reddy
Texas A & M University
37
SACRED
• Sampling And Caching RED• Maintain flow rate as state for cached flows• If flow rate > threshold, drop at higher rate
– Drop rate keeps increasing if flow stays above threshold
– Tends to punish nonresponsive flows, high-BW flows
• If flow rate < threshold, remove from cache– Make room for another flow
Narasimha Reddy
Texas A & M University
43
LRU-FQ flow chart – enqueue event
Packet Arrival
Is Flow in Cache?
Yes
No Does Cache Have
space?
Yes
Admit flow with Probability ‘p’
No
Is Flow Admitted?
Record flow detailsInitialize ‘count’ to 0
Yes
Increment ‘count’Move flow to top of cache No
Is‘count’ >= ‘threshold’
No
Yes
Enqueue in Partial stateQueue
Enqueue in NormalQueue
Narasimha Reddy
Texas A & M University
44
Linux IP Packet Forwarding
Packet Arrival Check & StorePacket
Enqueue pkt
Request SchedulerTo invoke bottom half
Device Prepares
packet Packet Departure
Error checkingVerify
Destination
Route to destinationUpdate Packet
Packet Enqueued
Scheduler invokesBottom half Scheduler runs
Device driver
Local packetDeliver to upper layers UPPER LAYERS
IP LAYER
LINK LAYER
Design space
Narasimha Reddy
Texas A & M University
45
Linux Kernel traffic control
• Filters are used to distinguish between different classes of flows.
• Each class of flows can be further categorized into sub-classes using filters.
• Queuing disciplines control how the packets are enqueued and dequeued
Narasimha Reddy
Texas A & M University
46
LRU-FQ Implementation
• LRU component of the scheme is implemented as a filter. – All parameters: threshold, probability and
cache size are passed as parameters to the filter
• Fair Queuing employed as a queuing discipline. – Scheduling based on queue’s weight.– Start-time Fair Queuing
Narasimha Reddy
Texas A & M University
48
Control of Non-responsive Proportion
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
9 8 7 6 5 4 3 2 1
LRU Weight (x/10)
TC
P T
hro
ug
hp
ut
Fra
ctio
n (
20 T
CP
Flo
ws)
Ideal
UDP Flows = 2
UDP Flows = 3
UDP Flows = 4
UDP Flows = 5
Normal Router
Long-Term flow differentiation
Probability = 1/25 Cache size= 11 threshold= 125
Normal TCP fraction = 0.07
Narasimha Reddy
Texas A & M University
49
Long-term flow differentiationUDP Rate Based Experiments
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1 2 3 4
LRU Weight Proportion (x/10)
TC
P T
hro
ug
hp
ut
frac
tio
n
Ideal
UDP Rate = 100%
UDP Rate = 80%
UDP Rate = 60%
UDP Rate = 40%
Probability = 1/25 Cache size= 11 threshold= 125
Narasimha Reddy
Texas A & M University
50
Histogram of Web File Distribution
0
100
200
300
400
500
600
File Size
Fre
qu
ency
Histogram of Web File Distribution 350 500 140 9 1
500 5k 50k 500k 5m
Protecting Web Mice
Narasimha Reddy
Texas A & M University
51
Protecting Web mice
1:1LRU : Normal Queue
11LRU Cache Size
125Threshold
1/50Probability
20Web Clients
2 – 4LongTerm UDP Flows
20Long Term TCP Flows
Experimental Setup
Narasimha Reddy
Texas A & M University
52
Protecting Web MiceBandwidth Results
0.0656.2192789.134
0.0585.55128489.803
0.0625.88131389.452
TCP Fraction
TCP Tput
# Web Requests
UDP Tput
UDP Flows
0.4944.511363246.244
0.4944.831382845.733
0.4944.921391545.732
TCP Fraction
TCP Tput
# Web Requests
UDP Tput
UDP Flows
Normal Router
LRU-FQ Router
Narasimha Reddy
Texas A & M University
53
Protecting Web MiceTiming Results
UDP AvgRsp DevRsp MinRsp MaxRsp AvgConn DevConn MinConn MaxConn2 0.26 0.85 0.012 21.15 0.14 0.66 0.0014 21.013 0.26 0.85 0.013 22.27 0.13 0.59 0.0017 9.034 0.26 0.88 0.013 21.05 0.13 0.61 0.002 9.02
Normal Router
LRU-FQ Router
UDP AvgRsp DevRsp MinRsp MaxRsp AvgConn DevConn MinConn MaxConn2 2.54 4.43 0.026 45.08 1.95 3.07 0.0118 453 2.7 4.92 0.026 93.02 1.94 3.11 0.0115 45.014 3.06 4.83 0.026 45.03 2.11 3.42 0.0122 45
Narasimha Reddy
Texas A & M University
54
Summary of Partial-State
• Sampling and Caching allows simple identification of resource hogs
• Provides a good control of DOS attacks with limited number of flows
• Provides fairer distribution of link BW
• Partial state packet handling cost -not an issue at 100Mbps/1Gbps.– 1Gbps implemented on Intel Network processor
Narasimha Reddy
Texas A & M University
55
Applications of Partial State• More intelligent control of network traffic
• Accounting and measurement of high bandwidth flows
• Denial of Service (DOS) attack prevention
• Tracing of high bandwidth flows
• QOS routing
Narasimha Reddy
Texas A & M University
57
Approach
Network Traffic
Signal Generation
& Data Filtering
(Address correlation)
Anomaly Detection
(Thresholding)
Detection Signal
Statistical or Signal Analysis
(Wavelets or DCT)
Narasimha Reddy
Texas A & M University
58
Signal Generation
• Traffic volume (bytes or packets)– Analyzed before– May not be a great signal when links are always
congested (typical campus access links)
• Lot more information in packet headers– Source address– Destination address– Protocol number– Port numbers
Narasimha Reddy
Texas A & M University
59
Signal Generation
• Per packet cost is important driver• Update a counter for each packet header
field– Too much memory to put in SRAM
• Break the field into multiple 8-bit fields– 32-bit address into four 8-bit fields– 1024 locations instead of 2^32 locations– In general, 256* (k/8) instead of 2^k– k/8 counter updates instead of 1
Narasimha Reddy
Texas A & M University
60
Signal Generation
• What kind of signals can we generate with addresses, port numbers and protocol numbers?
Narasimha Reddy
Texas A & M University
61
Addresses are correlated
• Most of us have habits – Access same web sites
• Large web sites get significant part of traffic– Google.com, hp.com, yahoo.com
• Large downloads correlate over time– ftp, video
• On an aggregate, addresses are correlated
Narasimha Reddy
Texas A & M University
62
Address Correlation –attacks?
• Address correlation changes when traffic patterns change abruptly– Denial of service attacks– Flash crowds– Worms
• Results in differences in correlation – High --single attack victim– Low – lots of addresses --worm
Narasimha Reddy
Texas A & M University
63
Address correlation signals
• Address correlation:
• Simplified Address correlation:
m npmnpm npmnp
npmnpm npmnpn
2)(2)11(
)(*)11()(
m mnm mnmn pppnC 1)(
Narasimha Reddy
Texas A & M University
66
Signal Analysis
• Capture information over a sampling period– Of the order of a few seconds to minutes
• Analyze each sample to detect anomalies– Compare with historical norms
• Post-mortem/Real-time analysis– May use different amounts of data & analysis
• Detailed information of past few samples
• Less detailed information of older samples
Narasimha Reddy
Texas A & M University
67
Signal Analysis
• Address correlation as a time series signal
• Employ known techniques to analyze time series signals
• Wavelets –one powerful technique– Allows analysis in both time and frequency
domain
• Per-sample analysis has more flexibility– Not in forwarding path
Narasimha Reddy
Texas A & M University
70
Image based analysis
• Treat the traffic data as images
• Apply image processing based analysis
• Treat each sample as a frame in a video– Video compression techniques lead to data
reduction– Scene change analysis leads to anomaly
detection– Motion prediction leads to attack prediction
Narasimha Reddy
Texas A & M University
71
Signal Generation
IP byte 0 IP byte 1
IP byte 2 IP byte 3
IP byte 0 IP byte 1
IP byte 2 IP byte 3
destination IP address
source IP
address
Figure 2. The visualization of network traffic signal in IP address
(a) 1 dimension (b) 2 dimension
0 1 14 15
16 17 30 31
224 225 238 239
240 241 254 255
..........
..........
..........
..........
..........
..........
..........
..........
IP byte 0
00
01
0254
0255
10
11
1254
1255
2540
2541
254254
254255
2550
2551
255254
255255
..........
..........
..........
..........
..........
..........
IP byte 0(source IP address,
destination IP address)
Narasimha Reddy
Texas A & M University
72
Two dimensional images
• Horizontal/vertical lines indicate anomalies– Infected machine contacting multiple
destinations (worm propagation)– Multiple source machines targeting a
destination (DDOS)
Narasimha Reddy
Texas A & M University
78
Evaluation
• True Positive Rate
• False Alarm Rate or False Positive Rate
• True Negative Rate
• False Negative Rate
• LR = true positive rate/ false positive rate
• NLR = false negative rate/true –ve rate
• Ideally, LR = infinity, NLR = 0
Narasimha Reddy
Texas A & M University
80
Protocol Composition
• During attack, attack protocol volume will be higher– Observation of changes can lead to detection
Narasimha Reddy
Texas A & M University
86
End host attacks
• Common solution to several kinds of attacks?• Do something simple in the network layer
– State maintenance and policing
• Our Key Idea: Per Resource regulation – Hierarchical regulation (per resource, per flow) also
possible
• Move regulation away from server into the network (eg. At firewall)
Narasimha Reddy
Texas A & M University
88
End host – QOS regulation
• Limit consumption of each resource– At bastion Host
• Limit resource consumption to a traffic class so that other classes keep getting service
Narasimha Reddy
Texas A & M University
89
End host protection
• Have a uniform picture of resources at the network layer– We do this at the QOS Regulator
• Resource Aggregates (resource principals)
– Memory, Protocol State Buffers, mbuf / sk_buff Clusters, Network Bandwidth, CPU Cycles...
• Charge incoming traffic to one or more of these resource aggregates
Narasimha Reddy
Texas A & M University
90
End host protection (cont’d)
• What does Rate Control achieve?– UDP food regulation– ICMP flood regulation– Interrupt / packet processing regulation– What about TCP SYN? CGI attack?
– Consume Fixed number of resources
• What does Window Control achieve?– Regulates fixed number of resources– Need to keep track of resource usage
– TCP SYN data structures, CGI processes, Memory
– Sometimes action required to reset system state and free resources
Narasimha Reddy
Texas A & M University
93
Advantages
• Not looking for specific known attacks
• Generic mechanism
• Works in real-time – Latencies of a few samples– Simple enough to be implemented inline
Narasimha Reddy
Texas A & M University
94
Prototypes
• Linux-PC boxes
• On Intel Network processors– Can push to Gbps packet forwarding rates– Forwarding throughput not impacted– Sampling rates of a few ms possible
Narasimha Reddy
Texas A & M University
95
Related Work
• Resource usage monitoring– Estan & Verghese –Bloom filters– Kodialam & Lakshman – Run detection– Mahajan et al – RED-PD – Duffield (AT & T) – Sampling– Others
Narasimha Reddy
Texas A & M University
96
Related Work –Worms
• Payload monitoring– Singh, Savage & Verghese, Tang & Chen– Look for matches against constant length
payloads• Sampling, Rabin Signatures
– Prototype implementation – Detects worms within 5-30 seconds– Effective with polymorphic worms
Narasimha Reddy
Texas A & M University
97
Related Work -- Worms
• Look for TCP Reset signals– Weaver & Paxson– Random host scan at a specific ports– Not all hosts open attack port– Attacking worm will get many Resets– Too many Resets => Attacker– Effective for TCP based attacks– Can detect/contain in real-time
Narasimha Reddy
Texas A & M University
98
Related Work -- Worms
• Quick spreading worms use randomly generated addresses– Normal users use names, DNS– Worms don’t have DNS activity– Lots of accesses without DNS requests =>
Worms– Many detectors within a campus
• Local DNS servers
Narasimha Reddy
Texas A & M University
99
Related Work -- Worms
• Address honeypots– Arbor networks, Paxson, CrowCroft– Configure machines to accept packets for
unassigned addresses– Only worms will contact these machines– Capture payloads to analyze – Quickly propagate signatures
Narasimha Reddy
Texas A & M University
100
Related Work -- Worms
• IP Traceback – Savage et al– Address spoofing makes origin of attacks
difficult to detect– Tracing, if universal, will limit attacks
• Fear of detection
– Post-attack detection • Not helpful in mitigating or detection
– Most attack machines are innocent participants
Narasimha Reddy
Texas A & M University
101
Related Work –host based
• Limit the number of new connections of individual hosts– TwyCross & Williamson (HP)– Reduces the speed at which a worm can spread– Can be used to detect worms
• Monitor application execution sequences– Profiling based indication of anomalous
behavior => Detect and sandbox worms
Narasimha Reddy
Texas A & M University
102
Conclusion
• Real-time resource accounting is feasible
• Real-time traffic monitoring is feasible– Simple enough to be implemented inline
• Can rely on many tools from signal/image processing area– More robust offline analysis possible– Concise for logging and playback
Narasimha Reddy
Texas A & M University
103
Thank you !!
For more information,http://ee.tamu.edu/~reddy
Narasimha Reddy
Texas A & M University
104
LRU-RED Results
0
10
20
30
40
50
50 67 75 80
% UDP flows
% T
CP
Th
rou
gh
pu
t
Droptail
LQD
CHOKe
LRU
RED
Narasimha Reddy
Texas A & M University
105
RTT Bias -TCP flows
0
1
2
3
4
5
6
7
8
8 8 44 84 84 124
204
204
404
RTT in ms
% D
rop
rate
CHOKe
RED
DropTail
LQD
LRU
Narasimha Reddy
Texas A & M University
106
Impact of Cache size
• Effect of varying cache size– to study impact of cache size on performance of
the scheme– probability= 1/55, threshold = 125– number of TCP flows=20– equal weights for both queues.
Narasimha Reddy
Texas A & M University
108
Normal Workloads
• Performance under normal workloads– working of scheme when non-responsive loads
are absent or use their fair share of bandwidth– cache size = 9, threshold =125– probability = 1/55