network awareness and network securitychechik/cybersecurity/cascon-mchugh.pdf · look at flow...
TRANSCRIPT
Faculty Of Computer SciencePrivacy and Security Lab
Network Awareness andNetwork Security
John McHughCanada Research Chair in Privacy and Security
Director, Privacy and Security LaboratoryDalhousie University, Halifax, NS
CASCON CyberSecurity Workshop17 October 2005
Faculty Of Computer SciencePrivacy and Security Lab
OverviewThe internet has grown to the point where
observational approaches offer one of the onlyapproaches to its understanding.• In this case, we have a view of the border of a large
composite network with > /8 address space and severalmillion hosts.
• Given a broad view, it is possible to look at bothsimilarities and differences in the traffic going to/fromvarious sites.
• In this talk, we will try to sketch a variety of analyses thatcan be performed when such data is available
Note: Similar data can be aggregated in a variety ofways. Opportunities to obtain similar views exist.
Faculty Of Computer SciencePrivacy and Security Lab
Network Data CollectionApproach
Look at flow abstractions network.• No payload data just headers – Source, Destination IP and
ports; protocol; times; traffic volumes (e.g., packets and bytes)- Cisco NetFlow like sources
Comprehensive coverage• >95% coverage of the monitored network• Multiple network points of attachment (many borders)
Collect a lot of data• Requires a data center with large computational and storage
capacity for historical analysis• Scalable collection and analysis
Faculty Of Computer SciencePrivacy and Security Lab
Organization/Analysis ApproachNetflow Data
• Organized by hour, type, sensor (router), etc. for outsideto inside and inside to outside.
• Packed format for efficient linear search• 10s of TB and growing by 10s of GB/day
Primary operation is selection based on time, IP, flowcharacteristics (protocol, volume, etc.)• Creates files of raw data satisfying selection criteria• Statistics on files can be produced
Can create sets or bags (multisets, counted sets) ofIPs with selection characteristics• Sets can be used for further selection / filtering• Multisets (bags) allow efficient counting
Faculty Of Computer SciencePrivacy and Security Lab
A few quick examplesThe next few slides show some typical samples of the
kinds of information that can be derived from the flowdata.
They range from network wide analyses toexaminations of the characteristics of specificsubnets and even of specific hosts.• The analysis system can be viewed as a powerful zoom
lens.• It is capable at looking at the overall traffic on a few percent
of the internet at its widest angle, or at a single host at itshighest magnification.
Faculty Of Computer SciencePrivacy and Security Lab
Network Characterization
We can use the set of addresses that emittraffic to approximate the active population ofa subnet.• The active addresses can be used to partition
inbound traffic• Hit - mostly intentional• Miss - accidental, malicious, etc.
• The traffic from the active addresses can be usedto characterize a subnet• Host roles• Activity patterns
Faculty Of Computer SciencePrivacy and Security Lab
One week on a /16
0
500000
1000000
1500000
2000000
2500000
3000000
3500000
4000000
4500000
11-Jan 12-Jan 13-Jan 14-Jan 15-Jan 16-Jan 17-Jan 18-Jan
Date and Time
Ho
url
y F
low
Fil
e S
ize
Inbound Hits
Inbound misses
Faculty Of Computer SciencePrivacy and Security Lab
Host CharacterizationCpt. Damon Becknel, USA looked at host
characterization based on port mixes. Wedeveloped the following visualizations to helpin the process.• The log scale on flows helps when there are large
differences in flow volumes among ports• The data was NOT filtered by protocol and there
is “noise” in the port fields for some protocols,especially ICMP and possibly others.
• We see a mix of “normal” and questionableactivity.
Faculty Of Computer SciencePrivacy and Security Lab
Workstation?
Faculty Of Computer SciencePrivacy and Security Lab
Web Server ?
Faculty Of Computer SciencePrivacy and Security Lab
Scanner !!
Faculty Of Computer SciencePrivacy and Security Lab
Developing the contact surfaceIn looking for scanners, Carrie Gates asked if there is a
normal contact pattern?• i.e. how many external hosts contact how many internal
hosts per unit time ?One might expect clustering or a power law, but
NumberOf
Sources
Number of Destinations
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceNow you see it (July 2003)
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceAway it goes (August 2003)
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceNow you don’t (September 2003)
Faculty Of Computer SciencePrivacy and Security Lab
Structure in the contact surfaceHere it is again (Feb / April /June 2004)
Faculty Of Computer SciencePrivacy and Security Lab
We are still studying thisPhase 1
• Persisted from Jan - Aug• details for 1 week in July• 91% port 80, 87% SYN,
96% unique sIP/dIP pairs• 49% to 60 /16 in 1 /8• 3 /8 srcs 46%, 34%, 20%
• 2 AP, 1 SA• Too slow to be DoS• Too persistent for scan?
Blaster released on Aug 11• Could this be the related?
Phase 2• From mid Feb - early June• Again SYN to Port 80
Conclusion• We don’t know what phase
1 is, but it is highly likelythat phase 2 is WelchiaB
• Forensics may hold theclue to the behavior
Faculty Of Computer SciencePrivacy and Security Lab
Weekly contact activity• During this week there are about 3.3 million
sources that show up exactly once.• The distribution tapers off rapidly from there.• In a linear plot it looks like a straight line at
approximately zero, but the log plot shows aninteresting upturn at the end. This contains:• One connection that persists for the entire
week with flows in each hour.• A number of apparently scripted transfers that
occur every few minutes. At least two involveaccess to a weather site.
• At first glance, no evidence of hourly, stealthyactivity.
Faculty Of Computer SciencePrivacy and Security Lab
Weekly Contact HoursContacts per Source (1 Week)
1
10
100
1000
10000
100000
1000000
10000000
0 50 100 150
Total contact hours
Nu
mb
er o
f s
ou
rces
Contacts per Source (1 Week)
Faculty Of Computer SciencePrivacy and Security Lab
Partitioning Heavy HittersThe set of hosts in a network that have beenobserved to emit traffic during some timeinterval approximates the active population ofthe network.• Traffic sent to addresses that are not associated
with hosts is arguably malicious• The active host set can be used to separate this
traffic from traffic addressed to active hosts• Additional refinements are possible
• We look at high volume hit / miss traffic (>32000flows per hour)
Faculty Of Computer SciencePrivacy and Security Lab
Hit and Miss Sources above32000
Hit and Miss Sources above 32000 Flows/Hour
0
10
20
30
40
50
60
70
80
0 50 100 150
Hour in Week
Nu
mb
er
of
So
urc
es p
er
Ho
ur
Hit Sources
Miss Sources
Faculty Of Computer SciencePrivacy and Security Lab
So who are these sourcesThe following are in the heavy miss list for more than 50 Hours inthe week analyzed.
063.210.x.y 66 Level 3 Communications, Inc., CO199.099.x.y 135 Performance Systems International, DC200.074.x.y 51 Metropolis Intercom, Santiago, CL203.129.x.y 77 Software Technology Parks of India, Pune, IN217.107.x.y 113 RTComm.ru
Two are registered in ARIN, one in the Asian Registry, one in theLatin American Registry and one with RIPE.
Faculty Of Computer SciencePrivacy and Security Lab
And what are they doing?Flows from Heavy sources
0
100000
200000
300000
400000
500000
600000
700000
800000
900000
1000000
0 50 100 150
Hour in Week
Flo
ws p
er
ho
ur
063.210.x.y
199.099.x.y
200.074.x.y
203.129.x.y
217.107.x.y
Faculty Of Computer SciencePrivacy and Security Lab
1 hour of 217.107.x.x, etc. sIP| dPort| pro| flags|
217.107.x.y| 1080| 6| S | 217985 1
217.107.x.y| 80| 6| S | 215356 2
217.107.x.y| 8080| 6| S | 210326 3
217.107.x.y| 3128| 6| S | 209895 4
217.107.x.y| 80| 6| SR | 1064 5
217.107.x.y| 8080| 6| SR | 325 6
217.107.x.y| 1080| 6| SR | 85 7
217.107.x.y| 80| 6| R | 52 8
217.107.x.y| 8080| 6| R | 21 9
217.107.x.y| 1080| 6| R | 2 10
217.107.x.y| 74| 1| | 1 11
End of input after 855112 records forming 11clusters.
Note the best yieldis 0.4% on port 80
199.99.x.y203.129.x.y arescanning Port 80
200.74.x.y isscanning 80, 8080,1080, and 3128
Faculty Of Computer SciencePrivacy and Security Lab
But what about 063.210.x.y sIP|dPort |pro|packets| bytes| flags|
63.210.x.y|46177| 6| 1| 40| R A |
63.210.x.y|26478| 6| 1| 40| R A |
63.210.x.y|27373| 6| 1| 40| R A |
• This does not cluster on dPort, try sport
sIP|sPort|pro| packets| bytes| flags|
63.210.x.y| 80| 6| 1| 40| R A | 142786 1 63.210.x.y| 443| 6| 1| 40| R A | 140083 2
• Almost all the activity is RA on 80 or 443• This is possibly response to a flooding attack
Faculty Of Computer SciencePrivacy and Security Lab
Proactive use - Spyware
Outgoing traffic from spyware creates hot spots.• Aggregation of destinations hints at targets• Similarity of content provides additional evidence• The wider the spread of the spyware, the easier it
is to detect this way• The more data is aggregated the easier it is to
see this.
Zombie command and control networks mightbe detectable this way, also.
Faculty Of Computer SciencePrivacy and Security Lab
Summary / Conclusion / FutureWe have an unprecedented ability to examine network
traffic• Long periods of time, large volumes
Empirical Analyses• How to identify scans, Denials Of Service• What defenses work?• Evidence of compromise
Acquire Additional Data Sources• Different network topologies - Would like to look at interior
nodes and subnets as well as border.• Different types of networks (e.g., Wireless)
Faculty Of Computer SciencePrivacy and Security Lab
Credit where credit is dueTo the entire Situational Awareness team, especially:The late Suresh L. Konda
• Suresh was, and remains, the inspiration for this program
Mike Collins, Andrew Kompanek, and the SiLKtoolsdevelopers• Mike Duggan, Mark Thomas
CERT analysts and users of the data• Carrie Gates, Marc Kellner, Capt. Damon Becknel, USA
• Carrie and Damon are responsible many of the graphics
Tom Longstaff for managing the unmanagable