worms: taxonomy and detection mark shaneck 2/6/2004

Worms: Taxonomy and Detection

Mark Shaneck

2/6/2004

2

Outline

Introduction Worm Classification

Spreading Media Target Acquisition Polymorphic Worms

Detection / Prevention Conclusion

3

Introduction

Common and costly So far, mostly benign… Need to react within seconds - too quickly

for a human

4

Spreading Media

Traditional Email Windows File Sharing Hybrid

5

Traditional

Self-propagate through network Exploit some vulnerability to automatically execute

worm payload Most common - buffer overflow

Least common in existence Largest potential danger

Spreads fastest

Main subject of detection and containment research

6

Email

Spreads through email Relies on humans or poor application design

Most are executable attachments Nimda executed automatically when previewed

Most common form of worm Very hard to detect, but they spread slowly

7

Windows File Sharing

Spreads through windows file shares Worms don’t generally spread this way

solely Very hard to penetrate a network perimeter this

way Usually use other methods to penetrate network

and then this method to spread within the network

8

Hybrid Worms

Combination of methods Example: Nimda

Spread through email Copied itself to open network shares (was executed if

someone viewed it in Windows Explorer) Traditional methods

Used subnet scanning to look for open Code Red II and Sadmind backdoors

Exploited multiple IIS Directory Traversal vulnerabilities Modified web pages to cause clients to download and

execute the worm payload

9

Hybrid Worms

Detection difficulties Propagation pattern is difficult to predict since

humans are involved If one method is blocked it might find another

way in…

10

Target Acquisition

Random Scanning Subnet Scanning Routing Worm Pre-generated Hit List Topological Stealth / Passive

11

Random Scanning

32 bit number is randomly generated and used as the IP address

Slammer and Code Red I Hits black IP space frequently

Only 28.6% of IP space is allocated

12

Subnet Scanning

Generate last 1, 2, or 3 bytes of IP address randomly

Code Red II and Blaster Some scans must be completely random to

infect whole internet

13

Routing Worm

BGP information can tell which IP address blocks are allocated

This information is publicly available http://www.routeviews.org/ http://www.ripe.net/ris/

14

BGP Routing Worm

By including routable prefixes in the worm payload, it can limit its scanning to allocated addresses

Could reduce scanning space by 71.4% Aggregation and compression could reduce the

space needed to 175 KB Compare

Slammer: 376 bytes Blaster: 6 KB Nimda: 57 KB

15

Class A Routing Worm

By examining BGP data you can see which Class A addresses are allocated

Only 116 of 256 Class A addresses are publicly routable (45.3% of total IP space)

Only 116 extra bytes are needed to reduce the scanning space in half

16

Pre-generated Hit List

Hit list of vulnerable machines is sent with payload Determined before worm launch by scanning

Gives the worm a boost in the slow start phase Skips the phase that follows the exponential model

Infection rate looks linear in the rapid propagation phase

Can avoid detection by the early detection systems

17

Topological

Uses info on the infected host to find the next target Morris Worm used Network Yellow Pages

and /etc/hosts file to find more hosts Email worms use address books P2P systems usually store info about hosts it

connects to

18

Stealth / Passive

Waits for a vulnerable system to contact it Hides the infection among normal traffic

No active scanning

Nimda - modification of server web pages P2P systems - infected host could respond to

requests with the worm

19

Polymorphic Worms

Worms can easily be enhanced for self-modification Simple encryption with random key would randomize

the payload Small decryption routine would remain This could be obfuscated and randomized as well

Random do-nothing instructions Random padding

Exploit might remain common Nimda email - no exploit data Buffer Overflow - return address might be same

20

Detection / Prevention

Ideal: Dynamic Quarantine and Automatic Signature Generation

IPv6 vs. Worms EarlyBird Honeycomb BGP Information Kalman Filter Hidden Markov Models Email Worm Detection

21

Ideal

Detect worm outbreak quickly Automatically generate signatures and filter

packets immediately Distribute alerts and signatures faster than

worms can spread Is this possible?

22

IPv6 vs. Worms

IPv6 has 2128 IP addresses Smallest subnet has 264 addresses

4 billion IPv4 internets Consider a sub-network

1,000,000 vulnerable hosts 100,000 scans per second (Slammer - 4,000) 1,000 initially infected hosts It would take 40 years to infect 50% of vulnerable

population with random scanning Scan-based worms will be ineffective

23

EarlyBird

“Flows” are identified by packet content (or hash of content)

Counters of distinct sources and destinations are kept for popular flows

When counts cross the threshold, flow is considered a worm, and content used for signature

Additional “guilt” can be assigned to flows sent to black address space

24

EarlyBird

Benefits Counts distinct sources and destinations Most systems simply examine total traffic on a

particular port and look for changes in the traffic pattern

25

EarlyBird

Packet content examination can be evaded with simple polymorphism They suggest using sampled Rabin fingerprinting

to find commonly occurring fixed length strings If only 4 bytes are in common for a polymorphic

worm, then the packets will be identified by only 4 bytes…. How to differentiate packets?

26

Honeycomb

Plugin to honeyd Assumption: All traffic to a honeypot is suspicious For every inbound packet - use longest common

substring (LCS) algorithm to find a signature (after performing header analysis)

Adds signature to the signature pool Periodically outputs signature pool to Snort/Bro Problems: Traffic to regular hosts? Polymorphism?

27

BGP Information

Use black address space to watch for scans Only will be useful in detecting random scanning

worms

Use AS profiling to build a model of how much traffic comes from each AS and watch for drastic changes Will it detect in time?

28

Kalman Filter

Worm propagation follows the epidemic model

0 50 100 150 2000

2

4

6

8

10x 10

4

Time t (second)

# o

f in

fect

ed

ho

sts

# of infected hosts It

29

Kalman Filter

Best system currently by Don Towsley, et al. Distribute sensors (ingress and egress filters) around

network to measure Scan rate Scan distribution Total number of scans Total number of infected hosts

Info sent to centralized Malware Warning Center (MWC)

30

Kalman Filter

0

10

20

30

40

50

60

10 20 30 40 50

-0.1

-0.05

0

0.05

0.1

0.15

0.2

10 20 30 40 50

Worm traffic

0

10

20

30

40

50

60

10 20 30 40 50

-0.1

-0.05

0

0.05

0.1

0.15

0.2

10 20 30 40 50

0

10

20

30

40

50

60

10 20 30 40 50

-0.1

-0.05

0

0.05

0.1

0.15

0.2

10 20 30 40 50

Non-worm traffic burst

Exponential rate on-line estimation

Monitored illegitimate traffic rate

31

Kalman Filter

MWC uses Kalman filter to calculate trend in the growth If it matches the exponential model, it is considered a

worm Sensors measure the info by packets sent to black IP

space Sensors must monitor 220 IP addresses to get accurate

information Can be circumvented by a hit-list or topological

worm

32

Hidden Markov Model

Not very useful in worm detection HMMs are based on changes in states Worm outbreaks effectively consist of two

states - vulnerable and infected To be of use the transition to infected would

need to be detected, which is basically worm detection…

33

Email Worm Detection

Email Mining Toolkit (EMT) - Columbia Cliques - users usually send email to particular sets

of users Assumption: If user sends to a set that is not a

subset of a clique, something is wrong Anomaly detection to find suspicious email to be

examined in more detail Problems: If user sends one broadcast email, clique

is useless. False positives.

34

Conclusion

Ideal in fighting worms - detection and quarantine / signature generation

Most research focuses on early detection It is not clear how to protect after detection

Is it enough to close the port? Ban offending IP addresses temporarily?

Is it possible to automatically generate signatures for any worm?

worms: taxonomy and detection mark shaneck 2/6/2004

Documents