heuristics to classify internet backbone traffic based on connection patterns

16
Heuristics to Classify Internet Backbone Traffic based on Connection Patterns Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University of Technology Göteborg, Sweden

Upload: cleary

Post on 19-Mar-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Heuristics to Classify Internet Backbone Traffic based on Connection Patterns. Wolfgang John and Sven Tafvelin Dept. of Computer Science and Engineering Chalmers University of Technology G öteborg, Sweden. Introduction: Measurement location. Internet. 2x 10 Gbit/s (OC-192) - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

Wolfgang John and Sven TafvelinDept. of Computer Science and Engineering

Chalmers University of TechnologyGöteborg, Sweden

Page 2: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Introduction: Measurement location

Internet

Region

al ISPs

Göteborg

Stockholm

Other smaller Univ. and Institutes

Göteborg’s Univ.

Student-Net

• 2x 10 Gbit/s (OC-192)• capturing headers only• IP addresses anonymized• tightly synchronized• bidirectional per-flow analysis

Chalmers Univ.

Page 3: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Introduction: Motivation

• Problem:

– Operators don’t know the type of their traffic

– How to:• Improve network design and provisioning? • Support QoS support or security monitoring?• Enhance accounting possibilities?• Reveal trends and changes in network applications?

Page 4: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Introduction: Classification

• Solution: Traffic classification

– Four basic approaches:1. Port numbers

+ easy to implement - unreliable (P2P, malicious traffic)

2. Packet payloads+ accurate- requires updated payload signatures- privacy and legal issues- high processing requirements - does not work on encrypted traffic (P2P)

Page 5: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Introduction: Classification (2)

• Solution: Traffic classification (contd.)

3. Statistical fingerprinting+ no detailed packet information needed - depending on quality of training data- promising, but still immature

4. Connection patterns+ no payload required+ no training data required- not perfect accuracy

Page 6: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Methodology: Traffic Classification

• Two articles classify P2P flows according to connection patterns:– Karagiannis et al., 2004– Perenyi et al., 2006

• Updated classification heuristics:– Refined the heuristics in prior articles– Added new, necessary heuristics

Page 7: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Methodology: Proposed Heuristics

• Rules based on connection patterns and port numbers– 5 rules for P2P traffic (H1-H5)– 10 rules to classify other traffic types (F1-F10)

• remove ‘false positives’ from P2P

– Rules are applied:• On flows in 10 minute intervals• Independently on all flows and

prioritized when fetched from the database

Page 8: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Methodology: Proposed Heuristics (2)

– Heuristics for potential P2P traffic (H1-H5)• All traffic to and from potential P2P hosts is marked

as P2P traffic

• H1: TCP and UDP traffic between IP pair• H2: Well known P2P ports• H3: Re-usage of source Port within short time• H4: Non-parallel connections to endpoint (IP/Port)• H5: unclassified, long flows

– unclassified by H1-H4 and F1-F9– more than 1MB in one direction or – duration of more than 10 minutes

Page 9: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Methodology: Proposed Heuristics (3)

– Heuristics for other traffic (F1-F10)• F1 and F2: Web servers:

– parallel connections to web Ports– All traffic to and from Web server is Web-traffic

• F3: common services (DNS, BGP)– Equal source and destination port and port<501

• F4: Mail servers:– Hosts receiving traffic on mail ports (smtp, imap, pop)

while sending traffic via smtp– All traffic to and from Mail servers is Mail-traffic

Page 10: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Methodology: Proposed Heuristics (4)

– Heuristics for other traffic (F1-F10)• F5 and F6: Messenger and Gaming

– Hosts, connected to by a number of different IPs on well-known messenger, chat or gaming ports within a period of 10 days

– All traffic to and from these hosts is messenger or gaming• F7: FTP

– Active FTP with initiating port number of 20

• F8: non P2P ports:– Some well-known, privileged port number, typically not

used by P2P like dns, telnet, ssh, ftp, mail, rtp, bgp …

Page 11: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Methodology: Proposed Heuristics (5)

– Heuristics for other traffic (F1-F10)• F9: malicious and attack traffic

– Scans (scan from one source through port ranges)– Sweeps (scans from one source through IP ranges)– DoS attacks (“hammering attacks” from one source to few

hosts in high frequency)

• F10: unclassified, known non-P2P Port– unclassified by H1-H4 and F1-F9 (no connection pattern)– Well known ports including Web, messenger and gaming

Page 12: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Verification of proposed rule-set

# connections in 106 Amount of data in TB

Comparison of classification methods for P2P traffic

Page 13: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Results

Application Breakdown April 2006

Page 14: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Results (2)

Detailed results will be published at PAM 2008W. John and S. Tafvelin and Tomas Olovsson, Trends and Differences in Connection Behavior within Classes ofInternet Backbone Traffic, to be presented at the Passive and Active Measurement Conference,Cleveland, Ohio, USA, April 2008.(Proceedings to be published in Springer LNSC)http://pam2008.cs.wpi.edu/

Documentation about measurements (raw data)DatCat – Internet Measurement Data Catalog by CAIDAhttp://www.datcat.org (search for SUNET)

Page 15: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

2008-01-23ICOIN 2008

Conclusions

• Previous classification methods on packet header traces don’t work well on backbone data

• Proposal of refined and updates heuristics– Combining previous approaches– Extension and adjustment of heuristics– Including a rule for attack traffic

• Simple and fast method to decompose traffic– no payload required (encryption, header data, etc.)

• Effectively used even on short traces (10 min)• 0.2% of the data left unclassified

Page 16: Heuristics to Classify Internet Backbone Traffic based on Connection Patterns

Thank you very much for you attention!

Questions?