taad - a tool for traffic analysis and automatic diagnosis kathy l. benninger [email protected]...

22
TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger [email protected] NLANR/Pittsburgh Supercomputing Center

Upload: kory-sherman

Post on 04-Jan-2016

222 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

TAAD - A Tool for Traffic Analysis and Automatic

Diagnosis

Kathy L. Benninger

[email protected]

NLANR/Pittsburgh Supercomputing Center

Page 2: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

2

Outline

• Context for development of TAAD

• Characteristics of the tool

• Performance model

• Output description and interpretation

• OCXmon

• Practical considerations

Page 3: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

3

Context

• TAAD is being developed by the NLANR network research group based at the Pittsburgh Supercomputing Center

• NCNE Pittsburgh GigaPoP based at PSC

• Coexistence of NLANR group and the NCNE Pittsburgh GigaPoP provides ample opportunity for development and test.

Page 4: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

4

Context (cont’d)

• Need for tool to support NLANR/PSC’s TCP Trace-based Performance Diagnosis Flowchart– Analysis of heavily aggregated traffic – Automatic problem detection and partial

diagnosis

• Availability of OCXmon data collection

Page 5: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

5

Tool Characteristics

• Searches aggregate traffic for miss-tuned microflows

• Tool for GigaPoP operators

• Examines traffic from GigaPoP viewpoint, but detects end-system problems

Page 6: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

6

Tool Characteristics (cont’d)

• Uses model developed in “The Macroscopic Behavior of the TCP Congestion Avoidance Algorithm” [Mathis, Semke, Mahdavi, Ott, CCR July 1996]

• Compares actual TCP performance to performance predicted by the Model

Page 7: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

7

Tool Characteristics (cont’d)

• Diagnosis of bulk flows

• Does not pinpoint why performance is poor

• Evolving...

Page 8: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

8

Macroscopic Performance Model

• Rate = Estimated data rate (bytes/second)• MSS = Maximum Segment Size (bytes)• RTT = Round Trip Time (seconds)• p = Segment loss rate (probability)• C = Proportionality constant (typically 0.7)

)()( pCRTTMSSRate

Page 9: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

9

TAAD Calculation

teMeasuredRa

tedByModelRatePredicGainRatio

Page 10: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

10

Model used by TAAD

)( MSSRTT by te, scaledMeasuredRap

CGainRatio

• GainRatio = Indicates potential performance improvement

• p = Analogous to loss rate, but derived from number of packets successfully delivered between recovery events

• MeasuredRate = Data rate (bytes/second)

• RTT = Round Trip Time (seconds)

• MSS = Maximum Segment Size (bytes)

Page 11: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

11

TAAD Output Fields

• Source addresses and ports

• Destination addresses and ports

• Start time and duration of flow

• Counts of packets and bytes

• GainRatio and OpportunitySize

Page 12: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

12

TAAD Output Interpretation

• If GainRatio– is ~ 1, flow performance is close to Model– is > 1, indicates a non-IP bottleneck– is >> 1, invites tuning to improve performance– is < 1 means cheating!

Page 13: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

13

TAAD Output Interpretation (cont’d)

• OpportunitySize is GainRatio scaled by number of packets– Indicates how much data could have been

transmitted in the same amount of time on a properly tuned connection

– Output flows are sorted by OpportunitySize– Flows with largest OpportunitySize offer

largest payoff with tuning

Page 14: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

14

Sample Output# for if 0 vp:vc 1:153# unknown_ encaps: 0# not_ipv4: 0# pkts: 32434# bytes: 18128703# first: 0.9297266# latest: 3.69090628# oprtunsz src dst sport dport start_time duration pkts bytes gainratio

1244.2 0.177.0.0 0.4.0.0 1023 1383 0.00178352 3.669245243 339 476124 3.71180.0 0.25.0.0 0.4.0.0 20 1037 0.93062580 2.754259825 415 584200 2.8558.0 0.58.0.0 0.2.0.0 41415 25 0.93210152 2.620216131 217 305052 2.6454.9 0.228.0.0 0.2.0.0 119 2101 0.00483488 3.684525251 133 199500 3.4404.0 3.59.0.0 0.4.0.0 7919 5501 0.25671660 3.203558207 187 229234 2.2370.0 1.206.0.0 0.4.0.0 80 4586 0.06984540 3.618816853 199 297544 1.9352.4 2.60.0.0 0.2.0.0 80 1170 0.10803892 3.455293179 174 218624 2.0295.8 0.23.0.0 0.4.0.0 3474 1393 0.36288916 3.281085014 113 157084 2.6267.2 2.157.0.0 0.2.0.0 80 4252 0.14957588 3.521914005 103 143855 2.6241.8 0.228.0.0 0.30.0.0 80 1547 0.35120440 3.325309753 126 189000 1.9208.5 0.23.0.0 0.4.0.0 1275 6699 0.00540136 3.400367737 106 142896 2.0187.3 0.212.0.0 0.4.0.0 1986 20 0.00383024 3.671326876 103 140748 1.894.0 0.23.0.0 0.4.0.0 20 1422 0.25331492 3.378539562 103 57204 0.9

# end data for if 0 vp:vc 1:153

Page 15: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

15

OC3mon

• Available though development efforts of– NLANR/MOAT project at SDSC– MCI’s OCXmon activity– CAIDA’s CoralReef software suite

• Passive network monitoring tool

Page 16: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

16

OC3mon (cont’d)

• Data format– Trace files collected in Coral .crl format– Analysis output of TAAD is ASCII

• Collects packet headers

• Does not collect payload

Page 17: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

17

Operation

• Five minute trace on one or two interfaces

• New trace capture begins while previous five minutes of data is analyzed

• Data volume (per interface, mid-day)– Capture .crl file ~ 40MB/minute– Analysis output filesize ~ 25K/minute

Page 18: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

18

Operational Issues

• Data Policy– Amount of data– Security and privacy– Legal liability

• Run time– ATM card(s) devoted to continuous capture– Recommend dedicated machine

Page 19: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

19

Resource requirement

• Currently running on one Intel 450MHz CPU– CPU ~2% load during trace capture– CPU ~75-80% load during analysis (and

continued trace)– wall-clock time for analysis is < 1 minute for a

5 minute trace capture (~200MB trace file)

• 6GB disk sufficient for summary data

Page 20: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

20

Future

• Verification and release

• Adaptation for use with other trace tools

• Additional tools to create a TAAD toolset

Page 21: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

21

Conclusion

• TAAD is intended to help meet the need for a tool to automate the analysis and diagnosis of aggregated bulk flows.

• The analysis and diagnosis is based on comparing modeled and actual performance

• Output is intended to be a pointer for where to direct tuning efforts for maximum benefit

Page 22: TAAD - A Tool for Traffic Analysis and Automatic Diagnosis Kathy L. Benninger benninger@psc.edu NLANR/Pittsburgh Supercomputing Center

NLANR/PSC http://www.ncne.nlanr.net/TCP/TAAD

22

References• Macroscopic paper

– http://www.psc.edu/networking/papers/model_ccr97.ps

• TCP Tuning– http://www.ncne.nlanr.net/TCP/

• TAAD– http://www.ncne.nlanr.net/TCP/TAAD

• CoralReef– http://www.caida.org/Tools/CoralReef/