uw-madison - flowscan and rate limiting adventures

54
UW-Madison - FlowScan and Rate Limiting Adventures I2 Techs Conference May 17, 2001 Michael Hare

Upload: aislin

Post on 14-Jan-2016

32 views

Category:

Documents


0 download

DESCRIPTION

UW-Madison - FlowScan and Rate Limiting Adventures. I2 Techs Conference May 17, 2001 Michael Hare. Presentation Overview. FlowScan Controlling ResNet traffic: Some experiences with rate limits. FlowScan: A Network Traffic Reporting and Visualization Tool. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: UW-Madison - FlowScan and Rate Limiting Adventures

UW-Madison - FlowScan and Rate Limiting Adventures

I2 Techs Conference

May 17, 2001

Michael Hare

Page 2: UW-Madison - FlowScan and Rate Limiting Adventures

Presentation Overview

FlowScan

Controlling ResNet traffic: Some experiences with rate limits

Page 3: UW-Madison - FlowScan and Rate Limiting Adventures

FlowScan: A Network Traffic Reporting and Visualization Tool

FlowScan is a software package for open systems that is freelyavailable under the terms of the GNU General Public License.

Primarily developed by Dave Plonka of UW-Madison.

FlowScan analyses and reports on flow data exported by IP routers.

FlowScan produces graph images which provide a continuous,near real-time view of network traffic across a its bordersuitable for webpages.

Page 4: UW-Madison - FlowScan and Rate Limiting Adventures

Background on Flows

• The notion of flow profiling was introduced by the research community.

• Today, for performance and accounting reasons, flow profiling is built into some networking devices.

• Not yet standards-based, FlowScan utilizes flows defined and exported by Cisco's NetFlow feature.

Page 5: UW-Madison - FlowScan and Rate Limiting Adventures

Sample Flows - FTP

An IP flow is a unidirectional series of IP packets of a given protocol, travelling between a source and destination, within a certain period of time.

Page 6: UW-Madison - FlowScan and Rate Limiting Adventures

FlowScan

• FlowScan maintains counters based upon flow classifications and periodically exports information into databases.

• Counters are currently maintained based on these flow attributes:

– Protocols (ICMP, TCP, UDP)– Services (FTP, SMTP, HTTP, P2P Apps) – Subnets (if desired)– AS pairs

• Works with most Cisco and RiverStone RS routers• Compatibility with Juniper's routers and packet-sampling-based flows is

in the planning stages (More on this later)

Page 7: UW-Madison - FlowScan and Rate Limiting Adventures

Some Uses For Flowscan

• Short term network analysis lets you discover recent changes in network behavior. Graphs over a short time frame are based upon five-minute intervals.

• Long term network analysis aids in capacity planning and traffic shaping efforts.

Page 8: UW-Madison - FlowScan and Rate Limiting Adventures

Short-Term Network Analysis: Redhat 7.1 Release

Events, such as the release of RedHat 7.1, are visible as jumps in outbound traffic patterns. Outbound Computer Science traffic increased from 10 Mb/s to nearly 80Mb/s almost instantly.

Page 9: UW-Madison - FlowScan and Rate Limiting Adventures

Short-Term Network Analysis: DoS Detection

Network abuse, such as flood-based Denial of Service attacks, are visible as "stalagmites" and "stalactites". These would be hidden in coarser-grained long-term graphs. Since one flow is created for each series of packets between a source and a destination, portscans are common culprits for these “Flow Explosions”.

Page 10: UW-Madison - FlowScan and Rate Limiting Adventures

Short-Term Network Analysis: DoS Detection (cont)

Difference in the number of hosts talking out vs. being talking to in a 5 minute period. Another scheme for detecting portscans unearths the huge amounts of probes initiated in a 48 hour period.

Page 11: UW-Madison - FlowScan and Rate Limiting Adventures

Long Term AnalysisInput/Output totals, 730 days prior to 12 May 2001

• The academic calendar year dramatically influences campus traffic levels, mostly notably in ResNet. Since the beginning of data collection in early 1999, ResNet users have typically been larger providers than consumers of Internet content.• Outbound traffic consistently exceeds our inbound traffic level, but this academic year’s inbound / outbound traffic patterns haven’t experienced the typical ‘doubling’ effect; access links at or near capacity.

Page 12: UW-Madison - FlowScan and Rate Limiting Adventures

Long Term AnalysisApplication totals, 365 days prior to 12 May 2001

Here, we get a glimpse of the rise and fall of Napster, the first ‘killer’ p2p app. Although Napster usage has declined, outbound from traffic from ResNet has not.

Page 13: UW-Madison - FlowScan and Rate Limiting Adventures

UW-MadisonNapster vs. Gnutella Usage

Mid-Dec through Mid-Jan was a quiet time on campus for Napster, as the primary users are not utilizing the network. Here, we clearly illustrate the declining usage of Napster and the increased usage of Gnutella. As was with Napster, the campus appears to be a larger provider than consumer of Gnutella data.

Page 14: UW-Madison - FlowScan and Rate Limiting Adventures

UW-MadisonNapster vs Gnutella Usage (cont)

For the first time, Gnutella overtakes Napster as UW-Madison’s most popular P2P file swapping application.

Page 15: UW-Madison - FlowScan and Rate Limiting Adventures

Long Term AnalysisPeering, 730 days prior to 12 May 2001

FlowScan lets you monitor the effectiveness of your peering by reporting the next-hop source or destination AS’s of your traffic. Our biggest peers are WiscNet and Abilene.

Page 16: UW-Madison - FlowScan and Rate Limiting Adventures

CampusIO Extension Modules

Top ASNs

Flowscan can help you make informed peering and provisioning decisions by reporting the amount of traffic that other AS’s sources, sinks, or carries for your institution.

Above, our most popular origin (endpoint) peer is @Home. We are currently working on a peering arrangement.

Page 17: UW-Madison - FlowScan and Rate Limiting Adventures

CampusIO Extension Modules

Alerts

To deal with DoS floods, alerts via pager and email were introduced. Currently based on tolerances set in a configuration file. Looking for ways to utilize AI-type heuristics to automate tolerances.

Page 18: UW-Madison - FlowScan and Rate Limiting Adventures

CampusIO Extension Modules

Top Talkers

This output of FlowScan’s Top Talkers module (anonymized sample shown here) lets you see top bandwidth consumers and providers.

Page 19: UW-Madison - FlowScan and Rate Limiting Adventures

Implementing Flowscan inLarge Scale (ISP) Networks

WiscNet, Wisconsin’s statewide educational network, is currently researching several challenges of utilizing FlowScan in a large environment.

– Limitations of the flow processor itself (FlowScan)

– Limitations of the exporting hardware (Routers)

Page 20: UW-Madison - FlowScan and Rate Limiting Adventures

Limitations of Flow Processing:Flow Processing

UW-Madison campus collects flows from a Cisco 7507 and processes them on a 700Mhz P3. FlowScan almost falls behind during peak usage times, because there are too many flows to process.

WiscNet handles 2~3x the amount of traffic of Madison, and will be collecting flows from multiple border routers and processing them on a 1Ghz machine. Without some course of action, it is doubtful that the processor will be able to keep up.

Page 21: UW-Madison - FlowScan and Rate Limiting Adventures

Limitations of Flow Processing:Flow Exporting

Large ISPs tend to have devices with high-speed interfaces. Because of router CPU utilization, current hardware is not able to support full flow export on heavily utilized high-speed interfaces (OC12+).

Running FlowScan in an environment with multiple edge routers, possibly with mixed vendors, adds complexity. Juniper routers do not support full scale flow exporting, but they do support a concept known as packet sampling.

Page 22: UW-Madison - FlowScan and Rate Limiting Adventures

Packet Sampling

In order to reduce the CPU demands on their routers, Juniper utilized the concept of packet sampling; instead of considering each packet for flow export, they only examine a configurable percentage.

UW-Madison campus recently evaluated a Juniper router, and found that with its current interfaces and amount of traffic processed, a sampling rate of 1 out of every 96 packets had to be set, otherwise the Juniper would become overburdened in flow export duties.

Page 23: UW-Madison - FlowScan and Rate Limiting Adventures

Packet Sampling (cont)

With packet sampling, the produced graphs looked similar to the graphs produced during non-sampled periods.

Page 24: UW-Madison - FlowScan and Rate Limiting Adventures

The Bright Side of Packet Sampling

As an added bonus, we saw the amount of flows being exported from our router drop nearly 90%. FlowScan could easily keep up with this level of flows.

Page 25: UW-Madison - FlowScan and Rate Limiting Adventures

The Ugly Side of Packet Sampling

Packet sampling broke some things we expected and more.

• Our security team relied on the logs produced by the 1 to 1 flow exporting when investigating network abuse and techno-crimes. We no longer could provide a completely accurate view of our network traffic.

• We lost the ability to detect DoS attacks based on the "stalagmites" and "stalactites“ in the flow graphs, because we were only catching about 1/96th of the usually single-packet flow portscans.

Page 26: UW-Madison - FlowScan and Rate Limiting Adventures

More Ugly Things about Packet Sampling

• FlowScan itself relies on the 1 to 1 flow exporting for application classification. The Napster and Passive FTP detection modules determined users by looking for patterns in the packets;

• For Napster, look for a client talking to an index server before counting port 6699 traffic as Napster data.

• For Passive FTP, look for established port 20 connections.

Packet sampling gives us no guarantee that these packets will be sampled.

Page 27: UW-Madison - FlowScan and Rate Limiting Adventures

Statistical Accuracy of Flow Sampling: Non Sampled Model

I was surprised to find that more than 88% of our flows consisted of only twelve packets or less. 76% were six packets or less.

Upon investigation, these were typically SYNs, ACKs, UDP, and small bits of web content.

Page 28: UW-Madison - FlowScan and Rate Limiting Adventures

Statistical Accuracy of Flow Sampling: Sampled Model

In a sampled model, only 39% of our flows consisted of twelve packets or less, and only 27% of our flows were six packets or less.

This was compared to 88% and 76% respectively in our non-sampled model.

Page 29: UW-Madison - FlowScan and Rate Limiting Adventures

Conclusions on Sampling

• In our short eval period, the sampled application and input/output graphs appeared representative of the campus traffic, but the nature of the traffic being reported dramatically shifted.

• Larger flows were over-represented, and smaller flows were under-represented.• Longer studies need to be done.

Page 30: UW-Madison - FlowScan and Rate Limiting Adventures

The Future of Flow Accounting:

– FlowScan is currently coded in Perl for easy maintenance and portability. Further speed improvements may come in a rewrite to C, or creating codebase that can utilize multiple processors.

– Running multiple FlowScan instances and aggregating totals collected by each flow processor.

• Breaks stateful inspection.

– Vendor support in hardware for 1 to 1 flow accounting.

Page 31: UW-Madison - FlowScan and Rate Limiting Adventures

FlowScan

Information on flowscan can be found here:

http://net.doit.wisc.edu/~plonka/FlowScan/

The UW-Madison Campus uses FlowScan to graph traffic patterns. The live site is available here.

http://wwwstats.net.wisc.edu

Page 32: UW-Madison - FlowScan and Rate Limiting Adventures

FlowScan

This concludes this portion of the presentation.

Page 33: UW-Madison - FlowScan and Rate Limiting Adventures

Controlling ResNet Traffic

We started investigating rate limits in order to get a handle on ResNet usage.

Napster outbound at times compromised 50% of our outbound traffic. We first tried educating users to remove server functions of their Napster clients, but no change in network behavior was observed.

Page 34: UW-Madison - FlowScan and Rate Limiting Adventures

Rate Limiting

• Once UW-Madison had FlowScan in place for measurement instrumentation, it became a great tool by which to gauge the effectiveness of configuration changes.

• We needed to attain predictability for network costs, including bandwidth, engineering, and equipment resources.

Page 35: UW-Madison - FlowScan and Rate Limiting Adventures

Basic Types of Rate Limiting:Traffic Shaping

• Traffic shaping - Traffic comes into a queue and is released at a specified rate, thereby smoothing the flow of traffic. This queuing introduces latency into the flow.

(Juniper Networks)

Page 36: UW-Madison - FlowScan and Rate Limiting Adventures

Basic Types of Rate Limiting:Traffic Shaping

– Advantages• Prevents congestion at aggregation points.

• Available in a number of routers.

– Disadvantages• Doesn't necessarily allow all available network

capacity to be utilized.

• Doesn’t allow "bursting" beyond the configured rate-limit, even if the average rate would conform to the limit.

Page 37: UW-Madison - FlowScan and Rate Limiting Adventures

Basic Types of Rate Limiting:Traffic Policing

Traffic policing - Traffic comes into an interface, and a decision is made either to drop, pass, or mark the traffic (best effort/less than bet effort). Queuing is not involved so it doesn't degrade performance of conformant traffic.

(Juniper Networks)

Page 38: UW-Madison - FlowScan and Rate Limiting Adventures

Basic Types of Rate Limiting:Traffic Policing

Hard policing causes an immediate drop of the packet, which causes retransmissions.

Soft policing is the ability to defer the decision about whether or not to drop a given packet until that packet reaches a downstream router which is better informed as to whether or not congestion currently exists.

Page 39: UW-Madison - FlowScan and Rate Limiting Adventures

Practical Rate-Limit Methods:Aggregate Rate-Limiting

Aggregate rate limits are usually enforced at some central point in the network. The rate-limit is applied to either a physical interface or to traffic defined by addressing or by application, for example, as can be defined using a Cisco Access Control List (ACL). Aggregate limits can be implemented with policing and/or shaping techniques.

Page 40: UW-Madison - FlowScan and Rate Limiting Adventures

Aggregate Rate-Limiting:Pros and Cons

– Advantages• Relatively simple to configure.• Simple to enforce for the router hardware because most rate-

limit implementations of this sort do not need to track the state of individual connections.

– Disadvantages• Inability to track individual users, hosts, or application

sessions. As such they can unfairly punish some users or applications by indiscriminately dropping their packets rather than others.

• Decreases goodput by causing retransmissions

Page 41: UW-Madison - FlowScan and Rate Limiting Adventures

Aggregate Rate-Limiting:Experiences

• UW-Madison has experimented with Cisco's Committed Access Rate (CAR) limits on a 7507 router at our campus border. Although it effectively limited traffic to the specified level, it was reported that ftp users in the outside world were unable to even establish a connection to the rate-limited ftp servers because the all of the returning ACK packets were dropped during high congestion.

Page 42: UW-Madison - FlowScan and Rate Limiting Adventures

Aggregate Rate-Limiting:Example

• The following commands configured CAR on our Cisco border router to limit a user population's outbound traffic to 10Mb/s:

access-list 125 permit tcp 10.10.0.0 0.0.255.255 anyinterface (your interface)rate-limit output access-group 125 10000000 1000000 1000000 conform-action transmit exceed-action drop

Page 43: UW-Madison - FlowScan and Rate Limiting Adventures

Practical Rate-Limit Methods:Flow-Based Rate-Limiting

• Flow based rate limiting conforms individual traffic flows to a predetermined allocation of bandwidth. They are most effective nearest the population one wishes to control. As with aggregate limits, the rate-limit is applied to either a physical interface or to traffic defined by addressing or by application, and can also be implemented with policing and/or shaping techniques.

Page 44: UW-Madison - FlowScan and Rate Limiting Adventures

Flow-Based Rate-Limiting:Pros and Cons

– Advantages • Somewhat fair in that they distribute packet drops across individual application

sessions of users. • One user's session doesn't impinge on another's since each flow gets its own

allocation. • There is a fine level of granularity of control, because each direction of

individual streams can be affected.

– Disadvantages• Individual users can't burst traffic within a single session. • Retranmissions are caused when packets are dropped, leading to decreased

goodput. • Users that create more simultaneous sessions get more bandwidth. A local

server can get a large percentage of available bandwidth. • There are some scalability issues, but this is improving with application-specific

hardware support. • Doesn’t set any ‘hard’ limits. Bandwidth usage not guaranteed.

Page 45: UW-Madison - FlowScan and Rate Limiting Adventures

Flow-Based Rate-Limiting:Experiences

• UW-Campus currently has this implemented on an Riverstone RS to limit residence hall network (ResNet) traffic. This can also be done with Cisco gear such as the Catalyst 65xx with the requisite additional cards.

• It was reported that ResNet users had difficulty with UDP based applications, although we had per-flow UDP limiting set to 10Mb/s. The problems disappeared after completely removing the limits.

Page 46: UW-Madison - FlowScan and Rate Limiting Adventures

Flow-Based Rate-Limiting:Example

Example: configuring rate-limits on a Riverstone aggregation router to limit a user population's outbound flows TCP flows to 100 Kb/s each. Also, limit flows that are likely to be Napster to 33.6 Kb/s. Consider outbound flows to campus destinations and to web servers to be "preferred", so only limit those to 10 Mb/s.

acl resnet_napdata permit tcp 10.10.0.0/16 any 6699 any acl resnet_napdata permit tcp 10.10.0.0/16 any any 6699 acl resnet_napdata permit tcp 10.10.0.0/16 any 6688 anyacl resnet_napdata permit tcp 10.10.0.0/16 any any 6688 acl resnet_tcp permit tcp 10.10.0.0/16 any acl resnet_tcp_preferred permit tcp 10.10.0.0/16 any any http acl resnet_tcp_preferred permit tcp 10.10.0.0/16 10.1.0.0/16 acl resnet_tcp_preferred permit tcp 10.10.0.0/16 10.2.0.0/16acl resnet_tcp_preferred permit tcp 10.10.0.0/16 10.3.0.0/16 acl resnet_tcp_preferred permit tcp any 10.10.0.0/16 rate-limit resnet_tcp input acl resnet_tcp rate 100000 exceed-action drop-packets sequence 1 rate-limit resnet_napdata input acl resnet_napdata rate 33600 exceed-action drop-packets sequence 3 rate-limit resnet_tcp_preferred input acl resnet_tcp_preferred rate 10000000 exceed-action drop-packets sequence 4 rate-limit resnet_napdata apply interface backbone rate-limit resnet_tcp apply interface backbone rate-limit resnet_tcp_preferred apply interface backbone

Page 47: UW-Madison - FlowScan and Rate Limiting Adventures

Results of Rate-Limit Implementations

• Aggregated based hard policing causes steady 20 Mb output from ResNet

• Winter break: ResNet traffic very low

• Flow based hard policing lowers traffic amount, but level is not steady.

Page 48: UW-Madison - FlowScan and Rate Limiting Adventures

Practical Rate-Limit Methods:TCP Rate Control

• TCP rate shapes flows of TCP traffic at the same fine granularity available with other flow-based rate limits by manipulating TCP header fields, which are used to negotiate window sizes information, and by pacing response ACKs.

• Sites such as UW-Whitewater have experience with the Packeteer PacketShaper product. Last weekend, UW-Madison began experimenting with a PacketShaper.

Page 49: UW-Madison - FlowScan and Rate Limiting Adventures

TCP Rate Control:Pros and Cons

– Advantages: • Maximizes goodput by minimizing packet retransmissions. • A mature commercial product implementing it is available.• TCP rate control could offer some protection against some

obscure denial-of-service attacks which generate non-conforming TCP packets.

– Disadvantages:• As the name implies, TCP rate control is TCP specific, and

therefore must be augmented with other rate-limiting mechanisms.

• There are scalability issues. PacketShaper, for example, must track state of connections and manipulate packets on the fly.

• Patented, closed-source implementations.

Page 50: UW-Madison - FlowScan and Rate Limiting Adventures

TCP Rate Control:Experiences

– At UW-Whitewater, connected via DS3 to WiscNet, when the PacketShaper was set to 45Mb versus not being present in the network, the transfer rate was roughly 2/3 of the non-PacketShaper rate. Simply having the device in the network slowed transfer rates.

– Current maximum physical connection rate is 100Mb Ethernet. Using PacketShaper in large networks is tricky.

– Basically a flow-based rate controller, those advantages and disadvantages apply as well.

Page 51: UW-Madison - FlowScan and Rate Limiting Adventures

Rate-Limit Alternatives

• A number of institutions have implemented quota systems. These limits help ensure limited bandwidth is shared fairly among all of its residents.

• University of Texas implements weekly bandwidth quotas for their ResNet.

http://resnet.utexas.edu/meter.html

Page 52: UW-Madison - FlowScan and Rate Limiting Adventures

Adaptive Rate Limiting

• Also, some adaptive rate-limit systems have been prototyped and appear to be somewhat effective. For example:http://www.ncne.nlanr.net/training/techs/2001/0128/presentations/200101-kline1_files/v3_document.htm

• "Top Talker" reports have been added to FlowScan to facilitate the enforcement of adaptive rate-limit policies. These systems are complicated to implement because they need to manipulate router configurations programmatically via CLI.

Page 53: UW-Madison - FlowScan and Rate Limiting Adventures

Rate Limit Links

Policing and Shaping overview:http://www.cisco.com/univercd/cc/td/doc/product/software/ios120/12cgcr/qos_c/qcpart4/qcpolts.htm

Rate-limiting and Traffic-policing Featureshttp://www.juniper.net/techcenter/techpapers/200005.html

Committed Access Ratehttp://www.cisco.com/warp/public/732/Tech/car/

TCP Rate Controlhttp://www.cs.rpi.edu/~karans/report/

Generic Traffic Shapinghttp://www.cisco.com/warp/public/732/Tech/gts/ -

Page 54: UW-Madison - FlowScan and Rate Limiting Adventures

The End…?

[email protected]

Thanks to Dave Plonka, UW Madison