doc

30
On Evaluating Policy-Based Bandwidth Management Devices Huan-Yun Wei 1 Ying-Dar Lin Department of Computer and Information Science National Chiao Tung University, Hsinchu, Taiwan Tel: +886-3-5712121-ext56667 FAX: +886-3-5721490 Email: {hywei,ydlin}@cis.nctu.edu.tw Policy-based bandwidth management defines how to allocate bandwidth resources according to organizational policy rules. Enterprises often employ such policy-based devices at their organizational edges to manage the narrow but expensive Internet access links. This work designs a novel testbed and uses it to evaluate the functionality and performance of many such devices, including six commercial products and one open source solution. Their policy rules can be categorized into (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes. The testbed mimics the real-life Internet with heterogeneous Internet delays/delay jitters/packet loss rates, and evaluates the effectiveness of policy enforcement of the above three policy types in terms of accuracy, fairness, stability, robustness, bandwidth 1 Corresponding author 1

Upload: catharine24

Post on 02-Nov-2014

766 views

Category:

Documents


1 download

DESCRIPTION

 

TRANSCRIPT

Page 1: doc

On Evaluating Policy-Based Bandwidth Management Devices

Huan-Yun Wei1 Ying-Dar Lin

Department of Computer and Information Science

National Chiao Tung University, Hsinchu, Taiwan

Tel: +886-3-5712121-ext56667

FAX: +886-3-5721490

Email: {hywei,ydlin}@cis.nctu.edu.tw

Policy-based bandwidth management defines how to allocate bandwidth resources according to

organizational policy rules. Enterprises often employ such policy-based devices at their organizational

edges to manage the narrow but expensive Internet access links. This work designs a novel testbed and

uses it to evaluate the functionality and performance of many such devices, including six commercial

products and one open source solution. Their policy rules can be categorized into (1) class-based rule;

(2) connection rule within a class; (3) bandwidth borrowing rule among classes. The testbed mimics the

real-life Internet with heterogeneous Internet delays/delay jitters/packet loss rates, and evaluates the

effectiveness of policy enforcement of the above three policy types in terms of accuracy, fairness,

stability, robustness, bandwidth borrowing, and voice over IP (VoIP) quality. The test results2 reveal that

(1) explicitly sizing the TCP window could cause performance or fairness degradation even under slight

packet loss rates; (2) the open source solution can compete with commercial products in accurately

limiting flow aggregates; (3) the voice qualities over IP networks significantly depends on the packet

sizes of all other traffic when using a narrowband (125kbps) access link.

Keywords: policy-based, bandwidth management, TCP, testbed, emulator

1 Corresponding author2 All test results are verified by the vendors and are reproducible through our open tools. Nowadays most benchmark reports are financed by vendors and may be biased, without practical testbeds. Guided by this neutral test, readers can obtain in-depth sights when examining bandwidth management devices.

1

Page 2: doc

1. Introduction

Internet services provide an economic and convenient system to carry out business, such as

efficient information exchange among branch offices, or efficient customer/provider access to the

services. However, the importance of the services varies, and enterprises often fail to effectively utilize

the narrow but expensive WAN link bandwidth. For instance, the bandwidth required by ERP

(Enterprise Resource Planning), voice over IP (VoIP), and e-business may be occupied by less-important

applications such as FTP. Since end-to-end Internet QoS such as DiffServ [1] is still under experiment,

enterprises seek to at least manage their inbound and outbound links. Thus, policy-based bandwidth

management devices are employed at organizational edges to set and enforce organizational policies for

pursuing the utmost benefits.

Network administrators define policy rules to achieve resource management objectives for the

enterprise. Each policy rule contains “condition” and “action” fields to define specific actions for

specific conditions. Condition defines the packet-matching criteria, such as a certain subnet , application,

or protocol. Action defines the bandwidth parameters, such as “at least 100kbps” or “at most 200kbps”.

So each policy rule is class-based that it groups a set of traffic flows into a per-class queue according to

the specified packet filter (condition), and then the class of traffic is scheduled out at its corresponding

specified bandwidth (action). Moreover, the class-based rules can be further configured with bandwidth

borrowing among the classes to dynamically utilize available bandwidth effectively. Additionally, each

connection within a class can be guaranteed to have at least a certain amount of bandwidth. Throughout

this work we evaluate the effectiveness of various policy enforcements for the above three policy types:

(1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule among classes.

The following subsections review traditional and prevalent technologies to enforce these policy rules.

Traditional Technology—Queuing

A straightforward method for bandwidth management is to queue less-important traffic and pass

important traffic as soon as possible. Queuing can be roughly categorized into (1) priority-based queuing

and (2) rate-based queuing. Priority-based queuing sets the priority among the classes and the highest

priority class is scheduled out first. This is suitable for short-lived, extremely important, or transaction-

oriented flows. However, priority-based queuing cannot quantitatively guarantee/limit the bandwidth for

a class. As an analogy, if everyone is VIP, then no one is real VIP. In contrast, rate-based queuing

employs various packet scheduling algorithms [2] that can decide from which class comes the next

packet for transmission. This can effectively limit senders who are trying to overburden the resource.

Besides, the minimum bandwidth for important applications can be quantitatively guaranteed. Floyd and

2

Page 3: doc

Jacobson [3] further investigate the bandwidth borrowing among the classes. Queuing has different

impacts upon UDP and TCP data flows. Next we briefly review UDP and TCP protocols.

Queuing the Internet Traffic: TCP vs. UDP

The majority of software applications today use TCP (Transmission Control Protocol) for data

transmission because TCP can establish a reliable end-to-end connection. TCP receivers acknowledge

the successful reception of each data packet by replying an Ack to their TCP senders. Thus, Ack packets

can trigger senders sending out new data packets. Unacknowledged data packets are retransmitted to

guarantee reliability of data transfers. TCP also incorporates flow control mechanisms that prevent a

sender from overburdening the network capacity or overflowing its receiver’s buffer. Thus each TCP

sender keeps two window values, congestion window (CWND) and receiver advertised window

(RWND), and seeks to satisfy both network capacity (congestion control) and receiver's capability of

receiving the data, respectively. So each TCP sender do not have unacknowledged data more than

min(CWND, RWND). RWND is advertised by the receiver in TCP Ack packets and ranges widely

among operating systems. CWND is kept increasing exponentially during the slow-start phase and

linearly during the congestion avoidance phase, probing available bandwidth until packet losses occur.

Loss behavior differs among versions but mainly on how the CWND is shrunk and raised, or how the

lost segments are accurately retransmitted. Falls and Floyd [4] give a good overview and problems on

Tahoe, Reno, NewReno, and SACK TCP versions. Padhye and Floyd [5] further investigate the TCP

version distribution among 4550 Web servers. Unlike TCP, UDP (User Datagram Protocol) lacks the

connection establishment, reliability of data transfer, and flow control. UDP only provides port number

multiplexing and is commonly used by real-time applications such as video conferencing and Voice over

IP (VoIP).

Queuing has different impacts upon UDP and TCP flows. As for real-time UDP traffic, the bit rate

is often fixed and the video/voice quality heavily depends on the loss rate, delay, and delay jitter. The

packet scheduler must precisely allocate enough bandwidth for real-time UDP traffic to minimize packet

losses and delay at the controlling device. Moreover, the packets of the real-time traffic require to be

smoothly scheduled out with even intervals for minimizing the delay jitter. As for TCP traffic, TCP

flows competing for the same queue can cause a great amount of data packets queued in the device,

resulting in high buffer requirement and large packet latency at the device. Moreover, the TCP flows

may not fairly share the class bandwidth, especially when their round-trip times (RTT) are different.

Thus many vendors apply specific algorithms for regulating TCP traffic.

Specific Algorithms for TCP Traffic

To guarantee each TCP connection bandwidth within a class, and hence achieve fairness among

the flows within a class, the ideal solution is to actively control the sending rate of each sender within

3

Page 4: doc

the class instead of letting them compete with each other. Thus queuing and its queuing delay, buffer

requirement can be reduced. Other types of traffic such as UDP can only resort to the primitive solution,

queuing, to passively control its bandwidth. Two methods exist for controlling each TCP connection: (1)

window-sizing and (2) packet-dropping.

1. Window Sizing: Since a TCP connection can be actively controlled through the feedback

Acks, the window-sizing method directly influences the amount of sending bytes by shrinking the

RWND in the TCP Acks. In this test, iPolicer, PacketShaper, WiseWAN, QoSWorks and Guardian

Pro belong to this type. Karandikar et al. [6] sponsored by Packeteer investigate the window-sizing

technique. Though window-sizing can directly control per-connection bandwidth, it needs to

readjust its Ack regulations when another connection enters or leaves the class.

2. Packet Dropping: Because a TCP sender slows down its transmission rate in response of

network congestion by halving its congestion window size, the packet-dropping method drops

packets and expects that the sender will slow down its rate when detecting the packet loss events

[7]. In this test, FloodGate (uses per-flow queuing) and ALTQ_CBQ+RED belong to this type.

This work designs a novel testbed for evaluating the effectiveness of various policy enforcement

techniques used by existing products or solutions. The testbed mimics the real-life Internet

characteristics such as WAN delay, delay jitter, and packet loss. Section 2 compares the relevant

information of the devices under test (DUT). Section 3 then describes the design of our testbed and the

test methodology. Section 4 demonstrates the test results. Finally, a summary of the test results and

conclusions are given in Section 5.

2. Device under Test (DUT)

This test project invites nine vendors, and six of them join this test. Table 1 compares the

relevant information of all the DUTs. Most DUTs are installed at LAN-router link to prevent router

queues from overflowing and causing congestion. Because the grade of each DUT differs, so only low

bandwidth configurations (below 1.544Mbps) are tested. This minifies hardware differences so that test

results can reflect true management capability of each DUT.

Vendor/

Model

Grade

(Announced)

S/W

Ver.

OS,

HW/SWInstall at

Hardware

Boot

fromCPU RAM Interface

Fail

OverLog to

ALTQ 2.2 [8] 100Mbps 2.2 FreeBSD, Software Between

LAN

and

Our P!!! 700MHz PC with 256M

SDRAM, 2 Intel 100M NICs installed,

booting from a hard disk.

N Same FreeBSD

NetGuard’s Guardian Pro [9] 10Mbps 5.02 NT 4.0, Software N Same NT server

CheckPoint’s FloodGate [10] 45 Mbps 4.1 NT 4.0, Software HA* Same NT server

4

Page 5: doc

Router

BroadWeb/Acute’s iPolicer

100-CR2202 [11][12]100 Mbps 1.6.4

Embedded NT,

Hardware

Flash

32M

P!!!

600128M 10/100Mbps N Another NT server

Packeteer’s PacketShaper

4500 [13]45 Mbps 4.1.2

Embedded Linux,

HardwareFlash

P!!!

600128M 10/100Mbps Y

Embedded Hard

Disk

Sitara’s QoSWorks

QWX-10000 [14]100 Mbps 1.8

Embedded FreeBSD,

Hardware

Hard

Disk

P!!!

600192M 10/100Mbps Y

Embedded Hard

Disk

NetReality’s WiseWan

200/500 [15]5Mbps 4.0

Proprietary,

HardwareWAN link

Flash

32M

P

13332M

V.35

(10Mbps log)Y Another NT server

Note 1: Invited venders also include Lucent’s Access Point, Allot’s NetEnforcer(these two decide not to join this test after examining our test plan)and

Cisco’s Cisco Assure (did not want to join at the beginning).

Note 2: Fail Over is defined as the capability of bypassing traffic when the power is off. HA means high availability module (optional).

Note 3: Sitira revealed to us that QoSWorks uses ALTQ_CBQ.

Table 1: Product information and software/hardware platforms

2.1 Functionality of Policy Console

Network administrators use policy console to define organizational bandwidth policy rules. Table 2

lists the functionality of each policy console. All DUTs can limit the bandwidth of a class. Moreover,

most DUTs can guarantee the minimum bandwidth of each connection within the class, except for

Guardian Pro and ALTQ. These two settings can be further set by (a) inter-class bandwidth borrowing

and (b) intra-class bandwidth borrowing, respectively. In (a) the DUTs can redistribute any available

bandwidth unused by some classes to other active classes; in (b) if any flow in a class terminates, its

bandwidth will be fairly redistributed to other flows.

Vendor/

Model

Packet Classifier

Direction

(In/Out)

UDP

traffic

control

WAN

Link

Speed

Setup

Per-Class Bandwidth Control Bandwidth Borrowing

Src/Dst IP/Port#,

mask, Prot. ID

Host

list

Class

limit

Guarantee BW for each

connection in the classInter-class Intra-class

ALTQ Y N Both Y Y Y N Auto Compete2

NetGuard’s Guardian Pro Y Y Both Y Y Y N Degree1 Compete

CheckPoint’s FloodGate Y Y Both Y Y Y Y Degree Degree

NetReality’s WiseWan Y Y Both Y Y Y Y Auto Auto

Acute/Broadweb’s iPolicer Y Y Both N N Y Y N N

Packeteer’s PacketShaper Y Y Both Y Y Y Y Degree Degree

Sitara’s QoSWorks Y N Both Y Y Y Y Auto Auto

1 Degree means that administrators can manually specify the degree of bandwidth borrowing.

2 DUTs without connection guarantee let the flows within the class compete with each other.

Table 2: Functionality Comparison of the Devices under Test

5

Page 6: doc

2.3 Protocol Support

Table 3 compares the protocol support of each DUT. Most Internet services/protocols can be

recognized by layer-4 TCP/UDP port numbers. However, layer-7 awareness can increase the simplicity

and capability of bandwidth management. For example, FTP protocol includes the passive mode, in

which FTP-data port (port 20, for sending data) can be dynamically changed to another by negotiation in

the FTP-Cmd port (port 21, for sending FTP commands). If the DUT cannot recognize what negotiation

is in the FTP-Cmd port, obviously it cannot control the connection that is actually sending the data.

PacketShaper and WiseWAN have the richest layer-7 awareness. In terms of quantity of port-service

mapping entries, WiseWAN and PacketShaper are the richest. The next richest are FloodGate and

Guardian Pro. iPolicer, QoSWorks, and ALTQ have few or no built-in port-service mapping entries and

require manual lookups in the port-service mapping table. Although iPolicer can identify UDP, it cannot

control its bandwidth.

Vendor/

Model

Layer awareness Built-in port-service mappingsICMP IPX # of other protocols

Layer Layer-7 TYPE TCP UDP

ALTQ 4 N 0 (Manually assign port #) N N Manually assign port #

NetGuard’s Guardian Pro 4 N 60 35 Y N 15

CheckPoint’s FloodGate 7 URL/MIME-TYPE 60 35 Y N Manually assign port #

NetReality’s WiseWan 7 URL/MIME-TYPE 109 79 Y Y Above 250

Acute/BroadWeb’s iPolicer 4 N 12 Cannot control Y N Manually assign port #

Packeteer’s PacketShaper 7 URL Total above 200 (layer 2 ~7) Y Y Above 200

Sitara’s QoSWorks 4 N 0 (Manually assign port #) N N Manually assign port #

*Note: This table only lists the protocols that can control rather than just recognize only.

Table 3: Comparison of Protocols Support

Appendix A-1 and A-2 further compare the policy console user interface and special functions of

the DUTs. Most DUTs mix priority-based and rate-based queuing, however, this test focuses on “rate-

based policy” that controls “TCP connections flowing from enterprises (LAN) to WAN” since TCP

traffic occupies most of the Internet traffic. As for UDP traffic, this test focuses on real-time applications

such as Voice over IP (VoIP). Differences between configured bandwidth and measured results will be

6

Page 7: doc

quantified.

3. Testbed and Test Methodology

Testbed and test methodology significantly influence test results and require careful examination to

avoid misinterpretation of the results.

3.1 Testbed: Mimics the Real-Life Internet

Internet is very dynamic. Different connections have different paths and therefore have different

distances and path qualities. Our testbed mimics the above properties by setting WAN delay, WAN delay

jitter, and WAN packet loss rate to each routing path. Figure 1 and Table 4 shows complete information

about our testbed and testing tools. Testing data flows are from X to Y, passing through the DUT,

routers, monitoring point, and WAN emulator. The Cisco routers are installed specifically for WiseWAN

because of its V.35 interface. Each DUT is individually tested on this testbed. Appendix B displays our

testbed photo. IP-aliasing employed at A and I in Fig.1 emulates multiple competing sources and their

corresponding sinks, respectively. Self-written wan-emu virtual interface driver is used to emulate the

dynamics of the Internet. They are detailed as follows:

7

XY

Figure 1: The Testbed: Mimic the Real-life InternetNote: All PC are equipped with Intel Express Pro 10/100Mbps network interface cards. The V.35 serial clock rate between Cisco routers is set to 2Mbps. Each DUT is individually tested on this testbed.

Page 8: doc

Tool Function DescriptionPosition in

Fig. 1

Ncftpput [16] TCP Traffic

generator

Traffic: 20 ncftputs flows from subnet X to subnet Y.

Packet size: 1,500 bytes

TCP options: SACK/timestamp/window-scaling disabled.

A

SmartVoIpQoS

[17]

VoIP (UDP) traffic

generator

Traffic: Single VoIP flow with RTP format UDP packets.

Codec: G.729 (50 frame/sec, frame size=74 byte, around 30kbps)

M

VoIP Gateway Same as above Same as above K and N

ttt [18] Real-time traffic

bandwidth monitor

Monitor the bandwidth of the traffic passing through it by protocols,

source/destination IP, etc.

G

Tcpdump [19] Packet sniffer Dump each packet’s header to the RAM disk to avoid I/O overheads. A and H

Self-written AWK

scripts [20]

Data Analyzer Calculating statistics from the tcpdump result. G

Self-written wan

emulator [20]

WAN Emulator To have different delays, delay jitters, and random/periodic packet

loss rates impairments on different flows.

H

Table 6: Testing Tools

1. IP-aliasing3: In Linux each network interface card (NIC) can emulate 100 NICs, with each virtual NIC

having a unique IP address. With proper routing table setup at A in Fig.1, we can direct certain flows

destined to a certain virtual NIC at I through a virtual NIC at A. Virtual NICs generate packets with

their corresponding IP addresses such that the DUT will feel that outgoing TCP data packets are from

different local hosts, and incoming TCP Acks are from different remote hosts. Moreover, packets are

sent without link-layer collisions since only a single physical NIC is present at A and I.

2. wan-emu: Wan-emu is a Linux virtual interface driver that resides between the IP layer and the NIC

driver. In this testbed, multiple wan-emu virtual devices are attached to the sink-side last-hop NIC

driver (at H with IP 10.1.1.254) to have different impairments on different routes. With proper static

route, we can direct flows destined to a virtual NIC at I through a specific wan-emu interface that has

the desired link characteristics. Each packet passing through is pasted a timestamp indicating the time

for it to be kicked out. An interrupt is triggered every 1ms to examine how many packets are due and

should be forwarded. The timer granularity can be easily tuned to 8192 Hz in Linux. Impairments such

as the random/periodic loss rate and delay jitter are also implemented.

3.2 Test Methodology

This test includes three sub-tests: Basic Test, Robustness Test, and Advanced Test.

3 Note that some operating systems merely support alias IP addresses, but cannot support alias interfaces, such as FreeBSD and Windows 2000.

8

Page 9: doc

A. Basic Test

This test evaluates the accuracy of the class bandwidth and the fairness among the connections

within the class. Besides, this test also investigates the stability of each DUT among its five-time runs.

The total WAN link bandwidth is set to T1 (1.544Mbps)4 and is partitioned into five classes (20, 40, 128,

256, and 1100kbps), with each class matching four TCP connections. Each class is set to guarantee that

each connection has 1/4 of the class bandwidth5. All settings are fixed without any bandwidth

borrowing. This test repeats in consecutive five runs, with 200 seconds intervals in between. Within each

run, 20 FTP connections are simultaneously flowing from A to I (Table 6), with each class match 4

connections. After 250 seconds, all the ncftpput processes are killed. Data from 30 to 230 seconds are

analyzed. The statistics are explained in Table 7. Appendix C uses an intuitive example to illustrate the

following statistics.

Statistic Quantify what? DefinitionComparison

Standard

AccuracyThe differences between:(1)the class bandwidth settings (2)the measured class bandwidth

Averaged normalized goodput* The closer to 1, the better

Stability of

accuracy

The differences of the accuracy

statistics among the five runs.

CoV** of normalized goodput among the five runs

(Same as above, but take the CoV among the 5 runs

instead of the average.)

It depends***.

Fairness Fairness of bandwidth usage

among the 4 connections in

each class.

Averaged CoV among 4 connections’ goodputs The closer to 0, the

better

Stability of

fairness

Differences of the “fairness

statistic among the five runs”

Same as above, but take the standard deviation

among the 5 runs instead of the average.

It depends***.

Retransmission

Ratio

Retransmission ratio in each

class.

The closer to 0, the

better

* Goodput is the effective throughput (bytes/time) excluding the bandwidth consumed by retransmission.

** CoV denotes “coefficient of variation,” which means “standard deviation over mean.”

*** If the accuracy tends to 1, it would be better for its stability to be 0. This implies the DUT always performs accurately. However, if the accuracy tends to

0, and its stability also tends to 0, it implies that the DUT always performs inaccurately. This also applies to fairness and its stability (Appendix C).

Table 7: Basic Test Statistics

4 BroadWeb/Acute iPolicer does not have WAN link speed setup.5 NetGuard Guardian Pro cannot accept per-connection setting.

9

Page 10: doc

B. Robustness Test

Packets may be generated by different operating systems, hence different TCP implementations,

and pass through paths with various delays and loss rates. Long-distance TCP connections are expected

to be vulnerable to Internet losses because they require more time to obtain Acks for recovering to their

target bandwidth. Since many DUTs regulate TCP Acks, it is our concern whether they are compatible

with the major operation systems. Table 8 describes our test methodology.

Test ItemDescription

Comparison standardDUT Settings Test Methodology

Under Heterogeneous

Internet Delays

Same as Basic

Test.

WAN delays of the four connections in each class

are 10ms, 50ms, 100ms, 150ms

Same as the Basic Test

Under Various

Internet Loss Rates

200kbps for

the test flow.

A single TCP connection is tested under 0.5%, 1%,

2%, 4% and 8% periodic loss rates.

Whether the goodput can

smoothly degrade.

Under Different

Sending Operating

Systems

80kbps for the

test flow.

(1)WAN: delay=50ms, periodic loss rate=1%.

(2)TCP Source OS= {Linux 2.2.14, Windows 2000,

FreeBSD 4.0, Solaris8}.

(3)TCP Receiver OS= Linux 2.2.14.

(4)Each time a single TCP connection is tested.

How closely the byte-time

lines of the operating

systems can overlap with

each other.

Table 8: Robustness Test Methodology

C. Advanced Test

This test includes bandwidth borrowing test and VoIP quality test. Bandwidth borrowing has been

described in Section 2. VoIP quality is separately tested through SmartBits and VoIP Gateway to

evaluate whether the DUTs can precisely allocate adequate bandwidth for voice traffic. Each test is

conducted under heavily-loaded FTP traffic. Detailed test methodologies are in Table 9.

Test ItemDescription Comparison

standardDUT Settings Test Methodology

Inter-class

Bandwidth

Borrowing

(1) Link speed=T1 (1.544Mbps), divided

into 2 classes A, B. A=B=777kbps.

(2) Class A matches connection 1, Class

B matches connection 2.

(3) A and B can borrow with each other.

Connection 1 and 2 are started and stopped in

sequence.

(1) Stability of

each

connection.

(2) How

seamlessly the

total

bandwidth line

can be when

connection 1

Intra-class

Bandwidth

Borrowing

(1) Link speed=T1 (1.544Mbps), divided

into 1 classes A. A=1.544Mbps.

(2) The class matches connection 1 and 2.

(3) Per-connection bandwidth: at least

10

Page 11: doc

777kbps, at most 1.544Mbps. terminates.

VoIP test using

SmartVoIpQoS

(1) Link speed={T1,125kbps}, divided

into 2 classes A, B.

(2) A=30kbps for voice traffic,

B={T1,125kbps}-30kbps for FTP

traffic.

(3) FTP traffic can occupy the voice class

until voice traffic begins.

Background: 20 FTP connections.

Foreground: a 30kbps G.729 VoIP flow.

PSQM1, jitter,

delay and loss.

VoIP test using

VoIP Gateway

(Cisco 1750)

Background: 20 FTP connections.

Foreground: Dial a phone (JP to NP, G.729

codec), hold X’s and Y’s phones,

speak 1 to 10 at 2 word/sec, and

judge the voice quality.

Listening with

ears2.

1PSQM (Perceptual Speech Quality Measurement) is calculated from delay, jitter, and loss statistics. PSQM rated as 6.5 has the poorest quality

2The VoIP Gateway is set to continuously sample the sound even when the primary tester keeps silent. Thus the data flow is always around 30 kbps.

Table 9: Advanced Test Methodology

4. Benchmark Test Results

A. Basic Test Results

A-1. Accuracy and Stability of Accuracy

Figure 2 (A1 is accuracy, B1 is its stability, A2 and B2 will be discussed in the robustness test)

reveals that the DUTs can be classified into three groups: ALTQ_CBQ, PacketShaper, and QoSWorks

have the most accurate and stable control for each class; WiseWAN and FloodGate are less effective in

the narrowband class (20kbps) because of their large retransmission ratios as will be shown in section A-

3; iPolicer and Guardian Pro are the least effective. iPolicer has several terminated connections in the

middle of each run. Thus those connections not sending data waste bandwidth and result in instability

among the five runs6.

6 Note: The test crew had performed many “five-run” tests on iPolicer. It is only after the above phenomenon has been verified that we include the most general one of the “five-run” tests in our analysis.

11

Page 12: doc

Figure 2: Results of accuracy and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)

A-2. Fairness and Stability of Fairness

Figure 3 (A1 is fairness, B1 is its stability, A2 and B2 will be discussed in robustness test) also

distinguishes three groups: PacketShaper is the most fair and stable; QoSWorks is less fair but is stable

in the 20kbps class, implying that it is less fair in the 20kbps class in all the five test runs (Appendix C).

FloodGate and WiseWAN are less fair and stable in the 20kbps class. iPolicer, Guardian Pro, and

ALTQ_CBQ+RED provide poor fairness. Pure CBQ has the poorest fairness under narrowband

(20~40kbps) classes. However, it is somewhat alleviated after applying RED to each class because RED

tends to drop more packets from the connection that is more aggressively sending the data.

12

Page 13: doc

Figure 3: Results of fairness and its stability (A1, B1: No Internet Delay; A2, B2: With Internet Delay)

A-3. Retransmission Ratio

Figure 4 A1 (A2 will be discussed in robustness test) shows large retransmission ratio in

narrowband classes (20~40kbps), except for PacketShaper and QoSWorks, but especially in WiseWAN,

iPolicer, FloodGate and ALTQ_CBQ+RED. As an analogy, a small exit often keeps many people

waiting before it. FloodGate and ALTQ_CBQ+RED use “packet dropping” to slow down TCP flows so

they have high retransmissions. WiseWAN has enormous packet losses at the Cisco router before

WiseWAN can control the traffic at the WAN link. Results of iPolicer are not easy to comprehend in

terms of the technologies it claims (adjusting the TCP window size).

13

Page 14: doc

Figure 4: Test results of retransmission ratio (A1: No Internet Delay; A2: With Internet Delay)

B. Robustness Test Results

B-1. Under Heterogeneous Internet Delays

To make it easy to compare with the Basic Test, the test results are listed with those of Basic Test.

Figure 2 (A2, B2), Figure 3 (A2, B2) and Figure 4 (A2) separately demonstrates the results. Most results

scales up the differences among the DUTs in the Basic Test, especially with iPolicer and ALTQ_CBQ in

the fairness statistic. Long-distance connections are vulnerable to packet losses due to buffer overflows

at the controlling device, as described in Section 3.2 B. ALTQ_CBQ+RED can alleviate the unfairness

degree of ALTQ_CBQ because the short-distance connections, which are more aggressively sending the

data, have more packets dropped by the RED mechanism. Guardian Pro cannot guarantee each

connection and thus reveals significant instability between Basic Test and this test. QoSWorks is less fair

under the broadband class (1.1Mbps).

B-2. Under Various Packet Loss Rates

Normally a TCP flow slows down its transmission rate when packet losses occur. Figure 5 shows

the goodput of each DUT under different Internet packet loss rates (each flow is with 200kbps and the

measured goodput is averaged over 200 seconds as in Basic Test). Almost all the DUTs can smoothly

lower their goodputs as packet loss rate increases, except for PacketShaper and iPolicer. These two

devices give up sizing the TCP window when they have detected the TCP loss events (triple duplicate

Acks). Thus, the TCP sending window suddenly bumps up and causes a burst of packets flowing to the

controlling device, resulting in a higher goodput at 0.5% loss rate. This phenomenon is alleviated when

increasing the packet loss rate.

14

Page 15: doc

Figure 5: Robustness Test—goodput under various packet loss rates

B-3. Under Different Sending Operating Systems

In this compatibility test (see Fig.6, the X axis is time, Y axis is the bytes sent, thus the slope is the

bandwidth), TCP connections sending from different operating systems passing through PacketShaper

have different results. PacketShaper shrinks the TCP window to the condition that no more than 4

packets are in the WAN pipe. Thus, each packet loss resorts to a retransmission timeout instead of using

fast retransmit [21]. Since BSD-derived UNIX systems use a coarse-grained retransmission timer

(500ms) [21] such that they slowly retransmit the lost packets. In contrast, Linux keeps a fined-grained

retransmission timer and has the best performance when packet losses occur. iPolicer has a serious bug

when sending data from Windows 2000 to Linux 2.2.14. The tcpdump tool found that the TCP Ack

header length is miscalculated when passing through iPolicer, causing incorrectly triggering of data

packets from TCP senders. TCP has many options and various implementations, so explicitly modifying

the packet header requires sever compatibility tests. The other products can fairly treat TCP flows from

different operating systems.

Figure 6: Robustness test— Under different Sending Operating Systems

C. Advanced Test Results

C-1. Bandwidth Borrowing Test Results

This test uses ttt to observe the effectiveness of bandwidth borrowing. In each figure we only focus

15

Page 16: doc

on three lines: the total bandwidth (ip/ether line), the bandwidth of connection 1 (xxxx/tcp line) and the

bandwidth of connection 2 (yyyy/tcp line). The test crew draws another baseline indicating the ideal

total link bandwidth (1.544Mbps) for comparison.

Inter-Class Bandwidth Borrowing Test Results

Figure 7 shows the inter-class bandwidth borrowing benchmark results. iPolicer does not have

this function, so we set the bandwidth of both of the two classes as 1.544Mbps. However, Cisco

Routers link is set to 2Mbps, thus the two 1.544Mbps flows through iPolicer exceeds the baseline

bandwidth. After connection 1 terminates, the total bandwidth narrows down to around 1.5Mbps

with some bandwidth fluctuation. WiseWAN and ALTQ can automatically borrow bandwidth among

classes, and the others can be further configured with the degree of bandwidth borrowing. Guardian

Pro has an unstable look when connection 2 starts to obtain a bandwidth share. ALTQ_CBQ and

ALTQ_CBQ+RED can only borrow a limited bandwidth (from 777kbps to 1.1Mbps). FloodGate,

PacketShaper and QoSWorks can perform inter-class bandwidth borrowing seamlessly.

(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard GuardianPro (d) ALTQ_CBQ

(e) Packeteer PacketShaper (f) Sitara QoSWorks (g) NetReality WiseWAN (h) ALTQ_CBQ+RED

Figure 7: Inter-class Bandwidth Borrowing Test

Intra-Class Bandwidth Borrowing

Figure 8 shows the intra-class bandwidth borrowing benchmark results. iPolicer lacks this

function so after connection 1 terminates, connection 2 cannot occupy the newly available

bandwidth within the class. Guardian Pro and ALTQ_CBQ have fluctuating bandwidth sharing

between the two connections since they cannot guarantee per-connection bandwidth. This

phenomenon in ALTQ_CBQ is again slightly alleviated after applying RED. The other four products

are quite similar in this test, except that PacketShaper and FloodGate have little gaps.

16

Page 17: doc

(a) Acute/Broadweb iPolicer (b) CheckPoint FloodGate (c) NetGuard Guardian Pro (d) ALTQ_CBQ

(e) Packeteer PacketShaper (f) Sitara QoSWorks (g) NetReality WiseWAN (h) ALTQ_CBQ+RED

Figure 8: Intra-class Bandwidth Borrowing Test

C-2. VoIP Quality Test

This test does not include iPolicer since presently it cannot control UDP traffic. This test is

performed by the Smartbits and by the Cisco 1750 VoIP gateways. The former gives quantitative results

while the latter judges the voice quality through hearing.

Figure 9 (a) shows that under T1 WAN link (1.544Mbps) the DUTs differ in latency and jitter.

However, the ultimate voice quality grades (PSQM) are similar except for ALTQ_CBQ. This is also

verified by the VoIP Gateway (Table 10) test. We thus conclude that under T1 access link the G.729 bit

rate can be easily allocated. In contrast, under 125kbps WAN link (Fig.9 (b) and Table 10), the voice can

only barely be recognized with PacketShaper. Transmitting a large packet (1500 bytes) to the

narrowband WAN link (125kbps) takes a long time such that its following small voice packet (74 bytes)

has to wait until the previous large packet is completely scheduled out. However, after QoSWorks

17

Latency and jitter

30.6303

81.0739

1529.3163 837.5684

1.053310.8978

121.2768

0

50

100

150

200

250

Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks ALTQ_CBQ

(ms)

0

5

10

15

20

25

30

(ms)

Average Latency Max Latency Jitter (Latency Variation)

PSQM and Loss Rate

2.2 2.48 2.56 2.45 2.7 2.6

6.5

0

1

2

3

4

5

6

Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks ALTQ_CBQ

PS

QM

0

5

10

15

20

25

Loss

rat

e (%

)

PSQM Loss rate

Latency and jitter

0

500

1000

1500

2000

2500

3000

Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks QoSWorks2 ALTQ_CBQ

Late

ncy

(ms)

0

10

20

30

40

50

60

70

80

Jitte

r (m

s)

Average Latency Max Latency Jitter (Latency Variation)

PSQM and Loss Rate

2.2

6.5 6.5 6.5 6.5 6.5

2.6

6.23

0

2

4

6

8

Base PacketShaper FloodGate WiseWAN GuardianPro QoSWorks QoSWorks2 ALTQ_CBQ

PS

QM

0

20

40

60

80

100

Loss

rate

(%)

PSQM Loss rate

(a) T1 WAN link (1.544Mbps) (b)125kbps WAN linkNote: “Base” results are conducted under clean testbed without enabling any DUT. The G.729 Codec is not lossless compression. Even though the jitter and loss is few, the PSQM is at least 2.2.

Figure 9: VoIP Test Results of SmartVoIPQoS

Page 18: doc

exercises Packet Size Optimization (minifying the maximum transfer unit of FTP connections when

establishing the connections), the voice quality approaches the original voice both in Smartbits and

Gateway tests. While it is promising, readers should be aware that minifying the packet size of all other

TCP connections can cause large overhead. As an analogy, the overhead of several small trucks carrying

the goods is larger than that of a big truck carrying the same goods. This tradeoff depends on the

considerations of the network administrator.

T1 WAN link Speed 125kbps WAN link Speed

Calling timeDelay time

(estimated by ears)

Voice quality

(legibility)Calling time

Delay time

(estimated by ears)

Voice quality

(legibility)

Baseline (only voice) About 0.2 sec Very short(< 0.1 sec) Very good <1 sec Very short(< 0.1

sec)

Very good

Baseline (with background FTP) Cannot establish the connection Cannot establish the connection

iPolicer Cannot be tested(do not support UDP traffic control) Cannot be tested(do not support UDP traffic control)

FloodGate About 0.5 sec Very short(< 0.1 sec) Very good About 7sec About 1 sec Very Poor(<10%)

Guardian Pro About 0.5 sec Very short(< 0.1 sec) Very good About 3 sec About 1.5 sec Ultra poor(<1%)

WiseWAN About 0.5 sec Very short(< 0.1 sec) Very good About 7sec About 1.5 sec Ultra poor(<1%)

PacketShaper About 0.5 sec Very short(< 0.1 sec) Very good About 1 sec About 1 sec Poor (60%)

ALTQ_CBQ About 2 sec Very short(< 0.1 sec) Very good About 18 sec About 1 sec Very Poor(<10%)

QoSWorks About 1 sec Very short(< 0.1 sec) Very good About 17 sec About 1 sec Very Poor(<10%)

QoSWorks Optimized Not tested (no need to) About 6 sec Very short(< 0.2

sec)

Very good

Table 10: VoIP Test Results Through VoIP Gateway

5. Conclusions

This work designs a novel testbed that mimics the real-life Internet conditions, such as multiple

connections, heterogeneous WAN delays/delay jitters/packet loss rates, and different TCP source

implementations. Most test reports, such as those by the Tolly Group [22], are financed by the vendors

and may be biased. Additionally, the testbed in those reports is over-simplified, without in-depth test

items or with inadequate number of connections. This work first classifies the policy rules into three

major types: (1) class-based rule; (2) connection rule within a class; (3) bandwidth borrowing rule

among classes. The test methodology then quantifies the effectiveness of the above policy rule types of

each device in terms of accuracy, fairness, stability, robustness, bandwidth borrowing, and VoIP quality.

The test results reveal several things that can be reproducible with our open tools: (1) the narrowband

18

Page 19: doc

class-based rule and its fairness among the flows are harder to enforce when multiple TCP connections

compete for the same queue, resulting in large queue length and TCP retransmissions. (2) explicitly

sizing the TCP window could cause performance or fairness degradation even under slight packet loss

rates; (3) the open source solution can compete with commercial products in accurately limiting flow

aggregates; (4) the video/voice qualities of real-time applications significantly depends on the packet

sizes of all other traffic when using a narrowband (125kbps) access link. Detailed functionality

comparison among the DUTs gives further directions for enhancing open source solutions, such as

Packeteer’s traffic discovery and QoSWorks’s intuitive user interface. The ALTQ package lacks per-

connection bandwidth guarantee within the class that it needs further refinements to satisfy the

enterprises’ demand. Some vendors in this test use open sources but never do they open their kernel

patches. We are currently patching ALTQ with per-connection bandwidth guarantee and will feedback to

the Open Source community. After all, open source should be open.

6. References

[1] S. Blake, D. Black, M. Carlson, E. Davies, Z. Wang, and W. Weiss, An Architecture for

Differentiated Services, RFC 2475, Dec. 1998.

[2] Stiliadis, and A. Varma, Latency-Rate Servers: A General Model for Analysis of Traffic

Scheduling Algorithms, IEEE/ACM Transactions on Networking, Vol. 6, No. 5, pp.611-624, Oct.

1998.

[3] S. Floyd, and V. Jacobson, Link-sharing and resource management models for packet

networks, IEEE/ACM Transactions on Networking, Vol. 3, No. 4, pp.365-386, 1995.

[4] K. Fall, and S. Floyd, Simulation-based Comparisons of Tahoe, Reno, and SACK TCP,

ACM Computer Communication Review, Vol. 26 No. 3, pp.5-21, Jul. 1996.

[5] J. Padhye, and S. Floyd, On Inferring TCP Behavior, ACM SIGCOMM'2001, San Diego,

USA, August, 2001. http://www.acm.org/sigcomm/sigcomm2001/p23.html (to be appeared)

[6] S. Floyd and V. Jacobson, Random Early Detection Gateways for Congestion Avoidance,

IEEE/ACM Transactions on Networking, Vol. 1, No. 4, pp.397-413, Aug. 1993.

[7] S. Karandikar, S. Kalyanaraman, P. Bagal, and B. Packer, TCP Rate Control, ACM

Computer Communication Review, Vol. 30, No. 1, Jan. 2000.

[8] K. Cho, Alternate Queueing for BSD UNIX (ALTQ), http://www.csl.sony.co.jp/person/kjc

[9] NetGuard Corporation, http://www.netguard.com

[10] Check Point Software Technologies, http://www.checkpoint.com

[11] BroadWeb Corporation, http://www.broadweb.com.tw

[12] Acute Communication Corporation, http://www.acutecomm.com

[13] Packeteer Corporation, http://www.packeteer.com

19

Page 20: doc

[14] Sitara Networks, http://www.sitaranetworks.com

[15] NetReality Corporation, http://www.net-reality.com

[16] Ncftpput Software, http://www.ncftp.com

[17] K. Cho, Tele Traffic Tapper (ttt), http://www.csl.sony.co.jp/person/kjc

[18] Spirent Communications, http://www.netcomsystems.com

[19] Lawrence Berkeley National Laboratory, tcpdump, http://www-nrg.ee.lbl.gov

[20] H. Y. Wei, WAN Emulator, http://speed.cis.nctu.edu.tw/wanemu/

[21] W. R. Stevens, TCP/IP Illustrated Volume 1 - The Protocols, Addison-Wesley, 1994.

[22] Tolly Group, http://www.tolly.com

Acknowledgements

We thank the vendors who so generously provided us with the devices and their verifications of

the test results. We are grateful to Ching-Chuan Chiang and Yi-Chung Liu for their help on the

preliminary tests and functionality comparisons.

Appendix

Appendix A. Detailed Functionality Comparison

A-1. Policy Console User Interface

As for the policy console user interface (Table A), a notable function is how many devices a

management console can control. Policy consoles of PacketShaper and QoSWorks can control only one

device since they use built-in web servers for configuration with Web browsers. Policy consoles of

others (except for ALTQ) can remotely control multiple devices located at different places. As for

schedule control, per-rule schedule control is more effective. For example, some rules can be inactive

during non-office hours, but VoIP rule should be always active to guarantee voice quality.

Vendor/Model TypeSchedule

Control

Management

ConsoleOS Monitor/Statistics Alert

ALTQ Config File N Single device FreeBSD 4.0 Per-class bandwidth usage N/A

NetGuard’s

Guardian Pro

GUI Win32

ApplicationPer-rule

Global

devicesWin NT/2000

Line Statistics Report/Response Time

Report/Protocol Distribution ReportLog

20

Page 21: doc

CheckPoint’s

FloodGate

GUI Win32

ApplicationPer-rule

Global

devicesWin NT/2000

Line Statistics Report/Response Time

Report/Protocol Distribution ReportN/A

NetReality’s

WiseWan

GUI Java

ApplicationPer-rule

Global

devicesWin NT/Solaris

Line Statistics Report/Port Report /Response Time

Report/Protocol Distribution Report/VoIP

Report/Top Ten Talkers/Top Ten Protocols or Apps

SNMP trap

BroadWeb/Acute’s

iPolicer

Web Browser

(Java Applet)Per-rule

Global

devices

Web Server Web Client Line Statistics Report/Top Ten Report/

Top Ten Talkers/Top Ten ProtocolsEmail trap

Another NT IE 5.0

Packeteer’s

PacketShaper

Web Browser

(HTML)Per-device

Single

device

Web Server Web Client Utilization/Network Efficiency/Top Ten

Classes/Top Twenty Talkers/Per-class Bandwidth

Usage/Response Time Report

SNMP

trapEmbedded

Web Server Any

Sitara’s QoSWorksWeb Browser

(HTML)Per-device

Single

device

Web Server Web Client Per-class Bandwidth Usage/Link statistics/Top

classes per link/Top Applications/Protocol

Distribution/Traffic by address

SNMP

trapEmbedded

Web Server Any

Table A: Management Interface and Statistics of Flow

A-2. Special Functions

PacketShaper is superior in its Traffic Discovery, which can automatically identify the protocols

of the traffic passing through it and provide an instant feedback to the network administrator for further

bandwidth setting. Others have to manually monitor whether the newly specified packet filters can

capture its corresponding traffic. WiseWAN is directly installed at the WAN link (V.35 cable) and thus

can verify whether the measured bandwidth matches the subscribed bandwidth. Additionally, it can

detect PVCs in the frame relay network. Thus a single WiseWAN device can control all the traffic on the

mesh-structured frame relay links among branch offices. QoSWorks significantly focuses on controlling

VoIP traffic. With shrinking TCP data packet size, VoIP (UDP packets) traffic can pass through

QoSWorks smoothly, especially in narrowband WAN link. Moreover, QoSWorks has built-in Web cache

(not verified in this report). Both FloodGate and Guardian Pro can be integrated with their firewall, VPN

and NAT packages. Integrated solutions may reduce management costs.

21

Page 22: doc

Appendix B. Testbed Photo

Figure B: Testbed Photo

Appendix C. Intuitive Example for Basic Test Statistics

This intuitive example illustrates how the Basic Test statistics of the 20kbps class are derived. As

described in Section 3.2, each class matches four connections, and the test repeats for five runs. Ideally

within each run each connection can receive 1/4 of the class bandwidth. The example results tell us that

the accuracy statistic is 19, which approaches the ideal result 20, cannot reflect real conditions. With the

aid of poor stability of accuracy, we can judge that the DUT is actually not good in accuracy. On the

other hand, “Not fair” with “Good stability of fairness” means that the DUT “cannot fairly" treat the

flows almost “all the time”.

Figure C: Intuitive Example for Basic Test Statistics

22