experience with simulating real tcp/ip-protocol...
TRANSCRIPT
INSTITUT FÜRNACHRICHTENVERMITTLUNG
UND DATENVERARBEITUNGProf. Dr.-Ing. Dr. h. c. mult. P. J. Kühn
Universität StuttgartINSTITUT FÜR
KOMMUNIKATIONSNETZEUND RECHNERSYSTEME
Prof. Dr.-Ing. Dr. h. c. mult. P. J. Kühn
Universität Stuttgart
Experience with SimulatingReal TCP/IP-Protocol Stacks
Michael Scharf, Christina ZeehInstitute of Communication Networks and Computer Engineering
University of [email protected]
ITG Fachgruppe 5.2.1 Workshop "Network Engineering" - Essen, November 22, 2007
This work is partly funded by the German Research Foundation (DFG) through the Center of Excellence (SFB) 627 "Nexus".
Institute of Communication Networks and Computer Engineering University of Stuttgart
• Motivation: Current and future Internet
• Towards accurate simulators
• Network Simulation Cradle
• Accuracy and performance tests
• Work-in-progress
• Conclusion and future work
Outline
Institute of Communication Networks and Computer Engineering University of Stuttgart
The Internet
• A highly decentralized, imperfect, and extremely successfull system
• Design principles
- End-to-end argument
- Resource management by TCP congestion control (no QoS!)
IP
IP
IP
IP
IP
Potential
Router
Internet
congestion
Appl.
TCPIP
Appl.
TCPIP
Appl.
TCPIP
Link
Link
Link
End system End system
End system
Motivation: Current and Future Internet
Institute of Communication Networks and Computer Engineering University of Stuttgart
Internet Applications
BW
Util
ity
BW
Util
ity
BW
Util
ity
BW
Util
ity
BW
Util
ity
BW
Util
ity
Streaming Best EffortInteractive
VoIPRemote shell
P2P
Filedownl.
Backup
Instant messaging
Multimedia(Youtube, ...)
Netw. attached storage
IPTV
Virtual reality3D Internet
(ERP, ...)
Online gaming
Research, GRID
WWW
DNS, ...
HD−Video
E−mail
Terminal appl.
Internet−TV
Gbps
kbps
Mbps
Elasticity
Typicalbandwidth UDP TCP
Internet
ARPANET
Motivation: Current and Future Internet
Further Details: Michael Scharf, "Future Internet Transport Layer - Heading towards a Post-TCP Era?",Future Internet Design Workshop, ECOC, Sept. 2007
Institute of Communication Networks and Computer Engineering University of Stuttgart
Design Principles of the Internet Congestion Control• Sender-side control of data rate by TCP congestion window
• Greedy probing of available bandwidth on path (window increase)
• Implicit congestion feedback by packet loss (window decrease)
➥ Basics almost unchanged since V. Jacobson’s proposal from 1988
Challenges• Broadband wide area networks
- Large bandwidth-delay product
- Extreme variety and dynamics (sensor networks to highspeed photonics)
• Fairness (network neutrality debate)
• Network demanding applications
• ...
Further details: Michael Welzl, Dimitri Papadimitriou, Michael Scharf, "Open Research Issues inInternet Congestion Control", IETF internet draft, work in progress, July 2007, draft-irtf-iccrg-welzl-congestion-control-open-research-00.txt
Motivation: Current and Future Internet
Institute of Communication Networks and Computer Engineering University of Stuttgart
Recent TCP Research and Standardization
➥ Lots of ongoing work!
TCP tweaking
Highspeed−TCP
AQM
ECN
Quick−Start
re−ECN
XCP
(always)
(2003)
(1998)
(2001)
(2007)
(?)
(?)
Incremental Evolution Revolution
Motivation: Current and Future Internet
Institute of Communication Networks and Computer Engineering University of Stuttgart
The "Credibility Gap" of Internet Simulations• TCP is the basic Internet transport protocol... and inherently complex
• How to investigate new network and transport protocols?
- Real development in testbeds not always possible
- Faster "time-to-paper" of simulations
• However: Lack of accurate (TCP) simulators
- Many missing features• Unidirectional transfer only
• Constant packet size
• No flow control
• ...
- Seldomly validated
- Real-world stacks always differ to specs and permanently evolve
➥ Possible remedy: Direct execution of real TCP/IP stack code insimulations
Towards Accurate Simulators
Institute of Communication Networks and Computer Engineering University of Stuttgart
Recent Research Efforts
− NCTUns (Chiao Tung/− Various tools, often Matlab based
− Separate code basis
(flow control, conn. management, ...)
− Plain ns−2 simulator − NS−2 TCP Linux (Caltech, since 2006)
− Simplified functions
− Hybrid code basis
− Only parts directly affecting performance
− Synthetic app. models
(cong. control)
− Greedy sources
− High−level behavior,
− Synthetic app. models − Synthetic app. models
− Kernel or user space emulation
cation of host kernel− May require modifi−
− Real applications
− Network stack ported to sim. environment
or Linux kernel− Based on FreeBSD
− Various other special
as OPNET, ...
− NSC (U. of Waikato, since 2003) Harvard, since 1999)
− UML Simulator (2003)
− dummynet, NIST Net, netem, ...
− OppBSD (Karlsruhe, since 2004)
− Lunar (Virginia Tech, 2004)
− Mathematical model
e. g. fluid−flow model
TCP simulators, such
− IKR tcplib
Code extraction Stack integration Runtime emulationAbstract model Reimplementation
Difference from real code
Complexity/cost
Towards Accurate Simulators
Institute of Communication Networks and Computer Engineering University of Stuttgart
Challenges of the Direct Code Execution Approach• Moving code from kernel-space to user-space
- No priviliged CPU instructions
- Many kernel subsystems not needed (e. g., memory management)
• Multiple stack instances in simulators
- No global variables
- Multi-tasking, threading, and scheduler difficult to model
• Simulator interface
- Virtual time: Timer interrupt replaced by simulator events
- Full packets transport, instead of function calls
- Byte-stream socket interface vs. message-oriented simulators
- Programming language mismatch (e. g., C vs. C++ code)
Towards Accurate Simulators
Institute of Communication Networks and Computer Engineering University of Stuttgart
Comparison of Recent Solutions
oLinux 2.6.13+NS−2 TCP Linux
Linux, *BSD, ...NSC
oOppBSD for OMNeT++ FreeBSD 6.0
o ?Lunar Linux 2.4.3
+ o oNCTUns Linux/BSD
? oUMLSimulator User−mode Linux
+
+
+
o
+
+
+
+
+
+
−
−−
−
++
Performance ExtensibilityApproach Code basis AccuracyMaturity
Towards Accurate Simulators
Institute of Communication Networks and Computer Engineering University of Stuttgart
Overview• Developed by Sam Jansen at University of Waikato, Hamilton, New
Zealand, since 2003
• Supports TCP stacks of Linux (2.4, 2.6.14.2), FreeBSD, OpenBSD, lwIP
• Frontend to ns-2 simulator
Features• Powerfull automatic C parser
- Substitution of global variables
- Can be adapted to new stacks
• Supports execution of different stacksin parallel
• Good documentation
• Well validated
Socket interface
TCP/IP stack code
Cra
dle
Network
Config
Create
Timer interrupt
Application
Network interface
Network Simulation Cradle (NSC)
Institute of Communication Networks and Computer Engineering University of Stuttgart
Simulator Structure
Extensions• Flexible towards other simulators
➥ Implementation of new frontend to IKR simlib with rather limited effort
• Integration of new protocol stacks
➥ Can be quite time-consuming
Sim. program NS−2 Sim. program
IKR Simlib entitiesNS−2 agent
IKR Simlib
NSC interface
Stack and Socket Classes (Wrapper)
Implemented
Kernel
functions
UnimplementedSupport functions
Kernel
functionsTCP/IP stack codeSha
red
libra
ry
Stack−independent code
Simulator
Stack−specific code
Kernel code
Network Simulation Cradle (NSC)
Institute of Communication Networks and Computer Engineering University of Stuttgart
Scenario 1: Goodput of Greedy Source
Scenario 2: Head-of-Line Blocking (HOL) in Signaling
➥ Scenarios similar to validation tests of Sam Jansen
Linux Linux
Data
ACKs
Source Sink
10 Mbit/s, 50ms
10 Mbit/s, 50ms
lossPacket
Goodput
• One TCP connection with greedy source
• Ethernet with MTU of 1500 byte
• Buffer size of 1000 packets
• Simulation: Linux 2.6.14.2
• Measurement: Linux 2.6.20.20 on P4 PC,"netem" network emulation
Linux Linux
64 byte
100 Mbit/s, 10ms lossPacket
Echo app.
100 Mbit/s, 10msPacketloss
64 byte data
data
Sign. app.
timeResponse
• One TCP connection
• Request-response of 64 byte messageswith neg.-exp. IAT of 10 ms
• Ethernet with MTU 1500 byte
• Buffer size of 1000 packets
• Socket option "NODELAY"
• Simulation: Linux 2.6.14.2
• Measurement: Linux 2.6.20.20 on P4 PC,"netem" network emulation
Accuracy and Performance Tests
Institute of Communication Networks and Computer Engineering University of Stuttgart
Result Scenario 1: Goodput of Greedy Source
➥ High accuracy, also for new congestion control algorithms
0.001 0.01 0.1 1Packet loss probability [%]
0
2
4
6
8
10G
oodp
ut [M
bit/s
]
PFTK modelMeasurement (Linux 2.6.20.20)Simulation (NSC+IKR simlib)Simulation (ns2-sack1)Simulation (IKR tcplib)
Reno Bic/Cubic
Accuracy and Performance Tests
Institute of Communication Networks and Computer Engineering University of Stuttgart
Result Scenario 2: HOL in Signaling
➥ Response time in simulations less than in reality
➥ Reasons: Kernel scheduler delays, network emulation errors
Further Details: Michael Scharf, Sebastian Kiesel: "Head-of-Line Blocking in TCP and SCTP: Analysisand Measurements", Proc. IEEE Globecom, San Francisco, CA, USA, Nov. 2006
0.1 1Unidirectional packet loss probability [%]
0
20
40
60
80
100M
ean
resp
onse
tim
e [m
s]
Analytical lower boundMeasurement (Linux 2.6.20.20)Simulation (NSC+IKR simlib)
Bic/
Reno
Analytical lower bound
Cubic
problemsStability
0 50 100 150 200 250 300Response time [ms]
10-3
10-2
10-1
100
CC
DF
Measurement (Linux 2.6.20.20)Simulation (NSC+IKR simlib)
Reno
Bic/Cubic
Distribution at 1% lossMean response time
Accuracy and Performance Tests
Institute of Communication Networks and Computer Engineering University of Stuttgart
Overhead and Cost
• Significant overhead of NSC compared to abstract simulators
• Improvement possible
- Decrease timer interrupt frequency
- Usage of pseudo data as payload of packets
NSC ns2-sack1 IKR tcplib1
10
100
1000S
peed
up C
ompa
red
to R
ealti
me
NSC ns2-sack1 IKR tcplib0
20000
40000
60000
80000
100000
Req
uire
d M
emor
y [K
iB]
Virtual vs. real-time Memory requirement
About 69 MB forshared library,ca. 400 KiB perstack
Scenario: Test case 1 at 0.1% loss; Plattform: Intel P4 2.8 Ghz, 2 GB RAM, Ubuntu 7.04
Accuracy and Performance Tests
Institute of Communication Networks and Computer Engineering University of Stuttgart
Quick-Start TCP Extension (RFC 4782)
• Speeds up interactive WAN applications
- After connection setup or idle periods
- For large bandwidth-delay products
• Reality check
- Requires support in all routers
- Some open (research) issues
➥ May be an option for future broadband IP networks
Quick-Start:
- Recent experimental TCP extension- (Almost) immediately use large window
Slow-Start:
- One pillar of TCP congestion control- Exponential window growth
Con
gest
ion
win
dow
Time
Slow Start
Slow start threshold
Quick−Start
Time
Slow start threshold
Con
gest
ion
win
dow
Router RouterSYN
pacing
Rate!
Rate?
Rate
Standardalgorithms
ACKSYN,ACK
New ACK
Host 2Host 1
QS request
QS response
Work-in-Progress
Institute of Communication Networks and Computer Engineering University of Stuttgart
Required Linux Kernel Modifications
Further details: Michael Scharf, Haiko Strotbek, "Experiences with Implementing Quick-Start in theLinux Kernel", Presentation at IETF 69, TSVAREA, Chicago, IL, USA, July 2007
Typical flow of packets
Config
Device driver
IP
Application
Kernel space
User space
TCP
ip_rcv
net_rx_action
Routing
ip_local_deliver
Analysis
Socket interface
Sysctl config
Sysctl config
ip_finish_output
tcp_send_msg
dev_queue_xmit
tcp_write_xmit
tcp_v4_rcv Send ACK tcp_transmit_skb
ip_forward_finish
ip_queue_xmit
ip_forward
Fast/slow path State
do_tcp_setsockopt
Handle SYN Cong.control
send_packetip_build_and_
Work-in-Progress
Institute of Communication Networks and Computer Engineering University of Stuttgart
Required Linux Kernel Modifications
Further details: Michael Scharf, Haiko Strotbek, "Experiences with Implementing Quick-Start in theLinux Kernel", Presentation at IETF 69, TSVAREA, Chicago, IL, USA, July 2007
Typical flow of packets
Config
Device driver
IP
Application
Kernel space
User space
TCP
ip_rcv
net_rx_action
Routing
ip_local_deliver
Analysis
Socket interface
Sysctl config
Sysctl config
ip_finish_output
tcp_send_msg
dev_queue_xmit
tcp_write_xmit
tcp_v4_rcv Send ACK tcp_transmit_skb
ip_forward_finish
ip_queue_xmit
ip_forward
Fast/slow path State
do_tcp_setsockopt
Handle SYN Cong.control
send_packetip_build_and_
Options Options Options
Flow control
New sysctlActivate QS
Traffic metering,adm. control
Ratepacing
Metering, adm.
Hist.
QS TTL decr.
Work-in-Progress
Institute of Communication Networks and Computer Engineering University of Stuttgart
Measurement vs. Simulation
➥ Support of kernel prototype software development
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Time [s]
0
2
4
6
8
10D
ata
rate
[Mbi
t/s]
Slow-Start (Measurement)Quick-Start (Measurement)
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2Time [s]
0
2
4
6
8
10
Dat
a ra
te [M
bit/s
]
Slow-Start (Simulation)Quick-Start (Simulation)
Scenario
• 10 Mbps Ethernet
• 100 ms RTT
• Simulation: Linux2.6.14.2, Reno congestioncontrol
• Measurement: Linux2.6.20.20, Renocongestion control,"netem" networkemulation
• (Almost) same kernelpatch
• Quick-Start request for5.12 Mbit/s
Work-in-Progress
Institute of Communication Networks and Computer Engineering University of Stuttgart
Conclusions• Simulating TCP/IP networks is challenging
• (More) Accurate simulators by direct execution of TCP/IP stack code
• Promising solution: Network Simulation Cradle
- Extensible, supports many stacks
- Frontends to different simulators possible (ns-2, IKR simlib)
- Can support experimental protocol development
• Limitation: Less scalable than abstract simulators
Future Work• Adaptation to newest stack versions
• Improvement of scalability and addition of features
• Better models for applications, kernel schedulers, ...
Conclusions and Future Work