improving the performance of the linux network subsystem

34
Improving the Performance of the Linux Network Subsystem King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT Dr. K. Salah April 22, 2007 Dhahran, Saudi Arabia

Upload: amina

Post on 15-Jan-2016

52 views

Category:

Documents


0 download

DESCRIPTION

Improving the Performance of the Linux Network Subsystem. King Fahd University of Petroleum and Minerals (KFUPM) INFORMATION AND COMPUTER SCIENCE DEPARTMENT Dr. K. Salah April 22, 2007 Dhahran, Saudi Arabia. Agenda. Introduction Receive-livelock Phenomenon Existing Schemes - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Improving the Performance of the Linux Network Subsystem

Improving the Performance of the Linux Network Subsystem

King Fahd University of Petroleum and Minerals (KFUPM)INFORMATION AND COMPUTER SCIENCE DEPARTMENT

Dr. K. Salah

April 22, 2007Dhahran, Saudi Arabia

Page 2: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Agenda

• Introduction• Receive-livelock Phenomenon• Existing Schemes• Previous Work. Why Hybrid Scheme?• Problem Statement• Project Objectives• Equipment• Project Phases and Scheduling• Benefits and Utilizations• Budget• Summary

Page 3: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Introduction

• High-Speed Network devices are widely deployed

• Gigabit Ethernet Technology supports 1 Gb/s and 10 Gb/s raw bandwidth

• Network performance has been shifted to servers and end hosts

• The high bandwidth increase can negatively impact the OS performance due to the interrupt overhead caused by the incoming gigabit traffic.

• As interrupt handling has more priority over other processing, this leads to receive-livelock phenomenon

Page 4: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Typical Architecture Model

Rx DMA Engine

Tx MAC

Application Application

DeviceDriver

Network Protocol Stack

PCI

NIC

...

Network Link

Rx circular buffer starting address

is loaded at initialization to start DMAing

incoming packets

UserSpace

Kernel Space

Tx DMA Engine

Rx MAC

Page 5: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Packet Arrival Rate - Slow

Pro

toco

l Sta

ck

Ap

pli

cati

ons

Net

wor

k

traf

fic

Host system

Page 6: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Packet Arrival Rate - Fast

Pro

toco

l Sta

ck

Ap

pli

cati

ons

Host system

XX

Net

wor

k

traf

ficXX

Page 7: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Receive-livelock Phenomenon

(Source: K. K. Ramakrishnan,1993)

Thr

ough

put

MLFRR

Ideal

Acceptable

Livelock

Offered load

Page 8: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Existing Schemes

• Normal Interruption

• Interrupt Disabling and Enabling

• Polling– Pure Polling vs. NAPI Polling

• Interrupt Coalescing (IC)

• Hybrid Scheme

Page 9: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Interrupt Disabling and Enabling

• The idea of pure interrupt disable-enable scheme is to have the interrupts of incoming packets turned off or disabled as long as there are packets to be processed by kernel’s protocol stack, i.e., the protocol buffer is not empty.

• When the buffer is empty, the interrupts are turned on again or re-enabled. – Any incoming packets (while the interrupts are disabled) are DMA’d

quietly to protocol buffer without incurring any interrupt overhead.

Page 10: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Polling

• Disable interrupts of incoming packets altogether and thus eliminating interrupt overhead completely.

• OS periodically polls its host system memory (i.e., protocol processing buffer or DMA Rx Ring) to find packets to process. – In general, exhaustive polling is rarely implemented. Polling with quota is

usually the case whereby only a maximum number of packets is processed in each poll in order to leave some CPU power for application processing.

• Two drawbacks for polling. – First, unsuccessful polls can be encountered as packets are not guaranteed to be

present at all times in the host memory, and thus CPU power is wasted.

– Second, processing of incoming packets is not performed immediately as the packets get queued until they are polled.

• Selecting the polling period is crucial. – Very frequent polling can be detrimental to performance as significant overhead can be encountered at

each poll.

– On the other hand, if polling is performed infrequently, packets may encounter long delays.

Page 11: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Pure Polling vs. NAPI Polling

Page 12: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Pure Polling vs. NAPI Polling

Page 13: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Shortcomings of NAPI

• Rotten Packets – When NAPI re-enables interrupts, there is the possibility of a packet or

more would sneak in during that time and go undetected until a fresh packet arrives. These packets are known as “Rotten packets”.

• Poor Performance with CPU-bound Applications– NAPI was reported not to perform well for hosts that heavily loaded

with CPU-bound applications. This is caused from scheduling polling using Linux softIRQs whereby CPU-bound user applications compete with softIRQs for CPU, and therefore softIRQs (and NAPI) would get less chance to run.

Page 14: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Interrupt Coalescing

• Most network adapters or NICs are manufactured to have interrupt coalescing.

• In IC, the NIC generates a single interrupt for a group of incoming packets. – This is opposed to normal interruption mode in which the NIC

generates an interrupt for every incoming packet.

• Two schemes to mitigate the rate of interrupts– Count-based IC

• NIC generates an interrupt when a predefined number of packets has been received.

– Time-based IC • NIC waits a predefined time period before it generates an interrupt. During this time

period multiple packets can be received.

Page 15: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Hybrid Scheme

• A combination of– Interrupt Disabling and Enabling

&

– Polling

Page 16: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Why?

Page 17: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Problem Statement

• In this research we intend – to implement a novel hybrid interrupt-handling scheme that

improves the performance of Linux networking subsystem and overcome the shortcomings of NAPI.

– to prove experimentally that our proposed scheme outperforms NAPI under different system configurations and load conditions.

1

Page 18: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Project Objectives

• Devise a novel scheme for Linux platform to enhance packet reception of links at Gigabit speed. – The scheme is expected to outperform in terms of latency, throughput,

and CPU availability the scheme of NAPI currently implemented in the latest Linux 2.6.

– The novel scheme should architect a proper solution to measure and forecast the traffic rate.

– Also the novel scheme should work for a host with single and multiple interfaces.

– More importantly, the scheme should work for SMP (Symmetric Multi-Processing) architecture where the host’s motherboard has multiple processors.

Page 19: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Project Objectives (cont’d)

• Find solutions to shortcomings and open issues of NAPI (other than latency, throughput, and CPU availability). These shortcomings include rotten packets and poor network performance when the system is heavily loaded with CPU-bound applications.

• Devise a novel generic benchmark for Linux hosts to measure find the switching point (cliff point).

Page 20: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Project Objectives (cont’d)

• Develop a testbed of an experiment to examine and compare the performance of the new modified Linux version to latest Linux NAPI. – The experiment takes into account numerous and different test

conditions and variables. • Linux host with single and multiple network interfaces

• Different types of input traffic (bursty, constant, Poisson)

• Different packet sizes

• Various types of system loads including CPU-bound and I/O bound applications

• Hosts with single and multiple processors (i.e. SMP).

• The experiment should follow guidelines of testing and benchmarking laid out in RFC2544.

Page 21: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Experimental Equipment

Linux Box

IXIA 400T Traffic Generator/

Analyzer

Laptop to manage and control

IXIA Software

eth0

eth2

eth1

if 3

if 2If 1

Page 22: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Project Phases and Scheduling

• Phase I: (Period of six months)– This is primarily a Linux network stack re-design and modification

phase

• Phase II: (Period of twelve months)– This phase is concerned with the testbed and experimental setup as well

as running performance evaluation of NAPI and our proposed hybrid scheme.

• Phase III: (Period of six months)– This phase is concerned with the performance of our hybrid scheme for

hosts with SMP support.

Page 23: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Phase I

1. Devise an appropriate technique to measure in real-time the traffic arrival rate. This task includes the following subtasks:

• Perform extensive review to measure and forecast the arrival traffic rate. Devise a forecast technique that has the following requirements:

(1) computationally simplified and optimized with minimal overhead and operations,

(2) accurate in terms of being comparable to actual data rate, (3) stable in terms of ignoring short traffic spikes, and (4) responsive in terms of following changes in actual traffic rate.

• Examine the effectiveness of the proposed technique to forecast the traffic arrival rate and compare it with other proposed techniques in the literature. The technique must be appropriate for different type of traffics including bursty traffic with empirical packet sizes. Discrete Event Simulation (DES) will be used to assess the performance and effectiveness of our proposed technique.

• Plot, analyze, and compare performance of proposed technique for forecasting arrival traffic rate.

• Determine (using simulation and fine tuning of parameters) the minimum and maximum values (i.e., confidence interval) of forecasted/estimated traffic rate. These values will be used as the upper and lower thresholds of the cliff point and will be used by the hybrid scheme for switching between interrupt disable-enable and polling. Also they will be used to prevent frequent oscillation and switching between the scheme of interrupt disable-enable and polling, and thereby minimizing the overall overhead.

Page 24: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Phase I – cont’d

2. Understand thoroughly Linux kernel and the complex NAPI code. This would require the following subtasks:

• Understand and perform extensive review and study of Linux 2.6 network stack (NAPI) and the NIC network drivers.

• Set up a utility called cscope or kscope to navigate and browse the actual Linux code and understand it thoroughly.

• Identify exactly what code needs to be changed in both Linux kernel as well as the network driver

• Identify how different the code should be to support single processor and multi-processor host, i.e., SMP.

3. Investigate open known issues or shortcomings with NAPI (other than expected latency at low traffic rate) and critique proposed solutions in the literature.

• These shortcomings include: rotten packets and poor network performance under heavy CPU-bound applications.

• More importantly, investigate how our proposed solution of hybrid scheme will resolve these known open issues.

Page 25: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Phase II

4. Modify, test, and recompile the code of Linux 2.6 to implement our proposed hybrid scheme and the scheme to forecast the traffic arrival rate. In addition the code has to handle solutions to rotten packets and the problem of poor performance of network stack under a system heavily loaded with CPU-bound applications.

5. Learn how to use the IXIA 400T traffic generator/analyzer. Configure simple experiment of generating and receiving packets.

6. Identify the proper cliff point for the system. This can be accomplished only by determining the interrupt overhead and protocol processing time. The interrupt overhead and protocol processing time will be determined using measurement. – Using IXIA or some other technique, devise a generic and useful way to

measure interrupt overhead. Determine the distribution of the interrupt overhead.

– Using IXIA or some other technique, devise a way to measure protocol processing at OS level. Determine the distribution of kernel’s protocol processing.

Page 26: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Phase II – cont’d

7. Using IXIA 400T and a PC with Linux 2.6 and NAPI enabled, measure and plot the following performance metrics:

• Packet forwarding latency• Packet forwarding throughput• CPU utilization with packet forwarding

8. The above experiment will consider the following different configurations and conditions:

• Different packet sizes• Traffic distribution: Poisson vs. bursty• Traffic reception and transfer on a single NIC• Traffic reception and transfer on multiple NICs

9. Using IXIA 400T and a PC with our proposed hybrid scheme, do the same performance measurements as in Task 7 and Task 8.

10. Plot and compare performance of NAPI and our proposed hybrid scheme. Make proper conclusions.

11. Compare and evaluate the performance of our solutions for NAPI shortcomings of rotten packets and poor network performance under CPU-bound applications. Consider performance conditions and configurations of Task 7 and Task 8.

Page 27: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Phase III

12. Examine the performance impact described for previous tasks of (Task 6-11) under Linux support for SMP with dual processors motherboard. – Compare SMP performance to the performance when using only a

single processor. This is a huge phase, as six tasks are to be carried out again. Its is to be noted according to RFC 2544 recommendations that in order to obtain a reported value for a single performance point, a test has to be repeated at least 20 times and the reported value must be the average of these 20 recorded values. Also the recommendations and guidelines state that the test has to run at least 20 minutes for obtaining one single reported value.

13. Ensure that the novel scheme preservers the order of packets, i.e., there is no need for packet re-ordering.

14. Prepare and deliver the final report

Page 28: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Work Plan

Page 29: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Personal Requirement

• The project team will consist of the primary investigator and two graduate students (PhD or MS degree candidates).

• The graduate students will be a computer science/engineering graduate and will work under the supervision and guidance of the PI.

Page 30: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Benefits and Utilization

• contribute to the advancement of open-source operating systems (as that of Linux) by providing a step-up version that improves the performance of its networking subsystem to suit Gigabit network traffic. – This will lead to having better Linux-based routers, firewalls, servers,

and proxies.

• utilize previously theoretical work of [24] to devise a new hybrid interrupt handling scheme to improve the networking performance of Linux or any operating systems. polling, and thereby minimizing the overall overhead.

• provide adequate solutions to NAPI shortcomings of the current Linux 2.6 networking subsystem.

Page 31: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Benefits and Utilization -- cont’d

• prove and demonstrate that the proposed hybrid scheme is a big enhancement in terms of performance form current versions when considering many different configurations and load conditions.

• provide an algorithm and computationally optimized technique to forecast the traffic arrival rate. Such an algorithm or technique should have no or minimal impact on Linux performance.

• provide a generic methodology and benchmark to identify the switching point.

• Research community at large can benefit substantially from the experimental work in terms of methodology, testbed, experimental setup and configuration. The experimental methodology and techniques can be employed for similar systems to conduct performance comparison.

Page 32: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Benefits and Utilization -- cont’d

• major beneficiaries may include almost all Saudi companies, as well as governmental and non-governmental institutions, that show keen interest in using Linux. – GbE deployment– Linux wide popularity

• will benefit KFUPM in general and the department of Information and Computer Science in particular. – It is anticipated that a modified version of Linux that best suits Gigabit traffic

will carry the name of KFUPM and the ICS department on it. – KFUPM can be seen as an active contributor to open-source code and

community.• results of general interest to the research community will be published at key

international conference, such as these of IEEE and ACM. Also it is anticipated that this research work will lead to publications in refereed reputable journals.

• No network traffic generators or analyzers at KFUPM. – Such a project can definitely lay the ground for further research and

development by having such equipment available. The equipment can be utilized for research.

– Also the IT center at the university can use such equipment for diagnosing and troubleshooting network problems related to performance bottlenecks.

Page 33: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Budget

Page 34: Improving the Performance of the Linux Network Subsystem

Project Seminar 2007Dhahran, Saudi Arabia

KFUPM, ICS DepartmentK. Salah

Summary

• In this research we intend to improve the performance of Linux networking subsystem and overcome the shortcomings of NAPI.

• The project will be of great benefit to research and open-source community and KUFPM, and the public at large