outline of the paper introduction. overview of l4. design and implementation of linux server....

24

Upload: toni-dalling

Post on 15-Jan-2016

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility
Page 2: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Outline of the Paper• Introduction.• Overview Of L4.• Design and Implementation Of Linux Server.• Evaluating Compatibility Performance.• Evaluating Extensibility Performance.• Alternative concepts from a performance point of

view.• Conclusion

Page 3: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

IntroductionMotivation: Microkernel based systems found too slowGoal: Show that microkernel based systems can be

practical with good performanceMethod: • Conduct experiments on L4, a lean second generation

microkernel with Linux running on top of it• The resulting system called L4linux• Compare performance of L4linux to native Linux and • Mklinux- Linux running on mach derived first

generation microkernel

Page 4: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

L4 Essentials• Based on two concepts - Address spaces and threads.• Address spaces- Constructed recursively by user level servers called pagers outside

the kernel. Initial address space-Physical memory The next address spaces created by granting, mapping and unmapping

flexpages. Flexpages- Logical pages of sizes 2^n ranging from 1 physical page to entire

address space. Pagers act as main memory managers enabling the implementation of memory

management policies• Threads-

• An activity executing inside the address space• can dynamically associate with individual pagers.

• IPC refers to cross address space communication• I/O ports treated as part of address space• Hardware interrupts handled as IPC

Page 5: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Linux Design and Implementation

• L4 implemented on Pentium, Alpha and MIPS architectures

• Linux has architecture dependent and independent parts.

• All modifications done to architecture dependent part.• Application binary interface in Linux unmodified.• No Linux-specific modifications done to L4.

Page 6: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

The Linux Kernel• On booting, the Linux server requests memory from its

pager which maps physical memory in to the server’s address space.

• The Linux server then acts as the pager for the user processes it creates.

• Hardware page tables are kept inside L4 and cannot be accessed directly by user processes leading to additional logical page tables kept by Linux kernel.

• A single L4 thread is multiplexed by L4linux to handle system calls and page faults.

• Interrupts disabled for synchronization and critical sections.

Page 7: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Interrupt Handling and Device Drivers• Interrupt handlers in native Linux are subdivided in to top halves(Run

immediately) and bottom halves(Run later).• L4 maps hardware interrupts in to messages.• Top half interrupt handlers are implemented as threads waiting for

such messages, one thread per interrupt source.• Another thread handles all bottom halves when top half is completed.

Page 8: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Linux User Processes

• Linux user processes implemented as a task.• The task is created by the Linux server and associates

it with a pager.• L4 converts any Linux user page fault in to an RPC

and sends it to Linux kernel.• The kernel then replies by mapping/unmapping the

pages from its address space of the process.

Page 9: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

System Call Mechanisms• L4linux system call implemented as RPCs between user

processes and Linux server.• There are three system call interfaces:

1. A modified version of libc.so which uses L4 IPC primitives to call Linux kernel

2. A corresponding libc.a

3. A user level exception handler which does the system call trap instruction by calling the corresponding routine in the modified shared library.

• TLB flushes avoided• L4linux uses physical copyin and copyout to exchange data

between kernel and user processes instead of address translation by hardware

Page 10: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Signaling• Linux kernel signals the user processes by

manipulating their stack, SP and PC.• In L4, each user process has a signal handler thread.• Upon receiving signal from the Linux server, the

signal handler causes the user process’s main thread to save its state and enter Linux and resumes the main thread.

Page 11: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Scheduling• All threads are scheduled by L4’s internal scheduler.• The native Linux server’s schedule() operation is used

only for multiplexing Linux server thread across cross routines when concurrent calls are made.

• The number of co routine switches are minimized by sleeping until a new system call or wakeup call is received.

Page 12: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Supporting Tagged TLBs or Small spaces

• Tagged TLB used to avoid TLB flushes in native Linux• However TLB conflicts have the same effect as TLB

flushes due to extensive use of libraries and identical, virtual allocation of code and data in address spaces.

• In L4linux, a special library permits the customization of code and data

• The emulation library and signal thread can also be mapped closely to the application.

• Thus, servers executing in small address spaces can be built.

Page 13: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Compatibility Performance Three questions:

What is the penalty of using L4linux instead of native Linux? - Explained by running benchmarks on native and L4linux using the same hardware.

Does the performance of the underlying microkernel matter?- Explained by comparing L4linux to Mklinux.

How much does co-location improve performance?- Explained by comparing user mode L4linux to in-kernel version of Mklinux.

Page 14: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Micro Benchmarks• Used to analyze the detailed behavior of L4linux

mechanisms• getpid – the shortest system call was repeated in a

loop.

Page 15: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Micro Benchmarks• The Imbench benchmark suite measures system calls,

context switches, memory accesses, pipe operations, networking operations etc.

• Hbench is the revised version of Imbench.

Page 16: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Macro Benchmarks• Measure the system’s overall performance• The time needed to recompile the L4linux server was 6-7% slowr

than native Linux and 10-20% faster than both Mklinux versions.• Commercial AIM multiuser benchmark used for a more systematic

evaluation• The system performance under different application loads was

measured.

Page 17: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Compatibility Performance Analysis

• The current implementation of L4linux comes close to native Linux even under high load with penalties ranging from 5-10%.

• Both the macro and micro benchmarks shows that performance of microkernel matters.

• All benchmarks suggests that co-location itself does not improve performance

Page 18: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Extensibility Performance

Main advantage of microkernel- Extensibility/specialization

Three questions:1. Can we add services outside L4linux to improve performance

by specializing in Unix?2. Can we improve certain applications by using native

microkernel mechanisms in addition to the classical API?3. Can we achieve high performance for non-classical Unix

compatible systems coexisting with L4linux?

These three questions are answered by specific examples.

Page 19: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Pipes and RPCFour variants of data exchange are compared.

1. Standard pipe mechanism

2. Asynchronous pipes on L4- runs only on L4 and needs no Linux kernel.

3. Synchronous RPC- Uses blocking IPC directly without buffering data.

4. Synchronous mapping RPC- Sender maps pages in to receiver’s address space

Imbench used to measure latency and bandwidth.

Page 20: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Cache Partitioning• L4’s hierarchical user level pagers allow both L4linux memory

system and a dedicated real time system to run in parallel.• The worst case execution time is considered the optimization

criteria in real time systems.• A memory manager on top of L4 is used to partition cache between

multiple real time tasks to minimize cache interference costs.• Time for matrix multiplication was measured with:

1. Uninterrupted cache conflicts- 10.9ms

2. Interrupted cache conflicts- 96.1ms

3. Cache partitioning avoiding secondary cache interference-24.9 ms

Page 21: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

• The time taken(in microseconds) for selected memory operations in native Linux and L4linux are compared.

Virtual Memory Operations

L4 Linux

Fault 6.2 n/a

Trap 3.4 12

Appel1 12 55

Appel2 10 44

Page 22: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Extensibility Performance Analysis• Unix compatible functionality can be improved by

microkernel primitives. Eg: pipes, VM operations.• Unix compatible or partially compatible functions can

be added to the system that outperforms implementations based on unix API. Eg: RPC, User level pagers for VM operations.

• Microkernel offers possibilities for coexisting systems based on different paradigms. Eg: Real time systems and MMU.

Page 23: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Alternative Basic Concepts

Can a mechanism at a lower level than IPC or a grafting model improve performance of a microkernel?

Protected Control Transfer• A parameter less cross address space procedure call via a callee

defined gate.• Time taken for PCT and IPC were compared and PCT does not offer

significant improvement.

Grafting• Downloading extensions in to the kernel.• Performance impact is still an open question.

Page 24: Outline of the Paper Introduction. Overview Of L4. Design and Implementation Of Linux Server. Evaluating Compatibility Performance. Evaluating Extensibility

Conclusion• The performance of L4 is significantly better than the first generation

microkernel.

• The throughput for L4 is only 5% less than native Linux whereas first generation microkernel were 5-7 times worse than native Linux.

• The overall system performance does depend on the performance of the microkernel.

• Modifications to Linux to suit L4 will further improve performance.

• L4 provides an apt platform to build specialized systems.