µ-kernel construction (1) - itec-os start · µ-kernel construction (1) overview, motivation,...
TRANSCRIPT
© 2009 University of Karlsruhe, System Architecture Group
µµ--Kernel Construction (1)Kernel Construction (1)
Overview, Motivation, Problems
2© 2009 University of Karlsruhe, System Architecture Group
Staff
� Lecturer: Raphael Neider
� PhD student
� Meeting Times
� Tue, 14:00-15:30h
� Bldg. 50.34, Room 161
3© 2009 University of Karlsruhe, System Architecture Group
Background
� L4Ka project (http://l4ka.org)
� State-of-the-art in microkernel construction
� L4Ka::Pistachio
� Microkernel
� IDL4
� Interface compiler
� Virtual machines
� Target application
4© 2009 University of Karlsruhe, System Architecture Group
Purpose of Operating Systems
� Abstract from the hardware
� Interrupts, exceptions
� Provide common services
� Address spaces and protection
� Threads and concurrency control
� Persistency of data
� Bridge semantic gap
� Application demands vs. hardware provides
5© 2009 University of Karlsruhe, System Architecture Group
Purpose
HardwareHardware
OperatingSystem
OperatingSystem
AppApp AppAppAppApp
ApplicationApplicationApplication
documents
windowsthreads
coroutines
symbolsstacks & heaps
arrays & structures
variables
modules
procedures
statements
HardwareHardwareHardwarebit
wordregisterbyte
instruction
Operating SystemOperating SystemOperating Systemfile
address space
socket
semaphore
IPC
process
monitor
eventsegment
mutex
priority
ACLthread
pipepagetask
schedule
6© 2009 University of Karlsruhe, System Architecture Group
Operating System Designs
application-specific monolithic
µ-kernel with object interfaces
�Ultimate flexibility�Ultimate minimalism�Ultimate performance�Normal programming
paradigm� Proprietary and
incompatible solutions
�Standard interface�Runs programs from
different vendors� Compromised
interface� Poor performance� Inflexible
�Standard interface�User-defined interfaces�Runs subsystems from
different vendors�Good flexibility�Good minimalism�Good performance� Difficult to use� Different paradigm
7© 2009 University of Karlsruhe, System Architecture Group
Monolithic Kernels – Advantages
� Kernel has access to everything, potentially
� All optimizations are possible
� All techniques/mechanisms/concepts can be implemented
� Kernel extended by adding more code
AppApp AppAppAppApp
HardwareHardware
LinuxLinux
Driver Driver
TCP/IP EXT2
8© 2009 University of Karlsruhe, System Architecture Group
0
10
20
30
40
50
60
70
1994-03-13 1996-12-07 1999-09-03 2002-05-30 2005-02-23 2007-11-20
Siz
e (M
B)
<=2.0 2.1 2.2 2.3 2.4 2.5 2.6.16 2.6
Linux Kernel Evolution (.tar.gz)
Linux 2.6.25:Linux 2.6.25:
5.7M lines of code5.7M lines of code(using “sloccount”)
9© 2009 University of Karlsruhe, System Architecture Group
Approaches to Tackling Complexity
� Monolithic approaches
� Layered Kernels
� Modular Kernels
� Object Oriented Kernels
� Alternatives
� Extensible Kernels
� Microkernels
10© 2009 University of Karlsruhe, System Architecture Group
History
� Monolithic kernels
� 1st-generation µ-kernels
� Mach
� Chorus
� Amoeba
� (L3)
� 2nd-generation µ-kernels
� (Spin)
� Exokernel
� L4
CMU, OSF
Inria, Chorus
Vrije Universiteit
GMD
U Washington
MIT
GMD / IBM / UKa Recursive Address SpacesRecursive Address Spaces
External PagerExternal Pager
UserUser --Level DriverLevel Driver
11© 2009 University of Karlsruhe, System Architecture Group
µ-Kernel BasedSystems
HardwareHardware
LinuxLinux
HardwareHardware
L4 µ-kernelL4 µ-kernel
L4LinuxL4Linux
AppApp AppAppAppApp AppApp AppAppAppApp
Monolithic System µ-Kernel BasedMonolithic System Smaller trusted
computing base
12© 2009 University of Karlsruhe, System Architecture Group
Smaller trusted computing base
µ-Kernel BasedSystems
HardwareHardware
LinuxLinux
HardwareHardware
L4 µ-kernelL4 µ-kernel
L4LinuxL4Linux
AppApp AppAppAppApp AppApp AppAppAppApp
Monolithic System µ-Kernel BasedMonolithic System
What if address spaceWhat if address space
switches were for free?switches were for free?
13© 2009 University of Karlsruhe, System Architecture Group
µ-Kernel BasedSystems
AppApp AppAppAppApp
HardwareHardware
LinuxLinux
Driver Driver
TCP/IP EXT2
HardwareHardware
L4 µ-kernelL4 µ-kernel
AppApp AppAppAppApp
L4LinuxL4Linux
Driver Driver
TCP/IP EXT2
Monolithic System µ-Kernel BasedMonolithic System
Multi-Server System
Net DrvNet Drv IDE DrvIDE Drv
TCP/IPTCP/IP EXT2EXT2
HardwareHardware
L4 µ-kernelL4 µ-kernel
AppApp AppAppAppApp
14© 2009 University of Karlsruhe, System Architecture Group
µ-Kernel BasedSystems
Net DrvNet Drv IDE DrvIDE Drv
TCP/IPTCP/IP EXT2EXT2
HardwareHardware
LinuxLinux
HardwareHardware
L4 µ-kernelL4 µ-kernel
L4LinuxL4Linux
AppApp AppAppAppApp AppApp AppAppAppApp
Driver Driver
TCP/IP EXT2
Driver Driver
TCP/IP EXT2
Monolithic System µ-Kernel BasedMonolithic System
HardwareHardware
L4 µ-kernelL4 µ-kernel
Multi-Server System
AppApp AppAppAppApp
HardwareHardwareHardwarebit
wordregisterbyte
instruction
ApplicationApplicationApplication
documents
windowsthreads
coroutines
symbolsstacks & heaps
arrays & structures
variables
modules
procedures
statements
µµµ---KernelKernelKerneladdress space
thread
ServerServerServerPageServerServerServerMutex ServerServerServerSocket ServerServerServer
File
15© 2009 University of Karlsruhe, System Architecture Group
Microkernel Based Systems
HardwareHardware
L4 µ-kernelL4 µ-kernel
L4LinuxL4Linux
AppApp AppAppAppApp
µ-Kernel BasedServer Consolidation
L4LinuxL4Linux L4LinuxL4Linux
Driver Driver
TCP/IP EXT2
Hardware
Linux
App AppApp
Monolithic System
Driver Driver
TCP/IP EXT2
Hardware
L4 µ-kernel
L4Linux
App AppApp
µ-Kernel BasedMonolithic System
Net Drv IDE Drv
TCP/IP EXT2
Hardware
L4 µ-kernel
Multi-Server System
App AppApp
16© 2009 University of Karlsruhe, System Architecture Group
Microkernel Based Systems
HardwareHardware
L4 µ-kernelL4 µ-kernel
AppApp AppAppAppApp
Coupling withReal-Time Systems
L4LinuxL4Linux RTSystem
RTSystem
Driver Driver
TCP/IP EXT2
Hardware
Linux
App AppApp
Monolithic System
Driver Driver
TCP/IP EXT2
Hardware
L4 µ-kernel
L4Linux
App AppApp
µ-Kernel BasedMonolithic System
Net Drv IDE Drv
TCP/IP EXT2
Hardware
L4 µ-kernel
Multi-Server System
App AppApp
Hardware
L4 µ-kernel
Linux
App AppApp
µ-Kernel BasedServer Consolidation
Linux Linux
17© 2009 University of Karlsruhe, System Architecture Group
Microkernel Based Systems
Driver Driver
TCP/IP EXT2
Hardware
Linux
App AppApp
Monolithic System
Driver Driver
TCP/IP EXT2
Hardware
L4 µ-kernel
L4Linux
App AppApp
µ-Kernel BasedMonolithic System
Net Drv IDE Drv
TCP/IP EXT2
Hardware
L4 µ-kernel
Multi-Server System
App AppApp
Hardware
L4 µ-kernel
Linux
App AppApp
µ-Kernel BasedServer Consolidation
Linux Linux
Hardware
L4 µ-kernel
App AppApp
Coupling withReal-Time Systems
L4LinuxRT
System
Hardware
L4 µ-kernel
App AppApp
Thin Clients
NativeJava
Embd.JavaApp
Hardware
L4 µ-kernel
Specialized Systems
SpecializedComponent
Hardware
L4 µ-kernel
App AppApp
Coupling withSecure Systems
L4LinuxSecureSystem
18© 2009 University of Karlsruhe, System Architecture Group
Microkernel Based Systems:The Challenge
HardwareHardware
L4 µ-kernelL4 µ-kernel
Net DrvNet Drv IDE DrvIDE Drv
TCP/IPTCP/IP
Exec ServExec Serv
SCSI DrvSCSI Drv KBD DrvKBD Drv
EXT2 FSEXT2 FS MM ServMM Serv
Proc ServProc ServGFX ServGFX Serv Swap ServSwap Serv
shshgccgcclesslessemacsemacs twmtwm
Regular O
S operation
s
Regular O
S operation
s
require lot
s of comm
unication
require lot
s of comm
unication
19© 2009 University of Karlsruhe, System Architecture Group
The Great Promise
� Coexistence of different
� APIs
� File systems
� OS personalities
� Flexibility
� Extensibility
� Simplicity
� Maintainability
� Security
� Safety IBM IBM WorkPlaceWorkPlace OS: OS:
~2,000,000,000 US$~2,000,000,000 US$
� SLOW
� INFLEXIBLE
� LARGE
The Big Disaster
20© 2009 University of Karlsruhe, System Architecture Group
25 MHz 386 50 MHz 486 90 MHz Pentium 133 MHz Alpha
The 100The 100--µµs Disasters Disaster
21© 2009 University of Karlsruhe, System Architecture Group
0
50
100
0 50 100 150
msg len
[µs]
Mach
L4
IPC Costs (486, 50 MHz)
0
100
200
300
400
0 2000 4000 6000
msg len
Mach
raw copy
[µs]
L4
22© 2009 University of Karlsruhe, System Architecture Group
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
100 200 400 800 1600 3200 6400 12800 25600 51200 102400 204800
Mach 486
Mach Alpha
Chorus 486
Spin Alpha
L4 Pentium
average cycles between successive IPCs
over
head
due
to IP
C
Overhead due to IPC
~2.400 cycles
~140.000 cycles
23© 2009 University of Karlsruhe, System Architecture Group
L4 IPC: ’95—’00
82
16
23
36
55
7
38
0
20
40
60
80
100
120
140
Pentium R4600 Alpha
0.73 µs (Pentium 166 MHz)
0.91 µs (R4600 100 MHz)
0.10 µs (21164 433 MHz)
[cycles]
236
180
200
50
100
150
200
250
Pent III P3 Sysops P3 Lipc ?
0.47 µs (P III 500 MHz)
0.36 µs (P III 500 MHz)
0.04 µs (P III 500 MHz)
(hopefully)
24© 2009 University of Karlsruhe, System Architecture Group
82
16
23
36
55
7
38
0
20
40
60
80
100
120
140
Pentium R4600 Alpha
0.73 µs (Pentium 166 MHz)
0.91 µs (R4600 100 MHz)
0.10 µs (21164 433 MHz)
[cycles]
L4 IPC Performance
984
180169
148
0
20
40
60
80
100
120
140
160
180
200
220
P4 P3 Itanic1 Itanic2
0.23 µs (P III 800 MHz)
0.16 µs (Itanic2 900 MHz)
0.21 µs (Itanic1 800 MHz)
0.35 µs (P4 2.800 GHz)
25© 2009 University of Karlsruhe, System Architecture Group
L4Ka::Pistachio (today)
0.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0 4 8 12 16 20 24 28 32 36 40 44 48 52 56 60
message length in words
[us]
IA32 800MHz Inter-AS
Itanium1 800MHz
Itanium2 1GHz
IA32 800MHz Intra-AS
26© 2009 University of Karlsruhe, System Architecture Group
Thesis
A A µµ--kernel does the job ifkernel does the job if
�� properly designed andproperly designed and
�� carefully implemented.carefully implemented.
� Minimality
� Architectural Integration
� Elegance
� Efficiency
� Flexibility
27© 2009 University of Karlsruhe, System Architecture Group
Thesis
A A µµ--kernel does the job ifkernel does the job if
�� properly designed andproperly designed and
�� carefully implemented.carefully implemented.
� Minimality
� Architectural Integration
� Elegance
� Efficiency
� Flexibility
When analyzing IPC performance,When analyzing IPC performance,
cycles are not the only thing to consider!cycles are not the only thing to consider!
28© 2009 University of Karlsruhe, System Architecture Group
Processor-DRAM Gap (Latency)
µProc60%/yr.
DRAM7%/yr.
1
10
100
1000
DRAM
CPU
1980
1981
1983
1984
1985
1986
1987
1988
1989
1990
1991
1992
1993
1994
1995
1996
1997
1998
1999
2000
1982
Processor-MemoryPerformance Gap(grows 50% / year)
Per
form
ance
Time
“Moore’s Law”
Slide originally fromDave Patterson, Parcon 98
29© 2009 University of Karlsruhe, System Architecture Group
Today’s Situation: Microprocessor
� Microprocessor-DRAM performance gap
� Time of a full cache miss in instructions executed1st Alpha (7000): 340 ns/5.0 ns = 68 clks x 2 instr. = 136 instr.
2nd Alpha (8400): 266 ns/3.3 ns = 80 clks x 4 instr. = 320 instr.
3rd Alpha (t.b.d.): 180 ns/1.7 ns =108 clks x 6 instr. = 648 instr.
� 1/2X latency x 3X clock rate x 3X instr/clock ⇒ ≈4.5X
� Minimize kernel-induced cache misses!
Slide originally fromDave Patterson, Parcon 98
30© 2009 University of Karlsruhe, System Architecture Group
99.85%
0.15%
L2 cache
� 8192 cache lines (256K)
� 12 lines used for IPC
98.83%
1.17%
L1 cache
� 1024 cache lines (16K + 16K)
� 12 lines used for IPC
Cache Working Sets
31© 2009 University of Karlsruhe, System Architecture Group
Multi-Processor Architectures
� Synchronization
� Bus locks
� Inter-processor interrupts
� NUMA behavior
� AMD Opteron
� Simultaneous multithreading (HyperThreading)
32© 2009 University of Karlsruhe, System Architecture Group
A Perfect Microkernel?
Proving minimality, necessarity and completeness would be nice but is impossible, since there is no agreed-upon metric and all is Turing-equivalent.
Jochen Liedtke, On µ-Kernel Construction, SOSP ‘95
33© 2009 University of Karlsruhe, System Architecture Group
Small Kernel ≠ Small Problem
SecuritySecurity
PerformancePerformance
MultiprocessorMultiprocessor
SchedulingScheduling
MemoryMemory
DevicesDevices
PersistencePersistence
Fault toleranceFault tolerance
Quality of serviceQuality of service
PortabilityPortability
Formal verificationFormal verification
Software engineeringSoftware engineering
34© 2009 University of Karlsruhe, System Architecture Group
�� ThreadsThreads
�� Address SpacesAddress Spaces
IPCIPC
MappingMapping
µ-Kernel Design
� A µ-kernel does no real work
� µ-kernel services are only required to overcome µ-kernel constraints
� Therefore, µ-kernels have to be infinitely fast!
MinimalityMinimality is the key!is the key!
35© 2009 University of Karlsruhe, System Architecture Group
Threads – IPC
A
B
C
� Enable controlled communication across address space boundaries
36© 2009 University of Karlsruhe, System Architecture Group
User-Level Device Drivers
A
B
DiskNet Gfx
IRQIRQ IRQ
C
37© 2009 University of Karlsruhe, System Architecture Group
Address Spaces – Mapping
� Setup shared memory regions
map
38© 2009 University of Karlsruhe, System Architecture Group
Address Spaces – Mapping
unmap
� Revoke shared memory regions
39© 2009 University of Karlsruhe, System Architecture Group
Address Spaces – Mapping
grant
� Donate memory regions to others
� Frees up virtual memory in the granting space
� Required for file servers, … (at least useful)
40© 2009 University of Karlsruhe, System Architecture Group
PF IPC
res IPC
Page Fault Handling
PagerApplicationmap msg
"PF" msg
41© 2009 University of Karlsruhe, System Architecture Group
EAXSPIP
…
EAXSPIP
…
Exception Handling
ExceptionHandler
Applicationcontinue msg
exception msg
Kernel modifies register contents according to reply
message
42© 2009 University of Karlsruhe, System Architecture Group
Physical Memory
Initial AS
Pager 1 Pager 2
Pager 3
Pager 4Application
Application Application
Application
Driver
Driver
Recursive Address Spaces
43© 2009 University of Karlsruhe, System Architecture Group
Mach Virtual Memory
Physical Memory
Paging PolicyMach
Application Application
Application
External PagerInflexible
44© 2009 University of Karlsruhe, System Architecture Group
Tolerated Concepts
A concept is tolerated in the µ-kernel if …
competing user-level implementations would violate system requirements.
45© 2009 University of Karlsruhe, System Architecture Group
Functional Requirements
� Principle of independence
� Subsystem Sprovides guarantees independent of S’
� Principle of integrity
� Other subsystems can rely on independence guarantees
S1S1 S2S2
S’S’
46© 2009 University of Karlsruhe, System Architecture Group
Requirement: Address Spaces
� µ-kernel must hide hardware address spaces
� Otherwise violates integrity principle
� But must permit arbitrary protection schemes [and non-protection]
� Solution: recursive construction of address spaces outside the kernel
47© 2009 University of Karlsruhe, System Architecture Group
Requirement: Threads
� A thread τ is an activity inside an address space with
� registers
� instruction pointer
� stack pointer
� state information
� σ(τ) := address space of thread τ
48© 2009 University of Karlsruhe, System Architecture Group
Why Tolerate Threads in µ-kernels?
� The decisive reason: σ(τ)� Modifications to address spaces
σ(τ) := σ’
must be controlled by the kernel
� Thus a notion of τ that represents the above
activity
� Additionally: concurrency
49© 2009 University of Karlsruhe, System Architecture Group
Requirement: IPC
� IPC = inter-process communication
� Inherently required in µ-kernel
� Classical approach
� Transfer messages between threads
� Contractual
� Sender determines what to send
� Receiver agrees to receive the information
50© 2009 University of Karlsruhe, System Architecture Group
Size Comparison
Linux (all platforms)5.7 Million lines
Mach 4 (x86)90,000 lines
L4Ka::Pistachio (x86+AMD64)45,000 lines
51© 2009 University of Karlsruhe, System Architecture Group
In this course …
You do learn
� How to design a µK
� Why L4 is sooooo fast
� Reasons why others failed
� Nitty-gritty details about IA32
� Some OS bashing …
� More cool stuff …
You don’t learn
� How to construct a system on a µK (�SDI)
� Linux dos and don’ts
� Operating system X is better than Y
52© 2009 University of Karlsruhe, System Architecture Group
Plan
� Overview, Motivation, Problems
� Threads, System-calls, and Thread Switching
� TCBs and Address Space Layout
� IPC Functionality and Implementation
� Dispatching
� Small Address Spaces - IPC
� Virtual Memory and Mapping Database
� Interrupts, Exceptions and CPU Virtualization
� Security
Many algorithms, often influencing the system design.