improving ipc by kernel design jochen liedtke shane matthews portland state university
Post on 21-Dec-2015
220 views
TRANSCRIPT
![Page 1: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/1.jpg)
Improving IPC by Kernel DesignJochen Liedtke
Shane MatthewsPortland State University
![Page 2: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/2.jpg)
3/12/2004 Portland State University
Summary
• Review
• Performance improved
– Architecture Level
– Algorithmic Level
– Interface Level
– Coding Level
![Page 3: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/3.jpg)
3
Micro-kernels
• Minimal OS, providing a set of primitives used to implement thread/address space management and IPC [1]
• Everything else is moved to user-space (servers)
![Page 4: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/4.jpg)
4
Terminology (L3)
• Dataspace– Memory object, mapped into address space
• Task– Composed of threads, dataspaces, and an address space
• Message– String/memory object
![Page 5: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/5.jpg)
5
L3 Architecture & IPC
• Active components communicate via messages
• Applies to:– Device drivers
• Implemented as user level tasks
– Hardware Interrupts• Interrupt message from micro-kernel to thread
![Page 6: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/6.jpg)
6
L3 Redesign Principles
• IPC performance is the master– Security and performance must not be affected
• Synergetic effects taken into consideration– (Think combined effects)– May lead to reinforcement or diminution
• Design must aim at performance goal– Per short message transfer– 350 cycles (7 micro-seconds)
![Page 7: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/7.jpg)
3/12/2004 Portland State University
Architectural Level
• Messages
• Process Structure
• Control Blocks
![Page 8: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/8.jpg)
3/12/2004 Portland State University
Compound Messages
• Multiple send/receive -> 1 send/receive
• Messages consists of direct/indirect strings, and memory objects
![Page 9: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/9.jpg)
9
Twofold message copy
• [A space] -> [kernel] -
> [B space]
• O(20 + .75n) cycles,
n:= bytes
• Good for small
messages
• Need something better
as n grows
![Page 10: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/10.jpg)
10
LRPC and SRC RPC
• Client/server share user level memory– sender -> shared buffer
• Problems– When server to client is 1 to many, shared
regions of address space become critical resources
– Shared regions require explicit opens (unlike L3)
– Message change during/after checking
![Page 11: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/11.jpg)
11
Direct Message Copy Via Windows
• L3's method
– Destination mapped
into window
– Message copied to
window
• Window
– per address space
– Accessed exclusivly
by kernel
![Page 12: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/12.jpg)
12
Communication Windows
• Problems
– Must be fast
– Different threads
coxisting within
address space
• L3 Implementation
– One word page
directory B to A.
![Page 13: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/13.jpg)
13
Process Structure
• Threads running kernel mode have 1 kernel
stack per thread
– Efficient since interupts, page faults, IPC,
already save state on kernel stack
• Continuations
– Pro: • Reduce kernel stack
– Cons: • Require additional copies between kernel and
continutation
• Interfere with other optimizations
![Page 14: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/14.jpg)
14
Tread Control Blocks
• Implemented as large array in kernel
– fast tcb access
• Array base + tcb # + tcb size
– Saves TLB misses (IPC)
• kernel stacks of sender and reciever located in TCB
page
– Locking done via unmapping on TCB
![Page 15: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/15.jpg)
3/12/2004 Portland State University
Algorithmic Level
• Thread Identifier
• Lazy Scheduling
• Short Messages Via Registers
![Page 16: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/16.jpg)
3/12/2004 Portland State University
Thread Identifier
• Thread addressed by 64-bit UID in user-
mode
• Thread number in lower 32-bits of UID
– AND with bit mask, add to TCB’s array base
![Page 17: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/17.jpg)
3/12/2004 Portland State University
Lazy Scheduling
• IPC operation call or reply & receive next
– Delete sending thread from ready queue
– Insert into waiting queue
– Delete receiving thread from waiting queue
– Insert into ready queue
• Too many queue operations!
![Page 18: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/18.jpg)
3/12/2004 Portland State University
Lazy Scheduling cont.
• L3 queue invariants
– Ready queue contains all ready threads
– Waiting queue contains at least all threads
waiting
• TCB contains threads state (ready/waiting)
• Scheduler removes all threads not
belonging to queue during queue parsing
![Page 19: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/19.jpg)
3/12/2004 Portland State University
Short Messages Via Registers
• High proportion of messages are short
– Ex. Driver ack/error, hardware interrupts
• 486
– 7 general registers
– 3 needed: sender ID, result code
– 4 available
• 8-byte messages using coding scheme
![Page 20: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/20.jpg)
3/12/2004 Portland State University
Interface Level
• Simple RPC stubs
– Load registers, system call, check success
– Compiler generates stubs inline
• Parameter Passing
– Use registers when possible
![Page 21: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/21.jpg)
3/12/2004 Portland State University
Coding Level
• Reduce cache and TLB misses
– Short kernel code
• Short jumps, use registers, short address
displacements
– IPC kernel code in one page
– Handle save/restore of coprocessor lazily
• Delayed until different thread needs to use it
![Page 22: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/22.jpg)
3/12/2004 Portland State University
Results
• 100% would indicate double the time increase
• Removal of all increase IPC time by 134% for 8 byte message
![Page 23: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/23.jpg)
3/12/2004 Portland State University
Results
• L3 VS Mach
• System– Intel 486 DX-50– 256 KB external
cache– 16 MB memory
![Page 24: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/24.jpg)
3/12/2004 Portland State University
Results cont.
![Page 25: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/25.jpg)
3/12/2004 Portland State University
Conclusions
• IPC improved by applying
– Performance based reasoning
– Synergetic effects
– Architecture -> coding
![Page 26: Improving IPC by Kernel Design Jochen Liedtke Shane Matthews Portland State University](https://reader031.vdocuments.us/reader031/viewer/2022020117/56649d565503460f94a345b0/html5/thumbnails/26.jpg)
26
References
• [1] http://en.wikipedia.org/wiki/Micro_kernel
• [2] Improving IPC by Kernel Design - Jochen Liedtke