1 presented by: jeff schaffer sr. field applications engineer qnx software systems...
TRANSCRIPT
![Page 1: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/1.jpg)
1
Presented by:
Jeff SchafferSr. Field Applications EngineerQNX Software [email protected]
“Embedded Operating Systems:
The State of the Art”
QNX is a leading provider of real time operating system (RTOS) software, development tools, and services for mission critical embedded applications.
![Page 2: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/2.jpg)
2
Role of the Embedded OS
Traditional
– Permit sharing of common resources of the computer (disks, printers, CPU)
– Provide low-level control of I/O devices that may be complex, time dependent, and non-portable
– Provide device-independent abstractions (e.g. files, filenames, directories)
Additional Roles
– Prevent common causes of system failure and instability; minimize impact when they occur
– Extend system life cycles
– Isolate problems during development and at runtime
![Page 3: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/3.jpg)
3
Architecture Comparison
REAL TIME EXECUTIVEAdvantage: single address spaceDisadvantage: single address space,
different binary imagesFailure: means reboot
MONOLITHIC KERNELAdvantage: apps run in own memory spaceDisadvantage: kernel not protected,
kernel testingFailure: might mean reboot
TRUE MICROKERNELAdvantageModules run in own memory spaceAdd/replace services on the flyReusable modulesDirect hardware accessDisadvantage: context switchingFailure: usually does not mean reboot
![Page 4: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/4.jpg)
4
MicrokernelX86, PPC, MIPS, SH4,
ARM, StrongARM, XScale
App
PhotonGUI
Flashfsys Audio
driver
TCP/IP
Serialdriver Http
serverJava
ProcessManager
• Dynamic architecture makes hot-start and upgrades easy, even with drivers
• Philosophy: a trusted kernel running a system of untrusted software components
• Processes provide a reusable component model with well defined message interfaces
• Processes communicate via messages or other methods, such as shared memory. Permits loose inter-module coupling.
• No requirement for filesystem, GUI, etc.
MicroKernel – Neutrino
![Page 5: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/5.jpg)
5
Process 1 Process 2
Pipes
Process address
mapShared memoryobject
map
Process address
map
mapSharedMemory
msg 5msg 2msg 3msg 4Process 1 Process 2MessageQueues
Typical Forms of IPC
Mailboxes
Kernel
![Page 6: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/6.jpg)
6
Which Architecture for me?
Depends on your application and processor! Simple apps (such as single control loops) generally
only need a real-time executive As system becomes more complex, typically need a
more complex operating system architecture Need to look at factors such as scalability and
reliability Do standards matter?
![Page 7: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/7.jpg)
API’sTwo most common standards
Advantages of standardsPortability of code
Hiring of programmers
![Page 8: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/8.jpg)
8
Less than 1 second response?
Less than 1 millisecond response?
Less than 1 microsecond response?
Do I need Real-Time?
What is Real Time?
Maybe ...
![Page 9: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/9.jpg)
9
Real-Time
"A real-time system is one in which the correctness of the computations not only
depends upon the logical correctness of the computation but also upon the time at which
the result is produced. If the timing constraints of the system are not met, system
failure is said to have occurred."
Donald Gillies (comp.realtime FAQ)
![Page 10: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/10.jpg)
10
A Simple Example...
“it doesn’t do you any good if the signal that cuts fuel to the jet engine arrives a millisecond after the engine
has exploded”
Bill O. Gallmeister - POSIX.4 Programming for the Real World
![Page 11: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/11.jpg)
11
ATM
“Hard” vs. “Soft” Real Time
Hard– absolute deadlines– late responses cannot be tolerated and may have a
catastrophic effect on the system– example: flight control
Soft– systems which have reduced constraints on "lateness”;
e.g. late responses may still have some value– still must operate very quickly and repeatably– example: cardiac pacemaker
![Page 12: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/12.jpg)
12
Real-time OS Requirements
Operating system factors that permit real-time:– Thread Scheduling– Control of Priority Inversion– Time Spent in Kernel– Interrupt Processing
![Page 13: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/13.jpg)
13
Factor #1: Scheduling
Non real-time scheduling– round-robin– FIFO– adaptive
Real-time scheduling– priority based– sporadic
![Page 14: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/14.jpg)
14
Sequence:1. Low priority task acquires bus mutex to transfer data2. High priority task blocks until mutex released3. Medium priority task pre-empts low priority task4. Watchdog timer resets since Bus Manager has not run in some time
Factor #2: Priority Inversion
Source: Embedded Systems Programming
Information Bus Manager
Meteorological Data Gathering Task
Communications Task
![Page 15: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/15.jpg)
15
Factor #3: Kernel Time
Kernel operations must be pre-emptible– if they are not, an unknown amount of time can
be spent in the kernel performing an operation on behalf of a user process
– can cause real-time process to miss deadline All kernels have some window (or multiple windows)
of time where pre-emption cannot occur Some operating systems attempt to provide real-
time capability by adding “checkpoints” within the kernel so they can be interrupted at these points
![Page 16: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/16.jpg)
16
int KER
iret
Entry a few opcodes Interrupts off
Unlocked
KernelOperation
whichmay
includemessage
pass
usecstomsecs
Pre-emptable
Exit a few opcodes Interrupts off
Locked usecsNo pre-emptionInterrupts on
Unlocked usecs Pre-emptable
A Kernel call is asoftware interrupt
Example
![Page 17: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/17.jpg)
Split Out Long OperationsSplit Out Long Operations
ProcessManager
Thread
Sync
Message
Sched
Signal
Channel
ClockTimer
Intr
Fork
Exec
Pathname
Spawn
Mmap
Waitpid
SessionUID/GID
Debug
Nto Proc
![Page 18: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/18.jpg)
18
Factor #4: Interrupts
This is broken down into the following areas: Method of handling the interrupt processing chain Handling of Nested Interrupts
![Page 19: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/19.jpg)
19
Interrupt Processing Chain
ISR
INT x
ISR
INT y
IST IST
IST scheduled whenever queue emptied, non-deterministic
ISR
INT x
ISR
INT y
IST IST
IST scheduled by normal OS scheduling,
deterministic
![Page 20: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/20.jpg)
20
Conventional OS
Real-time kernel
Problems– different API’s– real-time layer proprietary– existing OS apps not R/T– poor communication
between operating systems– loss of control issue
Can I Make Any Conventional OS Real-Time
Method– Add real-time layer below
conventional OS, running conventional OS as a low priority real-time process
– Add real-time layer to hardware service layer
![Page 21: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/21.jpg)
21
Title of presentationTitle 2
Scalability
![Page 22: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/22.jpg)
22
Scaling Solution #1:Single Board, Single Node
CPU
Bridge Mem.
Bus PCI
Peripherals
The only scaling possible is a CPU replacement
![Page 23: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/23.jpg)
23
Scaling Solution #2:Single Board, Multiple Nodes
Relatively simple to implementAllows “scaling-on-demand”Suitable if nodes have independent
“work”
Inter-node IPC slower than memory accessComplexity in maintaining global view of dataDifficult to break-up computationally-intensive
tasks
CPU
Bridge Mem.
Bus PCI
Peripherals
CPU
Bridge
Bus PCI
Peripherals
Node 1
Node 2
![Page 24: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/24.jpg)
24
Scaling Solution #3:Single Board, Multiple Processors
CPU0
Bridge Mem.
Bus
PCI
PeripheralsCPU1
Tightly-coupled symmetric multiprocessing (SMP) All processors have a symmetric and consistent view
of physical memory and peripherals Scales processing power Need software (RTOS) support
![Page 25: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/25.jpg)
25
The SMP OS Dilemma
SMP systems to date use desktop operating systems; not responsive enough for real-time requirements
• Application servers• Databases• Web servers
Typical real-time operating systems (home-built or commercial), such as are commonly used in routers and switches today, do not have SMP support
SMP capable real-time operating systems run the CPU’s as independent processors with independent operating systems
![Page 26: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/26.jpg)
26
SMP Support
True (tightly coupled) SMP support
Only the kernel needs SMP awareness
Transparent to application software and drivers - identical binaries for UP and SMP systems
Automatic scheduling across all CPU’s
![Page 27: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/27.jpg)
27
Thread
Running
CPU 0Process
CPU 1
Thread
Process
Ready queues
63Priority
6261...0
Thread Thread
Thread
Blocked states Thread Thread
QNX “True” SMP
STATE_RUNNING thread on each processor
Priority-based ready queues
Each thread can be locked to a specific CPU by using a processor affinity mask
Scheduler remembers last CPU thread ran on
– Minimize thread migration– Optimize cache usage
Highest-priority READY thread always immediately scheduled
![Page 28: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/28.jpg)
28
Why Is Cache Important?
Cache efficiency is probably the single largest determinant of performance on SMP
Coherent view of physical memory is maintained using cache snooping
Cache snooping is done at the CPU bus level and so operates at lower speeds than core
Coherency is “invisible” to software
![Page 29: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/29.jpg)
29
Performance Implications
Snoop traffic expected on SMP Cache hits generally cause no bus transaction Multiple processors writing to same location
degrades performance (ping-pong effect) Performance degrades when large amount of data
modified on one processor and read on the other Sometimes it is better to have specific threads in a
process run on same CPU
![Page 30: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/30.jpg)
30
Designing for SMP:One Big task
Single thread
Giant App
• Will not work with SMP
![Page 31: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/31.jpg)
31
Designing for SMP:Single Threaded Tasks
App 1
Single thread
App 2
Single thread
• Works with SMP• Process data can be shared with shared memory
• Good concurrency, some complexity
• IPC not usually as efficient as memory sharing
![Page 32: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/32.jpg)
32
Designing for SMP:Scaling Software with Threads
Threads
Server
• Single copy server• All process data is implicitly shared and accessible
• Can achieve good concurrency with less complexity
• POSIX synchronization used• Mutexes• Semaphores• Condition variables• Usually more efficient than
inter-process synchronization
Note: SMP finds concurrency problems fast!
![Page 33: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/33.jpg)
33
Optimizing Compute-intensive Applications
Main thread
Threads
Application
Worker thread
Worker thread
Pool of worker threads Dispatch “work” to worker
threads Scales very well with SMP The tricky part is “breaking
up” the problem
![Page 34: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/34.jpg)
34
CPU 0CPU 0 CPU 1CPU 1
IRQ 7IRQ 7
IRQ 8IRQ 8 IRQ 9IRQ 9
IRQ 10IRQ 10
IRQ CPU7 08 19 110 1
ISRISR
IST
Interrupt processed on CPU that was targeted
Can distribute load by handling interrupts on different processors
Sometimes not the optimal strategy due to cache effects
Interrupt Handling
![Page 35: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/35.jpg)
35
Scaling Solution #4:Multiple Processors/Nodes
CPU0
Bridge Mem.
Bus
PCI
PeripheralsCPU1
CPU0
Bridge
Bus
PCI
PeripheralsCPU1
Node 2
Node 1
![Page 36: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/36.jpg)
36
Network
Network
Chassis
Network
Network
Network
Network
...
Hig
h-s
pe
ed
inte
rco
nn
ect
Lo
w-s
pee
d b
us
Line card
Line card
Example
![Page 37: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/37.jpg)
QNET
Messages flow transparently through QNET from one message bus to another.
LAN orInternet orBackplane
QNET
MicrokernelApp
All applications and servers become network distributed without any special code.
FlashFsys CDROM
Fsys
TCP/IP
AudioPhotonApp
ProcessManager
The QNET MicroNetwork
![Page 38: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/38.jpg)
38
LineLinecardcard
LineLinecardcard
ControlControlcardcard
QNX Qnet Manager
Extends message passing across multiple QNX microkernels
Over anything with a packet driver:
– Ethernet, RapidIO, 3GIO, InfiniBand, Stargen, etc.
Class of service Use symbolic prefixes to make
client code independent of location of resource manager
![Page 39: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/39.jpg)
39
Linecard
Controlcard
Linecard
One or multiple links can connect different nodes.
QNET Class of Service
![Page 40: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/40.jpg)
40
Data is sent out the link which will deliver it the fastest. This is based upon link speed and queue length for each link.
Linecard
Controlcard
Linecard
QNET: Load-Balanced Distribution
![Page 41: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/41.jpg)
41
Data is sent out a primary link. If it fails, data is diverted to a secondary link. The primary link is probed and when it comes back online, data is diverted back to it.
Linecard
Controlcard
Linecard
QNET: Ordered Distribution
![Page 42: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/42.jpg)
42
Data is sent out both links at the same time. A failure on either of the links is handled gracefully.
Linecard
Controlcard
Linecard
QNET: Parallel Distribution
![Page 43: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/43.jpg)
43
Designing for Networked SMP:Single/Multi Threaded Tasks
App 1
Multiple threads
App 2
Single thread
• Different processes necessary for different nodes
• Works with SMP• Process data can be shared with shared memory
• IPC for networked communication
![Page 44: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/44.jpg)
44
Client /service
Client Node
A
B
/net/a/dev/service
/net/b/dev/service
• Simple link provides transparent redirection• Process has to monitor status of link• Switch over is not transparent to client
Transparent Redirection
![Page 45: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/45.jpg)
45
Client
Client Node
A
B
/net/a/dev/service
/net/b/dev/service
Servicemgr
• Service manager acts as a proxy• Monitors health of and/or load on services/nodes• Switch over is transparent to client
/dev/service
Transparent Redirection
![Page 46: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/46.jpg)
46
Client
Client Node
A
B
/net/a/dev/service
/net/b/dev/service
Servicemgr
/dev/service
• Requests serviced redundantly • First/majority/best result• Different implementations
Redundant Links
![Page 47: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/47.jpg)
FLASHFSYS TCP/IP
App App
BlueTooth
Qnet
MO
ST
BU
S
FLASHFSYS Graphics
Browser Audio
Photon
Qnet
CDROMFSYS
Graphics
Browser Audio
Photon
Qnet
![Page 48: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/48.jpg)
FLASHFSYS TCP/IP
App App
BlueTooth
FLASHFSYS Graphics
Browser Audio
Photon
Qnet
CDROMFSYS
Graphics
Qnet
Qnet
MO
ST
BU
S
Browser
![Page 49: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/49.jpg)
49
Title of presentationTitle 2
Reliability and Availability
![Page 50: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/50.jpg)
50
Why?
Embedded systems are different! Failure in an embedded system can have severe
effects - like death …
“Pilots really hate to be told they have
to reboot their plane while in flight”Walter Shawlee
![Page 51: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/51.jpg)
51
Definitions
MTBF: Mean Time Between Failure– The average number of hours between failures for a
large number of components over a long time. (e.g. MIL-HDBK-217)
MTTR: Mean Time To Repair– Total amount of time spent performing all corrective
maintenance repairs divided by the number of repairs
MTBI: Mean Time Between Interruptions.– The average number of hours between failures while
a redundant component is down.
![Page 52: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/52.jpg)
52
Defining HA
Quantified by failure rate (MTBF) Time to resume service after failure is MTTRReliability
Allows for failure, with quick service restoration. As MTTR 0, Availability 100%Availability
< 5 minutes downtime / year (> 99.999% uptime)Assume faults exist: design to contain, notify, recover and restore rapidly
5 Nines
![Page 53: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/53.jpg)
53
$68,372,928
$6,837,293$683,729 $68,373
99% 99.9% 99.99% 99.999%
an
nu
al l
oss
es
annual availability
Source: Gartner Group ($13,000/minute Cross-industry Average)
Annual Cost of Downtimeversus Availability
Costs speak for themselves
![Page 54: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/54.jpg)
54
Availability via Reliability and Repair
low MTTR -> high availability– System is composed of reliable components, that
are protected from each other, and that communicate ONLY through well known interfaces.
this leads to– fault isolation– speedy recovery– reset a component not a board/system– dynamic control
• stop/start• upgrade
![Page 55: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/55.jpg)
55
Software vs Hardware HA
Hardware HA– utilizes redundancy of key components
• a single fault cannot cause all redundant components to fail (No SPOF). e.g. mirrored disks, multiple system boards, I/O cards
– Active/active, active/spare, active/standby
Software is a Significant Cause of Downtime
But that’s only part of the problem!!!
![Page 56: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/56.jpg)
56
Comparison
Software Fault40%
Planned Outage
30%
Operator Error15%
Environment5%
Hardware10%
![Page 57: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/57.jpg)
57
High Level Look at a Core Router/Switch
One or more control elements
OC
LD
(1W
)
OC
LD
(2W
)
OC
LD
(3W
)
OC
LD
(4W
)
OC
I (1
A)
OC
I (1
B)
OC
I (2
A)
OC
I (2
B)
OC
M (
A)
OC
M (
B)
OC
I (3
A)
OC
I (3
B)
OC
I (4
A)
OC
I (4
B)
OC
LD
(4E
)
OC
LD
(3E
)
OC
LD
(2E
)
OC
LD
(1E
)
Sh
elf
Pro
cess
or
Fill
er
I
O
OFF
ON
I
O
OFF
ON
Maintenance Panel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Fiber Management Trough
Optical Multiplexer Tray (OMX)
Cooling Unit
![Page 58: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/58.jpg)
58
Handling Failures
OC
LD
(1W
)
OC
LD
(2W
)
OC
LD
(3W
)
OC
LD
(4W
)
OC
I (1
A)
OC
I (1
B)
OC
I (2
A)
OC
I (2
B)
OC
M (
A)
OC
M (
B)
OC
I (3
A)
OC
I (3
B)
OC
I (4
A)
OC
I (4
B)
OC
LD
(4E
)
OC
LD
(3E
)
OC
LD
(2E
)
OC
LD
(1E
)
Sh
elf
Pro
cess
or
Fill
er
I
O
OFF
ON
I
O
OFF
ON
Maintenance Panel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Fiber Management Trough
Optical Multiplexer Tray (OMX)
Cooling Unit
Isolate Fault to a Board
Switch to Backup
![Page 59: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/59.jpg)
59
OC
LD
(1W
)
OC
LD
(2W
)
OC
LD
(3W
)
OC
LD
(4W
)
OC
I (1
A)
OC
I (1
B)
OC
I (2
A)
OC
I (2
B)
OC
M (
A)
OC
M (
B)
OC
I (3
A)
OC
I (3
B)
OC
I (4
A)
OC
I (4
B)
OC
LD
(4E
)
OC
LD
(3E
)
OC
LD
(2E
)
OC
LD
(1E
)
Sh
elf
Pro
cess
or
Fill
er
I
O
OFF
ON
I
O
OFF
ON
Maintenance Panel
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Fiber Management Trough
Optical Multiplexer Tray (OMX)
Cooling Unit
Route Manager
TCP/IP stack
SNMP Manager
Application
Application
Flash Drivers
Device Manager
NetworkManager
RTOS
Application
Hardware
Application
Isolate fault to a SW component
May not be in the Hardware
![Page 60: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/60.jpg)
60
Route Manager
TCP/IP stack
SNMP Manager
Application
Application
Flash Drivers
Device Manager
NetworkManager
RTOS
Application
Application
Faulty Software Component
• Isolate and contain• Repair (e.g. restart)• Notify• Diagnose• Upgrade
Ideal: Identify and Fix
![Page 61: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/61.jpg)
61
Component-level recovery rarely done
Lack of suitable protection and isolation Lack of modularity Tight component coupling Few dynamic capabilities
Software failures normally handled by: Hardware watchdogs Redundant boards
![Page 62: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/62.jpg)
62
Repair Time
Board Replacement Hours
Reboot Minutes
Failover to Standby Seconds
SW Component Restart 10’s Milliseconds
SW Failover Milliseconds
![Page 63: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/63.jpg)
63
TCP/IP
HA Managerrestartsservice
FLASHFSYS
DISKFSYS
Microkernel
TCP/IP
HAManagerATM
Process Memory Violation
Kernel notifies HA Manager
Dump file forpost-mortem
analysis
High Availability Manager
![Page 64: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/64.jpg)
64
Driver
HAM HAMGuardian
HAM CheckpointedState
Stack
App
CheckpointedState
HA Manager (HAM) monitors components, sends notification of component failure
Heart-beat services detect component hangs
Core file on crash can be created for debugging and analysis
Checkpointing permits recovering current state
Notification and Recovery
![Page 65: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/65.jpg)
65
• A second “shadow” server attaches to the same name
Recovery
![Page 66: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/66.jpg)
66
• A second “shadow” server attaches to the same name• If primary faults, new clients connect to shadow server• Old clients can re-connect to shadow server.
Recovery
![Page 67: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/67.jpg)
67
• Start a new “shadow” server
Recovery
![Page 68: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/68.jpg)
68
Serverv 1.0Client
/dev/service
/dev/service
Serverv 1.1
NewClient
Service Upgrades
New version of server attaches to same name
New clients connect to new server
Old server exits when all old clients have exited
![Page 69: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/69.jpg)
69
QNX Momentics Tools
![Page 70: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/70.jpg)
70
Design Goals
Tools needed to be easy to learn
Tools which could take advantage of QNX
Tools which could integrate tools from other vendors, company designed tools, and industry specific tools and have them work with our tools and each other
Tools needed to be customizable to the user or the company
![Page 71: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/71.jpg)
71
Windows, Solaris, QNX NeutrinoWindows, Solaris, QNX Neutrino
IDE Workbench(Eclipse framework)
IDE Workbench(Eclipse framework)
Sourcedebugger
Java codedeveloper
Targetinformation
System builder
Profiler
Photon app builder
Memoryanalysis
C/C++ codedeveloper
Targetagent
Targetagent
PhotonmicroGUIPhoton
microGUI
Flashfsys
Flashfsys TCP/IPTCP/IP
HttpserverHttp
serverJavaJava
Ethernet, Serial,JTAG, ROMulator
Microkernel
Command-line
tools
BSPs
DDKs
Neutrinoruntime
3rd-PartyTools
Virtio
Invoke command-line tools
QNX® Neutrino® RTOS
Rational
…TBA
XScale
QNX® Momentics
The Best Tools and the Best RTOS
![Page 72: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/72.jpg)
72
IBM donated FrameworkJava IDE200 person-years of effortOpen Source
Consortium founding members include
QNX IDE: Standards based
![Page 73: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/73.jpg)
73
System Profiling
![Page 74: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/74.jpg)
74
Protocol
TCP/IPDeviceDriver
Application
InstrumentedMicroKernel
Trace
SystemEvent Log
System Events• interrupts,• scheduler, • messages, • system calls
System Characterization• Performance analysis• Field diagnostic• Live or post-mortem
Printer
Data display
Statistical &
Numerical
Analysis
Systems Analysis Toolkit
![Page 75: 1 Presented by: Jeff Schaffer Sr. Field Applications Engineer QNX Software Systems jpschaffer@qnx.com 818-227-5105 Embedded Operating Systems: The State](https://reader036.vdocuments.us/reader036/viewer/2022062618/5514a95255034640138b4d12/html5/thumbnails/75.jpg)
Providing Technology for Today…Providing Technology for Today…
Architecture for TomorrowArchitecture for Tomorrow
Irvine Office - 949-727-0444David Weintraub - Regional Sales Manager
Woodland Hills Office - 818-227-5105Jeff Schaffer - Sr. Field Applications Engineer