1 presented by: jeff schaffer sr. field applications engineer qnx software systems...

Presented by:

Jeff SchafferSr. Field Applications EngineerQNX Software Systemsjpschaffer@qnx.com818-227-5105

“Embedded Operating Systems:

The State of the Art”

QNX is a leading provider of real time operating system (RTOS) software, development tools, and services for mission critical embedded applications.

Role of the Embedded OS

Traditional

– Permit sharing of common resources of the computer (disks, printers, CPU)

– Provide low-level control of I/O devices that may be complex, time dependent, and non-portable

– Provide device-independent abstractions (e.g. files, filenames, directories)

Additional Roles

– Prevent common causes of system failure and instability; minimize impact when they occur

– Extend system life cycles

– Isolate problems during development and at runtime

Architecture Comparison

REAL TIME EXECUTIVEAdvantage: single address spaceDisadvantage: single address space,

different binary imagesFailure: means reboot

MONOLITHIC KERNELAdvantage: apps run in own memory spaceDisadvantage: kernel not protected,

kernel testingFailure: might mean reboot

TRUE MICROKERNELAdvantageModules run in own memory spaceAdd/replace services on the flyReusable modulesDirect hardware accessDisadvantage: context switchingFailure: usually does not mean reboot

MicrokernelX86, PPC, MIPS, SH4,

ARM, StrongARM, XScale

PhotonGUI

Flashfsys Audio

driver

TCP/IP

Serialdriver Http

serverJava

ProcessManager

• Dynamic architecture makes hot-start and upgrades easy, even with drivers

• Philosophy: a trusted kernel running a system of untrusted software components

• Processes provide a reusable component model with well defined message interfaces

• Processes communicate via messages or other methods, such as shared memory. Permits loose inter-module coupling.

• No requirement for filesystem, GUI, etc.

MicroKernel – Neutrino

Process 1 Process 2

Process address

mapShared memoryobject

Process address

mapSharedMemory

msg 5msg 2msg 3msg 4Process 1 Process 2MessageQueues

Typical Forms of IPC

Mailboxes

Kernel

Which Architecture for me?

Depends on your application and processor! Simple apps (such as single control loops) generally

only need a real-time executive As system becomes more complex, typically need a

more complex operating system architecture Need to look at factors such as scalability and

reliability Do standards matter?

API’sTwo most common standards

Advantages of standardsPortability of code

Hiring of programmers

Less than 1 second response?

Less than 1 millisecond response?

Less than 1 microsecond response?

Do I need Real-Time?

What is Real Time?

Maybe ...

Real-Time

"A real-time system is one in which the correctness of the computations not only

depends upon the logical correctness of the computation but also upon the time at which

the result is produced. If the timing constraints of the system are not met, system

failure is said to have occurred."

Donald Gillies (comp.realtime FAQ)

A Simple Example...

“it doesn’t do you any good if the signal that cuts fuel to the jet engine arrives a millisecond after the engine

has exploded”

Bill O. Gallmeister - POSIX.4 Programming for the Real World

“Hard” vs. “Soft” Real Time

Hard– absolute deadlines– late responses cannot be tolerated and may have a

catastrophic effect on the system– example: flight control

Soft– systems which have reduced constraints on "lateness”;

e.g. late responses may still have some value– still must operate very quickly and repeatably– example: cardiac pacemaker

Real-time OS Requirements

Operating system factors that permit real-time:– Thread Scheduling– Control of Priority Inversion– Time Spent in Kernel– Interrupt Processing

Factor #1: Scheduling

Non real-time scheduling– round-robin– FIFO– adaptive

Real-time scheduling– priority based– sporadic

Sequence:1. Low priority task acquires bus mutex to transfer data2. High priority task blocks until mutex released3. Medium priority task pre-empts low priority task4. Watchdog timer resets since Bus Manager has not run in some time

Factor #2: Priority Inversion

Source: Embedded Systems Programming

Information Bus Manager

Meteorological Data Gathering Task

Communications Task

Factor #3: Kernel Time

Kernel operations must be pre-emptible– if they are not, an unknown amount of time can

be spent in the kernel performing an operation on behalf of a user process

– can cause real-time process to miss deadline All kernels have some window (or multiple windows)

of time where pre-emption cannot occur Some operating systems attempt to provide real-

time capability by adding “checkpoints” within the kernel so they can be interrupted at these points

int KER

Entry a few opcodes Interrupts off

Unlocked

KernelOperation

whichmay

includemessage

usecstomsecs

Pre-emptable

Exit a few opcodes Interrupts off

Locked usecsNo pre-emptionInterrupts on

Unlocked usecs Pre-emptable

A Kernel call is asoftware interrupt

Example

Split Out Long OperationsSplit Out Long Operations

ProcessManager

Thread

Message

Signal

Channel

ClockTimer

Pathname

Waitpid

SessionUID/GID

Nto Proc

Factor #4: Interrupts

This is broken down into the following areas: Method of handling the interrupt processing chain Handling of Nested Interrupts

Interrupt Processing Chain

IST IST

IST scheduled whenever queue emptied, non-deterministic

IST IST

IST scheduled by normal OS scheduling,

deterministic

Conventional OS

Real-time kernel

Problems– different API’s– real-time layer proprietary– existing OS apps not R/T– poor communication

between operating systems– loss of control issue

Can I Make Any Conventional OS Real-Time

Method– Add real-time layer below

conventional OS, running conventional OS as a low priority real-time process

– Add real-time layer to hardware service layer

Title of presentationTitle 2

Scalability

Scaling Solution #1:Single Board, Single Node

Bridge Mem.

Bus PCI

Peripherals

The only scaling possible is a CPU replacement

Scaling Solution #2:Single Board, Multiple Nodes

Relatively simple to implementAllows “scaling-on-demand”Suitable if nodes have independent

“work”

Inter-node IPC slower than memory accessComplexity in maintaining global view of dataDifficult to break-up computationally-intensive

Bridge Mem.

Bus PCI

Peripherals

Bridge

Bus PCI

Peripherals

Node 1

Node 2

Scaling Solution #3:Single Board, Multiple Processors

Bridge Mem.

PeripheralsCPU1

Tightly-coupled symmetric multiprocessing (SMP) All processors have a symmetric and consistent view

of physical memory and peripherals Scales processing power Need software (RTOS) support

The SMP OS Dilemma

SMP systems to date use desktop operating systems; not responsive enough for real-time requirements

• Application servers• Databases• Web servers

Typical real-time operating systems (home-built or commercial), such as are commonly used in routers and switches today, do not have SMP support

SMP capable real-time operating systems run the CPU’s as independent processors with independent operating systems

SMP Support

True (tightly coupled) SMP support

Only the kernel needs SMP awareness

Transparent to application software and drivers - identical binaries for UP and SMP systems

Automatic scheduling across all CPU’s

Thread

Running

CPU 0Process

Thread

Process

Ready queues

63Priority

6261...0

Thread Thread

Thread

Blocked states Thread Thread

QNX “True” SMP

STATE_RUNNING thread on each processor

Priority-based ready queues

Each thread can be locked to a specific CPU by using a processor affinity mask

Scheduler remembers last CPU thread ran on

– Minimize thread migration– Optimize cache usage

Highest-priority READY thread always immediately scheduled

Why Is Cache Important?

Cache efficiency is probably the single largest determinant of performance on SMP

Coherent view of physical memory is maintained using cache snooping

Cache snooping is done at the CPU bus level and so operates at lower speeds than core

Coherency is “invisible” to software

Performance Implications

Snoop traffic expected on SMP Cache hits generally cause no bus transaction Multiple processors writing to same location

degrades performance (ping-pong effect) Performance degrades when large amount of data

modified on one processor and read on the other Sometimes it is better to have specific threads in a

process run on same CPU

Designing for SMP:One Big task

Single thread

Giant App

• Will not work with SMP

Designing for SMP:Single Threaded Tasks

Single thread

• Works with SMP• Process data can be shared with shared memory

• Good concurrency, some complexity

• IPC not usually as efficient as memory sharing

Designing for SMP:Scaling Software with Threads

Threads

Server

• Single copy server• All process data is implicitly shared and accessible

• Can achieve good concurrency with less complexity

• POSIX synchronization used• Mutexes• Semaphores• Condition variables• Usually more efficient than

inter-process synchronization

Note: SMP finds concurrency problems fast!

Optimizing Compute-intensive Applications

Main thread

Threads

Application

Worker thread

Pool of worker threads Dispatch “work” to worker

threads Scales very well with SMP The tricky part is “breaking

up” the problem

CPU 0CPU 0 CPU 1CPU 1

IRQ 7IRQ 7

IRQ 8IRQ 8 IRQ 9IRQ 9

IRQ 10IRQ 10

IRQ CPU7 08 19 110 1

ISRISR

Interrupt processed on CPU that was targeted

Can distribute load by handling interrupts on different processors

Sometimes not the optimal strategy due to cache effects

Interrupt Handling

Scaling Solution #4:Multiple Processors/Nodes

Bridge Mem.

PeripheralsCPU1

Bridge

PeripheralsCPU1

Node 2

Node 1

Network

Chassis

Network

Line card

Example

Messages flow transparently through QNET from one message bus to another.

LAN orInternet orBackplane

MicrokernelApp

All applications and servers become network distributed without any special code.

FlashFsys CDROM

TCP/IP

AudioPhotonApp

ProcessManager

The QNET MicroNetwork

LineLinecardcard

ControlControlcardcard

QNX Qnet Manager

Extends message passing across multiple QNX microkernels

Over anything with a packet driver:

– Ethernet, RapidIO, 3GIO, InfiniBand, Stargen, etc.

Class of service Use symbolic prefixes to make

client code independent of location of resource manager

Linecard

Controlcard

Linecard

One or multiple links can connect different nodes.

QNET Class of Service

Data is sent out the link which will deliver it the fastest. This is based upon link speed and queue length for each link.

Linecard

Controlcard

Linecard

QNET: Load-Balanced Distribution

Data is sent out a primary link. If it fails, data is diverted to a secondary link. The primary link is probed and when it comes back online, data is diverted back to it.

Linecard

Controlcard

Linecard

QNET: Ordered Distribution

Data is sent out both links at the same time. A failure on either of the links is handled gracefully.

Linecard

Controlcard

Linecard

QNET: Parallel Distribution

Designing for Networked SMP:Single/Multi Threaded Tasks

Multiple threads

Single thread

• Different processes necessary for different nodes

• Works with SMP• Process data can be shared with shared memory

• IPC for networked communication

Client /service

Client Node

/net/a/dev/service

/net/b/dev/service

• Simple link provides transparent redirection• Process has to monitor status of link• Switch over is not transparent to client

Transparent Redirection

Client

Client Node

/net/a/dev/service

/net/b/dev/service

Servicemgr

• Service manager acts as a proxy• Monitors health of and/or load on services/nodes• Switch over is transparent to client

/dev/service

Transparent Redirection

Client

Client Node

/net/a/dev/service

/net/b/dev/service

Servicemgr

/dev/service

• Requests serviced redundantly • First/majority/best result• Different implementations

Redundant Links

FLASHFSYS TCP/IP

App App

BlueTooth

FLASHFSYS Graphics

Browser Audio

Photon

CDROMFSYS

Graphics

Browser Audio

Photon

FLASHFSYS TCP/IP

App App

BlueTooth

FLASHFSYS Graphics

Browser Audio

Photon

CDROMFSYS

Graphics

Browser

Title of presentationTitle 2

Reliability and Availability

Embedded systems are different! Failure in an embedded system can have severe

effects - like death …

“Pilots really hate to be told they have

to reboot their plane while in flight”Walter Shawlee

Definitions

MTBF: Mean Time Between Failure– The average number of hours between failures for a

large number of components over a long time. (e.g. MIL-HDBK-217)

MTTR: Mean Time To Repair– Total amount of time spent performing all corrective

maintenance repairs divided by the number of repairs

MTBI: Mean Time Between Interruptions.– The average number of hours between failures while

a redundant component is down.

Defining HA

Quantified by failure rate (MTBF) Time to resume service after failure is MTTRReliability

Allows for failure, with quick service restoration. As MTTR 0, Availability 100%Availability

< 5 minutes downtime / year (> 99.999% uptime)Assume faults exist: design to contain, notify, recover and restore rapidly

5 Nines

$68,372,928

$6,837,293$683,729 $68,373

99% 99.9% 99.99% 99.999%

annual availability

Source: Gartner Group ($13,000/minute Cross-industry Average)

Annual Cost of Downtimeversus Availability

Costs speak for themselves

Availability via Reliability and Repair

low MTTR -> high availability– System is composed of reliable components, that

are protected from each other, and that communicate ONLY through well known interfaces.

this leads to– fault isolation– speedy recovery– reset a component not a board/system– dynamic control

• stop/start• upgrade

Software vs Hardware HA

Hardware HA– utilizes redundancy of key components

• a single fault cannot cause all redundant components to fail (No SPOF). e.g. mirrored disks, multiple system boards, I/O cards

– Active/active, active/spare, active/standby

Software is a Significant Cause of Downtime

But that’s only part of the problem!!!

Comparison

Software Fault40%

Planned Outage

Operator Error15%

Environment5%

Hardware10%

High Level Look at a Core Router/Switch

One or more control elements

Maintenance Panel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Fiber Management Trough

Optical Multiplexer Tray (OMX)

Cooling Unit

Handling Failures

Maintenance Panel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Cooling Unit

Isolate Fault to a Board

Switch to Backup

Maintenance Panel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Cooling Unit

Route Manager

TCP/IP stack

SNMP Manager

Application

Flash Drivers

Device Manager

NetworkManager

Application

Hardware

Application

Isolate fault to a SW component

May not be in the Hardware

Route Manager

TCP/IP stack

SNMP Manager

Application

Flash Drivers

Device Manager

NetworkManager

Application

Faulty Software Component

• Isolate and contain• Repair (e.g. restart)• Notify• Diagnose• Upgrade

Ideal: Identify and Fix

Component-level recovery rarely done

Lack of suitable protection and isolation Lack of modularity Tight component coupling Few dynamic capabilities

Software failures normally handled by: Hardware watchdogs Redundant boards

Repair Time

Board Replacement Hours

Reboot Minutes

Failover to Standby Seconds

SW Component Restart 10’s Milliseconds

SW Failover Milliseconds

TCP/IP

HA Managerrestartsservice

FLASHFSYS

DISKFSYS

Microkernel

TCP/IP

HAManagerATM

Process Memory Violation

Kernel notifies HA Manager

Dump file forpost-mortem

analysis

High Availability Manager

Driver

HAM HAMGuardian

HAM CheckpointedState

CheckpointedState

HA Manager (HAM) monitors components, sends notification of component failure

Heart-beat services detect component hangs

Core file on crash can be created for debugging and analysis

Checkpointing permits recovering current state

Notification and Recovery

• A second “shadow” server attaches to the same name

Recovery

• A second “shadow” server attaches to the same name• If primary faults, new clients connect to shadow server• Old clients can re-connect to shadow server.

Recovery

• Start a new “shadow” server

Recovery

Serverv 1.0Client

/dev/service

Serverv 1.1

NewClient

Service Upgrades

New version of server attaches to same name

New clients connect to new server

Old server exits when all old clients have exited

QNX Momentics Tools

Design Goals

Tools needed to be easy to learn

Tools which could take advantage of QNX

Tools which could integrate tools from other vendors, company designed tools, and industry specific tools and have them work with our tools and each other

Tools needed to be customizable to the user or the company

Windows, Solaris, QNX NeutrinoWindows, Solaris, QNX Neutrino

IDE Workbench(Eclipse framework)

Sourcedebugger

Java codedeveloper

Targetinformation

System builder

Profiler

Photon app builder

Memoryanalysis

C/C++ codedeveloper

Targetagent

PhotonmicroGUIPhoton

microGUI

Flashfsys

Flashfsys TCP/IPTCP/IP

HttpserverHttp

serverJavaJava

Ethernet, Serial,JTAG, ROMulator

Microkernel

Command-line

Neutrinoruntime

3rd-PartyTools

Virtio

Invoke command-line tools

QNX® Neutrino® RTOS

Rational

…TBA

XScale

QNX® Momentics

The Best Tools and the Best RTOS

IBM donated FrameworkJava IDE200 person-years of effortOpen Source

Consortium founding members include

QNX IDE: Standards based

System Profiling

Protocol

TCP/IPDeviceDriver

Application

InstrumentedMicroKernel

SystemEvent Log

System Events• interrupts,• scheduler, • messages, • system calls

System Characterization• Performance analysis• Field diagnostic• Live or post-mortem

Printer

Data display

Statistical &

Numerical

Analysis

Systems Analysis Toolkit

Providing Technology for Today…Providing Technology for Today…

Architecture for TomorrowArchitecture for Tomorrow

Irvine Office - 949-727-0444David Weintraub - Regional Sales Manager

dweintraub@qnx.com

Woodland Hills Office - 818-227-5105Jeff Schaffer - Sr. Field Applications Engineer

jpschaffer@qnx.com

1 presented by: jeff schaffer sr. field applications engineer qnx software systems...

reboot slide

realtime process

kernel time kernel operations

sporadic slide

realtime system

real world slide

time factor

task communications

Documents

qnx momentics...

qnx software systems charles eagan, engineering vice...

qnx neutrino realtime operating...

qnx neutrino realtime operating...

qnx...

qnx neutrino...

qnx neutrino realtime operating...

qnx car multimedia architecture guide · title: qnx car...

qnx in china · • qnx shanghai office -2009 • qnx...

qnx momentics pe...

portability made possible -...

qnx...

qnx neutrino rtos...

revised version 2 · qnx car html5 engine technology qnx...

qnx momentics tool...

qnx software development platform 6.5.0 service pack...

qnx momentics...

qnx neutrino -...

hands-on workshop: industrial human-machine interfacing...

tidc07 developing audio multimedia sol. on the ti … audio...