1 presented by: jeff schaffer sr. field applications engineer qnx software systems...

Post on 27-Mar-2015

219 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

Presented by:

Jeff SchafferSr. Field Applications EngineerQNX Software Systemsjpschaffer@qnx.com818-227-5105

“Embedded Operating Systems:

The State of the Art”

QNX is a leading provider of real time operating system (RTOS) software, development tools, and services for mission critical embedded applications.

2

Role of the Embedded OS

Traditional

– Permit sharing of common resources of the computer (disks, printers, CPU)

– Provide low-level control of I/O devices that may be complex, time dependent, and non-portable

– Provide device-independent abstractions (e.g. files, filenames, directories)

Additional Roles

– Prevent common causes of system failure and instability; minimize impact when they occur

– Extend system life cycles

– Isolate problems during development and at runtime

3

Architecture Comparison

REAL TIME EXECUTIVEAdvantage: single address spaceDisadvantage: single address space,

different binary imagesFailure: means reboot

MONOLITHIC KERNELAdvantage: apps run in own memory spaceDisadvantage: kernel not protected,

kernel testingFailure: might mean reboot

TRUE MICROKERNELAdvantageModules run in own memory spaceAdd/replace services on the flyReusable modulesDirect hardware accessDisadvantage: context switchingFailure: usually does not mean reboot

4

MicrokernelX86, PPC, MIPS, SH4,

ARM, StrongARM, XScale

App

PhotonGUI

Flashfsys Audio

driver

TCP/IP

Serialdriver Http

serverJava

ProcessManager

• Dynamic architecture makes hot-start and upgrades easy, even with drivers

• Philosophy: a trusted kernel running a system of untrusted software components

• Processes provide a reusable component model with well defined message interfaces

• Processes communicate via messages or other methods, such as shared memory. Permits loose inter-module coupling.

• No requirement for filesystem, GUI, etc.

MicroKernel – Neutrino

5

Process 1 Process 2

Pipes

Process address

mapShared memoryobject

map

Process address

map

mapSharedMemory

msg 5msg 2msg 3msg 4Process 1 Process 2MessageQueues

Typical Forms of IPC

Mailboxes

Kernel

6

Which Architecture for me?

Depends on your application and processor! Simple apps (such as single control loops) generally

only need a real-time executive As system becomes more complex, typically need a

more complex operating system architecture Need to look at factors such as scalability and

reliability Do standards matter?

API’sTwo most common standards

Advantages of standardsPortability of code

Hiring of programmers

8

Less than 1 second response?

Less than 1 millisecond response?

Less than 1 microsecond response?

Do I need Real-Time?

What is Real Time?

Maybe ...

9

Real-Time

"A real-time system is one in which the correctness of the computations not only

depends upon the logical correctness of the computation but also upon the time at which

the result is produced. If the timing constraints of the system are not met, system

failure is said to have occurred."

Donald Gillies (comp.realtime FAQ)

10

A Simple Example...

“it doesn’t do you any good if the signal that cuts fuel to the jet engine arrives a millisecond after the engine

has exploded”

Bill O. Gallmeister - POSIX.4 Programming for the Real World

11

ATM

“Hard” vs. “Soft” Real Time

Hard– absolute deadlines– late responses cannot be tolerated and may have a

catastrophic effect on the system– example: flight control

Soft– systems which have reduced constraints on "lateness”;

e.g. late responses may still have some value– still must operate very quickly and repeatably– example: cardiac pacemaker

12

Real-time OS Requirements

Operating system factors that permit real-time:– Thread Scheduling– Control of Priority Inversion– Time Spent in Kernel– Interrupt Processing

13

Factor #1: Scheduling

Non real-time scheduling– round-robin– FIFO– adaptive

Real-time scheduling– priority based– sporadic

14

Sequence:1. Low priority task acquires bus mutex to transfer data2. High priority task blocks until mutex released3. Medium priority task pre-empts low priority task4. Watchdog timer resets since Bus Manager has not run in some time

Factor #2: Priority Inversion

Source: Embedded Systems Programming

Information Bus Manager

Meteorological Data Gathering Task

Communications Task

15

Factor #3: Kernel Time

Kernel operations must be pre-emptible– if they are not, an unknown amount of time can

be spent in the kernel performing an operation on behalf of a user process

– can cause real-time process to miss deadline All kernels have some window (or multiple windows)

of time where pre-emption cannot occur Some operating systems attempt to provide real-

time capability by adding “checkpoints” within the kernel so they can be interrupted at these points

16

int KER

iret

Entry a few opcodes Interrupts off

Unlocked

KernelOperation

whichmay

includemessage

pass

usecstomsecs

Pre-emptable

Exit a few opcodes Interrupts off

Locked usecsNo pre-emptionInterrupts on

Unlocked usecs Pre-emptable

A Kernel call is asoftware interrupt

Example

Split Out Long OperationsSplit Out Long Operations

ProcessManager

Thread

Sync

Message

Sched

Signal

Channel

ClockTimer

Intr

Fork

Exec

Pathname

Spawn

Mmap

Waitpid

SessionUID/GID

Debug

Nto Proc

18

Factor #4: Interrupts

This is broken down into the following areas: Method of handling the interrupt processing chain Handling of Nested Interrupts

19

Interrupt Processing Chain

ISR

INT x

ISR

INT y

IST IST

IST scheduled whenever queue emptied, non-deterministic

ISR

INT x

ISR

INT y

IST IST

IST scheduled by normal OS scheduling,

deterministic

20

Conventional OS

Real-time kernel

Problems– different API’s– real-time layer proprietary– existing OS apps not R/T– poor communication

between operating systems– loss of control issue

Can I Make Any Conventional OS Real-Time

Method– Add real-time layer below

conventional OS, running conventional OS as a low priority real-time process

– Add real-time layer to hardware service layer

21

Title of presentationTitle 2

Scalability

22

Scaling Solution #1:Single Board, Single Node

CPU

Bridge Mem.

Bus PCI

Peripherals

The only scaling possible is a CPU replacement

23

Scaling Solution #2:Single Board, Multiple Nodes

Relatively simple to implementAllows “scaling-on-demand”Suitable if nodes have independent

“work”

Inter-node IPC slower than memory accessComplexity in maintaining global view of dataDifficult to break-up computationally-intensive

tasks

CPU

Bridge Mem.

Bus PCI

Peripherals

CPU

Bridge

Bus PCI

Peripherals

Node 1

Node 2

24

Scaling Solution #3:Single Board, Multiple Processors

CPU0

Bridge Mem.

Bus

PCI

PeripheralsCPU1

Tightly-coupled symmetric multiprocessing (SMP) All processors have a symmetric and consistent view

of physical memory and peripherals Scales processing power Need software (RTOS) support

25

The SMP OS Dilemma

SMP systems to date use desktop operating systems; not responsive enough for real-time requirements

• Application servers• Databases• Web servers

Typical real-time operating systems (home-built or commercial), such as are commonly used in routers and switches today, do not have SMP support

SMP capable real-time operating systems run the CPU’s as independent processors with independent operating systems

26

SMP Support

True (tightly coupled) SMP support

Only the kernel needs SMP awareness

Transparent to application software and drivers - identical binaries for UP and SMP systems

Automatic scheduling across all CPU’s

27

Thread

Running

CPU 0Process

CPU 1

Thread

Process

Ready queues

63Priority

6261...0

Thread Thread

Thread

Blocked states Thread Thread

QNX “True” SMP

STATE_RUNNING thread on each processor

Priority-based ready queues

Each thread can be locked to a specific CPU by using a processor affinity mask

Scheduler remembers last CPU thread ran on

– Minimize thread migration– Optimize cache usage

Highest-priority READY thread always immediately scheduled

28

Why Is Cache Important?

Cache efficiency is probably the single largest determinant of performance on SMP

Coherent view of physical memory is maintained using cache snooping

Cache snooping is done at the CPU bus level and so operates at lower speeds than core

Coherency is “invisible” to software

29

Performance Implications

Snoop traffic expected on SMP Cache hits generally cause no bus transaction Multiple processors writing to same location

degrades performance (ping-pong effect) Performance degrades when large amount of data

modified on one processor and read on the other Sometimes it is better to have specific threads in a

process run on same CPU

30

Designing for SMP:One Big task

Single thread

Giant App

• Will not work with SMP

31

Designing for SMP:Single Threaded Tasks

App 1

Single thread

App 2

Single thread

• Works with SMP• Process data can be shared with shared memory

• Good concurrency, some complexity

• IPC not usually as efficient as memory sharing

32

Designing for SMP:Scaling Software with Threads

Threads

Server

• Single copy server• All process data is implicitly shared and accessible

• Can achieve good concurrency with less complexity

• POSIX synchronization used• Mutexes• Semaphores• Condition variables• Usually more efficient than

inter-process synchronization

Note: SMP finds concurrency problems fast!

33

Optimizing Compute-intensive Applications

Main thread

Threads

Application

Worker thread

Worker thread

Pool of worker threads Dispatch “work” to worker

threads Scales very well with SMP The tricky part is “breaking

up” the problem

34

CPU 0CPU 0 CPU 1CPU 1

IRQ 7IRQ 7

IRQ 8IRQ 8 IRQ 9IRQ 9

IRQ 10IRQ 10

IRQ CPU7 08 19 110 1

ISRISR

IST

Interrupt processed on CPU that was targeted

Can distribute load by handling interrupts on different processors

Sometimes not the optimal strategy due to cache effects

Interrupt Handling

35

Scaling Solution #4:Multiple Processors/Nodes

CPU0

Bridge Mem.

Bus

PCI

PeripheralsCPU1

CPU0

Bridge

Bus

PCI

PeripheralsCPU1

Node 2

Node 1

36

Network

Network

Chassis

Network

Network

Network

Network

...

Hig

h-s

pe

ed

inte

rco

nn

ect

Lo

w-s

pee

d b

us

Line card

Line card

Example

QNET

Messages flow transparently through QNET from one message bus to another.

LAN orInternet orBackplane

QNET

MicrokernelApp

All applications and servers become network distributed without any special code.

FlashFsys CDROM

Fsys

TCP/IP

AudioPhotonApp

ProcessManager

The QNET MicroNetwork

38

LineLinecardcard

LineLinecardcard

ControlControlcardcard

QNX Qnet Manager

Extends message passing across multiple QNX microkernels

Over anything with a packet driver:

– Ethernet, RapidIO, 3GIO, InfiniBand, Stargen, etc.

Class of service Use symbolic prefixes to make

client code independent of location of resource manager

39

Linecard

Controlcard

Linecard

One or multiple links can connect different nodes.

QNET Class of Service

40

Data is sent out the link which will deliver it the fastest. This is based upon link speed and queue length for each link.

Linecard

Controlcard

Linecard

QNET: Load-Balanced Distribution

41

Data is sent out a primary link. If it fails, data is diverted to a secondary link. The primary link is probed and when it comes back online, data is diverted back to it.

Linecard

Controlcard

Linecard

QNET: Ordered Distribution

42

Data is sent out both links at the same time. A failure on either of the links is handled gracefully.

Linecard

Controlcard

Linecard

QNET: Parallel Distribution

43

Designing for Networked SMP:Single/Multi Threaded Tasks

App 1

Multiple threads

App 2

Single thread

• Different processes necessary for different nodes

• Works with SMP• Process data can be shared with shared memory

• IPC for networked communication

44

Client /service

Client Node

A

B

/net/a/dev/service

/net/b/dev/service

• Simple link provides transparent redirection• Process has to monitor status of link• Switch over is not transparent to client

Transparent Redirection

45

Client

Client Node

A

B

/net/a/dev/service

/net/b/dev/service

Servicemgr

• Service manager acts as a proxy• Monitors health of and/or load on services/nodes• Switch over is transparent to client

/dev/service

Transparent Redirection

46

Client

Client Node

A

B

/net/a/dev/service

/net/b/dev/service

Servicemgr

/dev/service

• Requests serviced redundantly • First/majority/best result• Different implementations

Redundant Links

FLASHFSYS TCP/IP

App App

BlueTooth

Qnet

MO

ST

BU

S

FLASHFSYS Graphics

Browser Audio

Photon

Qnet

CDROMFSYS

Graphics

Browser Audio

Photon

Qnet

FLASHFSYS TCP/IP

App App

BlueTooth

FLASHFSYS Graphics

Browser Audio

Photon

Qnet

CDROMFSYS

Graphics

Qnet

Qnet

MO

ST

BU

S

Browser

49

Title of presentationTitle 2

Reliability and Availability

50

Why?

Embedded systems are different! Failure in an embedded system can have severe

effects - like death …

“Pilots really hate to be told they have

to reboot their plane while in flight”Walter Shawlee

51

Definitions

MTBF: Mean Time Between Failure– The average number of hours between failures for a

large number of components over a long time. (e.g. MIL-HDBK-217)

MTTR: Mean Time To Repair– Total amount of time spent performing all corrective

maintenance repairs divided by the number of repairs

MTBI: Mean Time Between Interruptions.– The average number of hours between failures while

a redundant component is down.

52

Defining HA

Quantified by failure rate (MTBF) Time to resume service after failure is MTTRReliability

Allows for failure, with quick service restoration. As MTTR 0, Availability 100%Availability

< 5 minutes downtime / year (> 99.999% uptime)Assume faults exist: design to contain, notify, recover and restore rapidly

5 Nines

53

$68,372,928

$6,837,293$683,729 $68,373

99% 99.9% 99.99% 99.999%

an

nu

al l

oss

es

annual availability

Source: Gartner Group ($13,000/minute Cross-industry Average)

Annual Cost of Downtimeversus Availability

Costs speak for themselves

54

Availability via Reliability and Repair

low MTTR -> high availability– System is composed of reliable components, that

are protected from each other, and that communicate ONLY through well known interfaces.

this leads to– fault isolation– speedy recovery– reset a component not a board/system– dynamic control

• stop/start• upgrade

55

Software vs Hardware HA

Hardware HA– utilizes redundancy of key components

• a single fault cannot cause all redundant components to fail (No SPOF). e.g. mirrored disks, multiple system boards, I/O cards

– Active/active, active/spare, active/standby

Software is a Significant Cause of Downtime

But that’s only part of the problem!!!

56

Comparison

Software Fault40%

Planned Outage

30%

Operator Error15%

Environment5%

Hardware10%

57

High Level Look at a Core Router/Switch

One or more control elements

OC

LD

(1W

)

OC

LD

(2W

)

OC

LD

(3W

)

OC

LD

(4W

)

OC

I (1

A)

OC

I (1

B)

OC

I (2

A)

OC

I (2

B)

OC

M (

A)

OC

M (

B)

OC

I (3

A)

OC

I (3

B)

OC

I (4

A)

OC

I (4

B)

OC

LD

(4E

)

OC

LD

(3E

)

OC

LD

(2E

)

OC

LD

(1E

)

Sh

elf

Pro

cess

or

Fill

er

I

O

OFF

ON

I

O

OFF

ON

Maintenance Panel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Fiber Management Trough

Optical Multiplexer Tray (OMX)

Cooling Unit

58

Handling Failures

OC

LD

(1W

)

OC

LD

(2W

)

OC

LD

(3W

)

OC

LD

(4W

)

OC

I (1

A)

OC

I (1

B)

OC

I (2

A)

OC

I (2

B)

OC

M (

A)

OC

M (

B)

OC

I (3

A)

OC

I (3

B)

OC

I (4

A)

OC

I (4

B)

OC

LD

(4E

)

OC

LD

(3E

)

OC

LD

(2E

)

OC

LD

(1E

)

Sh

elf

Pro

cess

or

Fill

er

I

O

OFF

ON

I

O

OFF

ON

Maintenance Panel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Fiber Management Trough

Optical Multiplexer Tray (OMX)

Cooling Unit

Isolate Fault to a Board

Switch to Backup

59

OC

LD

(1W

)

OC

LD

(2W

)

OC

LD

(3W

)

OC

LD

(4W

)

OC

I (1

A)

OC

I (1

B)

OC

I (2

A)

OC

I (2

B)

OC

M (

A)

OC

M (

B)

OC

I (3

A)

OC

I (3

B)

OC

I (4

A)

OC

I (4

B)

OC

LD

(4E

)

OC

LD

(3E

)

OC

LD

(2E

)

OC

LD

(1E

)

Sh

elf

Pro

cess

or

Fill

er

I

O

OFF

ON

I

O

OFF

ON

Maintenance Panel

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Fiber Management Trough

Optical Multiplexer Tray (OMX)

Cooling Unit

Route Manager

TCP/IP stack

SNMP Manager

Application

Application

Flash Drivers

Device Manager

NetworkManager

RTOS

Application

Hardware

Application

Isolate fault to a SW component

May not be in the Hardware

60

Route Manager

TCP/IP stack

SNMP Manager

Application

Application

Flash Drivers

Device Manager

NetworkManager

RTOS

Application

Application

Faulty Software Component

• Isolate and contain• Repair (e.g. restart)• Notify• Diagnose• Upgrade

Ideal: Identify and Fix

61

Component-level recovery rarely done

Lack of suitable protection and isolation Lack of modularity Tight component coupling Few dynamic capabilities

Software failures normally handled by: Hardware watchdogs Redundant boards

62

Repair Time

Board Replacement Hours

Reboot Minutes

Failover to Standby Seconds

SW Component Restart 10’s Milliseconds

SW Failover Milliseconds

63

TCP/IP

HA Managerrestartsservice

FLASHFSYS

DISKFSYS

Microkernel

TCP/IP

HAManagerATM

Process Memory Violation

Kernel notifies HA Manager

Dump file forpost-mortem

analysis

High Availability Manager

64

Driver

HAM HAMGuardian

HAM CheckpointedState

Stack

App

CheckpointedState

HA Manager (HAM) monitors components, sends notification of component failure

Heart-beat services detect component hangs

Core file on crash can be created for debugging and analysis

Checkpointing permits recovering current state

Notification and Recovery

65

• A second “shadow” server attaches to the same name

Recovery

66

• A second “shadow” server attaches to the same name• If primary faults, new clients connect to shadow server• Old clients can re-connect to shadow server.

Recovery

67

• Start a new “shadow” server

Recovery

68

Serverv 1.0Client

/dev/service

/dev/service

Serverv 1.1

NewClient

Service Upgrades

New version of server attaches to same name

New clients connect to new server

Old server exits when all old clients have exited

69

QNX Momentics Tools

70

Design Goals

Tools needed to be easy to learn

Tools which could take advantage of QNX

Tools which could integrate tools from other vendors, company designed tools, and industry specific tools and have them work with our tools and each other

Tools needed to be customizable to the user or the company

71

Windows, Solaris, QNX NeutrinoWindows, Solaris, QNX Neutrino

IDE Workbench(Eclipse framework)

IDE Workbench(Eclipse framework)

Sourcedebugger

Java codedeveloper

Targetinformation

System builder

Profiler

Photon app builder

Memoryanalysis

C/C++ codedeveloper

Targetagent

Targetagent

PhotonmicroGUIPhoton

microGUI

Flashfsys

Flashfsys TCP/IPTCP/IP

HttpserverHttp

serverJavaJava

Ethernet, Serial,JTAG, ROMulator

Microkernel

Command-line

tools

BSPs

DDKs

Neutrinoruntime

3rd-PartyTools

Virtio

Invoke command-line tools

QNX® Neutrino® RTOS

Rational

…TBA

XScale

QNX® Momentics

The Best Tools and the Best RTOS

72

IBM donated FrameworkJava IDE200 person-years of effortOpen Source

Consortium founding members include

QNX IDE: Standards based

73

System Profiling

74

Protocol

TCP/IPDeviceDriver

Application

InstrumentedMicroKernel

Trace

SystemEvent Log

System Events• interrupts,• scheduler, • messages, • system calls

System Characterization• Performance analysis• Field diagnostic• Live or post-mortem

Printer

Data display

Statistical &

Numerical

Analysis

Systems Analysis Toolkit

Providing Technology for Today…Providing Technology for Today…

Architecture for TomorrowArchitecture for Tomorrow

Irvine Office - 949-727-0444David Weintraub - Regional Sales Manager

dweintraub@qnx.com

Woodland Hills Office - 818-227-5105Jeff Schaffer - Sr. Field Applications Engineer

jpschaffer@qnx.com

top related