real-time orb middleware: standards, applications, and variations
DESCRIPTION
Real-Time ORB Middleware: Standards, Applications, and Variations. Christopher Gill [email protected] Center for Distributed Object Computing Department of Computer Science and Engineering Washington University, St. Louis, MO. - PowerPoint PPT PresentationTRANSCRIPT
Real-Time ORB Middleware: Standards, Applications, and Variations
Christopher [email protected]
Center for Distributed Object ComputingDepartment of Computer Science and Engineering
Washington University, St. Louis, MO
Research supported in part by DARPA contracts F33615-01-C-1898 (NEST);and F33615-00-C-3048 and F33615-03-C-4111 (PCES)
Research conducted in collaboration with colleagues at Washington University, Vanderbilt University, University of Kansas, University of Rhode Island, Ohio University, OOMWorks, Boeing, BBN, Honeywell, and Tech-X
2 - Chris Gill – 04/20/23
Main Themes
Standards enforce commonality» Specify interfaces, etc., on which applications can rely
Applications are heterogeneous» Which standards are relevant may vary from app to app» Apps may rely on different subsets of standard features
What if commonality & heterogeneity don’t match?» E.g., app needs a feature the standard doesn’t address» E.g., a needed feature may conflict with specified ones
Developing and using standards-based middleware effectively demands attention to these issues (especially if time, space, reliability are involved)
3 - Chris Gill – 04/20/23
Motivating Example: Avionics Mission Computing
In-flight collaboration between aircraft personnel» Exchange imagery and annotations over a wireless network
Trade-offs between image quality and transfer latency » Managed adaptively during download, to ensure timeliness
Why use CORBA, and for what parts of the system?» For DOC between Ada/ORBExpress server and C++/TAO client» For prioritization of OFP and image handling operations on client» For adaptive rate-based scheduling on client
low bandwidth radio link
virtual folder,images
adaptationmiddleware
transmissionmiddleware
cockpitdisplays
serverside
clientside
imageserver
Collaborative research with Boeing, BBN,
Honeywell Technology Center, supported by Boeing/AFRL contract
F33615-97-D-1155/0005 (WSOA)
4 - Chris Gill – 04/20/23
Outline: Three Illustrative Technology Studies
Real-Time CORBA 1.0» Location/language transparency, low latency,
priorities» Trade-offs in time, footprint, and features
Lightweight CCM» Component assembly, deployment,
(re-)configuration» Trade-offs in the timeliness of configuration itself
Real-Time CORBA 1.2» Pluggable dynamic scheduling, distributable threads» Trade-offs in flexibility, overhead, and mechanisms
5 - Chris Gill – 04/20/23
Technology Study I: Real-Time CORBA 1.0
Location/language transparency Low latency Static Priorities Trade-offs
» time, footprint, and features
6 - Chris Gill – 04/20/23
CORBA Location/Language Transparency
IDL provides type safety between client and server A client obtains an interoperable object reference (IOR)
» Encodes IP address, port, object ID, etc. A wire format is defined for invocation messages
» Client stubs marshal, server skeletons un-marshal messages Other details are left as ORB implementation features
» How to combine threads, sockets, event de-multiplexers, etc.» ORB developers can (and should) exploit this design
freedom
ORB ORB
Stub
Client
Skeleton
Servant
IIOPmessage
objectreference
7 - Chris Gill – 04/20/23
Exploiting Design Freedom for Low Latency
Re-use portable, type-safe, efficient mechanisms» Concurrency, communication, event demultiplexing, etc. » Available for many POSIX-like OS platforms» Also RTOS: VxWorks, LynxOS, KURT-Linux/LibeRTOS
Compose to avoid blocking, queueing, locking, etc.
E.g., ACE Framework
8 - Chris Gill – 04/20/23
Real-Time CORBA 1.0: Static Priorities
Lanes enforce priority separation between threads Set minimum (static) and additional (dyn) # of threads Set stack size, use of thread borrowing, request buffering
// Define two lanesRTCORBA::ThreadpoolLane high_priority ={10 /*Prio*/, 3 /*Static Threads*/, 0 /*Dyn Threads*/ };
RTCORBA::ThreadpoolLane low_priority ={5 /*Prio*/, 2 /*Static Threads*/, 2 /*Dyn Threads*/};
RTCORBA::ThreadpoolLanes lanes(2); lanes.length (2);lanes[0] = high_priority; lanes[1] = low_priority;
RTCORBA::ThreadpoolId pool_id = rt_orb->create_threadpool_with_lanes
(1024 * 10, // Stacksize lanes, // Thread pool lanes false, // No thread borrowing false, 0, 0); // No request buffering
// Define two lanesRTCORBA::ThreadpoolLane high_priority ={10 /*Prio*/, 3 /*Static Threads*/, 0 /*Dyn Threads*/ };
RTCORBA::ThreadpoolLane low_priority ={5 /*Prio*/, 2 /*Static Threads*/, 2 /*Dyn Threads*/};
RTCORBA::ThreadpoolLanes lanes(2); lanes.length (2);lanes[0] = high_priority; lanes[1] = low_priority;
RTCORBA::ThreadpoolId pool_id = rt_orb->create_threadpool_with_lanes
(1024 * 10, // Stacksize lanes, // Thread pool lanes false, // No thread borrowing false, 0, 0); // No request buffering
Thread Pool with Lanes
PRIORITY
10PRIORITY
5
9 - Chris Gill – 04/20/23
When Trade-Offs Impinge on a Standard
Active Damage Detection on structures (e.g., aircraft tail) Ping nodes create vibrations that are measured by sensors Computational nodes do analysis, schedule other nodes DOC middleware can help ease programming complexity Crucial trade-offs in time vs. footprint vs. features Can (and should) ORB developers stay within the standard?
12
43
Acoustic Waves (kHz Range)
Structure with Embedded or Bonded Piezoelectric
Transducers
10 - Chris Gill – 04/20/23
Design Challenges
General purpose middleware aims at supporting a wide variety of applications» Tends to support a breadth of alternative features
Extra features may impact some applications» E.g., Foot-print in memory-constrained networked
embedded systems demanding real-time assurances Need to study and select middleware features
based on application requirements Fundamental tension between
» Generality/standardization » Application specific customization
11 - Chris Gill – 04/20/23
Critical Path Analysis and Trade-offs in nORB
Simple Object Adapter
Operation lookup and dispatch
Foo()
Bar()
Skeleton code(using ACE_CDR)
Unmarshallparameters
Call to implementation
Reactor
Acceptor
ConnectionCache
ORBReactor
Acceptor
ConnectionCache
ORB
Stub code(using ACE_CDR)
Marshallparameters
Remote call
11
2
3
4Could be avoided for homogenous nodes
1)
Only a subset of GIOP messages
2)
3) Simple Life cyclemanagement
4) Hash-table vs linear search
12 - Chris Gill – 04/20/23
Footprint Comparison: ACE, nORB, TAO
0
200
400
600
800
1,000
1,200
1,400
1,600
1,800
2,000F
oo
tpri
nt
in K
BNodeNodeRegistry
Node 376 1800 567 1738 509
NodeRegistry 324 1778 549 1725 492
ACE TAO nORBcompile
optimized TAO
compile optimized
nORB
ACE costs 212KB; nORB+ACE costs 345KB; TAO+ACE costs ~1.7MBNode application code alone costs 164KB
13 - Chris Gill – 04/20/23
Ping Scheduling Algorithm Convergence Time
14 - Chris Gill – 04/20/23
Technology Study I: Summary
The CORBA standard promotes DOC programming» Portable, interoperable, language/location transparent» Gives ORB developers freedom to optimize/strategize
The RT-CORBA 1.0 standard adds real-time QoS» E.g., thread pools, prioritized lanes, etc.» Here too, design freedom is crucial, e.g., for low latency
However, some application contexts raise issues» E.g., with stringent memory and RT constraints, how
crucial is strict standards compliance to developers?» Minimum CORBA, other specifications acknowledge this» Further attention to “degrees of compliance” may help
15 - Chris Gill – 04/20/23
Technology Study II: Lightweight CCM
Component assembly, deployment, re-configuration
Some applications require optimization and trade-offs in the timeliness of configuration itself
Rethink deployment/configuration lifecycle » Must fit within stringent system initialization bounds
16 - Chris Gill – 04/20/23
An Review of RT-DOC Middleware Evolution
Distributed Object Computing (DOC) Middleware» E.g., CORBA, Java RMI» Simplifies client-side programming via location (language)
independence
Real-Time DOC Middleware» E.g., Real-Time CORBA 1.0, 1.2» Enforces real-time properties between client and server
Component Middleware» E.g., CORBA Component Model (CCM), EJB/J2EE» Simplifies server programming through declarative configuration
Real-Time Component Middleware» E.g., the Component-Integrated ACE ORB (CIAO), QoS EJB» Enforces configured real-time properties within server itself» Are the configuration activities themselves real-time?
17 - Chris Gill – 04/20/23
Motivating Example Application
Simple component application from avionics domain (Boeing) Represents many other distributed real-time applications Application composed flexibly via component middleware Real-time (and other ) aspects can be configured this way
» E.g., RT-CORBA policies, thread pools, replicas for fault tolerance, etc.
Real-time bounds on configuration itself may matter as well» E.g., minimum initialization time when system is (re-)started» Constrains timing of component assembly and deployment stages
18 - Chris Gill – 04/20/23
Static vs. Dynamic Configuration
Dynamic Configuration» Component assembly & deployment
uses DLLS, XML parsing
» Problems parsing/loading time no support for .so/.dll libs on some platforms (e.g.,
VxWorks)
Static Configuration» Move as much off-line as possible » Focus on preserving only run-time flexibility that is needed» Use static linking to “load” implementations» Use run-time drivers to configure implementations at
initialization
19 - Chris Gill – 04/20/23
Static vs. Dynamic Configuration Experiments
Compared performance of static and dynamic configuration» Used example avionics domain application
Goal: identify sources of performance difference
Tests were run on a single machine» Pentium IV 2.5GHz CPU, 500MB RAM, 512KB Cache» OS was Linux 2.4.18 with KURT-Linux patches
applied Supports DLLs for dynamic configuration approach Offers good real-time predictability for experiments
» Experiments used CIAO 0.4.1 / TAO 1.4.1 / ACE 5.4.1
20 - Chris Gill – 04/20/23
Time for Application Assembly
Without RT features» msec vs. 100s of msec» 2 orders of magnitude
With RT features» Constant additional
overhead» Greater relative
cost at low orders of magnitude
Differences attributed to» Loading DLLs,
spawning processes Most expensive
» XML parsing on-line Secondary
21 - Chris Gill – 04/20/23
Component Server Creation Time
Server configuration is 2nd largest contributor to performance differences» 100s vs. 10s of msec
Static gives a baseline» Most of time was spent
hooking RT CORBA features into server
» 2 orders of magnitude less for non-RT version
Configuring RT-CORBA features
22 - Chris Gill – 04/20/23
Home Creation Time
Homes manage component instances
Configuring homes less expensive than » application assembly» component server
Loaded vs. linked homes accounts for the difference
Real-time features» Didn’t increase the total
time significantly
23 - Chris Gill – 04/20/23
CIAO vs. PRISM Configuration
CIAO’s static configuration similar to Boeing’s PRISM domain-specific component middleware» But configuration steps and flexibility/cost differ significantly» CCM (Extension Interface pattern) vs. C++ (Façade pattern) model
24 - Chris Gill – 04/20/23
CIAO vs. PRISM Configuration Experiments
Platform details» Motorola 5110-2263 VME board» MPC7410 500MHz processor w/ 512 MB RAM» VxWorks 5.4.2» Post x.4 (pre-release) version of CIAO w/ static configuration
High resolution time measurement used two tick counters» 5msec resolution: VxWorks tickGet()» 40ns resolution: VxWorks sysTimestamp()
25 - Chris Gill – 04/20/23
PRISM/CIAO Home Creation Time
PRISM homes» C++ objects
Memory allocation Object
initialization
CIAO homes» C++ object costs
…» … plus CORBA
initialization costs
home activation, etc.
26 - Chris Gill – 04/20/23
PRISM/CIAO Component Creation Time
Again see C++ vs. CORBA differences
Most expensive step in static CIAO configuration
Still well bounded» for all but the
finest time scales
Bounded by 4 msec
component
activation, etc.
27 - Chris Gill – 04/20/23
PRISM/CIAO Connection Establishment Time
Least expensive configuration step
Again reflects C++ vs. CORBA differences
Trade-off is between performance and flexibility
CORBA connection
setup cost
28 - Chris Gill – 04/20/23
Technology Study II: Summary
Static approach gives real-time configuration» Avoids costs/features that hamper real-time behavior
Main costs are DLL loading, spawning processes» Concentrated in application assembly, server creation» Intermediate design point: limited on-line XML parsing?
PRISM & CIAO differ somewhat in flexibility, cost» C++ based components vs. CORBA components» Intermediate design point: mixture of object types?
Static configuration capabilities described here are available as open-source within DAnCE» Implement Deployment & Configuration specification
» http://deuce.doc.wustl.edu/Download.html
29 - Chris Gill – 04/20/23
Technology Study III: Real-Time CORBA 1.2
Distributable threads Pluggable dynamic scheduling Trade-offs in flexibility, overhead, mechanisms
30 - Chris Gill – 04/20/23
Motivation More evolution of middleware programming
model» Distributable threads are natural for certain
applications I.e., those with long-running distributed sequential activities May also help with distributed scheduling, load balancing,
etc.» Integrated with pluggable/dynamic scheduling
semantics Design and implementation goals
» Flexible on-the-fly adaptation of real-time properties» Preserve info on paths a distributable thread
traverses» Provide efficient, rigorous enforcement mechanisms
31 - Chris Gill – 04/20/23
RT-CORBA 1.2 Implementation in TAO
Implementation of Distributable Threads» Thread identity and cancellation design considerations
Give the application better control of concurrency overall OS vs. distributable thread identity issues and approach Cancellation interface and its implementation
Dynamic scheduling service framework» Flexible interface between scheduler and application» OS and middleware based prio scheduler
implementations Benchmarks
» Quantify cost of managing distributable, OS thread ids» Compare OS, middleware scheduling techniques
32 - Chris Gill – 04/20/23
RT-CORBA 1.2 Concepts
Distributable thread – distributed concurrency abstraction Scheduling segment – governed by a single scheduling policy Locus of execution – where distributable thread is currently
running Dynamic schedulers – enforce distributable thread eligibility
Object (Servant)
IDLStubs
IDLSkeletons
ORB Core
DynamicScheduler
BSS-A
ESS-A
Client
Service Context
Schedulingsegment
A
ObjectAdapterDynamic
Scheduler
BSS-B
ESS-B
Service Context
Schedulingsegment
B
1 Distributable thread
2
Current locus of execution3
B: MUF
A: EDFA: EDF
Segment scheduling policies4
55
33 - Chris Gill – 04/20/23
Intro to RTC2 Distributable Threads
With only 2-way CORBA invocations, distributable threads behave much like traditional OS threads» But can move (with their context) from one endsystem to another» Cross through different resource scheduling domains
Distributable threads contend with OS threads, each other» With locking, effect can span endsystems, though scheduling is local
BSS - A
BSS - B
ESS - A
ESS - B
Host 1 Host 2 Host 3
2 - WayInvocation
2 - WayInvocation
DT1
BSS - C
ESS - C
DT2
BSS - D
ESS - B
BSS - E
ESS - E
DT3
34 - Chris Gill – 04/20/23
Creating Distributable Threads
Distributable threads can be created three different ways» An application thread calling BSS outside a distributable thread» A distributable thread calling the spawn() method» A distributable thread making an asynchronous (one-way)
invocation New distributable thread inherits scheduling parameters
Host 2
BSS - A
ESS - A
1 - WayInvocation
Host 3
DT3DT4
Host 1
spawn ()DT1
DT2
35 - Chris Gill – 04/20/23
Distributable Thread Path Example Scheduler upcalls at
several points on path» At creation of a new
distributable thread» At BSS, USS, ESS calls» When GIOP request is
sent» Receipt of GIOP request» When GIOP reply is sent» Receipt of GIOP reply
In each upcall, scheduling information is updated» Additional interception
points can (and sometimes should) be supported by the ORB and the scheduler/policy
Object(Servant)
IDLStubs
IDLSkeletons
ORB Core
DynamicScheduler
in args
out args + return value
Operation ()
BSS or Spawn
ESS
USS
Client
Service Context
12
3
4
1. BSS - RTScheduling::Current::begin_scheduling_segment() or RTScheduling::Current::spawn()2. USS - RTScheduling::Current::update_scheduling_segment()3. ESS - RTScheduling::Current::end_scheduling_segment()4. send_request() interceptor call5. receive_request() interceptor call6. send_reply() interceptor call7. receive_reply() interceptor call
7
5
6 ObjectAdapter
36 - Chris Gill – 04/20/23
Middleware Based Scheduling
CV
CV
1510
Ready Queue of Distributable Threads
+ 8
1510 8
New Distributable Thread
Ready Queue of Distributable Threads
CV
CV
CV
CV
CV
CV - Condition VariableImportance
- Distributable Thread
Benefit: scales in # of distributable threads per OS thread Drawback: queue management costs for some policies Alternatives: 1:1 OS:distributable thread, lanes, groups
37 - Chris Gill – 04/20/23
Simple comparison of OS and middleware scheduling
Both approaches show reasonable control at a resolution of seconds
Notice some latency in last transition in middleware approach
This OS/middleware difference is characteristic of other dynamic scheduling approaches (e.g., Group Scheduling)
Middleware/OS Scheduling Benchmark
Δ latency
OS level scheduling
middleware level scheduling
38 - Chris Gill – 04/20/23
Thread Identity and Cancellation Issues
Host 1
RTCORBA 2.0 Scheduler
Host 2
RTCORBA 2.0 Scheduler
DT<GUID, TID>
<GUID, TID>
Binding of a single DT totwo different OS threads
DT carries scheduling parameters with it
Other mechanisms affect real-time performance, as well» Managing identities of distributable and OS threads» Configuring and using mechanisms sensitive to thread identity» Supporting safe and efficient cancellation of thread execution
Can cancel from either endsystem
39 - Chris Gill – 04/20/23
Thread Specific Storage (TSS) Example A distributable thread can use thread-specific
storage» Avoids locking of global data
OS provided TSS is efficient, uses OS thread id However, distributable thread may span OS threads Solution: TSS emulation based on <GUID,tid> pair What is TSS emulation cost compared to OS TSS?
Host 1 Host 2
OSThread
1
DT 1
tss_write
tss_read
OSThread
2
OSThread
1
DT 2
40 - Chris Gill – 04/20/23
TSS Emulation BenchmarksTSS Key Create:
0
500
1000
1500
2000
2500
3000
3500
4000
1 51 101 151 201 251 301 351 401 451 501Number of Keys Created
Tim
e (n
sec)
EmulatedNative OS
TSS Write/Read:
0
500
1000
1500
2000
2500
3000
1 101 201 301 401 501 601 701 801 901 1001Number of Successive Iterations
Tim
e (n
sec)
Emulated WriteEmulated ReadNative OS WriteNative OS Read
Pentium tick timestamps» nsec resolution on 2.8 GHz P4,
512KB cache, 512MB memory» RedHat 7.3, real-time class» Called create repeatedly» Then, called write/read
repeatedly on one key Upper graph shows
scalability of key creation» Scales linearly with number of
keys in OS, ACE TSS» Emulation cost ~2usec more
per key creation Lower graph shows the
emulated write costs ~1.5usec, read ~.5usec more
41 - Chris Gill – 04/20/23
Distributable Thread Cancellation
Context: distributable thread can be cancelled to save cost Problem: only safe to cancel
» on an endsystem that is in the thread’s run-time “call stack”» when thread is at a safe preemption point
Solution: cancellation is» invoked via cancel method on distributable thread instance» handled at next scheduling point (scheduler upcall)
BSS - A
cancel DT
Process thecancel at next
scheduling point
Propagatecancel
Head of DT
Host 1 Host 2 Host 3
DT cancelled
42 - Chris Gill – 04/20/23
Technology Study III: Summary
RT-CORBA 1.2 can give predictable real-time performance
Allows dynamic scheduling of distributable threads A range of thread management mechanisms matter
» must also be designed for real-time performance RT-CORBA 1.2 implementation in TAO
» open-source software, freely available on the web» http://deuce.doc.wustl.edu/Download.html
43 - Chris Gill – 04/20/23
Concluding Remarks
CORBA Developers balance ongoing trade-offs» Between what standards specify …» And what their applications need
Often many application needs are addressed well» Inter-operability, location/language transparency» Component configuration support» Prioritization, other QoS aspects as well
However, the standards don’t cover everything» Developers must exercise judgment WRT standards
When to adhere, when to augment, when to diverge from them
Sometimes, divergences are the basis for upgrading standards The key point is that it’s an evolutionary process
» Applications try to converge toward standards» Standards try to converge toward applications
44 - Chris Gill – 04/20/23
For More Information
Avionics application case study» www.cse.wustl.edu/~cdgill/PDF/RTSJ_WSOA.pdf
Small footprint real-time middleware» www.cse.wustl.edu/~cdgill/PDF/rtas04_nORB.pdf
RT-CORBA 1.2» www.cse.wustl.edu/~cdgill/PDF/JBCS_RTC1.2.pdf» www.cse.wustl.edu/~cdgill/PDF/rtas05_DTEC.pdf
Dynamic scheduling» www.cse.wustl.edu/~cdgill/PDF/dynamic.pdf» www.cse.wustl.edu/~cdgill/PDF/embedded_sched.pdf» www.cse.wustl.edu/~cdgill/PDF/rtas05_groupsched.pdf» www.cse.wustl.edu/~cdgill/PDF/rtas05_DSRM.pdf
Real-Time component configuration» www.cse.wustl.edu/~cdgill/PDF/doa04_ciao.pdf» www.cse.wustl.edu/~cdgill/PDF/rtss04_ciao.pdf