designing communication protocol software for performance
TRANSCRIPT
◆ Designing Communication Protocol Software for PerformanceShivani Arora
Performance objectives need to be built into the design of time-criticalprotocols. This letter presents software design and architectural guidelinesthat could be used to meet performance requirements, and considers othersoftware design requirements, including portability and its effect onperformance. © 2006 Lucent Technologies Inc.
suite. It is a common transport mechanism for distri-
bution of signaling messages between user parts
residing on different network nodes of a telecomuni-
cations network. SS7 protocol recommendations sug-
gest a layered architecture, very similar to the
architecture proposed by Open Systems Interconnec-
tion [2], as shown in Figure 1.
The MTP is further divided into three levels.
Level 1 (MTPL1) defines mechanisms for physical
transfer of data and is typically implemented in hard-
ware. Level 2 (MTPL2), or signaling link protocol,
defines procedures for error detection and recovery.
Level 3 (MTPL3), or the signaling network layer,
handles network management and routing.
Performance Trade-OffsDuring the design of software for communication
protocols, performance objectives have to be given
the utmost importance. Meeting performance objec-
tives demands efficient utilization of resources like
CPU and memory. This can impact design decisions
regarding functionality breakdown, methods for inter-
task communication, formats for inter-task commu-
nication, data structures, search algorithms, and
detailed task design. When designing for performance,
it may not always be possible to satisfy all other design
IntroductionA communication protocol consists of a set of very
well defined physical connections, messages and pro-
cedures across an open interface to support a set of
features. This letter presents attributes to be consid-
ered in the design of communication protocol soft-
ware and explores the trade-offs between per-
formance objectives and other design objectives such
as modularity, portability, scalability, and reliability.
It is important to understand which design attributes
are critical for module development in a particular
software package and to make architectural choices
and design decisions accordingly. Significant decisions
have to be made about the organization of the soft-
ware, and the selection of the structural elements,
their interfaces, and their behaviors. Designers need to
consider various factors including the
• Platform (e.g., multi-processor or single processor),
• Protocol stack (e.g., multi-threaded or single
threaded),
• Inter-module communication schemes, and
• Buffer management requirements.
BackgroundTo aid this discussion, the message transfer part
(MTP) protocol stack [3] is used as an example. The
MTP is a part of the Signaling System 7 (SS7) protocol
Bell Labs Technical Journal 11(1), 203–207 (2006) © 2006 Lucent Technologies Inc. Published by Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com). • DOI: 10.1002/bltj.20154
204 Bell Labs Technical Journal DOI: 10.1002/bltj
Panel 1. Abbreviations, Acronyms, and Terms
API—Application programmable interfaceCPU—Central processing unitDPC—Destination point codeFSM—Finite state machineIAM—Initial address messageISDN—Integrated services digital networkISUP—ISDN user partMTP—Message transfer partMTPL1—MTP level 1MTPL2—MTP level 2MTPL3—MTP level 3REL—ReleaseSME—Subject matter expertSS7—Signaling System 7SU—Signaling unitTDM—Time division multiplex
objectives at all times. Each of these design decisions
is considered below.
Functionality BreakdownMost communication protocols, as in SS7, are
defined as layers and as such it is better to imple-
ment the various layers as separate execution
threads, tasks, or processes that can be run at differ-
ent task priorities. Timers running at lower layers
are required to be more stringent than those run-
ning at higher layers. This mandates stricter real-
time response constraints on the lower layers. Under
load, the lower layers should be run at higher prior-
ity than the higher layers, with higher layer execu-
tion pre-empted whenever there is an event for
a lower layer. This concept supports modularity
and also meets performance requirements. However,
it presents trade-offs including scheduling delays
and handling issues like thread synchronization and
inter-thread communication.
To overcome these issues, a designer may choose
to implement each layer as a module within a single
TC ISUP
SCCP
Signaling data link—Level 1
Signaling link—Level 2
Signaling network—Level 3Layer 3
Layer 2
Layer 1
OSI layers
Layer 4–7
Users of SS7
Exampleuserparts
MTP
ISDN—Integrated services digital networkISUP—ISDN user partMTP—Message transfer partOSI—Open Systems InterconnectionSCCP—Signaling connection control partSS7—Signaling system no. 7TC—Transaction capability
Figure 1.OSI layers versus MTP levels.
task. Real-time response constraints require a design
of this nature to be completely re-entrant, and higher
layer processing is “shelved” when there is an event
for lower layers. To avoid fault propagation, reliabil-
ity requirements demand rollback only from the
layer impacted when errors occur. This requires such
tasks to have rollback designed at the module level
within the task. Task design that considers both real-
time response constraints and reliability requirements
is comparatively more complex, and more difficult
to understand and maintain. Protection of critical
resources is an important design consideration in
both cases.
Inter-Task CommunicationAll communication protocols involve processing
of messages or packets received by each layer from
external interfaces. One of the disadvantages of break-
ing the MTP into separate threads would be the need
for message passing between them. Message passing
involves posting to queues that leads to delays. If the
same MTP is implemented as a module within one
process, the communication between layers can be
implemented using function calls or application pro-
grammable interfaces (APIs). Such APIs are more
CPU-efficient and reduce latency. However, they
make the software tightly coupled. Loosely coupled,
portable software architectures can meet future
DOI: 10.1002/bltj Bell Labs Technical Journal 205
scalability requirements by distributing the various
layers on separate processors; they are also more
modular and maintainable. Note, however, that im-
plementations designed to run on different kinds of
hardware architectures are not very efficient in terms
of CPU usage, and lead to increased latency.
Another approach would be to design the stack as
a tightly coupled module and run multiple instances
of the whole stack on different processors. This would
be very efficient. However, some protocol stack layers,
e.g., MTPL3, require that only one instance is defined
on a specific network node.
Communication FormatMTPL1 receives signaling messages on a time di-
vision multiplex (TDM) trunk. The general message
format of a signaling unit (SU) is shown in Figure 2.
SUs are received by the MTPL1 device and stored in
fixed-size buffers tuned to receive the maximum-size
SU. Each layer needs to decode the relevant SU head-
ers, process them, and deliver the decoded information
required by higher layers. For better performance, one
approach is to use the same fixed-size buffer across all
layers of the protocol stack, instead of reallocating a
fresh buffer for each layer for passing information on
to other layers.
For received messages, the buffers allocated at
MTPL1 would have an additional common header
allocated for storing information used across protocol
stack layers. The buffer size consists of a common
header plus the maximum SU length supported by
the protocol. Each layer would know the offset needed
to store information required by each higher layer.
The higher layers would retrieve information from
the header. This would achieve effective information
passing—without unnecessary overheads of buffer
allocation, de-allocation, and copying at each layer—
and thus result in better CPU utilization. Dynamic
memory operations account for a high percentage of
total message processing time. This has also been
shown in studies done in [1].
Communication protocols define variable length
bit-packed messages to save bandwidth across network
elements while defining a wide spectrum of services.
For better performance, all the relevant information in
an incoming message is extracted by parsers at each
layer and stored in structures. Subsequently, other
modules may use these structures directly without
having to process the bit-packed message again. In
a multi-processor network element, it is desirable to
define structures that are word aligned.
For outgoing messages, the highest layer of the
protocol stack (e.g., user parts in SS7) could allocate
the buffer so that there is no need for further reallo-
cation as the buffer is processed sequentially by lower
layers before it is sent out. This is possible when the
Flag MTPL2fields MTPL3fields UserPartParam CK Flag
Signaling unit (SU) from MTPL1
ComnHdr MTPL2fields MTPL3fields UserPartParam
UserPartInfo MTPL3Info MTPL2Info
Protocol message in the buffer
Common header
CK—Check sumMTP—Message transfer partMTPL1—MTP level 1
MTPL2—MTP level 2MTPL3—MTP level 3SU—Signaling unit
Figure 2.Message format for communication across protocol layers.
206 Bell Labs Technical Journal DOI: 10.1002/bltj
headers added by each of the protocol layers are of a
fixed length, and a predefined size can be added to
the size required by the encoded application layer
message.
Data Structures and Search AlgorithmsWhen designing data structures, preference
should be given to fixed-size arrays over dynamic
memory allocation. As an example, if an MTP has
to be designed for provisioning 1000 links with an
average use case of 100 links, using arrays to store
link-related information may seem like a waste of
memory. However, a design based on dynamic buffer
allocation and de-allocation for link data structure
makes software more error-prone, adds testing re-
quirements, and adds performance overheads. Saving
memory when the module is exercising 10% of its
rated capacity will not provide much benefit. A design
where memory allocations are done for 100 links at
a time may also be considered, particularly if the
saved memory can be utilized for other applications
running on the processor.
As an alternative, consider the case of SS7 desti-
nation point codes (DPCs), which uniquely identify
every node in an SS7 network. The MTPL3 protocol
needs to handle a value that can range from 0 to 214
–1, but at any node there will never be more than 20
to 25 neighbors. Indexing with this range when the
value one is trying to find (i.e., neighboring DPC) is
only a set of 20 to 25 values will certainly be a waste
of memory. In cases such as this, it is better to create
a hash table. A simple hashing function can be used
since memory optimizations should not override CPU
occupancy constraints. Using the 8 least significant
bits of the DPC would be very efficient, since the
neighboring DPC will normally be contiguous.
Task DesignTask design should focus on optimization of more
frequent scenarios. In the case of processing messages
from a mailbox, the switch on the message type should
list the most frequently received message types first.
For example, every ISUP (ISDN user part) finite state
machine (FSM) will need to handle IAM (new call)
and REL (call disconnection) messages, but most will
encounter INF messages that provide information
about inter-working scenarios only infrequently. This
philosophy has to be supported by priority handling
for rollback messages from the fault handler.
Communication protocol design lends itself to a
table-driven approach since the FSM is very well
defined. This approach is more efficient than a “case”
statement, since it can be implemented as an array of
function handlers with “current state” and “event” as
indices and the compiler can use direct access. For the
case statement, the compiler runs two searches, one
on “event” and the other on “current state,” to find
the correct function handler.
A parser is also an integral part of a protocol layer
and it can be designed as a separate module tightly
coupled to the state machine using well-defined API
interfaces to meet performance and modularity
requirements.
Conclusion To keep pace with both market and user demands,
many features are being built into communication
protocols as add-ons. Since it is difficult to redesign
a software module to accommodate these add-on
features, it is important to design for the best and most
efficient utilization of CPU and other resources from
the beginning. As more and more features are added
to designs, performance objectives should not be com-
promised. Lack of attention to performance objectives
in the development stage will make failures more
likely during field testing, and will set the stage for a
progressively degenerative process of fire-fighting.
AcknowledgementsI thank my colleagues Dr. S. Rudra Kumar,
Chetan Vinchhi, and Bhavani Yerrapalli of the India
Product Realization Centre’s (IPRC) Subject Matter
Expert (SME) team, whose valuable feedback enabled
me to better articulate the work presented in this
paper.
References [1] M. Cortes, J. R. Ensor, and J. O. Esteban, “On SIP
Performance,” Bell Labs Tech. J., 9:3 (2004),155–172.
[2] International Organization for Standardization,“Information Technology—Open SystemsInterconnection—Basic Reference Model: The
DOI: 10.1002/bltj Bell Labs Technical Journal 207
Basic Model,” ISO/IEC Standard 7498-1, 1994,<http://www.iso.org>.
[3] International Telecommunication Union, CCITTBlue Book, “Specifications of Signaling SystemNo. 7—Recommendations Q.700–Q.716,” IXthPlenary Assembly, (Melbourne, Australia, 1988),<http://www.itu.org>.
(Manuscript approved December 2005)
SHIVANI ARORA is a member of technical staff at theIndia Product Realization Centre (IPRC),Lucent Technologies, Bangalore. Her areasof interest are communication protocolsand software architectures and designs. She is a member of the subject matter
expert team focusing on EVDO RNC development. She received her B.E. in computer science and engi-neering from the Indian Institute of Technology (IIT) in Roorkee, India, and her M.Tech. in computertechnology from IIT New Delhi. She has 16 years ofexperience in the development of wireline and wireless access network products. ◆