designing communication protocol software for performance

◆ Designing Communication Protocol Software for PerformanceShivani Arora

Performance objectives need to be built into the design of time-criticalprotocols. This letter presents software design and architectural guidelinesthat could be used to meet performance requirements, and considers othersoftware design requirements, including portability and its effect onperformance. © 2006 Lucent Technologies Inc.

suite. It is a common transport mechanism for distri-

bution of signaling messages between user parts

residing on different network nodes of a telecomuni-

cations network. SS7 protocol recommendations sug-

gest a layered architecture, very similar to the

architecture proposed by Open Systems Interconnec-

tion [2], as shown in Figure 1.

The MTP is further divided into three levels.

Level 1 (MTPL1) defines mechanisms for physical

transfer of data and is typically implemented in hard-

ware. Level 2 (MTPL2), or signaling link protocol,

defines procedures for error detection and recovery.

Level 3 (MTPL3), or the signaling network layer,

handles network management and routing.

Performance Trade-OffsDuring the design of software for communication

protocols, performance objectives have to be given

the utmost importance. Meeting performance objec-

tives demands efficient utilization of resources like

CPU and memory. This can impact design decisions

regarding functionality breakdown, methods for inter-

task communication, formats for inter-task commu-

nication, data structures, search algorithms, and

detailed task design. When designing for performance,

it may not always be possible to satisfy all other design

IntroductionA communication protocol consists of a set of very

well defined physical connections, messages and pro-

cedures across an open interface to support a set of

features. This letter presents attributes to be consid-

ered in the design of communication protocol soft-

ware and explores the trade-offs between per-

formance objectives and other design objectives such

as modularity, portability, scalability, and reliability.

It is important to understand which design attributes

are critical for module development in a particular

software package and to make architectural choices

and design decisions accordingly. Significant decisions

have to be made about the organization of the soft-

ware, and the selection of the structural elements,

their interfaces, and their behaviors. Designers need to

consider various factors including the

• Platform (e.g., multi-processor or single processor),

• Protocol stack (e.g., multi-threaded or single

threaded),

• Inter-module communication schemes, and

• Buffer management requirements.

BackgroundTo aid this discussion, the message transfer part

(MTP) protocol stack [3] is used as an example. The

MTP is a part of the Signaling System 7 (SS7) protocol

Bell Labs Technical Journal 11(1), 203–207 (2006) © 2006 Lucent Technologies Inc. Published by Wiley Periodicals, Inc.Published online in Wiley InterScience (www.interscience.wiley.com). • DOI: 10.1002/bltj.20154

204 Bell Labs Technical Journal DOI: 10.1002/bltj

Panel 1. Abbreviations, Acronyms, and Terms

API—Application programmable interfaceCPU—Central processing unitDPC—Destination point codeFSM—Finite state machineIAM—Initial address messageISDN—Integrated services digital networkISUP—ISDN user partMTP—Message transfer partMTPL1—MTP level 1MTPL2—MTP level 2MTPL3—MTP level 3REL—ReleaseSME—Subject matter expertSS7—Signaling System 7SU—Signaling unitTDM—Time division multiplex

objectives at all times. Each of these design decisions

is considered below.

Functionality BreakdownMost communication protocols, as in SS7, are

defined as layers and as such it is better to imple-

ment the various layers as separate execution

threads, tasks, or processes that can be run at differ-

ent task priorities. Timers running at lower layers

are required to be more stringent than those run-

ning at higher layers. This mandates stricter real-

time response constraints on the lower layers. Under

load, the lower layers should be run at higher prior-

ity than the higher layers, with higher layer execu-

tion pre-empted whenever there is an event for

a lower layer. This concept supports modularity

and also meets performance requirements. However,

it presents trade-offs including scheduling delays

and handling issues like thread synchronization and

inter-thread communication.

To overcome these issues, a designer may choose

to implement each layer as a module within a single

TC ISUP

SCCP

Signaling data link—Level 1

Signaling link—Level 2

Signaling network—Level 3Layer 3

Layer 2

Layer 1

OSI layers

Layer 4–7

Users of SS7

Exampleuserparts

MTP

ISDN—Integrated services digital networkISUP—ISDN user partMTP—Message transfer partOSI—Open Systems InterconnectionSCCP—Signaling connection control partSS7—Signaling system no. 7TC—Transaction capability

Figure 1.OSI layers versus MTP levels.

task. Real-time response constraints require a design

of this nature to be completely re-entrant, and higher

layer processing is “shelved” when there is an event

for lower layers. To avoid fault propagation, reliabil-

ity requirements demand rollback only from the

layer impacted when errors occur. This requires such

tasks to have rollback designed at the module level

within the task. Task design that considers both real-

time response constraints and reliability requirements

is comparatively more complex, and more difficult

to understand and maintain. Protection of critical

resources is an important design consideration in

both cases.

Inter-Task CommunicationAll communication protocols involve processing

of messages or packets received by each layer from

external interfaces. One of the disadvantages of break-

ing the MTP into separate threads would be the need

for message passing between them. Message passing

involves posting to queues that leads to delays. If the

same MTP is implemented as a module within one

process, the communication between layers can be

implemented using function calls or application pro-

grammable interfaces (APIs). Such APIs are more

CPU-efficient and reduce latency. However, they

make the software tightly coupled. Loosely coupled,

portable software architectures can meet future

DOI: 10.1002/bltj Bell Labs Technical Journal 205

scalability requirements by distributing the various

layers on separate processors; they are also more

modular and maintainable. Note, however, that im-

plementations designed to run on different kinds of

hardware architectures are not very efficient in terms

of CPU usage, and lead to increased latency.

Another approach would be to design the stack as

a tightly coupled module and run multiple instances

of the whole stack on different processors. This would

be very efficient. However, some protocol stack layers,

e.g., MTPL3, require that only one instance is defined

on a specific network node.

Communication FormatMTPL1 receives signaling messages on a time di-

vision multiplex (TDM) trunk. The general message

format of a signaling unit (SU) is shown in Figure 2.

SUs are received by the MTPL1 device and stored in

fixed-size buffers tuned to receive the maximum-size

SU. Each layer needs to decode the relevant SU head-

ers, process them, and deliver the decoded information

required by higher layers. For better performance, one

approach is to use the same fixed-size buffer across all

layers of the protocol stack, instead of reallocating a

fresh buffer for each layer for passing information on

to other layers.

For received messages, the buffers allocated at

MTPL1 would have an additional common header

allocated for storing information used across protocol

stack layers. The buffer size consists of a common

header plus the maximum SU length supported by

the protocol. Each layer would know the offset needed

to store information required by each higher layer.

The higher layers would retrieve information from

the header. This would achieve effective information

passing—without unnecessary overheads of buffer

allocation, de-allocation, and copying at each layer—

and thus result in better CPU utilization. Dynamic

memory operations account for a high percentage of

total message processing time. This has also been

shown in studies done in [1].

Communication protocols define variable length

bit-packed messages to save bandwidth across network

elements while defining a wide spectrum of services.

For better performance, all the relevant information in

an incoming message is extracted by parsers at each

layer and stored in structures. Subsequently, other

modules may use these structures directly without

having to process the bit-packed message again. In

a multi-processor network element, it is desirable to

define structures that are word aligned.

For outgoing messages, the highest layer of the

protocol stack (e.g., user parts in SS7) could allocate

the buffer so that there is no need for further reallo-

cation as the buffer is processed sequentially by lower

layers before it is sent out. This is possible when the

Flag MTPL2fields MTPL3fields UserPartParam CK Flag

Signaling unit (SU) from MTPL1

ComnHdr MTPL2fields MTPL3fields UserPartParam

UserPartInfo MTPL3Info MTPL2Info

Protocol message in the buffer

Common header

CK—Check sumMTP—Message transfer partMTPL1—MTP level 1

MTPL2—MTP level 2MTPL3—MTP level 3SU—Signaling unit

Figure 2.Message format for communication across protocol layers.

206 Bell Labs Technical Journal DOI: 10.1002/bltj

headers added by each of the protocol layers are of a

fixed length, and a predefined size can be added to

the size required by the encoded application layer

message.

Data Structures and Search AlgorithmsWhen designing data structures, preference

should be given to fixed-size arrays over dynamic

memory allocation. As an example, if an MTP has

to be designed for provisioning 1000 links with an

average use case of 100 links, using arrays to store

link-related information may seem like a waste of

memory. However, a design based on dynamic buffer

allocation and de-allocation for link data structure

makes software more error-prone, adds testing re-

quirements, and adds performance overheads. Saving

memory when the module is exercising 10% of its

rated capacity will not provide much benefit. A design

where memory allocations are done for 100 links at

a time may also be considered, particularly if the

saved memory can be utilized for other applications

running on the processor.

As an alternative, consider the case of SS7 desti-

nation point codes (DPCs), which uniquely identify

every node in an SS7 network. The MTPL3 protocol

needs to handle a value that can range from 0 to 214

–1, but at any node there will never be more than 20

to 25 neighbors. Indexing with this range when the

value one is trying to find (i.e., neighboring DPC) is

only a set of 20 to 25 values will certainly be a waste

of memory. In cases such as this, it is better to create

a hash table. A simple hashing function can be used

since memory optimizations should not override CPU

occupancy constraints. Using the 8 least significant

bits of the DPC would be very efficient, since the

neighboring DPC will normally be contiguous.

Task DesignTask design should focus on optimization of more

frequent scenarios. In the case of processing messages

from a mailbox, the switch on the message type should

list the most frequently received message types first.

For example, every ISUP (ISDN user part) finite state

machine (FSM) will need to handle IAM (new call)

and REL (call disconnection) messages, but most will

encounter INF messages that provide information

about inter-working scenarios only infrequently. This

philosophy has to be supported by priority handling

for rollback messages from the fault handler.

Communication protocol design lends itself to a

table-driven approach since the FSM is very well

defined. This approach is more efficient than a “case”

statement, since it can be implemented as an array of

function handlers with “current state” and “event” as

indices and the compiler can use direct access. For the

case statement, the compiler runs two searches, one

on “event” and the other on “current state,” to find

the correct function handler.

A parser is also an integral part of a protocol layer

and it can be designed as a separate module tightly

coupled to the state machine using well-defined API

interfaces to meet performance and modularity

requirements.

Conclusion To keep pace with both market and user demands,

many features are being built into communication

protocols as add-ons. Since it is difficult to redesign

a software module to accommodate these add-on

features, it is important to design for the best and most

efficient utilization of CPU and other resources from

the beginning. As more and more features are added

to designs, performance objectives should not be com-

promised. Lack of attention to performance objectives

in the development stage will make failures more

likely during field testing, and will set the stage for a

progressively degenerative process of fire-fighting.

AcknowledgementsI thank my colleagues Dr. S. Rudra Kumar,

Chetan Vinchhi, and Bhavani Yerrapalli of the India

Product Realization Centre’s (IPRC) Subject Matter

Expert (SME) team, whose valuable feedback enabled

me to better articulate the work presented in this

paper.

References [1] M. Cortes, J. R. Ensor, and J. O. Esteban, “On SIP

Performance,” Bell Labs Tech. J., 9:3 (2004),155–172.

[2] International Organization for Standardization,“Information Technology—Open SystemsInterconnection—Basic Reference Model: The

DOI: 10.1002/bltj Bell Labs Technical Journal 207

Basic Model,” ISO/IEC Standard 7498-1, 1994,<http://www.iso.org>.

[3] International Telecommunication Union, CCITTBlue Book, “Specifications of Signaling SystemNo. 7—Recommendations Q.700–Q.716,” IXthPlenary Assembly, (Melbourne, Australia, 1988),<http://www.itu.org>.

(Manuscript approved December 2005)

SHIVANI ARORA is a member of technical staff at theIndia Product Realization Centre (IPRC),Lucent Technologies, Bangalore. Her areasof interest are communication protocolsand software architectures and designs. She is a member of the subject matter

expert team focusing on EVDO RNC development. She received her B.E. in computer science and engi-neering from the Indian Institute of Technology (IIT) in Roorkee, India, and her M.Tech. in computertechnology from IIT New Delhi. She has 16 years ofexperience in the development of wireline and wireless access network products. ◆

designing communication protocol software for performance

Documents