report: high performance trading - onixs · performance trading - fix messaging testing for low...

24
Report: High Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic communication between buy and sell- side and execution venues, where the performance requirements of algorithmic and high frequency trading are extreme and or the benefits of STP (straight through processing) are sought from electronic connectivity . January 2012 WITH THANKS TO THE TEAM AT INTEL FASTERLAB UK STATEMENT OF CONFIDENTIALITY / DISCLAIMER This document has been prepared by the consortium of companies described herein. No part of this document shall be reproduced without the consultation of these parties and acknowledgement of its source. Contact can be made to [email protected].

Upload: others

Post on 17-Aug-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

Report: High

Performance Trading - FIX Messaging Testing

for Low Latency

Abstract:

FIX is the de-facto standard protocol used extensively

for electronic communication between buy and sell-side and execution venues, where the performance requirements of algorithmic and high frequency trading are extreme and or the benefits of STP (straight through processing) are sought from

electronic connectivity.

January 2012

WITH THANKS TO THE TEAM AT INTEL FASTERLAB UK

STATEMENT OF CONFIDENTIALITY / DISCLAIMER This document has been prepared by the consortium of companies described herein. No part of this document shall be reproduced without the consultation of these parties and

acknowledgement of its source. Contact can be made to [email protected].

Page 2: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

January 2012

TABLE OF CONTENTS

1. SUMMARY .............................................................................................................................................................. 1

2. INTRODUCTION ................................................................................................................................................... 2

2.1 PURPOSE ...................................................................................................................................................................... 2

2.2 ROLES AND RESPONSIBILITIES ............................................................................................................................. 2

2.3 CONDUCT AND PRESENTATION ............................................................................................................................ 3

3. METHOD ................................................................................................................................................................. 4

3.1 TEST HARNESS – SOFTWARE DESIGN .................................................................................................................. 4

3.2 TEST HARNESS - HARDWARE DESIGN ................................................................................................................. 5

3.3 THE MESSAGE PASSING PROCESS……….………………………………………………………………………. 7

3.4 TIMINGS ...................................................................................................................................................................... 8

3.5 POST-TEST DATA PROCESSING .............................................................................................................................. 8

3.6 TEST SCENARIOS ....................................................................................................................................................... 9

4. RESULTS AND OBSERVATIONS ..................................................................................................................... 10

4.1 EFFECTIVENESS OF KERNEL BYPASS ................................................................................................................ 10

4.2 MAIN RESULTS ........................................................................................................................................................ 10

5. DISCUSSION ......................................................................................................................................................... 16

5.1 VALUE OF THE EXERCISE TO THE ELECTRONIC FINANCIAL TRADING COMMUNITY .......................... 16

5.2 PERFORMANCE OF THE TEST RIG ....................................................................................................................... 16

5.3 RAISING THE TEST RIG TO PRODUCTION STANDARD ................................................................................... 17

5.4 EXPLOITING THE RESULTS ................................................................................................................................... 19

6. CONCLUSION ...................................................................................................................................................... 20

APPENDICES ............................................................................................................................................................ 21

1A TECHNOLOGY MEMBERS ...................................................................................................................................... 21

Page 3: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

1 January 2012

1. Summary

This briefing paper reports on the activity of a consortium of leading IT vendors that have joined forces to

create demonstrable high performance solution stacks to address common business requirements in

financial trading. The initial focus of the consortium is on a reference-able technology stack of products

and services to support FIX protocol communication functions. The paper describes the test environment,

documents a set of benchmark tests performed on both commercial and open source FIX engine offerings,

and details and interprets the representative latency and throughput figures achieved.

The objective is to create transparency in and capability around comparing performance statistics for key

functions along the trading life cycle. The tests used business workloads and were deliberately aligned to

reflect the market‟s current interest in the measurement of interparty latency across the trade life cycle –

using FIX formatted messages for defined legs.

An on-going objective is to provide the market with useful data in order to support decisions in

technology investment. Therefore, a range of technologies and application software has been addressed.

Approaches were made to a number of application vendors with the ultimate agreement to test FIX

engines covering both C++ and Java implementations from EPAM Systems‟ B2BITS unit and Rapid

Addition, respectively. As a datum point for comparison, the open source QuickFIX, in both its C++ and

Java variants was used.

OnX Enterprise Solutions Ltd is leading a consortium whose charter members include Intel, Dell, Arista

Networks and Solarflare Communications, with additional services provided by Edge Technology Group,

GreySpark Partners and Equinix. The foundation objective is to create transparent comparative

performance statistics for key functions along the trading life cycles using business workloads – FIX

being used on a number of legs of the typical trade life cycle.

A series of tests were undertaken that demonstrate the value of commercial software (versus open source)

and use of specialist technologies in a low latency infrastructure. The consortium approach recognises the

reality that the creation of high performance solutions requires the interaction of many leading edge

technologies and the integration of components from several vendors. These parties must work together in

order to specify correct parts and then to tune them together such that a complete and reliable solution is

available through a collective single channel.

Results for the tests showed that both B2BITS and Rapid Addition‟s commercial FIX engines out

performed the open source QuickFIX offerings (C++ and Java) in a range of tests, being between 4 and

16 times faster in generating messages during a standardised simulated trade. The average latency for the

commercial engines was 11 to 12 microseconds, whereas the open source engines were between 45 and

180 microseconds. The variation in results was equally stark, since the frequency distribution results from

the commercial engines were bell curved but the open source results had a long fat tail. This indicated the

commercial solutions significantly reduced the effect of network jitter and with it the undesired variance

of performance.

B2BITS FIX Antenna engine was a C++ version; Rapid Addition‟s Cheetah engine was Java. Both

demonstrated similar performance characteristics over a range of tests and workloads. The similarity of

results between the commercial C++ and Java engines stood in contrast to the open source equivalents,

demonstrating that Java can perform as well as C++ code when implemented in an optimised fashion.

Page 4: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

2 January 2012

2. Introduction

In the online and Co-Lo based financial trading markets, performance, both in terms of latency and

throughput is paramount. It is the difference between a firm being „in the market‟ or not. Complete

trading systems are built from many complex elements, including market data capture, trading algorithms,

trade execution, and in-flow risk analysis. These elements run on critical infrastructure components,

hardware, software, network and connectivity all of which must interoperate with each other.

Today, there is a lack of industry-recognised benchmarks for designers, which can demonstrate solutions

have „high performance‟ characteristics. To achieve performance and agility, with low up-front and on-

going operating costs, trade infrastructure implementation teams need to source the best available

components from different innovative specialist vendors, integrate them and tune their interoperability.

FIX is the de-facto standard protocol used extensively for electronic communication between buy and

sell-side and execution venues, where the performance requirements of algorithmic and high frequency

trading are extreme and or the benefits of STP (straight through processing) are sought from electronic

connectivity.

2.1 Purpose

FIX message generation is an increasingly important leg in automated trading and can be a source of

significant latency and jitter which can adversely impact the success of business and trading strategies. As

trading strategies require access to a greater diversity of execution venues, communication over the

standard FIX protocol is more cost effective than accessing markets via diverse proprietary protocols at

the various venues.

Infrastructure deployment teams have to select appropriate components, integrating them, commissioning

them, deploying them for maximum performance, which can be an extreme challenge, It requires a

combination of knowledge, skills, experience and deployment ability that is today scarce and expensive in

the market.

The testing undertaken by OnX in the Intel lab with support from the consortium was to investigate these

assertions:

1. Using commercial FIX engines would achieve lower latency and less jitter.

2. Using specialist low latency network techniques would have a significant impact on latency.

Full results for each environment and latency improvement are available on request.

2.2 Roles and Responsibilities

OnX Enterprise Solutions, as a “solution facilitator” led a collaborative approach through the creation of a

consortium of IT vendors focused on the creation of high performance infrastructure designs specifically

for financial trading systems. OnX consultants provided input into the hardware selection, conduct of the

tests and post-test analysis.

Page 5: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

3 January 2012

The benchmarks were conducted at Intel‟s fasterLAB in the UK. Intel engineers screened hardware and

software performance for optimization. Intel engineers performed the tests, recorded the results and

provided post test process to produce average tables and graph outputs.

Software suppliers Rapid Addition and B2BITS EPAM Systems provided their FIX engines. The test

harness was designed by Rapid Addition and the open source implementations in Java and C++ were

supplied by Rapid Addition and B2BITS respectively.

2.2.1 Consortium Members

A number of technology and services providers have invested as charter members of the consortium.

However, the initiative is open, and further participant members may be added in the future. Between

them, these members provide a complete infrastructure capability and created the reference architecture,

each drawing on specific expertise while OnX provided the integration and build capability.

The charter group members directly involved in building the initial technology stack and in the

performance benchmark testing comprise:

Lead:

OnX Enterprise Solutions – Product procurement and architecture design

Infrastructure component providers:

Arista Networks – Network Switch

Dell – X86 Servers

Intel – Intel® Xeon® processors, and lab environment

Solarflare Communications – Network Interface Card

Implementation and deployment Services:

Edge Technology Group – Buy-side solutions

Equinix – Trading ecosystem hosting

GreySpark Partners – Capital Markets business, management and Technology consulting services

Applications under test:

Rapid Addition – FIX engine

B2BITS EPAM – FIX engine

QuickFIX – Open Source FIX engine

2.3 Conduct and Presentation

Tests were performed by Intel engineers and preliminary results shared with the software suppliers who

were then given an opportunity to optimize their code. A second round of testing was then conducted, the

results of which were used in the preparation of this paper.

The software houses had access to the test harness prior to testing, in order to agree and finalize the

methodology – but no access or amendment was allowed during the test runs. All results were captured by

Intel and shared with OnX. Only results from Rapid Addition were shared with Rapid Addition and

likewise the results from B2BITS EPAM were shared only with B2BITS EPAM.

Page 6: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

4 January 2012

3. Method

3.1 Test Harness – Software Design

The test harness used to perform the benchmarks was designed by Rapid Addition (audited by B2BITS

EPAM) with implementations for C++ and Java written by B2BITS EPAM and Rapid Addition,

respectively.

The test software was implemented across the two servers. One ran simulators for a market data (MD)

feed and an execution venue (EV), the other represented a typical software implementation of a real life

algorithmic trading system – with all applications run on a single server to minimise latency, and through

the „in-process‟ linkage of the algorithmic application logic and the FIX engine under test.

The tests measured recognisable legs in the trading life cycle, mapping to real life workflow scenarios and

researching current industry interest in the measurement of interparty latency over discrete legs of a

trading cycle.

The very simple logic of the simulated algorithmic trading component minimises latency and jitter, so

allowing the focus of the benchmark to be on the FIX engines themselves.

The benchmarking on the FIX engines focused on their ability (a) to process both FIX-formatted market

data and (b) order processing messages for 2 defined stages in the trade cycle at different throughput

rates, over both burst and prolonged periods.

The companies under scrutiny were given controlled access to the test rig with the ability to run tests,

analyse results, tune and re-test. This activity was supported by skilled Intel engineers, who were also

available to assist the companies optimising their code for the target hardware stack.

The diagram below illustrates the test harness with its simulated market data and execution venue.

Figure 1: Test Harness Overview

Page 7: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

5 January 2012

3.2 Test Harness - Hardware Design

In order to conduct benchmark tests on the FIX engines, the reference architecture was specified and built

by OnX at the fasterLAB in the UK. OnX also analysed and interpreted the benchmarks, and provided an

independent audit of the test activities by the FIX engine vendors. These vendors accessed the test rig via

remote access under pre-approved and agreed conditions. The main components of the reference

architecture are shown below:

Figure 2: Reference Architecture Components

3.2.1 CPU and Servers

The test harness server was a Dell PowerEdge R710 server. This occupies 2U of rack space and

incorporates energy efficient technologies to reduce power consumption and cooling. These are typically

deployed in co-location environments, where space and power can be limited.

The market data simulator and execution venue server included 2 x Intel® Xeon® processor X5677 , each

with 4 cores, at 3.47GHz and 16GB of RAM; running Microsoft Windows Server 2008. This

configuration was sufficient for the test harness task of generating a suitable trading workload.

Dell also provided the monitoring server that hosted the network monitoring service. This comprised an

Endace network monitor, timings were uploaded to the operating system only for post test processing.

Page 8: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

6 January 2012

The test harness server housed the algorithmic trading system simulator and FIX engine. This server

included a single Intel® Xeon® processor X5698 (dual-core), clocked at 4.4 GHz, with 12MB of L3

cache and 96GB of RAM (12 x 8GB); running Red Hat Enterprise Linux (RHEL 6.0). This processor has

been designed based on feedback directly from Intel‟s teams in the field close to financial trading for

applications where the fastest single-thread instruction execution is required. Performance increases of

more than 20% compared to other Intel® Xeon® processor X5600 Series from Intel were noted.

Preliminary tests were undertaken to select the most appropriate processor for the workload by comparing

the Intel® Xeon® processor X5698 (4.4GHz) against the Intel® Xeon® processor X5680 (3.33GHz) .

The preliminary test, using a message rate of 100,000 messages a second showed the Intel® Xeon®

processor X5698 to have 36% better latency performance than the Intel® Xeon® processor X5680. The

speed difference between the processors was 32%, indicating the Intel® Xeon® processor X5698 was

exhibiting better linear scalability under test and was better suited to the FIX engine workload. On the

basis of this preliminary test, the Intel® Xeon® processor X5698 was selected for the test environment.

3.2.2 Network Design

The network used in the test harness used a network switch design rather than incorporating network taps

at the points of measurement. Network taps are often deployed to measure latency across certain trade

processing legs; however they can introduce instability and unreliability into the network. Port mirroring

was used to forward packet data to the Endace network monitor. This is a much more common network

implementation in production trading environments.

Two types of network switch were considered:

1. Cut-through switch. This switch starts forwarding a network packet before the whole packet has

been received, normally as soon as the destination address is processed. This reduces latency at

the switch but decreases reliability as corrupted packets may be forwarded.

2. Store and Forward switch. This design buffers the whole packet before processing it. This enables

the switch to validate the integrity of the packet before forwarding it. There is a consequential

delay as a result of the buffering process, which increases latency.

Knowing that timings were likely to be in the range of 5 microseconds to 300 microseconds, the low

latency cut-through switch design was selected. The delay of switching packets using a cut-though switch

is of the order of 300 to 1000 nanoseconds, depending on manufacturer. The delay of store and forward

switching is between 500 to 1000 microseconds, again depending on manufacturer.

Therefore, the cut through design was adopted – with the switch from Arista Networks (7124SX) built

into the stack. The 7124SX uses a low latency design application specific integrated circuit (ASIC,

switching at a 250 nanosecond rate. The ASIC is from Fulcrum Microsystems (an Intel company). This

network switch has an extended operating system, (EOS), which can support additional features, such as

PTP (Precision Time Protocol) and can also use Arista‟s latency analyser utility, known as LANZ.

The switched design depended on a feature called port mirroring, which is used for monitoring traffic by

sending information on a specified physical port to another interface. In this case it was also essential port

mirroring copied both received and transmitted packets on the mirror source to the mirror destination. In

the configuration, the source was the port connecting the FIX engine server, and the destination was the

server hosting the Endace Network monitor card.

Page 9: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

7 January 2012

Low latency network interface cards (NICs) were selected for all servers. With a non-specialized network

card, latencies are around 20 microseconds. Empirical evidence from Solarflare Communications

indicates that this can be reduced by 50% by using a specialized low latency network card and by a

further 50% using a technique referred to as „kernel bypass‟. Solarflare is a recognized provider of low

latency NICs offering kernel bypass support for both Transmission Control Protocol (TCP) and User

Datagram Protocol (UDP) traffic. Typically, market data is broadcast using stateless UDP and trade

execution uses TCP. The model selected was the dual interface SFN5122-F.

3.3 The Message Passing Process

The diagram below illustrates how messages pass through the test harness.

Figure 3: Execution Flow of Messages Through the Test Harness

Referring to figure 3 above, the message flow in detail is now described:

1. The market data simulator created Market Data incremental refresh messages (tag 35 = X),

assigning an MD Entry Price (tag 270) that was incremented through a saw tooth pattern from

0.001 in 0.001 increments, cycling through a small, in memory, list of stocks (tag 55); with each

cycle of the saw tooth pattern the integer part of the price was incremented.

Key

MD Simulator FIX Engine Under Test

EV Simulator

Algo simulator

Se

ssio

n-1

Se

ssio

n-2

35=X

Price

Stock

+

+

Other

Start

Is this a bid

(269=0)?

Invoke “Create

Order Single”

class to “buy”

stock

Yes

Yes

End

No

Does MD Entry

Price end .00?

No

Order Single

Status=Filled

Create

35=D

Hand up message

Create 1st ER

35=8,39=0

Create 2nd

ER

35=8,39=2

Hand up message

No

Yes

Invoke “Create

Order Single”

class to “sell”

stock

Discard

Yes

Ne

two

rk C

ard

= timing points

Has “buy”

Traded?

Is this “New”

ER?No

Page 10: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

8 January 2012

2. The FIX engine listened to this stream of messages on a single FIX session (Session-1) and hands

each message up to the algorithmic trading simulator.

3. The algorithm simulator interrogated the data and when a bid (tag 269=0) had a MD Entry Price

that ends in ".000” (e.g. 270=56.000) it instructed the FIX engine to create and send a new Order

Single (tag 35=D) message to buy 100 lots (tag 38=100) of the symbol (tag 55) to an execution

venue simulator on a second FIX session (Session-2). Since each market data message had a

unique price, market data messages could be correlated with the order messages that they

triggered.

4. The execution venue simulator automatically filled the order by creating two Execution Reports

(tag 35=8). The first had an Order Status of “New” (tag 39=0); the second, “Filled” (tag 39=2).

These were returned on the same FIX session (Session-2).

5. On receipt of the fill (tag 35=8; tag 39=2) the algo simulator instructed the FIX engine to send

another Order Single (tag35=D) to sell 100 lots (tag 38=100) the same symbol (tag 55).

6. Again, the execution venue simulator automatically filled the order by creating two Execution

Reports (tag 35=8). The first will have an Order Status of “New” (tag 39=0); the second, “Filled”

(tag 39=2).

7. Note: Tests were performed without use of persistent storage.

3.4 Timings Since timestamps within the test harness hardware components lacked sufficient accuracy to the

microsecond, timings were recorded on an Endace network monitor.

Three timestamps were recorded for each benchmark process:

1. Receipt of the market data message from the market data simulator (T1).

2. Transmission of each of the 2 single order messages to the execution venue simulator (T2).

3. Receipt of confirmation of the execution of T1 from the execution venue simulator (T3).

3.5 Post-Test Data Processing

The timings collected on the Endace card were uploaded to a Unix workstation for post processing.

Depending on the test run parameters and duration between 3000 and 24 million timings (generating

24MB of test result data per second) were recorded. Scripts were used to analyze the timestamps to give 2

performance measurements:

1. The FIX engine‟s ability to process market data messages and create single order messages as a

result, calculated as T2-T1 for each buy order.

2. The FIX engine‟s ability to generate single order messages after receipt of an order filled

message, calculated as T3-T2 for each sell order.

Page 11: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

9 January 2012

3.6 Test Scenarios

The benchmark process was repeated as part of 3 different test cycles covering short to extended

duration periods. Each set of benchmark cycles was repeated 3 times, in order to establish mean

latency figures of the FIX engines.

The 3 intended test cycles were:

1. Burst test – where 50,000 market data messages per second were generated by the market data

simulator for a period of 5 minutes.

2. Sustained test – where market data message rates were increased from 10,000 to 100,000 per

second, by 10,000 every 4 minutes, for a total of 40 minutes.

3. Extended sustained test – where market data rates were increased from 10,000 to 50,000 per

second, by 10,000 every 10 minutes, and then held at 50,000 for a total time of 4 hours.

At 50,000 market data messages per second, 50 orders per second were generated by the algo simulator,

and at 100,000 market data messages, 100 orders per second were generated.

Different execution venue simulator delays were tested since not all matching engines at different

exchanges are equal. The delays were varied for the burst test as follows:

Delay (Microseconds) Packets Per Second

10 100,000

14 71,429

20 50,000

50 20,000

100 10,000

200 5,000

1,000 1,000

2,000 500

Page 12: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

10 January 2012

4. Results and Observations

A total of 84 test runs were conducted and analyzed out of an anticipated total number of 96 test results

from the FIX engines tested: B2BITS EPAM, QuickFIX C++; and Rapid Addition, QuickFIX Java.

Performance issues with the QuickFIX C++ engine would not permit it to operate when the execution

venues matching engine was set to perform at a rate faster than 50 microseconds.

4.1 Effectiveness of Kernel Bypass

A preliminary design test was conducted and results indicated that the use of kernel bypass (Solarflare‟s

Open Onload product) had an impact on the commercial FIX engines from B2BITS and Rapid Addition

across all tests. No observable impact was recorded when testing the open source variants.

With this observation being established, it was decided that kernel bypass would be enabled for all tests,

irrespective of whether the application design was capable of taking advantage of it.

4.2 Main Results

The results of the tests showed varying performance characteristics between both Java, C++ and open

source code streams.

This included outright latency when delivering message workloads, and the level of jitter displayed by the

engines as they performed their tasks across the period of the test workloads.

The graphs below show a selection of performance characteristics. Full detailed figures for each

environment can be seen in the C++ and Java results reports respectively, where each commercial engine

is compared with its open source equivalent.

Page 13: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

11 January 2012

4.2.1 Test Results and Observations

The two graphs above show the latency of the workload completion over a 300 microsecond range,

comparing open source against the commercially available Java and C++ FIX engines, respectively.

Buy Orders – Execution Venue simulating a 50

microsecond order matching delay

Page 14: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

12 January 2012

The two graphs above show the same results over a 60 microsecond range, comparing open source

against the commercially available Java and C++ FIX engines, respectively.

Page 15: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

13 January 2012

The two graphs above show the latency of the workload completion over a 300 microsecond range,

comparing open source against the commercially available Java and C++ FIX engines, respectively. Note

the absence of performance test results from the open source C++ engine under these test conditions.

Buy Orders – Execution Venue simulating a 14

microsecond order matching delay

Page 16: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

14 January 2012

The two graphs above show the same results over a 60 microsecond range, comparing open source

against the commercially available Java and C++ FIX engines, respectively.

1. The commercial FIX engines completed the messaging tasks between 30 and 50 microseconds

more quickly than the QuickFIX engines.

2. The QuickFIX engines had outlying results to 300 microseconds (they did not complete their task

inside this time), a source of jitter (unpredictability).

3. QuickFIX C++ was unable to perform with the Exchange simulator set at 14 Microseconds.

4. Across the range of tests, each commercial engine exhibited different characteristics, with

differences in outright latency and jitter, which showed no common theme as to performance

characteristics and are hence considered to be within experimental error. This assertion is

demonstrated when examining the whole result set.

5. Open source/free Java and C++ QuickFIX engines show random variation between themselves –

C++ version could not perform at the 14 microsecond load level.

6. The commercial FIX engines were consistent and deterministic throughout the tests.

The commercial engines showed a normal distribution pattern and calculations of standard deviation were

undertaken. The results for the QuickFIX engines showed a large number of outlying results (which

translates to poor reliability in handling trading workloads) and did not fit the normal distribution model.

Page 17: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

15 January 2012

The commercial FIX engines showed a much tighter distribution range of 4 microseconds, as opposed to

50 microseconds.

The sample run below illustrates the point. Note the difference in microsecond range on the X axis of

each graph below.

Nu

mb

er

Of

Sa

mp

les

Time in µs

Nu

mb

er

Of

Sa

mp

les

Time in µs

QuickFIX - open source

Sample commercial engine

Page 18: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

16 January 2012

5. Discussion

5.1 Value of the Exercise to the Electronic Financial Trading Community The testing exercise has illuminated the debate by practitioners who look to quantify the benefits of

commercial FIX engines over their open source counterparts. It is clear that the commercial engines

outperform open source versions by an order of magnitude – and also have significantly higher

consistency in performance, an essential feature for the execution of certain trading strategies. While the

open source model is widely successful as a driver for innovation, in the case of FIX it is clearly

important to select software products based on the required workload and performance characteristics.

The Java based FIX engine closely matched the native C++ code – with each engine showing individual

characteristics.

Finally, the exercise has demonstrated the value of optimized high performance infrastructure when

deploying automated electronic trading systems.

5.2 Performance of the Test Rig

The test rig in the Intel fasterLAB did not prove to be a limiting factor in the testing process. The

infrastructure showed itself to be reliable (with no failed components over the test cycle) when running at

the extremes of performance, including running the CPU‟s at 100% capacity for prolonged periods.

In future tests an enhancement which is being pursued is to implement the Precision Time Protocol (PTP),

which is accurate to 500 nanoseconds. PTP enabled NIC‟s will be tested. These will include Solarflare‟s -

SFN5322F, which has an accurate oscillator to act as a grand master clock. Other network components

are then synchronized, provided they implement the PTP network daemon, which is available for both

Red Hat and the Arista EOS switch operating system.

Latency can be measured at the switch using the LANZ feature from Arista. This will reduce the number

of components required to accurately time stamp network packets generated during the trade cycle.

This enhancement will continue to ensure the integrated trading test suite remains at the leading edge of

network component innovation.

Page 19: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

17 January 2012

5.3 Raising the Test Rig to Production Standard

5.3.1 Deploying Production-Quality Infrastructure

A focus of this series of tests has been to illustrate the importance of design in the technical infrastructure

and its direct and positive impact on performance.

Moving from a lab experiment to a stable production system, which can support live trading execution

strategies that rely on speed and reliability, can be expensive and time consuming.

Deploying high performance infrastructure requires prudent engineering discipline, which has to be

accommodated in any implementation plan. This is characterized by the non-functional requirements

listed below:

Reliability – the stability of a system to reproduce the same results under the same conditions on

an on-going basis requiring minimized intervention.

Availability – the ability to continue operation, with failover/disaster recovery when one or more

components fail.

Testability – scrutinize and assert the integrity of the system as fit for purpose as planned and

required.

Manageability – control of the system, start, stop and vary the control parameters, using planned

resources, be they in-house or outsourced to a service provider.

Performance – closely aligned to reliability – the ability of the system to work within the required

functional constraints and meet operational expectations.

Security – ensure the access control, audit and privacy of the system is maintained – maintaining

required audit trails – and access to information for internal prudence and external compliance.

Scalability – the system can maintain performance requirements and/or accommodate spikes in

demand as workloads increase within defined boundaries.

Extensibility – the ease of change of a component without consequential change to adjacent

components – the ability to extend the scope of the system to support additional business

functions, e.g. adjacent and/or new roles such as risk reporting, compliance, etc.

Project governance is required across the implementation of a high performance trading infrastructure.

This begins with an analysis of the current environment, whether it is a green field deployment or a

complete replacement of existing systems.

A critical component is to ensure that any new system can integrate effectively with existing systems

(SOR, risk, market data, etc.). Across the financial services technology landscape, these skills and

competencies are typically spread across multiple parties with differing and often overlapping areas of

competence and responsibility. This can introduce variance in the effectiveness of the trading

infrastructure, which can impact the overall effectiveness of the deployment project.

The consortium has been assembled to create teaming amongst parties, who can carry the resource loads

of planning and designing suitable infrastructure within the context of each firm‟s current and ongoing

environment. This approach can be equally applied whether the deployment is in-house or at a co-

location facility in proximity to market liquidity.

Page 20: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

18 January 2012

5.3.2 Implementing Commercial FIX Engines

Implementing a FIX engine is a non-trivial exercise, which can be split into two parts.

1. Application Integration: The market data, algorithm and trade execution components of a

trading platform need to be linked to the FIX engine through linked libraries. This requires a level

of programming skill that depends on the complexity of the trading platform, level and quality of

the FIX engine documentation and the number of FIX engine touch points to the trading

application.

Even the simplified test rig had three implementation stages: 1) Planning the application

integration; 2) Executing and optimizing the applications and infrastructure for optimum

performance; and 3) Commissioning and deploying the infrastructure. These stages can be

accelerated while identifying and containing elements of risk by engaging a suitable specialist

systems integrator, such as GreySpark Partners.

2. Execution Venue Integration: Each execution venue will have its own rules and tests to allow

market participants to join the market. These tests typically require validation within test

environments with a prescribed test schedule. Passing the venue integration test requires planning

and logistical rigor.

5.3.3 Implementing Production-Ready Networks

Building the network infrastructure to production standard is a pre-requisite of the integration work. Four

areas of infrastructure design and operation need to be considered.

1. Assessment of and elimination of single points of failure: the test rig had one network link to a

single switch. Building redundant links between servers and having a redundant switch is

common practice, which is recommended when deploying this type of infrastructure. In network

terms this is “Multi-chassis, link aggregation groups” or MLAG. See figure 4 below.

2. Application failover: via the network enables the software components of a system to restart on a

standby server (for High Availability). Using clustering techniques reduces the amount of time

failover takes to complete and requires input to assess the relative cost of the outage period to

determine the complexity of the clustering solution.

3. Backup and Restore: every solution should have a tested backup and restore mechanism to protect

the business from system failure. Since some trading platforms tend to be stateless, the restore

mechanism will resemble the original commissioning steps (having recorded configuration details

and files). Designing a backup and restoration mechanism requires the same business input as

application failover, specifically guided by the cost of outage.

4. Operational Management: which encompasses all aspects of change and configuration

management, systems monitoring and maintenance of operational integrity.

Page 21: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

19 January 2012

Figure 4: Reliable Network Connectivity

5.4 Exploiting the Results

The choice of application software for financial services is extensive. Making an appropriate selection is a

challenge for the business – be it buy-side, sell-side or execution venue. For those applications addressing

the trading functions there is a lack of transparency and consistency in measuring and assessing

performance of solutions from individual vendors, or solution sets of interoperating elements.

This consortium based testing program is an exercise in collaboration to achieve operating solutions with

FIX messaging as the initial focus. The composition of the group models the reality facing trading firms.

Production systems come from the assembly of many parts from many entities and with resources from

different sources. This exercise has provided a basis for consistent testing and comparison of how

technologies handle the (FIX) business workloads.

Its success in achieving a granular and detailed set of results comes in major part from the facilitation that

OnX Enterprise Solutions brings through its product distribution and architecture design capabilities –

and Intel‟s objective to support testing of solution scenarios on its Xeon processors. Across the team, each

party to the consortium has volunteered its core capability, whether it is product or service IP.

The combined resources effectively anticipate the exercise that trading firms would have to address in

selection, procurement and commissioning of systems. Remarkable levels of co-operation, and open

sharing have been displayed with visible useful results. This output can feed directly into the technology

selection processes of firms.

MLAG Peer Link

Private Cluster links

MLAG pair

MLAG pair

Page 22: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

20 January 2012

6. Conclusion

The major result from the testing exercise was the collaboration between parties to create a robust and

representative testing environment, which was able to produce results simulating real-life conditions and

their effect on the key function of FIX message transmission.

The commercial FIX engines were between 4 and 16 times faster (depending on load) than the open

source QuickFIX equivalent engines, with an average latency test result of 11 microseconds, as opposed

to 180 microseconds. This was even more evident when the performance of the execution venue was

increased to reflect faster matching (sub 50 microseconds). Stress exerted on the FIX engine drew out

different performance characteristics.

Under different stress conditions, each engine exhibited different characteristics. The commercial

engines‟ performance was vastly superior to that of the open source models. The standard deviation from

the mean for a commercial engine was only 1 microsecond.

The open source software exhibited results which when translated into the real production world would

not be considered sufficiently robust to support automated trading strategies. The major factors affecting

open source variants are poor performance under high load, higher levels of network jitter and trade

execution outliers up to 300 microseconds.

Tuning the Network Interface Cards with kernel bypass technology improved the performance of both

commercial engines and demonstrated a 50% reduction in latency. This translated into a round-trip saving

in latency, which would have material impact on the trading strategy being executed. Engineering an

integrated trading platform was proven to deliver incremental benefits in reducing overall latency.

Both Java and C++ environments in open source and commercial form exhibited individual

characteristics across the various code streams in the applications. This indicates the on-going scope for

improvement in the software, which can lead to improvements in overall performance.

The test results demonstrated that trading strategies which rely on minimising response times should be

deployed on a high performance infrastructure. This is integral in obtaining enhanced levels of

performance and reliability. Each layer in the technology stack has a role to play with incremental

enhancements being possible when implementing options, such as kernel bypass.

Page 23: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

21 January 2012

Appendices

1. Consortium

The consortium comprises a group of companies whose combined capability maps to the provision of

trading technology solutions. This is not a closed group – and fully is open to inputs from additional

parties on an on-going basis.

1a Technology Members

A number of technology and services providers have invested as charter members of the consortium.

However the initiative is open, and further participant members may be added in the future. Between

them, these members provide a complete infrastructure capability and created the reference architecture,

each drawing on specific expertise, while OnX provided the integration and build capability.

The charter group members with technology product directly involved building the rig and in the

performance benchmark testing comprise:

OnX Enterprise Solutions – As consortium lead, OnX selected vendors for the benchmark test stack,

built the test rig by integrating the product components and interpreted the results of the tests.

Arista Networks – Provided its 7124SX network switch to connect servers for the benchmark and its

LANZ (Latency Analyzer) capability for tuning.

Dell – The benchmark was run on two Dell PowerEdge R710 servers, one of which was equipped with an

Intel® Xeon® processor X5698 .

Intel – The benchmarks were conducted at Intel‟s fasterLAB in the UK. Intel® Xeon® processors X5677

and X5698 were installed in the Dell servers. Intel engineers screened hardware and software

performance for optimum utilisation of iA (Intel Architecture) features, including use of Intel Compiler .

Solarflare Communications – SFN5122F 10 gigabit Ethernet network adaptors were installed in each of

the Dell servers, offering kernel-bypass communications.

Page 24: Report: High Performance Trading - OnixS · Performance Trading - FIX Messaging Testing for Low Latency Abstract: FIX is the de-facto standard protocol used extensively for electronic

FIX Messaging Low Latency Testing

22 January 2012

Other consortium members, which can provide services for deployment in real life production scenarios –

be they Co-Lo, onsite or other – include:

Edge Technology Group – Provides integration and managed services, in particular for buy-

side participants.

Equinix – run financial services data centres around the globe supporting high-performance

trading across multiple-asset classes on a deep mix of trading venues. Trading participants are

connected inside the data centre using cross-connects to reduce network latency delay and enable

price discovery, order routing and execution at the highest possible performance levels.

GreySpark Partners – Provides „top down‟ trading strategy and technology consulting, and

integration services, with a focus on assessing requirements and designing „technology bundles‟

for high performance.

1b Application Providers Being Tested

Rapid Addition - FIX engine "Cheetah" - in Java - and Quick FIX Java harness.

B2BITS EPAM - FIX engine "FIX Antenna" 2.7 - in C++ and Quick FIX C++ harness.

QuickFIX – Open source FIX engine in C++ and Java.