1 high frequency futures data ewan kirk ceo cantab capital partners llp jan 2008

23
1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

Upload: ann-paul

Post on 26-Dec-2015

230 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

1

High Frequency Futures Data

Ewan Kirk CEO Cantab Capital Partners

LLP

Jan 2008

Page 2: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

2

Introduction

A vast amount of information is produced in the financial markets every day

Typically the vast majority of the data is either discarded or ignored by both practitioners and researchers

With powerful computers and massive data storage more of this data can be analysed

But…

We’ve lived in a Gaussian or near-Gaussian financial world for our entire professional lives and much of this data is not even close to Gaussian

Concepts like “return”, “volatility” cease to have meaning

There is huge amounts of data. 200mb per contract per day.

In the univariate case, the data on a single contract is bursty and not evenly spaced

In the multivariate case, data is not cotemporaneous.

There are no obvious intellectual framework.

But…

The financial opportunities are large.

Page 3: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

3

A Descent into the data 20 years of SP500 data. Eyeball statistics tell you that this is log-normal with a drift. And, to a first approximation, it is.

But think hard about this. What is this data? Does it bear any resemblance to the market on the day? There are 5000 data points here and this is probably about as much as you can reasonably hope to work with.

S & P 500 Spot I ndex

200

400

600

800

1000

1200

1400

1600

15Jan88 1Jan90 1Jan92 1Jan94 1Jan96 1Jan98 1Jan00 1Jan02 1Jan04 1Jan06

Page 4: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

4

What about a month worth of data? 29 days SP500 data. It’s pretty obvious here that you can’t say much statistically.

Oh and don’t forget that there are weekends, holidays, early closing days in this graph.

S & P 500 Spot I ndex

1380

1400

1420

1440

1460

1480

1500

1520

1Dec07 10Dec0717Dec0724Dec0731Dec07 7Jan08 14Jan08 21Jan08

Page 5: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

5

Could we use more frequent data? The futures market generates a lot of data but there are lots of little issues with futures

They’re not the same as the spot index (and in some cases like oil, there isn’t a spot index). They also expire and roll, lots of tough stuff to worry about here but let’s ignore all these issues and just look at the hourly data.

E-Mini S&P 500 Index CME Nrby b 01 S & P 500 Spot I ndex

1380

1400

1420

1440

1460

1480

1500

1520

1540

1Dec07 10Dec0717Dec0724Dec0731Dec07 7Jan08 14Jan08 21Jan08

Page 6: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

6

So here’s hourly data Oh no! What’s gone wrong?

Well there are zeros in the data stream

hloc2(ccp_ rt_ esh8.trdprc_ 1,-60)

0

200

400

600

800

1000

1200

1400

1600

12/ 1 7:30:00 1/ 31 7:30:00

Page 7: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

7

Let’s clean it up Looking a bit better but there are some pretty odd things happening here

And let’s not forget that I’ve just said “hourly” but what does that mean? Average price over the hour (argh!), highest price, lowest price, last price, first price? When does an hour start and end? Last traded price or mean of last bid and offer? How much do you weight a price at 11pm on a Friday compared to 2pm on a Wednesday? What does 11pm and 2pm mean in this context?

680 Data points

zapz(hloc2(ccp_ rt_ esh8.trdprc_ 1,-60))

1360

1380

1400

1420

1440

1460

1480

1500

1520

1540

12/ 1 7:30:00 1/ 31 7:30:00

Page 8: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

8

Let’s zoom in some more. One Day 1 hour: 24 (?) points, 93 15 minute points, 1387 minutely points. Note 4 days data is more than the 20 year S&P graph!

Clearly minutely data gives a lot more information than hourly and there are some interesting bits of structure here. But there are lots of times when not much is happening. How do we deal with this? Oh and the kurtosis is 45…

Even with minute data, we’re throwing away more than 99.5% of the data.

Hourly 15 Minutes Minutes

1370

1375

1380

1385

1390

1395

1400

1405

1410

1415

1420

1/ 15 0:00:00 1/ 16 0:00:00

Page 9: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

9

Tick Data Every time something happens on a futures exchange, the exchange sends out a message to every

subscriber saying what has happened. So what can happen? A trade. We get trade time (to nearest millisecond but there are lags), trade volume and, obviously,

trade price. The best bid can change and the size on the best bid can change. The best offer can change and the size on the best offer can change.

In addition, there is “level data” which is the next best bids and offers down 5 (or more) levels.

This screen flashes continuously pretty much all day…

Page 10: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

10

Tick Data This is starting to look more interesting.

zapz(ccp_ rt_ esh8.trdprc_ 1) zapz(ccp_ rt_ esh8.bid) zapz(ccp_ rt_ esh8.ask)

1370

1375

1380

1385

1390

1395

1400

1405

1410

1415

1420

1/ 15 0:00:00.000 1/ 16 0:00:00.000

Page 11: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

11

Zoom in (30 minutes) This is a randomly chosen 30 minute window from 10:30 EST to 11:00 EST on the

15th of January

Trade Bid Ask

1390

1391

1392

1393

1394

1395

1396

1397

1398

1399

1400

1/ 15 10:30:00.000 1/ 15 11:00:00.000

What can we say about this data? The digital nature of the system is starting to become more obvious.

Oh and let’s not forget that there are 1400 trades in this period and 43488 changes of the bid or offer price or size.

In this single half hour, there is more data than in the entire history of the S&P series since 1945….And we get a new set every 30 minutes….

Page 12: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

12

Zoom in even more (2 minutes)

Trade Bid Ask Traded Volume Bid Size Ask Size

1395.5

1395.6

1395.7

1395.8

1395.9

1396

1396.1

1396.2

1396.3

1396.4

1396.5

1396.6

1396.7

1396.8

1396.9

1397

1397.1

1397.2

1397.3

1397.4

1397.5

0

100

200

300

400

500

600

700

800

900

1000

1100

1200

1300

1400

1/ 15 10:38:00.000 1/ 15 10:40:00.000

Page 13: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

13

Zoom in even more (10 seconds)

Trade Bid Ask Traded Volume Bid Size Ask Size

1396.75

1396.8

1396.85

1396.9

1396.95

1397

1397.05

1397.1

1397.15

1397.2

1397.25

1397.3

1397.35

1397.4

1397.45

1397.5

0

50

100

150

200

250

300

350

400

450

500

550

600

650

1/ 15 10:38:10.000 1/ 15 10:38:20.000

Page 14: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

14

Zoom in even more (1 second)

Trade Bid Ask Traded Volume Bid Size Ask Size

1396.75

1396.8

1396.85

1396.9

1396.95

1397

1397.05

1397.1

1397.15

1397.2

1397.25

1397.3

1397.35

1397.4

1397.45

1397.5

0

50

100

150

200

250

300

350

400

450

500

550

600

650

1/ 15 10:38:13.000 1/ 15 10:38:14.000

Rule of thumb, one second of data is equivalent to about 6 months of daily data.

Look at the interesting structure. Artifacts too!

Did the bid size change here? Nope

Trades happening on the bid

Simultaneous trades at bid and offer? Then nothing for over 2/10 of a second! Relative calm!

Page 15: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

15

Here’s that same second tabular format Trade Bid Ask Traded VolumeBid Size Ask Size

10:38:13 1397.25 1397.25 1397.50 1 69 18610:38:13 1397.25 1397.50 68 18910:38:13 1397.50 1397.25 1397.50 20 68 18010:38:13 1397.25 1397.50 71 18110:38:13 1397.25 1397.50 76 17310:38:13 1397.25 1397.50 76 16010:38:13 1397.25 1397.25 1397.50 76 76 17010:38:13 1397.00 1397.00 1397.25 250 134 910:38:13 1397.00 1397.00 1397.25 8 133 9310:38:13 1397.00 1397.00 1397.25 5 111 12210:38:13 1397.00 1397.00 1397.25 2 100 13610:38:13 1397.00 110:38:13 1397.00 1397.00 1397.25 1 91 13810:38:13 1397.00 1397.00 1397.25 20 91 14610:38:13 1397.00 1397.00 1397.25 1 69 12610:38:13 1397.00 1397.00 1397.25 1 68 12610:38:13 1397.00 1397.00 1397.25 1 2 12210:38:13 1397.00 1396.75 1397.00 2 611 5810:38:13 1396.75 1397.00 596 5910:38:13 1396.75 1397.00 564 3510:38:13 1396.75 1397.00 544 50

Note that Excel (which was used to reformat the data) doesn’t understand times less than one second.

Page 16: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

16

Shall we make it more complicated? One minute in equities land.

I’ve removed bids/asks and sizes (but don’t forget the richness of that data)

FTSE and the S&P are both equities so they should be related but how? Not sure that the “correlation of returns” is really going to help here…

SP500 Trade FTSE Trade Dax CAC STOXX

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

5.5

6

1/ 15 10:00:00.000 1/ 15 10:01:00.000

Page 17: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

17

Oh and different assets look different at micro scale

One minute, quite a lot of moves but maybe not as many trades?Crude Oil Crude Bid Crude ask

91.92

91.93

91.94

91.95

91.96

91.97

91.98

91.99

92

92.01

92.02

92.03

1/ 15 10:00:00.000 1/ 15 10:01:00.000

Page 18: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

18

Here’s another one

The bund moves a lot less in one minute

Bund Bund Bid Bund ask FGBLH8.BIDSIZE FGBLH8.ASKSIZE

115.77

115.772

115.774

115.776

115.778

115.78

115.782

115.784

115.786

115.788

115.79

0

100

200

300

400

500

600

700

800

900

1000

1/ 15 10:00:00.000 1/ 15 10:01:00.000

Page 19: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

19

Let’s not lose sight of the amazonian rain forest for the trees There are literally thousands of futures contracts. Hundreds of them produce this much data

every second.

Conservative estimate is that since the start of electronic trading in 2002, the crude oil contract has produced 30 billion data points. (5 levels deep bids and asks x 12 contracts x 255 days x 5 years.

Include equity indices, bonds, currency futures, other commodities, and it is close to 10 trillion data points.

This is a hugely richer data set than the usual SP500, Lehman Bond Index and “Oil” daily data that most people seem to do research on.

Apart from high energy physics, there probably aren’t very many areas where there is this much data which needs to be modelled.

But why do we want to model it?

Page 20: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

20

Either “make money” or “don’t lose money Make Money (Statistical “Arbitrage”):

With all this information, can we predict where the next trade will be? Can we identify short term trends, short term mean reversion, does the intra day information tell us something about the next tick, the next 30 seconds, the next hour, the next day?

Forget your GARCH models, intra-day volatility is a lot better than GARCH at forecasting vol tomorrow. Costs in these markets are tiny!. FX is the best with $1m of notional costing $3 to trade. 1/30 th of the tightest bid

offer spread. FX spreads are often <1bp. Futures costs are considerably less than one tick and the markets are 1 tick wide most of the time. So if you can get a 2 tick move, you’re making money.

But how do you back test strategies? Just because you saw a trade at the bid, doesn’t mean that you got done at the bid in your backtest. Queues, latency etc.

Don’t Lose Money (Algorithmic Trading) If the bid market depth is 50 lots at 100, 50 lots at 99 and 250 lots at 98, and I have to sell 350 lots I know I can do

this right now at a WAP of 98.42. Can I do better? What about if I need to do 3500 lots? There is a risk trade off here. And it gets even more complicated because I might need to do 100 trades (“program

trading” as it is known). I can wait but might miss my price. Market impact, order arrival, agents… This has spawned the whole VWAP, TWAP, Iceberg…etc etc industry.

There are LOTS of computers in the market doing this on the back of some pretty hokey modelling.

Page 21: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

21

When Algorithms Go Bad…10 very very bad seconds for some statistician at a bank…

ESH8.TRDPRC_1 ESH8.BID ESH8.ASK ESH8.ACVOL_1

1355

1360

1365

1370

1375

1380

1385

1390

1395

1400

1405

1410

1415

1420

1425

1430

1435

20000

25000

30000

35000

40000

45000

50000

55000

60000

65000

70000

75000

80000

1/14 2:01:10.000 1/14 2:01:20.000

Page 22: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

22

Next Steps

I’m not presenting a model

I’m presenting a problem.

A big problem.

If anybody is interested in the problem then I’m happy to talk through it in more detail.

I was going to hand out a CD with 600mb of data on it. This was the 15Jan08 for the front month FTSE, SP500, 10y Bund and Crude oil.

But there are licensing issues with our data provider. If you want access to this data then get in touch and we’ll work out a way to do it.

[email protected]

Page 23: 1 High Frequency Futures Data Ewan Kirk CEO Cantab Capital Partners LLP Jan 2008

23

Disclaimer

This document is issued by Cantab Capital Partners (“CCP”), authorised and regulated by the FSA in relation to shares (the “Shares”) in the CCP Quantitative Fund (the “Fund”). The Fund will not be a recognised collective investment scheme under the Financial Services Act 1986 (the “Act”) and accordingly, investors in the Fund will not benefit from the rules and regulations made under the Act for the protection of investors, nor from the UK Investors’ Compensation Scheme.

CCP are regulated by FSA. This Brochure is issued only to persons falling within article 11(3) of the Financial Services Act 1986 (Investment Advertisements) (Exemptions) Order 1996 and may not be passed on to any other person. It does not constitute an offer or solicitation of an offer of any investment or investment service.

The value of the Shares, and any income from them, may go down as well as up and an investor may not receive back, on redemption of his Shares, the amount which he invested. Changes in rates of exchange between the US Dollar and the currencies in which the investments of the Fund are denominated may cause the value of the Shares to go up or down. The Shares will not be dealt in on a recognised or designated investment exchange for the purposes of the Act, nor will there be a market maker in the Shares, and it may therefore be difficult for an investor to dispose of his Shares otherwise than by way of redemption or to obtain reliable information about the extent of the risks to which his investment is exposed.

This document does not constitute or form part of any offer to issue or sell, or any solicitation of any offer to subscribe or purchase, the Shares, nor shall it or the fact of its distribution form the basis of or be relied on in connection with, any contract therefore. Recipients of this document who intend to apply for Shares following the publication of the prospectus to be issued by the Fund are reminded that any such application must be made solely on the basis of the information and opinions contained in the prospectus which may be different from the information and opinions contained herein. Neither CCP, nor their directors or employees warrant the accuracy, adequacy or completeness of the information contained herein and CCP expressly disclaims liability for errors or omissions in such information. No warranty of any kind implied, express or statutory is given by CCP or any of its directors or employees in connection with the information contained herein. Under no circumstances may this document, or any part thereof, be copied, reproduced or redistributed without the express permission of a partner of CCP. Registered in England No. OC317557. Registered office: Daedalus House, Station Road, Cambridge, CB1 2RE © Cantab Capital Partners LLP