predicting returns and volatilities with ultra-high frequency data -

http://weber.ucsd.edu/~mbacci/engle/ 1


EFFICIENT MARKET HYPOTHESIS

In its simplest form asserts that excess returns are unpredictable - possibly even by agents with special informationIs this true for long horizons?It is probably not true at short horizonsMicrostructure theory discusses the transition to efficiency


Why Don’t Informed Traders Make Easy Profits?

Only by trading can they profitIf others watch their trades, prices will move to reduce the profitWhen informed traders are buying, sellers will require higher prices until the advantage is gone.Trades carry information about prices


TRANSITION TO EFFICIENCY

Glosten-Milgrom(1985), Easley and O’Hara(1987), Easley and O’Hara(1992), Copeland and Galai(1983) and Kyle(1985)Two indistinguishable classes of traders - informed and uninformedWhen there is good news, informed traders will buy while the rest will be buyers and sellers. When there are more buyers than sellers, there is some probability that this is due to information traders – hence prices are increased by sophisticated market makers.


CONSEQUENCES

Informed traders make temporary excess profits at the expense of uninformed traders.The higher the proportion of informed traders, the

faster prices adjust to trades, wider is the bid ask spread andlower are the profits per informed trader.


Easley and O’Hara(1992)

Three possible events- Good news, Bad news and no news

Three possible actions by traders- Buy, Sell, No Trade

Same updating strategy is used


BEGINNING OF DAY

P(INFORMATION)=P(GOOD NEWS)=

P(AGENT IS INFORMED)=P(UNINFORMED WILL BE BUYER)=

P(UNINFORMED WILL TRADE)=

END OF DAY


Easley Kiefer and O’Hara

Empirically estimated these probabilitiesEconometrics involves simply matching the proportions of buys, sells and non-trades to those observed.Does not use (or need) prices, quantities or sequencing of trades


49.9

50.0

50.1

50.2

50.3

10 20 30 40 50 60 70 80 90 100

EVA EVB


50.00

50.05

50.10

50.15

50.20

50.25

50.30

2 4 6 8 10 12 14

ASK1ASK_EKO

ASK2ASK3

ASK4

ASKING QUOTES WITH VARIOUS FRACTIONSOF INFORMED TRADERS


50.00

50.05

50.10

50.15

50.20

50.25

50.30

2 4 6 8 10 12 14

EVAEVANEVA2N

EVA3NEVA4NEVA5N

ASK QUOTES AFTER A SEQUENCE OF BUYSWITH INTERVENING NONTRADES


INFORMED TRADERS

What is an informed trader? Information about true valueInformation about fundamentalsInformation about quantitiesInformation about who is informed

Temporary profits from trading but ultimately will be incorporated into prices


HOW FAST IS THIS TRANSITION?

Could be decades in emerging marketsCould be seconds in big liquid marketsSpeed depends on market characteristics and on the ability of the market to distinguish between informed and uninformed traders Transparency is a factor


HOW CAN THE MARKET DETECT INFORMED TRADERS?

When traders are informed, they are more likely to be in a hurry(short durations)When traders are informed, they prefer to trade large volumes.When bid ask spreads are wide, it is likely that the proportion of informed traders is high as market makers protect themselves


EMPIRICAL EVIDENCEEngle, Robert and Jeff Russell,(1998) “Autoregressive Conditional Duration: A New Model for Irregularly Spaced Data, Econometrica Engle, Robert,(2000), “The Econometrics of Ultra-High Frequency Data”, EconometricaDufour and Engle(2000), “Time and the Price Impact of a Trade”, Journal of Finance, forthcomingEngle and Lunde, “Trades and Quotes - A Bivariate Point Process”

Russell and Engle, “Econometric analysis of discrete-

valued, irregularly-spaced, financial transactions data”


APPROACH

Model the time to the next price change as a random durationThis is a model of volatility (its inverse)Model is a point process with dependence and deterministic diurnal effectsNEW ECONOMETRICS REQUIRED


PRICE PATH

Time Price Duration


Econometric Tools

Data are irregularly spaced in timeThe timing of trades is informativeWill use Engle and Russell(1998) Autoregressive Conditional Duration (ACD)


THE CONDITIONAL INTENSITY PROCESS

The conditional intensity is the probability that the next event occurs at time t+t given past arrival times and the number of events.( , ( ); ,..., )

( ( ) ( ) ( ), ,..., )

( )

( )lim

t N t t t

P N t t N t N t t t

t

N t

t

N t

1

0

1


THE ACD MODEL

The statistical specification is:

where xi is the duration=ti-ti-1, is the conditional duration and is an i.i.d. random variable with non-negative support

1 1 1 1. ,..., ,..., ;

.

i i i i i

ii i

i E x t t t t

ii x


TYPES OF ACD MODELS

Specifications of the conditional duration:

Specifications of the disturbancesExponentialWeibulGeneralized GammaNon-parametric

iiii

jijjiji

1i1ii

z,y,x

x

x


MAXIMUM LIKELIHOOD ESTIMATION

For the exponential disturbance

which is so closely related to GARCH that often theorems and software designed for GARCH can be used for ACD. It is a QML estimator.

i i

ii

xlogL


MODELING PRICE DURATIONS

WITH IBM PRICE DURATION DATAESTIMATE ACD(2,2)ADD IN PREDETERMINED VARIABLES REPRESENTING STATE OF THE MARKETKey predictors are transactions/time, volume/transaction, spread


Model 1 Model 2Parameter

.2107(6.14)

.3027(18.22)

1 .0457

(2.60).0507(2.24)

2 .1731

(5.94).1578(5.19)

1 .0769

(1.00).1646(1.61)

2 .5609

(8.07).4600(5.16)

#Trans/Sec -.0440(-12.65)

-.0359(-13.40)

Spread -.0782(-15.68)

Volume/Trans -.0041(-4.58)


STATISTICAL MODELS

There are two kinds of random variables:

Arrival Times of events such as tradesCharacteristics of events called Marks

which further describe the events

Let x denote the time between trades called durations and y be a vector of marksData:

}N,...1i),y,x{( ii


A MARKED POINT PROCESS

Joint density conditional on the past:

can always be written:

);y,xy,x(f~)y,x( i1i1iii1iii

F

1 1

1 1 1 1 1 2

( , , ; )

( , ; ) ( , , ; )

i i i

i i i i i ii

ii

i i

f y

y

x y

g x y y

x

xxqx


MODELING VOLATILITY WITH TRANSACTION DATA

Model the change in midquote from one transaction to the next conditional on the duration.Build GARCH model of volatility per unit of calendar time conditional on the duration.Find that short durations and wide spreads predict higher volatilities in the future


GARCH(1,1) GARCH&ECON

VARIABLE Coef Std.Err Z-Stat Coef Std.Err Z-Stat

MEAN

DURS -0.008 0.004 -1.892 -0.007 0.002 -4.027

AR(1) 0.279 0.023 12.29 0.186 0.022 8.507

MA(1) -0.656 0.019 -33.86 -0.570 0.016 -35.70

VARIANCE

C 0.988 0.092 10.74 -0.111 0.047 -2.358

ARCH(1) 0.245 0.020 12.33 0.250 0.013 18.73

GARCH(1) 0.622 0.025 24.70 0.158 0.014 11.71

1/DUR 0.587 0.028 21.27

DUR/EXPDUR -0.040 0.005 -7.992

LONGVOL(-1) 0.096 0.011 8.801

1/EXPDUR

SPREAD(-1)>> 0.736 0.065 11.29

SIZE>10000 0.193 0.119 1.624

LOGLIK -112246.3 -107406.4

LB(15) 93.092 0.000 40.810 0.000

LB2(15) 30.422 0.004 169.12 0.000


APPROACH

Extend Hasbrouck’s Vector Autoregressive measurement of price impact of tradesMeasure effect of time between trades on price impactUse ACD to model stochastic process of trade arrivals


Cumulative percentage quote revision after an

unexpected buy

0

0.02

0.04

0.06

0.08

1 3 5 7 9 11 13 15 17 19 21

1/17/91

12/24/90

Transaction Time (t)


Cumulative percentage quote revision after an unexpected buy

0

0 .02

0 .04

0 .06

0 .080

:00

02:0

5

04:1

0

06:1

5

08:2

0

10:2

5

12:3

0

14:3

5

16:4

0

18:4

5

20:

50

Calendar time (min:sec)

1/17/91

12/24/90


SUMMARY

The price impacts, the spreads, the speed of quote revisions, and the volatility all respond to information variables TRANSITION IS FASTER WHEN THERE IS INFORMATION ARRIVINGEconometric measures of information

high shares per tradeshort duration between tradessustained wide spreads


Jeffrey R. RussellUniversity of ChicagoGraduate School of Business

Robert F. EngleUniversity of California, San Diego

http://gsbwww.uchicago.edu/fac/jeffrey.russell/research/


IBM

104.8

104.9

105

105.1

105.2

105.3

105.4

0 2 4 6 8 10 12 14

Time (Minutes)

Tra

nsa

ctio

n P

rice


Goal: Develop an econometric model for discrete-valued, irregularly-spaced time series data.

Method: Propose a class of models for the joint distribution of the arrival times of the data and the associated price changes.

Questions: Are returns predictable in the short or long run?How long is the long run? What factors influence thisadjustment rate?


Hausman,Lo and MacKinlay

Estimate Ordered Probit Model,JFE(1992)States are different price processes Independent variables

Time between tradesBid Ask SpreadVolumeSP500 futures returns over 5 minutesBuy-Sell indicatorLagged dependent variable


Let ti be the arrival time of the ith transaction where t0<t1<t2…

A sequence of strictly increasing random variables is called a simple point process.

N(t) denotes the associated counting process.

Let pi denote the price associated with the ith transaction and let yi=pi-pi-1 denote the price change associated with the ith transaction.

Since the price changes are discrete we define yi to takek unique values.That is yi is a multinomial random variable.

The bivariate process (yi,ti), is called a marked point process.

A Little Notation


11 ,, iiii tytyf

where ,..., 21

1

ii

i yyy and ,...,21

1

ii

i ttt

In the spirit of Engle (2000) we decompose the joint distribution into the product of the conditional and the marginal distribution:

We take the following conditional joint distribution of the arrival time ti and the mark yi as the general object of interest:

ACD

iii

iii

iiii tytqtyygtytyf 11

?

111 ,,,,

Engle and Russell (1998)


SPECIFYING THE PROBABILITY STRUCTURE

Let be a kx1 vector which has a 1 in only one place indicating the current stateLet be the conditional probability of all the states in period i.A standard Markov chain assumes

Instead we want modifiers of P

1i iPx

1 1 1, 1( , , , )i i i i iiiP x z tt x

ix

i


RESTRICTIONS

For P to be a transition matrixIt must have non negative elementsAll columns must sum to one

To impose these constraints, parameterize P as an inverse logistic function of its determinants


THE PARAMETERIZATION

For each time period t, express the probability of state i relative to a base state k as:

Which implies that: , , 1log / , 1,..., 1i t k t i t iA x b for i k

1

1

exp

1 exp

ij iij k

im mm

A bP

A b


R e w r i t i n g t h e k - 1 l o g f u n c t i o n s a s h ( ) t h i s c a n b e w r i t t e n i n s i m p l e

f o r m a s :

( 2 ) bAxh )(

w h e r e A i s a n u n r e s t r i c t e d ( k - 1 ) x ( k - 1 ) m a t r i x , b i s a n u n r e s t r i c t e d

( k - 1 ) x 1 v e c t o r a n d x i s a t h e ( k - 1 ) x 1 s t a t e v e c t o r .


MORE GENERALLY

Let matrices have time subscripts and allow other lagged variables:

The ACM likelihood is simply a multinomial for each observation conditional on the past

( ; ) 'log( )ACMt tL x x

1 1 1t t t t t t t t th A x B C h D z


THE FULL LIKELIHOOD

The sum of the ACD and ACM log likelihood is

( , ; , ) 'log( ) log( ) tt t t

t

L x x


Even more generally, we define the Autoregressive Conditional Multinomial (ACM) model as:

iji

r

jjtji

q

jjt

p

jjijijti GZhCxBxAh

1,

1,

1,

Where is the inverse logistic function.

Zi might contain ti, a constant term, a deterministic functionof time, or perhaps other weakly exogenous variables.

We call this an ACM(p,q,r) model.

)1()1(: KKh


The data:

58,944 transactions of IBM stock over the 3 months of Nov.1990 - Jan. 1991 on the consolidated market. (TORQ)

98.6% of the price changes took one of 5 different values.

10-1

70

60

50

40

30

20

10

0

Price Change

Perc

ent


.125>p if 1,0,0,0

.125p<0 if 0,1,0,0

0=p if 0,0,0,0

0<p.125- if 0,0,1,0

-.125<p if 0,0,0,1

i

i

i

i

i

ix

We thereforeconsider a 5state model defined as

It is interesting to consider the sample cross correlogram ofthe state vector xi.


15 14 13 12 11

10 9 8 7 6

5 4 3 2 1 =lag

Sample cross correlations of x

up 2up 1

down 1down 2

up 2

up 1

dow

n 1do

wn 2


Initially, we consider simple parameterizations in which the information set for the joint likelihood consists of the filtration of past arrival times and past price changes.

ACD

iii

ACM

iii

iiii tytqtyygtytyf 11111 ,,,,

Parameters are estimated using the joint distribution of arrival times and price changes.


ACM(p,q,r) specification:

4321

112

2/1

)()ln( gggg

hCxBxVAh

iiiii

ji

r

jjji

q

jj

p

jjijijiji

ACD(s,t) Engle and Russell (1998) specifies the conditionalprobability of the ith event arrival at time ti+by

w

jjij

v

jjij

t

jjij

s

j ji

jiji x

1

2

111lnln

Where and gj are symmetric. 1 iii tt

iiiI

01

1where ,...,...,,| 21,21 iiiii xxttE


0

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

Expected Duration

Vo

lati

lity

Conditional Variance of Price Changes as a Function of Expected Duration


Simulations

We perform simulations with spreads, volume, and transactionrates all set to their median value and examine the long run price impact of two consecutive trades that push the price down 1 ticks each.

We then perform simulations with spreads, volume andtransaction rates set to their 95 percentile values, one at atime, for the initial two trades and then reset them to their median values for the remainder of the simulation.


-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49

Transaction

Do

llars

Median High Transaction Rate Large Volume Wide Spread

Price impact of 2 consecutive trades each pushing the pricedown by 1 tick.


-0.06

-0.05

-0.04

-0.03

-0.02

-0.01

0

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49

Transaction

Do

llars

High Transaction Rate Large Volume Wide Spread


Conclusions

1. Both the realized and the expected duration impact the distribution of the price changes for the data studied.

2. Transaction rates tend to be lower when price are falling.

3. Transaction rates tend to be higher when volatility is higher.

4. Simulations suggest that the long run price impact of a trade can be very sensitive to the volume but is less sensitive to the spread and the transaction rates.

predicting returns and volatilities with ultra-high frequency data -

Documents

proportion of informed

profitwhen informed

information traders

expense of uninformed

market characteristics

higher prices

sophisticated market

thefaster prices