unveiling the identity of pin from the flash crash

8/7/2019 Unveiling the Identity of PIN from the Flash Crash

1/37Electronic copy available at: http://ssrn.com/abstract=1697879

Unveiling the Identity of PIN from the Flash Crash:

Illiquidity or Information Asymmetry?

Qin Lei

First Draft: October 25, 2010

Current Version: November 25, 2010

Qin Lei, Finance Department, Cox School of Business at Southern Methodist University, 6212Bishop Blvd, Dallas, TX 75275-0333. Phone: (214) 768-3183. Email: [email protected].


2/37Electronic copy available at: http://ssrn.com/abstract=1697879

Abstract

This paper extends the original PIN framework to explicitly allow for the coexistence of liq-

uidity shocks and fundamental news, both of which can lead to order imbalances. The pseudo

market makers submit contrarian orders in the event of liquidity shocks and thus move the

stock prices back to the fundamental level. Consequently, the conventional PIN measure

consists of one component driven by the informed traders who receive the fundamental newsand another component due to pseudo market makers who arrive upon liquidity shocks. Dur-

ing the ash crash on May 6, 2010, there is a nearly ten-fold market-wide increase in the

illiquidity component of PIN but there is a lack of uniform increase in the information asym-

metry component, based on the estimation of the extended PIN model for common stocks

listed on NYSE and AMEX. In contrast, the original PIN model disallows liquidity shocks

and thus overestimates the extent of asymmetric information. In addition to introducing a

conceptually more pure measure of asymmetric information than that is previously available,

this paper also contributes to the literature through methodological improvements to the

PIN estimation and provides the recipe to eradicate the numerical overow and underow

problems and impute the daily PIN series from repeated estimations of quarterly PINs.

JEL Classications: G10, G14

Keywords: Probability of Informed Trading (PIN), Information Asymmetry, Flash Crash,

Floating Point Overow, Daily PIN Series


3/37

1 Introduction

The U.S. nancial markets experienced a tumultuous day on May 6, 2010, when the Dow

Jones Industrial Average (DJIA) stock index witnessed its biggest one-day loss of 998.5 points

since its inception more than 100 years ago. Miraculously, the sharp decline across a muchwider spectrum of stocks than just the thirty component stocks in DJIA reversed itself within

a thirty-minute interval in the same day. Due to the dramatic plummet and swift reversal

of stock prices, this event became known as a ash crash in the popular press. The ash

crash continues to reverberate among market participants and policy makers, and academic

researchers also have keen interests in knowing more about the mechanism underlying this

event. I study the ash crash in this paper because this event amounts to a critical challenge

to the PIN literature, but at the same time presents a unique opportunity to unveil the

identity of PIN. This paper addresses the challenge in an extended PIN model that is better

supported by the data and exploits the ash crash as a natural experiment to reveal that

PIN does not always purely represent asymmetric information.

In a series of important papers, Easley, OHara and their coauthors design and test the

probability of information-based trading (with the shorthand PIN) to capture the most likely

fraction of informed trades among all orders submitted by informed and uninformed traders

under a statistical structure that describes the news and order arrival processes. The PIN

measure has subsequently been studied in a number of contexts to quantify the eect of

asymmetric information.1 Yet the ash crash reveals a key weakness of the PIN measure in

that it overlooks the possibility of order imbalances induced by rm-specic or market-wide

liquidity shocks and thus would overestimate the fraction of informed trades during such

times. The notion of PIN as a measure of asymmetric information being contaminated by

liquidity eects extends beyond the ash crash event. In fact, the interpretation of PIN has

been subject to much controversy despite its popularity. For instance, Easley, Hvidkjaer, and

OHara (2002) nd that a 10% dierence in PIN of two stocks results in a 2.5% dierence in

expected returns and interpret this nding as the information risk being priced. Duarte and

Young (2009) counter that the PIN factor is priced only because it is a proxy for illiquidity

in light of the evidence regarding the disappearance of the pricing power for the private

information factor after controlling for illiquidity. Even before the identity crisis of PIN in

the asset pricing context, there has been some confusion in the literature over the varying

interpretations of PIN.2 In light of the potential dual roles of the PIN measure, it is important

1 Here is an incomplete list of studies that apply the PIN measure outside the market microstructure eld.Easley and OHara (2004) and Duarte, Han, Harford and Young (2008) study the eect of PIN on the costof capital. Vega (2006) and Jayaraman (2008) examine the role of PIN in the context of corporate earnings.Bharath, Pasquariello and Wu (2009) study the relationship between PIN and the capital structure. Easley,Hvidkjaer and OHara (2002) and Duarte and Young (2009) examine the pricing power of an aggregated PINfactor in the asset pricing context.

2 For instance, Easley, Kiefer, OHara and Paperman (1996) understandably advocate PIN as a measure ofprivate information, yet Easley, Engle, OHara and Wu (2008) assert PIN as a simple measure of illiquidity

1


4/37

to ascertain the true identity of the PIN measure. Is it a pure measure of information

asymmetry as originally designed? Or, is it confounded by the extent of illiquidity? If the

latter is true, how can one possibly carve the illiquidity component out of the PIN measure

and obtain a more pure measure of asymmetric information? Answering these questions

requires disentangling the role of asymmetric information from the role of illiquidity and it

can be done either qualitatively or quantitatively. This paper takes the initiative to do both.

Given the elusive nature of illiquidity and information asymmetry, it is dicult to tell

them apart unless there is some exogenous shock that naturally separates them. The ash

crash on May 6, 2010, is such a natural experiment because it enables a qualitative distinction

between the two competing interpretations for the PIN measure. The U.S. Commodity

Futures Trading Commission (CFTC) and the Securities and Exchange Commission (SEC)

attribute the ash crash to a short-lived liquidity crisis on both the index futures market and

the equity market (e.g., CFTC-SEC, 2010). This quick episode of a market-wide liquidity

crunch would necessarily imply a hike in an illiquidity measure for many stocks on the dayof the ash crash. Consequently, one can largely rule out a uniform increase in the estimated

PIN if it purely measures the extent of asymmetric information. I repeatedly estimate the

quarterly PINs with and without the day of the ash crash and then impute the daily PIN

series based on the knowledge that the informed orders have to add up over time. It turns

out that there is a marked increase in the estimated PINs for almost all stocks on May 6,

2010, and this nding goes against PIN as a pure measure of asymmetric information. It

appears unlikely that there is a simultaneous hike in the amount of private information across

all stocks. A systematic liquidity shock is more plausible than the common arrival of private

information across all stocks at once especially because of the sharp and swift price reversal

on the day of the ash crash. Stocks prices are supposed to gradually incorporate information

revealed from the informed orders and thus the information-based trading activities would

imply a price continuation rather than a sharp reversal. In other words, the qualitative

inference around the ash crash event suggests that the empirical data lean in favor of the

illiquidity interpretation for the PIN measure at least during the ash crash.

Having achieved the qualitative separation between asymmetric information and illiquid-

ity, I turn to quantifying the distinction so as to obtain a more pure measure of information

asymmetry. For this purpose, I extend the original PIN framework to explicitly allow for

the coexistence of liquidity shocks and fundamental news, both of which can lead to order

imbalances. The news probability of a liquidity shock is observed from the actual frequency

with which sizeable intraday price reversals occur. The idea is that fundamental news should

be steadily incorporated into stock prices without major reversal within a short time span,

while a sharp and quick reversal in stock prices is the hallmark of liquidity shocks that are

(pp. 190). Amihud (2002) treats the PIN measure as both a ner and better measure of illiquidity (pp. 32)and a measure of microstructure risk ... that reects the adverse selection cost resulting from asymmetricinformation (pp. 34).

2


5/37

unrelated to the fundamental news. The pseudo market makers submit contrarian orders

in the event of liquidity shocks and thus move the stock prices back to the fundamental

level. Consequently, the conventional PIN measure consists of one component driven by the

informed traders who receive the fundamental news and another component due to pseudo

market makers who arrive upon liquidity shocks. During the ash crash on May 6, 2010, there

is a nearly ten-fold market-wide increase in the illiquidity component of PIN that would have

been mistakenly attributed to information-based trading under the original PIN model that

disallows liquidity shocks. The information asymmetry component also rises relative to the

preceding quarter but not in a uniform manner. The daily increment in asymmetric informa-

tion is not statistically signicant at the 1% level for four of ten volume deciles, and stocks in

the highest two volume deciles experience the largest hike since they have a very low level of

asymmetric information in the preceding quarter. These ndings could suggest an environ-

ment with higher information asymmetry among those heavily traded stocks on the day of the

ash crash. The sta report in CFTC-SEC (2010) identies the ash crash as initiated in the

S&P 500 index futures market. Since the index futures market leads the equity market on an

intraday basis (e.g., Chan, 2002), it is plausible that some of the S&P 500 index component

stocks were indeed traded as if accompanied by material private information on the day of

the ash crash. This circumstance would naturally translate into a more prominent eect

among the most heavily traded stocks.

Using a novel idea to trace the identity of PIN through the ash crash as a natural exper-

iment, this paper helps to address some challenges that the original PIN model faces. Aktas,

de Bodt, Declerck, and Van Oppens (2007) document the apparent diculty in reconciling

the information leakage with the lower PIN estimates prior to the announcements of mergers

and acquisitions. One would have expected a higher estimated PIN during periods of infor-

mation leakage if PIN purely captures the extent of information asymmetry. Though Aktas

et al. label the inconsistency as a PIN anomaly, it is possible that the stock liquidity actu-

ally improves when traders exploit the leaked information so long as the lower PIN estimates

reect lower illiquidity. Therefore, it warrants further investigation to see if the extended

PIN model resolves the anomaly.

This paper is closely related to Duarte and Young (2009) in that both papers extend

the original PIN framework to address the concern that the original PIN measure captures

illiquidity as well as information asymmetry. One critical distinction is that my extension

explicitly allows pseudo market makers to submit one-sided orders upon the occurrence of

liquidity shocks and thus addresses the problematic premise in the original PIN framework

that informed traders are the exclusive source of order imbalances. Duarte and Young (2009)

acknowledge the possibility that order imbalances could also result from liquidity shocks

rather than informed trades, leading to potentially misleading inferences from the problematic

premise. However, they also carry on the tradition in Easley, Kiefer, OHara and Paperman

3


6/37

(1996) that order imbalances are interpreted as an exclusive indication for informed trades,

and leave the issue unaddressed as an important caveat. Moreover, Duarte and Young

(2009) have a fairly dierent motivation behind their extension compared to this paper. My

model extension is inspired by the ash crash and I introduce the pseudo market makers

to break the exclusivity of the informed traders in creating order imbalances. In contrast,

Duarte and Young (2009) are most concerned about the mismatch between theory and reality

because the observed correlation between buy orders and sell orders is positive even though

the original PIN framework implies a negative correlation. They introduce symmetric positive

order ow shocks on both the buy side and the sell side to accomplish the goal of eliminating

the mismatch.

Beyond the theoretical extension of the PIN framework to explicitly allow for liquid-

ity eects to coexist with, and thus be separated from, information asymmetry, this paper

also contributes to the literature by providing methodological improvements to the PIN es-

timation. Specically, I design one simple procedure to dynamically factorize the daily log-likelihood function for the maximum likelihood estimation of PIN and eectively eliminate

the numerical overow and underow problems that have long plagued academic researchers

and practitioners alike. The buy and sell orders have steadily increased in recent years es-

pecially in light of the prevalence of algorithmic trading that often splits large orders into

smaller pieces. The explosive growth of the number of trades often contributes to the failure

of PIN estimations. After applying the dynamic factorization scheme, my estimation has a

100% convergence rate while avoiding corner solutions and local maxima. Without applying

the scheme, however, the estimation failure rate is a staggering 54.88%. I also make avail-

able a technique to impute the daily PIN series through repeated estimations of quarterly

PINs. Researchers are expected to benet from these methodological improvements in dif-

ferent settings, especially among studies of short-lived corporate events where the change of

asymmetric information needs to be measured on a daily basis.

The balance of the paper proceeds as follows. Section 2 describes the original PIN frame-

work and proposes a few methodological improvements to the PIN estimation. The PIN

framework is then extended in Section 3 to explicitly allow for liquidity shocks. Section 4

contains the empirical analysis and the concluding remarks are in Section 5.

2 PIN Estimation

2.1 Original PIN Framework

The expanding literature concerning the probability of informed trading is built on the the-

oretical foundation in Easley and OHara (1992). Easley, Kiefer and OHara (1996) and

especially Easley, Kiefer, OHara and Paperman (1996) popularize the PIN measure by pro-

4


7/37

viding an empirical recipe for the maximum likelihood estimation of PIN. The parsimonious

structure in Easley, Kiefer, OHara and Paperman (1996) becomes the natural starting point

for many subsequent papers that extend the trading process, the parameterization underlying

the PIN measure or both. In essence the original PIN framework in Easley, Kiefer, OHara

and Paperman (1996) imposes a statistical structure on the observed order ows for a given

stock and relies on the parameter values that maximize the sample likelihood to compute the

average fraction of orders due to information-based trading.

Only two types of investors trade stocks in the setting of Easley, Kiefer, OHara and

Paperman (1996), either informed or uninformed. The orders from these traders are modeled

as Poisson processes with arrival rates and " for the informed traders and the uninformed

traders, respectively. While the uninformed traders submit both buy orders and sell orders

with equal probabilities on average, the informed traders commit to one-sided orders that are

consistent with the private news about the stock fundamentals. There is an probability

that the fundamental news would arrive on any trading interval, and the arrived news has a probability of being negative. Therefore, there is an probability for a trading interval

to be associated with bad news, during which the informed traders submit only sell orders.

Likewise, there is an (1 ) probability that the informed traders submit only buy orders

on a trading interval with good news. When there is a lack of news with probability 1 ,

only the uninformed traders participate in trading the stock.

Easley, Kiefer, OHara and Paperman (1996) recommend aggregating the order ows at

the daily level for all stocks so that the modeled trading interval lasts exactly one day. The

daily likelihood of observing B buy orders and S sell orders on one specic stock is

L[(B; S)j] = (1 ) exp(" ")(")B

B!

(")S

S!

+exp(" ")(")B

B!

( + ")S

S!(1)

+(1 )exp( " ")( + ")B

B!

(")S

S!;

where denotes the vector of parameters to be estimated.

The standard practice in the literature is that under the assumption of constant parame-

ters over each calendar quarter, one can estimate the set of parameters that maximize the

sample likelihood of observing the daily order ows. The average orders from the informed

traders are while the uninformed traders contribute 2". Therefore, the most likely fraction

of informed orders, or the probability of informed trading, can be dened as

P IN =

+ 2": (2)

It is fairly intuitive that under this framework the informed traders are the sole source of

5


8/37

order imbalance by construction and the observation of high order imbalance is necessarily

associated with a high level of estimated PIN. As long as the order imbalance can be exclu-

sively attributed to the informed traders, it is straightforward to demonstrate that the PIN is

essentially equivalent to the absolute percentage order imbalance. Three teams of researchers

uncover this relationship independently around the same time. Kaul, Lei and Stoman (2005)

derive it through the change of variables using a system of equations after invoking perfect

foresight. Aktas, de Bodt, Declerck, and Van Oppens (2007) nd that the PIN is the ratio of

expected absolute order imbalances to expected total orders. Easley, Engle, OHara and Wu

(2008) document the same relationship with a rst-order approximation while noting that

the expected absolute dierence of Poisson variables is quite complicated.

2.2 Common Factorization of Log-likelihood

As the number of orders gets very large, the likelihood function becomes harder, and even

impossible in certain cases, to compute due to the factorial, the exponential and the power

functions. Regardless of the specic hardware and software used for the computation, there

are limits on the maximum and minimum numbers allowed, beyond which an overow and

underow error would be triggered, respectively. To get around this issue, one can re-arrange

the likelihood function to produce a common factor whose natural logarithm is easy to com-

pute. That is, rewrite the likelihood function as

L[(B; S)j] = c m=(B!S!);

where the common factor c makes ln(c) easy to compute and the multiplicative factor m isconstructed to moderate the magnitude of inputs to the exponential functions and the power

functions. One can skip calculating the common factorial in the denominator because it does

not involve any parameters to be estimated and thus aects only the absolute magnitude of

the likelihood value.

After purging some constants unrelated to parameters in , one can write the daily

log-likelihood function in the following computation-friendly form,

L() = 2" + (B + S)ln(")

+ ln f(1 ) + exp[ Sln(k)] + (1 )exp[ B ln(k)]g ; (3)

where k "

+ "; and thus ln(k) 0:

The ratio k of arrival rates is bounded between 0 and 1, and thus the inputs for the exponential

functions can be of moderate size.

6


9/37

2.3 Eliminating Overow and Underow Problems

The common factorization in equation (3) works reasonably well to alleviate the overow and

underow problems among stocks with low to moderate trading volumes, but it is far from

eliminating these problems. Stocks with high trading volume often suer from the overow

and underow problems even after the moderation introduced by the factorization. Given

the recent trends of institutional investors breaking up their orders into smaller pieces and

the increasing prevalence of high frequency traders who often submit orders of small size,

more and more stocks fall into the category for which the PIN estimation simply fails. Note

that the overow and underow problems are not exclusively aicting only stocks with high

trading volumes, however. To break down the estimation process, it takes no more than

one day of severely one-sided order ows or the optimization procedures directing one of

the interested parameters into a certain region of value that would trigger an overow or

underow problem.

In contrast to the dire situation regarding the empirical estimation of PIN, the PIN mea-

sure as a theoretical concept has clearly gained popularity among researchers who are keen

to measure the extent of information asymmetry in various contexts. It is thus understand-

ably desirable to eectively eliminate the overow and underow problems from the PIN

estimation. This paper provides one such solution by dynamically changing the factorization

process for each pair of order ows on a daily basis so as to actively avoid triggering any

overow or underow error.

To implement the dynamic factorization, it is necessary to rst identify the trigger value

for overow and underow errors in the hardware and software combination for the PINestimations. In order to obtain the trigger value, the researcher can keep increasing the

input value C to an exponential function expfjCjg until the calculation fails. For instance, I

use the SAS software on a desktop computer that associates expf708g with an overow and

expf708g with an underow. Alternatively, one can use the constant function in SAS to

identify the trigger values. In my computer, I nd that constant(logsmall) = 708:396 and

constant(logbig) = 709:783. In other words, the combination of my computing hardware

and software yields an approximate critical value C = 708, and the factorization of the daily

log-likelihood function has to be done in a way to actively avoid numbers outside the range

[exp(C); exp(C)], which is equivalent to [10C= ln(10); 10C

= ln(10)]. Otherwise, an overow

or underow problem can occur.

Note that the overow and underow problems aect mainly the multiplicative factor m

of the daily likelihood. The basic strategy of my factorization scheme is to pull the largest

exponential input out of the multiplicative factor m, make it part of the common factor c,

and identify all occasions necessary to replace an exponential with zero that would trigger an

underow problem. The Appendix spells out the full details of a simple three-step procedure

7


10/37

to dynamically factorize the daily log-likelihood function. Once the overow and underow

problems are completely eliminated through the dynamic factorization algorithm, it is fairly

easy to conduct a grid search over dierent regions of parameter values so as to ensure a

global maximization.

2.4 Computing Daily PIN Series

The presence of information asymmetry is applicable in many contexts. It is very often

the case that empirical researchers seek to measure the change in the extent of asymmetric

information around certain corporate events that can be short-lived. It is the common practice

in the literature to estimate the PIN measure over one quarter of daily data for a given stock.

Therefore, the estimated quarterly PIN measures are not well suited for studying corporate

events whose eects on information asymmetry may last only a day or two. While there are

studies that extend the original theoretical framework underlying the PIN measure to allow

for the estimation of daily PIN series (e.g., Lei and Wu, 2005; Easley, Engle, OHara and

Wu, 2008), the high frequency series comes at the cost of imposing more elaborate structures

on the observed order ows and thus the extended models are not nearly as popular as the

simple estimation of quarterly PINs. In fact, even the studies that promote the extended

PIN models allowing for high frequency PIN series often avoid a large scale estimation for

many stocks and restrict the exercise to a selected few stocks instead.

In this paper, I propose a simple method to impute the daily PIN series from the quarterly

PIN estimates. The basic idea is to estimate the quarterly PIN measures with and without

the trading day t and infer the daily PIN measure from the dierence in quarterly PINestimates. Denote by Nx the total number of trades in the quarter (e.g., 62 trading days)

prior to trading day t. Denote by Nt the total number of trades in trading day t. Then the

cumulative total number of trades over a 63-trading-day span ended on trading day t is

Nc = Nx + Nt:

Denote by P INx the PIN estimated from using the rst 62 days of trades. Denote by P INc

the PIN estimated from using the 63 days of trades. Denote by P INt the imputed PIN

measure for trading day t. Clearly, the informed orders have to add up over the period of 63

days. In other words, the following relation holds

Nx P INx + Nt P INt = Nc P INc:

Substituting the denition of the cumulative total number of trades and re-arranging the

8


11/37

terms, the implied daily PIN measure is

P INt = P INc +NxNt

(P INc P INx): (4)

The daily incremental PIN relative to the prior 62 trading days is

P INt P INx =NcNt

(P INc P INx):

So one alternative representation of the imputed daily PIN measure is

P INt = P INx +NcNt

(P INc P INx): (5)

The intuition behind the daily PIN measure is straightforward. The PIN estimated over

63 trading days is essentially a weighted average of the PIN measure on day t and the PIN

estimated over the preceding 62 trading days. If the PIN measure over the period inclusiveof the trading day t is higher than the PIN measure excluding the trading day t, then it must

be the case that the PIN on the trading day t is higher than before. On the other hand,

if the PIN drops lower on the trading day t, then the inclusive PIN measure must be lower

than the PIN measure excluding the trading day t.

The inference above also delivers a boundary condition between P INx and P INc as an

added benet. Since the daily PIN is bounded between zero and one, the estimated P INc

must be bounded as well,

NxNc

P INx P INc NxNc

P INx + NtNc

: (6)

In practice, estimated pairs of P INx and P INc that do not satisfy the above boundary

condition should be re-visited. The violation of the boundary condition could have resulted

from local maximum rather than global maximum estimates for either P INx or P INc and

thus re-estimations may help. If the model structure is too rigid to t the data well, however,

the estimated pairs have to be either discarded or re-estimated under an alternative model

structure. For instance, an estimation window with a dierent time span may t the data

better, with daily order ows over one month as opposed to one quarter.

3 PIN Extension

The ash crash on May 6, 2010, provides a good motivation to introduce liquidity shocks into

an extended PIN framework. In the event of a rm-specic or market-wide liquidity shock,

the stock prices can experience a sizeable reversal over a short period of time that would not

9


12/37

necessarily be consistent with the presence of informed traders. Stock prices are typically

assumed to gradually incorporate the information revealed from the informed orders and

thus the information-based activities would imply a price continuation rather than a sizeable

reversal. One way of justifying the sizeable price reversals associated with liquidity shocks is

to introduce a third group of investors, known as the pseudo market makers, whose orders

arrive only upon the liquidity shocks. In the extended PIN model, the pseudo market makers

trade in a contrarian fashion in the same way as market makers would and thus move stock

prices back to the fundamental level.

3.1 Revised Trade Process and Sample Likelihood

It is useful to extend the news arrival process so that news reects either signals about the

fundamental value of the stock or simply liquidity shocks. The trading process can be revised

as follows. There is a news event in each trading interval (e.g., one day) with probability .

Conditional on the arrival of a news event, there is a probability that the news reects the

liquidity shock and a 1 probability that the news reects the value-relevant fundamental.

The orders from the pseudo market makers arrive only upon the occurrence of a liquidity shock

and follow a Poisson process with arrival rate . As before, the informed orders arrive upon

the release of fundamental news and follow a Poisson process with arrival rate . Regardless

of whether a news event occurs, the uninformed orders always arrive in each trading interval

and follow a Poisson process with arrival rate ". Irrespective of the news type, each news

event has an identical probability of being negative. The uninformed orders are insensitive

to the news nature and thus balanced across the buy and sell sides. In contrast, both the

informed investors and the pseudo market makers submit only one-sided orders depending

upon the news nature. Specically, the liquidity shock triggers only buy (or sell) orders from

the pseudo market makers on days with bad (or good) news, while the fundamental news

induces only sell (or buy) orders from the informed traders on days with bad (or good) news.

Denote by the vector of parameters to be estimated. The daily likelihood of observing

B buy trades and S sell trades on one specic stock is

L[(B; S)j] = (1 ) exp(" ")(")B

B!

(")S

S!

+exp( " ") (+ ")B

B!(")

S

S!

+(1 ) exp(" ")(")B

B!

(+ ")S

S!(7)

+(1 )exp(" ")(")B

B!

( + ")S

S!

+(1 )(1 ) exp( " ")( + ")B

B!

(")S

S!:

10


13/37

Purging some constants unrelated to parameters in , one can write the daily log-likelihood

function in the following computation-friendly form,

L() = 2" + (B + S)ln(")

+ ln8


14/37

Instead of specifying a constant based on the observed frequency of intraday price

reversals, one may choose to directly introduce the price series into the model and make both

the probability of liquidity shock and the PIN time-varying. It is also possible to link the

time-varying probability of liquidity shock to certain well-known liquidity measures. One has

to carefully balance though the benets of having a model with rich dynamics against the

costs of imposing a more elaborate and thus complex structure on the data. I choose to allow

for a constant in the PIN extension here because of its inherent parsimony.

3.3 PIN Decomposition

The conventional PIN measure in the extended framework can be re-dened as

P IN =(1 ) +

(1 ) + + 2"(8)

to reect the fact that both the informed investors and the pseudo market makers contribute

to order imbalances. The component of PIN measure that is purely related to information

asymmetry can be isolated as

P INinasy =(1 )

(1 ) + + 2"; (9)

after carving out the component of PIN measure related to illiquidity

P INilliq =

(1 ) + + 2": (10)

Note that the arrival rate of the pseudo market makers is endogenously determined by the

extended PIN framework and thus can have great inuence over the PIN decomposition, even

though the constant probability is pinned down from the price reversal statistics that are

outside the PIN framework.

It is clear now that the conventional PIN measure actually consists of both an illiquidity

component and an information asymmetry component. In the special case with = 0, the

PIN measure fully represents the extent of asymmetric information. In the special case with

= 1, the PIN measure fully represents the extent of illiquidity. One of these two roles can

dominate the other from time to time. Keeping in mind the coexisting roles of illiquidityand information asymmetry, helps one to reconcile the potential confusion over the varying

interpretations of the PIN measure in the literature (see the second footnote).

From the perspective of the dual roles that the PIN measure can take, it is possible to

address the PIN anomaly documented in Aktas, de Bodt, Declerck, and Van Oppens (2007).

Instead of nding higher PIN estimates in periods with information leakage prior to the

announcements of mergers and acquisitions, these authors nd lower PIN estimates and thus

12


15/37

label the nding an anomaly. It is not inconceivable that the lower PIN estimates could

actually reect the improved liquidity due to the heightened trading activities along with

information leakage. I examine this empirical possibility in a separate paper.

3.4 Literature Review

In a closely related paper, Duarte and Young (2009) decompose the PIN measure into two

components and attribute the pricing power of the PIN factor in the asset pricing context to

the illiquidity component rather than the information asymmetry component. This nding

makes an important contrast to the nding of information risk being priced in Easley, Hvid-

kjaer, and OHara (2002). Despite the similarity over the decomposition of the PIN measure,

this paper is distinctively dierent from Duarte and Young (2009) in several aspects. As dis-

cussed earlier, the PIN measure is essentially equivalent to absolute order imbalance under

the original framework in Easley, Kiefer, OHara and Paperman (1996). Duarte and Young

(2009) inherit the same critical premise as Easley, Kiefer, OHara and Paperman (1996) that

the informed investors are the exclusive source of order imbalances. Since this assumption

does not have to hold in reality, Duarte and Young (2009) carefully discuss the potential

problem with this assumption. They acknowledge the possibility that the order imbalances

could also result from liquidity shocks rather than informed trades, leading to potentially

problematic inferences. My paper directly tackles this important caveat in Duarte and

Young (2009) and explicitly allows both the informed investors and the pseudo market mak-

ers to create order imbalances. This papers decomposition of the PIN measure clearly reects

the importance of liquidity shocks. In light of this important distinction, it is worthwhile to

examine whether or not the conclusion in Duarte and Young (2009) regarding the pricing

power of the two components of the PIN factor is robust to introducing liquidity shocks to

the PIN framework. I carry out this empirical exercise in another paper.

Moreover, the motivation behind the PIN extension in Duarte and Young (2009) is quite

dierent. In this paper, I introduce the liquidity shocks, upon which the pseudo market

makers arrive to move prices back to the fundamental level, in order to break the exclusivity

of informed traders in creating order imbalances. In contrast, Duarte and Young (2009)

reasonably argue that the one-sided nature of informed orders necessarily implies a negative

correlation between buy orders and sell orders even though the observed daily correlation is

positive. The PIN extension in Duarte and Young (2009) is motivated by eliminating the

mismatch with the observed order ows in terms of correlation, and they accomplish the

goal by introducing symmetric positive shocks to both buy and sell orders. In a way, their

motivation and approach are quite similar to the PIN extension in Weston (2001) who also

worries about trading volume on information days being abnormally large on both buy and

sell sides. Weston (2001) argues that the positive correlation between buy and sell orders

is driven by noise trading, which is characterized as a third group of traders that submit

13


16/37

both buy and sell orders simultaneously. While Weston (2001) allows the symmetric order

ow hikes to take place only on a day with news arrival, Duarte and Young (2009) introduce

the symmetric order ow hikes regardless of whether the fundamental news arrives. Since

informed traders and pseudo market makers in my extension would submit only one-sided

orders depending upon the nature of news events, one limitation of my extension is that it

does not imply a positive correlation between buy and sell order ows. This is an empirical

limitation in the sense that the modelled trading interval does not have to span exactly one

day and the observed positive correlation between buy and sell orders does not necessarily

extend beyond all sampling intervals other than one day. One way to address the limitation is

to have a ner grid of trading intervals so as to allow intraday interactions between dierent

news events and thus higher buy orders and sell orders on the same day. Alternatively, one

can follow the lead of Weston (2001) and Duarte and Young (2009) and further complement

the orders from the pseudo market makers with symmetric order ow hikes to ensure daily

order ows that are positively correlated.

Easley, Lopez and OHara (2010) also study the PIN around the ash crash but rely on

an approximation rather than the maximum likelihood estimation that is typically used in

the literature. In light of the nding that the original PIN measure is essentially equivalent

to absolute order imbalance as discussed earlier, Kaul, Lei and Stoman (2005) advocate

using the absolute percentage of order imbalance (AIM) in place of the PIN measure that is

much harder to estimate than AIM. Easley, Lopez and OHara (2010) advance this proposal by

detailing a procedure to measure the absolute order imbalance in lieu of PIN and applying the

revised measure to a number of dierent security products beyond stocks. As a result, these

two papers step outside the typical PIN framework and do not conduct maximum likelihood

estimations for the proposed PIN alternative. It is noteworthy that Easley, Lopez and OHara

(2010) update the order imbalance more frequently among heavily traded stocks than thinly

traded stocks and thus partly address the issue that Kaul, Lei and Stoman (2005) raise

regarding the practice in the literature of applying a uniform frequency to measure order

ows for all stocks. Unfortunately, however, the absolute percentage order imbalance can

be a proxy for both illiquidity and information asymmetry much in the same way as the

original PIN measure does. Both Kaul, Lei and Stoman (2005) and Easley, Lopez and

OHara (2010) suer from the lack of distinction between these two roles precisely because

of the awed assumption that the informed traders are the sole source of order imbalance.

Moreover, Easley, Engle, OHara and Wu (2008) illustrate that using the absolute percentage

order imbalance as an approximation for PIN may actually miss the dynamics over short-

lived corporate events such as earnings announcements that a daily PIN series would have

captured. So it is not straightforward to conclude that the documented properties of the

alternative measure in Easley, Lopez and OHara (2010) necessarily reect those of PIN

around the ash crash.

14


17/37

In sum, this papers extension to the PIN framework marks an important departure from

the extant literature and contributes a measure of information asymmetry that is conceptually

purer than that is previously available.

4 Empirical Analysis

4.1 Construction of Sample

My primary data source is the detailed stock transactions from the New York Stock Exchange

(NYSE) Trade and Quote (TAQ) database between February 5 and May 6 of 2010. This

study focuses on stocks listed on NYSE and American Stock exchange (AMEX). Because

the auto-quotes are not ltered in TAQ, I follow Chordia, Roll and Subrahmanyam (2001) in

using only the primary market (NYSE) quotes, and retain quotes within the regular trading

block after purging those quotes with non-positive bid or ask prices, negative bid or ask sizes,

missing time stamps, or bid prices higher than ask prices. I also remove trades that are out of

sequence, recorded before the open or after the close time, have special settlement conditions,

or have missing trade size or time stamp. As is the standard practice in the literature, the

algorithm in Lee and Ready (1991) is utilized to determine the buyer-initiated or seller-

initiated nature of each trade.4 Basically, all trades with a price higher (or lower) than the

midpoint of the bid and ask prices are classied as buyer-initiated (or seller-initiated). Trades

with a price identical to the mid point of the prevailing quote are subject to a tick test so

that a trade is classied as buyer-initiated (or seller-initiated) if the price is higher (or lower)

than the preceding trade. I follow the advice of Chordia, Roll and Subrahmanyam (2005)who recommend revoking the ve-second delay rule in Lee and Ready (1991) for matching

trades with quotes starting in 1999.

For each stock the PIN measure is estimated separately for the 62-day period ending on

May 5, 2010, and the 63-day period ending on May 6, 2010. With a minimum requirement of

order ows for 30 trading days, the maximum likelihood estimation is carried out using the

NLMIXED procedure in SAS. The dynamic factorization of the daily log-likelihood function

is remarkably successful. After an extensive grid search over dierent regions of parameter

values to ensure a global maximum, the optimization exercise nishes successfully for all

stocks in each estimation period. To facilitate imputing the daily PIN series on the dateof ash crash, stocks with zero trades on May 6, 2010, are removed from the sample. As

discussed earlier in Section 2, the imputation of the daily PIN involves a set of boundary

conditions on the resulting pair of quarterly PIN estimates. Only stock quarters that survive

this additional requirement remain in the nal sample.

4 Note that Boehmer, Grammig and Theissen (2007) study the bias on PIN estimates introduced by thesometimes erroneous classication of the trade initiation and provide a method to correct this bias.

15


18/37

The master le of the TAQ database provides the CUSIP underlying each stock ticker

symbol and I rely on the Center for Research in Security Prices (CRSP) database to extract

the stock characteristics (such as primary exchange, share code and market equity) after

merging the two datasets on CUSIP. There are 1,765 stocks on the NYSE/AMEX with

qualied pairs of quarterly PIN estimates thus far. To check for results sensitivity to the exact

grouping of stocks, I employ a set of lters to further rene the sample. After removing the

American Depositary Receipts (ADRs), the sample size becomes 1,600. Focusing on common

stocks with CRSP share code of either 10 or 11 further reduces the sample size to 998 stocks.

To guard against the potential confounding eects from the earnings announcements adjacent

to the ash crash event, I also remove stocks that have their earnings announced between

May 5 and May 7, 2010, inclusive on both ends. The announcement dates are extracted from

the actual earnings le for the U.S. rms in the I/B/E/S database. The sample size comes

down to 847.

4.2 Estimation of Original PIN Measure

The simple algorithm of dynamic factorization for the daily log-likelihood function outlined in

the Appendix is quite successful, achieving a 100% convergence rate in my sample while avoid-

ing corner solutions and local maxima. In contrast, the common factorization in equation

(3) fares much poorer and has a success rate of 45.12% in the same sample. The staggering

failure rate from the common factorization in equation (3) illustrates the dire situation of the

PIN estimation for the trading data in recent years. With algorithmic trading increasingly

popular, many orders are split into smaller pieces, often resulting in tens and thousands of

trades for one stock on one typical day. The sharp increase in the observed order ows makes

it more likely to trigger a numerical overow or underow. Hence it is critical to have an

eective factorization scheme that is exible enough to adapt to various patterns of daily

order ows in eradicating the overow and underow problem.

To show that extreme cases of order imbalances signicantly contribute to the estimation

complexity, I run a logit regression to explain the success of maximum likelihood estimations

for the original PIN framework with the common factorization of daily log-likelihood in

equation (3). The cross-sectional regression results are reported in Table 1. When the

total number of trades averaged across all trading days is the sole predictor, it is inversely

related to the estimation success. In other words, the PIN estimation is more dicult among

heavily traded stocks. The maximum absolute order imbalance also adds to the diculty of

maximum likelihood estimation in that extreme imbalances often trigger numerical overow

and underow problems. Note that the extreme absolute order imbalance delivers a better

t than the total trades as a sole predictor for the estimation success, and there is little

incremental explanatory power from the total trades after controlling for the extreme order

imbalance. The percentage absolute order imbalance averaged across all trading days beats

16


19/37

the aforementioned two predictors, however, by delivering a pseudo-R2 of 0.42 as a sole

predictor. The positive coecient with the percentage absolute order imbalance suggests that

the original PIN framework thrives at cases with extremely imbalanced orders on average,

which in turn strongly reect the presence of informed orders. Putting these predictive

variables together to explain the estimation success retains their respective signs with the

exception of the total orders. Further augmenting the logit regression with the logarithmic

market equity does not materially change the inferences, and as expected the estimations for

large cap stocks are more dicult. All the estimated coecients in the top panel of Table 1,

including the intercepts, are statistically signicant at the 1% level.

In the bottom panel of Table 1, I repeat the same set of six regression designs while

replacing the independent variables by the cross-sectional percentile rank when possible.

The percentile ranks help us to gauge the result sensitivity to potential outliers since the

regressions in the top panel would place too much weight on observations with extreme

values. The qualitative pattern of results remains largely unchanged with a few exceptions.The goodness of t has improved after the transformation of independent variables. The

rm size is no longer statistically signicant and the intercepts in two designs are also less

statistically signicant than before. Moreover, the total number of trades has one extra

change of sign in the bottom panel compared to the top panel.

Overall order imbalances contribute to the estimation complexity in an interesting way.

While a high level of extreme imbalances implies a lower estimation success, stocks with a

higher percentage of order imbalances are actually easier to estimate. The former nding

speaks directly to the numerical overow and underow problems of the estimation and the

latter points to the strategy of the original PIN framework in identifying order imbalancesas informed trades.

4.3 Inferences based on Daily PIN Series

Table 2 reports the cross-sectional mean probability of informed trading related to the ash

crash on May 6, 2010, based on the estimations of the original PIN model for a number

of sub-samples. The reported PIN measures include the quarterly PIN excluding the ash

crash event, the imputed daily PIN on the day of the ash crash as well as the incremental

PIN on the day of the ash crash. Relative to the PIN estimated for the preceding quarter,there appears to be a market-wide hike of about 0.12 (or a doubling eect) in the imputed

daily PIN on the ash crash event regardless of whether we exclude American Depositary

Receipts, focus on the common stocks only, or exclude stocks with earnings announced on

days immediately adjacent to the ash crash. The incremental PIN on May 6, 2010, is reliably

positive, so are the quarterly and daily PIN measures.

To better understand the cross-sectional dierences, I further classify the 847 common

17


20/37

stocks in the nal sample into ten volume deciles based on the daily average total number of

trades over the quarter ended on May 5, 2010. The pattern of PIN estimates in the quarter

leading up to the ash crash appears similar to that reported in Easley, Kiefer, OHara and

Paperman (1996). That is, thinly traded stocks have higher estimated PINs than heavily

traded stocks. The PIN estimates are monotonically declining as the volume decile gets

higher. The average PIN of 0.221 for the stocks in the lowest volume decile nearly triples

that for the stocks in the highest volume decile at 0.080.

The pattern of imputed PINs on the day of the ash crash is remarkably dierent. While

the stocks in the lowest volume decile continue to have the highest average daily PIN at

0.318, the average daily PIN for the rest of nine volume deciles ranges from 0.217 to 0.258

without any discernible pattern among them. The heavily traded stocks in the 9th and 10th

volume decile share the same average imputed PIN of 0.238 on the day of the ash crash,

which almost triples their respective PIN level in the preceding quarter. The stark contrast

of the estimated PINs around the ash crash, coupled with the seemingly lack of distinctionbetween stocks with high volume and those with modest volume on the event day, suggests

the uniqueness and the usefulness of the ash crash event in revealing the true identity of

the PIN. The pattern of daily imputed PIN series points out one key weakness of the original

PIN as a pure measure of information asymmetry. For someone holding such a pure view, it

is very worrisome that the level of asymmetric information exceeds 0.217 for stocks in every

volume decile even among the most heavily traded stocks. It is also dicult to make the

case that all stocks other than those most thinly traded stocks exhibit the same extent of

asymmetric information on the day of the ash crash as long as they are not among the most

thinly traded group. In contrast, it is far easier for someone viewing the PIN as a simple

measure of illiquidity to associate the ash crash event with a market-wide liquidity shock

that aects almost all stocks to the same degree on average.

There are at least two ways to present the contrast between the daily imputed PIN on

the day of the ash crash and the quarterly PIN just prior to that date. Table 2 reports

both the cross-sectional mean incremental PIN and the ratio of average daily PIN to average

quarterly PIN. The most thinly traded stocks experience the least increase in PIN on the day

of the ash crash while the most heavily traded stocks experience the largest hike. Based on

the ratio of means, the most thinly traded stocks register a 44% hike in PIN and the most

heavily traded stocks 199%. The degree of PIN hike is gradually increasing as the volume

decile climbs higher, but not in a strictly monotonic fashion. The nding of a stronger PIN

hike on the day of the ash crash among those most frequently traded stocks is another piece

of evidence corroborating the notion that the conventional PIN measure may actually better

proxy for illiquidity than information asymmetry on the day of the ash crash. After all, the

sta report by CFTC-SEC (2010) traces the ash crash to a large and aggressive trade in

the S&P 500 index futures market, and the highest two volume deciles indeed include many

18


21/37

stocks in the S&P 500 index.

In light of the ndings above, it appears reasonable to conclude that the empirical evidence

surrounding the ash crash leans in favor of the illiquidity interpretation rather than the

information asymmetry interpretation for the conventional PIN measure. After all, it is very

dicult to exclusively attribute the market-wide hike in the PIN on the day of the ash crashto asymmetric information as the original PIN model would. The extended PIN framework

demonstrate that the conventional PIN measure consists of both an illiquidity component

and an information asymmetry component. It is interesting to see how well the extended

PIN model addresses the situation.

4.4 Estimation of Extended PIN Measures

As discussed in Section 3, it goes beyond the observation of daily order ows to identify

possible liquidity shocks. Consequently, the constant probability of liquidity shocks isdetermined outside the PIN structure and becomes a crucial input for the extended PIN

model. In this paper, I equate the constant probability of liquidity shock to the empirical

frequency for the occurrence of a sizeable intraday reversal of stock prices within a given

stock quarter. Here is the detailed procedure to identify sizeable intraday reversals. First,

one can cut each regular trading day into thirteen half-hour slots from 9:30am EST to 4:00pm

EST and nd the minimum and maximum prices within each time slot. Second, the timing

information of these minimum and maximum prices along with the opening and closing prices

helps us create an intraday return series and determine the intraday maximum and minimum

returns. Suppose that the aforementioned intraday maximum return happens to be positiveand the intraday minimum return is negative. Moreover, suppose that both the intraday

maximum and minimum returns exceed a pre-specied threshold level in absolute value, then

this trading day qualies to be a day with sizeable intraday price reversals. Finally, one can

tally the number of trading days with sizeable price reversals and compute the fraction of

such days within all trading days over the entire estimation period. The resulting fraction is

the constant probability of liquidity shocks that is used to estimate the rest of parameters

in the maximum likelihood estimation and construct the two components of PIN.

The pre-specied return threshold can be either stock-specic or uniform across all stocks.

For the former, I use the sample standard deviation of daily stock returns based on the con-secutive daily closing prices over the entire estimation period. The intuition behind this

benchmark is that intraday stock price reversals exceeding one standard deviation of daily

returns on each direction constitute a sizeable swing within the day. In a robustness check, I

also try to set a uniform cuto of 2% across all stocks to identify sizeable intraday reversals.

The cross-sectional average is 0:0805 based on the stock-specic cutos and 0:1191 based

on the uniform cuto of 2% during the quarter ended on the ash crash. When the date

19


22/37

of the ash crash is excluded, the cross-sectional average s are 0:0801 and 0:1081, respec-

tively. Given the equal weight assigned for all trading days associated with liquidity shocks

irrespective of the magnitude of the price reversal beyond the threshold, the inclusion of the

ash crash event only slightly boosts the empirical frequency of liquidity shocks.

The extended PIN model is repeatedly estimated for the nal sample of common stockslisted on the NYSE/AMEX, excluding those with earnings announced on days immediately

adjacent to the ash crash. The maximum likelihood estimations for each stock produce

two pairs of PIN components, one for the quarter excluding the ash crash and the other

including the ash crash. As before, each PIN component can be imputed for the day of the

ash crash based on the set of quarterly PIN components with and without the ash crash.

Depending upon the cuto used to identify liquidity shocks, it is possible that none of the

trading days in the estimation period qualies to be a day with liquidity shocks, resulting in

a zero probability of liquidity shock. For instance, 8.84% of stock quarters correspond to a

zero probability of liquidity shock when the stock-specic cuto is used to identify liquidityshocks. In such cases, the extended PIN model degenerates to the original PIN model and

no further estimation is needed.

4.5 PIN Decomposition

Under the extended PIN model, the conventional PIN measure can be decomposed into an

information asymmetry component and an illiquidity component. Table 3 presents the decom-

position for common stocks across ten volume deciles around the ash crash. In the quarter

ended one day before the ash crash, the information asymmetry component P INinasy isnon-surprisingly large (at the level of around 0.20) among the lowest three volume deciles,

gradually declines in trading volume but not in a strictly monotonic fashion, and reaches the

lowest value 0.066 for the highest volume decile. The estimated P INinasy for the low volume

deciles is two to three times larger than for the highest volume decile. The quantitative

pattern here appears comparable to the conventional PIN measure in Table 2. In the same

quarter, the illiquidity component P INilliq for the lowest volume decile is about twice as

large for each of the rest nine volume deciles, reaching 0.026 and about 0.013, respectively.

The quarterly decomposition prior to the ash crash suggests that the information asymme-

try component strictly dominates the illiquidity component by a factor of 4.7 to 22.6. Even

at the lowest volume decile where the illiquidity component is twice as large as the rest of

volume deciles, the information asymmetry component is more than seven times as large as

the illiquidity component.

While the quarterly PIN decomposition is highlighted by the strict dominance of the infor-

mation asymmetry component over the illiquidity component, the imputed PIN components

on the day of the ash crash are characteristic of the disappearance of this strong dominance

20


23/37

and the lack of any distinctive pattern across volume deciles. The daily P INinasy for stocks

in each of the lowest three volume deciles exceeds 0.200, followed by the fourth and the ninth

volume decile at 0.192 and 0.191, respectively. One might have expected the lowest volume

decile to continue having the highest P INilliq on the day of the ash crash as it does in the

quarter prior to the ash crash. This is actually not the case as the fth volume decile has the

highest P INilliq . As far as the magnitude is concerned, the illiquidity component beats the

information asymmetry component in the fth and sixth volume deciles, and is only slightly

behind in the other eight volume deciles.

The quarterly and daily PIN components reported in Table 3 are all reliably positive,

statistically signicant at any conventional level. The daily incremental PIN relative to the

quarterly PIN in terms of the information asymmetry component shows a modest increase

among thinly traded stocks but registers a fairly large hike among heavily traded stocks,

ranging from 0.041 for the lowest volume decile to 0.106 for the highest volume decile. Also

note that the incremental P INinasy is not statistically dierent from zero at the 1% levelfor the lowest three volume deciles and the fth decile. When expressed in terms of the

ratio of average daily P INinasy to average quarterly P INinasy, the PIN hike varies from

21% to 124%. Overall the hike in asymmetric information on the day of the ash crash is

considerably weakened both economically and statistically under the extended model than

the original PIN framework that fails to distinguish the information asymmetry component

from the illiquidity component.

In a striking contrast, the daily PIN in terms of the illiquidity component is drastically

higher than its quarterly counterpart. The daily incremental P INilliq is invariably positive

and reliably so for all ten volume deciles. The boost in P INilliq on the day of the ash crashamounts to a more than ve-fold increase for the most thinly traded stocks and a nearly

14-fold increase for the fourth volume decile.

The aforementioned results of PIN decomposition are not conned to using the stock-

specic cutos to identify liquidity shocks. In a robustness check, I repeat the exercise after

dening liquidity shocks as intraday price reversals exceeding 2% for all stocks. The results

under the uniform 2% cuto are reported in Table 4, which closely replicates all the qualitative

patterns in Table 3 under the stock-specic cutos.

There are a number of lessons we can draw from the PIN decomposition around the ash

crash. First and foremost, it is critically important to introduce liquidity shocks to extend

the original PIN framework. Otherwise, the original PIN measure can be misleadingly high in

cases where the credit to the illiquidity component is due. Second, even though the illiquidity

component of PIN is negligibly small in the quarter leading up to the ash crash, it accounts

for nearly as large a fraction as the information asymmetry component on the day of the ash

crash. Since the asset pricing tests of the information risk as measured by a PIN factor have

been done at the annual interval using the original PIN model, it may well be worthwhile to

21


24/37

revisit the test using the extended PIN model at the monthly interval that is typically used

by asset pricing studies. To the extent that the illiquidity component of PIN is declining in

the length of the sampling interval, the factor based on the illiquidity component of PIN may

be even stronger than previously reported in Duarte and Young (2009). Third, the roughly

similar magnitude of P INilliq across volume deciles points to the commonality of liquidity

shocks across all stocks at the time of crisis.5 This is evidence further corroborating the sta

report by CFTC-SEC (2010) that documents the ash crash as twin liquidity crises on the

S&P 500 index futures market and the equity market.

4.6 Forecasting the Opening Bid-Ask Spread

To examine the role of PIN components in predicting future spreads, I run the regression

ln(ospreadi;t) = a0 + a1 ln(P INinasy;i;x) + a2 ln(P INilliq;i;x) + a3 ln(volumei;x) + mi;t:

The dependent variable is the logarithmic opening bid-ask spread as a percentage of midpoint

price on the day of the ash crash (with time subscript t). The set of predictors are measured

in the quarter immediately preceding the ash crash (with time subscript x) and include

the logarithmic PIN components on information asymmetry and illiquidity as well as the

logarithmic share volume. The individual stocks are denoted by the subscript i and the

residuals are denoted by mi;t. The logarithmic transformation of variables is partly motivated

by theory and has appeared in previous studies such as Weston (2001) and Easley, Engle,

OHara and Wu (2008). The cross-sectional regression results are presented in Table 5.

When liquidity shocks are identied using rm-specic cutos, both the information asym-

metry and the illiquidity components are positive and statistically signicant as a standalone

predictor. The illiquidity component has a much weaker forecasting power for the opening

spread than does the asymmetric information component, with adjusted R2 of 0.012 and

0.195, respectively. Both PIN components in the preceding quarter are positive and highly

statistically signicant when they join the share volume in forecasting the opening spread on

the day of the ash crash. Not surprisingly, the average daily share volume in the preceding

quarter is negatively associated with the opening spread and statistically signicant.

All the qualitative patterns of the cross-sectional regression results above are preserved

when liquidity shocks are identied through intraday price reversals that uniformly exceed 2%

for all stocks. As far as the goodness of t is concerned under the extended PIN estimations

with a uniform cuto, the forecasting power of the information asymmetry component is

5 Chordia, Roll and Subrahmanyam (2000), Hasbrouck and Seppi (2001) and Korajczyk and Sadka (2008)study the cross-sectional commonality of liquidity. It can be interesting to carry out the principal componentanalysis on the illiquidity component of the PIN over an extended period of time even though the datalimitation around the ash crash event prevents such an exercise here.

22


25/37

weakened somewhat while that of the illiquidity component strengthens, with adjusted R2 of

0.095 and 0.181, respectively.

It is clear that the two PIN components and the share volume are able to jointly explain a

large fraction of the cross-sectional variations in the opening spread. The adjusted R2 is 0.292

with stock-specic cutos and 0.394 with a uniform 2% cuto. The power of the informationasymmetry component forecasting the opening spread and the positive and highly signicant

association between these two variables are consistent with the ndings in the literature (e.g.,

Easley, Kiefer, OHara and Paperman, 1996; Weston, 2001; Lei and Wu, 2005; Easley, Engle,

OHara and Wu, 2008). This is the rst paper to my knowledge that formally introduces

the rm-specic or market-wide liquidity shocks into the PIN framework so as to break the

exclusivity of the informed trades in creating order imbalances. So the new nding of the

illiquidity component of PIN as an important predictor for future bid-ask spreads validates the

PIN extension in this paper. It contributes to the literature by helping us better understand

the role of liquidity shocks and allowing practitioners to better anticipate the trading costsand design trading strategies accordingly.

4.7 Explaining the Illiquidity Component of PIN

In a further analysis of the illiquidity component of PIN, I run the contemporaneous cross-

sectional regression for stocks in the nal sample on the day of the ash crash

P INilliq;i;t = b0 + b1 ln(twspreadi;t) + b1 ln(volumei;t) + ni;t:

The individual stocks are denoted by subscript i and the residuals are denoted by ni;t. Since

the opening spread provides only a snap shot, it may not adequately represent the full-day

dynamics of the spread on the day of the ash crash. So I construct the time-weighted

average spread (denoted by twspread) as a percentage of the midpoint price.6 The illiquidity

component of PIN is expected to be positively correlated with the time-weighted average

spread. The contemporaneous share volume is also included in the regression.

Table 6 reports the regression results with liquidity shocks dened using either stock-

specic cutos or the uniform 2% cuto. Regardless of the cuto scheme, the time-weighted

average spread is positive and highly statistically signicant in explaining the cross-sectional

variations of the illiquidity component of PIN. This relationship is quite remarkable in that

the illiquidity component of PIN is based on the rather coarse price reversal statistics and

primarily driven by the daily order ows that are abstract from any price information. So

6 For the construction of the time-weighted average spread on the day of the ash crash, I retain the sameset of quotes that are used to determine the trade initiation as the Lee and Ready (1991) procedure requires,and purge other quotes that do not correspond to any actual trades. The time span in seconds between theretained quotes is then used as the weight for each bid-ask spread in percentage of the midpoint price tocompute the time-weighted average spread for each day.

23


26/37

the positive relationship is quite revealing in light of the fact that the time-weighted spread

is purely price information.

As a single explanatory variable, the share volume is inversely related to the illiquidity

component. This relationship is weak from the statistical viewpoint, however, reinforcing

the conclusion from the visual inspection of Tables 3 and 4 that volume is not a very goodsorting device for the illiquidity component of PIN on a standalone basis. After controlling for

the time-weighted average spread, however, the logarithmic share volume registers a positive

coecient that is highly statistically signicant. The positive relationship with share volume

is quite unique here because it alludes to the fact that some of the S&P 500 index component

stocks are hit the hardest with extreme price reversals during the ash crash, and the most

heavily traded stocks experience the largest hike in the illiquidity component of PIN.

When liquidity shocks are identied with stock-specic cutos, the time-weighted spread

and the share volume explain a fairly small fraction of the cross-sectional variations in the

illiquidity component with an adjusted R2 of 0.024. When a uniform 2% cuto is used instead,

the time-weighted spread and the share volume have a much better t with the data. The

adjusted R2 is 0.161. While further research is needed to glean additional insights from the

illiquidity component of PIN, the ndings thus far in this paper illustrate the importance of

introducing liquidity shocks into the PIN framework.

5 Conclusion

The ash crash event on May 6, 2010, provides both the motivation and the testing eldfor this paper. During this event, the sharp drop of stock prices and the swift reversal over

a thirty-minute interval are very interesting in that they essentially amount to a serious

challenge to the original PIN framework in Easley, Kiefer, OHara and Paperman (1996). On

the day of the ash crash, there is a wide spread large increase in PIN for various sub-samples

of stocks, and the PIN nearly tripled among the most heavily traded stocks. Such a pervasive

PIN hike cannot be solely attributed to the increase of asymmetric information as would the

original PIN model and the gradual incorporation of private information into stock prices

seems at odds with the sizeable and quick reversal of stock prices across the board.

By explicitly allowing for liquidity shocks, this paper extends the original PIN frameworkto introduce a third trading motive in addition to the private information and the exogenous

liquidity needs. The pseudo market makers can submit contrarian orders during periods of

liquidity shocks and thus help to restore stock prices back to the fundamental level, resulting

in an observed price reversal. The coexistence of fundamental news and liquidity shocks in

the extended PIN model implies that the informed investors are no longer the sole source

of order imbalances and the pseudo market makers can also submit one-sided orders during

24


27/37

liquidity shocks. Consequently, the conventional PIN measure consists of both an information

asymmetry component and an illiquidity component.

The extended PIN model is then put to test around the ash crash. The illiquidity

component of PIN accounts for a negligible fraction during the quarter leading up to the ash

crash but experiences a ve- to fourteen-fold hike on the day of the ash crash, reaching at alevel nearly at par with the information asymmetry component. Even though the information

asymmetry component also witnesses an increase on the day of the ash crash, it is not nearly

as drastic as the illiquidity component. Compared to the original PIN framework, the hike

in asymmetric information on the day of the ash crash is weakened substantially under the

extended model both from the statistical and the economic perspectives. Moreover, there is

evidence that both the information asymmetry component and the illiquidity component of

PIN can forecast the opening bid-ask spread. On the day of the ash crash, the illiquidity

component of PIN is positively and contemporaneously correlated the time-weighted average

spread, further supporting the notion that the rm-specic or market-wide liquidity shockaects the inference on the information-based trading.

These new ndings contribute to the literature and deepen our understanding of the

role of information asymmetry. They certainly point to the importance of accounting for

liquidity shocks in the PIN framework and invite us to revisit a number of interesting issues.

For instance, would the PIN decomposition under the extended model imply a stronger PIN

factor or a weaker one in the asset pricing context? Can we actually resolve the documented

PIN anomaly in the context of mergers and acquisitions announcements? I study these

and other interesting questions in a series of companion studies.

In addition to the development and testing of an extension to the PIN framework, this

paper also provides a number of methodological improvements to the PIN estimation. In the

Appendix, I outline one simple procedure to dynamically factorize the daily log-likelihood

function for the maximum likelihood estimation and eectively eliminate the numerical over-

ow and underow problems that have long plagued the academic researchers and practition-

ers alike in the PIN context. Moreover, this paper also furnishes the guidelines of imputing the

daily PIN series through repeated estimations of quarterly PINs. Researchers are expected

to benet from these methodological improvements in a wide variety of settings, especially

among the corporate event studies that would most appreciate the availability of a daily PIN

series without the cost of imposing a complex data structure.

25


28/37

6 Appendix. Dynamic Factorization of Log-likelihood

For ease of exposition, I illustrate the factorization process under the original PIN framework.

The daily log-likelihood function can be written as

L() = 2" + (B + S)ln(") + ln hXwi exp(xi)i ;where the weights and the exponential inputs are given in the table below.

Weight wi Exponential Input xi

1 0

Sln(k)

(1 ) B ln(k)

The computational complexity lies in the weighted sum of exponential functions, each of which

has the potential of triggering an overow or underow. One can dynamically factorize the

log-likelihood function on a daily basis using a three-step procedure.

First, nd the maximum input xmax and pull xmax into the common factor. Alternatively

speaking, one can compute the modied exponential input

yi = xi xmax:

Second, examine each modied exponential input yi and see if it falls below the critical

value C that is determined by the researchers hardware and software for estimating the

PINs. Note that the maximum input ymax is zero and thus yi 0 always holds. Ifyi C,

then it is necessary to force a zero weight so as to avoid the underow from evaluating exp(yi).

If C < yi 0, then it is ne to directly evaluate exp(yi). Alternatively speaking, one can

compute the modied weight

vi = wi (C < yi 0);

where the indicator function takes the value of 1 if C < yi 0 and 0 otherwise.

Third, the daily log-likelihood function can be rewritten as

L() = 2" + (B + S)ln(") + xmax + ln

Xvj6=0

vj exp(yj)

:

Note that there is no need to check for the logarithmic inputs. The arrival rates are often

coded as an exponential function to ensure their positiveness so ln(") will not cause numerical

problems. Moreover, the fact of yi 0 implies that the input for the second logarithmic

26


29/37

function is properly bounded between 0 and 1.

Having discussed the process of factoring the log-likelihood function, I should also note the

importance of checking for the presence of overow and underow problems when handling

the various transformations of raw parameter inputs that help to ensure that 0 1,

0 1, > 0, and " > 0. Researchers often use the exponential transformation to ensurea positive parameter and the logistic transformation for a parameter that is a probability. One

can use techniques similar to the ones documented above to handle these transformations.

For instance, denote by e, e, e and e" the parameters before transformation and c a constantto scale the arrival rates so as to make Hessian matrix of the vector of parameters well

behaved. The transformation for the news probability and the probability of arrived news

being negative can be written as

=

8>>>>>:

0 if

e C;

11+exp(e) if jej < C;1 if e C. =

8>>>>>:

0 if

e C;

1

1+exp(e) if e < C;1 ife C.

The informed arrival rate and the uninformed arrival rate " can be written as

=

8>>>>>:

0 ife C;exp(e) exp(c) if jej < C;exp(C) exp(c) ife C.

" =

8>>>>>:

0 ife" C;exp(e") exp(c) if je"j < C;exp(C) exp(c) ife" C.

Finally, one needs to anticipate potential overow and underow problems for the com-putation of

k ="

+ "=

1

1 + exp(e e") and ln(k) = ln [1 + exp(e e")] :There are four cases to consider. (1) If e e" C then k = 1 and ln(k) = 0. (2) If e e" C then k = 0 and ln(k) = e" e. (3) Ifje e"j < C and 1+exp(e e") 10C= ln(10)then k = 0 and ln(k) = e" e. (4) If je e"j < C and 1 + exp(e e") < 10C= ln(10) thenk = 1=[1 + exp(

e

e")] and ln(k) = ln [1 + exp(

e

e")].

The log-likelihood function under the extended PIN framework can be handled in a similarway. For instance, the table of weights and exponential inputs can be simply augmented with

two more rows along with the introduction of the parameter. I omit the details for brevity.

27


30/37

References

[1] Aktas, Nihat, Eric de Bodt, Fany Declerck, and Herve Van Oppens, 2007, The PIN

anomaly around M&A announcements, Journal of Financial Markets 10, 169-191.

[2] Benos, Evangelos, and Marek Jochec, 2007, Testing the PIN variable, University of

Illinois at Urbana-Champaign Working Paper.

[3] Bessembinder, Hendrik, Kalok Chan, and Paul J. Seguin, 1996, An empirical examina-

tion of information, dierences of opinion, and trading activity, Journal of Financial

Economics 40, 105-134.

[4] Bharath, Sreedhar T., Paolo Pasquariello, and Guojun Wu, 2009, Does asymmetric

information drive capital structure decisions?, Review of Financial Studies 22, 3211-

3243.

[5] Boehmer, Ekkehart, Joachim Grammig, and Erik Theissen, 2007, Estimating the proba-

bility of informed tradingdoes trade misclassication matter?, Journal of FinancialMarkets 10, 26-47.

[6] Chan, Kalok, 1992, A further analysis of the leadlag relationship between the cash

market and stock index futures market, Review of Financial Studies 5, 123-152.

[7] Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2000, Commonality in

liquidity, Journal of Financial Economics 56, 3-28.

[8] Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2001, Market liquidity

and trading activity, Journal of Finance 56, 501-530.

[9] Chordia, Tarun, Richard Roll, and Avanidhar Subrahmanyam, 2005, Evidence on the

speed of convergence to market eciency, Journal of Financial Economics 76, 271-

292.

[10] CFTC-SEC, 2010, Findings regarding the market events of May 6, 2010, Report of the

Stas of CFTC and SEC to the Joint Advisory Committee on Emerging Advisory

Issues.

[11] Duarte, Jeerson, Xi Han, Jarrad Harford, and Lance Young, 2008, Information asym-

metry, information dissemination and the eect of regulation FD on the cost of

capital, Journal of Financial Economics 87, 24-44.

[12] Duarte, Jeerson, and Lance Young, 2009, Why is PIN priced?, Journal of FinancialEconomics 91, 119-138.

[13] Easley, David, Robert F. Engle, Maureen OHara, and Liuren Wu, 2008, Time-varying

arrival rates of informed and uninformed trades, Journal of Financial Econometrics

6, 171-207.

[14] Easley, David, Soeren Hvidkjaer, and Maureen OHara, 2002, Is information risk a de-

terminant of asset returns?, Journal of Finance 57, 2185-2221.

28


31/37

[15] Easley, David, Nicholas M. Kiefer, and Maureen OHara, 1996, Cream-skimming or

prot-sharing? The curious role of purchased order ow, Journal of Finance 51,

811-833.

[16] Easley, David, Nicholas M. Kiefer, and Maureen OHara, 1997, One day in the life of a

very common stock, Review of Financial Studies 10, 805-835.[17] Easley, David, Nicholas M. Kiefer, Maureen OHara, and Joseph B. Paperman, 1996,

Liquidity, information, and infrequently traded stocks, Journal of Finance 51, 1405-

1436.

[18] Easley, David, Marcos M. Lopez de Prado, and Maureen OHara, 2010, The microstruc-

ture of the Flash crash: Flow toxicity, liquidity crashes and the probability of

informed trading, Cornell University Working Paper.

[19] Easley, David, and Maureen OHara, 1992, Time and the process of security price ad-

justment, Journal of Finance 47, 577-605.

[20] Easley, David, and Maureen Ohara, 2004, Information and the cost of capital, Journal

of Finance 59, 1553-1583.

[21] Hasbrouck, Joel, and Duane J. Seppi, 2001, Common factors in prices, order ows, and

liquidity, Journal of Financial Economics 59, 383-411.

[22] Jayaraman, Sudarshan, 2008, Earnings volatility, cash ow volatility, and informed trad-

ing, Journal of Accounting Research 46, 809-851.

[23] Kaul, Gautam, Qin Lei, and Noah Stoman, 2005, AIMing at PIN: Order ow, infor-

mation, and liquidity, University of Michigan Working Paper.

[24] Korajczyk, Robert A., and Ronnie Sadka, 2008, Pricing the commonality across alter-

native measures of liquidity, Journal of Financial Economics 87, 45-72.

[25] Lee, Charles M. C., and Mark J. Ready, 1991, Inferring trade direction from intraday

data, Journal of Finance 46, 733-746.

[26] Lei, Qin, and Guojun Wu, 2005, Time-varying informed and uninformed trading activi-

ties, Journal of Financial Markets 8, 153-181.

[27] Vega, Clara, 2006, Stock price reaction to public and private information, Journal of

F

unveiling the identity of pin from the flash crash

Documents