removing biases in computed returns

ORI GINAL RESEARCH

Removing biases in computed returns

Lawrence Fisher • Daniel G. Weaver • Gwendolyn Webb

Published online: 21 January 2010� Springer Science+Business Media, LLC 2010

Abstract This paper presents a straightforward method for asymptotically removing the

well-known upward bias in observed returns of equally-weighted portfolios. Our method

removes all of the bias due to any random transient errors such as bid-ask bounce and

allows for the estimation of short horizon returns. We apply our method to the CRSP

equally-weighted monthly return indexes for the NYSE, Amex, and NASDAQ and show

that the bias is cumulative. In particular, a NASDAQ index (with a base of 100 in 1973)

grows to the level of 17,975 by 2006, but nearly half of the increase is due to cumulative

bias. We also conduct a simulation in which we simulate true prices and set spreads

according to a discrete pricing grid. True prices are then not necessarily at the midpoint of

the spread. In the simulation we compare our method to calculating returns based on

observed closing quote midpoints and find that the returns from our method are statistically

indistinguishable from the (simulated) true returns. While the mid-quote method results in

an improvement over using closing transaction prices, it still results in a statistically

significant amount of upward bias. We demonstrate that applying our methodology results

in a reversal of the relative performance of NASDAQ stocks versus NYSE stocks over a

25 year window.

Keywords Unbiased market index � Bias in computed returns � Jensen’s inequality �Asset pricing � Index construction

L. FisherDepartment of Finance and Economics, Rutgers Business School, Rutgers University,111 Washington Street, Newark, NJ 07102, USA

D. G. Weaver (&)Department of Finance and Economics, Rutgers Business School, Rutgers University,94 Rockafeller Road, Piscataway, NJ 08854-8054, USAe-mail: [email protected]

G. WebbBert W. Wasserman Department of Economics and Finance, Baruch College, Zicklin Schoolof Business, One Bernard Baruch Way, Box B10-225, New York, NY 10010, USAe-mail: [email protected]

123

Rev Quant Finan Acc (2010) 35:137–161DOI 10.1007/s11156-009-0161-8

JEL Classification G10 � G12 � C43

1 Introduction

It is well known that observed returns of single-security prices are upward biased and that

the bias is caused by errors in quoted prices.1 Frequently re-balanced, equally-weighted

indexes (or portfolios) are especially prone to the bias, which is known to be cumulative

(Conrad and Kaul 1993). However, equal weighting is the method used in virtually all

event studies and is the preferred method of forming an index when there are dispropor-

tionate market capitalization weights among representative stocks.,23 Also, since the bias is

cumulative, it can potentially affect the relative rankings of stock performance, especially

over longer periods of time.

Blume and Stambaugh (1983) show that a buy-and-hold portfolio contains a diversifi-

cation effect that removes the bias after the first period, as the number of stocks increases.4

Previous implementations of their methodology have focused on long investment hori-

zons.5 Alternative methodologies for estimating unbiased returns that are applicable to

short investment horizons are shown to have undesirable properties.6 In this paper, we

present an alternative derivation of Blume and Stambaugh’s buy-and-hold methodology.

This allows for the estimation of unbiased short horizon equally-weighted returns which

can then be used in event studies, as well as for the construction of frequently rebalanced

indexes.

The intuition behind our methodology is straightforward and is similar to that for deriving

implied future single-period spot rates from two-period rates (assuming that the liquidity

premium is zero). The implied future spot rate for t - 1 to t is solved by dividing the square

of one plus the spot rate from t - 2 to t by one plus the spot rate from t - 2 to t - 1. In a

similar manner, in this paper we show that dividing a two-period average portfolio price

relative (one plus the observed return) ending at time t, by a one-period average portfolio

price relative that ends at time t - 1 results in an unbiased estimate of the true one-period

price relative (hence return) ending at time t, as long as errors in t, t - 1, and t - 2 are

independent.

1 See Macaulay (1938) and Fisher (1966), among others.2 A notable exception is Fama et al. (1969), who use continuously compounded returns.3 For example, the explanation of the weighting method for the Dow Jones Turkey Equal Weighted 15Index from the company’s web site (http://www.djindexes.com/mdsidx/?event=showTurkey15) is that ‘‘Theindex includes the largest stocks traded on the Istanbul Stock Exchange, and is equal weighted to limit theinfluence of the biggest companies on overall index performance.’’ Also in 2005, NASDAQ began con-structing an equally-weighted version (rebalanced quarterly) of several of their indexes including theNASDAQ 100. See also Hamza et al. (2006) for a discussion of the efficacy of different index weightingmethods for emerging markets.4 In Blume and Stambaugh (1983) a buy-and-hold portfolio sets equal weights in a portfolio at thebeginning of the period and no rebalancing is done before the end of the multi-period investment horizon. Incontrast a rebalanced portfolio is rebalanced each period. See Roll (1983) for a comparison of rebalancedand buy-and-hold portfolio returns.5 For example Blume and Stambaugh (1983) examine one-year investment horizons and Conrad and Kaul(1993) examine 3-year investment horizons.6 Blume and Stambaugh (1983) and Bessembinder and Kalcheva (2007) note that while short horizoncontinuous-compounded rates of return contain no bias, they also possess certain properties that limit theiruse in many tests.

138 L. Fisher et al.

123

http://www.djindexes.com/mdsidx/?event=showTurkey15

The average bias inherent in observed returns is due to pricing errors at the beginning of

the holding period. We show, as others have, that by invoking the law of large numbers the

expected bias in observed prices at time t is zero, leaving only the bias in observed prices at

the beginning of the period. Our contribution to the literature is to provide a method for

removing the remaining bias due to any random transient errors (not limited to bid-ask

bounce as in previous studies). The only assumptions that we need to make are that (1)

transient errors in successive observed prices are independent, and (2) all observed prices

are finite and greater than zero. In our methodology, both the numerator and denominator

start at the same time. Because they have the same amount of bias, it cancels out, leaving

asymptotically unbiased estimates of true returns.

In addition to discrete pricing errors, a growing literature (e.g., Hou and Moskowitz

(2005)) finds significant lagged responses to new information. We examine the efficacy of

our methodology in the context of lagged adjustment and find that as long as prices at time

t have fully adjusted to the information available at time t - 1, then we still produce an

unbiased estimate of true returns. Given that Hou and Moskowitz find that some stocks

take up to 4 weeks to adjust to new information, our method is probably adequate for

estimating true monthly returns of U.S. stocks, but not shorter periods.

One method that has been proposed for avoiding the bias in observed returns is to use

the mid-quote. However, that method assumes that the true price is the mid-point of the

spread, which in turn assumes both continuous pricing, and no other departures from

equilibrium. This then naturally leads to the simplifying assumption of a binomial distri-

bution of errors, as assumed by Blume and Stambaugh (1983), among others. However,

true prices may not be at the midpoint of the spread because of the discrete pricing grid.

We first show that our methodology does not depend on the distribution of error terms.

Then we employ a simulation to compare the accuracy of our methodology to using the

mid-quote to calculate returns. In our simulation, equilibrium spreads are applied to

generated true prices and the observed bid and ask prices are then determined using a

discrete pricing grid. We find that taking the mid-quote removes between one and two-

thirds of the bias in observed returns, but that our methodology removes all but random

errors. This suggests that taking the mid-quote is an insufficient remedy for removing bias

in observed returns.

We then compare our methodology for constructing an index to the naıve observed

return approach. We replicate the return on the CRSP Equally-Weighted Index for the New

York Stock Exchange, American Stock Exchange, and Nasdaq.7 We compute monthly

returns and resulting index levels, setting December 31, 1973 equal to 100. Since observed

returns are upward biased, the indexes are cumulative with respect to the bias. Comparing

our unbiased equally-weighted index to the CRSP index for each of the three markets

reveals the extent of this cumulative effect. For example, we find that over the 33 years

from 1973 to 2006, the ending level of the CRSP Nasdaq index is over 90% higher than our

unbiased Nasdaq index (17,976 vs. 9,418). The indexes for the other two markets show

similar, but smaller, cumulative bias. When we compare markets, we find that the overall

performance of Nasdaq stocks as measured by the CRSP index is about 54 percent higher

than that of NYSE stocks in the 1973–2006 period. However, after we remove the bias, we

find that Nasdaq stocks perform on average no higher than stocks on the NYSE. This

clearly shows the importance of our methodology in calculating equally-weighted stock

indexes, in asset class horse-races.

7 The CRSP Equally-Weighted Index methodology is first developed by Cohen and Fitch (1966).

Removing biases in computed returns 139

123

We examine the average size of the bias in observed monthly returns for 5-year time

periods, as well as other periods, for each of the three markets. For NYSE stocks, the bias

ranges from an average of 2.27 basis points (hereafter b.p.) for the last half of the 1950s, to

almost 50 b.p. during the first half of the 1930s. The NYSE has the lowest average level of

bias in observed monthly returns, followed by the Amex, and then NASDAQ. During the

period of our study, previous authors have found NASDAQ spreads to be wider than those

on the NYSE. This underscores the applicability of our methodology for constructing

indexes in markets with larger transient errors (e.g. wider spreads), such as emerging

markets. Finally, we show that although the bias in January returns is larger than other

months for all three markets, virtually all calendar months have substantial upward bias in

observed returns.

Our methodology is of interest to academics who routinely construct equally-weighted

portfolios, as well as to practitioners who construct indexes for the purpose of evaluating

market performance.

The rest of this paper is organized as follows. The next section reviews relevant liter-

ature. Section 3 develops a model of price adjustment that allows for random transient

errors and lagged adjustment to information. In Sects. 4 and 5, we present results of our

simulation comparison tests, and then our measurement of the level of bias in CRSP return

indexes. Section 6 concludes and discusses several areas for future research.

2 Literature review

It is well known that the existence of significant transaction costs leads to imperfect and

non-instantaneous adjustment of reported prices to new information. This leads to security

price behavior that is inconsistent with that expected in an efficient market. For example,Fisher (1966) observes that frequently rebalanced indexes for equally-weighted portfolios

seem to outperform buy-and-hold portfolios.8 In particular, Fisher compares n-period

observed returns, Qn, to chained equally-weighted observed monthly returns, rt, and finds:

Qn \Yn

t¼ 1

1 þ rtð Þ" #

� 1: ð1Þ

Fisher considers whether this inequality reflects bias due to infrequent trading and the

propensity of databases to use non-synchronous closing prices. He conjectures that the lagged

response to new information, as well as random errors, could lead to observed link relatives (one

plus return) ð1þ rtÞ ¼ Pt

Pt�1being upward biased. Let rt be the fully-adjusted observed return

of a security, and let et be the error in the observed price, Pt, of the security at time t. Then

1 þ rtð Þ ¼ 1 þ et

1 þ et�1

1 þ rtð Þ ð2Þ

where rt is the true (unbiased) return at time t. If E(et) = E(et-1) = 0, then by Jensen’s

inequality, the ratio on the right-hand side of Eq. (2) is greater than 1.0. Hence, observed

returns will lead to upward bias relative to true returns.

8 This upward bias in equally-weighted returns is first observed by Macaulay (1938, pp. 149–154) in hisanalysis of railroad stocks. Macaulay concludes that the bias, which he calls mathematical drift, is largerthan can be caused by chance, and is much larger than the bias observed for a value-weighted index of thesame stocks.


123

Fisher (1966) conjectures that differential lagged response to market-wide information

will result in negative serial correlation in the residuals from a market model regression.

However, he further conjectures that the amount of error introduced by the return gener-

ation process is very small.

Blume and Stambaugh (1983) extend the work of Fisher (1966) by modeling bias as a

function of bid-ask spread.9 The spread supposedly compensates liquidity providers for

providing liquidity and creates a friction in the market. They assume that a security sells

either at the bid price, Pb, or the ask price, Pa, and that security’s true price, P, is at the

midpoint of the spread. Then the observed price contains a relative error equal to

e ¼ �Pa � Pb

2P: ð3Þ

Blume and Stambaugh (1983) show that a relative error of e causes upward bias equal to

e2=ð1þ e2Þwhich is shown to be approximately equal to e2. Further, they suggest that the bias

can be approximated by the variance of the previous period’s error terms, r2 et�1ð Þ. They show

that bid-ask spread is a problem only in equally-weighted portfolios.10 They also show that

equally-weighted portfolios will contain the average amount of bias r2 et�1ð Þ across firms,

which can be significant. Although the returns on buy-and-hold portfolios contain less bias

than rebalanced portfolios, they are still biased upward relative to true returns.11

It is clear from Eq. (3) that wider spreads will cause larger errors in observed prices.

Blume and Stambaugh (1983) conjecture that, since smaller firms typically have wider

spreads, the small-firm effect (Reinganum 1982; Keim 1983) may be due to bid-ask bias.

To test this hypothesis, they form ten portfolios based on year-end firm values. They then

calculate the following year’s average daily return based on: (1) a daily rebalancing

strategy and (2) a 1 year buy-and-hold strategy. The overall difference between the

smallest and largest firm size portfolios is only half as large for the buy-and-hold strategy

relative to the rebalanced strategy. They conclude that at least part of the small firm effect

is due to bias in returns induced by relatively wider bid-ask spreads.

Roll (1984) exploits the fact that observed prices differ from true prices due to bid-ask

spreads to develop a method for estimating the ‘‘effective’’ bid-ask spread.12 His method is

based on the transactional model of Niederhoffer and Osborne (1966), as well as the

models of Cootner (1962) and Samuelson (1965). Niederhoffer and Osborne show that the

market-making process (which causes the bid-ask spread) results in negative serial

dependence in observed trade prices. Roll extends previous work by showing that the bid-

ask spread results in negative serial covariance in observed returns. He shows that if true

price equals (PB ? PA)/2 and true returns are serially uncorrelated, then

9 Although Blume and Stambaugh (1983) consider other possible pricing errors (see their Sect. 2.4), theyassert that the bias from them is negligible.10 Equally weighting portfolios is the method most commonly used in event studies.11 Buy-and-hold portfolios reduce the bias because the weights used after the first period have a negativecorrelation with subsequent observed returns which offsets the upward bias. It then follows that otherportfolio weighting methods will also reduce the bias if the weights are similarly negatively correlated withobserved returns. Bessembinder and Kalcheva (2007) show that the method presented in this paper is onesuch method.12 While quoted bid-ask spread is defined as Pa � Pb, effective spread takes into account the fact that tradescan occur at prices other than the posted bid and ask. Effective spread measures the distance between the

midpoint of the spread and trade prices. Mathematically it can be expressed as 2 Pt � PaþPb

2

h i, where Pt is the

observed trade price.


123

covret ¼ �4 � ES2 ð4Þ

where covret is the serial covariance of observed returns, and ES is the effective spread as a

function of price.

Roll estimates the average effective spread for NYSE and Amex stocks based on daily

holding periods to be 0.017. For 5-day holding periods, the average effective spread is

estimated to be 0.30. He proves that in an efficient market the serial covariance of true

returns is zero. He also argues that the measurement of effective bid-ask spread is inde-

pendent of the return interval. Therefore, since the effective bid-ask spread is not the cause

of the difference between 1- and 5-day intervals, he concludes that the NYSE and Amex

are not as informationally efficient as previously thought. Fama (1991) also concludes that

stocks do not immediately adjust to new information. Therefore, observed prices may

contain errors other than those induced by bid-ask spreads. The other errors arise from a

combination of non-synchronous trading, discrete pricing, and slow adjustment to

information.13

Although both Blume and Stambaugh (1983) and Roll (1984) show that observed

returns are upward biased, to date no studies have proposed a method for removing the bias

that allows for frequent rebalancing. Conrad and Kaul (1993) suggest that studies exam-

ining long-term returns use annual (or longer) buy-and-hold returns. Canina et al. (1998)

agree with Conrad and Kaul (1993) and suggest that researchers construct long holding-

period buy-and-hold portfolios when they are concerned with long-run performance.14

However, an annual holding period is not suitable for event studies, which typically

examine returns in the days following an event.

In a recent paper, Bessembinder and Kalcheva (2007) examine the impact of bid-ask

bounce errors on asset pricing tests. They suggest that the upward bias imparted by bid-ask

bounce results in noisy beta estimates and downward bias in the estimated premium for

beta risk. They examine several potential ways of removing the bias in observed returns,

including the one presented in this paper. One of their methods applies if the true price is

equal to the quote midpoint. They first find proportional spreads for each security as in Roll

(1984) as Pt � PaþPb

2, then find the variance of these biases for a time series of observations

for a firm and subtract the obtained variance from observed returns.

Bessembinder and Kalcheva (2007) also examine two potential methods of removing

bias that do not require knowing the bid-ask spread. The first of these is to use continu-

ously-compounded returns. They show that the mean of observed continuously-

compounded returns is equal to the mean of true continuously-compounded returns.

However, as they note, there are several problems associated with using these returns. For

example, Fisher (1966) illustrates that chained short horizon continuously-compounded

returns are downward biased relative to long holding period returns by about the same

amount. In addition, Ferson and Korajczyk (1995) argue that continuously-compounded

returns are inappropriate for tests of asset pricing models.

The last correction Bessembinder and Kalcheva (2007) examine is the method presented

in this paper—as well as in Weaver (1991).15 They find that our method effectively

13 Discrete pricing may cause errors if the amount of expected price adjustment to information is less thanthe minimum tick size on a market. In addition, discrete pricing may cause true prices to deviate from themid-point of bid-ask spread.14 Fisher and Lorie (1964, 1968, and 1977) do just that. However, they find that initial year returns arealmost always higher that second and subsequent year returns.15 Multiplying each stock’s ‘‘return weight’’ by one plus the stock’s observed return and cumulating overstocks yields our Eq. (11).


123

removes the upward bias in observed returns. In the next section, we present our method

for producing asymptotically unbiased equally-weighted indexes.

3 The model

As mentioned above, observed returns may contain biases due to random transient errors

(e.g., bid-ask spread), non-synchronous trading, discrete pricing, and/or slow adjustment to

information.16 Here we develop a model of price adjustment that takes these factors into

account. We then develop an asymptotically unbiased method that removes random

transient errors. In subsequent sections, we use this method to estimate the bias inherent in

the monthly CRSP Equally-Weighted Index and to examine the length of the price

adjustment process.

Let the observed price of security i at time t, Pi;t

, be equal to17:

P i; t ¼ 1 � ai;t

� �Pi;t�1 þ ai;t Pi;t

� �1 þ ei;t

� �ð5Þ

where ai,t = the adjustment coefficient which shows the extent to which a price has

adjusted to information released since t - 1. If there is no lag, then ai = 1. ei,t = a random

transient error expressed, as a fraction of Pit.

Equation (5) implies that two different types of errors cause departures from true prices.

The first is an independent random transient error, e. This class of error can either be a

byproduct of the return generating process (e.g., bid-ask spreads) or independent transient

errors in prices. The second type is a lagged response process. There is considerable

empirical support for lagged response to information. For example, Hou and Moskowitz

(2005) find that for some stocks, responses can take up to 4 weeks.

The process of removing the errors in returns of equally-weighted portfolios is more

easily understood if the effect of each type of error is first examined separately. Therefore,

we examine the random transient type error (eit) first for the case where errors are com-

pletely corrected by the next observation, and then we take up the lagged response case.

3.1 Random transient errors

In the absence of a lagged adjustment process (ai,t = 1), observed prices are equal to true

prices on average since E(ei,t) = 0. However, returns are upward biased since

1þ EðritÞ½ � ¼ EPit

Pi;t�1

" #¼ E

Pitð1þ eitÞPi;t�1ð1þ ei;t�1Þ

� �¼ 1þ EðritÞ½ � � E

ð1þ eitÞð1þ ei;t�1Þ

� �ð6Þ

and from Jensen’s inequality we know that Eð1þei;tÞð1þei;t�1Þ

h i[ 1. Blume and Stambaugh (1983)

show that by invoking the law of large numbers, defining Rt as the return on a large

portfolio (or index) at time t, and noting that E(ei,t) = 0, Eq. (6) can be rewritten as

1 þ E Rt

� �� 1 þ E Rtð Þ½ � � E

1

1 þ et�1ð Þ

� �ð7Þ

16 Additional examples of random transient errors include errors in observed stock prices caused byincorrect order entry or transaction recording.17 Observed variables will be indicated with a hat and true values of variables will have no notation.


123

Therefore, the bias inherent in observed portfolio returns is the result of random transient

errors in observed prices at the beginning of the holding period. Blume and Stambaugh

further show that Eq. (6) can be approximated using a Taylor series as18

1 þ E ri;t

� �� 1 þ E ri;t

� �� 1 þ r2 ei;t�1

� �� ð8Þ

and for the return on a frequently rebalanced equally-weighted portfolio, Eq. (7) becomes19

1 þ E Rt

� �� 1 þ E Rtð Þ½ � � 1 þ r2 ei;t�1

� �n o: ð9Þ

Thus far, we have dealt with single holding period returns. To denote a multiple period

return or price, we employ leading subscripts to indicate the beginning of the holding

period and a trailing subscript to denote its end. For example, a two holding-period true

return on a portfolio observed at time t would be 2Rt. Next, we define a one holding-period

true portfolio price relative, Wt, as (1 ? Rt) and a two holding-period true portfolio price

relative as 2Wt ¼ ð1 þ 2RtÞ.20

In the absence of any errors (both random transient and lagged adjustment), it is clear

that the expectation of a two-period portfolio price relative is equal to the product of the

expectations of two sequential portfolio price relatives (although the distributions are

different). Thus,

E 2Wtð Þ ¼ E Wt�1ð Þ � E Wtð Þ ð10Þ

Then21

E Wtð Þ ¼ E2Wtð ÞWt�1ð Þ

� �ð11Þ

We note that observed price relative on a two holding-period portfolio ending at time t is

2Wt ¼ 1þ E 2Rt

� �� 1þ E 2Rtð Þ½ � 1þ r2 ei;t�2

� �n o

ð12Þ

In addition, the observed wealth relative on a one holding-period portfolio ending at time

t - 1 is

Wt�1 ¼ 1þ E Rt�1

� �� 1þ E Rt�1ð Þ½ � 1þ r2 ei;t�2

� �n oð13Þ

The expectation of Eq. (11) asymptotically becomes22

E2Wt

Wt�1

� �¼

1þ Eð2RtÞ½ � � 1þ r2ðei;t�2Þ� �

1þ EðRt�1Þ½ � � 1þ r2ðei;t�2Þ� � ¼

1þ Eð2RtÞ1þ EðRt�1Þ

¼ 1þ EðRtÞ ð14Þ

18 As shown in Blume and Stambaugh (1983) footnote 6.19 Blume and Stambaugh (1983), and some others, implicitly assume continuous pricing so that the dis-tribution of error terms is binomial. That is, the true price of stock is the midpoint of the spread. Given thereality of tick-induced discrete pricing, true price is not necessarily at the midpoint of spread. Therefore, alog normal distribution of error terms is more representative. In ‘‘Appendix 1’’ we show that the bias arisingfrom a log normally distributed error term is equal to that of a binomially distributed error.20 Subtracting 1 from a wealth relative yields the return on an index or portfolio.21 Assuming no lagged adjustment is equivalent to assuming no serial covariance, thus the product ofexpectations is equal to the expectation of the product.22 Since 1 þ r2 ei;t�2

� �n ois in both the numerator and denominator, Jensen’s inequality does not apply.


123

The intuition for this is straightforward. Recall that the bias inherent in observed returns

is due to the variance of error terms at the beginning of the holding period that are

corrected by the end of the holding period. In Eq. (14), since the two-period wealth relative

in the numerator and the one-period wealth relative in the denominator both begin in the

same period; they both contain the same average bias. Therefore, the biases cancel out and

the asymptotic result is approximately the expected one-period portfolio wealth relative for

time t.

3.2 Lagged response

Transient errors are not the only variables to have an impact on observed prices. There is

also the possibility that prices do not adjust instantaneously or simultaneously. In this

section, we analyze the effect of a lagged response on observed returns. Recall Eq. (5):

Pi;t¼ 1� ai;t

� �Pi;t�1 þ ai;tPi;t

� �� 1þ ei;t

� �ð5Þ

For simplicity we assume when prices adjust to new information, they do so fully.

Accordingly then ai,t is a Kronecker delta with the value of 1 for prices that have fully

adjusted, otherwise 0. Next, assume a ‘‘true’’ price generating process

Pi;t¼ Pi;t�1 þ biRtPi;t�1

� �� 1þ ui;t

� ð15Þ

where Rt is the true return on the market index at time t and ui,t is a random non-systematic

component for period t. For tractability, we assume that all betas are equal to 1. Then by

defining the index link relative as Wt = (1 ? Rt), Eq. 15 becomes

Pi;t¼ Pi;t�1 Wtð Þ � 1þ ui;t

� ð16Þ

Note that Pi;t�1

is also subject to non-adjustment to new information so that

Pi;t�1¼ 1� ai;t�1

� �Pi;t�2 þ ai;t�1Pi;t�1

� �� 1þ ei;t�1

� �

Define h as the probability that a = 0 and a as (1 - h). Then by noting thatP

u ¼ 0 we

can combine Eqs. (5) and (16) to obtain the aggregate observed index relative, Wt, as23

Wt ¼ 1þ r2 et�1ð Þh i

aþ hWt�1ð Þ hþ aWtð Þ ð17Þ

this can be rewritten in return form as

Wt ¼ 1þ r2 et�1ð Þh i

1þ hRt�1ð Þ 1þ aRtð Þ: ð18Þ

Similarly, we show in ‘‘Appendix 3’’ that in the presence of slow adjustment to new

information, the ratio of a two-period index relative to a one-period index relative ending at

time t - 1, as in Eq. (14), is

E2Wt

Wt�1

� �¼ ð1þ Rt�1Þð1þ aRtÞ

ð1þ aRt�1Þð19Þ

If prices fully adjust by time t for information released at time t - 1 (i.e., a = 1), then

23 The conditional expected index relative table is provided in ‘‘Appendix 2’’.


123

E2Wt

Wt�1

� �¼ 1þ EðRtÞ ð20Þ

Therefore, if prices fully adjust to new information within a month, our method provides

an unbiased estimate of true index returns at time t. Given that Hou and Moskowitz (2005)

find that there is a lagged response of up to 4 weeks for some U.S. stocks, it is reasonable

to conclude that Eq. (20) provides an unbiased estimate of monthly index returns. How-

ever, it is also clear that it may not be adequate for estimating unbiased returns for periods

of less than 1 month.

4 Simulation results

To examine how our index construction method performs relative to other bias correction

methods, we perform simulations based on the characteristics of observed historical NYSE

prices and the CRSP Equally-Weighted Index return (including all distributions). Although

stock pricing errors can occur due to bid-ask spreads (or other random transient errors),

slow adjustment to information, and non-trading, we only examine the first type of error.

To create more realistic simulation data, we base our simulated prices and market returns

on the actual distributions of monthly prices and index returns for the period January 1926

to December 1996. During this period, the tick size for stocks is $0.125 for all stocks priced

over $1.

We find that the distribution of NYSE prices (over $1) during the $0.125 tick period can

be well approximated by a gamma distribution with shape parameter of 1.757, mean of

$28.55, and standard deviation of $21.54. During this period, returns on the CRSP Equally-

Weighted Index are approximately log normally distributed with a mean of 1.024% per

month and standard deviation of 6.68%. For each run of the simulation, we first randomly

generate Month 1 prices for 1,000 stocks based on the distribution of observed prices.

These prices are deemed ‘‘true’’ prices. We next assume that true spreads for each stock are

a percentage of true price (separately 0.5, 1.0, 2.0, and 5%). Since minimum tick sizes

result in discrete pricing, we next determine the observed bid and ask prices based on

rounding the true spread to next highest discrete spread. This is done by rounding down to

the tick just below the simulated true price less one half the simulated true spread then

adding on the rounded spread to find the simulated observed ask. For example, a simulated

true price of $21.15 with an assumed true spread of 1% ($0.2115) would have an observed

bid of $21.00 and an ask of $21.25. We assume that observed closing prices are either at

the bid or the ask with equal probability.

To generate Month 2 simulated true prices we assume that the true return for our stocks

follows the market model with an a of zero or

Rit ¼ ai þ biRMt þ e ð21Þ

where Rit is the ‘‘true’’ return on stock i at time t and RMt is the return on a simulated

equally-weighted index. bi is the beta of stock i, drawn from a log normal distribution with

mean and standard deviation of one. We assume that residuals e are equal to 10% of the

market price times a normally distributed random number with mean zero and a standard

deviation of 1. The month’s ‘‘true’’ prices, P2, are determined as P1(1 ? Ri). The true

spread is then determined as well as the observed discrete spread and observed closing

price using the method described above. The same procedure is performed to simulate

Month 3 values. We simulate 3 months since the unbiased method requires at least


123

3 months of observed prices to calculate an unbiased 1-month index return, while other

(biased) methods only require 2 months of data (i.e., 1 month returns based on observed

returns or quote mid-points).

For each simulation run, we compare the true equally-weighted index return for a

universe of 1,000 stocks and compare that return to the biased observed index return as

well as to two potential methods for removing the bias:

• Computing returns based on the midpoint of the closing observed spreads

• Our proposed unbiased method of portfolio return estimation.

For observed returns and the first two methods, we employ observed prices in months 2

and 3 to calculate the index return for Month 3. All 3 months are used to calculate the

unbiased portfolio return. We run the simulation 1,000 times and compute average errors

across simulations. The results are summarized in Table 1. As mentioned earlier, we

examine true spread widths of 0.5, 1.0, 2.0, and 5% of true price separately. As expected,

we find that observed returns contain upward bias, and this bias increases with spread

width. The average error is statistically significant at acceptable levels.

Turning to the proposed methods for creating an unbiased equally-weighted index, we

find that basing returns on the midpoint of observed closing bid-ask spreads reduces the

bias by about one-third for spreads of up to 2% of price, and by two-thirds for spreads of

5%. Recall that in our simulations, the true price is not the midpoint of the spread. All of

the remaining errors using the mid-point of observed quotes are still statistically

significant.

Consistent with our earlier proof, the unbiased portfolio return exhibits statistically

insignificant errors on average for spreads of less than 10% of price. We compare the

unbiased portfolio returns to actual observed index returns for several markets and periods

in the next section.

5 CRSP equally-weighted index bias results

In the previous sections of this paper, we have noted that the observed returns of an

equally-weighted portfolio, such as the CRSP Equally-Weighted Index, are upward biased.

We also presented a method for asymptotically removing the bias. Since the bias in

observed returns is positive, it will be cumulative. In this section, we compare the observed

CRSP Equally-Weighted Index to an index constructed using our methodology and

examine the bias and its cumulative impact.

We begin by constructing the price relative for our index, i.e. 2Wt

Wt�1, for all stocks

contained in the CRSP Equally-Weighted Index using monthly returns. We separately

examine NYSE, Amex, and NASDAQ stocks through 2006.24 We define the bias as

RCRSP - RUnbiased, where the former is the return on the monthly CRSP Index and the

latter the return on the unbiased index. In order to examine comparative cumulative effects,

we set the index value equal to 100 on December 1973 for each market. (NASDAQ stocks

are included in CRSP beginning in December 1972). Each month, the previous month’s

index value is multiplied by the applicable index relative for the month. Table 2 contains

the December levels of the resulting series. It reveals the significant cumulative upward

bias inherent in an index constructed using observed equally-weighted returns.

24 As a check, we first re-construct the CRSP Equally-Weighted Index, so that we can find possibledifferences in the data bases.


123

For stocks listed on the NYSE, the resulting ending index value on December 2006

reveals that the cumulative bias inherent in observed returns is over 20% greater than our

unbiased portfolio (11,638 vs. 9,664) from December 1973.25 For Amex and NASDAQ

stocks the differences are more dramatic. The ending index value based on observed

returns for NASDAQ stocks is over 90% larger than that for the unbiased index.

We next compare the overall returns of the NYSE and Nasdaq markets. The biased

(CRSP) index shows that between 1973 and 2006, the overall index level of the Nasdaq

market is 17,975.6, fully 54 percent higher than the index level of 11,638.3 for NYSE

stocks. This is consistent with the general perception that returns of smaller stocks, more

Table 1 Comparison of proposed alternative methods of removing bias

Method Average error spread as a percentage of price

0.5% 1% 2% 5%

Observed 0.009%***0.0010

0.014%***0.0015

0.028%***0.0028

0.102%***0.0069

Quote midpoint 0.006%***0.0006

0.009%***0.0009

0.016%***0.0020

0.033%***0.0054

Unbiased -0.003%0.0142

-0.000%0.0143

0.006%0.0146

0.022%0.0160

This table reports the average errors found for observed returns and two proposed methods of removing biasin index returns. ‘‘True’’ monthly returns are generated by simulating the price distribution for stocks listedon the NYSE and applying a market model with a zero alpha and a mean return equal to the CRSP Equally-Weighted Index return over the period January 1926 to December 1996 of 1.024% and a standard deviationof 6.68%. Simulated prices are drawn from a gamma distribution with shape parameter of 1.757, a mean of$28.55, and a standard deviation of $21.54. For each run of the simulation we first randomly generate period(month) 1 prices for 1,000 stocks based on the distribution of observed prices. We next assume that truespreads for each stock are a percentage of true price (separately 0.5, 1.0, 2.0, and 5%). Observed bid and askprices are found by rounding the true spread to the next highest discrete spread. Observed closing prices areassumed to be either at the bid or at the ask with equal probability. Second and subsequent month ‘‘true’’prices are generated by applying the market model of simulated returns with beta of each stock drawn from alog normal distribution with mean one. We assume that residuals are equal to 10% of the market return timesnormally distributed random number with mean zero and standard deviation of 1. The month’s ‘‘true’’prices, P2, are determined as P1(1 ? Ri), the true spread is then determined as well as the observed discretespread and observed closing price using the method described above. The same procedure is performed forsimulation month 3. Three months are simulated since the unbiased method requires at least 3 months ofobserved prices to calculate an unbiased 1 month index return, while other methods only require two. Foreach simulation run, we compare the true equally-weighted index return for our simulated universe of 1,000stocks and compare that return to the biased observed index return as well as two potential methods forremoving the bias

• Computing returns based on the midpoint of the closing observed spreads

• The unbiased method of index construction

Reported is the average error (true return minus each method’s return) for each return estimation methodbased on over 1,000 simulations. Standard errors are in italics, and statistical significance, based on a t-test,and are indicated by asterisks

***, **,* Denote significant at the 0.01, 0.05 and the 0.10 level, respectively

25 If the index value for December 1926 is set to 100, then the unbiased NYSE index would have a value of361,016 by 2002, while the CRSP NYSE index would reach a value of 836,852. This further illustrates theimpact of bias in the construction of long series of equally-weighted indexes.


123

Table 2 Values of the unbiased and CRSP equally-weighted indexes, December 1926–2006

NYSE Amex NASDAQ

Unbiasedindex

CRSPEWRETD

Unbiasedindex

CRSPEWRETD

Unbiasedindex

CRSPEWRETD

1926 1.1 0.6 – – – –

1927 1.5 0.8 – – – –

1928 2.1 1.2 – – – –

1929 1.4 0.8 – – – –

1930 0.9 0.5 – – – –

1931 0.4 0.3 – – – –

1932 0.4 0.3 – – – –

1933 1 0.7 – – – –

1934 1.1 0.9 – – – –

1935 1.8 1.4 – – – –

1936 2.7 2.1 – – – –

1937 1.4 1.1 – – – –

1938 1.9 1.5 – – – –

1939 1.9 1.6 – – – –

1940 1.7 1.5 – – – –

1941 1.5 1.4 – – – –

1942 2 1.8 – – – –

1943 3.3 2.9 – – – –

1944 4.6 4.1 – – – –

1945 7.4 6.7 – – – –

1946 6.6 6 – – – –

1947 6.6 6 – – – –

1948 6.4 5.9 – – – –

1949 7.8 7.2 – – – –

1950 10.6 9.8 – – – –

1951 12.2 11.4 – – – –

1952 13.4 12.5 – – – –

1953 13 12.1 – – – –

1954 20.3 19 – – – –

1955 24.3 22.9 – – – –

1956 25.9 24.5 – – – –

1957 22.3 21 – – – –

1958 35.3 33.5 – – – –

1959 40.7 38.8 – – – –

1960 40 38.2 – – – –

1961 51.5 49.4 – – – –

1962 44.7 43.1 46.2 40.1 – –

1963 52.9 51.1 50.9 44.9 – –

1964 62.4 60.4 59.3 53.1 – –

1965 80.1 77.8 85.5 77.2 – –

1966 74 72.2 79.8 72.8 – –


123

Table 2 continued

NYSE Amex NASDAQ

Unbiasedindex

CRSPEWRETD

Unbiasedindex

CRSPEWRETD

Unbiasedindex

CRSPEWRETD

1967 110.8 108.5 174.5 161.2 – –

1968 144.1 141.3 272.2 254.1 – –

1969 114.5 112.5 186.1 173.8 – –

1970 110.2 109.2 144.9 138.3 – –

1971 131 130.4 173 167.4 – –

1972 141.6 141.5 174.4 170.4 – –

1973 100 100 100 100 100 100

1974 72.4 73.5 70.4 73 71.8 74.3

1975 113.8 118.9 115.6 127 110.3 119.2

1976 164.9 173.1 175.3 196.9 159.8 176.7

1977 180 189.5 214.9 245.6 210.1 235.5

1978 203.5 216.2 272.5 319.6 274.8 313.8

1979 274.3 292.6 397.3 472.2 387 448.5

1980 357.1 382.8 547.6 663.5 585.7 679.7

1981 376.9 405.4 552.2 676.1 555.8 653.2

1982 486.5 525.2 694.2 861.5 663.1 796.1

1983 644.3 700.6 970.4 1,226.40 893.1 1,098.20

1984 645.3 706.3 862.6 1,098.10 739.2 923.1

1985 835.8 917.3 1,030.50 1,321.80 906.5 1,151.10

1986 951.7 1,047.00 1,092.70 1,423.90 946 1,215.50

1987 922.2 1,017.00 968.8 1,275.60 839.3 1,093.40

1988 1,119.30 1,243.30 1,126.00 1,504.60 960.7 1,285.90

1989 1,304.40 1,451.60 1,288.10 1,726.10 1,039.50 1,405.30

1990 1,066.30 1,195.20 933.1 1,276.70 790.6 1,091.30

1991 1,461.80 1,661.20 1,266.10 1,795.90 1,224.80 1,742.10

1992 1,707.30 1,960.60 1,553.00 2,279.40 1,565.00 2,278.40

1993 2,033.50 2,340.50 2,053.40 3,027.20 1,975.40 2,951.10

1994 1,950.10 2,249.90 1,986.20 2,989.50 1,820.80 2,761.70

1995 2,404.60 2,778.30 2,479.40 3,777.40 2,402.80 3,708.30

1996 2,881.40 3,335.10 2,972.90 4,623.90 2,757.70 4,299.20

1997 3,583.60 4,154.80 3,607.10 5,667.60 3,181.80 5,042.10

1998 3,466.30 4,048.50 3,131.70 5,105.90 2,953.90 4,927.90

1999 3,621.20 4,237.40 3,723.30 6,117.80 4,584.90 7,680.40

2000 3,972.50 4,669.30 3,467.30 5,712.10 3,569.30 5,884.30

2001 4,399.30 5,264.30 3,951.20 6,949.40 4,115.00 7,441.80

2002 4,090.30 4,920.10 3,699.90 6,587.40 3,380.70 6,327.70

2003 5,973.60 7,167.30 6,221.10 11,145.7 6,606.30 12,377.2

2004 7,260.70 8,744.20 7,436.10 13,358.9 7,967.20 15,068.5

2005 7,946.60 9,559.50 7,847.10 14,197.9 8,122.60 15,427.3


123

predominant on the Nasdaq market, tend to show higher returns than larger stocks.

However, when we compare the levels of the unbiased indexes, the ranking changes

dramatically: the index level for Nasdaq is 9,418.2, slightly lower than the 9,663.5 level for

NYSE stocks, by about 2.5%. This result indicates that accounting for bias in index returns

over long periods of time can change rankings of relative stock performance.

We report average bias levels in observed monthly index returns by 5-year (and several

longer) periods in Table 3. Examining the results for NYSE stocks reveals that the average

monthly bias ranges from 2.27 b.p. for the last half of the 1950s to 49.21 b.p. for the first

half of the 1930s. Through 1996, all of the NYSE sub-periods exhibit statistically sig-

nificant bias at acceptable levels. Although the periods starting in 1996 exhibit upward

bias, it is not statistically significant. In 1997, U.S. markets cut their tick size from $1/8 to

$1/16, and further to $0.01 in 2001. These reductions are associated with a narrowing of

spreads on the NYSE.26 It follows that if spreads narrow, the bias inherent in observed

returns also declines.

The bias inherent in observed returns based on indexes of Amex and NASDAQ stocks

reveals averages over three times larger than those found for NYSE stocks. In particular,

we find that the average bias for Amex (NASDAQ) observed index returns is about 15 (16)

b.p. a month over the period April 1973 through December 2006. All of the average biases

are statistically different from zero at acceptable levels. This suggests that all time periods

are subject to significant upward bias in computed index returns. In addition, the facts that

Amex and NASDAQ stocks are typically smaller and have wider spreads indicate that

indexes for this type of stock (e.g., emerging markets indexes) will benefit most from the

index construction method presented in this paper.

Blume and Stambaugh (1983) find that one-half of the small firm effect is due to upward

bias in observed returns. Accordingly, we disaggregate our data by time period and

calendar month and show the bias by calendar month in Table 4. It is apparent that

January’s bias is much larger than any other month’s bias for every market and sub-period.

However, there is statistically significant upward bias for observed returns in most other

calendar months for each market and sub-period. Therefore, this is not a January anomaly.

6 Conclusions and future research

It is well known that market microstructure noise induces upward bias in individual returns

and in equally-weighted portfolios. Noise can be due to simple error, bid-ask bounce,

Table 2 continued

NYSE Amex NASDAQ

Unbiasedindex

CRSPEWRETD

Unbiasedindex

CRSPEWRETD

Unbiasedindex

CRSPEWRETD

2006 9,663.50 11,638.3 9,212.30 16,739.5 9,418.20 17,975.6

This table compares end-of-year index values, based on compounded monthly returns, for the traditionalCRSP Equally-Weighted Index and the unbiased index. Index values for the NYSE, Amex, and NASDAQare shown separately. Since CRSP’s coverage of NASDAQ stocks begins in December 1972, all (end-ofyear) index values are set to 100 as of December, 1973

26 See Jones and Lipson (2001) and Bessembinder (2003), among others.


123

Tab

le3

Bia

ses

tim

ates

for

var

iou

sp

erio

ds

NY

SE

Am

exN

AS

DA

Q

NM

ean

bia

ss e

tS

tati

stic

s eo

ftr

ansi

ent

erro

rsN

Mea

nb

ias

s et

Sta

tist

ics e

of

Tra

n-s

ien

tE

rro

rsN

Mea

nb

ias

s et

Sta

tist

ics e

of

tran

sien

ter

rors

A.

5ye

ar

per

iod

s

4/2

6–

12

/30

57

7.1

22

.48

2.8

7*

**

2.6

7–

––

––

––

––

–

1/3

1–

12

/35

60

49

.21

8.3

85

.87

**

*7

.02

––

––

––

––

––

1/3

6–

12

/40

60

18

.91

4.5

14

.20

**

*4

.35

––

––

––

––

––

1/4

1–

12

/45

60

8.9

94

.95

1.8

2*

3–

––

––

––

––

–

1/4

6–

12

/50

60

3.4

60

.86

4.0

1*

**

1.8

6–

––

––

––

––

–

1/5

1–

12

/55

60

2.7

20

.46

5.9

1*

**

1.6

5–

––

––

––

––

–

1/5

6–

12

/60

60

2.2

70

.79

2.8

8*

**

1.5

1–

––

––

––

––

–

1/6

1–

12

/65

60

2.7

20

.69

3.9

5*

**

1.6

5–

––

––

––

––

–

1/6

6–

12

/70

60

3.3

51

.17

2.8

6*

**

1.8

36

09

.18

2.4

13

.81*

**

3.0

3–

––

––

1/7

1–

12

/75

60

8.9

52

.76

3.2

5*

**

2.9

96

02

3.3

54

.12

5.6

6*

**

4.8

33

32

8.6

75

.15

5.5

7*

**

5.3

5

1/7

6–

12

/80

60

4.2

41

.43

.02

**

*2

.06

60

16

.66

1.9

48

.57*

**

4.0

86

01

1.6

82

.47

4.7

3*

**

3.4

2

1/8

1–

12

/85

60

3.9

21

.12

3.5

1*

**

1.9

86

09

.44

1.7

75

.35*

**

3.0

76

01

4.9

82

.13

7.0

4*

**

3.8

7

1/8

6–

12

/90

60

3.5

31

.68

2.1

0*

*1

.88

60

10

.79

2.1

94

.94*

**

3.2

86

01

3.7

52

.71

5.0

7*

**

1.3

2

1/9

1–

12

/95

60

5.1

32

.12

.44

**

2.2

66

01

8.1

4.6

53

.89*

**

4.2

56

01

8.6

82

.99

6.2

5*

**

4.3

2

1/9

6–

12

/00

60

2.8

81

.96

1.4

71

.76

01

2.8

16

.62

1.9

3*

3.5

86

01

1.3

29

.84

1.1

53

.36

1/0

1–

12

/06

72

3.3

82

.26

1.4

91

.84

72

13

.86

5.7

42

.41*

*3

.72

72

20

.71

9.4

72

.19*

**

4.5

5


123

Ta

ble

3co

nti

nu

ed

NY

SE

Am

exN

AS

DA

Q

NM

ean

bia

ss e

t Sta

tist

ics e

of

tran

sien

ter

rors

NM

ean

bia

ss e

t Sta

tist

ics e

of

Tra

n-s

ien

tE

rro

rsN

Mea

nb

ias

s et S

tati

stic

s eo

ftr

ansi

ent

erro

rs

B.

Lo

ng

erp

erio

ds

4/2

6–

10

/62

43

91

2.7

61

.67

7.6

3*

**

3.5

7–

––

––

––

––

–

11

/62

–1

2/0

65

30

4.2

70

.63

6.7

8*

**

2.0

75

30

14

.08

1.3

91

0.1

5*

**

3.7

5–

––

––

4/2

6–

3/7

35

64

10

.61

.32

8.0

3*

**

3.2

6–

––

––

––

––

–

4/7

3–

12

/06

40

54

.66

0.8

5.8

4*

**

2.1

64

05

15

.34

1.7

68

.72*

**

3.9

24

05

16

.45

2.3

96

.88*

**

4.0

6

4/2

6–

12

/06

96

98

.12

0.8

49

.63*

**

2.8

5–

––

––

––

––

–

This

table

list

sth

eav

erag

ebia

sin

the

month

lyre

turn

on

the

trad

itio

nal

CR

SP

Equal

ly-W

eighte

dIn

dex

.B

ias

isdefi

ned

asR

CR

SP

-R

Unbia

sed,w

her

eth

efo

rmer

isth

ere

turn

on

the

mon

thly

CR

SP

ind

exan

dth

ela

tter

the

retu

rno

nth

eu

nb

iase

din

dex

.B

iase

sar

eex

pre

ssed

inb

asis

po

int

(1b

.p.

=0

.00

01

or

0.0

1%

).S

tan

dar

der

rors

are

exp

ress

edas

ap

erce

nta

ge.

Bia

ses

inin

dex

esfo

rN

YS

E,

Am

ex,

and

NA

SD

AQ

stock

sar

eex

amin

edse

par

atel

y.

Fo

rea

chp

erio

dan

dm

ark

et,

we

list

the

nu

mb

ero

fo

bse

rvat

ions

and

the

mea

nb

ias,

asw

ell

asth

est

and

ard

erro

ran

dt-

stat

isti

c.P

anel

Ali

sts

5-y

ear

sub

-per

iod

s(e

xce

pt

for

the

firs

t,w

hic

his

57

mo

nth

slo

ng

).P

anel

Bli

sts

the

resu

lts

for

sub

-per

iod

sco

nfo

rmin

gto

CR

SP

star

td

ates

for

the

thre

em

ark

ets.

Tes

tso

fsi

gn

ifica

nce

are

bas

edo

ntw

o-t

aile

dte

sts


123

Ta

ble

4B

ias

esti

mat

esby

cale

ndar

month

Month

NY

SE

Am

exN

asdaq

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

A.

Apri

l1926–O

ctober

1962

Januar

y36

37.5

69.1

54.1

0***

6.1

3–

––

––

––

––

–

Feb

ruar

y36

5.5

97.1

90.7

82.3

6–

––

––

––

––

–

Mar

ch36

13.1

44.2

73.0

8***

3.6

2–

––

––

––

––

–

Apri

l37

9.7

53.6

62.6

7**

3.1

2–

––

––

––

––

–

May

37

9.2

94.0

12.3

2**

3.0

5–

––

––

––

––

–

June

37

14.4

17.3

51.9

6*

3.8

––

––

––

––

––

July

37

14.8

64.2

23.5

2***

3.8

5–

––

––

––

––

–

August

37

12.1

13.4

83.4

8***

3.4

8–

––

––

––

––

–

Sep

tem

ber

37

11.4

86.7

31.7

1*

3.3

9–

––

––

––

––

–

Oct

ober

37

12.6

68.0

01.5

83.5

6–

––

––

––

––

–

Novem

ber

36

7.5

93.2

42.3

4**

2.7

5–

––

––

––

––

–

Dec

ember

36

4.8

32.1

02.3

0**

2.2

0–

––

––

––

––

–

B.

Nove

mber

1962–D

ecem

ber

2006

Januar

y44

19.1

24.4

14.3

4***

4.3

744

50.6

59.1

55.5

4***

7.1

2–

––

––

Feb

ruar

y44

2.9

52.5

21.1

71.7

244

18.7

66.1

63.0

5***

4.3

3–

––

––

Mar

ch44

5.8

21.8

53.1

4***

2.4

144

14.0

54.2

03.3

4***

3.7

5–

––

––

Apri

l44

1.2

81.1

31.1

31.1

344

10.4

83.4

53.0

4***

3.2

4–

––

––

May

44

1.3

81.0

91.2

71.1

844

5.7

03.1

41.8

2*

2.3

9–

––

––

June

44

2.0

01.0

81.8

5*

1.4

244

9.0

92.8

33.2

2***

3.0

1–

––

––

July

44

3.4

41.4

02.4

5**

1.8

544

12.2

32.5

94.7

2***

3.5

0–

––

––

August

44

2.5

51.4

11.8

1*

1.6

044

8.2

72.5

03.3

1***

2.8

8–

––

––

Sep

tem

ber

44

1.6

01.6

50.9

71.2

644

10.8

74.3

82.4

8**

3.3

0–

––

––


123

Tab

le4

con

tin

ued

Month

NY

SE

Am

exN

asdaq

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

Oct

ober

44

4.9

52.2

02.2

5**

2.2

244

12.9

3.1

44.1

1***

3.5

9–

––

––

Novem

ber

45

1.8

91.9

50.9

71.3

745

8.7

33.5

62.4

5**

2.9

6–

––

––

Dec

ember

45

4.3

02.0

32.1

2**

2.0

745

7.5

24.8

11.5

62.7

4–

––

––

C.

Apri

l1926–M

arc

h1973

Januar

y47

30.4

77.2

64.2

0**

5.5

2–

––

––

––

––

–

Feb

ruar

y47

4.1

95.5

20.7

62.0

5–

––

––

––

––

–

Mar

ch47

10.8

23.3

23.2

6***

3.2

9–

––

––

––

––

–

Apri

l47

7.6

92.9

62.6

0**

2.7

7–

––

––

––

––

–

May

47

7.4

43.2

22.3

1**

2.7

3–

––

––

––

––

–

June

47

11.6

55.8

42.0

0**

3.4

1–

––

––

––

––

–

July

47

11.9

83.4

33.4

9***

3.4

6–

––

––

––

––

–

August

47

10.5

02.8

13.7

4***

3.2

4–

––

––

––

––

–

Sep

tem

ber

47

9.5

35.3

11.7

9*

3.0

9–

––

––

––

––

–

Oct

ober

47

10.6

96.3

61.6

8*

3.2

7–

––

––

––

––

–

Novem

ber

47

7.0

02.5

72.7

2***

2.6

5–

––

––

––

––

–

Dec

ember

47

5.2

21.7

13.0

5***

2.2

9–

––

––

––

––

–

D.

Apri

l1973–D

ecem

ber

2006

Januar

y33

23.0

85.6

94.0

6***

4.8

033

57.9

511.7

84.9

2***

7.6

133

58.1

615.8

53.6

7***

7.6

3

Feb

ruar

y33

4.0

63.2

91.2

42.0

233

21.8

58.0

82.7

0***

4.6

733

16.9

112.6

21.3

44.1

1

Mar

ch33

6.6

82.4

62.7

2**

2.5

833

15.4

85.5

52.7

9***

3.9

333

17.8

910.5

31.7

0*

4.2

3

Apri

l34

1.6

21.3

31.2

21.2

734

13.4

24.1

73.2

2***

3.6

634

15.2

05.4

42.7

9***

3.9

0

May

34

1.6

21.2

81.2

71.2

734

5.2

73.9

31.3

42.3

034

5.9

35.4

31.0

92.4

3

June

34

2.1

71.3

01.6

71.4

734

8.9

73.5

72.5

1**

2.9

934

9.1

94.2

12.1

8**

3.0

3


123

Tab

le4

con

tin

ued

Month

NY

SE

Am

exN

asdaq

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

NM

ean

bia

s

s et S

tati

stic

s eof

tran

sien

t

erro

rs

July

34

4.0

61.7

52.3

3**

2.0

234

12.6

93.1

34.0

5***

3.5

634

15.1

53.9

03.8

9***

3.8

9

August

34

1.9

61.7

01.1

51.4

034

9.3

13.1

03.0

1***

3.0

534

13.3

23.2

94.0

5***

3.6

5

Sep

tem

ber

34

1.3

82.1

30.6

51.1

834

12.2

45.5

32.2

1**

3.5

034

11.5

83.8

53.0

0***

3.4

0

Oct

ober

34

5.4

12.6

12.0

7**

2.3

234

12.3

33.6

03.4

3***

3.5

134

17.1

47.0

12.4

4**

4.1

4

Novem

ber

34

0.8

62.3

80.3

60.9

334

8.0

64.5

11.7

9*

2.8

434

5.4

45.8

10.9

42.3

3

Dec

ember

34

3.5

92.5

61.4

01.9

034

7.9

46.1

31.2

92.8

234

12.7

58.9

61.4

23.5

7

This

table

list

sth

eav

erag

ebia

sin

the

month

lyre

turn

on

the

trad

itio

nal

CR

SP

Equal

ly-W

eighte

dIn

dex

by

cale

ndar

month

.B

ias

isdefi

ned

asR

CR

SP

-R

Unbia

sed,

wher

eth

efo

rmer

isth

e

retu

rnon

the

month

lyC

RS

Pin

dex

and

the

latt

erth

ere

turn

on

the

unbia

sed

index

.B

iase

sar

eex

pre

ssed

inbas

ispoin

t(1

b.p

.=

0.0

001

or

0.0

1%

).T

he

stan

dar

der

ror

isex

pre

ssed

asa

per

centa

ge

of

the

true

pri

ceat

the

beg

innin

gof

the

month

.B

iase

sin

index

esfo

rN

YS

E,

Am

ex,

and

NA

SD

AQ

stock

sar

eex

amin

edse

par

atel

y.

For

each

per

iod

and

mar

ket

,w

eli

stth

e

num

ber

of

obse

rvat

ions

use

dan

dth

em

ean

bia

s,as

wel

las

the

stan

dar

der

ror

and

t-st

atis

tic.

To

allo

wfo

rco

mpar

isons

acro

ssm

arket

s,w

epar

titi

on

the

dat

ain

tofo

ur

tim

esp

ans

tota

ke

into

acco

unt

the

dif

feri

ng

star

tper

iods

for

CR

SP

index

esfr

om

the

NY

SE

,A

mex

,an

dN

AS

DA

Q.

Pan

elB

com

par

esth

eN

YS

Ean

dA

mex

,w

hil

eP

anel

Dco

mpar

eal

lth

ree

mar

ket

s.

Pan

els

Aan

dC

list

cale

ndar

month

bia

ses

for

the

NY

SE

from

the

star

tof

CR

SP

dat

aunti

lth

ebeg

innin

gof

CR

SP

dat

afo

rth

eA

mex

and

NA

SD

AQ

,re

spec

tivel

y.

Tes

tsof

signifi

cance

are

bas

edon

two-t

aile

dte

sts

***,

**,*

Den

ote

signifi

cant

atth

e0.0

1,

0.0

5an

dth

e0.1

0le

vel

,re

spec

tivel

y


123

discrete pricing, or slow adjustment of prices to new information. In this paper, we address

how to remove the bias in large equally-weighted portfolios such as the CRSP Equally-

Weighted Index. We develop a model of price formation that corrects for transient pricing

errors and slow adjustment to information.

We separately examine the two types of errors just mentioned. We first show that the

bias due to transient errors, consistent with previous studies, is due to errors in price at the

beginning of the holding period. We then show that the transient error bias in observed

portfolio returns can be asymptotically removed by taking the ratio of a two-period wealth

ratio (i.e., one plus the portfolio return) starting at time t - 2 to a one-period wealth

relative also starting at time t - 2. Since the bias in each relative is due to the average

transient pricing error at time t - 2, and both the numerator and denominator of the ratio

contain the same bias, the biases cancel out leaving an asymptotically unbiased portfolio

return.

We also examine the impact of slow adjustment of prices to new information on the

return on a large portfolio. We show that as long as the slow adjustment pricing error is

corrected by the end of the holding period, our method will not be impacted and will still

result in an asymptotically unbiased portfolio return. Given that previous studies have

found that prices adjust to new information within a month, the method presented here will

yield unbiased estimates of large portfolio returns for monthly or longer holding periods.

We test our method using a simulation that allows for errors in price due to bid-ask

bounce, a coarse pricing grid, and random errors. We compare portfolio returns generated

using: observed returns; the ratio method presented in this paper; quote mid-points; and

continuously compounded returns. The simulation is performed 1,000 times for 1,000 stock

portfolios. We find that the bias in observed returns increases linearly with proportional

spread width. We also find that our method results in the smallest proportional bias.

Assuming that our model is representative, this suggests that taking the quote midpoint will

not remove all of the bias in observed returns due to a discrete pricing grid.

We argue in the paper that the bias in observed returns, since it is always positive, is

cumulative in indexes. To determine the extent of the cumulative bias we examine the

commonly used CRSP Equally-Weighted Index over the period 1926 through 2006. We

create indexes equal to 100 on December 31,1973 and then compare the CRSP index return

to our ratio method for each of the three markets covered by CRSP. We find that the bias is

indeed cumulative and results in large index errors over time. For example, upward bias in

observed NASDAQ returns results in a cumulative error of 90% in the ending index value

on December 31, 2006. After the bias is removed, the overall equally-weighted return on

Nasdaq stocks is found to be slightly lower than that of NYSE stocks.

Examining 5-year sub-periods, we find that the bias in observed monthly returns ranges

from 2.27 b.p. during the last half of the 1950s for NYSE stocks to over 49 b.p. for the first

half of the 1930s. During the period 1973–2006 (when all three markets are covered by

CRSP) we find that NYSE stock indexes contain the smallest monthly bias (4.66 b.p.)

while NASDAQ stock indexes contain the largest (16.45 b.p.) Finally we find that although

January contains the largest amount of bias, virtually all months, and markets, contain

statistically significant amounts of bias.

Our findings suggest that the ratio method presented here should be used to estimate the

return on all equally-weighted indexes (or large portfolios) measured over a monthly

holding period in the United States. While this paper presents a good start to solving the

problem of upward bias in observed returns, there is much work yet to be done. For

example, we show that as long as prices adjust to new (independent) information by the

next price observation, our method is adequate in removing bias. Given that other studies


123

have shown that some stocks take up to 4 weeks to fully adjust to new information, it is not

clear how our method will perform for holding periods of less than a month. Another area

where work is to be done is to determine the relationship between the number of stocks in a

portfolio and the amount of bias removed using our method. We present a method that

asymptotically removes bias, but do not examine the relationship between the number of

stocks in a portfolio and the amount of bias removed.

The cumulative bias in equally-weighted returns over long periods of time is substantial

and can vary significantly from one market to another. For these reasons, it can affect

relative rankings of return performance across markets, and may have significant impli-

cations for risk/return comparisons of stock performance. Evaluations of stocks in thinly

traded and smaller markets, such as those in emerging markets, may be especially sensitive

to this bias. In addition, given that equal weighting is the method used in virtually all event

studies, it is clear that most studies report returns that are upward biased. If transient errors

increase around events then ‘‘abnormal’’ returns may merely reflect increased transient

errors. If this is the case, then the conclusions of many event studies need to be revisited.

Acknowledgments We thank Marshall Blume, Ivan Brick, Stephen Brown, Douglas Jones, Jay Ritter,Scott Linn, Michael Pagano, David C. Porter, Robert Stambaugh, Yusif Simaan, and David Whitcomb fortheir comments on earlier versions of this study. Fisher and Weaver thank the Whitcomb Center forResearch in Financial Services for research support. Fisher also thanks the donors of the First Fidelity BankResearch Professorship of Finance.

Appendix 1

Proof that a log normally distributed error term is approximately equal to a binomially

distributed error term

Let Yit ¼ logeð1þ eitÞ. If ð1þ eitÞ is log normally distributed, Yit is normally distributed

with mean lit and variance r2ðYitÞ. From Aitchison and Brown (1957, pp. 8–10),

Eð1þ ei;t�1Þ ¼ exp lit þ 0:5r2ðYitÞ� �

ðA1Þ

and

E1

1þ ei;t�1

� �¼ exp �lit þ 0:5r2ðYitÞ

� �ðA2Þ

By assumption E(1þ eitÞ ¼ 1, then since exp(0) = 1 it must be that

lit þ 0:5r2ðYitÞ ¼ 0 ðA3Þ

Solving for lit and substituting the result into Eq. (A2) yields

E1

1þ ei;t�1

� �¼ exp r2ðYitÞ

� �ðA4Þ

Aitchison and Brown (1957, Eq. 2.9) state that the exponential function of the variance of

the transformed distribution (i.e., the right-hand side of Eq. (A4)) is equal to one plus the

squared coefficient of variation of the parent, ð1þ eitÞ, distribution or er2 ¼ 1þ g2.

Therefore, Eq. (A4) can be rewritten as


123

E1

1þ ei;t�1

� �¼ 1þ

r2 1þ ei;t�1

� �

Eð1þ ei;t�1Þ� �2 ðA5Þ

Since E(1þ ei;t�1Þ ¼ 1 and r2ð1þ eitÞ ¼ r2ðei;t�1Þ, Eq. (A5) becomes

E1

1þ ei;t�1

� �¼ 1þ r2 ei;t�1

� �ðA6Þ

.Appendix 2

Conditional one-period expected index relatives assuming lagged adjustments to new

information

Recall Eq. (5) from the text:

Pi;t¼ 1� aitð ÞPi;t�1 þ aitPit

� �1þ eitð Þ ð5Þ

where: ai,t = adjustment coefficient which shows the extent to which prices have adjusted

to information released since t - 1. If there is no lag process, ai,t = 1. For tractability we

assume either that all prices fully adjust or that they do not adjust at all.

ai,t-1 ai,t Pi;t�1 Pi;t Conditional E Wt

� �

0 0 Pi;t�2 1þ ei;t�1

� �Pi;t�1 1þ eitð Þ Wt�1ð1þ r2ðet�1ÞÞ

0 1 Pi;t�2 1þ ei;t�1

� �Pit 1þ eitð Þ Wt�1Wtð1þ r2ðet�1ÞÞ

1 0 Pi;t�1 1þ ei;t�1

� �Pi;t�1 1þ eitð Þ ð1þ r2ðet�1ÞÞ

1 1 Pi;t�1 1þ ei;t�1

� �Pit 1þ eitð Þ Wtð1þ r2ðet�1ÞÞ

Define h as the probability that a = 0 and a as (1 - h) then

E Wt

� �� ð1þ r2ðet�1ÞÞ h2Wt�1 þ ahWt�1Wt þ ahþ a2Wt

� �

� ð1þ r2ðet�1Þ aþWt�1ð Þ hþ aWtð Þ ðA7Þ

and since W = (1 ? R), where R is the return on the true aggregate portfolio, and

a = 1 - h, then Eq. (A7) can be rewritten as.

EðWtÞ � ð1þ r2ðet�1ÞÞ 1� hþ h 1þ Rt�1ð Þ½ � hþ 1� hð Þ 1þ Rtð Þh i ðA8Þ

or

EðWtÞ � ð1þ r2ðet�1ÞÞ 1þ hRt�1ð Þ 1þ aRtð Þ ð18Þ

Appendix 3

The ratio of a two-period expected index relative to a one-period expected index

relative assuming lagged adjustment to new information

Following the methodology of ‘‘Appendix 2’’, finding the expected one-period index rel-

ative ending at time t - 1 yields the following conditional table.


123

ai,t-2 ai,2 Pi;t�2 Pi;t�1 Conditional E Wt�1

� �

0 0 Pi;t�3 1þ ei;t�2

� �Pi;t�2 1þ ei;t�1

� �Wt�2ð1þ r2ðet�2ÞÞ

0 1 Pi;t�3 1þ ei;t�2

� �Pi;t�1 1þ ei;t�1

� �Wt�2Wt�1ð1þ r2ðet�2ÞÞ

1 0 Pi;t�2 1þ ei;t�2

� �Pi;t�2 1þ ei;t�1

� �ð1þ r2ðet�2ÞÞ

1 1 Pi;t�2 1þ ei;t�2

� �Pi;t�1 1þ ei;t�1

� �Wt�1ð1þ r2ðet�2ÞÞ

Also as in ‘‘Appendix 2’’, define h as the probability that a = 0 and a as 1 - h. Then

E Wt�1

� �� ð1þ r2ðet�2ÞÞ h2Wt�2 þ ahWt�2Wt�1 þ ahþ a2Wt�1

� �ðA9Þ

� ð1þ r2ðet�2ÞÞ aþWt�2ð Þ hþ aWt�1ð Þ ðA10Þ

and once again, since W = (! ? R), where R is the return on the true aggregate portfolio

and a = 1 - h, then Eq. (A7) can be rewritten as

EðWt�1Þ � ð1þ r2ðet�2ÞÞ 1� hþ h 1þ Rt�2ð Þ½ � hþ 1þ hð Þ 1þ Rt�1ð Þh i ðA11Þ

or

EðWt�1Þ � ð1þ r2ðet�2ÞÞ 1þ hRt�2ð Þ 1þ aRt�1ð Þ ðA12Þ

Similarly, the following table gives the two-period aggregate wealth relative.

ai,t-2 ait Pi;t�2 Pit Conditional E 2Wt

� �

0 0 Pi;t�3 1þ ei;t�2

� �Pi;t�1 1þ eitð Þ Wt�2Wt�1ð1þ r2ðet�2ÞÞ

0 1 Pi;t�3 1þ ei;t�2

� �Pit 1þ eitð Þ Wt�2Wt�1Wtð1þ r2ðet�2ÞÞ

1 0 Pi;t�2 1þ ei;t�2

� �Pi;t�1 1þ eitð Þ Wt�1ð1þ r2ðet�2ÞÞ

1 1 Pi;t�2 1þ ei;t�2

� �Pit 1þ eitð Þ Wt�1Wtð1þ r2ðet�2ÞÞ

Defining h as the probability that a = 0 and a as (1 - h) then

E 2Wt

� �� 1þ r2ðet�2Þ �

h2Wt�2Wt�1 þ ahWt�2 þ ahWt�1 þ a2Wt�1Wt

� �ðA13Þ

and in return form

Eð2WtÞ � 1þ r2ðet�2Þ �

1þ Rt�1ð Þ 1þ hRt�2ð Þ 1þ aRtð Þ½ � ðA14Þ

then the ratio of (A14) to (A12) is

E 2Wt

� �

E Wt�1

� � ¼1þ r2ðet�2Þ �

1þ Rt�1ð Þ 1þ hRt�2ð Þ 1þ aRtð Þ½ �

1þ r2ðet�2Þ �

1þ hRt�2ð Þ 1þ aRt�1ð ÞðA15Þ

Since 1þ r2 ei;t�2

� �n ois in the numerator and denominator, Jensen’s inequality does not

apply, thus


123

E2Wt

Wt�1

� � 1þ Rt�1ð Þ 1þ aRtð Þ

1þ aRt�1ð Þ ðA16Þ

Therefore, in the presence of a lagged adjustment process longer than the holding period

under consideration, our method will remove all of the random transient error bias, but not

all of the lagged adjustment process bias.

References

Aitchison J, Brown JAC (1957) The lognormal distribution, with special reference to its uses in economics.University Press, Cambridge

Bessembinder H (2003) Trade execution costs and market quality after decimalization. J Financ Quant Anal38:747–777

Bessembinder H, Kalcheva I (2007) Liquidity biases in asset pricing tests. Working Paper, David EcclesSchool of Business, University of Utah

Blume ME, Stambaugh RF (1983) Biases in computed returns: an application to the size effect. J FinancEcon 12:387–404

Canina L, Michaely R, Thaler R, Womack K (1998) Caveat compounder: a warning about using the dailyCRSP equal-weighted index to compute long-run excess returns. J Financ 53(1):403–416

Cohen KJ, Fitch BP (1966) The average investment performance index. Manage Sci 12(6):B195–B215(Series B, Managerial)

Conrad J, Kaul G (1993) Long-term market overreaction or biases in computed returns. J Financ 48(1):39–63Cootner PH (1962) Stock prices: random versus systematic changes. Ind Manag Rev 3:24–45Fama EF (1991) Efficient capital markets: II. J Financ 46:1575–1617Fama EF, Fisher L, Jensen MC, Roll R (1969) The adjustment of stock prices to new information. Int Econ

Rev 10(1):1–21Ferson WE, Korajczyk RA (1995) Do arbitrage pricing models explain the predictability of stock returns?

J Bus 68:309–349Fisher L (1966) Some new stock-market indexes. J Bus 39:191–225Fisher L, Lorie JH (1964) Rates of return on investments in common stock. J Bus 37:1–21Fisher L, Lorie JH (1968) Rates of return on investments in common stock: the year-by-year record, 1926–65.

J Bus 41:291–316Fisher L, Lorie JH (1977) A half century of returns on stocks and bonds: Rates of return on investments in

common stocks and on U.S. Treasury securities, 1926–1976. University of Chicago Graduate School ofBusiness, Chicago

Hamza O, Kortas M, L’Her J-F, Roberge M (2006) International equity portfolios: selecting the rightbenchmark for emerging markets. Emerg Market Rev 7:111–128

Hou KW, Moskowitz T (2005) Market frictions, price delay, and the cross-section of expected returns. RevFinanc Stud 18(3):981–1020

Jones CM, Lipson ML (2001) Sixteenths: direct evidence on institutional execution costs. J Financ Econ59:253–278

Keim D (1983) Size related anomalies and stock market seasonality: further empirical evidence. J FinancEcon 12:13–32

Macaulay FR (1938) Some theoretical problems suggested by the movements of interest rates, bond yields,and stock prices in the United States since 1856. National Bureau of Economic Research, New York

Niederhoffer V, Osborne MFM (1966) Market making and reversal on the stock exchange. J Am Stat Assoc61:897–916

Reinganum MR (1982) A direct test of Roll’s conjecture on the firm size effect. J Financ 37:27–35Roll R (1983) On computing mean returns and the small firm premium. J Financ Econ 12:371–386Roll R (1984) A simple measure of the effective bid-ask spread in an efficient market. J Financ 39:1127–1139Samuelson PA (1965) Proof that properly anticipated prices fluctuate randomly. Ind Manag Rev 6:41–49Weaver DG (1991) Sources of short-term errors in the relative prices of common stocks. Ph.D Dissertation,

Rutgers University


123

removing biases in computed returns

Documents