there is no chaos in stock markets - jamal munshi

14
 THERE IS NO CHAOS IN STOCK MARKETS JAMAL MUNSHI ABSTRACT: The elegant simplicity of the Efficient Market Hypothesis (EMH) is its greatest weakness because human nature demands complicated answers to important questions and Chaos Theory readily fills that demand for complexity claiming that it reveals the hidden structure in stock prices. In this paper we take a close look at the Rescaled Range Analysis tool of chaos theorists and show that their findings are undermined by weaknesses in their methods 1 . 1. INTRODUC TION Stock market data have thwarted decades of effort by mathematicians and statisticians to discover their hidden pattern. Simple time series analyses such as AR, MA, ARMA, and ARIMA 2  were eventually replaced with more sophisticated instruments of torture such as spectral analysis but the data refused to confess. The failure to discover the structure in price movements convinced many researchers that the movements were random. The random walk hypothesis (RWH) of Osborne and others (Osborne, 1959) (Bachelier, 1900) was developed into the efficient market hypothesis (EMH) by Eugene Fama (Fama, 1965) (Fama, Efficient Capital Markets: A Review of Theory and Empirical Work, 1970), which serves as a foundational principle of finance. The `weak form of the EMH implies that movements in stock returns are random events independent of historical values. The rationale is that prices contain all publicly available information and if patterns did exist, arbitrageurs would take advantage of them and thereby quickly eliminate them. Both the RWH and the EMH came under immediate attack from technical analysts and this attack continues to this day partly  because the statis tics used in tests of the EMH are cont roversial. The null h ypothesis s tates tha t the ma rket is efficient. The test then consists of presenting convincing evidence that it is not. The tests usually fail. Many argue that the failure of these tests represent a Type II error, that is, a failure to detect a real effect  because o f low powe r of the st atistical test employ ed. It is th erefore log ical to c onjecture that the reason for the failure of statistics to reject the EMH is not the strength of the theory but the weakness of the statistics. If that is the case, perhaps a different and more powerful mathematical device that allowed for more complexity might be successful in discovering the hidden structure of stock prices. In the early seventies, it appeared that Catastrophe Theory was just such a device (Zeeman, 1974) (Zeeman, Catatstrophe Theory, 1976). It had a seductive ability to mimic long bull market periods followed by catastrophic crashes. But it proved to be a mathematical artifact. Its properties could not be generalized. It yielded no secret structure or patterns in stock prices. The results of other non-EMH models such as the Rational Bubble theory (Diba, 1988) and the Fads Theory (Camerer, 1989) are equally unimpressive for the same reasons. 1  Date June, 2014 Key words and phrases : rescaled range analysis, fractal, chaos theory, statistics, Monte Carlo, stock returns  Author affiliation: Professor Emeritus, Sonoma State University, Rohnert Park, CA, 94928, [email protected] 2  Auto Regressive, Moving Average, Auto Regressive Moving Average, Auto Regressive Integrated Moving Average

Upload: akindynos-alvanos

Post on 02-Nov-2015

9 views

Category:

Documents


0 download

DESCRIPTION

Νο ψηαοσ

TRANSCRIPT

  • Electronic copy available at: http://ssrn.com/abstract=2448648

    THERE IS NO CHAOS IN STOCK MARKETS

    JAMAL MUNSHI

    ABSTRACT: The elegant simplicity of the Efficient Market Hypothesis (EMH) is its greatest weakness because human nature

    demands complicated answers to important questions and Chaos Theory readily fills that demand for complexity claiming that it

    reveals the hidden structure in stock prices. In this paper we take a close look at the Rescaled Range Analysis tool of chaos

    theorists and show that their findings are undermined by weaknesses in their methods1.

    1. INTRODUCTION

    Stock market data have thwarted decades of effort by mathematicians and statisticians to discover their

    hidden pattern. Simple time series analyses such as AR, MA, ARMA, and ARIMA2 were eventually

    replaced with more sophisticated instruments of torture such as spectral analysis but the data refused to

    confess. The failure to discover the structure in price movements convinced many researchers that the

    movements were random. The random walk hypothesis (RWH) of Osborne and others (Osborne, 1959)

    (Bachelier, 1900) was developed into the efficient market hypothesis (EMH) by Eugene Fama (Fama,

    1965) (Fama, Efficient Capital Markets: A Review of Theory and Empirical Work, 1970), which serves

    as a foundational principle of finance.

    The `weak form of the EMH implies that movements in stock returns are random events independent of

    historical values. The rationale is that prices contain all publicly available information and if patterns did

    exist, arbitrageurs would take advantage of them and thereby quickly eliminate them. Both the RWH and

    the EMH came under immediate attack from technical analysts and this attack continues to this day partly

    because the statistics used in tests of the EMH are controversial. The null hypothesis states that the market

    is efficient. The test then consists of presenting convincing evidence that it is not. The tests usually fail.

    Many argue that the failure of these tests represent a Type II error, that is, a failure to detect a real effect

    because of low power of the statistical test employed. It is therefore logical to conjecture that the reason

    for the failure of statistics to reject the EMH is not the strength of the theory but the weakness of the

    statistics. If that is the case, perhaps a different and more powerful mathematical device that allowed for

    more complexity might be successful in discovering the hidden structure of stock prices.

    In the early seventies, it appeared that Catastrophe Theory was just such a device (Zeeman, 1974)

    (Zeeman, Catatstrophe Theory, 1976). It had a seductive ability to mimic long bull market periods

    followed by catastrophic crashes. But it proved to be a mathematical artifact. Its properties could not be

    generalized. It yielded no secret structure or patterns in stock prices. The results of other non-EMH

    models such as the Rational Bubble theory (Diba, 1988) and the Fads Theory (Camerer, 1989) are equally

    unimpressive for the same reasons.

    1 Date June, 2014

    Key words and phrases: rescaled range analysis, fractal, chaos theory, statistics, Monte Carlo, stock returns Author affiliation: Professor Emeritus, Sonoma State University, Rohnert Park, CA, 94928, [email protected] 2 Auto Regressive, Moving Average, Auto Regressive Moving Average, Auto Regressive Integrated Moving Average

  • Electronic copy available at: http://ssrn.com/abstract=2448648

    2. THEORY

    Many economists feel that the mathematics of time series implied by Chaos Theory (Mandelbrot B. ,

    1963) is a promising alternative. If time series data have a memory of the past and behave accordingly

    even to a small extent then much of what appears to be random behavior may turn out to be part of the

    deterministic response of the system. Certain non-linear dynamical system of equations can generate time

    series data that appear remarkably similar to the behavior of stock market prices.

    Research in this area of finance is motivated by the idea that by using new mathematical techniques

    hidden structures can be discovered in what appears to be a random time series. One technique, attributed

    to Lorenz (Lorenz, 1963), uses a plot of the data in phase space to detect patterns called strange attractors

    (Bradley, 2010) (Peters E. , 1991). Another method proposed by Takens (Takens, 1981) uses an algorithm

    to determine the `correlation dimension' of the data (Wikipedia, 2014) (Schouten, 1994). A low

    correlation dimension indicates a deterministic system. A high correlation dimension is indicative of

    randomness.

    The correlation dimension technique has yielded mixed results with stock data. Halbert White and others

    working with daily returns of IBM concluded that the correlation dimension was sufficiently high to

    regard the time series as white noise (White, 1988) although Scheinkman (Scheinkman, 1989) et al claim

    to have found that a significant deterministic component in weekly returns.

    2.1 Rescaled range analysis. A third technique for discovering structure in time series data has been

    described by Mandelbrot (Mandelbrot, 1982), Hurst (Hurst, 1951), Feder (Feder, 1988), and most recently

    by Peters (Peters, 1991) (Peters E. , 1991) (Peters E. , 1994). Called `rescaled range analysis', or R/S, it is

    a test for randomness of a series not unlike the runs test. The test rests on the relationship that in a truly

    random series, a selection of sub-samples of size taken sequentially without replacement from a large

    sample of size N should produce a random sampling distribution with a standard deviation given by

    Equation 1 x = (/0.5

    )*(N-)/(N-1)

    Here x is the standard deviation of the distribution of sample means obtained by drawing samples of size

    sequentially and without replacement from a large sample of size N, and is the standard deviation of

    the large sample, i.e., = x when =N. However, when the time series has runs, it can be shown that the

    exponent of in the term 0.5, will differ from 0.5. The paper by Peters describes the following

    relationship.

    Equation 2 R/S = H

  • where R is the range of the sequential running totals of the deviations of the sub-sample values from the

    sub-sample mean3, S is the standard deviation of the sub- sample, and is the size of the sub-sample . The

    `H' term is called the Hurst constant or the Hurst exponent. It serves as a measure of the fractal and non-

    random nature of the time series. If the series is random H will have a value of 0.5 as shown in Equation

    1, but if it has runs the H exponent will be different from 0.5.

    If there is a tendency for positive runs, that is, increases are more likely to be followed by increases and

    decreases are more likely to be followed by decreases, then H will be greater than 0.5 but less than 1.0.

    Values of H between 0 and 0.5 are indicative of negative runs, that is increases are more likely to be

    followed by decreases and vice versa. Hurst and Mandelbrot have found that many natural phenomena

    previously thought to be random have high H-values that are indicative of a serious departure from

    independence and randomness (Hurst, 1951).

    Once `H' is determined for a time series, the autocorrelation in the time series is computed as follows:

    Equation 3 CN = 2(2H-1)

    -1

    CN is the correlation coefficient and its magnitude may be interpreted as the degree to which the elements

    of the time series are dependent on historical values. The interpretation of this coefficient used by Peters

    to challenge the EMH is that it represents the percentage of the variation in the time series that can be

    explained by historical data. The weak form of the EMH implies that this correlation is zero; i.e., that the

    observations are independent of each other. Therefore, evidence of such a correlation can be interpreted to

    mean that the weak form of the EMH does not hold.

    Peters (Peters E. , 1991) (Peters E. , 1994) studied monthly returns of the S&P500 index, 30-year

    government T-bond, and the excess of stocks returns over the bond returns and computed the R/S values

    for all three time series data for eleven sequential sub-sample sizes. He found very high values of H and

    CN and therefore rejected the EMH null hypothesis. His papers and books on the subject have generated a

    great deal of interest in R/S research with many papers reporting a serious departure from randomness in

    stock returns previously assumed to be random (Pallikari, 1999) (Bohdalova, 2010) (Jasic, 1998)

    (McKenzie, 1999) (Mahalingam, 2012). We now examine this methodology in some detail.

    3 The sum of these values is of course zero by definition but the intermediate values in the running sub-totals will

    have a range that is related to the tendency for the data to have positive or negative runs.

  • 3. METHODOLOGY

    An appropriate time series that is sufficiently long to facilitate sequential sub-sampling without

    replacement is selected for R/S analysis. It will be sampled in cycles. In each cycle, of many sub-

    sampling cycles, sub-samples are taken sequentially and without replacement. The sub-sample sizes are

    left to the discretion of the researcher. In the example data shown in Table 1, for example, we find that the

    researcher (Peters E. , 1991) has taken eleven cycles of sub-samples from a large sample of N=463

    monthly returns. In the first sub-sampling cycle he selected one sample of =N=463. In the second cycle

    he selected two samples of =230 returns sequentially and without replacement. Similarly in the third

    cycle he selected three subsamples of =150 sequentially and without replacement; and so on gradually

    reducing sample size until he selected 77 sub-samples of =6.

    Stocks Bonds Premium

    463 31.877 45.05 27.977

    230 22.081 21.587 18.806

    150 16.795 15.72 15.161

    116 12.247 12.805 11.275

    75 12.182 10.248 11.626

    52 10.121 9.29 8.79

    36 7.689 7.711 7.014

    25 6.296 5.449 4.958

    18 4.454 4.193 4.444

    13 3.58 4.471 3.549

    6 2.168 2.11 2.209 Table 1 and R/S data from Peters

    For each of the many sub-samples he generated, he computed the sample mean, the deviation of each

    observation in that sub-sample from the sub-sample mean, the running sum of these deviations, the range4

    of the running sum of the deviations, and the standard deviation of the sub-sample. He then divided the

    range by the standard deviation of the sub-sample to obtain values shown in Table 1.

    For =463 there was of course just one sample and the value of Range over Standard deviation (R/S) for

    stocks for that single sample was found to be 31.877 as shown in Table 1. For =230, there were two sub-

    samples and therefore two values of R/S. We don't know what they were but we know from Table 1 that

    their average for stocks is 22.081. Similarly, there were three values of R/S for the three sub-samples of

    size =150 but this information is not reported. Instead we are told that the average of the three R/S

    values for stocks is 16.795. Thus for all sub-sample sizes shown in Table 1, the only R/S values reported

    and the only R/S values subjected to further analysis are the averages of the R/S values of all the sub-

    samples of a given size. The additional information contained in the original data, particularly with

    4 Range = maximum minus minimum. The sum of the all deviations is of course zero but the running partial sums

    may have a large range that deviates from that of a random series if there is tendency for a positive or negative persistence in the data.

  • reference to variance, is lost and gone forever. As we shall see later, this information loss is a serious

    weakness in R/S research methodology.

    Once all the values of and the average R/S for sub-samples of each size are tabulated as shown in Table

    1, the researcher is ready to estimate the value of H by utilizing Equation 2 which states the relationship

    between and R/S. To do that, the researcher first renders Equation 2 into linear form by taking the

    natural logarithm of both sides to yield

    Equation 4 ln(R/S) = H*ln()

    Equation 4 implies that there is a linear relationship between ln(R/S) and ln() and that the slope of this

    line is H. To estimate the value of H the researcher takes the natural logarithms of the R/S values and for

    the corresponding values of and then carries out OLS5 linear regression between the logarithms to

    estimate the slope using the regression model

    Equation 5 y=b0+b1x

    If x is set to ln() and y is set to ln(R/S), then, it is claimed, the regression coefficient b1 can be

    interpreted as our best unbiased estimate of the value of H. The regression results and the values of b1 are

    shown in Table 2. For example, the value of b1 for stocks is b1=0.611 and the researcher uses that

    information to conclude that his best unbiased estimate for H is H=0.611. He may decide that this value is

    rather high, much higher than H=0.5 for a random series, and conclude that the series is not random but

    contains a positive persistence. He can now use Equation 3 to put a value on the amount of persistence as

    CN=0.166, that is 16.6% of the value of the returns can be explained by the effect of past returns.

    PETERS RESULTS parameter stocks bonds premium

    Regression results b0 -0.103 -0.151 -0.185

    b1 0.611 0.641 0.658

    Model estimates H 0.611 0.641 0.658

    CN 0.166 0.216 0.245 Table 2 The findings as presented in the paper by Peters

    There are two serious problems with this regression procedure one of which we will address in this paper.

    First, we should take note that the regression algorithm minimizes the sum of squared errors in the natural

    logarithm of R/S. We have no assurance that the value of H at which the sum of squared errors in ln(R/S)

    is minimized is the same as the value of H at which the sum of squared errors in R/S is also minimized.

    We will examine this issue in a future paper. For now we turn to the more serious issue of b0, a value that

    is computed and presented but then forgotten and never interpreted. Some R/S researchers (Pallikari,

    1999) (Jasic, 1998) have attempted to acknowledge the existence of b0 by changing Equation 2 to the

    form

    5 Ordinary Least Squares

  • Equation 6 R/S = CH

    Equation 7 C=exp(b0).

    For example, in Table 2, the C and H values for stocks is C=e-0.103

    =0.902 and H=0.611. So we get a sense

    that the value of b0 now has a place in R/S research but we still have no attempt by researchers to interpret

    C or to examine the relationship between C and H.

    In fact, there is an inverse relationship between C and H that works like a see-saw. Higher values of C are

    associated with lower values of H and lower values of C are associated with higher values of H. This

    relationship presents the second methodological problem for R/S research because it implies that the

    values of H and C must be evaluated together as a pair and not in isolation. Alternately, one could fix the

    regression intercept at zero where C=1 and compare H values directly at the cost of increasing the error

    sum of squares in the regression. We investigate these possibilities in the next section.

    Yet a third issue in chaos research that must be investigated is that the way they are structured often leads

    to a high probability spurious findings. First, the level of hypothesis tests is usually set to =0.05. This

    means that five percent of the time researchers will find an effect in random data. Recent studies of the

    irreproducibility of results in the social sciences has led many to insist that this error level should be set to

    =0.001 (Johnson, 2013). In many R/S papers, the false positive error rate is further exacerbated by

    multiple comparisons that are made without a Bonferroni adjustment of the level (Mundfrom, 2006).

    For example, if five comparisons are made at =0.05, the possibility of finding an effect in random data

    rises to 1.05^5-1 or 27% - an unacceptable rate of the production of spurious and irreproducible results.

    4. DATA ANALYSIS

    For a demonstration of R/S analysis and the methodological issues we raised, we have selected four time

    series of daily returns and four series of weekly returns each with N=2486. The stocks and stock indexes

    selected for study are listed in Table 3. The number of cycles and the sub-sample sizes to use are

    arbitrarily assigned as follows. The sub-samples are taken in six cycles. In the first cycle we take one

    sample of =2486; in the second cycle, two samples of =1243 each; in the third cycle four samples of

    =621 each; in the fourth cycle, eight samples of =310 each; in the fifth cycle, sixteen samples of =155

    each; and in the sixth cycle we take thirty two samples of =77 each. In each cycle the samples are taken

    sequentially and without replacement. We then compute the value of R/S for each sample of each series.

    These values are shown in Table 5. Average R/S values as they would appear in conventional research are

    shown for reference in Table 4. All data and Microsoft Excel computational files are available in the

    online data archive of this paper (Munshi, 2014)6.

    6 The use of the R/S analysis spreadsheet is explained in the Appendix

  • Symbol Name From To Returns N

    IXIC NASDAQ Composite Index 2004 2014 daily 2486

    DJI Dow Jones Industrial Average 2004 2014 daily 2486

    BAC Bank of America 2004 2014 daily 2486

    CL Colgate Palmolive 2004 2014 daily 2486

    SPX S&P500 Index 1966 2014 weekly 2486

    CAT Caterpillar 1966 2014 weekly 2486

    BA Boeing 1966 2014 weekly 2486

    IBM IBM 1966 2014 weekly 2486

    Table 3 Returns series selected for R/S analysis

    IXIC DJI BAC CL SPX CAT BA IBM

    2486 67.103 71.259 72.107 41.150 65.899 42.414 62.702 61.596

    1243 40.069 39.317 55.765 31.469 52.971 30.363 48.096 44.789

    621 26.854 23.887 29.925 21.898 34.672 25.526 35.427 36.034

    310 20.426 17.773 18.719 15.808 23.039 19.659 24.614 25.140

    155 14.387 13.072 14.453 12.458 15.099 13.723 15.373 14.998

    77 9.487 9.231 9.370 8.977 9.796 9.204 9.938 10.614

    Table 4 The average value of R/S for each value of

    By comparing the values of R/S in Table 4 with those in Table 5 for the same value of we can see that

    there is a great deal of variance among the actual observed R/S values and that this variance simply

    vanishes from the data when the average value is substituted for them. Of course, the reduction of

    variance thus achieved increases the precision of the regression with higher values for R-squared and

    correspondingly lower values for the variance of the residuals and the standard error of the regression

    coefficient b1. As a result, it appears that we have estimated H with a great deal of precision but this

    precision is illusory and fictional because it does not actually exist in the data.

    Yet another problem created by throwing away the actual data and substituting one average value of R/S

    to represent many observed values of R/S is that the R/S values at lower values of are not weighted

    sufficiently in the least squares procedure and therefore the regression is unduly influenced by the large

    samples. For example the squared residuals of R/S values for =155 carries a weight=16 if the actual data

    are used in the regression instead of weight=1 when one average value is substituted for sixteen

    observations. The variance and weighting problems together act to corrupt the regression procedure.

    We therefore choose to carry out our regression using the data in Table 5 instead of the average values in

    Table 4. With Equation 6 as our empirical model, we take the logarithm of both sides to obtain regression

    model as ln(R/S)=C+H*ln(). Linear regression is carried out between y=ln(R/S) and x=ln() and the

    regression coefficients b0 and b1 are used as our unbiased estimate of the parameters in Equation 6 as

    H=b1 and C=exp(b0). These results are summarized in Table 6.

  • n IXIC DJI BAC CL SPX CAT BA IBM

    2486 67.103 71.259 72.107 41.150 65.899 42.414 62.702 61.596

    1243 51.309 52.210 54.747 32.024 43.696 24.955 60.936 36.848 1243 28.829 26.424 56.783 30.915 62.245 35.772 35.255 52.730

    621 29.682 21.223 26.771 22.764 34.683 20.575 50.970 37.105 621 31.785 30.288 35.840 25.693 30.024 22.941 33.671 37.393 621 20.237 23.359 27.944 21.509 37.810 24.660 23.726 46.462 621 25.713 20.679 29.143 17.628 36.170 33.931 33.343 23.175

    310 21.686 16.900 16.022 19.321 29.664 19.673 31.186 30.035 310 25.180 19.334 15.103 12.126 26.607 19.483 23.536 25.724 310 21.297 17.508 16.620 17.864 20.039 15.674 25.636 27.780 310 21.462 18.272 24.181 17.273 20.073 18.064 26.447 21.216 310 19.619 18.104 19.288 18.850 18.165 18.820 23.861 28.885 310 17.099 19.098 16.541 12.516 23.090 17.210 14.981 25.668 310 22.256 18.486 22.588 14.318 20.320 19.411 26.802 17.154 310 14.810 14.482 19.414 14.195 26.351 28.940 24.465 24.660

    155 15.702 13.432 10.524 15.382 14.605 9.877 20.571 17.603 155 20.579 15.052 15.727 12.499 17.725 12.534 14.393 17.697 155 15.521 9.978 15.379 9.674 17.187 14.446 15.825 13.806 155 15.114 11.873 14.014 12.337 17.924 17.099 16.399 21.653 155 13.939 16.481 10.149 9.048 14.473 14.299 15.193 15.273 155 15.930 12.707 10.778 14.061 18.773 14.238 18.211 16.065 155 14.180 11.235 15.423 14.286 9.640 11.549 14.829 14.978 155 11.935 13.367 15.740 13.984 16.650 11.034 16.629 16.202 155 13.455 12.881 14.667 14.152 14.948 12.421 12.580 15.857 155 16.258 13.275 18.580 12.453 18.088 11.937 11.986 12.682 155 12.544 12.515 17.654 10.921 11.005 11.785 13.044 13.058 155 10.086 10.684 9.266 10.798 12.600 15.035 13.145 14.135 155 16.071 14.108 21.575 9.445 14.079 15.665 17.146 14.022 155 17.333 14.939 11.894 12.730 12.496 13.889 13.500 11.632 155 9.892 13.125 13.289 11.921 20.200 21.845 16.938 16.054 155 11.649 13.497 16.583 15.644 11.188 11.925 15.576 9.257

    77 8.392 9.484 8.906 11.395 13.576 8.834 14.273 10.773 77 9.855 9.161 7.118 7.119 11.781 7.296 9.586 10.067 77 12.330 11.570 11.016 9.100 14.479 10.615 11.301 14.476 77 12.180 9.469 8.440 9.271 11.659 11.430 10.346 13.900 77 7.868 8.495 12.870 7.219 7.190 8.695 8.927 8.236 77 12.964 10.496 10.287 7.777 13.160 9.894 10.802 9.957 77 9.299 7.426 8.161 8.562 8.972 9.907 10.170 11.608 77 7.807 8.526 11.315 12.522 8.748 11.213 13.913 12.331 77 10.151 9.622 6.730 10.405 7.598 7.069 8.428 9.733 77 9.101 7.086 6.298 8.707 12.652 9.828 10.719 15.051 77 9.589 9.441 10.574 6.744 11.315 8.451 14.611 10.573 77 9.083 8.250 8.774 7.840 10.865 7.108 8.418 11.742 77 10.850 10.348 10.946 7.576 9.079 8.228 7.268 12.453 77 10.262 8.191 7.890 11.614 6.673 10.719 9.732 11.451 77 9.654 9.064 7.986 7.782 12.759 7.743 13.020 12.755 77 9.553 9.052 9.836 8.747 10.282 8.121 7.375 13.922 77 9.967 11.382 9.209 8.527 11.010 9.737 8.988 12.289 77 7.454 6.692 9.604 10.500 7.189 8.491 11.645 12.574 77 10.740 11.077 11.248 8.565 8.270 9.268 8.933 8.091 77 7.553 9.874 6.334 11.670 10.510 8.201 7.447 9.600 77 8.465 8.939 9.684 8.718 8.634 10.098 6.343 10.708 77 9.621 11.993 8.712 8.649 9.350 6.315 6.836 8.260 77 9.003 9.181 8.425 9.078 5.357 7.916 12.648 8.333 77 8.555 7.333 8.455 8.891 9.187 8.003 10.457 8.487 77 6.842 8.966 8.569 8.604 12.017 13.898 9.902 10.859 77 10.029 10.065 10.610 8.742 9.000 9.711 7.341 8.653 77 12.157 5.701 11.106 7.067 6.761 9.779 9.961 8.834 77 8.790 11.964 8.901 7.535 8.546 8.878 11.736 8.119 77 9.993 6.883 8.849 8.204 12.201 13.071 9.974 11.741 77 9.360 8.788 14.334 8.490 9.188 7.354 7.478 9.111 77 6.806 11.101 8.534 10.629 7.687 10.189 8.959 7.085 77 9.310 9.787 10.127 11.013 7.781 8.480 10.470 7.884

    Table 5 R/S values for all sub-samples taken

  • IXIC DJI BAC CL SPX CAT BA IBM

    C=exp(b0) 0.9550 1.0012 0.7339 1.3406 0.6999 1.1850 0.7721 0.9346

    H=b1 0.5282 0.5065 0.5810 0.4352 0.6030 0.4741 0.5872 0.5548

    R2 0.8627 0.8639 0.8594 0.8382 0.8472 0.8211 0.8478 0.8395

    H 0.0270 0.0257 0.0301 0.0245 0.0328 0.0283 0.0319 0.0311

    Table 6 Results of linear regression on the natural logarithms of the data in Table 5

    In Table 6, the R2 value expresses the percentage of the total sum of squared deviations from the mean

    that is explained by the regression. These values are quite a bit lower than the R2 values

    7 one usually

    encounters in R/S research but they are probably more realistic. Similarly the standard error in the

    estimation of H listed as H in Table 6 are higher but more reliable for the same reasons.

    If we scan Table 6 for H values that appear to be very different from H=0.5, CL and SPX stand out with

    CL showing a rather low value of H and SPX a somewhat high value. This is of course the line of

    reasoning conventionally taken. However, if we also look at the values of C we find something very

    interesting. Colgate Palmolive (CL), with a value of H unusually lower than H=0.5 shows a value of C

    that is unusually higher than C=1. In the same way, the S&P500 Index (SPX) with a value of H unusually

    higher than the neutral value of H=0.5 shows a value of C that is correspondingly lower than the neutral

    value of C=1. The DJIA Index (DJI) appears to be in the middle of these extremes with an H value very

    close to H=0.5 and a C value very close to C=1.

    In fact, if we compare all the H and C values we can discern the see-saw effect mentioned earlier. Values

    of H greater than 0.5 correspond with values of C less than 1.0 and values of H less than 0.5 correspond

    with values of C greater than 1.0. Therefore, to interpret the regression results in terms of persistence in

    the data and chaos in stock prices, we must first understand the relationship between C and H in a random

    series that has no persistence, no memory, and no chaos. To do that we took forty random samples of size

    n=2486 from a Gaussian distribution with =0 and =1 and computed the R/S values using the same sub-

    sampling strategy that we used for the stock returns. Linear regression between ln(R/S) and ln() was

    carried out for each of the forty samples. The regression results are shown in Figure 1 and Table 7.

    Figure 1 Observed relationship between C and H in random numbers

    7 well above 0.95

    y = -0.4199x - 0.6572 R = 0.9717

    -1

    -0.8

    -0.6

    -0.4

    -0.2

    0

    -0.6 -0.4 -0.2 0 0.2 0.4 0.6

    ln(H

    )

    ln(C)

  • Sample# All data Averages8

    C H R-sqrd C H R-sqrd

    14 0.629 0.617 0.816 0.631 0.622 0.989

    15 0.683 0.606 0.847 0.754 0.592 0.997

    32 0.684 0.601 0.853 0.588 0.630 0.961

    5 0.688 0.607 0.786 0.816 0.581 0.998

    17 0.758 0.588 0.845 0.732 0.599 0.997

    35 0.797 0.572 0.827 0.811 0.572 0.998

    37 0.799 0.571 0.781 0.716 0.595 0.987

    2 0.804 0.565 0.857 0.880 0.551 0.985

    16 0.805 0.579 0.804 0.751 0.597 0.997

    40 0.811 0.564 0.773 0.562 0.637 0.979

    19 0.814 0.566 0.818 1.205 0.499 0.980

    20 0.816 0.560 0.792 0.830 0.562 0.995

    28 0.843 0.556 0.833 0.612 0.618 0.961

    36 0.847 0.561 0.810 0.844 0.564 0.980

    9 0.881 0.550 0.782 0.847 0.561 0.980

    8 0.884 0.553 0.814 1.235 0.496 0.994

    21 0.884 0.532 0.783 0.654 0.591 0.976

    1 0.885 0.555 0.734 1.097 0.522 0.983

    10 0.899 0.549 0.784 1.357 0.479 0.980

    22 0.924 0.528 0.816 0.840 0.548 0.990

    12 0.924 0.531 0.814 0.705 0.585 0.981

    38 0.938 0.536 0.848 0.901 0.546 0.997

    30 0.939 0.528 0.785 0.910 0.539 0.999

    34 0.950 0.526 0.859 0.675 0.590 0.959

    27 0.960 0.531 0.738 1.089 0.515 0.995

    29 0.980 0.519 0.807 1.192 0.487 0.994

    23 0.989 0.520 0.790 1.094 0.505 0.992

    18 1.008 0.516 0.765 0.758 0.571 0.959

    33 1.030 0.521 0.817 1.364 0.472 0.990

    11 1.032 0.524 0.851 1.312 0.485 0.989

    25 1.058 0.502 0.855 0.879 0.538 0.973

    31 1.072 0.502 0.800 1.019 0.515 0.998

    7 1.073 0.496 0.757 1.508 0.438 0.975

    24 1.083 0.504 0.775 1.069 0.511 0.982

    3 1.098 0.506 0.773 1.644 0.437 0.973

    39 1.117 0.509 0.800 1.475 0.461 0.982

    4 1.149 0.480 0.761 1.525 0.432 0.981

    13 1.303 0.470 0.775 1.881 0.406 0.983

    6 1.462 0.442 0.798 2.120 0.376 0.975

    26 1.482 0.424 0.682 1.952 0.377 0.993

    mean 0.944 0.537 0.800 1.046 0.530 0.984

    stdev 0.188 0.042 0.038 0.394 0.068 0.011

    max 1.482 0.617 0.859 2.120 0.637 0.999

    min 0.629 0.424 0.682 0.562 0.376 0.959

    range 0.853 0.193 0.177 1.559 0.260 0.040

    Table 7 The values of C and H for random numbers

    8 The results for the regression of average values are shown only for comparison.

  • The inverse relationship between C and H in random numbers is clear to see in the sorted data in Table 7

    and also graphically in Figure 1. The relationship between H and C for random numbers in our sub-

    sample structure may thus be estimated numerically with linear regression in a purely empirical way as

    shown in Figure 1.

    Equation 8 ln(H) = -0.6572 - 0.4199*ln(C)

    It is noted that Equation 8 does not yield a value of H=0.5 when C=1 as we would expect according to

    Equation 2. The actual value we compute from Equation 8 is that H(C=1)=e-0.6572

    = 0.5183. We know that

    Equation 8 is not accurate but it is a better alternative than ignoring the value of C. We can now use

    Equation 8 for hypothesis tests to determine if the observed values of H for the stock returns shown in

    Table 6 are different from the values of H we would observe in a random series for the same value of C.

    The hypothesis tests are shown in Table 8. The row marked H(C) in Table 8 refers to the value of H we

    would expect in a random series for the value of C in each column.

    IXIC DJI BAC CL SPX CAT BA IBM

    C 0.9550 1.0012 0.7339 1.3406 0.6999 1.1850 0.7721 0.9346

    H 0.5282 0.5065 0.5810 0.4352 0.6030 0.4741 0.5872 0.5548

    H( C ) 0.5284 0.5180 0.5902 0.4583 0.6021 0.4826 0.5778 0.5332

    Difference 0.0002 0.0116 0.0092 0.0231 0.0009 0.0085 0.0094 0.0216

    t-value 0.008221 0.4500939 0.3054518 0.9421487 0.0287287 0.3006475 0.2957537 0.6942445

    p-value 0.9947665 0.7307531 0.8112735 0.5189576 0.9817158 0.8140747 0.8169358 0.6136656

    Table 8 Hypothesis test for H

    It is easy to see in Table 8 that the values of H observed for the stock series are very similar to the values

    we would expect to see in a random series and the t-test9 confirms our intuition. These results indicate

    that the R/S procedure carried out in the manner shown here does not show any evidence of persistence in

    the eight time series data studied that includes both daily and weekly returns.

    Yet another way to deal with C is to simply get rid of it and test Equation 2 directly by setting C=1. In the

    regression model for y=ln(R/S) against x=ln() we force the y-intercept to be zero. That will yield the

    regression equation y=b1x. The b0 term is gone leaving us only with H=b1. The results are shown in Table

    9. None of the H-values shows a large departure from H=0.5 in conformity with our findings in Table 8.

    IXIC DJI BAC CL SPX CAT BA IBM

    H 0.5192 0.5067 0.5205 0.4925 0.5332 0.5073 0.5366 0.5416

    R2 0.8624 0.8639 0.8498 0.8332 0.8355 0.8170 0.8413 0.8390

    Table 9 Regression with C=1

    9 Using the standard error of estimate for H in Table 6

  • 5. CONCLUSIONS

    It is proposed that the findings of persistence and chaotic behavior in stock returns by the application of

    Rescaled Range Analysis are not real but artifacts of the methodology employed. The regression

    procedure normally employed yields values for H and C and the interpretation of H in isolation without

    consideration of the corresponding value of C can lead to spurious results. We also note that the practice

    of averaging sub-sample R/S values and then using that average as if it were an observation can also

    introduce serious errors in the estimation of H and the standard error of H. Yet another weakness of

    conventional R/S research is the use of high values of in hypothesis tests and the absence of Bonferroni

    corrections for multiple comparisons. Under these circumstances we see no evidence of non-randomness

    in R/S research. We conclude that stock returns are a random walk and that the weak form of the Efficient

    Market Hypothesis has not been proven wrong by the R/S methodology.

    6. REFERENCES

    Bachelier, L. (1900). Thorie de la Spculation. Annales Scientifique de l'cole Normale .

    Bohdalova, M. a. (2010). Markets, Information and their Fractal Analysis. Retrieved 2014, from g-casa.com:

    http://www.g-casa.com/conferences/budapest/papers/Bohdalova.pdf

    Bradley, L. (2010). Strange attractors. Retrieved 2014, from Space telescope science institute:

    http://www.stsci.edu/~lbradley/seminar/attractors.html

    Camerer, C. (1989). Bubbles and Fads in Asset Prices. Journal of Economic Surveys , 3-41.

    Chen, N. R. (1983). Chen, Naifu, Economic forces and the stock market: testing the APT and alternate asset pricing

    theories. Working pape .

    Chen, N. (1983). Some empirical tests of the theory of arbitrage pricing. Journal of Finance , Dec p414.

    Diba, B. (1988). The Theory of Rational Bubbles in Stock Prices. The Economic Journal , vol 98 No. 392 p746-754.

    Dybvig, P. a. (1985). Yes, the APT is Testable. Journal of Finance .

    Fama, E. (1970). Efficient Capital Markets: A Review of Theory and Empirical Work. Journal of Finance , 25 (2): 383

    417.

    Fama, E. (1965). The Behavior of Stock Market Prices. Journal of Business , 38: 34105.

    Feder, J. (1988). Fractals. NY: Plenum Press.

    Hurst, H. (1951). Long term storage capacity of reservoirs. Transactions of the American Society of Civil Engineers ,

    Vol. 116, p770.

    Jasic, T. (1998). Testing of nonlinearity and determinstic chaos in monthly Japanese stock market returns. Zagreb

    International Review of Economics and Business , Vol. 1 No. 1 pp. 61-82.

  • Johnson, V. E. (2013, November). Revised Standards for Statistical Evidence. Retrieved December 2013, from

    Proceedings of the National Academy of Sciences: http://www.pnas.org/content/110/48/19313.full

    Kryzanowski, L. S. (1994). Kryzanowski, Lawrence, Simon LalSome tests of APT mispricing using mimicking

    portfolios. Financial Review , v29: 2, p153.

    Lorenz, E. (1963). Deterministic nonperiodic flow. Lorenz, Journal of the Atmospheric Sciences , 20 (2): 130141.

    Mahalingam, G. (2012). Persistence and long range dependence in Indian stock market returns. Retrieved 2014,

    from IJMBS: http://www.ijmbs.com/24/gayathri.pdf

    Mandelbrot, B. (1982). The Fractal Geometry of Nature. NY: Freeman.

    Mandelbrot, B. (1963). The variation of certain speculative prices. Journal of Business , 36 (4): 394419.

    McKenzie, M. (1999). Non-periodic Australian stock market cycles. Retrieved 2014, from RMIT University:

    http://mams.rmit.edu.au/ztghsoxhhjw1.pdf

    Mundfrom, D. (2006). Bonferroni adjustments in tests for regression coefficients. Retrieved 2014, from University

    of Northern Colorado: http://mlrv.ua.edu/2006/Mundfrom-etal-MLRV-3.pdf

    Munshi, J. (2014). RS data archive. Retrieved 2014, from Dropbox:

    https://www.dropbox.com/sh/bu1mdjtg9mvlmfa/AACLwFys7FMblJzJPGandpjfa

    Osborne, M. (1959). Brownian motion in the stock market. Operations Research vol 7 , 145-173.

    Pallikari, F. a. (1999). A rescaled range analysis of random events. Journal of Scientific Exploration , Vol 13. No. 1

    pp. 25-40.

    Peters, E. (1991). A Chaotic Attractor for the S&P500. Financial Analysts Journal , Vol.47 No. 2 p55.

    Peters, E. (1991). Chaos and order in the capital markets : a new view of cycles, prices, and market volatility. NY:

    John Wiley and Sons.

    Peters, E. (1994). Fractal market analysis. NY: John Wiliey and Sons.

    Roll, R. (1977). A critique of the asset pricing theory's tests. Journal of Financial Economics , March, p129.

    Roll, R. a. (1980). An empirical investigation of the arbitrage pricing theory. Journal of Finance , Dec p1073.

    Roll, R. a. (1980). Roll,An empirical investigation of the arbitrage pricing theory. Journal of Finance , p1073.

    Ross, S. (1976). The arbitrage theory of capital pricing. Journal of Economic Theory , v13, p341, 1976.

    Scheinkman, J. (1989). Nonlinear dynamics and stock returns. Journal of Business , Vol. 62 p. 311.

    Schouten, J. (1994). Estimation of the dimension of a noisy attractor. Retrieved 2014, from Google Scholar:

    http://scholar.google.co.th/scholar_url?hl=en&q=http://repository.tudelft.nl/assets/uuid:bd5339a0-24e5-4362-

    b378-

    adc7c60cac99/aps_schouten_1994.pdf&sa=X&scisig=AAGBfm36mO9vIeHfx7XnQ_zLbcHlaxZD6Q&oi=scholarr&ei=

    mkRzU92RKYSskAXfz4D4BA&ved=0CCoQgAMoADAA

  • Shanken, J. (1982). The arbitrage pricing theory: is it testable? Retrieved 2014, from The University of Utah:

    http://home.business.utah.edu/finmll/fin787/papers/shanken1982.pdf

    Shanken, J. (1982). The Arbitrage Pricing Theory: Is it Testable? Journal of Finance , 1129-1140.

    Sharpe, W. (1962). A simplified model for porftolio returns. Management Science , p277.

    Sharpe, W. (1964). Capital asset prices: a theory of market equilibrium under conditions of risk. Journal of Finance ,

    v19, p425.

    Takens, F. (1981). Detecting strange attractors in turbulence. In D. A.-S. Young, Dynamical Systems and Turbulence

    (pp. 366-381). Springer-Verlag.

    University of South Carolina. (2014). Multicollinearity and variance inflation factors. Retrieved 2014, from

    University of South Carolina: http://www.stat.sc.edu/~hansont/stat704/vif_704.pdf

    Virginia Tech. (2014). Methods for multiple linear regression analysis. Retrieved 2014, from vt.edu:

    http://scholar.lib.vt.edu/theses/available/etd-219182249741411/unrestricted/Apxd.pdf

    White, H. (1988). Economic prediction using neural networks: the case of IBM daily stock returns. White, H.,

    "Economic prediction using neuraNeural Networks IEEE International Conference , White, H., "Economic prediction

    using neural networks: the case of IBM daily stock returpp.451,458 vol.2.

    Wikipedia. (2014). Autoregressive Model. Retrieved 2014, from Wikipedia:

    http://en.wikipedia.org/wiki/Autoregressive_model

    Wikipedia. (2014). Correlation dimension. Retrieved 2014, from Wikipedia:

    http://en.wikipedia.org/wiki/Correlation_dimension

    Wikipedia. (2014). Roll's Critique. Retrieved 2014, from wikipedia: http://en.wikipedia.org/wiki/Roll's_critique

    Zeeman, E. (1976). Catatstrophe Theory. Scientific American , 65-83.

    Zeeman, E. (1974). On the unstable behavior of stock exchanges. Journal of Mathematical Economics , 39-49.

    APPENDIX

    The Microsoft Excel file that computes R/S values is called "rescaled range analysis worksheet". Use paste values to

    put a column of returns data of sample size 2486 starting on cell A6 of the worksheet called "RS Computation".

    Then go to the worksheet called "Regression". All your R/S values are there along with the regression results.