capm - do you want fries with that?

26
CAPM: Do you want fries with that? Investment decision making ultimately comes down to questions of risk . How should risk be assessed? How much risk should we take to obtain a given return? What types of risk are rewarded and what types are not? The capital asset pricing model (CAPM) is the standard model representing the re- lationship between risk and return. CAPM states that risk is measured by the variance in the returns, so that the expected return of an investment represents the reward, while the variance of returns is the risk. In this representation of reality, given two investments with the same expected return but different variances, an investor will always choose the investment with smaller variance. Similarly, given two investments with the same variance of returns but different expected returns, an investor will always choose the investment with higher expected return. Under the CAPM model, all variance is risk, but not all risk is rewarded. For any asset, risk comes from two sources: effects that come from the specific actions of the asset manager (which affect only that asset), and marketwide movements (which affect all assets). Since marketwide effects will affect all assets, they cannot be diversified away. On the other hand, asset–specific components of risk will cancel out with each other if a large portfolio of assets is constructed, so under CAPM they are not rewarded. That is, under CAPM only variability related to market variability (the systematic risk or nondiversifiable risk ) is rewarded. Under CAPM, the expected return on an asset R can be written as a function of the riskfree rate R f (the return on the riskless asset, which has no variance; this is typically taken to be a short– or long–term bond rate, such as the 3–month Treasury bills rate) and the expected return of the market E (R m ): E (R)= R f + β (E [R m ] - R f ) = R f (1 - β )+ β (E [R m ]), (1) where β is the beta of the asset, the covariance of the asset’s returns with the market returns divided by the market return variance. This function is called the security market line. c 2011, Jeffrey S. Simonoff 1

Upload: logreturn

Post on 26-Nov-2015

20 views

Category:

Documents


0 download

DESCRIPTION

CAPM - Do you want fries with that?Investment decision making ultimately comes down to questions of risk . How shouldrisk be assessed? How much risk should we take to obtain a given return? What types ofrisk are rewarded and what types are not?

TRANSCRIPT

  • CAPM: Do you want fries with that?

    Investment decision making ultimately comes down to questions of risk . How should

    risk be assessed? How much risk should we take to obtain a given return? What types of

    risk are rewarded and what types are not?

    The capital asset pricing model (CAPM) is the standard model representing the re-

    lationship between risk and return. CAPM states that risk is measured by the variance

    in the returns, so that the expected return of an investment represents the reward, while

    the variance of returns is the risk. In this representation of reality, given two investments

    with the same expected return but different variances, an investor will always choose the

    investment with smaller variance. Similarly, given two investments with the same variance

    of returns but different expected returns, an investor will always choose the investment

    with higher expected return.

    Under the CAPM model, all variance is risk, but not all risk is rewarded. For any

    asset, risk comes from two sources: effects that come from the specific actions of the

    asset manager (which affect only that asset), and marketwide movements (which affect all

    assets). Since marketwide effects will affect all assets, they cannot be diversified away. On

    the other hand, assetspecific components of risk will cancel out with each other if a large

    portfolio of assets is constructed, so under CAPM they are not rewarded. That is, under

    CAPM only variability related to market variability (the systematic risk or nondiversifiable

    risk) is rewarded.

    Under CAPM, the expected return on an asset R can be written as a function of the

    riskfree rate Rf (the return on the riskless asset, which has no variance; this is typically

    taken to be a short or longterm bond rate, such as the 3month Treasury bills rate) and

    the expected return of the market E(Rm):

    E(R) = Rf + (E[Rm]Rf )= Rf (1 ) + (E[Rm]), (1)

    where is the beta of the asset, the covariance of the assets returns with the market

    returns divided by the market return variance. This function is called the security market

    line.

    c 2011, Jeffrey S. Simonoff 1

  • The beta of a security is of interest to an investor, as it measures the relative risk

    of the security compared with the market (a beta greater than one indicates a riskier

    than average security, while a beta less than one is consistent with a safer than average

    security). The beta can be estimated using a regression model relating stock returns to

    market returns,

    Ri = 0 + 1Rmi + i, (2)

    with V (i) = 2. Comparing this regression equation to (1) shows that the estimate

    of the slope is an estimate of beta. The estimated constant term can be compared to

    Rf (1 ) to see how the stock performed relative to the prediction of performance usingCAPM. (Technically, this is called the SharpeLintner version of CAPM; the Black version

    replaces Rf (1 ) in equation (1) with E(R0m)(1 ), where R0m is the return on thesocalled zerobeta portfolio, the portfolio that has the minimum variance of all portfolios

    uncorrelated with the market portfolio of assets.) The R2 of the regression, which estimates

    the proportion of the variability in the security accounted for by the market, estimates the

    market (nondiversifiable) risk of the security.

    The data examined here are the monthly returns for the McDonalds Food Corpo-

    ration. The data cover November 1988 through March 1996, or 89 months. The market

    return is measured using the New York Stock Exchange Composite Index. Here are the

    values:

    Row Date McDonalds return Market return

    1 8811 -0.042501 -0.020172

    2 8812 0.021505 0.028091

    3 8901 -0.001347 0.037379

    4 8902 0.079096 0.041268

    5 8903 -0.009143 -0.017024

    6 8904 0.048028 0.024549

    7 8905 0.084656 0.052701

    8 8906 0.016789 0.020975

    9 8907 0.017058 0.037563

    10 8908 -0.016772 0.076022

    11 8909 0.011736 0.009559

    12 8910 0.018951 -0.048047

    13 8911 0.040373 -0.019689

    c 2011, Jeffrey S. Simonoff 2

  • 14 8912 0.076532 0.025833

    15 9001 -0.054485 -0.028048

    16 9002 0.017563 -0.030466

    17 9003 -0.025920 0.034622

    18 9004 -0.005864 0.007355

    19 9005 0.021756 0.019589

    20 9006 0.098020 0.015266

    21 9007 0.010572 -0.013860

    22 9008 -0.197538 -0.088701

    23 9009 -0.069716 -0.050428

    24 9010 0.002323 -0.024987

    25 9011 0.056075 0.012492

    26 9012 0.039858 0.038468

    27 9101 -0.045752 0.006911

    28 9102 0.103667 0.115468

    29 9103 0.096968 0.007186

    30 9104 0.027639 0.005604

    31 9105 -0.025118 0.007187

    32 9106 -0.023000 0.018338

    33 9107 -0.002770 -0.008401

    34 9108 -0.020769 0.013011

    35 9109 0.010605 -0.004602

    36 9110 0.071503 0.021460

    37 9111 -0.016911 0.003192

    38 9112 0.022588 -0.029701

    39 9201 0.194721 0.095603

    40 9202 0.019247 0.006729

    41 9203 -0.034140 -0.002026

    42 9204 0.006742 -0.000769

    43 9205 0.063522 0.023114

    44 9206 0.028110 -0.014667

    45 9207 -0.009537 -0.005648

    46 9208 -0.055916 -0.011874

    47 9209 0.038035 -0.001818

    48 9210 -0.030287 -0.016921

    49 9211 0.087144 0.029031

    50 9212 0.046727 0.022980

    51 9301 0.000653 0.010092

    52 9302 0.020408 0.032486

    53 9303 0.050000 0.021991

    54 9304 -0.065486 0.007931

    55 9305 0.001916 0.000844

    56 9306 0.008910 -0.002222

    57 9307 -0.011977 0.010511

    c 2011, Jeffrey S. Simonoff 3

  • 58 9308 0.100776 0.026065

    59 9309 0.000000 0.003056

    60 9310 -0.001149 0.004092

    61 9311 0.039424 0.012023

    62 9312 0.030143 0.014189

    63 9401 -0.002184 0.026788

    64 9402 0.051041 0.000882

    65 9403 0.003107 -0.028628

    66 9404 -0.057672 -0.052835

    67 9405 0.041512 -0.001732

    68 9406 0.006816 0.007713

    69 9407 -0.042189 -0.009808

    70 9408 -0.075072 0.009913

    71 9409 0.025863 0.003630

    72 9410 0.004587 -0.017590

    73 9411 0.065059 -0.011684

    74 9412 -0.020339 -0.017653

    75 9501 0.021881 0.039433

    76 9502 0.132760 0.024697

    77 9503 0.047243 0.014850

    78 9504 0.007220 0.037510

    79 9505 0.044817 0.026078

    80 9506 0.033427 0.030091

    81 9507 0.013278 0.053668

    82 9508 -0.025370 -0.004127

    83 9509 0.052920 0.029195

    84 9510 0.012769 0.001223

    85 9511 0.085914 0.023503

    86 9512 0.045700 0.040412

    87 9601 0.016655 -0.005367

    88 9602 0.112645 0.050522

    89 9603 -0.000628 0.005572

    The use of monthly returns is quite typical in CAPM calculations, but the 7 12year time

    period is a bit longer than is typical (for example, Value Line and Standard and Poors

    use five years of data, while Bloomberg uses two).

    CAPM implies a linear relationship between McDonalds returns and market returns,

    which looks reasonable here:

    c 2011, Jeffrey S. Simonoff 4

  • -0.1 0.0 0.1

    -0.2

    -0.1

    0.0

    0.1

    0.2

    Market return

    McD

    on

    ald

    s r

    etu

    rn

    There is one noteworthy month at the lower left, which is case 22 (August 1990). This was

    at the beginning of a recession, and while the market did poorly (a 9% drop), McDonalds

    did particularly poorly (a 20% drop). Its not too surprising that a company that specializes

    in fast food (hardly a staple item) would suffer in a recession, and McDonalds did; its

    longterm debt was $4.4 billion in 1990, its highest value ever up through early 1996.

    Here are the results of a regression fit.

    Regression Analysis

    The regression equation is

    McDonalds return = 0.00735 + 1.09 Market return

    Predictor Coef SE Coef T P

    Constant 0.007351 0.004641 1.58 0.117

    Market r 1.0893 0.1503 7.25 0.000

    S = 0.04171 R-Sq = 37.7% R-Sq(adj) = 36.9%

    Analysis of Variance

    c 2011, Jeffrey S. Simonoff 5

  • Source DF SS MS F P

    Regression 1 0.091398 0.091398 52.55 0.000

    Error 87 0.151328 0.001739

    Total 88 0.242726

    The estimate of beta is 1.089; while this is greater than one (indicating a riskier

    than average stock), it is not significantly greater than one, as a ttest for the hypothesis

    H0 : 1 = 1 is

    t =1.0893 1

    .1503= .59.

    R2 = .377, leaving 62.3% diversifiable risk. This value of market (nondiversifiable) risk is

    a bit higher than is typical for U.S. stocks, since market risk averages about 27.0% in the

    U.S. market (it averages about 35% for U.K. stocks, 45% for German stocks, and 60% for

    the Taiwanese stock market).

    Does the least squares model fit these data? Here are some regression diagnostics.

    Note that August 1990 is apparently an outlier / leverage / influential point; August 1989,

    February 1991, and January 1992 also show up as possibly problematic.

    Data Display

    Row Date SRES1 HI1 COOK1

    1 8811 -0.67611 0.022593 0.005283

    2 8812 -0.39749 0.015770 0.001266

    3 8901 -1.19775 0.021396 0.015683

    4 8902 0.65035 0.024418 0.005293

    5 8903 0.04970 0.020305 0.000026

    6 8904 0.33654 0.014214 0.000817

    7 8905 0.48577 0.035575 0.004352

    8 8906 -0.32366 0.012974 0.000688

    9 8907 -0.75655 0.021530 0.006297

    10 8908 -2.65715 0.068855 0.261047

    11 8909 -0.14534 0.011236 0.000120

    12 8910 1.57633 0.054091 0.071046

    13 8911 1.32082 0.022226 0.019828

    14 8912 0.99137 0.014740 0.007352

    15 9001 -0.76136 0.029448 0.008794

    c 2011, Jeffrey S. Simonoff 6

  • 16 9002 1.05761 0.031876 0.018414

    17 9003 -1.71889 0.019493 0.029369

    18 9004 -0.51187 0.011290 0.001496

    19 9005 -0.16732 0.012583 0.000178

    20 9006 1.78572 0.011682 0.018846

    21 9007 0.44332 0.018263 0.001828

    22 9008 -2.79304 0.136198 0.615005

    23 9009 -0.54669 0.057717 0.009153

    24 9010 0.53931 0.026592 0.003973

    25 9011 0.84681 0.011360 0.004120

    26 9012 -0.22787 0.022203 0.000590

    27 9101 -1.46207 0.011317 0.012234

    28 9102 -0.76968 0.157294 0.055288

    29 9103 1.97225 0.011300 0.022228

    30 9104 0.34204 0.011424 0.000676

    31 9105 -0.97171 0.011300 0.005396

    32 9106 -1.21419 0.012272 0.009159

    33 9107 -0.02342 0.015352 0.000004

    34 9108 -1.01992 0.011405 0.006000

    35 9109 0.19962 0.013783 0.000278

    36 9110 0.98415 0.013123 0.006440

    37 9111 -0.66903 0.011737 0.002658

    38 9112 1.15927 0.031091 0.021562

    39 9201 2.11255 0.107705 0.269347

    40 9202 0.11011 0.011329 0.000069

    41 9203 -0.94805 0.012932 0.005888

    42 9204 0.00553 0.012580 0.000000

    43 9205 0.74825 0.013676 0.003882

    44 9206 0.88922 0.018759 0.007558

    45 9207 -0.25925 0.014178 0.000483

    46 9208 -1.21729 0.017115 0.012901

    47 9209 0.78831 0.012871 0.004051

    48 9210 -0.46522 0.020234 0.002235

    49 9211 1.16446 0.016237 0.011190

    50 9212 0.34628 0.013629 0.000828

    51 9301 -0.42658 0.011242 0.001035

    52 9302 -0.54036 0.018153 0.002699

    53 9303 0.45122 0.013293 0.001371

    54 9304 -1.96467 0.011264 0.021987

    55 9305 -0.15330 0.012187 0.000145

    56 9306 0.09605 0.012991 0.000061

    57 9307 -0.74217 0.011252 0.003134

    58 9308 1.57097 0.014840 0.018588

    59 9309 -0.25758 0.011759 0.000395

    c 2011, Jeffrey S. Simonoff 7

  • 60 9310 -0.31250 0.011602 0.000573

    61 9311 0.45758 0.011325 0.001199

    62 9312 0.17691 0.011533 0.000183

    63 9401 -0.93543 0.015159 0.006735

    64 9402 1.03083 0.012179 0.006551

    65 9403 0.65592 0.030016 0.006657

    66 9404 -0.18482 0.061532 0.001120

    67 9405 0.86994 0.012846 0.004924

    68 9406 -0.21549 0.011273 0.000265

    69 9407 -0.93920 0.016029 0.007185

    70 9408 -2.24787 0.011239 0.028719

    71 9409 0.35112 0.011669 0.000728

    72 9410 0.39732 0.020697 0.001668

    73 9411 1.70344 0.017010 0.025107

    74 9412 -0.20497 0.020742 0.000445

    75 9501 -0.68952 0.022943 0.005582

    76 9502 2.37893 0.014272 0.040971

    77 9503 0.57198 0.011621 0.001923

    78 9504 -0.99361 0.021492 0.010842

    79 9505 0.21884 0.014845 0.000361

    80 9506 -0.16210 0.016792 0.000224

    81 9507 -1.28342 0.036674 0.031354

    82 9508 -0.68139 0.013613 0.003204

    83 9509 0.33280 0.016321 0.000919

    84 9510 0.09859 0.012105 0.000060

    85 9511 1.27870 0.013817 0.011454

    86 9512 -0.13765 0.023719 0.000230

    87 9601 0.36586 0.014069 0.000955

    88 9602 1.22558 0.033186 0.025779

    89 9603 -0.33879 0.011427 0.000663

    We could now try to address potential model violations relative to the OLS model.

    For example, August 1990 might be removed, and we would reanalyze without it. Rather

    than do that, however, Id like to raise a different question: is August 1990 really unusual?

    Its further from the regression line than we would expect under OLS assumptions, but

    there is good reason to doubt one of those assumptions here the assumption of constant

    variance of the errors. If August 1990 corresponds to an observation with inherently larger

    residual variance, then its observed McDonalds return might not be unusually low at all.

    c 2011, Jeffrey S. Simonoff 8

  • Why might we expect nonconstant variance here? It comes from a crucial CAPM

    assumption: that the beta is constant over the entire 7 12year time period. This is unlikely

    to be true, as there is ample empirical evidence that betas change over time. If we fit a

    model with a constant beta to data consistent with changing beta, this will show up as

    nonconstant variance of a specific type.

    Lets consider a simple example: say there are two possible beta values for a given

    month, 1 + c and 1 c (obviously we could choose 1 and c to represent the two valuesthis way). The true underlying regression relationships are

    Ri = 0 + (1 + c)Rmi + i (3a)

    with probability .5, and

    Ri = 0 + (1 c)Rmi + i (3b)

    with probability .5. Under this model, we have

    E(Ri) = .5[0 + (1 + c)Rmi] + .5[0 + (1 c)Rmi]= 0 + 1Rmi;

    that is, on average the asset returns satisfy the CAPM formula (2). However, what are

    the variances of the errors, E[Ri E(Ri) |Rmi]2? For group (3a), we have

    V (i) = E[Ri E(Ri) |Rmi]2

    = E[0 + (1 + c)Rmi + i {0 + 1Rmi}]2

    = E[cRmi + i]2

    = c2R2mi + 2.

    For group (3b), we have

    V (i) = E[Ri E(Ri) |Rmi]2

    = E[0 + (1 c)Rmi + i {0 + 1Rmi}]2

    = E[cRmi + i]2

    = c2R2mi + 2.

    c 2011, Jeffrey S. Simonoff 9

  • That is, if the true beta varies in this way, the variance of the errors is 2+c2R2mi; we have

    heteroscedasticity, with the observed variance being a quadratic function of the market

    return.

    We can look at a plot of the absolute residuals from the OLS fit versus the market

    return values to see if nonconstant variance of this form is indicated. Here is a plot, with

    a lowess curve superimposed. This curve is an example of what is called a nonparametric

    regression estimate. Basically, it puts a smooth curve through the data points to help

    suggest structure that might not otherwise show up very clearly (it does this by fitting

    straight lines locally, rather than one straight line globally). The quadratic form of the

    nonconstant variance is very obvious.

    -0.1 0.0 0.1

    0

    1

    2

    3

    Market return

    Ab

    so

    lute

    re

    sid

    ua

    ls

    A Levenes test clearly rejects constant variance in favor of a quadratic model for het-

    eroscedasticity (see the appendix for discussion of how to identify and handle nonconstant

    variance that is related to a numerical predictor, rather than group membership):

    c 2011, Jeffrey S. Simonoff 10

  • Regression Analysis

    The regression equation is

    Absolute residuals = 0.691 - 1.35 Market return + 119 Markretsquared

    Predictor Coef SE Coef T P

    Constant 0.69102 0.07033 9.83 0.000

    Market r -1.353 2.322 -0.58 0.562

    Markrets 118.77 34.53 3.44 0.001

    S = 0.5928 R-Sq = 12.7% R-Sq(adj) = 10.7%

    Analysis of Variance

    Source DF SS MS F P

    Regression 2 4.3986 2.1993 6.26 0.003

    Error 86 30.2192 0.3514

    Total 88 34.6178

    Here is a regression to estimate the weights for a WLS fit:

    Regression Analysis

    The regression equation is

    lgsressq = - 1.51 + 1.28 Market return + 262 Markretsquared

    Predictor Coef SE Coef T P

    Constant -1.5086 0.2430 -6.21 0.000

    Market r 1.278 8.023 0.16 0.874

    Markrets 261.8 119.3 2.19 0.031

    S = 2.048 R-Sq = 6.6% R-Sq(adj) = 4.4%

    Analysis of Variance

    Source DF SS MS F P

    Regression 2 25.337 12.669 3.02 0.054

    Error 86 360.871 4.196

    Total 88 386.208

    The following plot illustrates the quadratic fit being used to estimate these weights:

    c 2011, Jeffrey S. Simonoff 11

  • -0.1 0.0 0.1

    -10

    -5

    0

    Market r

    lgsre

    ssq

    Y = -1.50862 + 1.27803X + 261.782X**2

    R-Sq = 0.066

    Regression Plot

    Here is a WLS version of the CAPM fit:

    Regression Analysis

    Weighted analysis using weights in wt

    The regression equation is

    McDonalds return = 0.00956 + 0.961 Market return

    Predictor Coef SE Coef T P

    Constant 0.009556 0.004241 2.25 0.027

    Market r 0.9610 0.1904 5.05 0.000

    S = 0.07457 R-Sq = 22.6% R-Sq(adj) = 21.8%

    Analysis of Variance

    Source DF SS MS F P

    Regression 1 0.14168 0.14168 25.47 0.000

    Error 87 0.48401 0.00556

    Total 88 0.62569

    c 2011, Jeffrey S. Simonoff 12

  • Things have changed a bit. The estimated beta for McDonalds is now less than one

    (although again, not significantly different from one). Note also that if this regression

    model was used to predict the McDonalds return from a given market return, the use

    of weights could change things dramatically. One would expect to find that a prediction

    interval from the WLS model would be narrower than one from the OLS model for a

    prediction for a small (close to zero) market return month, and wider for a prediction for

    a large (absolute) market return month, reflecting the inherent difference in variability off

    the regression line in these circumstances.

    August 1990 is no longer an outlier, since its high variability is accounted for by a

    small weight (that is, the assessment of the point as an outlier has changed because our

    model for the underlying variability of the observation has changed). Similarly, points

    previously flagged as potential leverage points are no longer assessed as problematic.

    Row Date SRES2 HI2 COOK2

    1 8811 -0.90899 0.0313846 0.0133862

    2 8812 -0.38421 0.0216368 0.0016323

    3 8901 -1.10109 0.0278343 0.0173564

    4 8902 0.67416 0.0301746 0.0070705

    5 8903 -0.06580 0.0278675 0.0000621

    6 8904 0.38974 0.0193488 0.0014985

    7 8905 0.47698 0.0348600 0.0041088

    8 8906 -0.34625 0.0172601 0.0010528

    9 8907 -0.67121 0.0279507 0.0064772

    10 8908 -1.28713 0.0313948 0.0268489

    11 8909 -0.19751 0.0133179 0.0002633

    12 8910 1.24446 0.0582867 0.0479275

    13 8911 1.38654 0.0308341 0.0305820

    14 8912 1.09424 0.0201590 0.0123170

    15 9001 -0.99147 0.0405856 0.0207919

    16 9002 0.98144 0.0433563 0.0218270

    17 9003 -1.66055 0.0260398 0.0368614

    18 9004 -0.63790 0.0132034 0.0027223

    19 9005 -0.17892 0.0165339 0.0002691

    20 9006 2.03582 0.0146603 0.0308323

    21 9007 0.40715 0.0245613 0.0020871

    22 9008 -1.33968 0.0398550 0.0372495

    23 9009 -0.67044 0.0592510 0.0141552

    24 9010 0.45649 0.0370008 0.0040032

    c 2011, Jeffrey S. Simonoff 13

  • 25 9011 0.96311 0.0138295 0.0065039

    26 9012 -0.15504 0.0285165 0.0003528

    27 9101 -1.75915 0.0132096 0.0207128

    28 9102 -0.07833 0.0096077 0.0000298

    29 9103 2.28448 0.0132046 0.0349175

    30 9104 0.36163 0.0132861 0.0008805

    31 9105 -1.17992 0.0132046 0.0093147

    32 9106 -1.36401 0.0159271 0.0150562

    33 9107 -0.12198 0.0196353 0.0001490

    34 9108 -1.19281 0.0139605 0.0100721

    35 9109 0.15737 0.0169432 0.0002134

    36 9110 1.10382 0.0175263 0.0108677

    37 9111 -0.84500 0.0136596 0.0049442

    38 9112 1.09992 0.0424880 0.0268421

    39 9201 0.76404 0.0201325 0.0059970

    40 9202 0.09159 0.0132150 0.0000562

    41 9203 -1.20053 0.0155118 0.0113544

    42 9204 -0.05963 0.0149358 0.0000270

    43 9205 0.83955 0.0184774 0.0066344

    44 9206 0.92534 0.0253776 0.0111478

    45 9207 -0.39286 0.0176171 0.0013839

    46 9208 -1.54221 0.0226415 0.0275492

    47 9209 0.86912 0.0154108 0.0059115

    48 9210 -0.66391 0.0277551 0.0062915

    49 9211 1.25946 0.0222654 0.0180612

    50 9212 0.39922 0.0183982 0.0014936

    51 9301 -0.52347 0.0133812 0.0018582

    52 9302 -0.50161 0.0246045 0.0031735

    53 9303 0.51414 0.0178250 0.0023986

    54 9304 -2.34118 0.0132102 0.0366878

    55 9305 -0.24256 0.0143164 0.0004273

    56 9306 0.04283 0.0156088 0.0000145

    57 9307 -0.88904 0.0134404 0.0053840

    58 9308 1.71500 0.0203083 0.0304847

    59 9309 -0.35750 0.0136898 0.0008870

    60 9310 -0.41817 0.0134847 0.0011951

    61 9311 0.51192 0.0137214 0.0018229

    62 9312 0.19265 0.0143000 0.0002692

    63 9401 -0.96648 0.0207773 0.0099098

    64 9402 1.16619 0.0143037 0.0098676

    65 9403 0.56110 0.0412568 0.0067741

    66 9404 -0.34721 0.0599191 0.0038420

    67 9405 0.96668 0.0153696 0.0072933

    68 9406 -0.28770 0.0132056 0.0005538

    c 2011, Jeffrey S. Simonoff 14

  • 69 9407 -1.21161 0.0207955 0.0155879

    70 9408 -2.65115 0.0133585 0.0475814

    71 9409 0.36648 0.0135692 0.0009238

    72 9410 0.33529 0.0284856 0.0016481

    73 9411 1.90440 0.0224654 0.0416743

    74 9412 -0.36314 0.0285548 0.0019381

    75 9501 -0.58862 0.0291046 0.0051932

    76 9502 2.60271 0.0194413 0.0671540

    77 9503 0.64719 0.0145156 0.0030847

    78 9504 -0.90140 0.0279175 0.0116676

    79 9505 0.26432 0.0203163 0.0007244

    80 9506 -0.12688 0.0229809 0.0001893

    81 9507 -0.92057 0.0350679 0.0153991

    82 9508 -0.89047 0.0166545 0.0067148

    83 9509 0.38746 0.0223760 0.0017181

    84 9510 0.05847 0.0141909 0.0000246

    85 9511 1.41815 0.0187098 0.0191727

    86 9512 -0.06134 0.0296836 0.0000575

    87 9601 0.35241 0.0174310 0.0011016

    88 9602 1.09682 0.0342752 0.0213484

    89 9603 -0.44263 0.0132891 0.0013193

    The Levenes test is no longer significant, which is consistent with the residual plots,

    which all look fine:

    The regression equation is

    absres = 0.816 - 0.61 Market return - 8.3 Marketsquared

    Predictor Coef SE Coef T P

    Constant 0.81588 0.07184 11.36 0.000

    Market r -0.607 2.371 -0.26 0.799

    Marketsq -8.31 35.27 -0.24 0.814

    S = 0.6055 R-Sq = 0.2% R-Sq(adj) = 0.0%

    Analysis of Variance

    Source DF SS MS F P

    Regression 2 0.0729 0.0365 0.10 0.905

    Residual Error 86 31.5284 0.3666

    Total 88 31.6013

    c 2011, Jeffrey S. Simonoff 15

  • -0.1 0.0 0.1

    -3

    -2

    -1

    0

    1

    2

    3

    Market return

    SR

    ES

    2

    -0.1 0.0 0.1

    0

    1

    2

    3

    Market return

    ab

    sre

    s

    c 2011, Jeffrey S. Simonoff 16

  • 10 20 30 40 50 60 70 80

    -3

    -2

    -1

    0

    1

    2

    3

    Observation Order

    Sta

    nd

    ard

    ize

    d R

    esid

    ua

    lResiduals Versus the Order of the Data

    (response is McDonald)

    -3 -2 -1 0 1 2 3

    -3

    -2

    -1

    0

    1

    2

    3

    Normal Score

    Sta

    nd

    ard

    ize

    d R

    esid

    ua

    l

    Normal Probability Plot of the Residuals(response is McDonald)

    A new estimate of the market risk is based on squaring the correlation between the

    fits from this model and the observed McDonalds returns:

    c 2011, Jeffrey S. Simonoff 17

  • Correlations (Pearson)

    Correlation of McDonalds return and FITS2 = 0.614

    That is, R2w1 = .6142 = 37.7% (the Fbased R2 measure is only R2w2 = 22.6%, reflecting

    that much of the apparent market risk is driven by months with high volatility). The

    riskless rate, as measured by the monthly equivalent rate for the first three month Treasury

    bill auction for that month, averaged .0045 over this time period; comparing the observed

    constant term to Rf (1 1) gives us an estimate of how McDonalds performed comparedto what CAPM would have predicted for it. Here this equals

    .009556 (.0045)(1 .9610) = .009381.

    That is, McDonalds outperformed its CAPM prediction by 0.9381% per month, which

    converts to an 11.86% annual outperformance of its CAPM prediction [(1.009381)12 =

    1.1186].

    The value of beta reported by investment analysts is usually rounded off to the nearest

    .05. It is also usually shrunk towards one because of regression to the mean (that is,

    analysts believe that stocks with unusually high or low betas in the past will probably

    be less extreme in the future). So, given our WLS estimate of 0.961, we would probably

    report McDonalds beta as 1.00. In fact, at this time the Value Line Investment Survey

    reported a beta of 1.00 for McDonalds, so were right in line with established opinion.

    One flaw in the previous analysis is that it is difficult to assess whether the observed

    unexpected performance (relative to CAPM) could just be due to random fluctuations;

    that is, is the 11.86% annual outperformance significantly different from zero? Also, the

    comparison of 0 to Rf (1 1) assumes that the riskless rate is constant over the entiretime period, which is not reasonable. We can correct these problems if we use a slightly

    different regression model to fit CAPM one based on excess returns. Lets go back to

    the original formulation of the CAPM model, but represent it a little differently:

    E(R) = Rf + (E[Rm]Rf),

    c 2011, Jeffrey S. Simonoff 18

  • E(R)Rf = (E[Rm]Rf ). (4)

    The values E(R) Rf and E[Rm] Rf are the expected excess returns of the asset andthe market, respectively, over the riskless rate; that is, they represent the returns that can

    be expected to be gained beyond those that come with zero risk. A regression model based

    on (4),

    Ri Rfi = 0 + 1(Rmi Rfi) + i,

    where the target and predictor values are now excess returns, provides an alternative

    way to estimate beta (via the slope in the model). Further, by (4), CAPM implies that

    the expected excess return exactly equals beta times the market excess return, so 0

    is an estimate of McDonalds performance relative to its predicted CAPM performance

    (sometimes called ). A test of whether the observed performance is significantly above or

    below the expected performance is then just the usual ttest for the constant term equaling

    zero.

    Here is an OLS regression using excess returns:

    Regression Analysis

    The regression equation is

    McDonalds excess rate = 0.00773 + 1.09 Market excess rate

    Predictor Coef SE Coef T P

    Constant 0.007735 0.004481 1.73 0.088

    Market e 1.0931 0.1504 7.27 0.000

    S = 0.04170 R-Sq = 37.8% R-Sq(adj) = 37.1%

    Analysis of Variance

    Source DF SS MS F P

    Regression 1 0.091861 0.091861 52.83 0.000

    Error 87 0.151277 0.001739

    Total 88 0.243138

    c 2011, Jeffrey S. Simonoff 19

  • The estimate of beta (1.093) is similar to that from the earlier OLS fit (1.089). The

    estimated outperformance of McDonalds from its CAPM prediction is .007735 (9.69%

    annualized), and it is not significantly different from zero at a .05 level (p = .088). Residual

    plots and a Levenes test (not given here) again indicate heteroscedasticity in the square

    of market return, with the following estimated weights:

    Regression Analysis

    The regression equation is

    lgsressq = - 1.51 + 3.75 Market excess rate + 266 Markexsq

    Predictor Coef SE Coef T P

    Constant -1.5100 0.2446 -6.17 0.000

    Market e 3.752 7.761 0.48 0.630

    Markexsq 266.3 120.0 2.22 0.029

    S = 2.088 R-Sq = 6.5% R-Sq(adj) = 4.4%

    Analysis of Variance

    Source DF SS MS F P

    Regression 2 26.280 13.140 3.01 0.054

    Error 86 375.025 4.361

    Total 88 401.305

    Here is the WLS fit:

    Regression Analysis

    Weighted analysis using weights in wt

    The regression equation is

    McDonalds excess rate = 0.00936 + 0.945 Market excess rate

    Predictor Coef SE Coef T P

    Constant 0.009357 0.004084 2.29 0.024

    Market e 0.9454 0.1913 4.94 0.000

    S = 0.07487 R-Sq = 21.9% R-Sq(adj) = 21.0%

    Analysis of Variance

    c 2011, Jeffrey S. Simonoff 20

  • Source DF SS MS F P

    Regression 1 0.13692 0.13692 24.43 0.000

    Error 87 0.48765 0.00561

    Total 88 0.62457

    The estimated beta (.945) is similar to the earlier WLS estimated beta (.961). The esti-

    mated outperformance of McDonalds compared to CAPM is .009357 (11.82% annualized),

    very similar to the earlier WLS estimate of .009381. Note that from this model fit, how-

    ever, we can establish that this outperformance is apparently significantly different from

    zero (p = .024), something that the other model fits could not do. That is, CAPM fails for

    McDonalds, in the sense that McDonalds performance is significantly better than CAPM

    predicts.

    An interesting application of WLS in the CAPM context can be found in the paper

    Outlier-Resistant Estimates of Beta by R.D. Martin and T.T. Simin (Financial Analysts

    Journal, 59(5), 56-69 [2003]). In that paper the authors useWLS to construct an estimator

    of beta that is resistant to the long-tailed nature of stock returns by downweighting those

    observations in the regression.

    Appendix: WLS when the error variance is related to numerical predictors

    We have previously discussed how nonconstant variance related to group membership

    can be identified using Levenes test, and handled using weighted least squares with the

    weights for the members of each group being the inverse of the residual variance for that

    group. Another way to refer to nonconstant variance related to group membership is

    to say that nonconstant variance is related to the values of a predictor variable, where

    that predictor variable happens to be categorical. It is also possible (as was the case

    here) that the variance of the errors is related to a (potential) predictor variable that is

    numerical (in this case it was effectively related to two variables, Market return and Market

    return2). Generalizing the Levenes test for this situation is straightforward; just construct

    a regression with the absolute residuals as the response and the potential numerical variable

    as a predictor. Note that this also can be combined with the situation with natural

    c 2011, Jeffrey S. Simonoff 21

  • subgroups by running an ANCOVA model with the absolute residuals as the response and

    both the grouping variable(s) and the numerical variable(s) as predictors. It is important

    to remember that the response variable itself should never be used as a potential predictor

    for nonconstant variance, since the (potential) nonconstant variance is already reflected in

    that response.

    Constructing weights for WLS in this situation is more complicated. What is needed

    is a model for what the relationship between the variances and the numerical predictor

    actually looks like. An exponential/linear model for this relationship is often used, whose

    parameters can be estimated from the data (this model has the advantage that it can only

    produce positive values for the variances, which of course is consistent with the actual

    situation). The model for the variance of ith error is

    var(i) = 2

    i = 2 exp

    j

    jzij

    ,

    where zij is the value of the jth variance predictor for the ith case and 2 is an overall

    average variance of the errors. These z variables would presumably be the predictors

    that were used above for the Levenes test, and while they would typically be chosen from

    the same pool of potential predictors as those for the regression itself (what we typically

    call the xs), they dont have to be the same variables (2 could be related to a variable

    that isnt related to E(y), and it could be unrelated to a variable that is).

    The problem with this formulation is that the j coefficients are unknown, and need

    to be estimated from the data. The key is to recognize that since 2i = E(2

    i ), by the

    model given above

    logE(2i ) = log2 +j

    jzij 0 +j

    jzij .

    That is, the logged expected squared errors follows a linear relationship with the z variables.

    This suggests that linear regression could be used to estimate the parameters, except that

    the expected squared errors are (of course) unknown. The trick is then to say that since

    the residuals are the best guesses we have for the errors, the squared residuals should be

    reasonable guesses for the expected squared errors, which means that the logged squared

    c 2011, Jeffrey S. Simonoff 22

  • residuals can be used as a response in a regression to estimate the s. The steps are thus

    as follows:

    (1) Create a variable that is the natural logarithm of the squares of the standardized

    residuals (LGSRESSQ, say). This variable can be formed in Minitab using the trans-

    formation Let LGSRESSQ = LN(SRES*SRES).

    (2) Perform a regression of LGSRESSQ on the variance predictor variables (the z variables),

    and record the fitted regression coefficients (dont worry about measures of fit for this

    regression).

    (3) Create a weight variable for use in the weighted least squares analysis. The weights

    are estimates of the inverse of the variance of the errors for each observation. They

    have the form WT = 1/ exp(FITS1), where FITS1 is the variable with fitted values from

    the regression in step 2.

    (4) Perform a weighted least squares regression, specifying WT as the weighting variable.

    You should redo a Levenes test to make sure that the nonconstant variance has been

    corrected. Remember that all plots and tests must be based on the standardized

    residuals, not the ordinary residuals, since the attempts to address nonconstant vari-

    ance are accounted for in the standardized residuals.

    Just as was the case when doing WLS based on a categorical predictor, the estimated

    variance of the error for any member of the population is s/WTi. The value of WTi comes

    from the estimated regression function in step 2 above (which is why it is a good idea to

    write down that function). So, for example, for the CAPM data the function that defined

    the weights was

    WT = 1/ exp(1.51 + 1.278 Market return + 261.8 Market return2).

    If a prediction for a new trading day for the McDonalds return was desired, and the market

    return on that day was .05 (for example), the weight associated with that day would be

    1/ exp(1.51 + (1.278)(.05) + (261.8)(.052)) = 2.207.

    The estimated McDonalds return on that day, found by substituting .05 for the market

    return into the WLS model would be .05761, while the estimated standard deviation of the

    c 2011, Jeffrey S. Simonoff 23

  • error term for that day would be s/WT = .07457/

    2.207 = .0502, where the s value also

    comes from the WLS model. Note that this estimated standard deviation of the errors is

    larger than that from the OLS model (which was .04171), which reflects that a day with a

    market return of .05 will have higher than average variability. A rough prediction interval

    for the McDonalds return on that day is thus .05761 (2)(.0502), or (.0428, .158).The exact prediction interval that comes out of Minitab requires more work. Here is

    the output that comes out if confidence and prediction intervals are requested for a value

    of market return equal to .05:

    * WARNING * The prediction interval output assumes a weight of 1. An

    adjustment must be made if a weight other than 1 is used.

    Predicted Values for New Observations

    New

    Obs Fit SE Fit 95% CI 95% PI

    1 0.05761 0.00928 (0.03917, 0.07605) (-0.09176, 0.20698)

    Values of Predictors for New Observations

    New Market

    Obs return

    1 0.0500

    Note that Minitab provides a warning that the prediction interval is incorrect. The

    problem is that the program assumes that the appropriate weight is equal to 1, even though

    we just saw that it really should be 2.207. The correction for this must be made by hand.

    The standard error of the fitted value (used for confidence intervals) is given correctly, but

    we need to calculate the standard error of the predicted value. This equals

    (Standard error of fitted value)2 + (Residual MS)/(Weight),

    where the Residual MS comes from the WLS fit. Then, the prediction interval is

    Predicted value tnp1/2 (Standard error of predicted value).

    c 2011, Jeffrey S. Simonoff 24

  • The standard error of the predicted value in this case is.009282 + .00556/2.207 = .051.

    For n = 89 and p = 1 the appropriate critical value for a 95% interval is 1.988, giving

    prediction interval

    .05761 (1.988)(.051) = (.0438, .159),

    which is of course very similar to the rough prediction interval given earlier.

    There is another mechanism by which variances of errors in a regression model can

    be different for different observations, and related to a numerical variable. Say that the

    response variable at the level of an individual follows the usual regression model,

    yi = 0 + 1xi1 + + pxpi + i,

    with i N(0, 2). Imagine, however, that the ith observed response is actually an averageyi for a sample of size ni with the observed predictor values {x1i, . . . , xpi}. The model isthus

    yi = 0 + 1xi1 + + pxpi + i,

    where

    V (i) = V (yi|{x1i, . . . , xpi}) =2

    ni.

    An example of this kind of situation could be as follows. Say you were interested in

    modeling the relationship between student test scores and (among other things) income.

    While it might be possible to obtain test scores at the level of individual students, it would

    be impossible to get incomes at that level because of privacy issues. On the other hand,

    average incomes at the level of census tract or school district might be available, and could

    be used to predict average test scores at that same level.

    This is just a standard heteroscedasticity model, and WLS is used to fit it. In fact,

    this is a particularly simple case, since the weights do not need to be estimated at all;

    since V (i) = 2/ni, the weight for the ith observation is just ni. That is, quite naturally,

    observations based on larger samples are weighted more heavily in estimating the regression

    coefficients.

    It should be noted, however, that this sort of aggregation is not without problems.

    Inferences made from aggregated data about individuals are called ecological inferences,

    c 2011, Jeffrey S. Simonoff 25

  • and they can be very misleading. As is always the case, we must be aware of confounding

    effects of missing predictors; for example, if school districts with wealthier residents also

    have lower proportions of non-native English speakers, a positive slope for income could be

    reflecting an English speaker effect, rather than an income effect. In addition, ecological

    inferences potentially suffer from aggregation bias, whereby the information lost when

    aggregating (as it is clear that some information will be lost) is different for some individuals

    than for others (for example, if there is more variability in incomes in some school districts

    compared to others, more information is lost in those school districts), resulting in biased

    inferences.

    Minitab commands

    To construct a scatter plot with a lowess curve superimposed on it, enter the appro-

    priate variables under Y variables and X variables as usual. Click on Data View, then

    Smoother, and click the button next to Lowess. Alternatively, a lowess curve can be su-

    perimposed on an existing plot by right clicking on the plot, and clicking Add Smoother,and then OK.

    c 2011, Jeffrey S. Simonoff 26