abstract - rutgers universityraw.rutgers.edu/docs/seminars/fall2010/sangsangliu.pdf · prediction...

A GMM based, parametric portfolio policy for large-scale equity portfolio optimization leveraging key accounting characteristics

Student: Sangsang Liu

Advisor: Suresh Govindaraj

August, 2010

Abstract Given the recent trend of developing new methods for portfolio optimization, here we review

two new papers using the parametric portfolio policy and propose potential improvements.

Specifically, we first introduce details of the two papers (Brandt, et al., 2009, Hand & Green,

2009), that use the parametric portfolio policy, a novel approach proposed by Brandt, et al.,

to make the optimal portfolio decision with firm-specific characteristics.

Then, we suggest two potential improvements:

1) Applying progressive learning algorithm to coefficient

2) Introducing discretionary accruals and non-discretionary accruals as firm-specific

accounting-based characteristics to account for the potential effect of earnings

management

Finally, we reverse engineer Matlab programs and associated generalized method of

moments(GMM) optimizer based on available publications and demonstrate we can achieve

comparable empirical results as described by Hand & Green since the authors are not willing to

share their proprietary Matlab codes. Besides, we demonstrate average returns can be further

improved by incorporating more recent trend information using either a simple “sliding window”

method or the “time-series correction to out-of-sample coefficients” method proposed by us.

1. Introduction

Markowitz introduces the mean–variance approach for portfolio optimization in his 1952

landmark publication “Portfolio Selection”. The mean–variance method requires modeling

the expected returns, variances, and covariances of all stocks as functions of their firm-

specific characteristics. This is not only an extremely difficult econometric problem given

the large number of moments involved and the need to ensure the positive definiteness of the

covariance matrix, but the results of the method are also extremely noisy and unstable (e.g.,

Michaud 1989).

Building on the work of Markowitz (1952), Sharpe(1964), Lintner(1965) and Black(1972)

develop the famous CAPM model, which is a single factor model of asset pricing. The major

prediction of the model is that the market portfolio of invested wealth is “mean-variance

efficient”. However, the CAPM model oversimplifies the complex market and a number of

empirical contradictions to the CAPM model are found in later literatures. For example,

CAPM cannot explain the phenomena that the average returns on small (low ME) stocks are

too high given their estimates, and average returns on large stocks are too low.

Particularly, in multivariate tests, the negative relation between size and average returns is

robust to the inclusion of other variables. And the positive relation between book-to-market

ratio and average returns also persists in combination with other variables. Moreover,

although the size effect has gained more attention, book-to-market ratio has a consistently

stronger role in predicting average returns. As a result, CAPM does not seem to explain the

cross-section of average stock returns for more recent period(e.g.,1980-1990 data).

Given the combination of size and book-to-market ratio seems to cover the roles of leverage

and E/P in average stock returns, Fama and French(1993) introduce a three-factor model.

Besides the factor , two additional variables(firm size and book-to-market ratio) are added

to help explain the cross-section of average stock returns.

One key challenge regarding all these factor-based or firm-specific characteristics-based

models is to leverage multiple firm-specific characteristics in a computationally efficient

way to make optimal portfolio investment decisions. The mean–variance method requires

modeling every firm’s expected return, variance and covariances as a function of those

characteristics which is very difficult to implement when the portfolio includes a large

number of assets. Therefore, Brandt, Santa-Clara and Valkanov (2009) develop a novel and

elegant parametric portfolio policy (PPP) method for portfolio optimization. This method

offers three distinctive advantages. It is computationally simple for large-scale equity

portfolios. It can be easily modified and extended to account for portfolio variations (e.g.,

long portfolio, post transaction cost.). Finally, it offers robust in and out of sample

performance. The PPP method directly models portfolio weights as a linear function of firm

characteristics(firm size, book-to-market ratio, momentum/lagged return), and then estimates

the parameters of the linear function by maximizing the average utility of the portfolio’s

return that a risk-averse investor would realize if he/she implementes the policy over a given

sample period. Brandt, et al.’s(BSCV) results demonstrate the importance of the firm’s

market capitalization, book-to-market ratio, and one-year lagged return for explaining

deviations of the optimal portfolio for a CRRA investor from the value-weighted market

portfolio. Compared to value-weighted market portfolio, the optimal portfolio (with and

without short-sale constraints) allocates considerably more wealth to stocks of small firms,

firms with high book-to market ratios (value firms), and firms with large positive lagged

returns(past winners). The result is consistent with the findings of Fama and French and

other related literatures. Besides, the PPP method can be easily applied to other asset classes.

For example, investors can use a similar approach to form bond portfolios based on bond

characteristics (e.g., duration, convexity, coupon rate) or to currency portfolios (e.g., interest

rate and inflation rate differentials).

Building on the work of Brandt, et al(2009), Hand and Green (2009) find that three

illustrative accounting-based characteristics—accruals, change in earnings, and asset

growth—are economically important. Inclusion of these three accounting-based

characteristics generates a higher out-of-sample, pre-transactions-costs annual information

ratio compared to that for the standard price-based characteristics of firm size, book-to-

market, and momentum. Nevertheless, Hand and Green do not discuss the selection logic of

the three illustrative accounting-based characteristics. For example, one could argue that

discretionary accruals instead of total accruals is a better indicator of earnings management

and therefore should be selected. In addition, Brandt, et al. mention the possibility of

including a business cycle variable to account for variations over time. However, neither

Brandt, et al. nor Hand & Green fully implement the concept. One could also argue that a

simple business cycle variable cannot fully account for the shift in recent performance trend.

Therefore, the focus of our work is: 1) Applying progressive learning algorithm to

coefficient . To account for variations over time and recent performance trend, we update

the calculation of coefficient of characteristics, which is constant across assets and

through time in BSCV and Hand and Green’s papers, by applying time-series correction to θ.

As a result, the coefficient becomes progressive learning based on its recent trend. The

advantage of using learning model is to give more weight in forming the optimized portfolio

to the characteristic which has higher influence in the most recent years’ optimized

portfolios. In other words, we allow the most recent trend to have more impact on the

immediate portfolio selection decision while keeping the fundamental methodology to

estimate the coefficient consistent between past and future. Therefore, the model is adjusted

to be responsive to new trend while preserving historical learning. 2) Introducing

discretionary accruals and non-discretionary accruals as firm-specific accounting-based

characteristics to account for the potential effect of earnings management. Specifically, we

split the accounting-based characteristics “accruals” to “discretionary accruals” and “non-

discretionary accruals” as two separate accounting-based characteristics to account for the

potential effect of earnings management.

In order to test our potential improvements using empirical data, we reverse engineer Matlab

programs and associated GMM optimizer based on available publications(Brandt, et. al.,

2009, Hand and Green, 2009) as the authors are not willing to share their proprietary Matlab

codes. In the empirical work section of this paper, we demonstrate we can achieve

comparable empirical results as described by Hand & Green. We also demonstrate average

returns can be further improved by incorporating more recent trend information using a

simple “sliding window” method and can also be improved by applying the “time-series

correction to out-of-sample θs” method proposed by us. The remainder of the paper proceeds

as follows. We explain the algebra, strength and limitation of BSCV’s PPP method along

with our proposed improvements in Section 2. We then describe the data we use and our

implementation procedure and timeline in Section 3, and our empirical results in Section 4.

We also outline some caveats of our study and provide potential future research direction in

Section 5. We conclude in Section 6.

2. Methodology and our proposed improvements 2.1 Basic structure of the parametric portfolio policy (PPP) method of Brandt, Santa-Clara and Valkanov (2009) The basic assumption of the PPP method is that at every date t the investor chooses a set of

portfolio weights {wi,t}, i=1,2……Nt of a set of stocks Nt so as to maximize the conditional

expected utility of that portfolio’s one-period ahead return rp,t+1. For each time t, the return

of stock i from t to t+1 is ri,t+1 and its standardized characteristic vector is defined as ,i tx .

Now, the problem is to find the weight for each stock in the optimized portfolio to maximize

the investor’s expected utility of the portfolio's return rp,t+1. If we assume the weight of each

stock in the optimized portfolio at time t is wi,t, then the conditional optimization problem is

described as following,

1

where w , is calculated by the estimated coefficient θ combined with the stocks’

characteristics as following,

(2)

Here w , is the weight of stock i at time t in a benchmark portfolio, such as the value-

weighted market portfolio. Then, θ is a vector of coefficients to be estimated. This method is

a so called “portfolio policy" because the estimated weights are generated by a single

function of characteristics that applies to all stocks over time, rather than estimating one

weight for each stock at each point in time. Given the assumption that coefficients θ are

constant across assets, the estimated weight of each stock in the optimized portfolio only

depends on the stock's characteristics instead of its historical returns. In other words, two

stocks that are close to each other in characteristics associated with expected returns and risk

should be assigned similar weights in the portfolio even if their historical returns are very

different. And as a result of the “constant coefficients through time” assumption, the

coefficients that maximize the investor’s conditional expected utility at a given date are the

same for all dates. Therefore, the conditional optimization problem can be rewritten as an

unconditional optimization problem,

3

Then θ can be estimated by maximizing the corresponding sample analog combined with

equation (2),

1

, , , 10 1

1 1( ( ) )max

tNT

i t i t i tt i t

u w x rT N

(4)

Four advantages of this approach can be pointed out. First of all, given the relatively low

dimensionality of the parameter vector, it is computationally simple to optimize a portfolio

of a very large number of stocks. In other words, only the number of characteristics entering

the portfolio policy(6 in our paper), rather than the number of assets in the portfolio, have an

impact on the computational burden. As a result, the PPP method avoids the tough task of

modeling the joint distribution of returns and characteristics and thus considerably reduces

the dimensionality of the optimization problem. Second, since the entire portfolio is

optimized by choosing only a few parameters θ, this method is numerically robust and less

likely to have the in-sample overfitting problem. Third, the model simultaneously takes into

account the relations between firm characteristics and all moments of returns, including

higher-order moments of returns. Fourth, in contrast to the traditional mean-variance

approach, the PPP method can accommodate different investors’ utility functions.

To estimate parameter θ, BSCV first take the first order condition(FOC) of the maximization

problems described in equation (4) , which should equal to zero in order to maximize the

expected utility.

FOC:

5

Given equation (5) is the sample moments of a moment condition and it equals to zero, we

can apply the generalized method of moments (GMM) formalized by Hansen in 1982 to

estimate the parameter θ here, where 1

, , , 10 1

1 1( ( ) )arg max

tNT

i t i t i tt i t

u w x rT N

the asymptotic covariance matrix of the estimator is

6

where

(7)

and V is a consistent estimator of the covariance matrix of h(r,x; θ), which can be estimated

by

(8)

2.2 Our first improvement: Applying progressive learning algorithm to coefficient

Both Hand&Green and Brandt, et al. use a “quasi-fixed time period” method that uses all

available historical data on an equal basis to estimate θt. Although this is a simple way to

leverage all available data, the influence of new trend (e.g., information contained in year

t-1’s data) has on θt decreases when time progresses. To allow the model to be responsive to

new trend while preserving historical learning, we suggest implementing a progressive

learning model by applying time-series correction to θ.

First, we use bootstrapping to increase the number of data points for each year for the time

horizon of 1975-2000 and therefore achieve the in-sample θ for each year t, i.e. ,t in sample .

Then we calculate the difference t for year t, where , ,t t in sample t out of sample . For

example for year 1980, the out-of-sample 1980,out of sample is calculated by the data from 1965-

1975, while the 1980,in sample is calculated by the data from Jan-Dec, 1980. Then we can use a

simple t average of the previous N years to account for the new trend of t . In other

words, for year t+1, the estimated correction for the original out-of-sample of year t+1 is

1

10

1'

N

t t iiN

. So the new out-of-sample coefficient is,

1, 1, 1' 't out of sample t out of sample t .

Due to time limitation, going forward, we will work on a more sophisticated optimization

model to update θ based on its recent trend and maximize the investor's expected utility

simultaneously.

The advantage of using the progressive learning model to estimate t at year t is we allow

the most recent trend to have more impact on the immediate portfolio selection decision

while keeping the fundamental methodology to estimate the coefficient consistent between

past and future.

2.3 Our second improvement: Separating discretionary accruals and non-discretionary

accruals to account for the potential effect of earnings management

To account for the potential effect of earnings management in the portfolio optimization, we

propose to split the total accruals(TA) to two separate characteristics-“discretionary

accruals” (DA)and “non-discretionary accruals”(NDA). So TA=DA+NDA. The calculation

of the discretionary accruals and non-discretionary accruals is based on the “modified Jones

model”. In this model, nondiscretionary accruals are estimated by a linear regression of the

total accruals scaled by the lagged total assets on 11/ tA , ( tREV - tREC ), and tPPE using

OLS(ordinary least square) , i.e.

tTA = 1a (1/ 1tA )+ 2a ( tREV ‐ tREC )+ 3a ( tPPE )+ tv (9)

tNDA = 1 (1/ 1tA )+ 2 ( tREV ‐ tREC )+ 3 ( tPPE ) (10)

where

1tA =total assets at t-1

tREV =revenues in year t less revenues in year t-1 scaled by total assets at t-1

tREC =net receivables in year t less net receivables in year t-1 scaled by total assets at t-1

tPPE =gross property plant and equipment in year t scaled by total assets at t-1

tTA =total accruals scaled by total assets at t-1

1 2 3, , =firm-specific parameters to be estimated by OLS

1 2 3, ,a a a =OLS estimators of 1 2 3, ,

1a (1/ 1tA )+ 2a ( tREV ‐ tREC )+ 3a ( tPPE )= tNDA

tv =DA

The residual of the above regression is the discretionary accruals (DA), and the non-

discretionary accruals(NDA) equals TA minus DA. The empirical test will be conducted

with these seven characteristics(discretionary accruals, non-discretionary accruals, change in

earnings, asset growth, market capitalization, book-to-market, and momentum) instead of the

original 6 characteristics proposed by Hand and Green, 2009.

For the choice of the utility function, like most of the empirical applications, we use standard

CRRA preferences over wealth,

Following BSCV and Hand and Green’s papers, we set =5.

3. Data and implementation procedure

Our focus of the empirical part of this study is to reproduce Hand & Green’s results with all

procedures completed by our own Matlab code for the time horizon of 1965-2008 with the

six illustrative firm-specific characteristics: three accounting-based characteristics(accruals,

change in earnings, and asset growth) and three price-based characteristics (market

capitalization, book-to-market, and momentum). We also demonstrate average returns can be

further improved by incorporating more recent trend information using a simple “sliding

window” method and can also be improved by applying the “time-series correction to out-of-

sample θs” method proposed by us. We will also split the accruals to two separate

characteristics-“discretionary accruals” and “non-discretionary accruals” to account for the

potential effect of earnings management in the portfolio optimization. The calculation of the

“discretionary accruals” and “non-discretionary accruals” is based on the “modified Jones

model”. We will complete the empirical test with these seven characteristics (discretionary

accruals, non-discretionary accruals, change in earnings, asset growth, market capitalization,

book-to-market, and momentum) later.

We first describe the data and then present our results which include the base case with the

six illustrative firm-specific characteristic, both in and out of sample, and the preliminary

results for one potential improvement we propose. Unless otherwise stated, we assume an

investor with CRRA preference and a relative risk aversion of five. Following BSCV, the

investor is restricted to only invest in U.S. stocks in our application. In other words, we do

not include the risk-free asset in the investment opportunity set because to a first-order

approximation including the risk free asset affects only the leverage of the optimized

portfolio.

3.1 Data

The financial statement data, price per share and the number of shares outstanding which are

used to calculate the accounting-based characteristics, book equity and market equity are

collected from the COMPUSTAT annual industrial file, while the monthly stock returns are

collected from CRSP monthly files. And the one-month Treasury bill rates(risk-free rate) are

collected from the Fama-French factor data set at WRDS. We collect all COMPUSTAT

variables from fiscal year 1963 through fiscal year 2008 together with CRSP data from Jan.

1963 through Dec. 2008.

As a first step, for each firm in this dataset, we define the six illustrative firm-specific

characteristics we use for portfolio optimization with the PPP method. Firm size or market

capitalization MVE is defined as the market value of common equity at the firm’s fiscal year

end, which is calculated by price per share times the number of shares outstanding. Book-to-

market BTM is the fiscal year end book value scaled by MVE(Stattman, 1980; Rosenberg,

Reid and Lanstein, 1985). Book equity is calculated by total assets minus liabilities, plus

balance sheet deferred taxes and investment tax credits, minus preferred stock value.

Momentum MOM is taken to be the compounded return between months t − 13 and t – 2 for

the monthly portfolio return at month t. When the operating cashflow is available, annual

accruals ACC are net income less operating cash flow scaled by average total assets;

otherwise per Sloan (1996) we set ACC = ∆current assets – ∆cash – ∆current liabilities –

∆debt in current liabilities – ∆taxes payable – ∆depreciation all scaled by average total

assets. If any of the above components is missing, we set it to zero. The change in annual

earnings UE is simply net income in the most recent fiscal year less net income of the prior

year, scaled by average total assets (Ball and Brown, 1968; Foster, Olsen and Shevlin, 1984).

Lastly, asset growth AGR is defined to be the natural log of one plus total assets at the end of

the most recent fiscal year less the natural log of one plus total assets one year earlier

(Cooper, Gulen and Schill, 2008, 2009).

Then we use CUSIP and calendar year to match the CRSP and COMPUSTAT data for the

entire universe of U.S. stocks from Jan. 1965 to Dec. 2008. If both the CUSIP and calendar

year match, we will consider them as a same firm and merge the data. After the matching

step from two databases, we then check the availability of monthly stock returns, the

availability of all the price-based data items(price, number of shares outstanding) and lastly

the availability of sufficient data to compute a firm's accruals, change in earnings, and asset

growth. Specifically, for the purpose of calculating momentum, we also check the

availability of the firm’s monthly return between months t − 13 and t – 2. If any of the

monthly return during this period is not available, we remove the firm from our dataset for

month t. Following BSCV, we also delete the smallest 20% of firms as measured by MVE

since such small firms tend to have low liquidity, high bid-ask spreads and

disproportionately high transactions costs. The final number of firms varies greatly by year,

rising from a low of only 359 in 1965 to a peak of 4773 in 1998. The average number of

firms per year is 2917.

3.2 Implementation procedure

In this section, we first describe how we line up stock returns data in real-time and financial

statement variables with a six-month lag rule in order to make sure these information would

have been available to investors before the end of month t and therefore can be used to

estimate the parameters following BSCV’s PPP method. Secondly, we describe how we

standardize the raw firm characteristics in order to make across-characteristic comparisons,

and to reduce the impact of outliers on the PPP estimation procedures. Finally, we describe a

“partially-rolling parameter estimation period” method we use to generate the out-of-sample

returns.

Similar to methodology used by Brandt, et al., 2009 and Hand & Green, 2009, for each

month t over the period Jan. 1965-Dec. 2008, we assume that investors have monthly return

information up to the end of month t, and that accounting-based information is available with

a six-month lag past a firm’s fiscal year-end. For instance, at the end of Sep. 1998, we

assume that investors only have access to annual accounting information published by firms

with fiscal years ending on or before Mar. 31, 1998. For those firms with fiscal years ending

Apr. 1 through Sep. 30, we assume that the most recently available annual accounting

information available to investors is from the prior fiscal year end. For example, the

accounting information available to investors at the end of Sep. 1998 for a firm whose fiscal

year-end is Apr. 30, 1998 will be extracted from the firm’s Apr. 30, 1997 annual report. We

impose this constraint in order to avoid look-ahead problems, and to make our methods

consistent with those of Hand and Green, 2009. We also apply this six-month delay rule to

MVE in addition to all other accounting variables from annual report. This indicates we do

not use MVE of month t but instead use the fiscal-year end MVE based on the six-month

delay rule.

To make across-characteristic comparisons, and to reduce the impact of outliers on the PPP

estimation procedures, we transform all raw firm characteristics by a ranking method. For

every month t, each characteristic is ranked into 100 bins (0.01 to1). We then subtract the

mean from the ranked characteristics to guarantee that the characteristics have a mean of

zero and a range of -0.5 to 0.5.

Following Hand and Green, 2009, we use the 408 monthly returns between Jan. 1975 and

Dec.2008. to calculate in-sample results. Therefore, the full period Jan. 1975-Dec. 2008 is

used to estimate one single parameter set in equation (2).

Following BSCV, we calculate out-of-sample returns based on a “partially-rolling parameter

estimation period” method or “quasi-fixed time period”. For each month in the first year of

the out-of-sample period, Jan.-Dec. 1975, we use data from Jan. 1965-Dec. 1974 to estimate

the coefficients 1965 1974 . Then we combine the initial parameter set 1965 1974 with the

standardized and monthly varying firm characteristics to generate out-of-sample returns of

the optimized portfolio for Jan.-Dec. 1975. Then for each month in the following year(1976)

of the out-of-sample period, Jan.-Dec.1976, we roll the ending point, but not the beginning

point, of the parameter estimation period forward one year through Dec. 1975 to estimate the

parameter set 1965 1975 . The same method is repeated every year from 1976 to 2008 to obtain

a specific θ for that given year. In mathematical terms, the “quasi-fixed” time period method

can be described as:

At year t, 1

, , , 10 1

1 1( ( ) )arg max

nNT

i n n i n i nn i n

u w x rt T N

where T equals the number of

years that historical data are available. Therefore at year t, T=t.(t=0 for the first year in the

available dataset-1965).

At year t+1, 1

, , , 10 1

1 1( ( ) )1 arg max

nNT

i n n i n i nn i n

u w x rt T N

where T=t+1

The main drawback of the “quasi-fixed time period” method is the calculation of carry

progressively more weight of the historical returns and as a result would become more and

more “stationary” over time. Therefore, we propose the progressive learning model for the

calculation of to better account for time variation when calculating . The model has been

introduced in the methodology part of this paper.

4. Empirical Results

Our findings are shown in the attached figures and tables. Following most of the empirical

implementation procedures of Hand&Green, 2009, we apply the BSCV’s PPP method to an

average of 2829 firm observations per month without imposing any short-sale constraints or

transactions costs.

Similar to Hand&Green, 2009, we include six accounting-based and price-based firm

characteristics to optimize investor’s average utility as described in equation (4). We show

the trend of the estimated PPP parameters or GMM estimators of the coefficients of the six

firm characteristics in Figure 1. These time-series figures of different firm characteristics

share a common feature---the estimated PPP parameters all drop significantly during the

Dot-com Bubble years 2000-2001. In addition to PPP parameters, we also calculate the

monthly weighted characteristics of the portfolio as , ,1

ˆtN

i t i ti

w x .

We present our average out-of-sample and in-sample parameter estimates, descriptive

statistics on portfolio weights, and the average monthly weighted firm characteristics in

Table 2. VW is defined by all observations included in our data set, not the unrestricted

CRSP universe. Specifically, for the PPP in-sample scenarios, a single PPP parameter is

estimated over the full sample period Jan. 1975-Dec. 2008. Different from in-sample

scenarios, PPP out-of-sample scenarios show the average of 34 different s (1975-2008).

The sample period for out-of-sample scenarios is defined by the “quasi-fixed time period”

method explained in section 3.2. The standard errors for in-sample scenarios are taken from

the sample asymptotic covariance matrix of the GMM optimization. Similar to Hand&Green,

2009, we use an alternative method to calculate the standard errors for the out-of-sample

scenarios. Here the standard errors of the parameter estimates are the average of the standard

errors of the 34 out-of-sample coefficients estimated by “partially-rolling parameter

estimation periods” method.

From Table 2, we observe that the deviations of the optimal weights from the value-weighted

market portfolio weights decrease with the firm’s market capitalization (firm size), accruals,

and asset growth. On the other hand, the deviations increase with the firm’s book-to-market

ratio, change in earnings and its momentum(lagged one-year return). The signs of these

estimates are consistent with those demonstrated by Hand&Green, 2009. These findings are

also consistent with the findings of Brandt, et al. 2009---the investors overweight small

firms(low market capitalization), value firms(high book-to-market ratio), and past

winners(high lagged one-year return), and underweight large firms, growth firms, and past

losers. Given these characteristics are standardized cross-sectionally, the magnitudes of the

estimated parameters can be compared with each other. Among all six firm characteristics, a

high change in earnings leads to the quantitatively largest overweighting of a stock compared

to the value-weighted market portfolio. If we only consider the three price-based

characteristics(MVE, MOM and BTM), a high book-to-market ratio leads to the

quantitatively largest overweighting of a stock. This is consistent with the finding of Brandt,

et al. 2009.

Following BSCV, we also report the PPP in-sample and out-of-sample monthly absolute

weight of the optimal portfolio, the monthly maximum and minimum weights, the total short

weights, and the proportion of negative weights in Table 2. The last part of Table 2 reports

the time-series average of the monthly weighted averages of the firms’ standardized

characteristics in the optimal portfolio. Positive values indicate an overall preference for

firms with relatively higher normalized characteristic. For example, a positive value for the

characteristic BTM indicates that for the sample period of our empirical test, the optimal

portfolio is on average weighted toward value firms.

A few observations from Table 2 are worth noting. Firstly, for in-sample PPP parameter

estimates, five out of six firm characteristics parameter estimates go beyond two standard

errors from zero (BTM, MOM, UE, ACC and AGR). For out-of-sample PPP parameter

estimates, only the parameter estimates for UE and AGR go beyond two standard errors from

zero. Secondly, the PPP method generates non-extreme maximum (around 3.2%) and

minimum (around –2%) portfolio weights. Lastly, across estimated optimal portfolios from

1975-2008, the average proportion of short position is around 48%. This is consistent with

literatures that show that the proportion of risky assets held short in both the mean-variance

tangency portfolio and the minimum variance portfolio tends in the limit to 50% if no

constraints on short selling are imposed (Levy, 1983; Green and Hollifield, 1992; Levy and

Ritov, 2001).

We report statistics on the returns generated by our optimal portfolios and those from

Hand&Green’s results side by side in Table 3. Following BSCV, the annualized return is

defined as the simple sum of the calendar year’s monthly returns. Specifically, Table 3

provides the certainty equivalent of the optimal portfolio’s mean annualized return, the mean

and standard deviation of the annualized return, together with the portfolio’s annualized

sharpe ratio and information ratio.

Table 3 indicates several key findings. First, the magnitudes of the mean (pre-transactions-

costs) returns from all the PPP optimized portfolios have a quite wide range, both in- and

out-of-sample. As a result, the standard deviations of returns from the PPP optimized

portfolios are also large. Secondly, compared to Hand&Green’s result, our results show a

higher mean annualized return for the optimized portfolios as well as a higher volatility. This

difference may be caused by the difference in the fine details of empirical implementation

such as the difference in the definition of characteristics(e.g., in our case, MOM is defined as

cumulative raw return for the twelve months ending two months before the portfolio months

while in Hand & Green’s case, cumulative raw return for the twelve months ending four

months after the most recent fiscal year end is used) , and the difference in filtering the

unrestricted CRSP universe and matching CRSP database with COMPUSTAT database.

Thirdly, our results prove to have a higher CE(Certainty Equivalent) than Hand&Green’s

work both in- and out-of-sample. In other words, our results are more optimal in terms of

maximizing the average utility of the portfolio’s return that a risk-averse investor would have

realized if he/she implemented the policy over a given sample period. Lastly, although lower

than Hand & Green’s results, our annualized sharpe ratios and information ratios for all

optimal portfolios are much higher than those of the value-weighted market portfolio.

In Figure 2, we show the comparison of annual returns for our in-sample optimal portfolio,

out-of-sample optimal portfolio and the value-weighted market portfolio which serves as a

benchmark, for the time horizon of 1975 to 2008.

In Figure 3, we show that average returns can be improved by incorporating more recent

trend information using a simple sliding window method. This finding supports our

hypothesis that incorporating more recent trend information could improve the overall PPP

performance and serves as the first step to fully implement a progressive learning model of θ.

Finally, in Figure 4, we demonstrate average returns can also be improved by applying time-

series correction to out-of-sample θs using the method we propose in Section 2.2.

Overall, we conclude from our results reported above that portfolios formed by PPP method

together with both accounting-based characteristics and price-based characteristics on

average deliver highly positive returns compared to the value-weighted market portfolio,

even during the 2000-2001 Dot-com bubble and 2007-2008 market meltdown. Besides, our

proposed “time-series correction to out-of-sample θs” method can further improve average

returns.

5. Caveats and potential future research direction

We’d like to highlight two limitations of this study. Firstly, we do not consider the

transaction cost in our study. It is possible that the results would be less impressive if we

bring in the transaction cost factor. Secondly, both the impressive returns and the optimal

choice of firm characteristics may not persist in the future. As we can see from the Figure 1,

the PPP parameters vary over the years. So going forward, the relative significance of each

firm characteristic in forming the optimal portfolio may change.

In addition, a number of additional theoretical and empirical researches could be further

pursued in the future. For example,

Introduce qualitative firm-specific characteristics in addition to the traditional

quantitative characteristics

Conduct a systematic scan of key potential combos of 3 firm characteristics and

explore return patterns should data and computational time permit

Apply the model to a specific subset of the equity universe (e.g., a single industry,

recently IPO stocks, etc)

Modify the model to account for industry/sector specific characteristics (e.g., a two layer

model to first calculate coefficients specifically for each industry and then conduct relative

prioritization/allocation between industries)

Add an “exiting” point in the model to account for potential investor choice for safer assets

(e.g., T-bill, bond)

Explore non-linear functions to see if performance can be further improved

Test the model on foreign stock exchange dataset and explore if any difference could

be explained by difference in accounting regulation

6. Conclusion

In this paper, we review two new papers using the parametric portfolio policy and propose

potential improvements. Specifically, we first introduce details of the two papers (Brandt, et

al., 2009, Hand & Green, 2009), that use the parametric portfolio policy, a novel approach

proposed by Brandt, et al., to make the optimal portfolio decision with firm-specific

accounting-based and price-based characteristics. Then, we suggest two potential

improvements: First, we apply progressive learning algorithm to coefficient . Second, we

introduce discretionary accruals and non-discretionary accruals as firm-specific accounting-

based characteristics to account for the effect of earnings management. Finally, we reverse

engineer Matlab programs and associated generalized method of moments(GMM) optimizer

based on available publications and demonstrate we can achieve comparable empirical

results as described by Hand&Green, 2009 since the authors are not willing to share their

proprietary codes. We also demonstrate average returns can be further improved by

incorporating more recent trend information.

In our empirical application part, we find that on a pre-transactions cost basis and with a

relative risk aversion of five, the optimal portfolio formed by the PPP method together with

the six illustrative firm characteristics yields a certainty equivalent return of 66.3% (in-

sample) and 72% (out-of-sample) as compared to 48.4%(in-sample) and 43.6%(out-of-

sample) of Hand&Green, 2009. And the out-of-sample optimal portfolio consistently

outperforms the value-weighted market portfolio for the time horizon of 1975-2008, even

during the Dot-com bubble and the Subprime crisis periods. Besides, we demonstrate

average returns can be improved by incorporating more recent trend information using a

simple “sliding window” method and can also be improved by applying the “time-series

correction to out-of-sample θs” method proposed by us.

We also find that among all six firm characteristics, a high change in earnings leads to the

quantitatively largest overweighting of a stock in the optimal portfolio compared to the

value-weighted market portfolio. Within the three price-based characteristics(MVE, MOM

and BTM), a high book-to-market ratio leads to the quantitatively largest overweighting of a

stock, which is consistent with the finding of Brandt, et al. 2009.

Figure 1: Time-series of annual optimal portfolio policy parameter

estimates

Figure 2: Comparison of annual returns for PPP out-of-sample optimal

portfolio, in-sample optimal portfolio and value-weighted market portfolio

Figure 3: A simple “sliding-window” improves returns, suggesting the

potential value to implement a progressive learning model of θ

Figure 4: Applying time-series correction(five years) to out-of-sample θs

improves returns

Table 1: Variable definitions and number of firms by year

This table reports the definitions of firm characteristics used in the PPP method and the number of companies by year in our final dataset after all the constraints imposed.

Definitions of accounting-based firm characteristics and price-based firm characteristics used in the PPP method

Firm size (MVE) = fiscal year market value(price per share times the number of shares outstanding) of common equity.

Book-to-market (BTM) = book value of common equity / MVE.

Momentum (MOM) = compounded return between months t − 13 and t – 2 for the monthly portfolio return at month t.

Accruals (ACC) = net income - operating cash flow scaled by average total assets if operating cash flow is available, otherwise ACC = ∆current assets –∆cash – ∆current liabilities – ∆debt in current liabilities – ∆taxes payable – ∆depreciation, all scaled by average total assets. If any of the above components is missing we set it to zero.

Change in earnings (UE) = change in net income scaled by average total assets.

Asset growth (AGR) = ln [1 + total assets] – ln [1 + lagged total assets].

Number of firms by year(after requiring the availability of monthly returns, sufficient data to compute all the firm characteristics and deleting the smallest 20% of stocks)

Table 2: Parameter estimates, average portfolio weights, and average firm

characteristics in the optimal portfolio

PPP in-sample

PPP out-of-sample

me ‐5.06 ‐7.18

[ ]mese 3.03 4.26

btm 21.10 13.04

[ ]btmse 4.36 8.05

mom 7.33 3.78

[ ]momse 3.15 3.94

acc ‐46.05 ‐3.27

[ ]accse 10.14 14.76

ue 51.24 66.51

[ ]uese 11.67 11.53

agr ‐32.17 ‐53.84

[ ]agrse 10.50 13.21

. 100iAvg w 0.50 0.57

.max 100iAvg w 3.24 3.28

.min 100iAvg w ‐2.25 ‐2.29

. ( 0)i iAvg w I w ‐7.48 ‐8.63

. ( 0) /i tAvg I w N 0.48 0.49

.Avg weighted MVE ‐0.11 ‐0.40

.Avg weighted BTM 1.46 1.01

.Avg weighted MOM 1.12 1.15

.Avg weighted ACC ‐3.27 0.43

.Avg weighted UE 2.58 4.15

.Avg weighted AGR ‐2.45 ‐3.54

Table 3: Descriptive statistics on the returns generated by optimal

portfolio of our results compared to those of Hand&Green, 2009

Statistics for annual return

PPP in-sample

PPP out-of-sample

Hand&Green's in-sample

Hand&Green's out-of-sample

VW market

Certainty equivalent 66.3% 72% 48.4% 43.6% 6.2%

Mean r 105% 115% 74.7% 75.5% 12.7%

( )r 54.7% 68.6% 31.7% 37.0% 15.4%

Sharpe ratio 1.83 1.60 2.18 1.89 0.45

Information ratio 1.68 1.51 2.31 2.02

Reference

Ball, R.J., and P. Brown, 1968. An empirical evaluation of accounting income numbers. Journal of Accounting Research 6, 159-178.

Banz, R., 1981. The relationship between return and market value of common stocks. Journal of Financial Economics 9, 3-18.

Bernard, V.L., and J.K. Thomas, 1989. Post-earnings-announcement drift: Delayed price response or risk premium? Journal of Accounting Research 27, 1-36.

Bernard, V.L., and J.K. Thomas, 1990. Evidence that stock prices do not fully reflect the implications of current earnings for future earnings. Journal of Economics & Accounting 13, 305-340.

Brandt, M.W, Santa-Clara, P., and R. Valkanov, 2009. Parametric portfolio policies: Exploiting characteristics in the cross section of equity returns. Review of Financial Studies 22: 3411-3447.

Black, Fischer. 1972. “Capital Market Equilibrium with Restricted Borrowing.” Journal of Business. 45:3, pp. 444-454.

Black, F., and R. Litterman. 1992. Global Portfolio Optimization. Financial Analysts Journal 48:28–43.

Carhart, M. M. 1997. On Persistence in Mutual Fund Performance. Journal of Finance 52:57–82.

Cooper, M.J., Gulen, H., and M.J. Schill, 2008. Asset growth and the cross-section of stock returns. Journal of Finance 63, 1609-1651.

Cooper, M.J., Gulen, H., and M.J. Schill, 2009. The asset growth effect in stock returns. Working paper, University of Utah.

Davis, M. H. A., and A. R. Norman. 1990. On Portfolio Optimization: Forecasting Covariances and Choosing the Risk Model. Review of Financial Studies, 12:937-74.

Dechaw and Sloan. 1995. Detecting Earnings Management. The Accounting Review, Vol. 70, No. 2 (Apr., 1995), pp. 193-225.

Fama, E. F., and K. R. French. 1992. The Cross-Section of Expected Stock Returns. The Journal of Finance, Vol. 47, No. 2 (Jun., 1992), pp. 427-465

Fama, E. F., and K. R. French. 1993. Common Risk Factors in the Returns of Stocks and Bonds. Journal of Financial Economics 33:3–56.

Fama, E. F., and K. R. French. 1996. Multifactor Explanations of Asset Pricing Anomalies. Journal of Finance 51:55–84.

Foster, G., Olsen, C., and T. Shevlin, 1984. Earnings releases, anomalies, and the behavior of security prices. The Accounting Review, 4, 574-603

Green, J., Hand, J.R.M., and M. Soliman, 2009. Going, going, gone? The demise of the accruals anomaly. Working paper, UNC Chapel Hill.

Green, R.C., and B. Hollifield, 1992. When will mean-variance efficient portfolios be well diversified? Journal of Finance 47, 1785-1809.

Grinold, R.C., 1992. Are benchmark portfolios efficient? The Journal of Portfolio Management 19, 1, 34-40.

Hand and Green. 2009. The Importance of Accounting Information in Portfolio Optimization. Working paper.

Hansen, L. P. 1982. Large Sample Properties of Generalized Method of Moments Estimators. Econometrica 50:1029–53.

Hasbrouck, J. 2006. Trading Costs and Returns for US Equities: Estimating Effective Costs from Daily Data.Working Paper, NYU.

Haugen, R.A., and N.L. Baker, 1996. Commonality in the determinants of expected stock returns. Journal of Financial Economics 41, 401-439.

Hirschleifer, D., Hou, K., and W.H. Teoh, 2006. The accrual anomaly: Risk or mispricing? Working paper, Ohio State University.

Jagannathan, A. and Skoulakis, G. 2002. Generalized Method of Moments: Application in Finance. Journal of Business & Economic Statistics.

Jacobs, B.R., and K.N. Levy, 1993. Long/short equity investing: Profit from both winners and losers. Journal of Portfolio Management 20 (1), 52-63.

Jacobs, B.R., and K.N. Levy, 1997. The long and short on long-short. The Journal of Investing (Spring) 2-15.

Jegadeesh, N., 1990. Evidence of predictable behavior of security returns. Journal of Finance 45, 881-898.

Jegadeesh, N., and S. Titman, 1993. Returns to buying winners and selling losers: Implications for stock market efficiency. Journal of Finance 48, 65-91.

Jones, S.L, and G. Larsen, 2004. How short selling expands the investment opportunity set and improves upon potential portfolio efficiency. In Short selling: Strategies, risks, and rewards. Ed. F. Fabozzi, Wiley Finance.

Jones, Jennifer J. 1991. Earnings Management During Import Relief Investigations. Journal of Accounting Research, Vol. 29, No. 2 (Autumn, 1991), pp. 193-228

Korkie, B., and H.J. Turtle, 2002. A mean-variance analysis of self-financing portfolios. Management Science 48 (3), 427-443.

Kothari, S.P., 2001. Capital markets research in accounting. Journal of Economics & Accounting 31, 105-231.

Kothari, S.P., and J. Shanken, 2002. Anomalies and efficient portfolio formation. Research monograph, The Research Foundation of AIMRTM.

Lee, M.C., 2001. Market efficiency and accounting research: a discussion of ‗capital market research in accounting‘ by S.P. Kothari. Journal of Economics & Accounting 31, 233-253.

Levy, H., 1983. The capital asset pricing model: Theory and evidence. Economic Journal 93, 145-165.

Levy, M., and Y. Ritov, 2001. Portfolio optimization with many assets: The importance of short-selling. Working paper, Hebrew University of Jerusalem.

Lintner, John. 1965. “The Valuation of Risk Assets and the Selection of Risky Investments in Stock Portfolios and Capital Budgets.” Review of Economics and Statistics. 47:1, pp. 13–37.

Lo, A.W., 2002. The statistics of Sharpe ratios. Financial Analysts Journal, July/August 36-51.

Markowitz, H. M. 1952. Portfolio Selection. Journal of Finance 7:77–91.

Michaud, R. O. 1989. The Markowitz Optimization Enigma: Is Optimized Optimal? Financial Analyst Journal 45:31–42.

Penman, S.H., 2009. Financial statement analysis and security valuation. McGraw-Hill/Irwin.

Reinganum, M.R., 1981. Misspecification of capital asset pricing: Empirical anomalies based on earnings' yields and market values. Journal of Financial Economics 9, 19-46.

Rosenberg, B., Reid, K., and R. Lanstein, 1985. Persuasive evidence of market inefficiency. Journal of Portfolio Management 11, 9-17.

Shumway, T., and V. Warther, 1999. The delisting bias in CRSP‘s Nasdaq data and its implications for the size effect. Journal of Finance, 2361-2379.

Sharpe, William F. 1964. “Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk.” Journal of Finance. 19:3, pp. 425–442.

Sharpe, W.F., 1994. The Sharpe ratio. Journal of Portfolio Management 21 (1), 49-58.

Sloan, R.G., 1996. Do stock prices fully reflect information in accruals and cash flows about future earnings? The Accounting Review 71, 3, 289-315.

Sorensen, Bent E. Teaching notes on GMM, 2007

Stattman, D. 1980. Book Values And Stock Returns. The Chicago MBA: A Journal Of Selected

Papers, 4, 25-45.

Wikipedia

Zivot, Eric, “Generalized Method of Moments” class notes, University of Washington

abstract - rutgers universityraw.rutgers.edu/docs/seminars/fall2010/sangsangliu.pdf · prediction...

Documents