macquarie global quant conference, hong kong, 22...
TRANSCRIPT
Crowdsourced alphaMacquarie Global Quant Conference, Hong Kong, 22 September 2014
Vinesh Jha, CEO
ExtractAlpha
23 September 2014
Agenda
� Motivation: new data
� What is crowdsourced alpha?
� Crowdsourced earnings estimates
� Alpha capture
� Financial bloggers
2
August, 2007
� Oops! Our models are all pretty much the same!
3
-4%
-3%
-2%
-1%
0%
1%
2%
3%
4%
5%
6%
7%
7/30/2007
7/31/2007
8/1/2007
8/2/2007
8/3/2007
8/4/2007
8/5/2007
8/6/2007
8/7/2007
8/8/2007
8/9/2007
8/10/2007
8/11/2007
8/12/2007
8/13/2007
8/14/2007
8/15/2007
8/16/2007
8/17/2007
8/18/2007
8/19/2007
8/20/2007
8/21/2007
8/22/2007
8/23/2007
8/24/2007
8/25/2007
8/26/2007
8/27/2007
8/28/2007
8/29/2007
8/30/2007
8/31/2007
Daily returns to 1-day reversal strategy, Aug, 2007 (Khandani and Lo, 2007)
Why?
� Pretty much the same universe…
� Pretty much the same modeling techniques…
� Pretty much the same risk models…
� Pretty much the same alphas!
4
Where to get new data? (1)
� Collect from existing traditional data sets, but dig deeper
� Detailed financial statements
� Industry specific
� Footnotes
� Conference call transcripts
� Broker research reports
� News sentiment
5
Where to get new data? (2)
� Collect as “exhaust” from alternate sources
� Transactional data (e.g., point of sale)
� Consumer behavior (e.g., foot traffic, web traffic)
6
Where to get new data? (3)
� Crowdsource it!
� Ask people…
� To collect data for you (e.g., Premise Data Corp)
� For their opinions (the subject of this talk)
� …or collect those opinions from public forums
� Blogs, social media, Amazon reviews,…
7
Caveats
� History often limited (or none if you’ve just started asking people)
� In/out of sample harder
� Cross-sectional coverage often thin
� Often US only
� Nasty formats
� No clean identifiers
� Need to figure out sentiment
8
Agenda
� Motivation: new data
� What is crowdsourced alpha?
� Crowdsourced earnings estimates
� Alpha capture
� Financial bloggers
9
Defining crowdsourced alpha
� The use of multiple humans’ forecasts to make investment decisions
� The humans needn’t know their forecasts are being used
� They needn’t actually be explicit forecasts
� For example, sentiment on products in a Twitter feed
10
What we need
� A measurable thing to forecast
� Which we need help in forecasting
� Humans to make the forecasts
� A platform for submission or collection of the forecasts
� This often starts with unstructured data
� Often have to build this yourself!
� An incentive for people to contribute their forecasts
� A way to clean up the noise
11
What we want
� A diversity of opinions or knowledge
� A good way to measure skill
� Objective
� Appropriate to the task and to our expectations
� Persistence of skill
� Can only find this if there is a diversity of skill levels
� Incentive to forecast well (not just to forecast)
� Leaderboards, monetary incentives, track record, marketing
12
Example
� IBES
� The original crowdsourced alpha!
� Fulfills all of our needs
� Platform for structured, useful, relatively noise-free forecasts by incentivized humans
� And most of our wants
� Somewhat diverse opinions, persistent skill
� Incentives are mixed
13
We can use the crowd to forecast..
14
What Who Where
EPS, revenues Sell side, IBES, FactSet, CapIQ
buy side, independents, individuals Estimize
Returns Sell side research IBES, FactSet, CapIQ, TipRanks
Sell side sales desks TIM Group
Financial bloggers and newsletters Seeking Alpha, Motley Fool, TipRanks
Individuals PredictWallStreet, StockTwits
Social sentiment Twitterers, Facebookists Gnip, many others
Macro Governments, economists Reuters, Gov't, Consensus Economics, Estimize
M&A Rumor mill Mergerize by Estimize
Strategies Fund managers FoF, multistrat firms
Algo developers Quantopian, Quantconnect
Agenda
� Motivation: new data
� What is crowdsourced alpha?
� Crowdsourced earnings estimates
� Alpha capture
� Financial bloggers
15
Crowdsourced earnings estimates
� Data from Estimize
� EPS and revenue estimates
� November 2011-2014 on U.S. stocks
� Pseudonymous
� Contributor base
� Buy side, independent, individuals, and students
� Diversity of backgrounds and forecasting methodologies
� Users can contribute biographical information
16
Estimize data
� 25,000 registered users, 75,000 unique viewers of data last quarter
� 4,000 contributors, 17,000 estimates made last quarter
� Coverage (3+ estimates) on 900+ stocks in recent quarters
� Cleaned for errors/noise
� Highly seasonal
17
How accurate?
� For what % of EPS reports is the Estimize consensus closer to actual EPS than is the sell side?
� Just using equally weighted crowd estimates
18
n
% more
accurate
Estimize
error
Wall Street
error
>= 1 analyst 8971 53% 17.3% 17.4%
>= 3 analysts 4916 58% 13.7% 14.5%
>= 10 analysts 1438 62% 11.7% 12.6%
>= 20 analysts 487 62% 12.6% 13.3%
A better benchmark for expectations
19
Estimize Wall Street
N 1 day 2 day 5 day N 1 day 2 day 5 day
IC 4614 0.010 0.016 0.024 4614 (0.018) (0.012) (0.001)
Mean return All surprises 4548 0.14% 0.14% 0.19% 4417 0.08% 0.03% 0.00%
> 1% surprises 4059 0.14% 0.13% 0.16% 4107 0.07% 0.02% -0.01%
> 5% surprises 2521 0.20% 0.20% 0.21% 2755 0.13% 0.06% 0.01%
> 10% surprises 1654 0.20% 0.25% 0.27% 1849 0.10% 0.05% -0.09%
Earnings surprise strategy
20
-40%
-20%
0%
20%
40%
60%
80%
11/3/2011
2/3/2012
5/3/2012
8/3/2012
11/3/2012
2/3/2013
5/3/2013
8/3/2013
11/3/2013
2/3/2014
Cumulative residual return to surprise strategies
1 day holding 5 day holding
Holding period
1 day 5 day
Ann ret 25.7% 10.7%
Ann SD 19.8% 14.5%
Sharpe 1.30 0.73
% days invested 29% 77%
Accuracy is persistent
21
� Require >= 5 prior quarters
� Compute error z score relative to other estimators, adjust by coverage
� Of estimators in the top (bottom) 20% per their prior coverage, what % end up in the top (bottom) 20% going forward?
� Would be 20% if random
Current ----->
Prior Bad Good Persistence
Bad 26.4% 19.4% 7.0%
Good 15.4% 21.0% 5.6%
What makes for an accurate estimate?
22
� Regress estimate-level accuracy (% error) against
� Track record +
� how good has the analyst been in this sector in the past?
� Difficulty of forecasting -
� condition track record on the overall accuracy of the Estimize community
� Expect less accuracy if everyone’s been inaccurate
� Amount of coverage +
� more is better, to a point
� Days to report -
� more recent forecasts contain more information
� Bias +
� higher estimates tend to be more accurate
� Commentary +
� Estimates accompanied by commentary are more accurate
What makes for an accurate estimate?
23
N 23,342
Factor Parameter T p
Track record 0.07 9.38 <.0001
Difficulty (0.03) (2.69) 0.007
Coverage 0.02 3.34 0.001
Days to report (0.10) (11.59) <.0001
Bias 0.14 21.85 <.0001
Comment 0.02 2.07 0.039
Agenda
� Motivation: new data
� What is crowdsourced alpha?
� Crowdsourced earnings estimates
� Alpha capture
� Financial bloggers
24
Alpha capture
� Data from TIM Group
� Sell side trading desks produce short-term (5-30 day) trade ideas for select hedge fund clients
� These are distinct from research desks’ recommendations
� Incentive: paid for in commissions!
� Global, 2006-2014
� 300 brokers, 3000 contributors (“authors”) providing stock level forecasts
� 64% of ideas are Long
25
Alpha capture event study
26
-0.4%
-0.3%
-0.2%
-0.1%
0.0%
0.1%
0.2%
0.3%
0.4%
-5 0 5 10 15 20
Cumulative signed mean return to ideas by region
North America Europe Asia
� Residual Returns
� Require > US$100mm market cap, > US$4 equivalent, > US$1mm ADV
Stronger & longer for small caps
27
-0.4%
-0.2%
0.0%
0.2%
0.4%
0.6%
-5 0 5 10 15 20
Cumulative signed mean return to ideas by size
Large Mid Small
Ideas just after earnings are weaker
28
-0.4%
-0.3%
-0.2%
-0.1%
0.0%
0.1%
0.2%
0.3%
0.4%
-5 0 5 10 15 20
Cumulative signed mean return to ideas by earnings date, North America
Post-earnings others
Author performance is persistent
� Require at least 3 ideas over the last 2 years
� Measure performance by a blend of average return, Sharpe ratio, hit rate
� Using residual returns
29
Current ----->
Prior Bad Good Persistence
Bad 26.1% 17.9% 8.3%
Good 18.8% 27.6% 8.9%
Combining the insights:The TIM Indicator
30
Agenda
� Motivation: new data
� What is crowdsourced alpha?
� Crowdsourced earnings estimates
� Alpha capture
� Financial bloggers
31
Financial bloggers
� Data from TipRanks
� U.S., 2010-2014
� 65 financial blogs’ data. 122,000 recommendations from 4000+ authors on 2000+ stocks, collected in real time
� NLP used to algorithmically determine buy/sell recommendation from the blog
� 84% Long
32
0
500
1000
1500
2000
2500
3000
3500
201009
201103
201109
201203
201209
201303
201309
201403
Blogger recommendations per month
Nbuy Nsell
Why blogs?
� Many include in-depth research by independent analysts, not captured in other data sets like news, broker research
� Contributors include buy side and industry experts
� Contributors are often compensated for providing original research, and typically disclose their positions
� Varying editorial standards across blogs, for example some are trying to upsell to premium content
33
Blogger event study
� Residual Returns
� Require > $100mm market cap, > $4, > $1mm ADV
34
-0.3%
-0.2%
-0.1%
0.0%
0.1%
0.2%
0.3%
-10 0 10 20 30 40 50 60
Cumulative residual returnsblogger recommendations
Buy Sell
TipRanks Expert Sentiment Signal (TRESS)
� Same universe, Market neutral “deciles”
� 18.8% annualized returns, Sharpe 2.13
35
-10%
0%
10%
20%
30%
40%
50%
60%
70%
80%
9/1/2010
12/1/2010
3/1/2011
6/1/2011
9/1/2011
12/1/2011
3/1/2012
6/1/2012
9/1/2012
12/1/2012
3/1/2013
6/1/2013
9/1/2013
12/1/2013
3/1/2014
6/1/2014
So…
� Lots of interesting new data out there (finally!)
� We need to be OK with limited history for many data sets
� Crowdsourcing, broadly defined, seems to add value!
� Let’s crowdsource more things!
36