simulating a basketball match with a homogeneous markov model and forecasting the outcome

International Journal of Forecasting 28 (2012) 532542

at

n

ls

a

nia

nrathygiatot w 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

1. Introduction

As Stekler, Sendor, and Verlander (2010) recentlypointed out in their survey of sports forecasting, a largeamount of effort is spent in forecasting the outcomes ofsporting events. While part of this effort focuses on theproperties of different types of forecasts and forecasters,the majority of such research is driven by sports bettingand revolves around betting markets. The evolution ofbetting markets brings new challenges for forecasters.In the past, only a few different betting markets wereavailable for a sports event. Today, bets can be made on allkinds of outcomes, ranging from the number of reboundsin a basketball match to the number of corners in a footballmatch. Furthermore, the increasing accessibility of in-playbetting has brought a demand for real-time forecasts basedon the progression of the sporting event. One way ofdealing with these issues in a uniform way is to constructa model that can simulate the sporting event. In this paperwe take a step in this direction by exploring the use ofMarkov models for basketball match simulation.

Corresponding author.E-mail addresses: [email protected] (E. trumbelj),

[email protected] (P. Vraar).

1.1. Related work

Basketball has received far less attention in the scien-tific literature than horse racing or football. Zak, Huang,and Sigfried (1979) combined defensive and offensive el-ements to rank individual teams. This ranking producedthe same order as the teams actual efficiencies. How-ever, no attempt was made to use the ranking to forecastthe results of individual matches. Berri (1999) proposed amodel for measuring how individual players contribute toa teams success. The sum of the contributions of all of itsplayers was a good estimator of the actual number of winsachieved by a team. Kvam and Sokol (2006) used a Markovmodel to predict the winner of NCAA basketball matches.Each team was a state in the model, transition probabil-ities were fit using the teams past performances, and theteamswere ranked according to themodels stationary dis-tribution. The model was better at forecasting the win-ner than several other statistical models and polls. Stern(1994) used the scores at the end of each quarter of 493NBA matches to fit a Brownian motion model of the scoreprogression. The model was good at estimating in-playwin probabilities, given the current points differences. Fora more comprehensive list of basketball-related forecast-ing references, especially those which focus on forecastingmatch outcomes from teams ratings or rankings, see Stek-ler et al. (2010).

0169-2070/$ see front matter 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.doi:10.1016/j.ijforecast.2011.01.004Contents lists available

International Jour

journal homepage: www.e

Simulating a basketball match withforecasting the outcomeErik trumbelj, Petar Vraar Faculty of Computer and Information Science, Traka 25, 1000 Ljubljana, Slove

a r t i c l e i n f o

Keywords:Sports forecastingProbability forecastingMonte CarloSimulationNational Basketball Association

a b s t r a c t

We used a possessiomatch. The models tand indirectly fromand other commonllatent strength ratinapproach is approprquality comparable tbasketball. Consistenforecasts.SciVerse ScienceDirect

al of Forecasting

evier.com/locate/ijforecast

homogeneous Markov model and

-based Markov model to model the progression of a basketballnsition matrix was estimated directly from NBA play-by-play datae teams summary statistics. We evaluated both this approachused forecasting approaches: logit regression of the outcome, amethod, and bookmaker odds. We found that the Markov modele for modelling a basketball match and produces forecasts of athat of other statistical approaches, while giving more insight intoith previous studies, bookmaker odds were the best probabilistic

E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542 533Themost relevant is the work by Shirley (2007). Shirleyused a Markov model with states which correspondedto how a team obtained (retained) possession and howmany points were scored during the previous transition.Shirley fit a model for each team using summary statisticsand showed that the simulations based on such modelswere good at estimating a teams win probability againstan average opponent. However, the results were in-sample and the quality of the model as a forecaster wasnot investigated. Shirley also discussed fitting the modelusing exclusively in-match data (as opposed to summarystatistics) and taking into account the individual teamsstrengths, but did not pursue this due to insufficient data.

1.2. Research objectives

Motivated by Shirleys work, we investigated the useof Markov models for basketball match simulation. Whilethe long term goal is a detailed and accurate simulationof a basketball match, legitimate questions can be askedabout whether a more complex model can still produceaccurate forecasts for something as relevant as the matchoutcome. Therefore, our main objective was to determinewhether such a model could be used to simulate amatch between two specific teams, while still producingprobabilistic forecasts of a quality comparable to that ofthe other statistical approaches that are most commonlyused to forecast the outcome of a sporting event. Wehave revisited Shirleys average home vs. average away andteam vs. average team models and fit them using play-by-play data from 2252 NBA basketball matches. We havealso proposed an extension which takes into account thestrengths of the two competing teams. This is achievedby inferring the transition probabilities from the teamssummary statistics.

We found that some related basketball forecastingstudies do not include an ex-ante evaluation. Thosethat do, usually evaluate the forecasts using the per-centage of correct predictions, which is more appropri-ate for evaluating binary forecasts than for evaluatingprobabilistic forecasts. Therefore, a secondary objectiveis an overview of the quality of different probabilisticforecasts for basketball. We evaluate our approach andthree of the most commonly used approaches in forecast-ing: logit regression of the outcome, a latent strength rank-ing model, and bookmaker odds.

The remainder of the paper is organized as follows. Inthe following section we describe the Markov model andanalyze its properties on NBA play-by-play data. Section 3focuses on producing amatch-specific Markovmodel fromthe teams summary statistics. In Section 4 we describethree other well-known approaches to forecasting. Theresults of the empirical evaluation are given in Section 5.Finally, in Section 6 we conclude the paper and discusssome ideas for further work.

2. A possession-based Markov model

In basketball, two teams (A and B) alternate possessionof the ball and try to score points.Mostmodern approachesto basketball analysis evaluate the match at the level ofpossessions (see Kubatko, Oliver, Pelton, & Rosenbaum,2007), focusing on the number of possessions a teamhas and how effectively they convert them into points.Shirley (2007) proposed a Markov model with states thatcorrespond to a team obtaining (or retaining) possession.He considered the following states: restarting the actionwith an inbound pass (i), steal or non-whistle turnover (s),offensive (o) or defensive rebound (d), and going to thefree throw line after a shooting/bonus/technical foul (f).1These states cover all moments in a game duringwhich theteams can obtain possession of the ball. However, such astate does not imply a change in possession. For example,a team can get two consecutive offensive rebounds, thusobtaining the ball while already having possession (that is,retaining possession). Therefore, it is possible for a team tokeep possession for several consecutive states.

The model takes into account the obtainment (retain-ment) of possession, how it was obtained, and the numberof points scored during the transition. A transition encom-passes everything that happens between two states. Dur-ing a transition, 0, 1, 2, or 3 points are scored. This leadsus to the set of states: {A, B} {i, s, o, d, f} {0, 1, 2, 3}.For example, let team B miss a shot and team A rebound,then team Amiss a 2pt shot, get an offensive rebound, anddunk for 2ptwhile being fouled for an extra free throw. Thefollowing states and transitions describe this sequence upto the free throw line: 0 Ad00 Ao02 Af2. Only30 of the 40 states are actually possible. If possession wasobtained by a steal, no points could have been scored onthe previous transition, and scoring 3 points in a transitioncan never lead to a rebound.2 This eliminates 10 states. Ta-ble 1 shows the remaining 30 states and describes the tran-sitions between them.

First, we estimated the transition probabilities usingthe frequencies from the play-by-play data available forthe 200708 and 200809 seasons, separately. The resultwas a model for a match between the average home teamand the average away team for the corresponding season the average home vs. average away model. For moredetails about the play-by-play data and how they weretransformed into states, see Table 2. The NBA play-by-playdata were obtained from www.basketballgeek.com.

The transition matrix can be used to estimate thenumber of points scored and the win probabilities. Weused a Markov chain Monte Carlo model to estimate themodels stationary distribution. From this distribution weget the number of points per transition, which, multipliedby the number of transitions, gives an estimate of thenumber of points scored. To estimate thewin probabilities,10 000 matches were simulated (draws were ignored).Note that some teams play fast-paced basketball, whileothers play at a slower pace. For example, the meannumber of transitions per game for the 2007/08 (2008/09)seasonwas 267 (264)with a standard deviation of 7.9 (8.0).

1 Note that going to the free throw line is a state, while subsequent freethrows are not considered as states. Instead, they are encompassed in thetransition from the free-throw state to the next state.2 Scoring 2 points can lead to a rebound, for example in the situation

when a player has 3 free throws, and makes the first two but misses thelast one.

534 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542Table1

Outlin

eof

thetran

sitio

nmatrixof

ShirleysMarko

vmod

elforb

asketball.Th

enu

mbers/le

ttersan

dthecorrespo

ndingde

scriptions

providemorede

tails

abou

thow

atran

sitio

ncanoccurb

etweentherow

and

columnstates.

Team

Aha

spo

ssession

Team

Bha

spo

ssession

Ai0

Ai1

Ai2

Ai3

As0

Ao0

Ao1

Ao2

Ad0

Ad1

Ad2

Af0

Af1

Af2

Af3

Bi0

Bi1

Bi2

Bi3

Bs0

Bo0

Bo1

Bo2

Bd0

Bd1

Bd2

Bf0

Bf1

Bf2

Bf3

Team

Aha

spo

ssession

Ai0

02

46

78

AB

CD

FAi1

02

46

78

AB

CD

FAi2

02

46

78

AB

CD

FAi3

02

46

78

AB

CD

FAs0

02

46

78

AB

CD

FAo

00

24

67

8A

BC

DF

Ao1

02

46

78

AB

CD

FAo

20

24

67

8A

BC

DF

Ad0

02

46

78

AB

CD

FAd

10

24

67

8A

BC

DF

Ad2

02

46

78

AB

CD

FAf0

01

12

33

55

58

99

BD

EE

GG

GAf1

01

23

55

89

9D

EG

GAf2

01

23

55

89

9D

EG

GAf3

02

58

9D

G

Team

Bha

spo

ssession

Bi0

8A

BC

DF

02

46

7Bi1

8A

BC

DF

02

46

7Bi2

8A

BC

DF

02

46

7Bi3

8A

BC

DF

02

46

7Bs0

8A

BC

DF

02

46

7Bo

08

AB

CD

F0

24

67

Bo1

8A

BC

DF

02

46

7Bo

28

AB

CD

F0

24

67

Bd0

8A

BC

DF

02

46

7Bd

18

AB

CD

F0

24

67

Bd2

8A

BC

DF

02

46

7Bf0

89

9B

DE

EG

GG

01

12

33

55

5Bf1

89

9D

EG

G0

12

35

5Bf2

89

9D

EG

G0

12

35

5Bf3

89

DG

02

5

0tim

eout

no

n-free-throw

foul

de

fensekn

ocks

ballou

tofb

ound

s9missedlastFT,tippe

dballou

tofb

ound

smadelastfree

throw

1missedlastfree

throw,defen

setip

pedballou

tofb

ound

sAmade2p

tsho

t2missedshot

orallfreethrows,offensiverebo

und

Bmade3p

tsho

tmade3free

throws

3missedlastfree

throw,offe

nsiverebo

und

Csteal

blockturnover

with

outw

histle

4fouled

Dmissedshot

orallfreethrows,de

fensiverebo

und

5missedlastfree

throw,loo

seballfoul

Emissedlastfree

throw,defen

sive

rebo

und

6made2p

tsho

t,shoo

tingfoul

Ftechnicalfou

loffensivefoul

whe

nin

bonu

s7made3p

tsho

t,shoo

tingfoul

Gmissedlastfree

throw,loo

seballfoul

8missedshot,tippe

dballou

tofb

ound

soffensivefoul

orviolation.

.

E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542 535

x (

r t

Aw

9797used. When estimating for a match between two specificteams, the mean of the two teams means is used.

Table 3 shows that both the 200708 and 200809average home vs. average away models produce goodestimates. Fig. 1 shows how the home teams winprobability changes over time, given leads of differentsizes. This was estimated using the 200708 model,and its shape is consistent with the results presentedby Stern (1994). Therefore, the Markov model is capableof modelling the home team advantage and the progressof the score over time.

Shirley (2007) also fit a smaller 18-state model usingthe teams summary statistics (3pt %, field goal %, rebounds,

probabilities do not change as the match progresses. Thisis a strong assumption, and those familiar with the sportwill agree that it is not a valid assumption. We thereforetreated each second as a separate bin and recorded thenumber of points scored in that particular second (seeFig. 2). Each quarter of an NBA match is 12 min or 720 slong. We pooled data from all four quarters of the 2008/09season (the 2007/08 data give similar results). The numberof points scored (and the method of scoring) remains ona similar level throughout a quarter, with the exceptionof approximately the first and last minutes. Those familiarwith the game of basketball know how the dynamics of thematch differ between time periods. In the final seconds, thestyle of play were the Denver Nuggets (287) and theGolden State Warriors (285). The San Antonio Spurs hadthe slowest pace in both 2007/08 and 2008/09, with 255and 249 transitions per game, respectively. Throughoutthe experiments, we use the teams mean numbers oftransitions as an estimate of the match length. Whenestimating for a particular team, only that teams games are

The teams win percentages obtained using the Markovmodel and the Monte Carlo simulation are unbiasedestimates of the home teamwin and the number of points.However, when looking at individual teams, the modelsoverestimate the weaker teams (see Fig. 3). The samething is observed in the results of Shirley (2007). Theproposed Markov model is homogeneous the transitionTable 2An excerpt from the first quarter of a match between Seattle (A) and Phoeni

Play-by-play data# Team Player Event Result

1 PHX Amare Stoudemire Shot 2pt2 SEA Damien Wilkins Shot Missed3 PHX Grant Hill Def. rebound -4 PHX Steve Nash Bad pass -5 SEA Kevin Durant Shot 2pt6 PHX Shawn Marion Shot 3pt Missed7 SEA Earl Watson Def. rebound -8 SEA Damien Wilkins Shot 2pt

Table 3Actual mean values and average home vs. average awaymodels estimates fo

Season Matches ActualHome win % Home points

200708 1132 0.606 100.9200809 1122 0.609 100.4

Fig. 1. Home teams win probability as a function of time and given leadsof different sizes. Estimated using the 2007/08 seasons home vs. awaymodel.

For 2007/08 and 2008/09, the two teamswith the quickestB).

Parsed model statesPossession Method Pts. on previous State

SEA Inbound pass 2 Ai2-PHX Def. rebound 0 Bd0SEA Steal 0 As0PHX Inbound pass 2 Bi2-SEA Def. rebound 0 Ad0PHX Inbound pass

he 2007/08 and 2008/09 seasons.

Estimateday points Home win % Home points Away points

.1 0.598 101.0 97.5

.0 0.607 100.9 97.5

etc.) for the 2003/04 season, for each team separately the team vs. average opponent model. Shirleys resultssuggest that the models provide a good estimate ofthe teams actual win percentages (R2 = 0.935).We repeated the procedure for the 2007/08 season,using the 30-state model and the play-by-play data.We estimated two models for each team, one fromthe teams home games and one from the teams awaygames. The two win probabilities estimated by thesetwo models, weighted by the share of home (away)matches, give an estimate of the teams win percentageagainst an average opponent. Fig. 3(a) shows that this isa good estimate of the teams actual win percentages. Theresults are similar if an out-of-sample ex-ante evaluationprocedure is used (see Fig. 3(b)). We used a rollingprocedure only matches which preceded a given matchwere used to estimate the transition matrix for thatmatch.

2.1. Limitations of the described Markov model approach

536 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542(a) In-sample results. (b) Out-of-sample results.

Fig. 3. Actual vs. estimated win percentages for the 2007/08 season.Fig. 2. Actual total number of points scored for each second-long slice ofa quarter. We pooled data from all four quarters of each available matchfrom the 2008/09 season.

teamshold up the ball and attempt to score just before timeruns out to deny their opponents another possession, etc.Note also that more points are scored from free throws asa quarter progresses, because more bonus foul situationsarise as the teams accumulate fouls.

Differences also exist between quarters. Table 4 showsthe mean total number of points scored and the meandifference between the numbers of points scored by theteams (nett points), for each quarter. We tested thesignificance of the differences between quarters with arepeated measures analysis of the variance, and obtaineda quarter does not have a strong impact on its practicaluse. Using all available matches, we also performed a two-way analysis of variance, using the quarter (4 categories)and minute (12 categories) as factors, and the number ofpoints scored as the target variable. Both factors, quarter(Pr(>|F |) = 3.1 105) and minute (Pr(>|F |) < 1015),as well as the interaction quarter minute (Pr(>|F |) |.60.00.00Team (player) ratings or rankings are often used forforecasting the outcome of a sports event. The ratingis used to describe a teams latent strength, and theratings of a given pair of teams are used to determinethe outcome of a match between the two teams. Beforeany matches have been played, the ratings are setto a predetermined value, which is often the samefor all teams. As matches are played, the ratings areupdated, depending on the teams performances. A well-known method for rating a players skill level is the

B, where the two teams have similar ratings. The increasein rating for team A, if they win at home, will be the sameas the increase in rating for team B, if they win an awaymatch. This is unfair, because team A had the advantage ofplaying at home. We compensate for this by subtracting aconstant from the away teams rating

Pr(A wins) = 10RA/400

10RA/400 + 10(RBC)/400 , (2)effectively handicapping the away team.loss function (also known as the Brier score) is defined as(pi oi)2, where pi and oi are the predicted probabilityof a home win and the actual outcome for the home team(1 = won, 0 = lost), respectively.

4.1. Latent strength modelling

=10RA/400 + 10RB/400 . (1)

Given the outcome of the match, we adjust theratings by adding R = K (actual outcome for A Pr(A wins)) to RA and subtracting R from RB, where Kis a rate-of-change constant. Note that Eq. (1) does nottake into account the home team advantage. For example,observe the match between home team A and away teamhORBrh 5.687 3.936 0.149 0.278 0.223 0.359 0.025DRBrh 5.444 4.212 0.196 0.743 0.680 0.792 0.021FTFh 5.938 3.905 0.128 0.243 0.185 0.315 0.026oEFG%h 25.490 4.961

540 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542The Elo rating system was originally developed for rat-ing chess players, where there is no need to take the win-ning margin into account. However, in basketball, it is rea-sonable to take the winning margin into account, reward-ing bigger wins more.3 Hvattum and Arntzen (2010) usedan extension of the Elo system, where the value of K isset according to the winning margin, K = K0(1 + ) ,where is the absolute winning margin and K0 > 0and > 0 are parameters. Hvattum and Arntzen (2010)used this approach for ranking football teams. However,it can easily be adapted for any sport. All that is requiredis a proper calibration of the parameters K0 and . Notethat sometimes parameters other than 10 and 400 areused in Eqs. (1) and (2). The purpose of these two pa-rameters is to set the scale of the ratings. Parameters Kand C require more attention. We used the following pa-rameter values for the basic ELO model and ELO: K =24.0, K0 = 2 and = 1. The home team advantageconstant was set to C = 64. To facilitate the out-of-sample evaluation, we did not use data from 2007/08 or2008/09 in the calibration. The chosen parameter valuesminimized the squared loss for the 2006/07 season. Werounded the values to simplify presentation, without sig-nificantly worsening the performance. The match scoreswhich were used to perform the calibration were obtainedfrom www.basketball-reference.com. References to sev-eral more complex latent-strength modelling approachesto sports forecasting, including that of Glickman and Stern(1998), are given by Stekler et al. (2010). Improvementsmight also be achieved by using the teams ELO ratings(or some other measure of the team quality) and fitting aBradley-Terry model (Bradley & Terry, 1952).

4.2. Bookmaker odds as forecasts

Bookmakers odds are probabilistic forecasts. For exam-ple, let dA and dB be the quoted decimal odds for a matchbetween teams A and B. Then pA = 1dA and pB = 1dB are winprobability estimates. Due to bookmakers profit margins,1dA

and 1dB can sum to more than 1, and thus it is commonto use the normalization pA = pApA+pB and pB =

pBpA+pB .

We used odds from the worlds largest betting ex-change: Betfair. Unlike regular bookmakers, betting ex-changes do not offer odds directly, but simply facilitatepeer-to-peer betting. Users can both back a selection (beton an event occurring) or lay a bet (bet against an eventoccurring) at any odds they choose. Matching back andlay bets are matched by the system. The odds change overtime, soweused the odds of the firstmatched bet (Betfair0)and the odds most frequently matched during the 5 minbefore the start of the match (Betfair1). These historicalodds data were obtained from data.betfair.com.4

Bookmaker odds are considered to be the overallbest source for probabilistic forecasts of sporting events.

3 Some rating systems, for example Sagarin ratings, also adjust forblowouts, so that a very large margin in a single game does not have aproportionately large impact on the teams rating.4 The data are free of charge, but some terms apply. One must be a

registered Betfair user with a certain (symbolic) number of loyalty points.However, recent results by Franck, Verbeek, and Nuesch(2010) and Smith, Patton, andWilliams (2010) suggest thatbetting exchange odds are better forecasts than regularbookmaker odds. However, these results are limited tohorse racing and football. To the best of our knowledge,no empirical evidence exists for basketball. To test thishypothesis, we compared forecasts from 11 bookmakersand Betfair across a set of 692 NBAmatches from 2008/09,such that all bookmakers forecasts were available for eachmatch. The selected bookmakers were 5Dimes (0.1859,6.24), Bet365 (0.1856, 6.68), BetCRIS (0.1857, 6.76), BetUS(0.1862, 7.05), Betway (0.1866, 7.20), Canbet (0.1858,7.01), Expekt (0.1859, 6.70), Jetbull (0.1868, 6.64), PaddyPower (0.1864, 6.37), Sports Interaction (0.1858, 6.22),Unibet (0.1866, 5.60), and Betfair (0.1855, 5.55). Thenumbers in parentheses represent the mean quadraticloss and mean rank (with respect to the mean quadraticloss), both across all 692 selected matches. We found nosignificant difference between the bookmakers predictionqualities (p-value 1). By applying the square root to themean quadratic loss of the best model (Betfair, 0.4307) andthe worst model (Betway, 0.4320), we express them in thesame unit as the predictions, thus simplifying comparisonand interpretation. The difference in quality between thebest and worst bookmakers is approximately one tenth ofa percent. However, Betfair has the lowestmean rank. Thatis, the Betfair odds are, on average, closest to the actualoutcome. Friedmans non-parametric test revealed thatthe differences in rank are significant (p-value 0). Weused Nemenyis test for post-hoc comparison. The criticaldifference in rank at the 0.01 significance level was 0.74.Therefore, the Betfair loss ranks significantly lower (better)than those of eight other bookmakers.

We also tested the distribution of the difference in lossbetween Betfair0 odds and Betfair1 odds. Using a pairwiset-test, we rejected the zero mean hypothesis and acceptedthe alternative hypothesis that the mean is greater than 0(p-value 0). Therefore, the odds from just before the startof the match are significantly better probabilistic forecaststhan the odds from the start of the betting period. This isconsistent with the findings of Gandar et al., who showedthat opening lines set by bookmakers (in our case, themarket) are less accurate than the closing lines establishedby the market (see Gandar, Dare, Brown, & Zuber, 1998;Gandar, Zuber, & Dare, 2000).5

5. Empirical evaluation

We compared the following sources of forecasts:(1) the home team vs. average opponent model fromSection 2 (MAvsAVG), (2) the Markov model approach (usingmultinomial logit row models) described in Section 3(MML), (3) Betfair1 odds, (4) the Elo rating models (seeSection 4.1), and (5) a logit regression model using the

5 Stekler et al. (2010) suggest that this indicates that the market(closing line) is more accurate than the bookmakers (opening line). Weshould interpret such indirect evidencewith caution. Opening lines couldalso be worse becausemore information becomes available after openinglines have been set. Without all of the participants forecasts at differentpoints in time, we cannot distinguish between the two possibilities.

E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542 541

cti

e h

00000matches in the evaluation). A rolling procedure was thenused to forecast the remaining matches in a season after forecasting and evaluating the forecasts for the nextdays matches, we added the matches to the sample andrebuilt all models using the current sample of matches.This procedure was repeated until the end of the season.

We evaluated each season separately. We evaluatedthe models forecasts using the quadratic loss function.We also measured the percentage of correct predictionswhen the forecasts were treated as being binary. For eachbasketball game, we had all of the models losses (that is,a complete block design). We viewed individual games assubjects and forecasting models as treatments. We wereinterested in seeing whether there exist differences inmean loss between forecasting models. The equivalence

the summary statistics (and/or transition frequencies) arenot completely accurate representations of a teams actualstrength. The three statistical approaches (logit regression,ELO rating, and the proposed model) exhibit different lev-els of bias, but produce forecasts of the samequality. There-fore, eachmodel represents a different bias-variance trade-off.

6. Conclusion

Using summary statistics to estimate Shirleys Markovmodel for basketball produced a model for a matchbetween two specific teams. The model was be used tosimulate the match and produce outcome forecasts of aquality comparable to that of other statistical approaches,the optimization of any such linear opinion pool whichincluded Betfair odds always resulted in the Betfair oddsbeing weighted approximately 1. The bookmaker oddsforecasts could not be improved on using a linear opinionpool. We did not investigate other types of forecastcombinations.

We selected the weights (0.54 for ELO, 0.46 for MML)that minimized the quadratic loss on the 2006/07 data,which were reserved for parameter selection and are notused in the evaluation. In the out-of-sample procedure, westarted with a sample of the first five home and first fiveaway matches for each team (we did not include these

AvsAVGworse than all of the other forecasts. The Betfair1 oddsare better than all of the other forecasts. The differencesin quality between the remaining statistical models arenot significant. The in-sample results for some of themodels illustrate how misleading their quality can be andemphasize the importance of evaluating ex-ante forecasts.

All of the statistical approaches were unbiased estima-tors of the win percentage, but, when observing individualteams, they exhibited a bias towardsweaker teams (see Ta-ble 12). This can be explained by the fact that the statisticalmodels do not take into account the amount of effort thatwas invested in a match (see Section 2.1). In other words,Table 10Empirical evaluation results. The accuracy is the proportion of correct predi

2007/08In-sample Out-of-sampleLoss Loss

Betfair1 0.190054100 ELO + 46100MML 0.1978ELO 0.2009Logit 0.1800 0.2022MML 0.1890 0.2048ELO 0.2065MAvsAVG 0.2003 0.2199Pr(Homewin) = 0.6 0.2373

N = 957

Table 11Adjusted p-values for pairwise tests of the equality of means. The alternativcolumn (model).

8 7 6

1 Betfair1

542 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542

mer

e

69151Table 12Linear models of the relationship between the actual win percentage and theAll of the forecasters exhibit a statistically significant bias towards the weak

MML 54100 ELO + 46100MMLIntercept Slope Intercept Slop

Estimate 0.170 0.657 0.131 0.73Std. Err. 0.020 0.039 0.015 0.02t-value 8.354 16.694 8.844 25.7Pr(> |t|) 109

simulating a basketball match with a homogeneous markov model and forecasting the outcome

Documents