simulating a basketball match with a homogeneous markov model and forecasting the outcome

11
International Journal of Forecasting 28 (2012) 532–542 Contents lists available at SciVerse ScienceDirect International Journal of Forecasting journal homepage: www.elsevier.com/locate/ijforecast Simulating a basketball match with a homogeneous Markov model and forecasting the outcome Erik Štrumbelj, Petar Vračar Faculty of Computer and Information Science, Tržaška 25, 1000 Ljubljana, Slovenia article info Keywords: Sports forecasting Probability forecasting Monte Carlo Simulation National Basketball Association abstract We used a possession-based Markov model to model the progression of a basketball match. The model’s transition matrix was estimated directly from NBA play-by-play data and indirectly from the teams’ summary statistics. We evaluated both this approach and other commonly used forecasting approaches: logit regression of the outcome, a latent strength rating method, and bookmaker odds. We found that the Markov model approach is appropriate for modelling a basketball match and produces forecasts of a quality comparable to that of other statistical approaches, while giving more insight into basketball. Consistent with previous studies, bookmaker odds were the best probabilistic forecasts. © 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. 1. Introduction As Stekler, Sendor, and Verlander (2010) recently pointed out in their survey of sports forecasting, a large amount of effort is spent in forecasting the outcomes of sporting events. While part of this effort focuses on the properties of different types of forecasts and forecasters, the majority of such research is driven by sports betting and revolves around betting markets. The evolution of betting markets brings new challenges for forecasters. In the past, only a few different betting markets were available for a sports event. Today, bets can be made on all kinds of outcomes, ranging from the number of rebounds in a basketball match to the number of corners in a football match. Furthermore, the increasing accessibility of in-play betting has brought a demand for real-time forecasts based on the progression of the sporting event. One way of dealing with these issues in a uniform way is to construct a model that can simulate the sporting event. In this paper we take a step in this direction by exploring the use of Markov models for basketball match simulation. Corresponding author. E-mail addresses: [email protected] (E. Štrumbelj), [email protected] (P. Vračar). 1.1. Related work Basketball has received far less attention in the scien- tific literature than horse racing or football. Zak, Huang, and Sigfried (1979) combined defensive and offensive el- ements to rank individual teams. This ranking produced the same order as the teams’ actual efficiencies. How- ever, no attempt was made to use the ranking to forecast the results of individual matches. Berri (1999) proposed a model for measuring how individual players contribute to a team’s success. The sum of the contributions of all of its players was a good estimator of the actual number of wins achieved by a team. Kvam and Sokol (2006) used a Markov model to predict the winner of NCAA basketball matches. Each team was a state in the model, transition probabil- ities were fit using the teams’ past performances, and the teams were ranked according to the model’s stationary dis- tribution. The model was better at forecasting the win- ner than several other statistical models and polls. Stern (1994) used the scores at the end of each quarter of 493 NBA matches to fit a Brownian motion model of the score progression. The model was good at estimating in-play win probabilities, given the current points differences. For a more comprehensive list of basketball-related forecast- ing references, especially those which focus on forecasting match outcomes from teams’ ratings or rankings, see Stek- ler et al. (2010). 0169-2070/$ – see front matter © 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.ijforecast.2011.01.004

Upload: michalis-kasioulis

Post on 08-Nov-2015

38 views

Category:

Documents


10 download

DESCRIPTION

We used a possession-based Markov model to model the progression of a basketballmatch. The model’s transition matrix was estimated directly from NBA play-by-play dataand indirectly from the teams’ summary statistics. We evaluated both this approachand other commonly used forecasting approaches: logit regression of the outcome, alatent strength rating method, and bookmaker odds. We found that the Markov modelapproach is appropriate for modelling a basketball match and produces forecasts of aquality comparable to that of other statistical approaches, while giving more insight intobasketball. Consistent with previous studies, bookmaker odds were the best probabilisticforecasts.

TRANSCRIPT

  • International Journal of Forecasting 28 (2012) 532542

    at

    n

    ls

    a

    nia

    nrathygiatot w 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.

    1. Introduction

    As Stekler, Sendor, and Verlander (2010) recentlypointed out in their survey of sports forecasting, a largeamount of effort is spent in forecasting the outcomes ofsporting events. While part of this effort focuses on theproperties of different types of forecasts and forecasters,the majority of such research is driven by sports bettingand revolves around betting markets. The evolution ofbetting markets brings new challenges for forecasters.In the past, only a few different betting markets wereavailable for a sports event. Today, bets can be made on allkinds of outcomes, ranging from the number of reboundsin a basketball match to the number of corners in a footballmatch. Furthermore, the increasing accessibility of in-playbetting has brought a demand for real-time forecasts basedon the progression of the sporting event. One way ofdealing with these issues in a uniform way is to constructa model that can simulate the sporting event. In this paperwe take a step in this direction by exploring the use ofMarkov models for basketball match simulation.

    Corresponding author.E-mail addresses: [email protected] (E. trumbelj),

    [email protected] (P. Vraar).

    1.1. Related work

    Basketball has received far less attention in the scien-tific literature than horse racing or football. Zak, Huang,and Sigfried (1979) combined defensive and offensive el-ements to rank individual teams. This ranking producedthe same order as the teams actual efficiencies. How-ever, no attempt was made to use the ranking to forecastthe results of individual matches. Berri (1999) proposed amodel for measuring how individual players contribute toa teams success. The sum of the contributions of all of itsplayers was a good estimator of the actual number of winsachieved by a team. Kvam and Sokol (2006) used a Markovmodel to predict the winner of NCAA basketball matches.Each team was a state in the model, transition probabil-ities were fit using the teams past performances, and theteamswere ranked according to themodels stationary dis-tribution. The model was better at forecasting the win-ner than several other statistical models and polls. Stern(1994) used the scores at the end of each quarter of 493NBA matches to fit a Brownian motion model of the scoreprogression. The model was good at estimating in-playwin probabilities, given the current points differences. Fora more comprehensive list of basketball-related forecast-ing references, especially those which focus on forecastingmatch outcomes from teams ratings or rankings, see Stek-ler et al. (2010).

    0169-2070/$ see front matter 2011 International Institute of Forecasters. Published by Elsevier B.V. All rights reserved.doi:10.1016/j.ijforecast.2011.01.004Contents lists available

    International Jour

    journal homepage: www.e

    Simulating a basketball match withforecasting the outcomeErik trumbelj, Petar Vraar Faculty of Computer and Information Science, Traka 25, 1000 Ljubljana, Slove

    a r t i c l e i n f o

    Keywords:Sports forecastingProbability forecastingMonte CarloSimulationNational Basketball Association

    a b s t r a c t

    We used a possessiomatch. The models tand indirectly fromand other commonllatent strength ratinapproach is approprquality comparable tbasketball. Consistenforecasts.SciVerse ScienceDirect

    al of Forecasting

    evier.com/locate/ijforecast

    homogeneous Markov model and

    -based Markov model to model the progression of a basketballnsition matrix was estimated directly from NBA play-by-play datae teams summary statistics. We evaluated both this approachused forecasting approaches: logit regression of the outcome, amethod, and bookmaker odds. We found that the Markov modele for modelling a basketball match and produces forecasts of athat of other statistical approaches, while giving more insight intoith previous studies, bookmaker odds were the best probabilistic

  • E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542 533Themost relevant is the work by Shirley (2007). Shirleyused a Markov model with states which correspondedto how a team obtained (retained) possession and howmany points were scored during the previous transition.Shirley fit a model for each team using summary statisticsand showed that the simulations based on such modelswere good at estimating a teams win probability againstan average opponent. However, the results were in-sample and the quality of the model as a forecaster wasnot investigated. Shirley also discussed fitting the modelusing exclusively in-match data (as opposed to summarystatistics) and taking into account the individual teamsstrengths, but did not pursue this due to insufficient data.

    1.2. Research objectives

    Motivated by Shirleys work, we investigated the useof Markov models for basketball match simulation. Whilethe long term goal is a detailed and accurate simulationof a basketball match, legitimate questions can be askedabout whether a more complex model can still produceaccurate forecasts for something as relevant as the matchoutcome. Therefore, our main objective was to determinewhether such a model could be used to simulate amatch between two specific teams, while still producingprobabilistic forecasts of a quality comparable to that ofthe other statistical approaches that are most commonlyused to forecast the outcome of a sporting event. Wehave revisited Shirleys average home vs. average away andteam vs. average team models and fit them using play-by-play data from 2252 NBA basketball matches. We havealso proposed an extension which takes into account thestrengths of the two competing teams. This is achievedby inferring the transition probabilities from the teamssummary statistics.

    We found that some related basketball forecastingstudies do not include an ex-ante evaluation. Thosethat do, usually evaluate the forecasts using the per-centage of correct predictions, which is more appropri-ate for evaluating binary forecasts than for evaluatingprobabilistic forecasts. Therefore, a secondary objectiveis an overview of the quality of different probabilisticforecasts for basketball. We evaluate our approach andthree of the most commonly used approaches in forecast-ing: logit regression of the outcome, a latent strength rank-ing model, and bookmaker odds.

    The remainder of the paper is organized as follows. Inthe following section we describe the Markov model andanalyze its properties on NBA play-by-play data. Section 3focuses on producing amatch-specific Markovmodel fromthe teams summary statistics. In Section 4 we describethree other well-known approaches to forecasting. Theresults of the empirical evaluation are given in Section 5.Finally, in Section 6 we conclude the paper and discusssome ideas for further work.

    2. A possession-based Markov model

    In basketball, two teams (A and B) alternate possessionof the ball and try to score points.Mostmodern approachesto basketball analysis evaluate the match at the level ofpossessions (see Kubatko, Oliver, Pelton, & Rosenbaum,2007), focusing on the number of possessions a teamhas and how effectively they convert them into points.Shirley (2007) proposed a Markov model with states thatcorrespond to a team obtaining (or retaining) possession.He considered the following states: restarting the actionwith an inbound pass (i), steal or non-whistle turnover (s),offensive (o) or defensive rebound (d), and going to thefree throw line after a shooting/bonus/technical foul (f).1These states cover all moments in a game duringwhich theteams can obtain possession of the ball. However, such astate does not imply a change in possession. For example,a team can get two consecutive offensive rebounds, thusobtaining the ball while already having possession (that is,retaining possession). Therefore, it is possible for a team tokeep possession for several consecutive states.

    The model takes into account the obtainment (retain-ment) of possession, how it was obtained, and the numberof points scored during the transition. A transition encom-passes everything that happens between two states. Dur-ing a transition, 0, 1, 2, or 3 points are scored. This leadsus to the set of states: {A, B} {i, s, o, d, f} {0, 1, 2, 3}.For example, let team B miss a shot and team A rebound,then team Amiss a 2pt shot, get an offensive rebound, anddunk for 2ptwhile being fouled for an extra free throw. Thefollowing states and transitions describe this sequence upto the free throw line: 0 Ad00 Ao02 Af2. Only30 of the 40 states are actually possible. If possession wasobtained by a steal, no points could have been scored onthe previous transition, and scoring 3 points in a transitioncan never lead to a rebound.2 This eliminates 10 states. Ta-ble 1 shows the remaining 30 states and describes the tran-sitions between them.

    First, we estimated the transition probabilities usingthe frequencies from the play-by-play data available forthe 200708 and 200809 seasons, separately. The resultwas a model for a match between the average home teamand the average away team for the corresponding season the average home vs. average away model. For moredetails about the play-by-play data and how they weretransformed into states, see Table 2. The NBA play-by-playdata were obtained from www.basketballgeek.com.

    The transition matrix can be used to estimate thenumber of points scored and the win probabilities. Weused a Markov chain Monte Carlo model to estimate themodels stationary distribution. From this distribution weget the number of points per transition, which, multipliedby the number of transitions, gives an estimate of thenumber of points scored. To estimate thewin probabilities,10 000 matches were simulated (draws were ignored).Note that some teams play fast-paced basketball, whileothers play at a slower pace. For example, the meannumber of transitions per game for the 2007/08 (2008/09)seasonwas 267 (264)with a standard deviation of 7.9 (8.0).

    1 Note that going to the free throw line is a state, while subsequent freethrows are not considered as states. Instead, they are encompassed in thetransition from the free-throw state to the next state.2 Scoring 2 points can lead to a rebound, for example in the situation

    when a player has 3 free throws, and makes the first two but misses thelast one.

  • 534 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542Table1

    Outlin

    eof

    thetran

    sitio

    nmatrixof

    ShirleysMarko

    vmod

    elforb

    asketball.Th

    enu

    mbers/le

    ttersan

    dthecorrespo

    ndingde

    scriptions

    providemorede

    tails

    abou

    thow

    atran

    sitio

    ncanoccurb

    etweentherow

    and

    columnstates.

    Team

    Aha

    spo

    ssession

    Team

    Bha

    spo

    ssession

    Ai0

    Ai1

    Ai2

    Ai3

    As0

    Ao0

    Ao1

    Ao2

    Ad0

    Ad1

    Ad2

    Af0

    Af1

    Af2

    Af3

    Bi0

    Bi1

    Bi2

    Bi3

    Bs0

    Bo0

    Bo1

    Bo2

    Bd0

    Bd1

    Bd2

    Bf0

    Bf1

    Bf2

    Bf3

    Team

    Aha

    spo

    ssession

    Ai0

    02

    46

    78

    AB

    CD

    FAi1

    02

    46

    78

    AB

    CD

    FAi2

    02

    46

    78

    AB

    CD

    FAi3

    02

    46

    78

    AB

    CD

    FAs0

    02

    46

    78

    AB

    CD

    FAo

    00

    24

    67

    8A

    BC

    DF

    Ao1

    02

    46

    78

    AB

    CD

    FAo

    20

    24

    67

    8A

    BC

    DF

    Ad0

    02

    46

    78

    AB

    CD

    FAd

    10

    24

    67

    8A

    BC

    DF

    Ad2

    02

    46

    78

    AB

    CD

    FAf0

    01

    12

    33

    55

    58

    99

    BD

    EE

    GG

    GAf1

    01

    23

    55

    89

    9D

    EG

    GAf2

    01

    23

    55

    89

    9D

    EG

    GAf3

    02

    58

    9D

    G

    Team

    Bha

    spo

    ssession

    Bi0

    8A

    BC

    DF

    02

    46

    7Bi1

    8A

    BC

    DF

    02

    46

    7Bi2

    8A

    BC

    DF

    02

    46

    7Bi3

    8A

    BC

    DF

    02

    46

    7Bs0

    8A

    BC

    DF

    02

    46

    7Bo

    08

    AB

    CD

    F0

    24

    67

    Bo1

    8A

    BC

    DF

    02

    46

    7Bo

    28

    AB

    CD

    F0

    24

    67

    Bd0

    8A

    BC

    DF

    02

    46

    7Bd

    18

    AB

    CD

    F0

    24

    67

    Bd2

    8A

    BC

    DF

    02

    46

    7Bf0

    89

    9B

    DE

    EG

    GG

    01

    12

    33

    55

    5Bf1

    89

    9D

    EG

    G0

    12

    35

    5Bf2

    89

    9D

    EG

    G0

    12

    35

    5Bf3

    89

    DG

    02

    5

    0tim

    eout

    no

    n-free-throw

    foul

    de

    fensekn

    ocks

    ballou

    tofb

    ound

    s9missedlastFT,tippe

    dballou

    tofb

    ound

    smadelastfree

    throw

    1missedlastfree

    throw,defen

    setip

    pedballou

    tofb

    ound

    sAmade2p

    tsho

    t2missedshot

    orallfreethrows,offensiverebo

    und

    Bmade3p

    tsho

    tmade3free

    throws

    3missedlastfree

    throw,offe

    nsiverebo

    und

    Csteal

    blockturnover

    with

    outw

    histle

    4fouled

    Dmissedshot

    orallfreethrows,de

    fensiverebo

    und

    5missedlastfree

    throw,loo

    seballfoul

    Emissedlastfree

    throw,defen

    sive

    rebo

    und

    6made2p

    tsho

    t,shoo

    tingfoul

    Ftechnicalfou

    loffensivefoul

    whe

    nin

    bonu

    s7made3p

    tsho

    t,shoo

    tingfoul

    Gmissedlastfree

    throw,loo

    seballfoul

    8missedshot,tippe

    dballou

    tofb

    ound

    soffensivefoul

    orviolation.

    .

  • E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542 535

    x (

    r t

    Aw

    9797used. When estimating for a match between two specificteams, the mean of the two teams means is used.

    Table 3 shows that both the 200708 and 200809average home vs. average away models produce goodestimates. Fig. 1 shows how the home teams winprobability changes over time, given leads of differentsizes. This was estimated using the 200708 model,and its shape is consistent with the results presentedby Stern (1994). Therefore, the Markov model is capableof modelling the home team advantage and the progressof the score over time.

    Shirley (2007) also fit a smaller 18-state model usingthe teams summary statistics (3pt %, field goal %, rebounds,

    probabilities do not change as the match progresses. Thisis a strong assumption, and those familiar with the sportwill agree that it is not a valid assumption. We thereforetreated each second as a separate bin and recorded thenumber of points scored in that particular second (seeFig. 2). Each quarter of an NBA match is 12 min or 720 slong. We pooled data from all four quarters of the 2008/09season (the 2007/08 data give similar results). The numberof points scored (and the method of scoring) remains ona similar level throughout a quarter, with the exceptionof approximately the first and last minutes. Those familiarwith the game of basketball know how the dynamics of thematch differ between time periods. In the final seconds, thestyle of play were the Denver Nuggets (287) and theGolden State Warriors (285). The San Antonio Spurs hadthe slowest pace in both 2007/08 and 2008/09, with 255and 249 transitions per game, respectively. Throughoutthe experiments, we use the teams mean numbers oftransitions as an estimate of the match length. Whenestimating for a particular team, only that teams games are

    The teams win percentages obtained using the Markovmodel and the Monte Carlo simulation are unbiasedestimates of the home teamwin and the number of points.However, when looking at individual teams, the modelsoverestimate the weaker teams (see Fig. 3). The samething is observed in the results of Shirley (2007). Theproposed Markov model is homogeneous the transitionTable 2An excerpt from the first quarter of a match between Seattle (A) and Phoeni

    Play-by-play data# Team Player Event Result

    1 PHX Amare Stoudemire Shot 2pt2 SEA Damien Wilkins Shot Missed3 PHX Grant Hill Def. rebound -4 PHX Steve Nash Bad pass -5 SEA Kevin Durant Shot 2pt6 PHX Shawn Marion Shot 3pt Missed7 SEA Earl Watson Def. rebound -8 SEA Damien Wilkins Shot 2pt

    Table 3Actual mean values and average home vs. average awaymodels estimates fo

    Season Matches ActualHome win % Home points

    200708 1132 0.606 100.9200809 1122 0.609 100.4

    Fig. 1. Home teams win probability as a function of time and given leadsof different sizes. Estimated using the 2007/08 seasons home vs. awaymodel.

    For 2007/08 and 2008/09, the two teamswith the quickestB).

    Parsed model statesPossession Method Pts. on previous State

    SEA Inbound pass 2 Ai2-PHX Def. rebound 0 Bd0SEA Steal 0 As0PHX Inbound pass 2 Bi2-SEA Def. rebound 0 Ad0PHX Inbound pass

    he 2007/08 and 2008/09 seasons.

    Estimateday points Home win % Home points Away points

    .1 0.598 101.0 97.5

    .0 0.607 100.9 97.5

    etc.) for the 2003/04 season, for each team separately the team vs. average opponent model. Shirleys resultssuggest that the models provide a good estimate ofthe teams actual win percentages (R2 = 0.935).We repeated the procedure for the 2007/08 season,using the 30-state model and the play-by-play data.We estimated two models for each team, one fromthe teams home games and one from the teams awaygames. The two win probabilities estimated by thesetwo models, weighted by the share of home (away)matches, give an estimate of the teams win percentageagainst an average opponent. Fig. 3(a) shows that this isa good estimate of the teams actual win percentages. Theresults are similar if an out-of-sample ex-ante evaluationprocedure is used (see Fig. 3(b)). We used a rollingprocedure only matches which preceded a given matchwere used to estimate the transition matrix for thatmatch.

    2.1. Limitations of the described Markov model approach

  • 536 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542(a) In-sample results. (b) Out-of-sample results.

    Fig. 3. Actual vs. estimated win percentages for the 2007/08 season.Fig. 2. Actual total number of points scored for each second-long slice ofa quarter. We pooled data from all four quarters of each available matchfrom the 2008/09 season.

    teamshold up the ball and attempt to score just before timeruns out to deny their opponents another possession, etc.Note also that more points are scored from free throws asa quarter progresses, because more bonus foul situationsarise as the teams accumulate fouls.

    Differences also exist between quarters. Table 4 showsthe mean total number of points scored and the meandifference between the numbers of points scored by theteams (nett points), for each quarter. We tested thesignificance of the differences between quarters with arepeated measures analysis of the variance, and obtaineda quarter does not have a strong impact on its practicaluse. Using all available matches, we also performed a two-way analysis of variance, using the quarter (4 categories)and minute (12 categories) as factors, and the number ofpoints scored as the target variable. Both factors, quarter(Pr(>|F |) = 3.1 105) and minute (Pr(>|F |) < 1015),as well as the interaction quarter minute (Pr(>|F |) |.60.00.00Team (player) ratings or rankings are often used forforecasting the outcome of a sports event. The ratingis used to describe a teams latent strength, and theratings of a given pair of teams are used to determinethe outcome of a match between the two teams. Beforeany matches have been played, the ratings are setto a predetermined value, which is often the samefor all teams. As matches are played, the ratings areupdated, depending on the teams performances. A well-known method for rating a players skill level is the

    B, where the two teams have similar ratings. The increasein rating for team A, if they win at home, will be the sameas the increase in rating for team B, if they win an awaymatch. This is unfair, because team A had the advantage ofplaying at home. We compensate for this by subtracting aconstant from the away teams rating

    Pr(A wins) = 10RA/400

    10RA/400 + 10(RBC)/400 , (2)effectively handicapping the away team.loss function (also known as the Brier score) is defined as(pi oi)2, where pi and oi are the predicted probabilityof a home win and the actual outcome for the home team(1 = won, 0 = lost), respectively.

    4.1. Latent strength modelling

    =10RA/400 + 10RB/400 . (1)

    Given the outcome of the match, we adjust theratings by adding R = K (actual outcome for A Pr(A wins)) to RA and subtracting R from RB, where Kis a rate-of-change constant. Note that Eq. (1) does nottake into account the home team advantage. For example,observe the match between home team A and away teamhORBrh 5.687 3.936 0.149 0.278 0.223 0.359 0.025DRBrh 5.444 4.212 0.196 0.743 0.680 0.792 0.021FTFh 5.938 3.905 0.128 0.243 0.185 0.315 0.026oEFG%h 25.490 4.961

  • 540 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542The Elo rating system was originally developed for rat-ing chess players, where there is no need to take the win-ning margin into account. However, in basketball, it is rea-sonable to take the winning margin into account, reward-ing bigger wins more.3 Hvattum and Arntzen (2010) usedan extension of the Elo system, where the value of K isset according to the winning margin, K = K0(1 + ) ,where is the absolute winning margin and K0 > 0and > 0 are parameters. Hvattum and Arntzen (2010)used this approach for ranking football teams. However,it can easily be adapted for any sport. All that is requiredis a proper calibration of the parameters K0 and . Notethat sometimes parameters other than 10 and 400 areused in Eqs. (1) and (2). The purpose of these two pa-rameters is to set the scale of the ratings. Parameters Kand C require more attention. We used the following pa-rameter values for the basic ELO model and ELO: K =24.0, K0 = 2 and = 1. The home team advantageconstant was set to C = 64. To facilitate the out-of-sample evaluation, we did not use data from 2007/08 or2008/09 in the calibration. The chosen parameter valuesminimized the squared loss for the 2006/07 season. Werounded the values to simplify presentation, without sig-nificantly worsening the performance. The match scoreswhich were used to perform the calibration were obtainedfrom www.basketball-reference.com. References to sev-eral more complex latent-strength modelling approachesto sports forecasting, including that of Glickman and Stern(1998), are given by Stekler et al. (2010). Improvementsmight also be achieved by using the teams ELO ratings(or some other measure of the team quality) and fitting aBradley-Terry model (Bradley & Terry, 1952).

    4.2. Bookmaker odds as forecasts

    Bookmakers odds are probabilistic forecasts. For exam-ple, let dA and dB be the quoted decimal odds for a matchbetween teams A and B. Then pA = 1dA and pB = 1dB are winprobability estimates. Due to bookmakers profit margins,1dA

    and 1dB can sum to more than 1, and thus it is commonto use the normalization pA = pApA+pB and pB =

    pBpA+pB .

    We used odds from the worlds largest betting ex-change: Betfair. Unlike regular bookmakers, betting ex-changes do not offer odds directly, but simply facilitatepeer-to-peer betting. Users can both back a selection (beton an event occurring) or lay a bet (bet against an eventoccurring) at any odds they choose. Matching back andlay bets are matched by the system. The odds change overtime, soweused the odds of the firstmatched bet (Betfair0)and the odds most frequently matched during the 5 minbefore the start of the match (Betfair1). These historicalodds data were obtained from data.betfair.com.4

    Bookmaker odds are considered to be the overallbest source for probabilistic forecasts of sporting events.

    3 Some rating systems, for example Sagarin ratings, also adjust forblowouts, so that a very large margin in a single game does not have aproportionately large impact on the teams rating.4 The data are free of charge, but some terms apply. One must be a

    registered Betfair user with a certain (symbolic) number of loyalty points.However, recent results by Franck, Verbeek, and Nuesch(2010) and Smith, Patton, andWilliams (2010) suggest thatbetting exchange odds are better forecasts than regularbookmaker odds. However, these results are limited tohorse racing and football. To the best of our knowledge,no empirical evidence exists for basketball. To test thishypothesis, we compared forecasts from 11 bookmakersand Betfair across a set of 692 NBAmatches from 2008/09,such that all bookmakers forecasts were available for eachmatch. The selected bookmakers were 5Dimes (0.1859,6.24), Bet365 (0.1856, 6.68), BetCRIS (0.1857, 6.76), BetUS(0.1862, 7.05), Betway (0.1866, 7.20), Canbet (0.1858,7.01), Expekt (0.1859, 6.70), Jetbull (0.1868, 6.64), PaddyPower (0.1864, 6.37), Sports Interaction (0.1858, 6.22),Unibet (0.1866, 5.60), and Betfair (0.1855, 5.55). Thenumbers in parentheses represent the mean quadraticloss and mean rank (with respect to the mean quadraticloss), both across all 692 selected matches. We found nosignificant difference between the bookmakers predictionqualities (p-value 1). By applying the square root to themean quadratic loss of the best model (Betfair, 0.4307) andthe worst model (Betway, 0.4320), we express them in thesame unit as the predictions, thus simplifying comparisonand interpretation. The difference in quality between thebest and worst bookmakers is approximately one tenth ofa percent. However, Betfair has the lowestmean rank. Thatis, the Betfair odds are, on average, closest to the actualoutcome. Friedmans non-parametric test revealed thatthe differences in rank are significant (p-value 0). Weused Nemenyis test for post-hoc comparison. The criticaldifference in rank at the 0.01 significance level was 0.74.Therefore, the Betfair loss ranks significantly lower (better)than those of eight other bookmakers.

    We also tested the distribution of the difference in lossbetween Betfair0 odds and Betfair1 odds. Using a pairwiset-test, we rejected the zero mean hypothesis and acceptedthe alternative hypothesis that the mean is greater than 0(p-value 0). Therefore, the odds from just before the startof the match are significantly better probabilistic forecaststhan the odds from the start of the betting period. This isconsistent with the findings of Gandar et al., who showedthat opening lines set by bookmakers (in our case, themarket) are less accurate than the closing lines establishedby the market (see Gandar, Dare, Brown, & Zuber, 1998;Gandar, Zuber, & Dare, 2000).5

    5. Empirical evaluation

    We compared the following sources of forecasts:(1) the home team vs. average opponent model fromSection 2 (MAvsAVG), (2) the Markov model approach (usingmultinomial logit row models) described in Section 3(MML), (3) Betfair1 odds, (4) the Elo rating models (seeSection 4.1), and (5) a logit regression model using the

    5 Stekler et al. (2010) suggest that this indicates that the market(closing line) is more accurate than the bookmakers (opening line). Weshould interpret such indirect evidencewith caution. Opening lines couldalso be worse becausemore information becomes available after openinglines have been set. Without all of the participants forecasts at differentpoints in time, we cannot distinguish between the two possibilities.

  • E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542 541

    cti

    e h

    00000matches in the evaluation). A rolling procedure was thenused to forecast the remaining matches in a season after forecasting and evaluating the forecasts for the nextdays matches, we added the matches to the sample andrebuilt all models using the current sample of matches.This procedure was repeated until the end of the season.

    We evaluated each season separately. We evaluatedthe models forecasts using the quadratic loss function.We also measured the percentage of correct predictionswhen the forecasts were treated as being binary. For eachbasketball game, we had all of the models losses (that is,a complete block design). We viewed individual games assubjects and forecasting models as treatments. We wereinterested in seeing whether there exist differences inmean loss between forecasting models. The equivalence

    the summary statistics (and/or transition frequencies) arenot completely accurate representations of a teams actualstrength. The three statistical approaches (logit regression,ELO rating, and the proposed model) exhibit different lev-els of bias, but produce forecasts of the samequality. There-fore, eachmodel represents a different bias-variance trade-off.

    6. Conclusion

    Using summary statistics to estimate Shirleys Markovmodel for basketball produced a model for a matchbetween two specific teams. The model was be used tosimulate the match and produce outcome forecasts of aquality comparable to that of other statistical approaches,the optimization of any such linear opinion pool whichincluded Betfair odds always resulted in the Betfair oddsbeing weighted approximately 1. The bookmaker oddsforecasts could not be improved on using a linear opinionpool. We did not investigate other types of forecastcombinations.

    We selected the weights (0.54 for ELO, 0.46 for MML)that minimized the quadratic loss on the 2006/07 data,which were reserved for parameter selection and are notused in the evaluation. In the out-of-sample procedure, westarted with a sample of the first five home and first fiveaway matches for each team (we did not include these

    AvsAVGworse than all of the other forecasts. The Betfair1 oddsare better than all of the other forecasts. The differencesin quality between the remaining statistical models arenot significant. The in-sample results for some of themodels illustrate how misleading their quality can be andemphasize the importance of evaluating ex-ante forecasts.

    All of the statistical approaches were unbiased estima-tors of the win percentage, but, when observing individualteams, they exhibited a bias towardsweaker teams (see Ta-ble 12). This can be explained by the fact that the statisticalmodels do not take into account the amount of effort thatwas invested in a match (see Section 2.1). In other words,Table 10Empirical evaluation results. The accuracy is the proportion of correct predi

    2007/08In-sample Out-of-sampleLoss Loss

    Betfair1 0.190054100 ELO + 46100MML 0.1978ELO 0.2009Logit 0.1800 0.2022MML 0.1890 0.2048ELO 0.2065MAvsAVG 0.2003 0.2199Pr(Homewin) = 0.6 0.2373

    N = 957

    Table 11Adjusted p-values for pairwise tests of the equality of means. The alternativcolumn (model).

    8 7 6

    1 Betfair1

  • 542 E. trumbelj, P. Vraar / International Journal of Forecasting 28 (2012) 532542

    mer

    e

    69151Table 12Linear models of the relationship between the actual win percentage and theAll of the forecasters exhibit a statistically significant bias towards the weak

    MML 54100 ELO + 46100MMLIntercept Slope Intercept Slop

    Estimate 0.170 0.657 0.131 0.73Std. Err. 0.020 0.039 0.015 0.02t-value 8.354 16.694 8.844 25.7Pr(> |t|) 109