me 647 project exacta betting optimization model

EXACTA BETTING OPTIMIZATION MODEL

Image taken from: http://andrewsgrouptravel.com/wp-content/uploads/2013/10/Keeneland82.jpg

Ben Johnson12/05/2014

ME 647 Final Project: Optimization Model Paired with Monte Carlo Analysis

ME 647 Final Project – Ben Johnson

Utilizing an Excel® add-in which nests the built in solver within a Monte Carlo analysis an attempt is made to maximize the probable betting returns employing historical race statistics.

Exacta Betting Optimization ModelO P T I M I Z AT I O N M O D E L W I T H M O N T E C A R LO A N A LY S I S

INTRODUCTION | ONEIt could easily take a socially unacceptable portion of one's adult life to amass the handicapping expertise necessary to actually make money betting on horse racing. But maybe it is possible, provided a few practical constraints and historical race statistics, to maximize the probable return on one's 'investments' in the thoroughbred racing industry, possibly even bringing the average returns into the black. By employing a linear programming model, this project makes an attempt to determine a way to hedge one's bets for a given race that is profitable on average according to Monte Carlo simulations of the race outcome.

PROJECT OBJECTIVE | TWOThe overall objective of this project is to utilize statistical data and optimization techniques in an attempt to turn a simulated profit for a range of given races without taking into consideration any sort of typical handicapping information. For this project only the final odds of each horse will be considered as the final odds of the win and place (1st and 2nd place) horses follow known distributions to a good degree of accuracy. While the omission of available information such as the field’s past lap times and win/loss records seems questionable, utilizing such information takes practice and is inherently subjective. After all, the odds associated with each horse are basically derived from the available handicapping stats so these stats are in essence being included nonetheless.

BACKGROUND | THREEExacta Betting

Page 1

ME 647 Final Project – Ben JohnsonThe type of bet that is considered for the purpose of this project is an “exacta” bet. To make an exacta bet means to pick both the winning horse and second place horse in the correct order. This type of bet was chosen for several reasons but mostly due to payoff potential and project sizing. While picking the winner of a race would involve only 1 decision variable per horse, exacta betting involves a number of decision variables which may be obtained from eq. 3.1.

¿DecisionVaribles=(nhorses )2−(nhorses) Eq 3.11

The model for this project will include up to 10 horses for a total of up to 90 decision variables (i.e. how much to bet on each of the 90 possible outcomes). The payoff for a “win” bet is the product of the winner’s final odds and the bet amount. The final odds are determined by the eq. 3.2.

Final oddsof winner=(∑ of all bets )−(House take)

∑ of all bets onwinnerEq

3.2

In Kentucky the house take is 16% for straight bets and 19% for “exotic” bets such as the exacta2. From this information it can be construed that the final odds for all horses in a given race will add up to 1 plus the house take. Another way of interpreting this information is that the hypothetical profit of any optimized betting method should exceed the house take to be in the black. While Ky house takes are significant, some online betting sites with limited overhead can have take percentages as low at 1-2%2.The fair (and much larger) payoff for an exacta bet is given by eq. 3.3.

Fair exacta payoff=(bet amount )(winhorseodds )(place horse odds+1) Eq 3.33

In reality, exacta payoffs vary somewhat about the fair amount. But unlike the payoffs for a win which are not known until the betting windows are closed and the bets are tallied, probable exacta payoffs are displayed in the minutes leading up to the race potentially allowing for this information’s inclusion in the optimization model. For this project the payoff of each exacta will be assumed to be the fair amount obtained from eq. 3.3 with the understanding that the probable payoffs are available for implementation in reality.

Likelihood of a Given OutcomeIn order to apply a Monte Carlo technique for race simulation a probability must be associated with each possible outcome. A Minitab® distribution analysis of the final odds for the win and place horses in the last 100+ Ky Derbys shows that both

1 An illustration of this equation’s origin may be seen in figure 4.1.2 http://horseworlddata.com/pmtrcks.html 3 http://www.brisnet.com/library/software/allnews/favoriteArticles/Final%20The%20Very%20Best%20Way%20We% 20Know%20to%20Play%20the%20Exacta.pdf

Page 2

ME 647 Final Project – Ben Johnsondistributions may be described as log-normal. This data is illustrated on the probability plots for win and place odds in figures 3.1 and 3.2 respectively.

10001001010.1

99.9

99

959080706050403020105

1

0.1

Loc 1.567Scale 1.131N 105AD 0.369P-Value 0.422

Win Odds

Perc

ent

Probability Plot of Win OddsLognormal - 95% CI

Figure 3.1 Probability Plot of Final Odds for Win Horse4

4 Past race results obtained from the following website where [year] is replaced by the a year between 2009 and 1901 http://www.kentuckyderby.com/sites/kentuckyderby.com/files/charts/[year].pdf

Page 3


10001001010.1

99.9

99

959080706050403020105

1

0.1

Loc 1.780Scale 1.149N 112AD 0.182P-Value 0.910

Place Odds

Perc

ent

Probability Plot of Place OddsLognormal - 95% CI

Figure 3.2 Probability Plot of Final Odds for Place Horse 5

For the purpose of this project, the relative probability of each outcome is estimated in the following manner. The probability associated with the win horse of a given outcome is approximated by the integral of the log-normal probability density function over some interval (x±α) about the odds of that horse.

Win probability of ahorse with odds x=∫x−α

x +α

[ 1xσ √2π

e−(ln x−μ)2

2σ 2 ] , x>0 Eq 3.46

Fortunately the excel function “lognormal.dist” can provide the value of this integral above or below any x (odds) value given values for µ and σ are provided. Values for those quantities may be obtained from the above probability plot in figure 3.1 where µ and σ are designated by Loc (location) and Scale respectively. The probability associated with the place horse is estimated from the same equation as the win horse the only difference being the values for µ and σ which are now taken from figure 3.2. The relative probability of each outcome is now estimated as the product of the win horse probability and the place horse probability. Since all possible odds are not included in a 10 horse race the odds associated with each possible outcome are

5 Derby results for years 2010-2014 were obtained from http://en.wikipedia.org/wiki/[year]_Kentucky_Derby where [year] is again replaced by a year from 2010-20146 An α value of 0.5 was utilized for the purpose of this analysis.

Page 4

ME 647 Final Project – Ben Johnsonmultiplied by a common scaling factor until the cumulative probability is approximately 1.

MATHEMATICAL MODEL | FOURDecision VariablesAssuming a 10 horse race a table of decision variables definitions may be constructed as shown in figure 4.1.Decision Variable Definitions

Horse 2Horse 1 1 2 3 4 5 6 7 8 9 10

1 X12 X13 X14 X15 X16 X17 X18 X19 X110

2 X21 X23 X24 X25 X26 X27 X28 X29 X210

3 X31 X32 X34 X35 X36 X37 X38 X39 X310

4 X41 X42 X43 X45 X46 X47 X48 X49 X410

5 X51 X52 X53 X54 X56 X57 X58 X56 X510

6 X61 X62 X63 X64 X65 X67 X68 X69 X610

7 X71 X72 X73 X74 X75 X76 X78 X79 X710

8 X81 X82 X83 X84 X85 X86 X87 X89 X810

9 X91 X92 X93 X94 X95 X96 X97 X98 X910

10 X101 X102 X103 X104 X105 X106 X107 X108 X109

Figure 4.1 Table of Decision Variable Definitions (xij)

In the above table each decision variable cell (Xij) represents the amount bet on the associated win/place (horse1/horse2) combination. For example, X12 represents the amount bet on the exacta outcome where horse 1 wins and horse 2 comes in 2nd.

Potential Payoffs

As was done for the decision variables, a table is also created providing the $1 payoff associated with each bet. Entries are obtained as previously described from eq. 3.3 and will be referred to as pij in the objective function.$1 Payoffs (p_ij)

1 2 3 4 5 6 7 8 9 10Horse 1 odds 5 6 8 8 8 9 10 12 15 20

1 5 35 45 45 45 50 55 65 80 1052 6 36 54 54 54 60 66 78 96 1263 8 48 56 72 72 80 88 104 128 1684 8 48 56 72 72 80 88 104 128 1685 9 54 63 81 81 90 99 117 144 1896 9 54 63 81 81 81 99 117 144 1897 10 60 70 90 90 90 100 130 160 2108 12 72 84 108 108 108 120 132 192 2529 15 90 105 135 135 135 150 165 195 315

10 20 120 140 180 180 180 200 220 260 320

Horse 2

Figure 4.2 Table of $1 Payoffs (pij)

Race SimulationPage 5

ME 647 Final Project – Ben JohnsonOnce the odds have been established for each horse and the probability associated with each outcome has been calculated as described above, the race is “simulated” 100 times in the following manner. The decision variables are listed in a single column adjacent to a column of associated probabilities. Another adjacent column tracks the cumulative value of the outcome probabilities (whose final value is set to slightly <1 by adjusting the scaling factor as was previously described). This cumulative probability column serves as a lookup table to which a random number is pointed. The outcome whose cumulative probability value is closest to the random number (without going over) is picked thereby choosing a race outcome. This method is repeated in 100 columns producing 100 simulated race winners. A binary variable is defined as follows for each potential outcome ij for all n simulated races.

BinaryWin∨Loss Variable forOutcome ij DuringSimulation n=¿

( y ij)n={ 1 ,∧if outcome ij occurs for simulationn0 ,∧if outcome ij doesnot occur for simulationn

∀ i=1,2,…,10∀ j=1,2 ,…,10

i≠ j∀ n=1,2 ,…,100

Eq 4.1

The result of this algorithm is 100 columns with a 1 in a single row corresponding to the winning outcome for that race simulation.

Objective FunctionSince the objective of this endeavor would be maximizing average profit the objective function is constructed with that end in mind. Assuming that the same bet is made for each of the 100 simulated races, the cost associated with the 100 simulated bets is given by eq. 4.2.

Total cost of 100 simulated bets= (100 )∑ x ij∀ i=1,2 ,…,10∀ j=1,2 ,…,10

i≠ jEq 4.2

The total gross winnings from the 100 simulations is illustrated in eq. 4.3.

Grosswinnings ¿100 simulations=∑n=1

100

x ij pij ( y¿¿ ij)n∀ i=1,2 ,…,10∀ j=1,2 ,…,10

i≠ j¿ Eq 4.3

The total net winnings (eq 4.4) is the difference of eq’s 4.3 and 4.2.

Net winnings ¿100 s imulations=∑n=1

100

x ij pij ( y¿¿ ij)n−(100)∑ x ij∀ i=1,2 ,…,10∀ j=1,2,…,10

i ≠ j¿ Eq 4.4

The mean winnings is given by equation 4.5.

Meanwinnings=∑n=1

100

xij p ij( y¿¿ ij)n−(100)∑ x ij

100

∀ i=1,2 ,…,10∀ j=1,2 ,…,10

i ≠ j¿ Eq 4.5

Page 6

ME 647 Final Project – Ben JohnsonThe objective is to maximize the mean winnings therefore the objective function is provided by eq 4.6.

Objective Function=(maximize )Z=∑n=1

100

x ij pij ( y¿¿ ij)n−(100)∑ x ij

100

∀ i=1,2 ,… ,10∀ j=1,2 ,…,10

i≠ j¿ Eq

4.6

ConstraintsIn order for this problem to bear semblance to reality, the bets must all be non-negative.

x ij≥0∀ i=1,2 ,…,10∀ j=1,2 ,…,10

i≠ jEq 4.7

The constraints for this problem were varied somewhat in the attempt to produce a profit. The following constraints were eventually settled upon as a compromise between potential profits and up-front financial demands.

Noindividual bet greater than $20: x ij≤20∀ i=1,2,…,10∀ j=1,2 ,…,10

i≠ jEq 4.8

Totalbet for agiven raceless than $1000 :∑ x ij≤1000∀ i=1,2 ,…,10∀ j=1,2 ,…,10

i ≠ jEq 4.9

Monte Carlo AnalysisIn order for the Excel solver to optimize the problem as defined thus far, the random variables determining the winner of each of the 100 races must be non-volatilized during the solver’s operation. Otherwise, each time the solver performs a calculation or changes decision variable values the race winners would again be randomized creating a moving target for the solver to chase. The Excel add-in utilized for this project (MCSimSolver)7 achieves this goal by the means of a visual basic macro. The macro provides for the completion of a specified number of iterations where the random variables assume new values (once per iteration) and the optimization is repeated based upon the newly randomized values, newly chosen race winners in this case. Values of all desired cells are recorded at the end of each iteration and are reported at the end of the Monte Carlo analysis. For this project 1000 iterations were completed for each simulated race which on the author’s modest PC8 takes about 5 or 6 minutes to complete.7 Excel Monte Carlo/solver add-in obtained from the following web resource: http://www3.wabash.edu/econometrics/EconometricsBook/Basic%20Tools/ExcelAddIns/MCSimSolver.htm8 Intel® i5™ 2.5GHz with 8.0Gb RAM and Windows® 64

Page 7

ME 647 Final Project – Ben JohnsonAt the end of each simulation the mode value (from 1000 iterations) for each decision variable is taken to be the optimal value for that variable. Mean values were also attempted with significantly less profitable results. At this point the optimal decision variable values are inserted into the model. The Monte Carlo simulation is repeated where the race winners are randomized each iteration but where the solver is disabled. The average profit from each iteration is recorded. This simulation only takes about 1 or 2 minutes since the optimization routine is excluded.

SIMULATION RESULTS | FIVEThe four following 10 horse races were simulated spanning a range of values for mean odds:

Horse 1 2 3 41 5 3 2 1.52 6 5 3 1.83 8 7 10 104 8 9 10 205 9 9 12 306 9 10 15 407 10 12 30 508 12 20 30 509 15 25 30 75

10 20 50 60 100Mean 10.2 15 20.2 37.83

RaceOdds :1

Figure 5.1 Table of Simulated Races with Odds :1

Decision Variable Values

Race 1 Horse 2Horse 1 1 2 3 4 5 6 7 8 9 10

1 20 0 0 20 20 20 0 0 0

2 20 20 20 20 20 0 0 20 0

3 0 20 0 0 0 0 20 0 0

4 0 20 0 0 0 0 20 0 0

5 0 20 0 0 0 0 0 0 0

6 0 20 0 0 0 0 20 0 0

7 20 0 0 0 0 0 0 0 0

8 0 0 20 20 0 0 0 0 0

9 0 0 0 0 0 0 0 0 0

10 0 0 0 0 0 0 0 0 0

Figure 5.2 Table of Optimized Decision Variable Values – Race 1

Page 8


Average 31.896SD 59.0919Max 251.800Min -154.600

Summary Statistics Notes

-160 -60 40 140 240

Histogram of Mean EarningsRace 1

Figure 5.3 Histogram of Mean Earnings for Optimized Decision Variable Values – Race 1


Race 2 Horse 2Horse 1 1 2 3 4 5 6 7 8 9 10

1 20 20 20 20 20 0 0 0 0

2 20 20 20 20 20 0 0 0 0

3 0 20 20 0 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0

5 0 0 0 0 0 20 0 0 0

6 0 20 0 0 0 0 0 0 0

7 0 0 0 20 20 0 0 0 0

8 0 0 0 0 0 0 0 0 0

9 0 0 0 0 0 0 0 0 0

10 0 0 0 0 0 0 0 0 0


Page 9




-100 -50 0 50 100 150 200

Histogram of Mean Earnings Race 2

Figure 5.5 Histogram of Mean Earnings for Optimized Decision Variable Values – Race 2


Race 3 Horse 2Horse 1 1 2 3 4 5 6 7 8 9 10

1 20 20 20 20 0 0 0 0 0

2 20 20 20 20 20 0 0 0 0

3 0 0 20 20 0 0 0 0 0

4 0 0 20 20 0 0 0 0 0

5 0 0 20 20 0 0 0 0 0

6 0 0 0 0 0 0 0 0 0

7 0 0 0 0 0 0 0 0 0

8 0 0 0 0 0 0 0 0 0

9 0 0 0 0 0 0 0 0 0

10 0 0 0 0 0 0 0 0 0


Page 10




-80 -30 20 70 120 170 220 270




Race 4 Horse 2Horse 1 1 2 3 4 5 6 7 8 9 10

1 20 20 0 0 0 0 0 0 0

2 20 20 20 0 0 0 0 0 0

3 0 20 0 0 0 0 0 0 0

4 0 0 0 0 0 0 0 0 0

5 0 0 0 0 0 0 0 0 0

6 0 0 0 0 0 0 0 0 0

7 0 0 0 0 0 0 0 0 0

8 0 0 0 0 0 0 0 0 0

9 0 0 0 0 0 0 0 0 0

10 0 0 0 0 0 0 0 0 0


Page 11




-10 40 90 140



A summary of the optimal results obtained for all 4 races is included in figure 5.10.

Avg. Odds

Individual Bet ROI

%ROI

10.2 40031.9

0 7.98

15 32057.9

918.1

2

20.2 30082.4

527.4

8

37.8 12060.8

950.7

4

Figure 5.10 Summary of Results – All 4 Simulated Races

A plot of the mean odds versus percent return on investment (%ROI) is illustrated on figure 5.11. A logarithmic trendline seems to fit the data well, possibly due to the log-normal distributions included in the model.

Page 12


Figure 5.11 Mean Odds vs. %ROI

CONCLUSIONS | SIXWhile these simulations would seem to suggest that creating an average profit might be possible, a significant up-front investment of both time and money would be required to enact the plan outlined here. Some assumptions have been made for the purpose of this project that might have significant impact on the results. Nonetheless, optimization techniques have been shown to be a useful tool for navigating a complex problem utilizing relationships obtained by bounding the problem realistically. In addition this project has illustrated the power of regarding a complex system as a statistical entity rather than attempting to dissect the complex interactions of its individual components. Further work could employ genetic algorithms in an attempt to decipher those interactions perhaps increasing the average %ROI significantly, but in the meantime this model provides a baseline that may be worth testing.

Page 13

me 647 project exacta betting optimization model

Documents