predicting football matches using neural networks in matlab

6
7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 1/6 Predicting Football Matches using Neural Networks in MATLAB®  __________________________________________________________________________________________________________________________ 1 1 1 INTRODUCTION There are many methods to predict the outcome of a football match. It can be predicted via a statistic model, using an ordered probit regression model. This particular method was used to predict Englis h league football matches [1]. In the static model, a wide range of variables were taken in account, in addition to the different teams past matches’ results data. These variables are the significance of each match for championship, promotion or relegation is sues; the involvement of the teams in cup competition; the geographical distance between the teams’ home towns; and a ‘big team’ effect [1]. Knowing that these results will serve as a starting point in establishing the prices and award for betting in the sports industry, the efficiency of such prices is also analyzed using empirical results [1]. A limited but increasing number of academic researchers have attempted to model match results data for football. It is in this way that it can be observed that different distributions are used, such as the poi ss on and the negative binomial distributions [1]. The statistic take on predicting football matches i s widely used for increasing the betting chances of the user, however, the algorithm al so requires traini ng the machi ne. A database is collected during the past years to have an analysis sample for training and for validation. The bigger the percentage of the data that is used for training, the better the system will Andrade, Pablo: Mechanical engineering student Cisneros, Jorge: Mechanical engineering student Suárez, Francisco: Mechanical engineering student fare, si mply because the analysis can use more data. On the other hand, the bigger the percentage of the data that is used for testing, the more statis tically reliable our test wil l be. In order to spl it all of the data, Weka offers a very good solution for this problem, namely a ten-fold cross validation. It splits the data into ten equal-sized portions and uses nine out of ten portions as training data and the last one as testing data. It repeats the process ten times, each time choosing a different portion as the testing data [2]. The selection of the relevant features i s an important feature since an accurate set makes it a lot easier to predict the outcomes of matches. Features are characteristics of recent matches of the teams involved, but how far in his tory do we need to go in order to get the best predi ctions? To ans wer this question we set up a very basic set of features and then each time we changed the amount of history looked at and compared the resul ts. This ini tial set included the following features:  Goals scored by home team in its latest x matches  Goals scored by away team in its latest x matches  Goals conceded by home team in its latest x matches  Goals conceded by away team in its latest x matches  Average number of points gained by home team in its latest x matches Predicting Football Matches using Neural Networks in MATLAB®  Andrade, Pablo; Cisneros Jorge; Suárez Francisco Escuela Politécnica Nacional, Faculty of Mechanical Engineering, Quito, Ecuador Mechatronics Abstract: The purpose of this project is to anticipate the outcome of a football match of a local team (LDU) using various types of neural networks via MATLAB®. To achieve the objective of the project a series of input data has to be collected in relation with the football team in question, the data was coll ected based on past matches records against different teams. Wi th the relevant data and the target for the proj ect three vi rtual neural networks where trained (Perceptron, Feed-Forward and Cascade) and simulated with the latest match played by the home team to see if the network coul d predict accurately the outcome of the match. The best results were achieved with the implementation of a feed-forward neural network. These results as well as the results from the other types of networks utilized are thoroughly discussed in this project.

Upload: diego-poveda

Post on 18-Feb-2018

225 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Predicting Football Matches Using Neural Networks in MATLAB

7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB

http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 1/6

Predicting Football Matches using Neural Networks in MATLAB®

 __________________________________________________________________________________________________________________________

1

1  1 INTRODUCTION

There are many methods to predict the outcome of a football

match. It can be predicted via a statistic model, using an

ordered probit regress ion model. This particular method was

used to predict Englis h league footbal l matches [1].

In the static model, a wide range of variables were taken in

account, in addition to the di fferent teams pas t matches’

results data. These variables are the significance of each

match for championship, promotion or relegation is sues; the

involvement of the teams in cup competition; the

geographical di stance between the teams’ home towns; a nd

a ‘big team’ effect  [1].

Knowing that these results wil l serve as a starting point in

establishing the prices and award for betting in the sports

industry, the efficiency of such prices is also analyzed using

empirical results [1].

A limited but increasing number of academic researchershave attempted to model match results data for footbal l. I t is

in this way that i t can be observed that di fferent distributions

are used, such as the poi ss on and the negative binomial

distributions [1].

The s tatistic take on predicting football matches i s widely

used for i ncreasing the betting chances of the user, however ,

the algorithm al so requires traini ng the machi ne. A database

is col lected during the past years to have an ana lys is sample

for trai ning and for vali dation. The bigger the percentage of

the data that is used for training, the better the system will

Andrade, Pablo: Mechanical engineering student

Cisneros, Jorge: Mechanical engineering student

Suárez, Francisco : Mechanical engineering stud ent

fare, si mply because the ana lys is can use more data. On the

other hand, the bigger the percentage of the data that is used

for testing, the more statis tically reliable our test wil l be. In

order to spl it al l of the data, Weka offers a very good solution

for this problem, namely a ten-fold cross validation. It splits

the data into ten equal-si zed portions and uses nine out of ten

portions as training data and the last one as testing data. It

repeats the process ten times, each time choos ing a differentportion as the testing data [2].

The selection of the relevant features i s an important feature

since an accurate set makes it a lot easier to predict the

outcomes of matches. Features a re characteristics of recent

matches of the teams involved, but how far i n his tory do we

need to go in order to get the best predi ctions? To ans wer this

question we set up a very basic set of features and then each

time we changed the amount of history looked at and

compared the resul ts. This ini tial set inc luded the following

features:

  Goals scored by home team in its latest x

matches  Goals scored by away team in its latest x

matches

  Goals conceded by home team in its latest x

matches

  Goals conceded by away team in its latest x

matches

  Average number of points gained by home

team in its latest x matches

Predicting Football Matches using Neural Networks in MATLAB®  

Andrade, Pablo; Cisneros Jorge; Suárez Francisco

Escuela Politécnica Nacional, Faculty of Mechanical Engineering, Quito, Ecuador

Mechatronics

Abstract:

The purpos e of this project is to anticipate the outcome of a football match of a local team (LDU) usi ng various types

of neural networks via MATLAB®. To achieve the objective of the project a s eries of input data has to be col lected

in relation with the footbal l team in question, the data was coll ected based on pas t matches records against

different teams. Wi th the relevant data and the target for the project three vi rtual neural networks where trained

(Perceptron, Feed-Forward and Cascade) and simulated with the latest match played by the home team to see if

the network coul d predict accurately the outcome of the match.

The best resul ts were achi eved with the implementation of a feed-forward neural network. These results as well as

the results from the other types of networks util ized are thoroughly discus sed in this project.

Page 2: Predicting Football Matches Using Neural Networks in MATLAB

7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB

http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 2/6

 

 Andrade, Pablo; Cisneros Jorge; Suárez Francisco

 _____________________________________________________________________________________________________________________________ __

2

  Average number of points gained by away

team in its latest x matches

The x stands for the (variable) number of matches looked at.

The fi rst four features are pretty strai ghtforward, the la st two

describe the points the home and away team gained in their

la test matches. These are calcula ted as in the football

competition itself, namely, 3 points for a win, 1 for a draw and

0 for a l oss . The average over the l atest x matches is taken. By

importing the features i n Weka and letting several machine

learning algorithms classify the data as described in Section

1.3, a percentage of correctly predicted ins tances is given.

Now that an optimal number of matches to be consi dered has

been found, we can move on to selecting the best possible

classifier (machine learning algorithm). These will by means

of a certain machine learni ng algorithm cla ssify all matches as

home wins, draws or away wins , depending on the features

belonging to that match. Duri ng the previous test round a

selection has already been made. Below is a li st of sevenclas si fiers incl uding a short descri ption of each one:

  ClassificationViaRegression – This algorithm

uses linear regression in order to predict the

right class.

  MultiClassClassifier – This algorithm is a lot

like ClassificationViaRegression, except that it

uses logistic regression instead of linear

regression.

  RotationForest – This algorithm uses a

decision tree to predict the right class.  LogitBoost – This is a boosting algorithm that

alsouses logistic regression.

  BayesNet – This algorithm uses Bayesian

networks topredict the right class.

  NaiveBayes – This algorithm resembles

BayesNet, except

  Home wins – This algorithm wil l, regardless of

the feature set, always predict a home win.

In the previous s ection we have already seen that the firs t two

perform best, usi ng the gi ven simple feature s et. We nowexpand our feature set by a few more features and make

several selections of them to see which classifier is best.

Please note that the “home wins”-classifier is used merely as

a reference. I t can immediately be seen that thi s classifier

performs worse than a ll the others.

A Bayesian Network was used to predict the results of

Barcelona FC team in the Spani sh League [3]. During the last

decade, Bayesi an networks (and probabi listic graphical

models in general ) have become very popular in artificial

intell igence. Bayesian networks (BNs) are graphica l models

for reasoning under uncertainty, where the nodes represent

variables (discrete or continuous) and arcs represent directconnections between them. These direct connections are

often causal connections. In addi tion, BNs model the

quantitative strength of the connections between variables,

allowing probabilistic beliefs about them to be updated

automatical ly as new information becomes avai lable. A

Bayesian network for a set of vari ables X = {X1,…..,Xn} cons ists

of:

1.  A network structure S that encodes a set of

conditional independence assertions about

variables in X,

2.  A set P of local probability distributions

associated with each variable. Together, these

components define the joint probability

distribution for X. The network structure Si s a

directed acyclic graph.

The BN used in the research of ref. [3] is a s fol lows:

A neural network approach can be establ is hed to predict the

results of football matches. It is the case of ref. [4]. In that

work, the input and output variables were known, however

the hidden layer and weight dis tributions were not known.

Another way of obtaining the wanted results, a compound

approach can be adopted, as explai ned in ref. [5]. The authors

designed FRES (Football Result Expectation System), which

consi sts of two major components: a rul e based reasoner and

a Bayesi an network component. This approach i s a compound

one in the sense that two different methods cooperate in

predicting the resul t of a footbal l match.

The reasoning can be divided into two stages, strategy-

making and result-calculating. Strategies include overlapping,

man-marking, pressing, position, and passing. The results

from Bayesian networks form the bases for these decisions.

Each team is assumed to have its own particular

characteristics, such as work rate, aggressiveness, pass

length, etc. Jess takes al l these facets into consideration to

determine a strategy. As well as pla y-making strategies, the

system al so reasons about higher-level decis ions such as

subs titutions and formation changes. The result calculating

part models the actual flow of a match. It models s uch aspects

as the effect of goal s on moral e, the effect of reputations,

relative scores, and l ocations on the state of the players. The

state changes throughout the match  –  for example, perhaps

a team’s morale is very good   at one moment; if nothing

special happens for a long time then their morale can beexpected to converge to normal [5].

Page 3: Predicting Football Matches Using Neural Networks in MATLAB

7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB

http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 3/6

 

Predicting Football Matches using Neural Networks in MATLAB®

 _________________________________________________________________________________________________________________________

3

A Bayesian network, Bayes network, belief network,

Bayes(ian) model or probabi lis tic directed acycl ic graphical

model is a probabi listic graphical model (a type of s tatist ical

model) that represents a set of random variables and their

conditional dependencies via a directed acyclic graph (DAG).

For example, a Bayesian network could represent the

probabi lis tic relationships between di seases and symptoms.

Given symptoms, the network can be used to compute the

probabilities of the presence of various diseases [6].

WHAT

The project intends to create an artificial neural network

capable of predicting within a reasonabl e margin of error the

outcome of a football match during a specifi c season based

on s tatistical data from past seasons and performance ratings

from the pl ayers as well as the team as a whole when playing

against other team from the same league.

WHY

  Mathematical and statistical challenge

  The process needed to train an artificial neural

network can be implemented in other similar

applications

  Advancing the artificial intelligence field.

  Betting

METHODOLOGYThe team to be analysed will be LIGA DE QUITO this being the

last winners of the stage in the Ecuadori an Cup.

A neural network wil l be establi shed for each team, taking in

account the statis tics from 15 matches of the last season.

These statistics are taken from

http://www.futbolmetrics.com. [7]

2.1 

Inputs.-

1.  Shooting ratings

2. 

Effectivity ratings3.  Goalkeeper saves

4.  Team defensive chal lenges won

5.  Goals in favor

6.  Goals agai nst

2.2  Outputs.-

1.  Winni ng the match.

2.  Drawing the match.

3.  Losing the match.

The neural network methodology consi sts in establ is hingthree different types of network:

  Cascade

  Feed forward

  Perceptron

These networks wil l be defined usi ng the NNTOOL toolbox of

MATLAB.

The results of these si mulations are shown in the next section.

2.3  SimulationThe si mulation process consists in adding the statis tics of the

la st match and compare the si mulation with the result in the

reality.

RESULTSThe results of the different networks are presented for LIGA

DE QUITO firstly.

3.1  LIGA DE QUITO

3.1.1  Perceptron

Page 4: Predicting Football Matches Using Neural Networks in MATLAB

7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB

http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 4/6

 

 Andrade, Pablo; Cisneros Jorge; Suárez Francisco

 _____________________________________________________________________________________________________________________________ __

4

3.1.2  Feed forward

3.1.3  Cascade

Page 5: Predicting Football Matches Using Neural Networks in MATLAB

7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB

http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 5/6

 

Predicting Football Matches using Neural Networks in MATLAB®

 _________________________________________________________________________________________________________________________

5

3.2  SIMULATIONThe statistics of the match taken in account are the ones of

the second match of the second season of the 2015, as shown

below [7].

These simulations will be done in each neural network. The

combined results of these simulations are shown below.

The expected result is a draw, i.e. a matrix a s of:

[ 0; 0; 1] 

The s imula tion that better sui ts the resul t is for the Feed

Forward network:

[. ; .; .] 

DISCUSSION AND APPLICATIONS

4.1  Perceptron Network

The perceptron network is the simplest kind of network and

it has a better visual way of comparing the resul ts. Effectively,

the perceptron network shows values of 1, 0 or -1. This makes

it easi er to compare.

The traini ng stage is a ls o eas ier, however, the results did not

converge, and the network always reached the maximum

epoch without a conclusive result.

The error i n predicting the result of drawing is l arge.

However, the error in this network is null for predicting the

losses and the winnings.

In the s imulation, this network did not predict accurately the

outcome of the test match, effectively, it shows a winning

score.

4.2  Feed forward

This network was implemented with 3 layers with 10 neurons

in the first and second layers.

The feed forward network begins with a large error, but the

training process reduces the error dramatically. Effectively,

the error in the last trai ning was in the order of  4 ∗ 10−.

The results of the training were proven to be very accurate

comparing with the target. There were no values that differedwith the expected results.

In the s imulation process, it is the onl y network that

accurately predicted the outcome, it predicted a dra w (very

close to 1).

4.3  CascadeThis network showed a good training process, a reduced error

in each training.

The error in the learning process turned out to be small forthe las t trai ning, in the order of  4 ∗ 10− .

The s imula tion result of this neural network was not

conclusi ve, since it didn’t predict a ny outcome, the values of

drawing, losing and winning were 0.

4.4   Applications

This work can be applied with further refinement in the input

vari ables to predict the outcome of a footbal l match.

Another appl ication of this project can be in other sports.

Page 6: Predicting Football Matches Using Neural Networks in MATLAB

7/23/2019 Predicting Football Matches Using Neural Networks in MATLAB

http://slidepdf.com/reader/full/predicting-football-matches-using-neural-networks-in-matlab 6/6

 

 Andrade, Pablo; Cisneros Jorge; Suárez Francisco

 _____________________________________________________________________________________________________________________________ __

6

CONCLUSIONS ANDRECOMMENDATIONS

  The best suited neural network for this project

is the Feed forward network, since it was the

one that learnt that scoring more goals thanthose the team receives translates into winning

the match.

  The perceptron network is not suited for this

kind of project, since it does not cope well with

drawings.

  The cascade network is not good for this

project, since it does not predict any outcome.

  The current network does not predict

accurately, since it needs the scored goals to

predict. Further variables are needed in order to

discard the goals from the inputs.

References

[1] J. Goddard, Modell ing football match results and the

effici ency of fixed-odds betting, Swansea: Universi ty of

Wales.

[2] D. Buursma, Predicting sports events from past results,

Twente: Univers ity of Twente.

[3] P. E. a . F. S. M. Farzin Owramipur, "Football Result

Prediction with Bayesian Network in Spanis h League-

Barcelona Team," vol. 5, no. 5, 2013.

[4] [Onli ne]. Avai la ble:

http://neuroph.sourceforge.net/tutorials/SportsPredi

ction/Premier%20League%20Prediction.html.

[5] C. C. a. R. I. (. M. Byungho Min, "A Compound Approach

for Football Result Prediction," Seoul National

University, Seoul.

[6] "Bayesi an network," [Onli ne]. Available:

https://en.wiki pedia.org/wiki /Bayesi an_network.

[7] "http://www.futbolmetrics.com/," [Onl ine].