beating the bookmakers: leveraging statistics and twitter microposts for predicting soccer results

1
http://multimedialab.elis.ugent.be Ghent University – iMinds, ELIS Department/Multimedia Lab Gaston Crommenlaan 8 bus 201 B-9050 Ledeberg – Ghent, Belgium Fréderic Godin, Jasper Zuallaert, Baptist Vandersmissen, Wesley De Neve and Rik Van de Walle Workshop on Large-scale Sports Analytics, KDD 2014 Beating the Bookmakers: Leveraging Statistics and Twitter Microposts for Predicting Soccer Results 24/8/2014, New York, USA Research Question Approach General evaluation of the predictions for 100 soccer games Method Match day 20-24 Match day 29-34 Overall Home Team Wins 48% 54% 51% A BBC Soccer Expert 62% 58% 60% The Bookmakers 66% 68% 67% Twitter Volume Model 48% 52% 50% Sentiment Model 48% 56% 52% User Prediction Model 58% 68% 63% Statistical Model 58% 70% 64% Majority Voting 64% 64% 64% Late Fusion 62% 70% 66% Early Fusion 66% 70% 68% 1. Harvest input data 2. Construct feature vectors 3. Train individual prediction models Statistical Analysis Twitter Volume Sentiment Analysis User Prediction Analysis Statistical model User Prediction Model Twitter Volume Model Sentiment Model Late Fusion Majority Voting Early Fusion Evaluation Can we use the wisdom of the crowd to predict the outcome of a soccer game correctly? Monetary Profit Conclusion Method Money earned The Bookmakers €18.55 Statistical Model €25.82 Early Fusion €29.70 How much would we earn if we bet €1 on every game? (100 games) 30% profit! By using the wisdom of the crowd we could beat the bookmakers in predicting the result of a soccer game. 4. Train combined prediction models rederic_godin, @jasperzuallaert, @BaptistV, @wmdeneve and @rvdwalle

Upload: fgodin

Post on 07-Dec-2014

165 views

Category:

Engineering


3 download

DESCRIPTION

Poster presented at the Large-Scale Sports Analytics Workshop at KDD 2014 (conference on Knowledge Discovery and Datamining) Paper can be found at : http://www.large-scale-sports-analytics.org/Large-Scale-Sports-Analytics/Submissions.html or Research Gate. ---------------------------- ABSTRACT: In this paper, we investigate the feasibility of using collec-tive knowledge for predicting the winner of a soccer game. Specifically, we developed different methods that extract and aggregate the information contained in over 50 million Twitter microposts to predict the outcome of soccer games, considering methods that use the Twitter volume, the sen-timent towards teams and the score predictions made by Twitter users. Apart from collective knowledge-based pre-diction methods, we also implemented traditional statistical methods. Our results show that the combination of different types of methods using both statistical knowledge and large sources of collective knowledge can beat both expert and bookmaker predictions. Indeed, we were for instance able to realize a monetary profit of almost 30% when betting on soccer games of the second half of the English Premier League 2013-2014.

TRANSCRIPT

Page 1: Beating the Bookmakers: Leveraging Statistics and Twitter Microposts for Predicting Soccer Results

http://multimedialab.elis.ugent.beGhent University – iMinds, ELIS Department/Multimedia Lab

Gaston Crommenlaan 8 bus 201B-9050 Ledeberg – Ghent, Belgium

Fréderic Godin, Jasper Zuallaert, Baptist Vandersmissen, Wesley De Neve and Rik Van de WalleWorkshop on Large-scale Sports Analytics, KDD 2014

Beating the Bookmakers: Leveraging Statistics and Twitter Microposts for Predicting Soccer Results

24/8/2014, New York, USA

Research Question

Approach

General evaluation of the predictions for 100 soccer gamesMethod Match day 20-24 Match day 29-34 OverallHome Team Wins 48% 54% 51%A BBC Soccer Expert 62% 58% 60%The Bookmakers 66% 68% 67%Twitter Volume Model 48% 52% 50%Sentiment Model 48% 56% 52%User Prediction Model 58% 68% 63%Statistical Model 58% 70% 64%Majority Voting 64% 64% 64%Late Fusion 62% 70% 66%Early Fusion 66% 70% 68%

1. Harvest input data 2. Construct feature vectors

3. Train individual prediction models

Statistical Analysis

Twitter Volume

Sentiment Analysis

User Prediction Analysis

Statistical model

User Prediction Model

Twitter Volume Model

Sentiment Model

Late Fusion

Majority Voting

Early Fusion

Evaluation

Can we use the wisdom of the crowd to predict the outcome of a soccer game correctly?

Monetary Profit

Conclusion

Method Money earnedThe Bookmakers €18.55Statistical Model €25.82Early Fusion €29.70

How much would we earn if we bet €1 on every game? (100 games)

30% profit!

By using the wisdom of the crowd we could beat the bookmakers in predicting the result of a soccer game.

4. Train combined prediction models

@frederic_godin, @jasperzuallaert, @BaptistV, @wmdeneve and @rvdwalle