la liga 2013 2014 analysis
TRANSCRIPT
NIIT BUSINESS ANALYTICS
La Liga Performance Analysis Project
Analyzing the strategic formula for success in La Liga
Rituparna Sarkar
In this project we will try to explain the dependency of the win and loss for a team playing in La Liga (2013-2014) based on goals forwarded and accepted and carrying to forward to establish relation between shooting efficiency of a team to the goals scored. Then we will do a similar analysis for a home and away matches, try to establish the differences in different conditions.
Contents Introduction ........................................................................................................................................................................ 2
Objective ......................................................................................................................................................................... 2
Data source ..................................................................................................................................................................... 2
Analysis ............................................................................................................................................................................... 3
Data Preparation ............................................................................................................................................................. 3
Data Analysis ................................................................................................................................................................... 4
Conclusion ......................................................................................................................................................................... 11
(1)
Introduction
Objective
The analysis will establish the following:
1) Relations of the goals forwarded by any teams to the number of matches won by them.
2) Relations of the goals accepted by any team to the number of matches won by the team.
3) Relation of the goals forwarded by any team to the number of matches lost by the team.
4) Relation of the goals accepted by any team to the number of matches lost by the team.
5) Relation of the number of goals scored by the team to the number of shots hit by the team.
6) Relation of the number of goals scored by the team to theshots on target hit by the team.
With the result of the above analysis, we will try to establish some suggestions that will help teams to improve their
performance in the upcoming season
Data source
http://www.football-data.co.uk/spainm.php
La Liga Primera Division (2013-2014)_original.csv
All the data use for this analysis are collected from the above website.
(2)
Analysis
Data Preparation
The data contained match wise data for the session, to achieve our objective we need to transform the data into team-
wise data.
Steps followed to prepare the data,
1) Used “remove duplicate data” to find the unique team names
2) Used pivot table, to get the data for individual column and then copied each column data and performed a paste
(values).
The different combinations used are to produce these data columns.
3) The data hence produced was produced for Home and Away scenario, to produce the data for the complete
session a sum of the data for Home and Away has been taken.
4) Also to compute the efficiency of shots, a simple formula of (Shots on target/Total shots) has been used.
Our final datasheet
DataSet_La liga Analysis.xlsx
Keys for our data sheet
Goals_F – Goal Forwarded Goals_A – Goal Accepted Shot_OT- Shot On Target Shots Eff- shot Efficiency H_Win- Home Win H_Loss- Home Loss H_Draw- Home Draw H_Goals_F- Home Goals Forwarded H_Goals_A- Home Goals Accepted H_shots- Home Shot H_shots_OT- Home Shots On Target H_Shot_Eff- Home Shot Efficiency A_Win- Away Win A_Loss- Away Loss A_Draw- Away Draw A_Goals_F- Away Goals Forwarded A_Goals_A- Away Goals Accepted A_shots- Away Shot A_shots_OT- Away Shot On Target A_Shot_Eff- Away Shot Efficiency
11 20 28 276 14 9 11 11 12 12 12 10
27 18 16 13 7 13 1743
66 77100
3649
30 41 35 32 35 39 32
104
69 62 5138 46
60
Alm
eria
Ath
Bilb
ao
Ath
Mad
rid
Bar
celo
na
Be
tis
Ce
lta
Elch
e
Esp
ano
l
Get
afe
Gra
nad
a
Leva
nte
Mal
aga
Osa
sun
a
Re
al M
adri
d
Sevi
lla
Soci
edad
Val
en
cia
Val
lad
olid
Val
leca
no
Vill
arre
al
Total Win vs Total Goals Scores
Total Win Total Goals Scored
(3)Data Analysis
Primary Analysis
1) Relation between matches won vs the numbers of goals scored by the team
Data columns: Total win/ Total Goals Score
Line graph:
Conclusion:
As we can see from the line graph, the rise
and fall of the Blue line (Total Win) is very
much in sync with the rise and fall of the
Red line (Total Goals scored).
We can very well assume that a very strong
relation can exist between these 2
parameters.
Linear Regression:
Conclusion:
Based on the regression result, we can see that 86% of the Total Win can be explained by the Total Goals Scored in
the tournament. The significance F value also assures that the probability of error is very low.
So we can very well say that a team with a strong offence has performed better in La Liga.
(4)
1120
28 27
614 9 11 11 12 12 12 10
2718 16 13
713 17
71
39
2633
78
54 50 51 54 56
43 46
62
38
52 55 5360
80
44
Alm
eria
Ath
Bilb
ao
Ath
Mad
rid
Bar
celo
na
Be
tis
Ce
lta
Elch
e
Esp
ano
l
Get
afe
Gra
nad
a
Leva
nte
Mal
aga
Osa
sun
a
Re
al M
adri
d
Sevi
lla
Soci
edad
Val
en
cia
Val
lad
olid
Val
leca
no
Vill
arre
al
Total win Vs Total Goals Accepted
Total Win Total Goals Accepted
2) Relation between matched won vs the number of goals accepted by the team
Data columns: Total win/Total Goals Accepted
Line Graph:
Conclusion:
As the line graph show, there is a
obvious inverse relation between the 2
parameters. We assume the 2
parameters are also related, but the
sync in rise and fall is not as strong as
the previous set of parameters. Linear
regression results will help us establish
the relation more firmly.
Linear Regression:
Conclusion:
As seen the value of R square suggests that the Total Win is not well explained by the Total goals accepted. With this
we can assume that most of the teams in La Liga stress on strong offence rather than defense. Even with high
number of goals accepted, some teams have performed better.
208 4 5
25 17 16 18 18 21 14 17 195 11 11 15 16 21 13
43
6677
100
3649
3041 35 32 35 39 32
104
69 6251
38 4660
Total Loss vs Total Goals Scored
Total Loss Total Goals Scored
3) Relation between matches loss vs the number of goals scored by the team.
Data columns: Total Loss /Total goal scored
Line graph:
Conclusion:
As seen, even these set of
parameters have a inverse
relation. But the relation is
assumed to be strong based on
the visual analysis of the line
graph. A linear regression will
help us establish the relation.
Linear Regression:
Conclusion:
Results of the linear regression show that 73% of the Total Loss can be explained by the Total goals score. This also helps
us believe that team with more goals have won more matches. Hence, we can say a team with strong offence has better
performed in La Liga.
208 4 5
2517 16 18 18 21
14 17 195 11 11 15 16 21
13
71
3926
33
78
54 50 51 54 5643 46
62
3852 55 53
60
80
44
Alm
eria
Ath
Bilb
ao
Ath
Mad
rid
Bar
celo
na
Be
tis
Ce
lta
Elch
e
Esp
ano
l
Get
afe
Gra
nad
a
Leva
nte
Mal
aga
Osa
sun
a
Re
al M
adri
d
Sevi
lla
Soci
edad
Val
en
cia
Val
lad
olid
Val
leca
no
Vill
arre
al
Total loss vs Total Goals Accepted
Total Loss Total Goals Accepted
4) Relation between matched loss vs the number of goals accepted by the team.
Data columns: Total Loss / Total goal scored
Line Graph:
Conclusion:
As seen, we can establish a strong
relation between the 2 parameters.
These parameters are directly related
to each other and a rise and fall of one
explains the rise and fall of the other.
The degree of relation will be
established by the linear regression.
Linear Regression:
Conclusion:
Linear regression confirms that the total number of losses can be explained by the total number os goals accepted bye
any team. However, the degree of strength of the relation was stronger in case of (Total Win vs Total Goals Scored),
thereby adding to the fact that teams stress on a strong offence in La Liga.
43 66 77 10036 49 30 41 35 32 35 39 32
104 69 62 51 38 46 60
397
510 503
643
501 494403 415 429 432
364
473 433
743
499 503 524
373
537436
Total Goals Vs Total Shots
Total Goals Scored Total Shots
5) Relation between goals scored vs number of shots
Data columns: Total Goals / Total Shot
Line Graph:
Conclusion:
A visual analysis of the line
graph shows that a relation
does exist between the 2
parameters but of a very
moderate strength. The degree
of rise in Red line is not very will
reflected in the blue line.
Linear Regression:
Conclusion:
Result of the linear regression shows a 70% relation between the 2 parameters. This can help is concluding that even
when the teams have very strong offence, the efficiency of the strategy is not as strong.
4366 77
100
36 49 30 41 35 32 35 39 32
10469 62 51 38 46 60
146189 202
266
162 171125 136 133 129 125
156134
301
163191 176
126176160
Total Goals Scored Vs Shots on target
Total Goals Scored Shots on target
6) Relation between goals scored vs shots on target
Data columns: Goals scored/ shots on target
Line graph:
Conclusion:
The relation between shots
on target gives us a better
reflection of the total goals
scored.
Linear Regression:
Conclusion:
The high value of R square (87%) explains that the goals scored strongly depends on the shots on target.
Combining the above results
The above analysis shows 2 very strong relations.
1) Total Win vs Total Goals
2) Total Goals scored vs Total Shots on target.
As we see, most of the above analysis suggests that teams in La Liga primarily focus on offence rather than defense.
1) Most of the wins are explained by the number of goals scored but not as strongly explained by the number of
goals accepted.
2) Matched lost by the team are equally explained by the total number of goals scored and goals accepted.
Also the later part of the analysis suggests that the teams although are strong in offence are not as efficient.
3) Total goals scored are explained better than shots on target than total shots
Supporting the above fact
1) Only 17 out of (38*20 = 760), i.e less than 3% matches have ended in no goals.
2) Only 100 out of (38*20 = 760), i.e. less than 14% of the matches have seen a team keeping clean sheet.
3) Even the best teams in the league have a maximum efficiency of 41% in shots, almost 60% of the shots are
wasted.
Assumes reasons why this kind of performance is observed
1) Teams use a formation which has less players in defensive position
2) Teams are lacking good defense players
3) Teams are doing more of the offense training
4) Defense teams are not motivated as much as the offensive team
5) Managers are buying offence players
6) Teams are dominated by individual performers
Conclusion
The following are the final set of suggestions for all teams in La Liga to upgrade their performance.
1) Increase defensive strength
a. Use formations with more people in defense.
b. Buy better defensive players in next transfer window.
c. Focus more on defense training.
d. Train players of have a neutral attitude rather than an offensive attitude in matches.
e. Give importance to the defense and motivate them to play more actively.
2) Increase efficiency of shots
a. Work on off the ball running to have a better positioning
b. Train team to efficiently convert the shots off target to shots on target
c. Suggest them to prefer possession of the ball rather than shots off target.
Adopting one or more of the above strategy is believed tohelp teams perform better in the upcoming seasons of the La
Liga.
(11)