kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · final...

20
Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data Mining 2 TU Graz 29 June 2017

Upload: others

Post on 23-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Final presentationProject 4: Machine Learning

Josef Koini [ #24 ]Knowledge Discovery and Data Mining 2

TU Graz

29 June 2017

Page 2: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Recapitulation

29 June 2017 [KDDM2] Final presentation 2

Page 3: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

What happened before...#Problem: automatic tagging of songs#Data set: Last.fm data set#Planned approach: Classification with Naive Bayes

29 June 2017 [KDDM2] Final presentation 3

Page 4: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Approach

29 June 2017 [KDDM2] Final presentation 4

Page 5: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Example{

"artist": "Neil Diamond","timestamp": "2011-08-09 02:43:27.936416","similars": [["TRWERMW128F92D19EB", 0.891814],

...,["TREJCAS128F9309618", 0.00048260300000000001]],

"tags": [["Soundtrack", "100"],["soft rock", "33"],["Neil Diamond", "33"],["brooklyn connections", "16"],["new york connections", "16"],["stage-and-screen", "16"],...,["diamond", "16"], ["male vocalists", "16"],

["cinematic", "16"], ["american", "16"]],"track_id": "TRJFKKR128F92D1950","title": "Dear Father”

}

29 June 2017 [KDDM2] Final presentation 5

Page 6: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Example{

"artist": "Neil Diamond","timestamp": "2011-08-09 02:43:27.936416","similars": [["TRWERMW128F92D19EB", 0.891814],

...,["TREJCAS128F9309618", 0.00048260300000000001]],

"tags": [["Soundtrack", "100"],["soft rock", "33"],["Neil Diamond", "33"],["brooklyn connections", "16"],["new york connections", "16"],["stage-and-screen", "16"],...,["diamond", "16"], ["male vocalists", "16"],

["cinematic", "16"], ["american", "16"]],"track_id": "TRJFKKR128F92D1950","title": "Dear Father”

}

29 June 2017 [KDDM2] Final presentation 6

Page 7: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Feature preparation#Bag of words approach# Title# Artist

#Bigrams#Minimum feature appearance#Tf-idf#Feature selection#Removing tracks without tags from training set

29 June 2017 [KDDM2] Final presentation 7

Page 8: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Classification#Scikit-learn#Multinomial Naive Bayes classifier#Top n tags#2 classifiers per tag# Title# Artist

29 June 2017 [KDDM2] Final presentation 8

Page 9: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Results

29 June 2017 [KDDM2] Final presentation 9

Page 10: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Evaluation#Accuracy#Precision#Recall#F1-measure

29 June 2017 [KDDM2] Final presentation 10

Page 11: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Problems#Unbalanced classes#Tracks without tags

29 June 2017 [KDDM2] Final presentation 11

Page 12: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Tags

29 June 2017 [KDDM2] Final presentation 12

Page 13: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Tag distribution

29 June 2017 [KDDM2] Final presentation 13

Page 14: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Accuracy

29 June 2017 [KDDM2] Final presentation 14

86,12% 87,89%90,78%

87,66% 92,40% 94,51% 96,38% 97,71% 98,83% 99,31%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 5 10 20 50 100 200 500 1000

Page 15: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Precision

29 June 2017 [KDDM2] Final presentation 15

64,57% 64,38% 64,43%

42,54%

51,03% 49,47% 53,11% 55,51% 57,97% 59,05%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 5 10 20 50 100 200 500 1000

Page 16: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Recall

29 June 2017 [KDDM2] Final presentation 16

68,15% 63,28% 62,33%

65,30%

57,71% 54,63%

46,12% 40,15%

33,68% 29,29%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 5 10 20 50 100 200 500 1000

Page 17: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

F1

29 June 2017 [KDDM2] Final presentation 17

66,31% 63,82% 63,36%

51,52% 54,17% 51,92% 49,37% 46,60% 42,60%

39,16%

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1 2 5 10 20 50 100 200 500 1000

Page 18: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Conclusion

29 June 2017 [KDDM2] Final presentation 18

Page 19: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

To put it in a nutshell...#Number of tags influences the strategy#Efficient method#Fairly good results

29 June 2017 [KDDM2] Final presentation 19

Page 20: kddm2 team24 machine learningkti.tugraz.at/.../2017/presentations/final/team-24.pdf · Final presentation Project 4: Machine Learning Josef Koini [ #24 ] Knowledge Discovery and Data

Questions?Don‘t hesitate to ask ;)

29 June 2017 [KDDM2] Final presentation 20