![Page 1: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/1.jpg)
Topic Models Recommendations
Morten Arngren Senior Data Scientist[ ]
![Page 2: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/2.jpg)
About Topic Recommendations
π‘ !
Recommendations
Modelling
![Page 3: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/3.jpg)
ββ¦YouTube for Publicationsβ¦
![Page 4: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/4.jpg)
IStarted in 2006 by 5 dudes.
15M. publications (free)π
π 7.5B. page views / month
340M. pages - (25 km2)
2013
π₯ 83M. unique visitors / month
""
![Page 5: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/5.jpg)
Data Science Team (Copenhagen)
12x 2.6GHz
96GB Ram
2TB SSD
2TB HardDrive
Morten Arngren Ph.D. in Machine Learning and AI (2011) M.Sc.A.M. (2007) B.Sc.E.E. (1997) !ISSUU, Data Scientist (2011 - present) DTU & FOSS Analytical, Machine Learning in Food Quality (2008-2011) Nokia Mobile Phones, Digital Signal Processing (2000-2007) Alcatel Space Denmark, Building Rockets (1997-2000)
Andrius Butkus Ph.D. in Digital Media Personalisation (2009) M.Sc.E.E. (2004) B.Sc.E.E. (2002) !ISSUU, Data Scientist (2011 - present) DTU External Lecturer, Human Computer Interaction (2010 - present) DTU Assistant Professor, Digital Media Engineering (2008-2010) β Amazon Web
Services
ML Gadgets
![Page 6: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/6.jpg)
πDataπData
![Page 7: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/7.jpg)
πData
![Page 8: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/8.jpg)
πLayout
(Quantify text and image boxes)
π
π
Article Extraction
)OCR
π
Image
Cover Analysis
#
Explicit Detection
Doc. Type Classification
$
Text
Detect Language (56)
Translate to English (from 24 languages) LDA Topics
(β
π
π
Page
Content
*DB
&40k
Pubs / Day
![Page 9: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/9.jpg)
time
Reader Activity
+!
,
π
- -
π
,
,,
-
N NSession
""
"" "
"
"
*DB
π ππ¬
π§1
2πΉ
βBirdie Nam Namβ
200GB / Day
![Page 10: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/10.jpg)
Topic Modelling
![Page 11: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/11.jpg)
LATENT DIRICHLET ALLOCATION
150 topics (preset parameter)
Topic model based on Bag-of-Words Data
http://radimrehurek.com/gensim/
Wikipedia Training Data ~4.5M Single Articles
(Pure Topics)
arabicAustralia history business
islands environment
hotels
poetic
food design arts
plants animals
Topic Distribution
1501
LDA π΄
D. Blei, A. Ng, and M. Jordan. Latent Dirichlet allocation. Journal of Machine Learning Research, 3:993β1022, January 2003.[ ]
![Page 12: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/12.jpg)
π
β
(
πΉ
5
π΄
LATENT DIRICHLET ALLOCATION
Properties Ξ£[0:1] β§ = 1
LDA SpacePC 4
the real
5+
Issuu Publications
![Page 13: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/13.jpg)
TOPIC CATEGORIES
(
πΈ
β β
(
πΉ
~4.5 Mio.
Density distr ibution not the same
Iπ΄
8πΈ
~9 Mio.
Empty locations in LDA space.
Travel
Cocktails
Chemistry
0.5 Travel 0.4 Spor ts 0.1
Botanics
Drinks
(Learning from Wikipedia Dataset)
Dancing
![Page 14: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/14.jpg)
Recommendation System!
![Page 15: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/15.jpg)
π¬
READER ACTIVITY
π ππ§1
2πΉ
Extract Implic it Ratingβ¦.?
No Explic it Ratingβ¦.
TimeβBirdie Nam Namβ
![Page 16: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/16.jpg)
Session { UserName: βBirdie-Nam-Namβ DocID: xxx-xxxxx Pages: 1: [250, 725, 569, 134, ...] 2: [1056, 1259, ...] 3: [1056, 1259, ...] 4: [102, 356, 208, 438] 5: [102, 356, 208, 438] 6: [5250, 3567, 809] 7: [5250, 3567, 809] ... TimeStamp: 1378935850 DocID: yyy-yyyyy }
Pages: [1,2,3,6,7] ReadTime: 25789 ms. TimeStamp: 1378935850
Browsing or Reading?Time
Readers
Publ
icat
ions
π
π¬
2
π§
πΈ
![Page 17: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/17.jpg)
Item2Item Matrix
π
π¬
2
π§
πΈ
π π¬ 2 π§ πΈ
12πΉπ¬π§ ππ
Reader indexed learning
To
Pages: [1,6,7,10,11] ReadTime: 11250 ms. TimeStamp: 1385437850
Time
568525081065
850 11509860
3690
in weeks
decay per week= 850
Decay function
![Page 18: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/18.jpg)
RECOMMENDING
Item2Item Matrix
8
π
π¬
π
πΈ
1 π 5 π§ π±
1 π 5 π§
Item Matrix Weight Mapping Function
π§π¬πΉ π
Time
25081065850 1150
N
ππ΄< π
11 1
Read History
π
Likes
Stacks
![Page 19: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/19.jpg)
RECOMMENDING
+5
π I
1 π
πΉ
β«8
π¬
π§
π
ππ
E
πΈπ
π€
π±
π·C
π·
πΊπΎ
F
π½
π±
Item Matrix Weight Mapping Function
1
Item Weights
1 π 5 π§ π± 1π5 π§ π±
πWeighted Sampling
1π5 π§ π±
![Page 20: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/20.jpg)
Max. Rank
![Page 21: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/21.jpg)
Tuned Parameters
![Page 22: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/22.jpg)
Deep Belief Network Model
Bag-of-Words modelTraining Data
I
Lars Maal
2000
500
20
2
Kasper Johansen
! "
Collaborate Fi lter ing Using Social Media Knowledge
Master Student Project
LLΓΈe
![Page 23: Issuu Talk on Topic Models and Recommendation Systems](https://reader033.vdocuments.us/reader033/viewer/2022042720/568c530a1a28ab4916b922d8/html5/thumbnails/23.jpg)
Master Student Project
LLMorten Arngren
Senior Data Scientist[ ]