beetjukes - insight project
TRANSCRIPT
A tune for every beat.
@alvinko
Music for the right time
DEMO
DISCOVERY
Finding new things are difficult.* *for computers at least, anyways.
Songs Users
Songs Users
Songs Users Connec+ons
Liz
Amanda
Jill
We Found Love Rihanna
Hot n Cold Katy Perry
Style Taylor Swi>
Songs Users Connec+ons
Liz
Amanda
We Found Love Rihanna
Hot n Cold Katy Perry
Style Taylor Swi>
Jill
Songs Users
Amanda
We Found Love Rihanna
Liz
Style Taylor Swi>
Songs Users Connec+ons
Amanda
We Found Love Rihanna
Liz
Style Taylor Swi>
Songs Users Connec+ons
Amanda
We Found Love Rihanna
Liz
Style Taylor Swi>
Who’s Taylor Swi> anyways?
Data Pipeline (and what I learned from building it)
Million Song Dataset (metadata store)
~200GB
User Taste Profile 40m rows, 120k users
<userid>, <songid>, <# +mes listened>
~5GB .tsv
Userid Songid
A 1
A 2
A 3
B 2
B 3
B 4
C 2
C 5
Userid Songid Userid2
A 2 B
A 2 C
A 3 B
B 2 A
B 2 C
B 3 A
C 2 A
C 2 B
Userid Userid2 Count
A B 2
A C 1
B A 2
B C 1
C A 1
C B 1
Userid Songid
A 1
A 2
A 3
B 2
B 3
B 4
C 2
C 5
Songid Userid Songid2
1 A 2
1 A 3
2 A 1
2 A 3
2 B 3
2 C 5
3 A 1
3 A 2
3 B 2
5 C 2
Songid Songid2 Count
1 2 1
1 3 1
2 1 1
2 3 2
2 5 1
3 1 1
3 2 2
5 2 1
Over 3B rows in each table
Graph traversal done at a large scale is HARD!
Airpal
Camus
m4.large m4.large m4.large m4.large m4.large
m4.large m4.large m4.large m4.large
m3.medium
m3.medium
Airpal
1. Iterate oEen 2. Fail fast 3. Learn from mistakes 4. Be courageous (but not stupid)
Alvin Ko Advanced Analytics, Caesars Entertainment BSBA Economics, UNLV
#dataviz #analytics #cars #traveling #hiking
! [email protected]" # ALVINKO
AlternaLve recipes…
S+ll in incuba+on
Acquired with no plans to con+nue development
Airpal
Camus