deezer - big data as a streaming service

35
Big Data as a Streaming Service Big Data as a Streaming Service Julie Knibbe Product Manager – Deezer @julieknibbe Manuel Moussalam R&D – Deezer

Upload: julie-knibbe

Post on 20-Aug-2015

2.421 views

Category:

Technology


3 download

TRANSCRIPT

Big Data as a Streaming Service

Big Data as a Streaming ServiceJulie KnibbeProduct Manager – Deezer@julieknibbe

Manuel MoussalamR&D – Deezer

Big Data as a Streaming Service

Product Manager

Defines features that meet users needs

Based on:• Market research• Product Data Analytics• Users feedback• Competitive Analysis• Creativity

Big Data as a Streaming Service

The Leanback Experience Team at Deezer

• Product Manager• Project Manager• R&D Developers• Big Data developers• Web developers

(front/back)• Mobile developers• QA

Big Data as a Streaming Service

Deezer

Active users 30M

Countries 180+

Tracks in catalog 35M

Artists in catalog 1M

Music providers 1K+

Big Data as a Streaming Service

The recommendation problem

No one wants to hear music they don’t like

Big Data as a Streaming Service

The recommendation problem

No one wants to hear the same 200 tracks over and over again

Big Data as a Streaming Service

The recommendation problem

You need to hear a song from 1 to 7 times to like it

Big Data as a Streaming Service

The recommendation problem

Parameters and variables:• Mood• Tastes• Habits• Openness• Sociological profile• …

Dimensions:• 35M tracks• 1M artists• 30M users

Big Data as a Streaming Service

Building a user profile

Onboarding usersMonitoring user actions

Big Data as a Streaming Service

Deezer – User qualification

Big Data as a Streaming Service

User Profile

Big Data as a Streaming Service

User Profile – Implicit / Explicit feedback

Adaptation Add new informationForget old interests

Big Data as a Streaming Service

Music Recommendation

Given a listening profile for user X, what music should we recommend?

Big Data as a Streaming Service

Recommendation system – adapting to user

types

Savants

Enthusiasts

Casuals

Indifferents

Riskier recommendations

Popular recommendations

Finding the right mix between novelty, familiarity and relevance

Big Data as a Streaming Service

Recommendation system – adapting to user

types

Sources: http://alchemi.co.uk/archives/mus/groups_and_beha.htmlhttp://musicmachinery.com/2014/01/14/the-zero-button-music-player-2/

Big Data as a Streaming Service

Use cases

Playlist / Channel generation

DiscoveryPersonal Search

Big Data as a Streaming Service

Deezer features – Flow

Big Data as a Streaming Service

Deezer features – Hear This

Big Data as a Streaming Service

At Deezer

Mixing collaborative filtering with semi-supervised approaches• Curation: Deezer Editors• Multi-layered graph structure of tracks & artists• Usage monitoring

Based on Hadoop + ElasticSearch + Spark

Big Data as a Streaming Service

Collaborative Filtering: Matching

Collaborative Filtering :« User X listened to the Rolling Stones. Users listening to the Rolling Stones usually also listen to the Who, let's suggest the Who to user X. »

Popularized by the Netflix Prize

Big Data as a Streaming Service

Collaborative Filtering

Either compute similarity upon users or items.. or both

Big Data as a Streaming Service

Real data

Big Data as a Streaming Service

Collaborative filtering: Exemplar based

Association rules• Market basket analysis• A priori Algorithm• ..

But:• Scalability issues• Hubs and Island issues (Stromae example)

Big Data as a Streaming Service

Collaborative filtering: Model based

A

n

m= U

I

X

k

• U is low-dimensional model on users• I on itemsRecommended items are missing entries of A

Matrix Factorization

Big Data as a Streaming Service

Collaborative Filtering: Limitations

• Cold Start problem• Sparse user-item matrix (1% coverage)• Only based on social behaviors• Popularity bias (« The rich gets richer »)

Big Data as a Streaming Service

Content-based filtering: Music items

representation

Big Data as a Streaming Service

Content-based filtering: Limitations

• Cold Start problem• Users with atypical tastes• Lack of novelty • Subjectivity not taken into account

Big Data as a Streaming Service

Content Similarity

Clustering tracks, artists, albums…

Methods:• Matrix Factorization techniques• Spectral clustering• Musical features extraction• Louvain algorithm• …

Big Data as a Streaming Service

Example: Multiple Spectral Clustering

Big Data as a Streaming Service

Cleaning

• Mislabeled data: Different sources tell different things about songs, artists, albums

• No universally adopted music ontology• Subjectivity

• Outlier detection: confronting several sources and models

Big Data as a Streaming Service

Cleaning: Example

Big Data as a Streaming Service

In real life…

A/B Testing

Big Data as a Streaming Service

Algorithms A/B Testing

Algo A

Algo B

Observe results:• Daily Active Users• Streams / users• Satisfaction• …

Deezer users

Big Data as a Streaming Service

Algorithms A/B Testing: Example

Test: Are new users (with no profile data) more likely to be more satisfied with charts items or with new ones?

Big Data as a Streaming Service

Thanks !