lyric-based artist network derek gossi cs 765 fall 2014

34
LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Upload: jayde-haidle

Post on 15-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

LYRIC-BASED ARTIST NETWORKDerek Gossi

CS 765

Fall 2014

Page 2: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Big Problem

How do we make better music recommendations?

Page 3: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Big Problem

How do we make better music recommendations?

Personalized recommendations

Anonymous recommendations based on similarity

Playlist generation

Page 4: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Big Problem

How do we make better music recommendations?

Ideally: Understand all the factors which link songs or artists together

Page 5: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Topics

• Background on Music Recommendation

• The Dataset

• Existing Research

• Proposed Research

Page 6: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

BACKGROUNDON MUSIC RECOMMENDATION

Page 7: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Music Recommendation Systems

Page 8: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Approaches to Recommendation• Collaborative Filtering

• Users that liked this artist/song also liked that artist/song• Amazon, iTunes store, Spotify

• Tagging• Categorization based on user-generated or pre-defined tags

• Calm, sad, romantic, cheerful, anxious, depressed

• Last.fm

• Content-based• Look at the audio signal• Not widely used in industry yet• Pandora, Spotify (in progress)• What can the lyrics tell us?

Page 9: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Approaches to Recommendation

Page 10: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Problem with Tags

Page 11: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Care vs. Scale

B Whitman, Co-Founder of The Echo Nest, “How music recommendation works—and doesn’t work”

Page 12: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Care vs. Scale

B Whitman, Co-Founder of The Echo Nest, “How music recommendation works—and doesn’t work”

Page 13: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Comparison of Approaches• Collaborative filtering is widely used in practice

• Precision vs. Profit• Even though you might like x better, Amazon makes more money by

recommending y• Probably less of an issue for subscription services such as Spotify

• Existing recommendation systems largely do not take content of music into account

• Why?• Possibility for large error• Computational cost• Still being researched

Page 14: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

MIR (Music Information Retrieval)• Emerging area of research• Gathering information directly from audio signal• Success in determining tempo, key, and loudness• Research in time signature tracking, melody detection

Page 15: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

MIR (Music Information Retrieval)

•What about trying to predict location on reduced-dimension latent space of users and songs using audio features?• Deep learning methodologies

Page 16: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014
Page 17: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Question• Can lyrics be used to improve recommender systems?• Benefits of lyrical analysis approach

• Known factors make for easy error checking• Large-scale factors such as repetition or key words are easy to

compute• Nearly as scalable as pure audio analysis for most popular genres

• Disadvantages of lyrical analysis approach• Not all songs have lyrics!• Text analysis is a subtle and complex problem too• Audio + lyrics make for new interpretations

• Reducing to artist level will “average out” some error• A combined approach will likely be the best approach

Page 18: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Care vs. Scale

B Whitman, Co-Founder of The Echo Nest, “How music recommendation works—and doesn’t work”

Page 19: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Care vs. Scale

Lyrical analysis

B Whitman, Co-Founder of The Echo Nest, “How music recommendation works—and doesn’t work”

Page 20: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Care vs. Scale

Lyrical analysis

Lyrical analysis + audio analysis + CF

B Whitman, Co-Founder of The Echo Nest, “How music recommendation works—and doesn’t work”

Page 21: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

THE DATASETThe Million Song Dataset (MSD)

Page 22: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Million Song Dataset• Open source dataset released in Feb 2011• Metadata and audio features for a million contemporary

audio tracks

Page 23: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Million Song Dataset Challenge• Online competition• Given full listening history for 1 million users• Given half of the listening history for 110,000 users• Goal: predict the other half of the listening history• Metric: mean average precision• Best ranked teams used some form collaborative filtering• See F. Aiolli, “A Preliminary Study on a Recommender

System for the Million Song Dataset Challenge”

Page 24: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

The Million Song Dataset Challenge

Page 25: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

EXISTING RESEARCHA Summary

Page 26: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Network Topology• P. Cano, O. Celma, and M. Koppenberger. “The topology

of music recommendation networks,” Feb 2008. • Analyzes four music recommendation systems from a

network perspective• Directed edges• n = 16,302 (Yahoo) to 51,616 (MSN)• m = 158,866 (AMG) to 511,539 (Yahoo)• Small-world properties in all networks

• Average shortest path < 8

• Clustering coefficient from 0.14 (Amazon) to 0.54 (MSN)

Page 27: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014
Page 28: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014
Page 29: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Lyrical Analysis• X. Hu, J. S. Downie, and A. F. Ehmann. “Lyric text mining

in music mood classification,” 2009.• 2,829 unique audio tracks from last.fm with lyrics and tags• Tags grouped into 18 distinct categories

• calm, comfort quiet, serene, mellow, chill out, … • grief, heartbreak, mournful, sorrow, sorry, …

• Objective: predict tag category• Lyrical model, audio feature model, and combined model• Lyrical features were found to outperform audio in cases

Page 30: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Lyrical Analysis• Y. Xia, K. Wong, L. Wang, and M. Xu. “Sentiment vector

space model for lyric-based song sentiment classification,” June 2008.

• Custom sentiment vector space model (s-VSM) used to classify 2,653 Chinese pop songs

• Only two classes: light-hearted and heavy-hearted• Lyrics found to outperform audio features in the

classification problem

Page 31: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

PROPOSED RESEARCH

Page 32: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Proposed Research• Use the MSD to create a network of songs and artists

linked by threshold lyrical similarity• Metric of similarity will be based on:

• Use of key words or key word groups• Word complexity and range of words used• Sentiment

• Random sample will need to be used, as mapping full dataset would require ~750,0002 iterations

• Cluster the network into n distinct “communities”• Unsupervised approach

Page 33: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

Research Questions• Network properties?

• Scale, clustering, etc.

• What are the most natural communities?• Genre, mood, complexity?

• How does it compare to existing models?• How much error is introduced by using lyrics only?

• How does the network topology of artists linked by lyrical similarity possess compare to existing user-based collaborative filtering networks?

• Can it be used to improve music recommendation?

Page 34: LYRIC-BASED ARTIST NETWORK Derek Gossi CS 765 Fall 2014

QUESTIONS?