influence and similarity between contemporary jazz artists, plus six

14
Influence and Similarity between Contemporary Jazz Artists, plus Six Degrees of Kind of Blue Gabriele Giaquinto, Cora Bledsoe, and Brian McGuirk University of Michigan 1 INTRODUCTION There is a large segment of modern jazz that does not fit any historical style and is very difficult to categorize. With this project we use exploratory network analysis to understand which artists have most influenced contemporary jazz artists, how contemporary jazz artists can be categorized, which artists are central in contemporary jazz, and if centrality is correlated with commercial success. The rest of the paper is organized as follows. Section 2 describes the data set and how we collected it. Section 3 provides an exploratory network analysis for the similarity network. Section 4 provides an exploratory network analysis for the influence network. Section 5 provides a network analysis of a 1959 jazz recording collaboration network. Section 6 discusses related work. In Section 7, we present our conclusion. 2 DATA SET We have collected data from All Music Guide [1], a music recommendation network [4]. The set of artists, their style and connections are created by music experts. Therefore, the data we collected is subjective and does not represent objective data, such as collaborations between artists (as done in [2] and [6]) or relationships between songwriters and singers (as done in [5]). We defined a jazz artist to be contemporary if he or she is currently active and started his or her career not earlier than the 90’s. There are, of course, several jazz artists that started their careers earlier and that give a great contribution to modern jazz. However, we wanted to study only young jazz artists in this network and wanted the networks to be manageable in size. To create the contemporary era networks we have developed a software tool that reads a set of contemporary jazz artists’ pages, creates two Pajek networks, and generates a list of artists that require further research. We started with the list of most important jazz artists, as provided by All Music Guide, and performed several iterations. To address the contributions of older jazz musicians we developed a network of 1959 jazz recording collaborations using the personal playing on the five albums All About Jazz defines as “the back bone of a great jazz collection” [8]. Using musicians playing on the following albums: Miles Davis, Kind of Blue John Coltrane, Giant Steps Dave Brubeck Quartet, Time Out Charles Mingus, Mingus Ah Um Ornette Coleman, The Shape of Things to Come We then developed a collaboration network using additional data from the All Music Guide.

Upload: hoanghanh

Post on 12-Feb-2017

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Influence and Similarity between Contemporary Jazz Artists, plus Six

Influence and Similarity between Contemporary Jazz Artists, plus Six Degrees of Kind of Blue

Gabriele Giaquinto, Cora Bledsoe, and Brian McGuirk University of Michigan

1 INTRODUCTION

There is a large segment of modern jazz that does not fit any historical style and is very

difficult to categorize. With this project we use exploratory network analysis to understand which

artists have most influenced contemporary jazz artists, how contemporary jazz artists can be

categorized, which artists are central in contemporary jazz, and if centrality is correlated with

commercial success.

The rest of the paper is organized as follows. Section 2 describes the data set and how we

collected it. Section 3 provides an exploratory network analysis for the similarity network.

Section 4 provides an exploratory network analysis for the influence network. Section 5 provides

a network analysis of a 1959 jazz recording collaboration network. Section 6 discusses related

work. In Section 7, we present our conclusion.

2 DATA SET

We have collected data from All Music Guide [1], a music recommendation network [4]. The

set of artists, their style and connections are created by music experts. Therefore, the data we

collected is subjective and does not represent objective data, such as collaborations between

artists (as done in [2] and [6]) or relationships between songwriters and singers (as done in [5]).

We defined a jazz artist to be contemporary if he or she is currently active and started his or her

career not earlier than the 90’s. There are, of course, several jazz artists that started their careers

earlier and that give a great contribution to modern jazz. However, we wanted to study only

young jazz artists in this network and wanted the networks to be manageable in size.

To create the contemporary era networks we have developed a software tool that reads a set

of contemporary jazz artists’ pages, creates two Pajek networks, and generates a list of artists that

require further research. We started with the list of most important jazz artists, as provided by All

Music Guide, and performed several iterations.

To address the contributions of older jazz musicians we developed a network of 1959 jazz

recording collaborations using the personal playing on the five albums All About Jazz defines as

“the back bone of a great jazz collection” [8]. Using musicians playing on the following albums:

• Miles Davis, Kind of Blue

• John Coltrane, Giant Steps

• Dave Brubeck Quartet, Time Out

• Charles Mingus, Mingus Ah Um

• Ornette Coleman, The Shape of Things to Come

We then developed a collaboration network using additional data from the All Music Guide.

Page 2: Influence and Similarity between Contemporary Jazz Artists, plus Six

2.1 Similarities Network

Each node represents a contemporary jazz artist and an edge represents the similarity among

them, as defined by All Music Guide. The network has 216 nodes and 258 edges. The average

degree is 2.38 and the average shortest path is 8.23.

2.2 Influences Network

Each node represents a jazz artist and an arc represents the musical influence of an artist on

the other, as defined by All Music Guide. The network consists of group 1, contemporary jazz

artists and group 2, the earlier artists that influence them. It has 418 nodes and 335 arcs.

2.3 1959 Collaboration Network

Each node represents a jazz artist that either played on one of the albums listed above or an

artist who recorded with an artist who played on one of the five “influential” albums. Edges

represent artists having played together on an album. The network has 196 nodes and 1591 edges.

3 SIMILARITIES NETWORK ANALYSIS

We imported the network in GUESS and visualized it using the Bin Pack layout, as shown in

Figure 1.

Figure 1 – Network of similarities among contemporary jazz artists

3.1 Centrality

We measured degree centrality, closeness centrality, and betweenness centrality. Figure 2

visualizes degree centrality; the size of a vertex is proportional to its degree and the vertices

colored in blue are the ones with highest degree centrality. Table 1 shows the 5 artists with the

highest degree centrality and the 5 artists with the highest closeness centrality. Table 2 shows the

5 artists with the highest betweenness centrality.

Page 3: Influence and Similarity between Contemporary Jazz Artists, plus Six

Figure 2 – Centrality degree in similarities network

Artist Degree Closeness Artist

Diana Krall 13 0.1209 David Sanchez

Bobby Sanabria 10 0.1188 Danilo Perez

Danilo Perez 10 0.1183 Leon Parker

David Sanchez 10 0.1155 Greg Tardy

Medeski, Martin & Wood 10 0.1128 The Bad Plus

Table 1: Degree and Closeness Centrality

Artist Betweenness

Medeski, Martin & Wood 0.198

Leon Parker 0.180

David Sanchez 0.173

Greg Tardy 0.152

Jim Black 0.145

Table 2: Betweenness Centrality

Four artists that have high centrality in more than just one metric are David Sanchez, Danilo

Perez, Leon Parker, and Greg Tardy; figure 3 shows them in white color. Sanchez and Perez

belong to the Latin Jazz community (see section 3.2 on community structure) and are similar

respectively to Parker (Contemporary Jazz community) and Tardy (Post Bop community). Being

part of a well-connected community and being able to cross musical styles boundaries seem to be

a good way of being central in modern jazz.

Page 4: Influence and Similarity between Contemporary Jazz Artists, plus Six

Figure 3 – Some artists that have high centrality in more than one metric

We tried to correlate centrality with the number of albums sold by an artist. We have

collected the number of gold and platinum albums awarded to an artist by the Recording Industry

Association of America [7] to figure out the number of albums sold. A gold album certifies the

sale of 500,000 albums; a platinum album certifies the sale of 1,000,000 albums. The first

interesting result of this research is that only a handful of contemporary jazz artists have been

awarded gold albums and only superstar Diana Krall has been awarded platinum albums. Table 3

shows the number of albums sold by contemporary jazz artists. The second observation we make

is that there is no correlation between centrality and number of albums sold: being central has

little to do with making money in modern jazz. The only exception is Diana Krall, which has the

highest degree centrality and the highest number of albums sold. We think this correlation can be

explained by a couple of things: when an artist gains commercial success, Music Labels tend to

favor artists that are similar, maybe because the artist’s success is the indication of a particular

style favored by the general public; the other explanation is that the commercial success inspires

many artists to follow the same “stylistic” road (there is a fine line between emulation and

inspiration here).

Artist Albums Sold (in millions)

Diana Krall 5

Boney James 1.5

Fourplay 1.5

Chris Botti 1

Candy Dulfur 0.5

Rachelle Ferrel 0.5

Table 3: Number of albums sold

3.2 Community structure

We used the Girvan-Newman betweenness clustering algorithm to find communities of

similar contemporary jazz artists. We immediately notice that there are several artists that are not

similar to any other artist. There are also some very small communities that are isolated from the

rest of the network. The isolated nodes and the small isolated components are immediately

identified by the algorithm (Figure 4). The interesting part is how the algorithm finds

communities in the giant component. As a stop criterion we just eyeballed the network and

Page 5: Influence and Similarity between Contemporary Jazz Artists, plus Six

stopped removing edges when it felt right. Here is a list of the components found by the algorithm

within the giant component (Figure 5):

• Vocal Jazz (light green component top-left corner): it seems to be centered around Diana

Krall

• Jazz influenced by other genres like Rock, Funk, and Pop (brown component): this

component has low density

• A Contemporary Jazz component (green component middle-left side)

• Smooth Jazz (light blue component in the middle): pretty small and with low density

• Latin Jazz (light violet middle-left side): this group seems to have high density)

• Post Bop (purple component left-bottom)

• Another Post Bop component (light green component bottom-left side): practically attached

to the previous component. Together, these two components are pretty large. These two

components should really be only one component. The algorithm did not work very well in

this case

• A group that is difficult to identity (purple component middle-bottom)

• Avant-Garde Jazz (violet component middle-right): this seems to be a large component, but it

is not very dense

Figure 4 – Similarity network with zero edges removed

Figure 5 – Community structure in similarity network

Page 6: Influence and Similarity between Contemporary Jazz Artists, plus Six

4 INFLUENCES NETWORK ANALYSIS

At first glance, even after being laid out with the Fructerman-Reingold or Kamada-Kawai

algorithms, the Influence Network appears roughly circular with many single 0-degree nodes.

None of the group 2 algorithms have positive indegrees as their outdegrees represent the

influence they have on more recent artists.

4.1 Community Structure

To uncover the community structure, we use the Girvan-Newman algorithm in Guess, which

removes edges in order of descending betweenness.

Figure 6-10 – Even after several iterations of removing edges, the Influence Network shows no real

sign of community structure

Unlike the Similarity Network, the Influence Network, shows little to no community

structure. Rather than several smaller sub-communities, there appears to be one large community

with many 0 degree nodes that are connected to any others.

In order to determine some sort of structure for the Influence Network, we also did a

hierarchical clustering. This was done using Pajek and exporting the file in a dendogram format.

Page 7: Influence and Similarity between Contemporary Jazz Artists, plus Six

Figure 11 – Dendogram showing hierarchical clustering in the Influence Network

Even without reading the name of all the Jazz artists in the network, we can see that some

more structure starts to take shape after a few iterations when we look at hierarchical clustering

rather than the Girvan-Newman betweenness algorithm.

4.2 Prestige and PageRank

By running the PageRank algorithm responsible for Google’s search capabilities, we can tell

which artists carry prestige within the network.

Page 8: Influence and Similarity between Contemporary Jazz Artists, plus Six

Figure 12 – PageRank algorithm shows prestige in the Influence Network

In link analysis, where PageRank is normally used, a webpage will have high PageRank if it

has some combiniation of high in-links, low out-links, and specific in-links from other high

ranking pages. In the world of jazz artists, these artists with high PageRank like Gretchen Parlato,

Diego Rivera, and Jorge Pescara have most likely been influenced by either a lot of people, a few

very important people, or some combination of the two.

4.3 Motifs

We wanted to find the most common types of groupings of connections within the Influence

Network, so we looked at a motif analysis. Using FANMOD, A Tool for Fast Network Motif

Detection (http://www.minet.uni-jena.de/~wernicke/motifs/index.html) , we found a single

recurring theme. To find the motif, we constructed 1000 random networks with the same number

of nodes and arcs as the original Influence Network.

Adjacency Matrix Frequency

(original)

Mean-Frequency

(random)

Std. Dev.

(random)

p-value

72.917% 72.82% 0.00047829 0.019

Figure 13 – FANMOD output showing the single motif and it’s frequencies.

The FANMOD output describes the single recurring 3-node motif as one jazz artist with two

influencers and those two influencers do not influence each other in any way.

Jorge Pescara

Gretchen Parlato

Diego Rivera

Page 9: Influence and Similarity between Contemporary Jazz Artists, plus Six

• The motif occurred in 72.92% of the original network

• On average, the motif occurred with a frequency of 72.82% in each of the random

networks

• The standard deviation from the mean frequency is very close to zero at .00047829 • The p-Value of a motif is the number of random networks in which it occurred more

often than in the original network, divided by the total number of random networks. Here,

we have a p-value of .019, which is closer to 0 making our motif fairly significant.

5 1959 JAZZ RECORDING COLLABORATION

We first imported the network in Pajek. To reduce the number of nodes in visualizations we

set the degree threshold to 13 or higher leaving us with 54 nodes. Community finding was

performed with the entire network of 196 nodes imported into Guess.

5.1 Centrality

After importing and node reduction we measured degree centrality, closeness centrality, and

betweenness centrality as with the similarities network. Figure 14 visualizes betweenness

centrality with the size of a node proportional to its betweenness. Table 4 shows the 5 artists with

the highest degree centrality. Table 5 shows the 5 artists with the highest closeness centrality.

Table 6 shows the 5 artists with the highest betweenness centrality.

Figure 14 – Betweenness display of the 1959 Jazz Network. Larger node size denotes higher betweenness

Degree Musician

169 Paul Chambers

99 Wynton Kelly

97 Jimmy Cobb

70 Philly Joe Jones

53 Tommy Flanagan

Table 4: Top 5 Degree in the 1959 Jazz Network

Page 10: Influence and Similarity between Contemporary Jazz Artists, plus Six

Closeness Musician

0.550389 Paul Chambers

0.490848 Philly Joe Jones

0.461616 Scott Lafaro

0.445051 Jimmy Cobb

0.440833 Wynton Kelly

Table 5: Top 5 Closeness in the 1959 Jazz Network

Betweenness Musician

.39 Paul Chambers

.32 Scott Lafaro

.08 Philly Joe Jones

.05 Danny Bank

.05 Jimmy Cobb

Table 6: Top 5 Betweenness in the 1959 Jazz Network

With the exception of Danny Bank in the betweenness table, all of the musicians with the top

centrality scores play a rhythm section instrument (drums, bass, or piano). This would make

sense because nearly every jazz album recorded uses these three instruments, but several albums

do not use woodwind or brass instruments so the rhythm instruments degree would be higher in

addition to the other measures because of the amount of work these artists get (though the

network seed albums all used woodwind instruments).

Of all the centrality measures, the most interesting is the high betweenness of Paul Chambers

and Scott Lafaro whose nodes are largest in the betweenness visualization.. Both musicians play

the bass, but they have very different degree scores. While Chambers has the highest degree in

the network (169) Lafaro is near the bottom (14). Figure 15 illustrates their connections to two

very large bands. Paul Chambers is connected to a network comprising of musicians who played

on the Miles Davis album Sketches of Spain shown in the yellow colored nodes. Scott Lafaro

(brown node) is connected to a network of musicians appearing on several Stank Kenton big band

albums shown in the blue and grey nodes in the northern portion of the visualization. Because

Lafaro and Chambers have an edge between themselves, they both receive the benefits of their

fellow bass player’s connection to large but less connected networks. It is likely that they never

played together on the album they both appear on and instead two different recording sessions

were cobbled together to make one album which is a common practice of the era. Their high

betweenness and the general high betweenness of rhythm section players makes these musicians

the people to get in touch with should one be looking for recording work in 1959 do to their

connections to several clusters in the network.

Page 11: Influence and Similarity between Contemporary Jazz Artists, plus Six

Figure 15 – Demonstrating Betweenness of Scott Lafaro and Paul Chambers

5.2 Community structure

Figure 16 – Givan-Newman Betweenness Clustering of 1959 Jazz Network

We used the Girvan-Newman betweenness clustering algorithm to find communities of

musicians appearing on the same album together. The enlarged red nodes represent leaders from

the five albums that seeded the network while the large black node represents Stan Kenton from

the Stan Kenton big band. Three distinct communities form through the removal of high

betweenness edges. The community in blue with the leaders John Coltrane and Miles Davis

started from the albums Kind of Blue and Giant Steps. Because these two albums share 6

Page 12: Influence and Similarity between Contemporary Jazz Artists, plus Six

musicians between them they form the largest cluster. The lime colored cluster represents the

Dave Brubeck and Orenette Coleman bands. Several of the sideman (non-leaders) of these

albums play together on each other’s albums when they appear as leaders linking these two

albums’ communities into one cluster. The darker green cluster consists of Charles Mingus

associated players and the Stan Kenton band. Charles Mingus is a bit of an anomaly in the

network. Earlier we discussed the high betweeness of rhythm section players and this is usually

the case, but Charles Mingus as a bassist records as a leader and does not appear on any other

leader’s albums so he is in a community on his own linked to the 7-12 musicians he uses on

projects depending on the musical need. Scot Lafaro is the bassist of choice for Charles Mingus’

musicians when Mingus isn’t leading so his link to them also links the Stan Kenton big band

(cluster with the large black dot) to the Charles Mingus community placing the Mingus big band

and Kenton big band in the same community. These are also the only two big bands in the

network.

Also interesting in this community finding and the data as a whole are the musicians who are

not represented. Some very famous and influential leaders were playing in 1959 including Stan

Getz, Sonny Rollins, and Thelonius Monk, yet none of these leaders connect to this album

collaboration network. Further work could search how many steps it would take until these

leaders entered the data.

6 RELATED WORK

Managing Metadata [3], written by David Datta, provides helpful hints and pitfalls to

consider when building a robust, usable database. He mentions that one must understand how the

data will be used, how the different pieces will link to each other and what the legal ramifications

are before building the database. For the All Music Guide (AMG), one of the most used aspects is

not the factual information taken from the disc jacket but rather the creative content written by the

AMG staff. This includes genres, influence, keywords, moods, etc.. Users of the site will often

get lost for hours exploring and browsing with the creative content, says Datta. He acknowledges

that there is much more to consider when building a good database but that this paper provides a

good understanding of the basics.

Community Structure in Jazz [2].

Data: The study uses data from The Red Hot Jazz Archive database which stores 198 bands

that performed from 1912-1940 . 1275 musician names appear in the database. Band size average

around 5-10 members, but some very large bands have 171 members.

The authors derive two networks from the data. The first network is a musician to musician

network where musicians represent nodes and edges between musicians are created if two

musicians have played together in a band. The second network is a band to band network where

bands are represented by nodes and edges represent bands with at least one shared member. Both

networks exhibited small world properties and high degrees of clustering. After deriving the

networks the authors remove edges having high betweenness from the networks one by one to

form communities. In the musician network the authors find two communities which represent

black/white racial segregation. The band network the authors identify the same two communities

as the musician network, but they also find two communities within the black musician

community which represent the cities (Chicago and New York) that bands recorded in.

7 CONCLUSIONS

In examining these three networks we learned a few aspects of Jazz musicians that carried

across networks. From the community finding done in all three networks using two separate

methods we learned that jazz musicians work in communities and these communities are bridged

by musicians having high centrality due to their ability to play in different settings and styles or

by musicians who incorporate many influences into their own style. We also learned that

Page 13: Influence and Similarity between Contemporary Jazz Artists, plus Six

centrality has more to do with a musician's ability to adapt to play in many musical settings rather

than high album sales (demonstrated in the similarity network) or name recognition

(demonstrated in the 1959 recording network). The fact that these characteristics appear to extend

over a period of forty plus years is interesting and could be the subject of further work on

different time periods representing the community.

Page 14: Influence and Similarity between Contemporary Jazz Artists, plus Six

REFERENCES

[1] All Music Guide. http://allmusic.com/

[2] P. Gleiser and L. Danon. Community Structure in Jazz. In Advances in Complex Systems,

Vol. 6, No. 4, 565-573, 2003.

[3] D. Datta. Managing Metadata. In Proceedings of the 3rd International Conference on Music

Information Retrieval, October 2002.

[4] P. Cano, O. Celma, M. Koppenberger, and J. M. Buldú. Topology of music

recommendation networks. In Chaos: An Interdisciplinary Journal of Nonlinear Science,

Vol. 16, 013107, January 2006.

[5] D. de Lima e Silva, M. Medeiros Soares, M.V.C. Henriques, M.T. Schivani Alves, S.G. de

Aguiar, T.P. de Carvalho, G. Corso, and L.S. Lucena. The complex network of the

Brazilian Popular Music. In Physica A, 332, 559-565, 2003.

[6] R. D. Smith. The network of collaboration among rappers and its community structure. In

Journal of Statistical Mechanics: Theory and Experiment, P02006, 2006.

[7] Recording Industry Association of America. http://www.riaa.com/

[8] G. Barber. 1959: A Great Year in Jazz. All About Music, 27 Oct. 2004. http://www.allaboutjazz.com/php/article.php?id=15310