networks, maps, relations
DESCRIPTION
Networks, Maps, Relations. (Humanities Hackathon 2012, Day 4). Objects of study : novels, species, philosophers, philosophies, words, concepts, languages, songs…. The problem at hand : describe relationships between the objects. (similarity, influence, equivalence, co-location….). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/1.jpg)
Networks, Maps, Relations
(Humanities Hackathon 2012, Day 4)
![Page 2: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/2.jpg)
![Page 3: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/3.jpg)
![Page 4: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/4.jpg)
Objects of study: novels, species, philosophers, philosophies, words, concepts, languages, songs….
The problem at hand: describe relationships between the objects. (similarity, influence, equivalence, co-location….)
![Page 5: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/5.jpg)
Graphs
• Simplest case: relations between pairs of objects.
• BINARY: objects are either related or they’re not (no attempt to measure extent or other qualities)
![Page 6: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/6.jpg)
(D.P. Hayes, Social Network Theory and the Claim that Shakespeare of Stratford…)
![Page 7: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/7.jpg)
![Page 8: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/8.jpg)
![Page 9: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/9.jpg)
How I made this graph (not recommended)
• adj <- array(c(0,0,0,0,0,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,1,0,0,0,1,1,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,0,0,0,0,0,0,1,0,0,0,1,0,1,1,0,1,0,0,1,1,0,0,0,0,1,1,1,0,0,1,1,0,0,1,1,0,0,0,0,0,0,0,0,0,0,1,1,0,0,1,0,1,1,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,1,0,0,1,0,0,0,0,0,0,1,1,0,1,0,0,0,1,1,0,1,0,0,1,1,0,0,0,0,0,1,0,0,0,0,0,1,0,0,1,1,1,1,1,1,0,1,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,1,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,1,0,0,0,0,0,0,1,0,0,1,1,0,0,1,1,1,0,0,1,0,0,0,0,0,0,1,0,0,0,1,1,0,0,0,0,0,0,1,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,1,0,1,0,0,0,1,0,0,1,0,0,0,1,1,0,0,1,0,0,0,0,0,1,1,0,0,0,0,0,0,0,1,1,0,1,1,1,0,0,1,0,0,0,1,1,0,1,0,0),c(20,20))
• >PL = graph.adjacency(adj,mode="undirected")
![Page 10: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/10.jpg)
How I made this graph
>Names = c( "Beaumont”, "Chapman" "Chettle" , "Dekker”, "Drayton" "Fletcher" , "Greene" , "Heywood" "Jonson" , "Kyd” ,"Lodge” , "Lyly" "Marlowe" , "Marston" , "Middleton" "Munday" , "Nashe" , "Peele" "Webster" , "SHAKESPEARE”)
> V(PL)$name = Names OR> V(PL)$name <- Names
![Page 11: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/11.jpg)
Graphs
A graph (or network) consists of:
• A set of vertices (or nodes)• A set of edges of the form (v,w) where v and w
are vertices.• Two vertices are adjacent if they are joined by
an edge.
![Page 12: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/12.jpg)
Directed graphs
Undirected graphs model symmetric relations: A is connected to B means B is connected to A.
(similarity, overlap, blood relation…)
Directed graphs (or digraphs) model non-symmetric relations:
(biological descent, Internet links, phone calls…)
![Page 13: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/13.jpg)
Weighted graphs
In a weighted graph, edges are assigned numbers – typically measuring the strength of a relation, not just whether it is there or not.
(e.g. edge from v to w records number of e-mails from v to w, not just existence of e-mail from v to w.)
![Page 14: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/14.jpg)
![Page 15: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/15.jpg)
![Page 16: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/16.jpg)
Shakespeare graph (undirected):• Vertices are Elizabethan playwrights• Edges are collaborations (or friendships, or co-
defendancies)
![Page 17: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/17.jpg)
![Page 18: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/18.jpg)
![Page 19: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/19.jpg)
MORAL: A picture of a graph is not a graph. The graph is the list of adjacencies, nothing more.
![Page 20: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/20.jpg)
ASIDE: why do this?
Oversimplification, BUTAll statements about books are
oversimplifications, e.g. “Raymond Carver wrote Cathedral”
Our goal is “distant reading”
![Page 21: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/21.jpg)
Basic notions
• The degree (or valence) of a vertex is the number of edges attached to it. Loose measure of “importance”
> degree(PL) Beaumont Chapman Chettle Dekker Drayton Fletcher 2 5 7 10 5 5
…Webster SHAKESPEARE
4 9
![Page 22: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/22.jpg)
• For directed graphs, the in-degree of a vertex x is the number of edges pointing to x, and the out-degree is the number of edges emanating from x.
• Web graph: in-degree = number of links pointing to my page, out-degree = number of outbound links on my page
![Page 23: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/23.jpg)
Basic notions• The distance between two vertices is the length of the shortest
chain of adjacencies connecting them.• > shortest.paths(PL,"SHAKESPEARE","Lyly")• Lyly• SHAKESPEARE 3• > lapply(get.shortest.paths(PL,'SHAKESPEARE','Lyly'),function(x)
V(PL)$name[x])• [[1]]• [1] "SHAKESPEARE" "Greene" "Nashe" "Lyly" (sorry for this ugliness)
![Page 24: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/24.jpg)
Basic notions
• The diameter of a graph is the greatest distance between any two vertices.
• > diameter(PL)• [1] 5• > farthest.nodes(PL)• [1] 1 12 5• > shortest.paths(PL,1,12)• Lyly• Beaumont 5
![Page 25: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/25.jpg)
Complete graphs
• Every vertex adjacent to every other5 vertices10 edges
![Page 26: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/26.jpg)
Complete graphs
More generally: n vertices, each vertex connected to n-1 others for a total of n(n-1)
This counts each edge twice!So (n^2-n)/2 edges.Number of edges scales as number of vertices
squared: studying a graph on 10 times as many vertices can take 100 times as long. (Or more, depending on the question asked…)
![Page 27: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/27.jpg)
Trees
A tree is a graph in which every two vertices are joined by one, but only one, path. Equivalently: no cycles.
![Page 28: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/28.jpg)
![Page 29: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/29.jpg)
Communities
• A clique is a set of vertices which are all mutually adjacent.
(So: any pair of adjacent vertices is a clique of size 2, any “triangle” is a clique of size 3…)
• e.g Shakespeare, Dekker, Chettle.• > largest.cliques(PL)• [[1]]• [1] 4 3 16 8 20
(Dekker,Chettle,Munday,Heywood,Shakespeare)
![Page 30: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/30.jpg)
![Page 31: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/31.jpg)
Communities
A graph is connected if any vertex can be reached from any other by a chain of adjacencies. Every graph breaks up into connected pieces called connected components.
![Page 32: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/32.jpg)
A geometry of their own
“Really, universally, relations stop nowhere, and the exquisite problem of the artist is eternally but to draw, by a geometry of his own, the circle within which they shall happily appear to do so.” (Henry James, preface to Roderick Hudson)
How to draw this circle?
![Page 33: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/33.jpg)
Clustering
Connected component: a set of vertices which has no connection to the remainder of the graph.
Cluster: a set of vertices which has relatively few connections to the rest of the graph.
(Note that this isn’t a definition…) Many ways to cluster, no “right way”
![Page 34: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/34.jpg)
Clustering in R• > edge.betweenness.community(PL)• Graph community structure calculated with the edge betweenness algorithm• Number of communities (best split): 2 • Modularity (best split): 0.2781065 • Membership vector:• Membership vector:• Beaumont Chapman Chettle Dekker Drayton Fletcher • 1 1 1 1 1 1 • Greene Heywood Jonson Kyd Lodge Lyly • 2 1 1 2 2 2 • Marlowe Marston Middleton Munday Nashe Peele • 2 1 1 1 2 2 • Webster SHAKESPEARE • 1 1
![Page 35: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/35.jpg)
How the clusters look
![Page 36: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/36.jpg)
“The University Wits were a group of late 16th century English playwrights who were educated at the universities (Oxford or Cambridge) and who became playwrights and popular secular writers. Prominent members of this group were Christopher Marlowe, Robert Greene, and Thomas Nashe from Cambridge, and John Lyly, Thomas Lodge, George Peele from Oxford.” (Wikipedia)
![Page 37: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/37.jpg)
Macbeth
![Page 38: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/38.jpg)
Clusters of characters in Macbeth> edge.betweenness.community(Macbeth)Graph community structure calculated with the edge betweenness algorithmNumber of communities (best split): 10 Modularity (best split): 0.06733369 Membership vector: MACBETH LADY MACBETH MACDUFF MALCOLM 1 2 1 1 ROSS BANQUO First Witch LENNOX 1 3 4 1 First Murderer DUNCAN Second Witch Third Witch 2 5 4 4 ALL SIWARD Messenger Second Murderer 1 6 7 8 Servant SEYTON 9 10
![Page 39: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/39.jpg)
Breakpoint
When can networks tell us things we don’t already know?
![Page 40: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/40.jpg)
200 names
Vertices: 200 baby names for boys popular in 2011.
For each name, record popularity in WI, TX, PA, CA, MA, GA, OH, MO, FL, CO, NY, IL
Edges: Two names are adjacent if their popularity distribution across states are “very similar”
![Page 41: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/41.jpg)
200 names
• >lapply(largest.cliques(MaleNames), function(x) V(MaleNames)$name[ x ])
[[1]][1] "Jacob" "Anthony" "Dylan" "Matthew"
"Brian" (popular in NY,CA,MA, less so in CO,MO,GA)
![Page 42: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/42.jpg)
200 names• > V(MaleNames)$name[neighbors(MaleNames,'Malachi')]• [1] "Ashton" "Ashton" "Kaden" "Kaden" "Malachi"
"Malachi"• > V(MaleNames)$name[neighbors(MaleNames,'Owen')]• [1] "Maxwell" "Maxwell" "Brady" "Brady" "Cole" "Cole"
"Owen" "Owen" • V(MaleNames)$name[neighbors(MaleNames,'Patrick')]• [1] "Thomas" "Thomas" "Patrick" "Patrick" "John" "John"
"Sean" "Sean" "Ryan" "Ryan" "Peter" "Peter"
![Page 43: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/43.jpg)
![Page 44: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/44.jpg)
![Page 45: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/45.jpg)
edge.betweenness.communities finds groups of girls’ names like
• Alaina, Maci, Mackenzie, Lillian, Addison, Alivia
• Piper, Harper, Brooklyn, Brooklynn• Aubrey, Zoey, Autumn, Ellie• Lucy, Josephine, Elise, Clara, Eleanor
![Page 46: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/46.jpg)
Density
How likely are two things to be related?The density of a graph is the probability that two random
elements are related: i.e.[total number of edges]/[total number of pairs of vertices]>graph.density(MaleNames)[1] 0.1084846> graph.density(FemaleNames)[1] 0.09950159>graph.density(Macbeth)[1] 0.2810458
![Page 47: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/47.jpg)
Transitivity
• A relation is transitive if “A related to B” and “B related to C” implies “A related to C.”
Transitive: “Is descended from,” “born in same city as”
Non-transitive: “is friends with”, “lived at some point in same city as”
![Page 48: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/48.jpg)
How transitive is a graph?
Some relations are transitive, others are not. But we don’t have to stop at “yes” or “no”.
How frequently are two friends of yours friends with each other?
• Always• Never• Something in between
![Page 49: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/49.jpg)
How transitive is a graph?
Transitivity (or “clustering coefficient”) gives the probability that two random neighbors of the same vertex are neighbors to each other.
> transitivity(MaleNames)[1] 0.4972335> transitivity(FemaleNames)[1] 0.4546713> transitivity(Macbeth)[1] 0.4545455
![Page 50: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/50.jpg)
How transitive is a graph?
In both name cases, two random neighbors have about a 50% chance of being connected (while two random vertices have about a 10% chance of being connected.) Quite transitive!
Facebook thinks the same is true for “friends” (and makes this so by thinking so!)
![Page 51: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/51.jpg)
Stub: incompletely specified networks
Standard problem: incomplete data. Did X and Y collaborate? Lack of an edge might mean “we know they didn’t” or “we don’t know that they did.”
One idea: use network structure – if graph is highly transitive, and X and Y have many common collaborators, this is evidence that X and Y collaborated.
![Page 52: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/52.jpg)
Metrics, clustering, trees
Suppose given: a set of objects (e.g. novels) and for each pair of objects a degree of dissimilarity (a number)
(survey data, lexical similarity, voting similarity…)
This data (subject to “triangle inequality”) is called a metric on the set of objects.
![Page 53: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/53.jpg)
Metrics, clustering, trees
Can we associate each object with a point on the plane so that the distances between points correspond to the dissimilarities between objects?
![Page 54: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/54.jpg)
Metrics, clustering, trees
Distance From City Distance To City Distance (km)Newark Jersey City 8.02Paterson Elizabeth 28.3Toms River Edison 65.4Trenton Camden 45.55Clifton Cherry Hill 126.24Passaic East Orange 11.84Union City North Bergen 2.92Irvington Bayonne 12.38South Vineland Wayne 176.47Union Vineland 149.49New BrunswickBloomfield 42.14Perth Amboy East Brunswick 15.46West Orange Plainfield 23.19West New York Hackensack 11.18Sayreville Junction Lakewood 41.97Atlantic City Sayreville 121.87Teaneck Linden 36.19……
![Page 55: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/55.jpg)
Metrics, clustering, trees
Doesn’t always work: 4 objects, each pair at distance 1.
Multidimensional scaling: embeds objects in the plane (or higher-dimensional space) while approximately realizing desired distances.
(e.g. Rosenberg, Nelson, Vivekananthan (1968)
![Page 56: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/56.jpg)
![Page 57: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/57.jpg)
Hierarchical clustering
A clustering of a set is a partition into categories.A hierarchical clustering is when we partition
the categories into subcategories, subcategories into subsubcategories….
![Page 58: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/58.jpg)
![Page 59: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/59.jpg)
A hierarchical clustering on a set of objects is the same as a tree whose leaves are the objects!
Agglomerative clustering, etc. – find hierarchical clustering that best respects measured dissimilarities (analogue of MDS)
![Page 60: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/60.jpg)
• Desideratum: objects that are very dissimilar should not be in the same subsubsubsubcategory (or: their distance in the tree should be large)
![Page 61: Networks, Maps, Relations](https://reader036.vdocuments.us/reader036/viewer/2022062520/56816147550346895dd0c2e4/html5/thumbnails/61.jpg)
LET US HACK!