![Page 1: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/1.jpg)
1
Alexandru Costan
Graph Theory and Social Networks
![Page 2: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/2.jpg)
2 Outline
• Graphs problems and representations
• Structure of social networks
• Applications of structural analysis
![Page 3: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/3.jpg)
3
Source: Wikipedia (Königsberg)
![Page 4: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/4.jpg)
4 What is a graph?
• G = (V,E) • V represents the set of vertices (nodes) • E represents the set of edges (links) • Both vertices and edges may contain additional information
• Different types of graphs: • Directed vs. undirected edges • Presence or absence of cycles
• Graphs are everywhere: • Hyperlink structure of the Web • Highway system • Social networks
![Page 5: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/5.jpg)
5 Some graph problems
• Finding shortest paths • Routing Internet traffic and UPS trucks
• Finding minimum spanning trees • Telco laying down fiber
• Finding Max Flow • Airline scheduling
• Identify “special” nodes and communities
• Breaking up terrorist cells, spread of avian flu
• Bipartite matching • Tinder
• PageRank
![Page 6: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/6.jpg)
6 Graphs are hard!
• Poor locality of memory access • Very little work per vertex • Changing degree of parallelism • Running over many machines makes the
problem worse • Graph storage:
• Flat Files: no query support • RDBMS: can store the graph with limited
support for graph query • State of the art today:
• Write your own infrastructure • MapReduce – tends to be inefficient
![Page 7: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/7.jpg)
7 Distributed Graph Processing
• Google’s Pregel • Large-scale graph processing • Vertex centered computation
• Apache Giraph • Open source • Iterative graph processing • Used at Facebook
• Twitter’s Cassovary • In-memory computation • Used for: “Who to Follow” and “Similar to” • Very simple to use (no need for persistence, databases or partitions)
• Neo4j Graph Database • Flexible schema • Powerful query language, ACID
![Page 8: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/8.jpg)
8 Representing graphs
Two common representations:
• Adjacency matrix
• Adjacency list
![Page 9: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/9.jpg)
9 Adjacency matrices
Represent a graph as an n x n square matrix M • n = |V| • Mij = 1 means a link from node i to j
1 2 3 4 1 0 1 0 1 2 1 0 1 1 3 1 0 0 0 4 1 0 1 0
1
2
3
4
![Page 10: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/10.jpg)
10 Adjacency matrices: critique
Advantages: • Easy mathematical manipulation • Iteration over rows and columns corresponds
to computations on outlinks and inlinks
Disadvantages: • Lots of zeros for sparse matrices • Lots of wasted space
![Page 11: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/11.jpg)
11 Adjacency lists
Take adjacency matrices… and throw away all the zeros
1: 2, 4 2: 1, 3, 4 3: 1 4: 1, 3
1 2 3 4 1 0 1 0 1 2 1 0 1 1 3 1 0 0 0 4 1 0 1 0
![Page 12: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/12.jpg)
12 Adjacency lists: critique
Advantages: • Much more compact representation • Easy to compute over outlinks
Disadvantages:
• Much more difficult to compute over inlinks
![Page 13: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/13.jpg)
13 Social graphs
![Page 14: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/14.jpg)
14 Social graphs
• Asymmetric follow relationship: very skewed graphs
• Very valuable “interest graphs”
• Huge graphs:
![Page 15: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/15.jpg)
15 What can networks tell us?
• The strength of weak ties [Granovetter ’73]
• Motivating question: How do people find new jobs?
• Through acquaintances rather than close friends • Surprising fact: discovery is enabled by weak ties
• Understanding structure affords deep insights
• Interplay between sociology and graph theory
![Page 16: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/16.jpg)
16 Triadic Closure
Question: What are the mechanisms by which node arrive and depart and by which edges form and vanish?
If two people in a social network have a friend in common, then there is an increased likelihood that they will become friends themselves at some point in the future
BC closes the triangle
![Page 17: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/17.jpg)
17 Triadic Closure
Over time…
… new edges are forming. But not all due to triadic closure (e.g. DG)
![Page 18: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/18.jpg)
18 Clustering Coefficient
• The probability that two randomly selected friends of A are friends with each other.
• The fraction of pairs of A’s friends that are connected to each other by edges.
• For node A: • at a) 1/6 • at b) 1/2
• The more strongly triadic closure is operating in the neighborhood of the node, the higher the clustering coefficient will tend to be.
![Page 19: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/19.jpg)
19 Reasons for Triadic Clousure
• Opportunity • Trust • Incentive
![Page 20: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/20.jpg)
20 Strength of weak ties
• Definition: a bridge in a graph is an edge whose removal disconnects the endpoints.
Bridges are presumably extremely rare in real social networks!
![Page 21: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/21.jpg)
21 Strength of weak ties
• Definition: a local bridge in a graph is an edge whose endpoints have no common neighbor.
![Page 22: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/22.jpg)
22 Types of edges
• Structural approach: • Local bridges or not
• Interpersonal approach: • Weak or strong
Challenge: how to link them ?
![Page 23: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/23.jpg)
23 Strong Triadic Closure
• Strong Triadic Closure Property: if the node has strong ties to two neighbors, then these neighbors must have at least a weak tie between them.
![Page 24: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/24.jpg)
24 Local bridges and weak ties
• Claim: If a node A in a network satisfies the Strong Triadic Closure Property and is involved in at least two strong ties, then any local bridge it is involved in must be a weak tie.
• Consequence: all local bridges are weak ties!
![Page 25: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/25.jpg)
25 Strength of weak ties
![Page 26: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/26.jpg)
26 Strength of Weak Ties
• Discovery is enabled by weak ties
• Surprising strength of weak ties!
• Simple structural model explains this cleanly
• Applies to Twitter/Facebook
![Page 27: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/27.jpg)
27 Tie strength on Facebook
![Page 28: Graph Theory and Social Networks - IRISA · • Social networks . ... • Running over many machines makes the problem worse ... • Neo4j Graph Database • Flexible schema](https://reader030.vdocuments.us/reader030/viewer/2022020204/5b552dd37f8b9a0d398deabd/html5/thumbnails/28.jpg)
28 Tie strength on Twitter
• Stronger… • Directed tweets: @someone
• … and weaker ties • Followers
• The number of strong ties remains relatively modest • Bellow 50 even for users with over 1000
followers.