social network analysis. outline l background of social networks –definition, examples and...
TRANSCRIPT
![Page 1: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/1.jpg)
Social Network Analysis
![Page 2: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/2.jpg)
Outline
Background of social networks– Definition, examples and properties
Data in social networks– Data creation, flow and storage
Analytic tasks in social networks– Problems, solutions and examples
Summary
![Page 3: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/3.jpg)
What is a Social Network?
A definition from Wikipedia– A social network is a social structure made
up of a set of social actors (such as individuals or organizations) and a set of the dyadic ties between these actors.
– Social network analysis: analyze the structure of the whole network, identify local and global patterns, locate influential entities, and examine network dynamics.
![Page 4: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/4.jpg)
Social Network Representation
Graph Representation Matrix Representation
![Page 5: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/5.jpg)
Social Network: Examples
![Page 6: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/6.jpg)
![Page 7: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/7.jpg)
The Scale and Growth of Social Networks
Facebook statistics
– 829 million daily active users on average in June 2014
– 1.32 billion monthly active users as of June 30, 2014
– 81.7% of daily active users are outside the U.S. and Canada
– 22% increase in Facebook users from 2012 to 2013 Facebook activities (every 20 minutes on Facebook)
– 1 million links shared
– 2 million friends requested
– 3 million messages senthttp://newsroom.fb.com/company-info/
http://www.statisticbrain.com/facebook-statistics/
![Page 8: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/8.jpg)
Visualizing Friendships on Facebook
![Page 9: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/9.jpg)
The Scale and Growth of Social Networks
Twitter statistics
– 271 million monthly active users in 2014
– 135,000 new users signing up every day
– 78% of Twitter active users are on mobile
– 77% of accounts are outside the U.S. Twitter activities
– 500 million Tweets are sent per day
– 9100 Tweets are sent per second
https://about.twitter.com/company
http://www.statisticbrain.com/twitter-statistics/
![Page 10: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/10.jpg)
A Tweet Map of America
![Page 11: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/11.jpg)
Properties of Large-Scale Social Networks
Scale-free distributions
Small-world effect
Strong community structure
![Page 12: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/12.jpg)
Scale-free Distributions
Degree distribution in large-scale networks often follows a power law, that is, the fraction p(x) of nodes in the network having x connections to other nodes goes for large values of x as:
A.k.a. long tail distribution, scale-free distribution
![Page 13: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/13.jpg)
Log-log Plot
Power law distribution becomes a straight line if plotted in a log-log scale
Friendship Network in Flickr Friendship Network in YouTube
![Page 14: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/14.jpg)
Small-world Effect
“Six Degrees of Separation”
A famous experiment conducted by Travers and Milgram (1969)
– Subjects were asked to send a chain letter to his acquaintance in order to reach a target person
– The average path length is around 5.5
Verified on a planetary-scale IM network of 180 million users (Leskovec and Horvitz 2008)
– The average path length is 6.6
Facebook users (721 million) were separated by 4.74 degrees as of May 2011.
![Page 15: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/15.jpg)
Diameter
Measures used to calibrate the small world effect– Diameter: the longest shortest path distance in a
network
– Average shortest path length
Example– The shortest distance between node 1 and node 9 is 4.
– The diameter of the network is 5, corresponding to the shortest distance between nodes 2 and 9.
Shortest PathThe Longest Shortest Path
![Page 16: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/16.jpg)
Community Structure
Community: People in a group interact with each other more frequently than those outside the group
Friends of a friend are likely to be friends as well
Measured by clustering coefficient: – density of connections among one’s friends
![Page 17: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/17.jpg)
Clustering Coefficient
d6=4, N6= {4, 5, 7,8}
k6=4 as e(4,5), e(5,7), e(5,8), e(7,8)
C6 = 4/(4*3/2) = 2/3
Average clustering coefficient
C = (C1 + C2 + … + Cn)/n
C = 0.61 for the left network
![Page 18: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/18.jpg)
Data in Social Networks
Data creation
Data flow
Data storage
![Page 19: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/19.jpg)
Data Creation in Social Networks
User profiles and relationships
User-generated content
– Text (blogs, microblogs, messages, reviews, etc.) 500 million tweets are sent per day.
– Images, audio, and video 100 hours of video are uploaded to YouTube every minute.
![Page 20: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/20.jpg)
Distinction from Content in Traditional Media (Newspaper, TV, etc.)
Inexpensive to generate and publish
Widely accessible
Varying quality
Rich user interaction
![Page 21: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/21.jpg)
Data Flow Architecture at Facebook
Hadoop: a distributed file system and map-reduce platform
Scribe: a distributed and scalable data bus that aggregates logs from web servers
Hive: a data warehousing framework for reporting, querying and analysis
Federated MySQL: contains all the Facebook site related data
[Thusoo et al., SIGMOD’10]
![Page 22: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/22.jpg)
Data Storage at Facebook
The production cluster usually has to hold only one month’s worth of data
The ad hoc cluster needs to hold all the historical data, so that measures, models and hypotheses can be tested on historical data
Using gzip to compress data with a compression factor of 6-7
![Page 23: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/23.jpg)
Cold Data Storage
Facebook uses 10,000 Blu-ray discs to store a petabyte (=1,000,000 GB) of ‘cold’ data that hardly ever needs to be accessed, including duplicates of its users’ photos and videos that Facebook keeps for backup purposes.
The Blu-ray system reduces costs by 50% and energy use by 80% compared with its current cold-storage system, which uses hard disk drives.
![Page 24: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/24.jpg)
Server Racks in Facebook’s Data Center
![Page 25: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/25.jpg)
Data Analytic Tasks in Social Networks
Community detection
Friend recommendation
Importance of nodes
Influence propagation
Event detection
![Page 26: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/26.jpg)
Community Detection
![Page 27: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/27.jpg)
What is a Community?
Community: It is formed by individuals such that those within a group interact with each other more frequently than with those outside the group
– a.k.a. group, cluster, cohesive subgroup, module in different contexts
Two types of groups in social networks– Explicit Groups: formed by user subscriptions
– Implicit Groups: implicitly formed by social interactions
![Page 28: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/28.jpg)
Community Example
[McAuley and Leskovec, NIPS’2012]
![Page 29: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/29.jpg)
Subjectivity of Community Definition
Each component is a communityA densely-knit
community
Definition of a community can be subjective.
![Page 30: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/30.jpg)
Community Detection
Community detection: discovering groups in a network where individuals’ group memberships are not explicitly given
Some social media sites allow people to join groups, is it necessary to extract groups based on network topology?
– Not all sites provide community platform
– Not all people want to make effort to join groups
– Groups can change dynamically
![Page 31: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/31.jpg)
Community Detection based on Cliques
Clique: a maximum complete subgraph in which all nodes are adjacent to each other
In a clique of size k, each node maintains degree >= k-1 (for example, node 7 with degree 4)
Nodes with degree < k-1 will not be included in the clique (for example, node 9 with degree 1)
Nodes 5, 6, 7 and 8 form a clique of size 4
![Page 32: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/32.jpg)
Maximum Clique Example
In order to find a clique >3, remove all nodes with degree <=3-1=2
– Step 1. Remove nodes 2 and 9
– Step 2. Remove nodes 1 and 3
– Step 3. Remove node 4
![Page 33: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/33.jpg)
Clique Percolation Method (CPM)
Clique is a very strict definition, unstable Normally use cliques as a core or a seed to find larger
communities
CPM is such a method to find overlapping communities– Input
A parameter k, and a network
– Procedure Find out all cliques of size k in a given network Construct a clique graph. Two cliques are adjacent if they share
k-1 nodes Each connected component in the clique graph forms a
community
![Page 34: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/34.jpg)
CPM Example
Cliques of size 3:{1, 2, 3}, {1, 3, 4}, {4, 5, 6}, {5, 6, 7}, {5, 6, 8}, {5, 7, 8}, {6, 7, 8}
Communities: {1, 2, 3, 4}
{4, 5, 6, 7, 8}
![Page 35: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/35.jpg)
Friend Recommendation
![Page 36: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/36.jpg)
Friend Recommendation Example
![Page 37: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/37.jpg)
What is Friend Recommendation?
Given a snapshot of a social network, can we recommend new friendships among its members that are likely to occur in the near future?
– a.k.a. link prediction
Observation: Users do not form friendship at random with all other users. Instead, they tend to prefer other users that are “close” to them.
link prediction
![Page 38: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/38.jpg)
Popular Link Prediction Heuristics
For a pair of nodes , compute to estimate the proximity between nodes x and y using the following heuristics:
Heuristic Score Definitionshortest path distance
common neighbors
Adamic/Adar
ensemble of all paths ()where {paths of length exactly
from to }
![Page 39: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/39.jpg)
Link Prediction Heuristics Example
Common neighbors
Adamic/Adar
ensemble of all paths
1+ 1(two length-2, one length-3, and one length-4 paths)
![Page 40: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/40.jpg)
Link Prediction Accuracy
Random Shortest Path
Common Neighbors
Adamic/Adar Ensemble of short paths
Lin
k p
red
icti
on
acc
ura
cy*
*Liben-Nowell & Kleinberg, 2003; Brand, 2005; Sarkar & Moore, 2007; Sarkar, 2010
The number of paths matters, not the length
For large dense graphs, common neighbors are enough
Differentiating between different degrees is important
In sparse graphs, length 3 or more paths help in prediction.
![Page 41: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/41.jpg)
Importance of Nodes
![Page 42: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/42.jpg)
Importance of Nodes
Not all nodes are equally important
Find out the most important nodes (influential entities) in one network
Commonly-used measures– Degree Centrality
– Closeness Centrality
![Page 43: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/43.jpg)
Degree Centrality
The importance of a node is determined by the number of nodes adjacent to it
– The larger the degree, the more important the node is
– Only a small number of nodes have high degrees in many real-life networks
Degree Centrality
Normalized Degree Centrality:
![Page 44: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/44.jpg)
Degree Centrality Example
Which node is the most important in the network?Node Degree
centrality Normalized degree centrality
1 3 3/8
2 2 2/8
3 3 3/8
4 4 4/8
5 4 4/8
6 4 4/8
7 4 4/8
8 3 3/8
9 1 1/8
![Page 45: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/45.jpg)
Closeness Centrality
“Central” nodes are important, as they can reach the whole network more quickly than non-central nodes
Importance measured by how close a node is to other nodes
Average Distance:
Closeness Centrality
![Page 46: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/46.jpg)
Closeness Centrality Example
Node 4 is more central than node 3
![Page 47: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/47.jpg)
Summary
In this lecture, we introduce
– social networks, examples and their properties
– data creation, flow and storage in social networks
– social network analysis tasks, applications and case studies
![Page 48: Social Network Analysis. Outline l Background of social networks –Definition, examples and properties l Data in social networks –Data creation, flow and](https://reader036.vdocuments.us/reader036/viewer/2022081511/5697bfa11a28abf838c95e6a/html5/thumbnails/48.jpg)
References
Community Detection and Mining in Social Media. Lei Tang and Huan Liu, Morgan & Claypool, September, 2010.