social network analysis - utkweb.eecs.utk.edu/~cphill25/cs594_spring2017/...social networks what is...

37
Social Network Analysis Colin Bird, Clarence Jackson, Brett Hagan

Upload: others

Post on 16-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Social Network AnalysisColin Bird, Clarence Jackson, Brett Hagan

Page 2: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Questions

1. In what year did Leo Katz introduce Katz Centrality?

2. What clustering algorithm did we implement?

3. What was the name of the professor whose student adopted using lines and

points to represent social relations?

Page 3: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Colin Bird

Page 4: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Clarence Jackson II

Major: Computer Science

Advisor: Dr. Michael Langston

Research: Graph Theoretics, Machine Learning

Hometown: Flint, MI

Interests: Basketball, gaming, math, tech

Page 5: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Brett Hagan

Computer Science Undergraduate, Senior

Born in Morristown, TN

Hobbies include music production, gaming, and Tennessee athletics

Page 6: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Presentation Outline

Overview

History

Algorithms

Applications

Implementations

Open Issues

Page 7: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Social Networks What is a social network?

A collection of social entities and their interactions. Typically, these entities are

people, but they could represent other things.

Social Network Analysis (SNA)What is social network analysis?

The process of investigating social structures through the use of networks

and graph theory. There is an assumption of non-randomness or

locality. This condition is the hardest to formalize, but the intuition is

that relationships tend to cluster.

Page 8: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Telephone Networks

Email Networks

Collaboration Networks

Examples of Social Networks

Page 9: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

History of SNA

Social structure developed as one of the early key concepts

in the social sciences.

Early 20th century scientists began to systematically theorize

social relationships.

Mathematical and computational models are at the base of

more current applications.

Page 10: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

History of SNA

Leopold Von Wiese (1924/1932) - Adopted using lines

and points to describe social relations.

Jacob Moreno (1934) - Introduced the idea of

depicting social structure as a network diagram

(‘sociometry’)

Page 11: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

History of SNA

Leo Katz (1953) - Measure of centrality in a network,

used to measure degrees of influence in a social

network.

Ithiel de Sola Pool and Manfred Kochen (1978) -

Introduced small world model quantifying the distance

between people through chains of connections.

Robert Luce & Albert Perry (1949) - FIrst to use graph

theoretics for SNA, specifically cliques.

Page 12: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Centrality

Centrality: Indicators of centrality identify the most important vertices in a graph

Intuitively, nodes in a social network with a higher centrality measure will be the

nodes we are probably most interested in

Eigenvector centrality was an early method, but inapplicable to Directed Acyclic

Graphs

Katz Centrality: Introduced by Leo Katz in 1953

Page 13: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Katz Centrality

Computes relative influence of individual nodes within a network

Takes into account the total number of walks between pairs

Let A be the adjacency matrix of a network under consideration. Elements (A,ij)

of A are variables that take a value 1 if a node i is connected to node j and 0

otherwise. The powers of A indicate the presence (or absence) of links

between two nodes through intermediaries.

𝛂 is an attenuation factor that penalizes connections made with distant nodes

Page 14: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Katz Centrality

Attenuation factor = .5

Peter Ashley

Ben

Tim

John Eric

John and Ben are neighbors of Tim, so the

weight assigned to these edges is (.5)^1 =

.5

Sarah

.5.5

Page 15: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Katz Centrality

Attenuation factor = .5

Peter Ashley

Ben

Tim

John

Eric

Sarah

Page 16: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Katz Centrality

Attenuation factor = .5

Peter Ashley

Ben

Tim

John Eric

Sarah is a path length of three away from

Tim, so the weight of the edge is (.5)^3 =

.125

Sarah

.5.5

.25.25

.25

.125

Katz(Tim) = 2(.5) + 3(.25) + (.125) = 1.875

Page 17: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Katz Centrality

Attenuation factor = .3

Peter Ashley

Ben

Tim

John Eric

Lowering the attenuation factor causes

longer paths to have less influence.

Sarah

.3.3

.09.09

.09

.027

Katz(Tim) = 2(.3) + 3(.09) + (.027) = .897

Page 18: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Clustering

Graph Clustering: Finding sets of related vertices in a graph

From a Social Network perspective, we might expect different clusters of users

to have sets of common friends or interests

Page 19: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

The Markov Clustering Algorithm (MCL)

Developed by Stijn van Dongen in 2000 at the Centre for Mathematics and

Computer Science in the Netherlands

Page 20: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Markov Clustering Algorithm (MCL)

Markov Chain: Sequence of variables where, given the present state, the past

and future states are independent

In the MCL, our variables in the Markov Chain will be stochastic probability

matrices

Page 21: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

The Markov Clustering Algorithm (MCL)

During early powers of the Markov Chain, edge weights are higher in links

within clusters and lower in links between clusters

The MCL boosts this effect using two mathematical operations

Expansion: Taking the Markov Chain transition matrix powers

Inflation: Raising columns to non-negative powers, followed by normalization

Page 22: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

The Markov Clustering Algorithm (MCL)

Expansion: Allows flow to connect to different parts of the graph

Inflation: Responsible for both the strengthening and weakening of current.

Corresponds to taking Hadamard power of matrix followed by normalization.

Page 23: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

The Markov Clustering Algorithm (MCL)

1. Input graph, power parameter e, and inflation parameter r

2. Create adjacency matrix

3. Normalize adjacency matrix to create stochastic probability matrix

4. Expansion: matrix to the power of e

5. Inflation: inflation operation with parameter r

6. Alternate between expansion and inflation until convergence

7. Interpret resulting matrix to extrapolate clusters

Page 24: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Applications

Crime (FBI,NSA,Terror Prevention)

Health Care (Primary care Pattern analysis)

Mining social media data

Page 25: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Implementation and results

● First need to collect to data

● First started using tools we already created

● Used open source libraries for betweenness

centrality

● Implemented Markov Clustering

Page 26: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Cooking the data

● Facebook

○ Click Farm 1 (500 bots ~ 20% densely interconnected)

○ Click Farm 2 (500 people)

● Twitter○ Click Farm 1 (10k bots ~ 2% densely interconnected)

○ Click Farm 2 (10k people)

Page 27: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

MCL Clustering of Facebook w/ Click Farms

MCL Facebook actual

Page 28: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Facebook w/ MCL Clustering close up

Page 29: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,
Page 30: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Without MCL Clustering

Page 31: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,
Page 32: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

With MCL Reduction

Page 33: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,
Page 34: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Open Issues

Combative nature of the problem (for click farms)

False positives are harmful

Large, dynamic data

Page 35: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

References

Otte, Evelien; Rousseau, Ronald (2002). "Social network analysis: a powerful strategy, also for the information sciences".

Journal of Information Science. 28 (6): 441–453. doi:10.1177/016555150202800601. Retrieved 2015-03-23.

Katz, L. (1953). A New Status Index Derived from Sociometric Analysis. Psychometrika, 39–43.

Leo Katz: A New Status Index Derived from Sociometric Index. Psychometrika 18(1):39–43, 1953

http://phya.snu.ac.kr/~dkim/PRL87278701.pdf

https://www.researchgate.net/publication/281368621_Network_Analysis_History_of?enrichId=rgreq-

93ca7beab51cc6ca0390597397632fb3-

XXX&enrichSource=Y292ZXJQYWdlOzI4MTM2ODYyMTtBUzoyNjg0MTYxNzkyNDA5NjNAMTQ0MTAwNjgxMjI4Nw%3D%

3D&el=1_x_2&_esc=publicationCoverPdf

Page 36: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Discussion

Page 37: Social Network Analysis - UTKweb.eecs.utk.edu/~cphill25/cs594_spring2017/...Social Networks What is a social network? A collection of social entities and their interactions. Typically,

Questions

1. In what year did Leo Katz introduce Katz Centrality?

2. What clustering algorithm did we implement?

3. What was the name of the professor whose student adopted using lines and

points to represent social relations?