community dynamicsshilpaa/community_dynamics.pdf · visualizing the evolution of subgroups in...

33
0 Community Dynamics Course: Analysis of Social Media Shilpa Arora Language Technologies Institute School of Computer Science Carnegie Mellon University

Upload: others

Post on 28-Jun-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

0

Community DynamicsCourse: Analysis of Social Media

Shilpa AroraLanguage Technologies Institute

School of Computer ScienceCarnegie Mellon University

Page 2: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

1

Papers Presented

• Tanja Falkowski and Myra Spiliopoulou. Data Mining for CommunityDynamics, Künstliche Intelligenz (Journal), (2007), pp. 23-29.

• Tanja Falkowski, Jörg Bartelheimer & Myra Spiliopoulou. Mining andVisualizing the Evolution of Subgroups in Social Networks.Proceedings of the 2006 IEEE/WIC/ACM International Conference onWeb Intelligence.

• Tanja Falkowski, Jörg Bartelheimer and Myra Spiliopoulou,Community Dynamics Mining, In Proc. of 14th European Conferenceon Information Systems (ECIS 2006), Göteborg, Sweden, 2006.

• Main References: Michelle Girvan and M.E.J. Newman. Community structure in social and

biological networks. Proc. Natl. Acad. Sci. USA, (2002) Filippo Radicchi, Claudio Castellano, Federico Cecconi, Vittorio Loreto,

and Domenico Parisi. Detecting and identifying communities innetworks.Proc Natl Acad Sci U S A. 2004

Page 3: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

2

Authors

Tanja Falkowski Research Associate Department for Information Systems, Otto-von-Guericke-University, Magdeburg

Myra SpiliopoulouProfessor of Business Informatics Computer Science,Otto-von-Guericke-University, Magdeburg

Jörg Bartelheimer Viadeo S.A.

Page 4: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

3

Outline

• Introduction• Motivation• Detecting Communities• Community Dynamics• Visualization Tools• Results• Conclusion

Page 5: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

4

Introduction

• Community of Practice – Group of people withcommon interests who interact to exchangetheir knowledge about the topic of interest

• Highly dynamic social network Structure changes over time Members and interactions are fluctuating Community – Persistent structure in a graph of

interactions among fluctuating members Community Instance: densely connected subgroups

that are only loosely connected to the rest of thegraphs

Communities = clusters of similar communityinstances across time

Page 6: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

5

Motivation

• Communities in social networks – social groupings citation network – related papers on a single topic web – pages on related topic

• Organizations encourage communitydevelopment to facilitate knowledge sharing

• Factors affecting communities Internal - infrastructure, leadership; External -

publicity• Detecting points of structural change

Predict similar future behavior

Page 7: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

6

Prior Work

• Community detection using aggregateddata, drawbacks: All interactions in time are treated equally = big

aggregates dominate, not most current Cannot observe transitions in the interaction

behavior, e.g. merging community, periodicallyactive communities

• Vertex & Edge level tools SoNIA (Moody et. al., 2006) & TeCFlow (Gloor &

Zhao, 2004) Change in behavior of single actor is captured –

dynamics of groups cannot be observed

Page 8: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

7

Proposed Approach

• Dynamical temporal observation ofcommunities Time windows Detect community instances in each window Compare community instances across time

window Link together similar community instances to

form a community Interactively visualize changes in community

structure

Page 9: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

8

Community Detection

• Graph G= (V,E), V-nodes (member), E-edges (interaction)

• Weights - no of messages exchangedor total length of messages Aggregated weights – favors old member Weights are assigned for each time

window

Page 10: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

9

Community Detection

• Hierarchical divisive clustering Iterative removal of edges that do not

contribute to a community Two measures for finding edges to be

removed: Edge Btweenness (Girvan et. al, 2002) Edge Clustering coefficient (Radicchi et. al.

2004)

Page 11: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

10

Community Detection

• Edge Betweenness (Girvan et. al, 2002) Edges which are least central = edges between the

communities Number of shortest paths between pairs of vertexes that

include this path Communities = densely connected subgraphs which are

loosely connected to each other few inter-group edges through which shortest paths go

Global quantity using properties of the whole system Complexity - O(m2n), m = number of edges, n = number

of vertices

Page 12: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

11

Community Detection

Edge Clustering coefficient (Radicchi et. al. 2004)

= number of triangle to which this edge belongs = degree of the vertex i = maximum possible number of triangles

including that edge Intuition: Edges between communities belong to less

number of shorter loops. Many such triangles would occurwithin communities

Added advantage: nodes with only one connection are notconsidered as isolated communities, as coefficient isinfinite for their unique edges

Page 13: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

12

Community Detection

• Betweenness Vs. Clustering Coef.(Radicchi et. al. 2004) Compared on artificial test graph with

four communities Comparable performance

Global quantity & Local quantity Complexity (m= # edges, n = # nodes)

Betweenness = O(m2n) Clustering Coefficient = O(m4/n2)

Strong anti-correlation but not perfect

Page 14: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

13

R0= fraction of nodes correctlyclassifiedpout= probability with which pairsof nodes in different groups areconnected

GN- Edge betweenness approachEdge Clustering approach with:* g=3, triangle loops* g=4, square loops

Average time neededto analyze a randomgraph of N nodes

Page 15: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

14

Community Detection

• Result of hierarchical clustering: Dendrogram• Meaningful network partition (Modularity -

Newman et. al., 2004) Fraction of edges connecting nodes within a

community minus expected value of the samequantity in a network with same community structurebut random edges

Look for peaks in modularity values - good splitpoints

Page 16: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

15

Community Detection

• Modularity (Q):

= sum of elements of matrix x

= trace of e, fraction of vertices in same community

= fraction of edges that connect vertices in community i

In network where edges fall between vertices without regard for thecommunities they belong to, we have

0<=Q <=1, Q > 0.3 - significant community structure

‘e’ - k x k matrix, k = #communities, eij = fraction of all edgesin network that link vertices in community i to vertices incommunity j

Page 17: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

16

Community Detection

• Step 1: Time axis is separated into equidistant timewindows

• Step 2: Detect community instances in each timewindow - Hierarchical divisive clustering

• Step 3: Finding similar community instances acrosstime windows - Community survival Similarity = overlap in members

• Step 4: Community survival graph - connectmatching community instances over time Borders of clusters: If and when a community dies or

merges with another community• Step 5: Groups of community instances are

discovered using hierarchical divisive clustering

Page 18: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

17

Community Detection

Page 19: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

18

Community Detection

• Overlap between 2 communityinstances:

= number of vertices in a community instanceor intersection

Similarity function:

Page 20: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

19

Visualizing CommunityDynamics

• CoDyM - Community discovering & dynamicsmining

• Two ways to compare: Fixed: Chosen time window is compared with all

other time windows Periodic: Chosen time window is compared to

previous time window• Measures:

Stability: How stable the composition of the group is? Fixed stability = # members from current window are

active in all other time windows Low Periodic stability indicates high changes in the

membership structure of the subgroup

Page 21: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

20

Visualizing CommunityDynamics

• Density: Edges inside the group / edges with outside

group members Indicates connectivity inside the group

• Cohesion - How connected the group is tomembers outside of the instance Greater cohesion, less density leads to an

unstable subgroup• Euclidean distance of vector representing sub-

groups. (Euclidean distance = 0) => structurally

equivalent

Page 22: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

21

Visualizing CommunityDynamics

• Correlation Coefficient: Covariance of thevector representation of the graphs /product of their standard deviations Structurally equivalent subgroups will have

correlation +1• Group Activity: Number of interactions in

each time window Min internal and external group activities

measures the reciprocity inside and outside thegroup respectively

Page 23: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

22

Observing Communitytransitions & triggers

• Community persists, grows, evolves,disappears, matures, merges, splits Observing vertices and edges Comparing community instances across time windows

• Triggers Community Leadership Change

Degree of vertex and vertex betweenness -> centrality Low edge clustering but high edge betweenness implies

probability that node acts as a bridge is higher External influences e.g. public campaign ->

immediate effect on community Global properties: # vertices & edges, average shortest path,

diameter of the graph, modularity of the graph etc.

Page 24: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

23

Data Set

• Online Student Community Interactions - guest book entries Edge - bilateral message exchange 1000 members + 250,000 guestbook entries

over 18 months (June 2004 - November 2005) Time windows

14-days (Not chosen automatically but byexperiments to ensure low standard deviation)

Similarity Threshold - = 0.5, = 6 1025 similar community instances, 4 communities

after clustering

Page 25: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

24

Rectangle - Community InstanceHeight = # members

Edge between similarcommunities

Different colors =different communitiesafter clustering

Page 26: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

25

Graph ofsimilarcommunityinstanceswithoutclustering

First break in communityevolution detected after 27clustering iterations

Page 27: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

26

2nd change in evolution -3 communities

3rd change in evolution -4 communities

Page 28: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

27Dendrogram - result of clustering

Best dendrogram cut based on modularity measure

Page 29: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

28

List of subgroups for user to chose from

Page 30: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

29

Visualizing CommunityEvolution

• Student community - integration ofinternational students who stay for 1or 2 semesters

• High fluctuations at the end of asemester or beginning of a new one

• Structural change correspond tobeginning or end of a semester -Christmas ‘04, Summer’05, Winter’05- as expected

Page 31: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

30

Conclusion & Critique

• Temporal evolution of online communities• CoDyM - Community Dynamics Miner

Evolution of online communities Triggers for structural changes

• Useful tool for community providers Foster intra-organizational knowledge sharing

• Need for appropriate similarity measures that scalebetter with changing activity and density

• Quantitatively evaluating community detectionalgorithm is difficult. Requires manual analysis.Need for better measures

• Automatic identification of time window size &thresholds

Page 32: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

31

References

• Newman, M. E. J. and Girvan, M. (2004).Finding and evaluating community structure innetworks. Physical Review, E 69(026113).

• Moody, J., Mc Farland, D. and Bender-deMoll,S. (2005). Dynamic Network Visualization.American Journal of Sociology, 110(4), 1206-1241.

• Gloor, P. A. and Zhao, Y. (2004). TeCFlow - ATemporal Communication Flow Visualizer forSocial Networks Analysis. In: CSCW'04Workshop on Social Networks, ACM.

Page 33: Community Dynamicsshilpaa/Community_Dynamics.pdf · Visualizing the Evolution of Subgroups in Social Networks. Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web

32

Questions ?